Design Goals

To support the Prospect-J development team, the Firm is implementing a highly available, yet low-cost, file service. The design goals are based on the business case, server requirements, cluster services, and expected service levels.

Business Case

The completed financial analysis shows that every day the Firm delays bringing Prospect-J to market could cost $100,000 in revenue. The expected primary cause for missing the schedule is interruptions in the development and test process.

To avoid interruptions in the development and test processes caused by the inaccessibility of source code, resulting from a file-system-related failure, the Firm decided to implement a highly available network file service. The Firm's founder believes that mistakes caused by tired employees can be more costly than those caused by hardware failure. The founder uses the phrase “you can get almost as much work finished in 11 hours as you can in 10” to illustrate his point of view. To help guard against the effects of human error, daily archival of the source code is required.

Server Requirements

To support the Prospect-J development, the development group needs a highly available network file server that must meet the following requirements

  1. Support developers whenever they want to work. Essentially, this requirement demands continuous availability.

  2. Support nightly builds of software for testing the following day

  3. Tape backup of all files to enable easy retrieval of a specific file on a specific date by a user without system administrator privileges

  4. Acquire a system quickly to meet the impossible deadline

  5. All administration tasks must be done from a remote location because the system does not have operators on duty continuously. Error reporting and status must be deliverable through two-way email for use by cell phones or pagers.

  6. Spare parts must be available in a reasonable time frame but are not required on site.

  7. Reserve file-locking semantics during a failover

  8. Provide NFS file services

  9. Support up to 100 users during peak periods. The user workload is sporadic and unpredictable.

Note

The Firm expects the system to supply file services only. Other systems will compile, edit, and handle other file manipulation.


Cluster Services

The Firm requires the following cluster services:

  • Network file system (NFS)

  • Common Interface File System (CIFS)

  • Daily file backup

Expected Service Level

The Firm is concerned about the time to market for Prospect-J. The developers tend to work nonstandard hours. The Firm expects the developers to work longer hours as the deadlines approach. The group starts the daily builds in early morning and expects them to be completed by 0800, when the test team begins work. For this reason, the system must be available continuously with minimal downtime.

The deadline for completing Prospect-J development is within nine months. Therefore, the Firm is not planning any downtime for upgrades, routine maintenance, or capacity expansion.

Because no scheduled downtime is available, the group leader first must approve emergency patches and then decide when the system administrator can apply these emergency patches. Also, because no planned downtime is available for upgrades, the plan of record for rolling back emergency patches is to revert to the previous daily backup.

Design Priorities

The senior management of the Firm gave the group a prioritized list (TABLE 5-1) of major features. If a design decision involved a trade-off between two major features, the group used this prioritization to make a decision.

Table 5-1. Major Feature Priorities
Feature Priority
Availability 1
Cost 2
Reliability 3
Recovery 4
Security 5
Performance 6
Serviceability 7

Availability

Availability is the highest priority for the Prospect-J development file server cluster. Any downtime before the development deadline will cause a slippage and subsequent loss of revenue opportunity.

Cost

The Firm wants to keep the cost of the system as low as possible without compromising the desired time-to-market deadline.

Reliability

Component reliability is more important than typical highly available installations because the Firm does not have a hardened data center. Also, the Firm will not stock spare parts on site.

Software reliability is expected to be high. The Firm does not intend to change the software components. The file services, clustering, and backup software will be used as is, with little or no customization. The firm will not develop any new software to install on the cluster.

Recovery

Recovery takes two forms—recovery of the cluster when down and recovery of archived files. Cluster recovery from a failure is expected to be as automated as possible because no operator is available continuously. Recovery of archived files must be done without operator assistance by a backup-and-restore software product.

Security

The system contains some of the Firm's proprietary and intellectual property. The system does not contain confidential personnel information, credit card numbers, or any other data that would make the Firm legally vulnerable if it were stolen.

The Firm is concerned about security problems causing an outage. Therefore, they will restrict login access to the system to a few responsible administrators. Physical security is a minor concern, but the lack of a hardened data center with controlled physical access puts the Firm at risk. The building that contains the system and developers has a part-time receptionist and a card access system that controls building access.

Some of the Firm's developers work in remote locations and connect to the Firm's internal network through a virtual private network (VPN). A separate group maintains the VPN and all Internet access points. As an added precaution, the Firm uses network address translation (NAT) to help prevent direct access to internal systems from the Internet. File service will be restricted to a set of hosts that will be maintained.

Performance, Sizing, and Capacity Planning

Predicting performance requirements and anticipating write workloads is difficult because the write workloads will be intermittent. The file service clients support local file caching, which reduces the read workload. The Firm does not expect that a sustained workload will exist for any significant period of time other than for backup.

Serviceability

The Firm does not have a centralized IT support organization. The Firm will rely on component supplier service contracts for installation, parts replacement, and any system troubleshooting.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset