Appendix C. Calculating Availability

Note

Much of the information that follows is based on the concepts presented in the book High Availability Network Fundamentals, by Chris Oggerino (Cisco Press). At the time of this writing, the book is unfortunately out of print. If you can get your hands on a copy, it is worth your while.

This appendix provides richer detail to help you evaluate the components of system availability, as an extension of what was presented in Chapter 6. You can calculate the availability of a single component with the following equation:

image with no caption

So, the availability of a component whose Mean Time Between Failures (MTBF) is 175,000 hours and Mean Time To Repair (MTTR) is 30 minutes would be:

image with no caption

In other words, according to the manufacturer’s testing results, the component is expected to have only 1.51 minutes of downtime per year.

Most systems are composed of more than one component, of course. Multicomponent systems are arranged in a serial or a parallel fashion. For a serial component-based system, each component is a single point of failure, and so each component depends on the other for system availability. In contrast, a parallel component system has redundant components built such that the failure of a single component will not cause the entire system to fail. You can calculate the availability of serial redundant components by multiplying together the availability numbers for each single component:

image with no caption

Here’s how to calculate the availability of a serial multicomponent system, consisting of a processor, bus, and I/O card:

image with no caption

This represents 99.998% availability, which is also called “four 9s and an 8.” That was a simplified example. Now, let’s look at a redundant system availability calculation (see Figure C-1).

Block diagram of a simple redundant system (Source: Chris Oggerino, High Availability Network Fundamentals, Cisco Press)
Figure C-1. Block diagram of a simple redundant system (Source: Chris Oggerino, High Availability Network Fundamentals, Cisco Press)

Figure C-1 shows a diagram of a simple redundant system with two CPUs, two power supplies, and two I/O cards. You can calculate availability on such a system in the same way you would calculate serial availability. The difference here is that each redundant system is calculated as the difference of 1 minus the product of each redundant and serial component. Note this key qualifier: a single redundant component (i.e., two power supplies) is 1 minus the product of the individual component’s availability. The following formula should help clear this up:

image with no caption

Now that you understand serial versus parallel systems, you can begin to calculate more complex scenarios, such as what’s shown in the following calculation. Assume that you know your I/O card availability is .99995, your CPU availability is .99996, your power supply availability is .99994, and your chassis availability is .999998. The availability calculation would be as follows:

image with no caption

The preceding calculation shows that, based purely on hardware MTBF numbers, this scenario should have only 1.05 minutes of downtime per year; in other words, it is a “five 9s” system.

You can obtain the MTBF component of the equation from your hardware manufacturer, which, if it is a network vendor, most likely uses the Telcordia Parts Count Method, which is described in document TR-332 from http://www.telcordia.com/. Cisco lists MTBF information in its product data sheets, as do Juniper and others.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset