Measuring availability is important to keeping your system highly available. Only by measuring availability can you understand how your application is performing now and examine how your application’s availability changes over time.
The most widely held mechanism for measuring the availability of a web application is calculating the percent of time it’s accessible for use by customers. We can describe this by using the following formula for a given period:
Let’s consider an example. Suppose that over the month of April, your website was down twice; the first time it was down for 37 minutes, and the second time it was down for 15 minutes. What is the availability of your website?
Your site availability is 99.8795%.
You can see from this example that it only takes a small amount of outage to have an impact on your availability percentage.
Often you will hear availability described as “the nines.” This is a shorthand way of indicating high availability percentages. Table 3-1 illustrates what it means.
Nines | Percentage | Monthly outage a |
---|---|---|
2 Nines |
99% |
432 minutes |
3 Nines |
99.9% |
43 minutes |
4 Nines |
99.99% |
4 minutes |
5 Nines |
99.999% |
26 seconds |
6 Nines |
99.9999% |
2.6 seconds |
a This assumes a 30-day month with 43,200 minutes in the month. |
In Example 3-1, we see that the website has fallen just short of the 3 nines metric (99.8795% compared to 99.9%). For a website that maintains 5 nines of availability, there can be only 26 seconds of downtime every month.
What’s a reasonable availability number in order to consider your system as high availability?
It is impossible to give a single answer to this question because it depends dramatically on your website, your customer expectations, your business needs, and your business expectations. You need to determine for yourself what number is required for your business.
Often, for basic web applications, 3 nines is considered acceptable availability. Using Table 3-1, this amounts to 43 minutes of downtime every month. For a web application to be considered highly available, often an indication of 5 nines is used. This amounts to only 26 seconds of downtime every month.
Don’t be fooled into thinking your site is highly available when it isn’t. Planned and regular maintenance that involves your application being unavailable still count against availability.
Here’s a comment that I often overhear: “Our application never fails. That’s because we regularly perform system maintenance. By scheduling weekly two-hour maintenance windows, and performing maintenace during these windows, we keep our availability high.”
Does this group keep its application’s availability high?
Let’s find out.
Without having a single failure of its application, the best this organization can achieve is 98.8% availability. This falls short of even 2 nines of availability (98.8% versus 99%).
Planned maintenance hurts nearly as much as unplanned outages. If your customer expects your application to be available and it isn’t, your customer has a negative experience. It doesn’t matter if you planned for the outage or not.