Timeouts

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Timeouts

Computer systems use timeouts to help maintain state information. Since digital computers are discrete systems, you must use some mechanism to create an event and to verify that an event was not completed. You can use timeouts for cluster heartbeats, arbitration cycles to determine if a processor has hung, and so forth.

Caution

Reducing the size of timeouts in an effort to improve the ability to detect component failures is tempting. You must avoid destabilizing conditions in which the timeout values are too small. Ensuring that the system remains stable when timeouts are changed requires due diligence and detailed engineering analysis.

Timeouts are also opportunities for defects. In particular, the stability of a system can be undermined if you design a series of nested timeouts incorrectly. FIGURE 1-3 is a nested timing diagram example.

Figure 1-3. Nested Timing Diagram—Example

In this example, component A has a state change at time t₁ that causes a state change in component B at time t_2. Component B then has a state change at time t₃ that causes a state change in component A at t_4. This timing diagram example is analogous to a host, A, issuing a read data command to a disk drive, B. This system has deterministic, sequential behavior. However, if a failure occurs in component B between t₂ and t_3, the state change of component A at t₄ will never occur.

Component A will hang. If a timeout is implemented, component A can detect the failure of component B. FIGURE 1-4 shows the failure of B and the point at which A times out.

Figure 1-4. Nested Timing Diagram With Timeout

In this case, A implements an internal timeout, A_to. At time t_1, component A starts the timeout counter, A_to, at t₂ and causes a state change in B at t_3. At time t_4, component B fails, never to recover. At time t expires, causing A to change its state. Since both A and A_to are part of component A, A knows an error condition occurred in component B.

Stable System

FIGURE 1-5 shows a stable system implementing timeouts. In this case, the component timeout of A is greater than the service time of component B.

Figure 1-5. Stable System With Timeout

Unstable System

FIGURE 1-6 shows an unstable system implementing timeouts. In this case, the timeout value component A is less than the service time of component B.

Figure 1-6. Unstable System With Timeout

The system is unstable because it detects false errors. The action taken by component A, because of the presumed failure of component B, determines the stability of the overall system.

Stability Problems

The stability problems inherent in complex computer systems using timeouts are difficult to predict or prove mathematically because of the multivariate and nonlinear nature of these systems.

Timeouts that are too long increase the time to detect errors. Timeouts that are too short generate false error conditions. False error conditions cause unnecessary failovers.

Timeouts are a source of systems integration errors. For components that use timeouts, the default timeout values are set according to their expected uses. When combined to build a larger system, the timeouts can cause system instability. You must understand the component timeouts and their effect on event detection.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Timeouts

Create new playlist

Sign In

Sign Up

Timeouts

Figure 1-3. Nested Timing Diagram—Example

Figure 1-4. Nested Timing Diagram With Timeout

Stable System

Figure 1-5. Stable System With Timeout

Unstable System

Figure 1-6. Unstable System With Timeout

Stability Problems

Table of Contents for
Timeouts