Circuit breaker pattern

The circuit breaker pattern can prevent an application from repeatedly trying an operation that is likely to fail. The circuit breaker wraps the calls to a service. It can handle faults that might take a variable amount of time to recover from when connecting to a remote service or resource. This can improve the stability and resiliency of an application.

The problem description—Remote connectivity is common in a distributed application. Due to a host of transient faults such as slow network speed, timeouts, the service unavailability, or the huge load on the service, calls to remote application services can fail. These faults, being transient, typically correct themselves after a short period of time. The retry pattern strategy suggests that a robust cloud application can handle these transient faults easily in order to meet up the service requests.

However, there can also be situations wherein the faults are due to bigger issues. The severity levels vary from temporary connectivity loss to the complete failure of the service due to various reasons and causes. Here, it is illogical to continuously retry to establish the broken connectivity. Instead, the application has to understand and accept the situation to handle the failure in a graceful manner.

Suppose the requested service is very busy, then there is a possibility for the whole system to break down.

Generally, an operation that invokes a service is configured to implement a timeout and to reply with a failure message if the service fails to respond within the indicated time period. However, this strategy could cause many concurrent requests to the same operation to be blocked until the timeout period expires. These blocked requests might hold critical system resources such as memory, threads, database connections, and so on. Finally, the resources could become exhausted, causing failure of other associated and even unrelated system components. The idea is to facilitate the operation to fail immediately and only to attempt to invoke the service again if it is likely to succeed. The point here is to set up a timeout intelligently because a shorter timeout might help to resolve this problem but the shorter timeout may cause the operation to fail most of the time.

The solution approach—The solution is the proven circuit breaker pattern, which can prevent an application from repeatedly trying to execute an operation that's likely to fail. This allows it to continue without waiting for the fault to be fixed or wasting CPU cycles while it determines that the fault is long lasting. The circuit breaker pattern also enables an application to detect whether the fault has been resolved. If the problem appears to have been fixed, the application can try to invoke the operation.

The retry pattern enables an application to retry an operation in the expectation that it will succeed. On the other hand, the circuit breaker pattern prevents an application from performing an operation that is likely to fail. An application can combine these two patterns by using the retry pattern to invoke an operation through a circuit breaker. However, the retry logic should be highly sensitive to any exceptions returned by the circuit breaker and abandon retry attempts if the circuit breaker indicates that a fault is not transient. Also, a circuit breaker acts as a proxy for operations that might fail. The proxy should monitor the number of recent failures that have occurred, and use this information to decide whether to allow the operation to proceed, or simply return an exception immediately. The proxy can be implemented as a state machine with the following states:

  • Closed: This is the original state of the circuit breaker. Therefore, the circuit breaker sends requests to the service and a counter continuously tracks the number of recent failures. If the failure count goes above the threshold level within a given time period, then the circuit breaker switches to the open state.
  • Open: In this state, the circuit breaker opens up and immediately fails all requests without calling the service. The application instead has to make use of a mitigation path such as reading data from a replica database or simply returning an error to the user. When the circuit breaker switches to the open state, it starts a timer. When the timer expires, the circuit breaker switches to the half-open state.
  • Half-open: In this state, the circuit breaker lets a limited number of requests go through to the service. If they succeed, the service is assumed to be recovered and the circuit breaker switches back to the original closed state. Otherwise, it reverts to the open state. The half-open state prevents a recovering service from suddenly being inundated with a series of service requests.

The circuit breaker pattern ensures the system's stability while the system slowly yet steadily recovers from a failure and minimizes the impact on the system's performance. It can help to maintain the response time of the system by quickly rejecting a request for an operation that is likely to fail rather than waiting for the operation to time out. If the circuit breaker raises an event each time, it changes the state. This information can be used to monitor the health of the part of the system protected by the circuit breaker or to alert an administrator when a circuit breaker trips to the open state.

The pattern is highly customizable and can be adapted according to the type of the possible failure. For example, it is possible to use an increasing timeout timer to a circuit breaker. We can place the circuit breaker in the open state for a few seconds initially and if the failure hasn't yet been resolved, then increase the timeout to a few minutes, and so on. In some cases, rather than the open state returning a failure and raising an exception, it could be useful to return a default value that is meaningful to the application.

In summary, this pattern is used to prevent an application from trying to invoke a remote service or access a shared resource if this operation is highly likely to fail. This pattern is not:

  • For handling access to local private resources in an application, such as an in-memory data structure
  • As a substitute for handling exceptions in the business logic of your applications

The circuit breaker pattern is becoming very common with microservices, emerging as the most optimized way of partitioning massive applications and presenting applications as an organized collection of microservices.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset