Monitoring and diagnostics

Continuous and tools-assisted monitoring of applications is crucial for achieving resiliency. If something drags, lags, or fails, the operational team has to be informed immediately along with all the right and relevant details to consider and proceed with a correct course of action. As we all agree, monitoring a large-scale distributed system poses a greater challenge. With the overwhelming acceptance of the divide and conquer technique, the number of moving parts of any enterprise-scale application has grown steadily and sharply. Today, as a part of the compartmentalization, we have virtualization and containerization concepts widely accepted and adopted. The number of VMs in any IT environment is growing. Furthermore, due to the lightweight nature, the number of containers being leveraged to run any mission-critical application has escalated rapidly and remarkably. In short, monitoring bare metal servers, VMs, and containers precisely is definitely a challenge for operational teams. Also, every kind of software and hardware generates a lot of log files resulting in massive operational data. It has become common to subject all sorts of operational data to extract actionable insights. Not only are the IT systems distributed, but they are also extremely dynamic. The monitoring, measuring, and management complexities of tomorrow's data centers and server farms are consistently on the climb. 

Monitoring is not the same as failure detection. For example, our application might detect a transient error and retry, resulting in no downtime. But it should also log the retry operation so that we can monitor the error rate, in order to get an overall picture of application health.

The resiliency strategy is essential to ensure the service resiliency of IT systems and business applications. As enterprises increasingly embrace the cloud model, the cloud service providers are focusing on enhancing the resiliency capability of their cloud servers, storage, and networks. Application developers are also learning the tricks and techniques fast in order to bring forth resilient applications. With the combination of resilient infrastructures, platforms, and applications, the days of the resilient IT, which is mandatory towards agile, dynamic, productive, and adaptive businesses, is not too far away.  

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset