Monitoring Your Services

In the previous chapter, we tested services that are interacting with each other in isolation. But when something bad happens in a real deployment, we need to have a global overview of what's going on. For example, when a microservice calls another one which in turn calls a third one, it can be hard to understand which one failed. We need to be able to track down all the interactions that a particular user had with the system that led to a problem.

Python applications can emit logs to help you debug issues, but jumping from one server to another to gather all the information you need to understand the problem can be hard. Thankfully, we can centralize all the logs to monitor a distributed deployment.

Continuously monitoring services are also important to assert the health of the whole system and follow how everything behaves. This involves answering questions such as, Is there a service that's dangerously approaching 100% of RAM usage?, How many requests per minute is that particular microservice doing? Do we have too many servers deployed for that API, can we remove a few boxes to reduce the price? Did a change we just deploy affect performance adversely?

To be able to answer questions like these continuously, every microservice we're deploying needs to be tooled to report primary metrics to a monitoring system.

This chapter is organized into two main sections:

Centralizing logs
Performance metrics

By the end of the chapter, you will have a full understanding of how to set up your microservices to monitor them.

Table of Contents for Monitoring Your Services

Create new playlist

Sign In

Sign Up

Table of Contents for
Monitoring Your Services