Monitoring

Next, we are going to take a look at monitoring our containers and also Docker hosts. In Chapter 4, Managing Containers, we discussed the docker container top and docker container stats commands; you may recall that both of these commands show real-time information only; there is no historical data kept.

While this is great if you are trying to debug a problem as it is running or want to quickly get an idea of what is going on inside your containers, it is not too helpful if you need to look back at a problem: maybe you have configured your containers to restart if they have become unresponsive. While that will help with the availability of your application, it isn't much of a help if you need to look at why your container became unresponsive.

We are going to be looking at quite a complex setup. There are alternatives to this configuration, and also, there is ongoing development within the Docker community to make the configuration that we are going to be looking at more straightforward over the next few releases of Docker Engine, but more on that later.

In the GitHub repository at https://github.com/russmckendrick/mastering-docker in the /chapter12 folder, there is a folder called prometheus in which there is a Docker Compose file that launches four different containers. All of these containers launch services that we need to monitor not only our containers but also our Docker hosts.

Rather than looking at the Docker Compose file, itself let's take a look at the visualization:

As you can see, there is a lot going on; the four services we are running are:

Before we launch and configure our Docker Compose services, we should talk about why each one is needed, starting with cadvisor.

As you may have noticed from the URL, cadvisor is a project released by Google. The service section in the Docker Compose file looks like the following:

cadvisor:
image: google/cadvisor:latest
container_name: cadvisor
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
restart: unless-stopped
expose:
- 8080

As you can see, we are mounting the various parts of our host's filesystem to allow cadvisor access to our Docker installation in much the same way as we did in Chapter 8, Portainer. The reason for this is that in our case, we are going to be using cadvisor to collect stats on our containers. While it can be used as a standalone container-monitoring service, we do not want to publicly expose the cadvisor container; instead, we are just making it available to other containers within our Docker Compose stack.

While cadvisor is a self-contained web frontend to the docker container stat command, displaying graphs and allowing you to drill down from your Docker host into your containers from an easy-to-use interface, it doesn't keep more than 5 minutes' worth of metrics. As we are attempting to record metrics that can be available hours or even days later, having no more than 5 minutes of metrics means that we are going to have to use additional tools to record the metrics it processes.

cadvisor exposes the information we want to record about our containers as structured data at the following endpoint: http://cadvisor:8080/metrics/. We will look at why this is important in a moment.

The next service we are launching is node-exporter. Its service definition in the Docker Compose file looks like the following:

node-exporter:
container_name: node-exporter
image: prom/node-exporter
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command: '-collector.procfs=/host/proc -collector.sysfs=/host/sys -collector.filesystem.ignored-mount-points="^(/rootfs|/host|)/(sys|proc|dev|host|etc)($$|/)" collector.filesystem.ignored-fs-types="^(sys|proc|auto|cgroup|devpts|ns|au|fuse.lxc|mqueue)(fs|)$$"'
expose:
- 9100

Again, we are mounting various parts of the Docker host's filesystem but this time you will notice that we are not mounting /var/lib/docker/ as we are not using node-exporter to monitor our containers and we only need information about the CPU, RAM and Disk utilization of our host machine. The command also dynamically configures node-exporter as the process starts rather than us having to bake or mount our configuration file into the container itself.

Again, we are only exposing the node-exporter port to our Docker Compose services; this is because like cadvisor, it is acting as nothing more than an endpoint, which is http://node-exporter:9100/metrics/, to expose the stats from our Docker host.

Both the cadvisor and node-exporter endpoints are being scraped automatically by our next service, prometheus. This is where most of the heavy lifting happens: Prometheus is a monitoring tool written and open sourced by SoundCloud (https://soundcloud.com/).

prometheus:
image: prom/prometheus
container_name: prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
restart: unless-stopped
expose:
- 9090
depends_on:
- cadvisor
- node-exporter

As you can see from the preceding service definition, we are mounting a configuration file and also have a volume called prometheus_data. The configuration file contains information about the sources we want to scrape, as you can see from the following configuration:

global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: 'monitoring'

rule_files:

scrape_configs:

- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']

- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']

- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']

We are instructing Prometheus to scrape data from our endpoints every 15 seconds. The endpoints are defined in the scrape_configs section, and as you can see, we have cadvisor and node-exporter in there as well as Prometheus itself defined. The reason we are creating and mounting the prometheus_data volume is that prometheus is going to be storing all of our metrics, so we need to keep it safe.

At its core, Prometheus is a time-series database. It takes the data is has scraped, processes it to find the metric name and value, and then stores it along with a timestamp.

For more information on the data model used by Prometheus, visit https://prometheus.io/docs/concepts/data_model/.

Prometheus also comes with a powerful query engine and API, making it the perfect database for this kind of data. While it does come with basic graphing capabilities, it is recommended that you use Grafana, which is our final service and also the only one to be exposed publicly.

Grafana is an open source tool for displaying monitoring graphs and metric analytics, which allows you to create dashboards using time-series databases such as Graphite (https://graphiteapp.org), InfluxDB (https://www.influxdata.com), and also Prometheus. There are also further backend database options that are available as plugins.

The Docker Compose definition for grafana follows a similar pattern to our other services:

grafana:
image: grafana/grafana
container_name: grafana
volumes:
- grafana_data:/var/lib/grafana
env_file:
- grafana.config
restart: unless-stopped
ports:
- 3000:3000
depends_on:
- prometheus

We are using the grafana_data volume to store Grafana's own internal configuration database, and rather than storing the environment variables in the Docker Compose file, we are loading them from an external file called grafana.config.

The variables are as follows:

GF_SECURITY_ADMIN_USER=admin
GF_SECURITY_ADMIN_PASSWORD=password
GF_USERS_ALLOW_SIGN_UP=false

As you can see, we are setting the username and password here, so having them in an external file means that you can change these values without editing the core Docker Compose file.

Now that we know the role each of the four services fulfills, let's launch them. To do this, simply run the following command from the /chapter12/prometheus/ folder:

$ docker-compose up -d

This will create a network and the volumes and pull the images from the Docker Hub. It will then go about launching the four services:

You may be tempted to go immediately to http://localhost:3000/; if you did, you would not see anything, as Grafana takes a few minutes to initialize itself. You can follow its progress by following the logs:

$ docker-compose logs -f grafana

The output of the command is given here:

Once you see the "Initializing HTTP Server" message, Grafana will be available. There is, however, one more thing we need to do before we access Grafana, and that is to configure our data source. We can do this by running the following command, which is for macOS and Linux only, to set up the data source using the API:

$ curl 'http://admin:password@localhost:3000/api/datasources' -X POST -H 'Content-Type: application/json;charset=UTF-8' --data-binary '{"name":"Prometheus","type":"prometheus","url":"http://prometheus:9090","access":"proxy","isDefault":true}'
The preceding command is available in the repository in the README.md file, so you don't have to type it out; you can copy and paste.

If you are unable to configure the data source using the API, then don't worry; Grafana will help you add one when you log in, which is what we are now going to do. Open your browser and enter http://localhost:3000/, and you should be greeted with a login screen:

Enter the User admin and the Password password. Once logged in, if you have configured the data source, you should see the following page:

If you haven't entered the data source, you will be prompted to; follow the onscreen instructions. You can use the information from the curl command further up; once you enter it, you will be taken to the New Dashboard screen.

Click on Home in the top left and you will be shown a menu. On the right-hand side is an Import Dashboard button; click on it to be taken to a page that will ask you to Upload .json File, enter a Grafana.com Dashboard or paste JSON. Enter the number 893 into the Grafana.com Dashboard box and click on the Load button. You will be taken to the import options:

Leave everything at its default options and select your data source, which should be Prometheus, the last option. Once this is complete, click on the Import button.

This will import the prepared Docker and system monitoring by Thibaut Mottet. This has been built with cadvisor, node-exporter, and Prometheus in mind; once imported, you should see something similar to the following:

As you can see, I have over 2 hours of metrics stored, which I can explore by clicking on the various graphs as well as the display options in the top right of the screen. For more information on the Grafana dashboard, go to the dashboards project page at https://grafana.com/dashboards/893/.

I have already mentioned that this is a complex solution; eventually, Docker will expand the recently released built-in endpoint, which presently only exposes information about the Docker Engine and not the containers themselves. For more information on the built-in endpoint, check out the official Docker documentation, which can be found at https://docs.docker.com/engine/admin/prometheus/.

There are other monitoring solutions out there; most of them take the form of third-party Software as a Service productions:

There are also other self-hosted options, such as:

As you can see, there are a few well-established monitoring solutions listed; in fact, you may already be using them, so it would be easy for you when expanding your configuration to take into account your monitoring your containers.

Once you have finished exploring your Prometheus installation, don't forget to remove it by running:

$ docker-compose down --volumes --rmi all

This removes all of the containers, volumes, images, and network.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset