Chapter 5: Using Multi-Container Pods and Design Patterns

Running complex applications on Kubernetes will require that you run not one but several containers in the same Pods. The strength of Kubernetes lies in its ability to create Pods made up of several containers: these Pods are capable of managing multiple containers at once. We will focus on those Pods in this chapter by studying the different aspects of hosting several containers in the same Pod, as well as having these different containers communicate with each other.

So far, we've only created Pods running a single container: those were the simplest forms of Pods, and you'll use them Pods to manage the simplest of applications. We also discovered how to update and delete them by running simple Create, Read, Update, Delete (CRUD) operations against those Pods using the kubectl command-line tool.

Besides mastering the basics of CRUD operations, you also learned how to access a running Pod inside a Kubernetes cluster.

In this chapter, we will push all of this one step forward and discover how to manage Pods when they are meant to launch not one but several containers: the good news is that everything you learned previously will also be valid for multi-container Pods. Things won't differ much in terms of raw Pod management because updating and deleting Pods is not different, no matter how many containers the Pod contains.

Besides those basic operations, we are also going to cover how to access a specific container inside a multi-container Pod and how to access its logs. When a given Pod contains more than one container, you'll have to run some specific commands with specific arguments to access it, and that's something we are going to cover in this chapter.

We will also discover some important design patterns such as ambassador, sidecar, and adapter containers. You'll need to learn these architectures to effectively manage multi-container Pods. You'll also learn how to deal with volumes from Kubernetes. Docker also provides volumes, but in Kubernetes, they are used to share data between containers launched by the same Pod, and this is going to be an important part of this chapter. After this chapter, you're going to be able to launch complex applications inside Kubernetes Pods.

In this chapter, we're going to cover the following main topics:

  • Understanding what multi-container Pods are
  • Sharing volumes between containers in the same pod
  • The ambassador design pattern
  • The sidecar design pattern
  • The adapter design pattern

Technical requirements

You will require the following prerequisites for this chapter:

  • A working kubectl command-line utility
  • A local or cloud-based Kubernetes cluster to practice with

Understanding what multi-container Pods are

In this section, we'll learn about the core concepts of Pods for managing several containers at once by discussing some concrete examples of multi-container Pods.

Then, we will create and delete a Pod made up of at least two and discover how to access its logs. After that, we'll learn how to access a specific container within a Pod containing multiple containers. Finally, we will learn how to access the logs of a specific container within a running Pod.

Concrete scenarios where you need multi-container Pods

You should group your containers into a Pod when they need to be tightly linked. More broadly, a Pod must correspond to an application or a process running in your Kubernetes cluster. If your application requires multiple containers to function properly, then those containers should be launched and managed through a single Pod.

When the containers are supposed to work together, you should group them into a single Pod. Keep in mind that a Pod cannot span across multiple worker nodes. So, if you create a Pod containing several containers, then all these containers will be created on the same worker node and the same Docker daemon installation.

To understand where and when to use multi-container Pods, take the example of two simple applications:

  • A log forwarder: In this example, imagine that you have deployed a web server such as NGINX that stores its logs in a dedicated directory. You might want to collect and forward these logs. For that, you could deploy something like a Splunk forwarder as a container within the same Pod as your NGINX server. These log forwarding tools are used to forward logs from a source to a destination location, and it is very common to deploy agents such as Splunk, Fluentd, or Filebeat to grab logs from a container and forward them to a central location such as an ElasticSearch cluster. In the Kubernetes world, this is generally achieved by running a multi-container Pod with one container dedicated to running the application, and another one dedicated to grabbing the logs and sending them elsewhere. Having these two containers managed by the same Pod would ensure that they are launched on the same node as the log forwarder and at the same time.
  • A proxy server: Another typical use case of a multi-container Pod would be an application where you have an NGINX web server acting as a reverse proxy in front of an application. It is very common to use middleware such as NGINX to route web traffic to your actual web application by following some custom rules. By bundling the two containers in the same Pod, you'll get two Pods running in the same node. You could also run a third container in the same Pod to forward the logs that are emitted by the two others to a central logging location! This is because Kubernetes has no limit on the number of containers you can have in the same Pod, so long as you have enough computing resources to run them all.

In general, every time several of your containers work together and are tightly coupled, you should have them in a multi-container Pod. Just with these two examples, it's easy to understand why such Pods as so powerful. Most of the Pods you'll launch in while working with Kubernetes will probably handle more than one container.

Now, let's discover when to not create a multi-container Pod.

When not to create a multi-container Pod

Pods are especially useful when they are managing several containers, but they should not be seen as the go-to solution every time you need to set up a container.

The really basic golden rule when designing a multi-container is to keep in mind that all the containers that are declared in the Pod will be scheduled on the same worker node.

Now, let's discover how to create multi-container Pods. As you can imagine, we will have to use the kubectl command-line tool!

Creating a Pod made up of two containers

In the previous chapter, we discovered two syntaxes for manipulating Kubernetes in the past:

  • The imperative syntax
  • The declarative syntax

Most of the Kubernetes objects we are going to discover in this book can be created or updated using these two methods, but unfortunately, this is not the case for multi-container Pods.

When you need to create a Pod containing multiple containers, you will need to go through the declarative syntax. This means that you will have to create a YAML file containing the declaration of your Pods and all the containers it will manage, and then apply it through kubectl create -f file.yaml or kubectl apply -f file.yaml.

You cannot create Pods with multiple containers with kubectl create or kubectl run.

Consider the following YAML manifest file stored in ~/multi-container-Pod.yaml:

# ~/multi-container-Pod.yaml

apiVersion: v1

kind: Pod

metadata:

  name: nginx-Pod

spec:

  containers:

    - name: nginx-container

      image: nginx:latest

    - name: busybox-container

      image: busybox:latest

This YAML manifest will create a Kubernetes Pod made up of two containers: one based on the nginx:latest image and the other one based on the busybox:latest image.

As you can see, there is no dedicated kind: resource for multi-container Pods – just like when we created a single-container Pod, we are only using kind: Pod.

To create it, use the following command:

$ kubectl create -f multi-container-Pod.yaml

Pods/multi-container-Pod created

This will result in the Pod being created. The kubelet on the elected worker node will have the Docker daemon to pull both images and instantiate two Docker containers.

To check if the Pod was correctly created, we can run kubectl get Pods:

$ kubectl get Pods

Do you remember the role of kubelet from Chapter 2, Kubernetes Architecture – From Docker Images to Running Pods? This component runs on each node that is part of your Kubernetes cluster and is responsible for converting pod manifests received from kube-apiserver into actual containers.

This kubelet launches your actual Docker containers on the worker nodes, and it is the only component of Kubernetes that is directly interacting with the Docker daemon.

Important Note

Keep this important note in mind because this is one is extremely important: all the containers that are declared in the same Pod will be scheduled, or launched, on the same worker node or Docker daemon. Pods cannot span multiple machines. All containers that are part of a Pod will be launched on the same worker node!

This is something extremely important: containers in the same Pod are meant to live together. If you terminate a Pod, all its containers will be killed together, and when you create a Pod, Kubelet will, at the very least, attempt to create all its containers together.

High availability is generally achieved by replicating multiple Pods over multiple nodes.

From a Kubernetes perspective, applying this file results in a fully working multi-container Pod made up of two containers, and we can make sure that the Pod is running with the two containers by running a standard kubectl get Pods command to fetch the Pod list from kube-apiserver.

Do you see the column that states 2/2? This is the number of containers inside the Pod. Here, this is saying that the two containers that are part of this Pod were successfully launched!

What happens when Kubernetes fails to launch one container in a Pod?

Kubernetes keeps track of all the containers that are launched in the same Pod. But it often happens that a specific container cannot be launched. One of the most popular causes for this issue is when a typo is introduced on one of the Docker images or tags specified in the Pod definition. Let's introduce such a typo in the YAML manifest to demonstrate how Kubernetes reacts when some containers of a specific Pod cannot be launched.

In the following example, I have defined a Docker image that does not exist at all for the NGINX container; note the nginx:i-do-not-exist Docker tag:

# ~/failed-multi-container-Pod.yaml

apiVersion: v1

kind: Pod

metadata:

  name: failed-multi-container-Pod

spec:

  containers:

    - name: nginx-container

      image: nginx:i-do-not-exist

    - name: busybox-container

      image: busybox:latest

Now, we can apply the following container using the kubectl create -f failed-multi-container-Pod.yaml command:

$ kubectl create -f failed-multi-container-Pod.yaml

Pod/failed-multi-container-Pod created

Here, you can see that the Pod was effectively created. This is because even if there's a failing Docker tag, the YAML remains valid from a Kubernetes perspective. So, Kubernetes simply creates the Pod and persists the entry into etcd, but we can easily imagine that kubelet will encounter an error when it launches the docker pull command to retrieve the image from Docker Hub.

Let's check the status of the Pod using kubectl get Pods:

$ kubectl create -f failed-multi-container-Pod.yaml

NAME                         READY   STATUS             RESTARTS   AGE

failed-multi-container-Pod   0/2     CrashLoopBackOff   4          2m23s

As you can see, the status of the pod is CrashLoopBackOff. This means that Kubernetes is constantly crashing when trying to launch the pod and retries again and again. To find out why it's crashing, you have to describe the Pod using the kubectl describe Pods/failed-multi-container-Pod command:

$ kubectl describe Pods/failed-multi-container-Pod

Warning  Failed     3m59s (x2 over 4m19s)  kubelet            Failed to pull image "nginx:i-do-not-exist": rpc error: code = Unknown desc = Error response from daemon: manifest for nginx:i-do-not-exist not found: manifest unknown: manifest unknown

  Warning  Failed     3m59s (x2 over 4m19s)  kubelet            Error: ErrImagePull

  Normal   Pulling    3m59s (x3 over 4m19s)  kubelet            Pulling image "busybox:latest"

  Normal   Created    3m58s (x3 over 4m17s)  kubelet            Created container busybox-container

It's a little bit hard to read, but by following this log, you can see that busybox-container is okay since kubelet has succeeded in creating it, as shown by the last line of the preceding output. But there's a problem with the other container; that is, nginx-container.

Here, you can see that the output error is ErrImagePull and as you can guess, it's saying that the container cannot be launched because the docker pull command fails to retrieve the nginx:i-do-not-exist Docker tag.

So, Kubernetes does the following:

  1. First, it creates the entry in etcd if the Pod of the YAML file is valid.
  2. Then, it simply tries to launch the container.
  3. If an error is encountered, it will try to launch the failing container again and again.

If any other container works properly, it's fine. But your Pod will never enter the Running status because of the crashing container. After all, your app certainly needs the failing container to work properly; otherwise, that container should not be there at all!

Now, let's learn how to delete a multi-container Pod.

Deleting a multi-container Pod

When you want to delete a Pod containing multiple containers, you have to go through the kubectl delete command, just like you would for a single-container Pod.

Them, you have two choices:

  • You specify the path to the YAML manifest file that's used by using the -f option.
  • You delete the Pod without using its YAML path if you know its name.

The first way consists of specifying the path to the YAML manifest file. You can do so using the following command:

$ kubectl delete -f multi-container-Pod.yaml

Otherwise, if you already know the Pod's name, you can do this as follows:

$ kubectl delete Pods/multi-Pod

$ # or equivalent

$ kubectl delete Pods multi-Pod

To figure out the name of the Pods, you can use the kubectl get commands:

$ kubectl get Pods

NAME                         READY   STATUS             RESTARTS   AGE

failed-multi-container-Pod   0/2     CrashLoopBackOff   9          22m

When I ran them, only failed-multi-container-Pod was created in my cluster, so that's why you can just see one line in my output. Keep in mind that these commands are working at the Pods level, not the containers level. Do not pass the name of a container as this wouldn't work at all.

Here is how you can delete failed-multi-container-Pod imperatively without specifying the YAML file that created it:

$ kubectl delete Pods/failed-multi-container-Pod

Pod "failed-multi-container-Pod" deleted

After a few seconds, the Pod is removed from the Kubernetes cluster and all its containers are removed from the Docker daemon and the worker node.

The amount of time that's spent before the command is issued and the Pod's name being deleted and released is called the grace period. Let's discover how to deal with it!

Understanding the Pod deletion grace period

One important concept related to deleting Pods is what is called the grace period. It is a concept that has something to do with how Kubernetes releases the name of the Pod during Pod deletion. Both single-container Pods and multi-container Pods have this grace period, which can be observed when you delete them. This grace period can be ignored by passing the --grace-period=0 --force option to the delete command.

The whole idea is that you cannot have two Pods with the same name running at the same time on your cluster, because the Pods' names are unique identifiers: that's why we use this as a parameter to identify a specific Pod when running the kubectl delete command, for example.

When the deletion is forced by setting a grace period to 0 with the --force flag, the Pod's name is immediately released and becomes available for another Pod to take it. While during an unforced deletion, the grace period is respected, and the Pod's name is released after it is effectively deleted:

$ kubectl delete Pods/multi-container-Pods --grace-period=0 –force

warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.

Pod "multi-container-Pod" force deleted

Keep in mind that this command should be used carefully if you don't know what you are doing. Forcefully deleting a Pod shouldn't be seen as the norm because as the output states, kubectl you cannot be sure that the Pod was effectively deleted. If, for some reason, the Pod could not be deleted, it might run indefinitely, so do not run this command if you are not sure of what to do.

Now, let's discover how to access a specific container that is running inside a multi-container Pod.

Accessing a specific container inside a multi-container Pod

When you have several containers in the same Pod, you can access each of them individually. Here, we will access the NGINX container of our multi-container Pods. Let's start by recreating it because we deleted it in our previous example:

$ kubectl create -f multi-container-Pod.yaml

Pod/multi-container-Pod created

To access a running container, you need to use the kubectl exec command, just like you need to use docker exec to launch a command in an already created container when using Docker without Kubernetes.

This command will ask for two important parameters:

  • The Pod that wraps the container you want to target
  • The name of the container itself, as entered in the YAML manifest file

We already know the name of the Pod because we can easily retrieve it with the kubectl get command. In our case, the Pod is named multi-container-Pod.

However, we don't have the container's name because there is no kubectl get containers command that would allow us to list the running containers. This is why we will have to use the kubectl describe Pods/multi-container-Pod command to find out what is contained in this Pod:

$ kubectl describe Pods/multi-container-Pod

This command will show the names of all the containers contained in the targeted Pod. Here, we can see that our Pod is running two containers: one called busybox-container and another called nginx-container. The one we need is nginx-container.

Additionally, the following is a little command for listing all the container names contained in a dedicated Pod:

$ kubectl get Pods/multi-container-Pod -o jsonpath="{.items[*].spec.containers[*].name}"

This command will spare you from using the describe command. However, it makes use of jsonpath, which is an advanced feature of kubectl: this command might look strange but it mostly consists of a sort filter that's applied against the command.

jsonpath filters are not easy to get right, so feel free to add this command as a bash alias or note it somewhere because it's a useful one.

In any case, we can now see that we have those two containers inside the multi-container-Pod Pod:

  • nginx-container
  • busybox-container

Now, let's access nginx-container. You have the name of the targeted container in the targeted Pod, so use the following command to access the Pod:

$ kubectl exec -ti multi-container-Pod --container nginx-container -- /bin/bash

#

After running this command, you'll be inside nginx-container. Let's explain this command a little bit. kubectl exec does the same as docker exec.

When you run this command, you get the shell of the container, called nginx-container, inside the multi-container Pod, at which point you will be ready to run commands inside this very specific container on your Kubernetes cluster.

The main difference from the single container Pod situation is the --container option (the -c short option works too). You need to pass this option to tell kubectl what container you want to reach.

Now, let's discover how to run commands in the containers running in your Pods!

Running commands in containers

One powerful aspect of Kubernetes is that you can, at any time, access the containers running on your Pods to execute some commands. We did this previously, but did you know you can also execute any command you want directly from the kubectl command-line tool?

First, we are going to recreate the Pod containing the NGINX and Busybox containers:

$ kubectl create -f multi-container-Pod.yaml

Pod/multi-container-Pod created

To run a command in a container, you need to use kubectl exec, just like we did previously. But this time, you have to remove the -ti parameter to prevent kubectl from attaching to your running terminal session.

Here, I'm running the ls command to list files in nginx-container from the multi-container-Pod Pod:

$ kubectl exec Pods/multi-container-Pod --container nginx-container-1 -- ls

This command will ask for two important parameters:

  • The name of the container, as specified in the YAML file
  • The name of the Pod that contains it

You can omit the container name but if you do so, kubectl will use the default first one. In our case, the default one will be nginx-container because it was the first one to be declared in the YAML manifest file.

Once you have entered all these parameters, you have to input the command you want to run from the rest with a double dash (--).

The name of the container, as well as the name of the Pod that contains it, will need to be provided. We already know the name of the Pod because we can easily retrieve it with the kubectl get command. In our case, the Pod is called multi-container-Pod.

We will now discover how to override the commands that are run by your containers.

Overriding the default commands run by your containers

When using Docker, you have the opportunity to write files called Dockerfile to build Docker images. Dockerfile makes use of two keywords to tell us what commands and arguments the containers that were built with this image will launch when they're created using the docker run command.

These two keywords are ENTRYPOINT and CMD:

  • ENTRYPOINT is the main command the Docker container will launch.
  • CMD is used to replace the parameters that are passed to the ENTRYPOINT command.

For example, a classic Dockerfile that should be launched to run the sleep command for 30 seconds would be written like this:

# ~/Dockerfile

FROM busybox:latest

ENTRYPOINT ["sleep"]

CMD ["30"]

This is just plain old Docker and you should be familiar with these concepts. As you may already know, the CMD argument is what you can pass to the docker run command. If you build this image using this Dockerfile using the docker build command, you'll end up with a Busybox image that just runs the sleep command (ENTRYPOINT) when docker run for 30 seconds (the CMD argument).

Thanks to the CMD instruction, you can override the default 30 seconds like so:

$ docker run my-custom-ubuntu:latest 60

$ docker run my-custom-ubuntu:latest # Just sleep for 30 seconds

Kubernetes, on the other hand, allows us to override both ENTRYPOINT and CMD thanks to YAML pod definition files. To do so, you must append two optional keys to your YAML configuration file: command and args.

This is a very big benefit Kubernetes brings you because you can decide to append arguments to the command that's run by your container's Dockerfile, just like the CMD arguments does with bare Docker, or completely override ENTRYPOINT!

Here, I'm going to write a new manifest file that will override the default ENTRYPOINT and CMD parameters of the busybox image to make the busybox container sleep for 60 seconds. Here is how to proceed:

# ~/nginx-busybox-with-custom-command-and-args.yaml

apiVersion: v1

kind: Pod

metadata:

  name: nginx-busybox-with-custom-command-and-args

spec:

  initContainers:

  - name: my-init-container

    image: busybox:latest

    command: ["sleep", "15"]

  containers:

    - name: nginx-container

      image: nginx:latest

    - name: busybox-container

      image: busybox:latest

      command: ["sleep"] # Corresponds to the ENTRYPOINT

      args: ["60"] # Corresponds to CMD

This is a bit tricky to understand because what Dockerfile calls ENTRYPOINT corresponds to the command argument in the YAML manifest file, and what Dockerfile calls CMD corresponds to the args configuration key in the YAML manifest file.

What if you omit one of them? Kubernetes will default to what is inside the Docker image. If you omit the args key in the YAML, then Kubernetes will go for the CMD provided in the Dockerfile , while if you omit the command key, Kubernetes will go for the ENTRYPOINT declated in the Dockerfile. Most of the time, or at least if you're comfortable with your container's ENTRYPOINT, you're just going to override the args file (the CMD Dockerfile instruction).

Now, let's discover another feature: initContainers! In the next section, you'll see another way to execute some additional side containers in your Pod to configure the main ones.

Introducing initContainers

initContainers is a feature provided by Kubernetes Pods to run setup scripts before the actual containers start. You can think of them as additional sides containers you can define in your Pod YAML manifest file: they will run first when the Pod is created. Then, once they complete, the Pod starts creating its main containers.

You can execute not one but several initContainers in the same Pod, but when you define lots of them, keep in mind that they will run one after another, not in parallel. Once an initContainer completes, the next one starts, and so on. In general, initContainers are used to pull application code from a Git repository and expose it to the main containers using volume mounts or to run start-up scripts.

Since initContainers can have their own Docker images. You can offload some configuration to them by keeping your main containers images as small as possible, thus increasing the whole security of your setup by removing unnecessary tools from your main container images. Here is a YAML manifest that introduces an initContainer:

# ~/nginx-with-init-container.yaml

apiVersion: v1

kind: Pod

metadata:

  name: nginx-with-init-container

spec:

  initContainers:

  - name: my-init-container

    image: busybox:latest

    command: ["sleep", "15"]

  containers:

    - name: nginx-container

      image: nginx:latest

As you can see from this YAML file, initContainer runs the busybox:latest image, which will sleep for 15 seconds before completing. Once the execution of initContainer is complete, Kubernetes will create the NGINX container.

Important Note

Note that Kubernetes cannot launch the main containers if initContainer fails. That's why it is really important to not see initContainer as something optional or that could fail. They will be forcibly executed if they are specified in the YAML manifest file, and if they fail, the main containers will never be launched!

Let's create the Pod. After, we will run the kubectl get Pods -w command for kubectl to watch for a change in the Pod list. The output of the command will be updated regularly, showing the change in the Pod's status. Please note the status command, which is saying that an initContainer is running!

$ kubectl create -f nginx-with-init-container.yaml

Pod/nginx-with-init-container created

$ kubectl get Pods -w

NAME                        READY   STATUS     RESTARTS   AGE

nginx-with-init-container   0/1     Init:0/1   0          3s

nginx-with-init-container   0/1     PodInitializing   0          17s

nginx-with-init-container   1/1     Running           0          19s

As you can see, Init:0/1 indicates that initContainer is being launched. After its completion, the Init: prefix disappears for the next statuses, indicating that we are done with initContainer and that Kubernetes is now creating the main container – in our case, the NGINX one!

Use initContainer wisely when you're building your Pods! And remember: if you can avoid using them, do so. You are not forced to use them, but they can be really helpful for running configuration scripts or to pull something from external servers before you launch your actual containers! Now, let's learn how to access the logs of a specific container inside a running Pod!

Accessing the logs of a specific container

When using multiple containers in a single Pod, you can retrieve the logs of a dedicated container inside the Pod. The proper way to proceed is by using the kubectl logs command.

The most common way a containerized application exposes its logs is by sending them to stdout, which is basically what Docker displays when you run the kubectl logs command.

The kubectl logs command is capable of streaming the stdout property of a dedicated container in a dedicated Pod and retrieving the application logs from the container. For it to work, you will need to know the name of both the precise container and its parent Pod, just like when we used kubectl exec to access a specific container.

Please read the previous section, Accessing a specific container inside a multi-container Pod, to discover how:

$ kubectl logs -f Pods/multi-container-Pods --container nginx-container

Please note the --container option (the -c short option works too), which specifies the container you want to retrieve the logs for. Note that it also works the same for initContainers: you have to pass its name to this option to retrieve its logs.

Important Note

Remember that if you do not pass the --container option, you will retrieve all the logs from all the containers that have been launched inside the Pod. Not passing this option is useful in the case of a single-container Pod, but you should consider this option every time you use a multi-container Pod.

There are other multiple useful options you need to be aware of when it comes to accessing the logs of a container in a Pod. You can decide to retrieve the logs written in the last 2 hours by using the following command:

$ kubectl logs --since=2h Pods/multi-container-Pods --container nginx-container

Also, you can use the --tail option to retrieve the most recent lines of a log's output. Here's how to do this:

$ kubectl logs --tail=30 Pods/multi-container-Pods --container nginx-container

Here, we are retrieving the 30 most recent lines in the log output of nginx-container.

Now, you are ready to read and retrieve the logs from your Kubernetes Pods, regardless of whether they are made up of one or several containers!

In this section, we discovered how to create, update, and delete multi-container Pods. We also discovered how to force the deletion of a Pod. We then discovered how to access a specific container in a Pod, as well as how to retrieve the logs of a specific container in a Pod. Though we created an NGINX and a Busybox container in our Pod, they are relatively poorly linked since they don't do anything together. To remediate that, we will now learn how to deal with volumes so that we can share files between our two containers.

Sharing volumes between containers in the same Pod

In this section, we'll learn what volumes are from a Kubernetes point of view and how to use them. Docker also has a notion of volumes but it differs from Kubernetes volumes: they answer the same need but they are not the same.

In this section, we will discover what Kubernetes volumes are, why they are useful, and how they can help us when it comes to Kubernetes volumes.

What are Kubernetes volumes?

We are going to answer a simple problem. Our multi-container Pods are currently made up of two containers: an NGINX one and a Busybox one. We are going to try to share the log directory in the NGINX container with the Busybox container by mounting the log directory of NGINX in the directory of the Busybox container. This way, we will create a relationship between the two containers to have them share a directory.

Kubernetes has two kinds of volumes:

  • Volumes, which we will discuss here.
  • PersistentVolume, which is a more advanced feature we will discuss later in Chapter 9, Persistent Storage in Kubernetes.

Keep in mind that these two are not the same. PersistentVolume is a resource of its own, whereas "volumes" is a Pod configuration. As the name suggests, PersistentVolume is persistent, whereas volumes are not supposed to be. But keep in mind that this is not always the case!

Simply put, volumes are storage-bound to the Pod's life cycle: when you create a Pod, you'll have the opportunity to create volumes and attach them to the container(s) inside the Pods. Volumes are nothing more than storage attached to the life cycle of the Pod. As soon as the Pod is deleted, the volumes that were created with it are deleted too.

Even though they are not limited to this use case and this is not always true, you can consider volumes as a particularly great way to share a directory and files between containers running in the same Pod.

Important Note

Remember that volumes are bound to the Pod's life cycle, not the container's life cycle. If a container crashes, the volume would survive because if a container crashes, it won't cause its parent Pod to crash, and thus, no volume will be deleted. So long as a Pod is alive, its volumes are too.

When Docker introduced the concept of volumes, it was just shared directories you could mount to a container. Kubernetes also built its volume feature around this idea and used volumes as shared directories.

But Kubernetes also brought support for a lot of drivers, which helps integrate Pods' volumes with external solutions. For example, an AWS EBS volume can be used as a Kubernetes volume. Here are some solutions among the most common ones that can be used as Kubernetes volumes:

  • awsElasticBlockStore
  • azureDisk
  • gcePersistentDisk
  • glusterfs
  • hostPath
  • emptyDir
  • nfs
  • persistentVolumeClaim (when you need to use a PersistentVolume, which is outside the scope of this chapter)

That is why I said it is not true to say that a volume is fully bound to the life cycle of a Pod. For example, a pod volume backed by an AWS EBS could survive a Pod being deleted because the backend provider (here, this is AWS) might have its own way of managing the storage life cycle. That is why we are going to purely focus on the simplest form of volumes for now.

Important Note

Please note that using external solutions to manage your Kubernetes volumes will require you to follow those external solutions' requirements. For example, using an AWS EBS volume as a Kubernetes volume will require your Pods to be executed on a Kubernetes worker node, which would be an EC2 instance. The reason for this is that AWS EBS volumes can only be attached to EC2 instances. Thus, a Pod exploiting such a volume would need to be launched on an EC2 instance.

We are going to discover the two most common volume drivers here: emptyDir and hostPath. We will also talk about persistentVolumeClaim because this one is going to a little be special compared to the other volumes and will be fully discovered in Chapter 9, Persistent Storage in Kubernetes

Now, let's start discovering how to share files between containers in the same Pod using volumes with the emptyDir volume type!

Creating and mounting an emptyDir volume

The emptyDir volume type is certainly the most common volume type that's used. As the name suggests, it is simply an empty directory that is initialized at Pod creation that you can mount to the location of each container running in the Pod.

It is certainly the easiest and simplest way to have your container share data between them. Let's create a Pod that will manage two containers.

In the following example, I am creating a Pod that will launch two containers, and just like we had previously, it's going to be an NGINX container and a Busybox container. I'm going to override the command that's run by the Busybox container when it starts to prevent it from completing. That way, we will get it running indefinitely as a long process and we will be able to launch additional commands to check if our emptyDir has been initialized correctly.

Both containers will have a common volume mounted at /var/i-am-empty-dir-volume/, which will be our emptyDir volume, initialized in the same Pod. Here is the YAML file for creating the Pod:

# ~/two-nginx-with-emptydir-Pod.yaml

apiVersion: v1

kind: Pod

metadata:

  name: two-containers-with-empty-dir

spec:

  containers:

    - name: nginx-container

      image: nginx:latest

      volumeMounts:

      - mountPath: /var/i-am-empty-dir-volume

        name: empty-dir-volume

    - name: busybox-container

      image: busybox:latest

      command: ["/bin/sh"]

      args: ["-c", "while true; do sleep 30; done;"] # Prevents busybox from exiting after completion

      volumeMounts:

      - mountPath: /var/i-am-empty-dir-volume

        name: empty-dir-volume

  volumes:

  - name: empty-dir-volume # name of the volume

    emptyDir: {} # Initialize an empty directory # The path on the worker node.

Note that the object we will create in our Kubernetes cluster will become more and more complex as we go through this example, and as you can imagine, most complex things cannot be achieved with just imperative commands. That's why you are going to see more and more examples relying on the YAML manifest file: you should start to take up the habit of trying to read them to figure out what they do.

That being said, we can now apply the manifest file using the following kubectl create -f command:

$ kubectl create -f two-containers-with-emptydir-Pod.yaml

Pod/two-containers-with-empty-dir created

Now, we can check that the Pod is successfully running by issuing the kubectl get Pods command:

$ kubectl get Pods

NAME                            READY   STATUS    RESTARTS   AGE

two-containers-with-empty-dir   2/2     Running   0          47s

Now that we are sure the Pod is running and that both the NGINX and Busybox containers have been launched, we can check that the directory can be accessed in both containers by issuing the ls command.

If the command is not failing, as we saw previously, we can run the ls command in the containers by simply running the kubectl exec command. As you may recall, the command takes the Pod's name and the container's name as arguments. We are going to run it twice to make sure the volume is mounted in both containers:

$ kubectl exec two-containers-with-empty-dir --container nginx-container -- ls /var

i-am-empty-dir-volume

$ kubectl exec two-containers-with-empty-dir --container busybox-container -- ls /var

i-am-empty-dir-volume

As you can see, the ls /var command is showing the name in both containers! This means that emptyDir was initialized and mounted in both containers correctly.

Now, let's create a file in one of the two containers. The file should be immediately visible in the other container, proving that the volume mount is working properly!

In the following command, I am simply creating a .txt file called hello-world.txt in the mounted directory:

$ kubectl exec two-containers-with-empty-dir --container nginx-container -- /bin/sh -c "echo 'hello world' >> /var/i-am-empty-dir-volume/hello-world.txt"

$ kubectl exec two-containers-with-empty-dir --container nginx-container -- cat /var/i-am-empty-dir-volume/hello-world.txt

hello world

$ kubectl exec two-containers-with-empty-dir --container busybox-container -- cat /var/i-am-empty-dir-volume/hello-world.txt

hello world

As you can see, I used nginx-container to create the /var/i-am-empty-dir-volume/hello-world.txt file, which contains the hello-world string. Then, I simply used the cat command to access the file from both containers; you can see that the file is accessible in both cases. Again, remember that emptyDir volumes are completely tied to the life cycle of the Pod. If the Pod declaring it is destroyed, then the volume is destroyed too, along with all its content, and it will become impossible to recover!

Now, we will discover another volume type: the hostPath volume. As you can imagine, it's going to be a directory that you can mount on your containers that is backed by a path on the host machine – the worker node running the Pod!

Creating and mounting a hostPath volume

The hostPath volume is also a common volume type. As its name suggests, it will allow you to mount a directory in the host machine to containers in your Pod! The host machine is the Kubernetes worker node executing the Pod. Here are some examples:

  • If your cluster is based on Minikube (a single-node cluster), the host is your local machine.
  • On Amazon EKS, the host machine will be an EC2 instance.
  • In a Kubeadm cluster, the host machine is generally a standard Linux machine.

The host machine is the machine running the Pod, and you can mount a directory from the filesystem of the host machine to the Kubernetes Pod!

In the following example, I'll be working on a Kubernetes cluster based on Minikube, so hostPath will be a directory that's been created on my computer that will then be mounted in a Kubernetes Pod.

Important Note

Using the hostPath volume type can be useful, but you have to keep in mind that it creates a strong relationship between the worker node and the Pods running on top of it. In the Kubernetes world, you can consider it as an anti-pattern.

The whole idea behind Pods is that they are supposed to be easy to delete and rescheduled on another worker node without problems. Using hostPath will create a tight relationship between the Pod and the worker node, and that could lead to major issues if your Pod were to fail and be rescheduled on a node where the required path on the host machine is not present.

Now, let's discover how to create hostPath.

Let's imagine that I have a file on my worker node on worker-node/nginx.conf and I want to mount it on /var/config/nginx.conf on the nginx container.

Here is the YAML file to create the setup. As you can see, I declared a hostPath volume at the bottom of the file that defines a path that should be present on my host machine. Now, I can mount it on any container that needs to deal with the volume in the containers block:

# ~/multi-container-Pod-with-host-path.yaml

apiVersion: v1

kind: Pod

metadata:

  name: multi-container-Pod-with-host-path

spec:

  containers:

    - name: nginx-container

      image: nginx:latest

      volumeMounts:

      - mountPath: /var/config

        name: my-host-path-volume

    - name: busybox-container

      image: busybox:latest

      command: ["/bin/sh"]

      args: ["-c", "while true; do sleep 30; done;"] # Prevents busybox from exiting after completion

  volumes:

  - name: my-host-path-volume

    hostPath:

      path: /tmp # The path on the worker node.

As you can see, mounting the value is just like what we did with the emptyDir volume in the previous section regarding the emptyDir volume type. By using a combination of volumes at the Pod level and volumeMounts at the container level, you can mount a volume on your containers.

You can also mount the directory on the busybox container so that it gets access to the directory on the host.

Before running the YAML manifest file, though, you need to create the path on your host and create the necessary file:

$ echo "Hello World" >> /tmp/hello-world.txt

Now that the path exists on the host machine, we can apply the YAML file to our Kubernetes cluster and, immediately after, launch a kubectl get Pods command to check that the Pod was created correctly:

$ kubectl create -f multi-container-Pod-with-host-path.yaml

Pod/multi-container-Pod-with-host-path

$ kubectl get Pods

NAME                                 READY   STATUS    RESTARTS   AGE

multi-container-Pod-with-host-path   2/2     Running   0          92s

Everything seems good! Now, let's echo the file that should be mounted at /var/config/hello-world.txt.

At beginning of this chapter, we discovered the different aspects of multi-container Pods! We discovered how to create, update, and delete multi-container Pods, as well as initContainers, access logs, override Docker commands directly from the pod's resources, and how to share directories between containers using the most two basic volumes. Now, we are going to put a few architecting principles together and discover some notions related to multi-container Pods called "patterns."

The ambassador design pattern

When designing a multi-container Pod, you can decide to follow some architectural principles to build your Pod. Some typical needs are answers by these design principles, and the ambassador pattern is one of them.

Here, we are going to discover what the ambassador design pattern is, how to build an ambassador container in Kubernetes Pods, and look at a concrete example of them.

What is the ambassador design pattern?

In essence, the ambassador design pattern applies to multi-container Pods. We can define two containers in the same Pod:

  • The first container will be called the main container.
  • The other container will be called the ambassador container.

In this design pattern, we assume that the main container might have to access external services to communicate with them. For example, you can have an application that must interact with a SQL database that is living outside of your Pod, and you need to reach this database to retrieve data from it.

This is the typical use case where you can deploy an adapter container alongside the main container, next to it, in the same Pod. The whole idea is to get the ambassador to proxy the requests ran by the main container to the database server. The ambassador container will be essentially a SQL proxy. Every time the main container wants to access the database, it won't access it directly but rather create a connection to the ambassador container that will play the role of a SQL proxy.

Important Note

Running an ambassador container is fine, but only if the external API is not living in the same Kubernetes cluster. To run requests on another pod, Kubernetes provides strong mechanics called services. We will have the opportunity to discover them in Chapter 7, Exposing Your Pods with Services.

But why would you need a proxy to access external databases? Here are some concrete benefits this design pattern can bring you:

  • Offloading SQL configuration
  • Management of SSL/TLS certificates

Please note that having an ambassador proxy is not limited to a SQL proxy but this example is demonstrative of what this design pattern can bring you. Note that the ambassador proxy is only supposed to be called for outbound connections from your main container to something else, such as data storage or an external API. It should not be seen as an entry point to your cluster! Now, let's quickly discover how to create an ambassador SQL with a YAML file.

A simple example of an ambassador multi-container Pod

Note that we know about ambassador containers, let's learn how to create one with Kubernetes. The following YAML manifest file creates a Pod that makes creates two containers:

  • nginx-app, derived from the nginx:latest image
  • Sql-ambassador-proxy, created from the mysql-proxy:latest Docker image:

# ~/ nginx-with-ambassador.yaml

apiVersion: v1

kind: Pod

metadata:

  name: nginx-with-ambassador

spec:

  containers:

    - name: mysql-proxy-ambassador-container

      image: mysql-proxy:latest

      ports:

         - containerPort: 3306

      env:

      - name: DB_HOST

        value: mysql.xxx.us-east-1.rds.amazonaws.com

    - name: nginx-container

      image: nginx:latest

As you can imagine, it's going to be the developer's job to get the application code running in the NGINX container to query the ambassador instead of the Amazon RDS endpoint. As the ambassador container can be configured from environment variables, it's going to be easy for you to just input the configuration variables in ambassador-container.

Important Note

Do not get tricked by the order of the containers in the YAML file. The fact that the ambassador container appears first does not make it the main container of the Pod. This notion of the main container does not exist at all from a Kubernetes perspective – both are plain Docker containers that run in parallel with no concept of a hierarchy between them. Here, we just access the Pod from the NGINX container, which makes it the most important one.

Remember that the ambassador running in the same Pod as the NGINX container makes it accessible from NGINX on localhost:3306!

The sidecar design pattern

The sidecar design pattern is an extremely useful one. It is good for when you want to extend the features of your main containers with features it would normally not be able to achieve on its own.

Just like we did for the ambassador container, we're going to explain exactly what it is by covering some examples. Then, we're going to discover some concrete examples.

What is the sidecar design pattern?

Think of the sidecar container as an extension or a helper for your main container. Its main purpose is to extend the main container to bring it a new feature, but without changing anything about it. Unlike the ambassador design pattern, the main container may even not be aware of the presence of a sidecar.

Just like the ambassador design pattern, the sidecar design pattern makes use of at least two containers:

  • The main container, the one that is running the application
  • The sidecar container, the one that is bringing something additional to the first one

You may have already guessed, but this pattern is especially useful when you want to run monitoring or log forwarder agents.

There are three things to understand when you want to build a sidecar that is going to forward your logs to another location:

  • You must locate the directory where your main containers write their logs.
  • You must create a volume to make this directory accessible to the log forwarder sidecar.
  • You must launch the sidecar container with the proper configuration.

Based on these concepts, the main container remains unchanged, and even if the sidecar fails, it wouldn't have an impact on the main container, which could continue to work.

Now, we're going to use an example YAML file for the sidecar design pattern.

A simple example of a sidecar multi-container Pod

Just like the ambassador design pattern, the sidecar makes use of multi-container Pods. We will define two containers in the same Pod.

The adapter design pattern

The last design pattern that we are going to discover here is the adapter design pattern. As its name suggests, it's going to adapt an entry from a source format to a target format.

What is the adapter design pattern?

The adapter design pattern is the last paradigm we are going to discover in this chapter. As with the ambassador and sidecar design patterns, this one expects that you run at least two containers:

  • The first one is the main container.
  • The second one is the adapter container.

This design pattern is helpful and should be used whenever the main containers emit data in a format, A, that should be sent to another application that is expecting the data in another format, B. As the name suggests, the adapter container is here to adapt.

Again, this design pattern is especially well-suited for log or monitoring management. Imagine a Kubernetes cluster where you have dozens of applications running; they are writing logs in Apache format that you need to convert into JSON so that they can be indexed by a search engine. This is exactly where the adapter design pattern comes into play. Running an adapter container next to the application containers will help you get these logs adapted to the source format before they are sent somewhere else.

Just like for the sidecar design pattern, this one can only work if both the containers in your Pod are accessing the same directory using volumes.

A simple example of an adapter multi-container Pod

In this example, I'm going to use a Pod that is using an adapter container with a shared directory mounted as a Kubernetes volume.

This Pod is going to run two containers:

  • nginx-app, derived from the nginx:latest image
  • adapter-container, created from the ubuntu:latest Docker image:

# ~/ nginx-with-ambassador.yaml

apiVersion: v1

kind: Pod

metadata:

  name: nginx-with-ambassador

spec:

  containers:

    - name: mysql-proxy-ambassador-container

      image: mysql-proxy:latest

      ports:

         - containerPort: 3306

      env:

      - name: DB_HOST

        value: mysql.xxx.us-east-1.rds.amazonaws.com

    - name: nginx-container

      image: nginx:latest

Please note that the container is mounting the same directory to provide access to both containers.

Summary

This chapter was quite a big one, but you should now have a good understanding of what pods are and how to use them, especially when it comes to managing multiple Docker containers in the same Pod. Since microservice applications are often made up of several containers and not just one, it's going to be difficult to manage only a single-container Pod in your cluster.

I recommend that you focus on mastering the declarative way of creating Kubernetes resources. As you have noticed in this chapter, the key to achieving the most complex things with Kubernetes resides in writing YAML files. One example is that you simply cannot create a multi-container Pod without writing YAML files.

This chapter completes the previous one: Chapter 4, Running Your Docker Containers. You need to understand that everything we will do with Kubernetes will be Pod management because everything in Kubernetes revolves around them. Keep in mind that containers are never created directly, but always through a pod object, and that all the containers within the same Pod are created on the same worker node. If you understand that, then you can continue to the next chapter!

The next chapter is going to introduce two of the most important objects, ConfigMaps and Secrets, as we continue to dig into the core concepts of Kubernetes. In Kubernetes, we consider that applications and their configurations should be treated as two completely distinct things to improve application portability: that is why we have the pod resource, which lets us create the application container, and the ConfigMaps and Secrets objects, which are there to help us inject configuration data into our pods.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset