Chapter 14. Init Container

Init Containers enable separation of concerns by providing a separate lifecycle for initialization-related tasks distinct from the main application containers. In this chapter, we look closely at this fundamental Kubernetes concept that is used in many other patterns when initialization logic is required.

Problem

Initialization is a widespread concern in many programming languages. Some languages have it covered as part of the language, and some use naming conventions and patterns to indicate a construct as the initializer. For example, in the Java programming language, to instantiate an object that requires some setup, we use the constructor (or static blocks for fancier use cases). Constructors are guaranteed to run as the first thing within the object, and they are guaranteed to run only once by the managing runtime (this is just an example; we don’t go into detail here on the different languages and corner cases). Moreover, we can use the constructor to validate preconditions such as mandatory parameters. We also use constructors to initialize the instance fields with incoming arguments or default values.

Init Containers are similar, but at Pod level rather than class level. So if you have one or more containers in a Pod that represent your main application, these containers may have prerequisites before starting up. These may include setting up special permissions on the filesystem, database schema setup, or application seed data installation. Also, this initializing logic may require tools and libraries that cannot be included in the application image. For security reasons, the application image may not have permissions to perform the initializing activities. Alternatively, you may want to delay the startup of your application until an external dependency is satisfied. For all these kinds of use cases, Kubernetes uses init containers as implementation of this pattern, which allow separation of initializing activities from the main application duties.

Solution

Init Containers in Kubernetes are part of the Pod definition, and they separate all containers in a Pod into two groups: init containers and application containers. All init containers are executed in a sequence, one by one, and all of them have to terminate successfully before the application containers are started up. In that sense, init containers are like constructor instructions in a Java class that help object initialization. Application containers, on the other hand, run in parallel, and the startup order is arbitrary. The execution flow is demonstrated in Figure 14-1.

Init and application containers in a Pod
Figure 14-1. Init and application containers in a Pod

Typically, init containers are expected to be small, run quickly, and complete successfully, except when an init container is used to delay the start of a Pod while waiting for a dependency, in which case it may not terminate until the dependency is satisfied. If an init container fails, the whole Pod is restarted (unless it is marked with RestartNever), causing all init containers to run again. Thus, to prevent any side effects, making init containers idempotent is a good practice.

On one hand, init containers have all of the same capabilities as application containers: all of the containers are part of the same Pod, so they share resource limits, volumes, and security settings, and end up placed on the same node. On the other hand, they have slightly different health-checking and resource-handling semantics. There is no readiness check for init containers, as all init containers must terminate successfully before the Pod startup processes can continue with application containers.

Init containers also affect the way Pod resource requirements are calculated for scheduling, autoscaling, and quota management. Given the ordering in the execution of all containers in a Pod (first, init containers run a sequence, then all application containers run in parallel), the effective Pod-level request and limit values become the highest values of the following two groups:

  • The highest init container request/limit value

  • The sum of all application container values for request/limit

A consequence of this behavior is that if you have init containers with high resource demands and application containers with low resource demands, the Pod-level request and limit values affecting the scheduling will be based on the higher value of the init containers. This setup is not resource-efficient. Even if init containers run for a short period of time and there is available capacity on the node for the majority of the time, no other Pod can use it.

Moreover, init containers enable separation of concerns and allow keeping containers single-purposed. An application container can be created by the application engineer and focus on the application logic only. A deployment engineer can author an init containers and focus on configuration and initialization tasks only. We demonstrate this in Example 14-1, which has one application container based on an HTTP server that serves files.

The container provides a generic HTTP-serving capability and does not make any assumptions about where the files to serve might come from for the different use cases. In the same Pod, an init container provides Git client capability, and its sole purpose is to clone a Git repo. Since both containers are part of the same Pod, they can access the same volume to share data. We use the same mechanism to share the cloned files from the init container to the application container.

Example 14-1 shows an init container that copies data into an empty volume.

Example 14-1. Init Container
apiVersion: v1
kind: Pod
metadata:
  name: www
  labels:
    app: www
spec:
  initContainers:
  - name: download
    image: axeclbr/git
    command:                                             1
    - git
    - clone
    - https://github.com/mdn/beginner-html-site-scripted
    - /var/lib/data
    volumeMounts:                                        2
    - mountPath: /var/lib/data
      name: source
  containers:
  - name: run
    image: docker.io/centos/httpd
    ports:
    - containerPort: 80
    volumeMounts:                                        2
    - mountPath: /var/www/html
      name: source
  volumes:                                               3
  - emptyDir: {}
    name: source
1

Clone an external Git repository into the mounted directory

2

Shared volume used by both init container and the application container

3

Empty directory used on the node for sharing data

We could have achieved the same effect by using ConfigMap or PersistentVolumes, but wanted to demonstrate how init containers work here. This example illustrates a typical usage pattern of an init container sharing a volume with the main container.

Keep a Pod running

For debugging the outcome of init containers, it helps if the command of the application container is replaced temporarily with a dummy sleep command so that you have time to examine the situation. This trick is particularly useful if your init container fails to start up and your application fails to start because the configuration is missing or broken. The following command within the Pod declaration gives you an hour to debug the volumes mounted by entering the Pod with kubectl exec -it <Pod> sh:

   command:
   - /bin/sh
   - "-c"
   - "sleep 3600"

A similar effect can be achieved by using a Sidecar as described next in Chapter 15, where the HTTP server container and the Git container are running side by side as application containers. But with the Sidecar approach, there is no way of knowing which container will run first, and Sidecar is meant to be used when containers run side by side continuously (as in Example 15-1, where the Git synchronizer container continuously updates the local folder). We could also use a Sidecar and Init Container together if both a guaranteed initialization and a constant update of the data are required.

Discussion

So why separate containers in a Pod into two groups? Why not just use an application container in a Pod for initialization if required? The answer is that these two groups of containers have different lifecycles, purposes, and even authors in some cases.

Having init containers run before application containers, and more importantly, having init containers run in stages that progress only when the current init container completes successfully, means you can be sure at every step of the initialization that the previous step has completed successfully, and you can progress to the next stage. Application containers, in contrast, run in parallel and do not provide the same guarantees as init containers. With this distinction in hand, we can create containers focused on a single initialization or application-focused task, and organize them in Pods.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset