Chapter 22. Controller

A Controller actively monitors and maintains a set of Kubernetes resources in a desired state. The heart of Kubernetes itself consists of a fleet of controllers that regularly watch and reconcile the current state of applications with the declared target state. In this chapter, we see how to leverage this core concept for extending the platform for our needs.

Problem

You already have seen that Kubernetes is a sophisticated and comprehensive platform that provides many features out of the box. However, it is a general-purpose orchestration platform that does not cover all application use cases. Luckily, it provides natural extension points where specific use cases can be implemented elegantly on top of proven Kubernetes building blocks.

The main question that arises here is about how to extend Kubernetes without changing and breaking it, and how to use its capabilities for custom use cases.

By design, Kubernetes is based on a declarative resource-centric API. What exactly do we mean by declarative? As opposed to an imperative approach, a declarative approach does not tell Kubernetes how it should act, but instead describes how the target state should look. For example, when we scale up a Deployment, we do not actively create new Pods by telling Kubernetes to “create a new Pod.” Instead, we change the Deployment resource’s replicas property via the Kubernetes API to the desired number.

So, how are the new Pods created? This is done internally by the Controllers. For every change in the resource status (like changing the replicas property value of a Deployment), Kubernetes creates an event and broadcasts it to all interested listeners. These listeners then can react by modifying, deleting, or creating new resources, which in turn creates other events, like Pod-created events. These events are then potentially picked up again by other controllers, which perform their specific actions.

The whole process is also known as state reconciliation, where a target state (the number of desired replicas) differs from the current state (the actual running instances), and it is the task of a controller to reconcile and reach the desired target state again. When looked at from this angle, Kubernetes essentially represents a distributed state manager. You give it the desired state for a component instance, and it attempts to maintain that state should anything change.

How can we now hook into this reconciliation process without modifying Kubernetes code and create a controller customized for our specific needs?

Solution

Kubernetes comes with a collection of built-in controllers that manage standard Kubernetes resources like ReplicaSets, DaemonSets, StatefulSets, Deployments, or Services. These controllers run as part of the Controller Manager, which is deployed (as a standalone process or a Pod) on the master node. These controllers are not aware of one another. They run in an endless reconciliation loop, to monitor their resources for the actual and desired state, and to act accordingly to get the actual state closer to the desired state.

However, in addition to these out-of-the-box controllers, the Kubernetes event-driven architecture allows us to plug in natively other custom controllers. Custom controllers can add extra functionality to the behavior state change events, the same way as the internal controllers do. A common characteristic of controllers is that they are reactive and react to events in the system to perform their specific actions. At a high level, this reconciliation process consists of the following main steps:

Observe

Discover the actual state by watching for events issued by Kubernetes when an observed resource changes.

Analyze

Determine the differences from the desired state.

Act

Perform operations to drive the actual to the desired state.

For example, the ReplicaSet controller watches for ReplicaSet resource changes, analyzes how many Pods need to be running, and acts by submitting Pod definitions to the API Server. Kubernetes’ backend is then responsible for starting up the requested Pod on a node.

Figure 22-1 shows how a controller registers itself as an event listener for detecting changes on the managed resources. It observes the current state and changes it by calling out to the API Server to get closer to the target state (if necessary).

Observe-Analyse-Act cycle
Figure 22-1. Observe-Analyze-Act cycle

Controllers are part of Kubernetes’ control plane, and it became clear early on that they would also allow extending the platform with custom behavior. Moreover, they have turned into the standard mechanism for extending the platform and enable complex application lifecycle management. And as a result, a new generation of more sophisticated controllers was born, called Operators. From an evolutionary and complexity point of view, we can classify the active reconciliation components into two groups:

Controllers

A simple reconciliation process that monitors and acts on standard Kubernetes resources. More often, these controllers enhance platform behavior and add new platform features.

Operators

A sophisticated reconciliation process that interacts with CustomResourceDefinitions (CRDs), which are at the heart of the Operator pattern. Typically, these Operators encapsulate complex application domain logic and manage the full application lifecycle. We discuss Operators in depth in Chapter 23.

As stated previously, these classifications help introduce new concepts gradually. Here, we focus on the simpler Controllers, and in the next chapter, we introduce CRDs and build up to the Operator pattern.

To avoid having multiple controllers acting on the same resources simultaneously, controllers use the SingletonService pattern explained in Chapter 10. Most controllers are deployed just as Deployments, but with one replica, as Kubernetes uses optimistic locking at the resource level to prevent concurrency issues when changing resource objects. In the end, a controller is nothing more than an application that runs permanently in the background.

Because Kubernetes itself is written in Go, and a complete client library for accessing Kubernetes is also written in Go, many controllers are written in Go too. However, you can write controllers in any programming language by sending requests to the Kubernetes API Server. We see a controller written in pure shell script later in Example 22-1.

The most straightforward kind of controllers extend the way Kubernetes manages its resources. They operate on the same standard resources and perform similar tasks as the Kubernetes internal controllers operating on the standard Kubernetes resources, but which are invisible to the user of the cluster. Controllers evaluate resource definitions and conditionally perform some actions. Although they can monitor and act upon any field in the resource definition, metadata and ConfigMaps are most suitable for this purpose. The following are a few considerations to keep in mind when choosing where to store controller data:

Labels

Labels as part of a resource’s metadata can be watched by any controller. They are indexed in the backend database and can be efficiently searched for in queries. We should use labels when a selector-like functionality is required (e.g., to match Pods of a Service or a Deployment). A limitation of labels is that only alphanumeric names and values with restrictions can be used. See the Kubernetes documentation for which syntax and character sets are allowed for labels.

Annotations

Annotations are an excellent alternative to labels. They have to be used instead of labels if the values do not conform to the syntax restrictions of label values. Annotations are not indexed, so we use annotations for nonidentifying information not used as keys in controller queries. Preferring annotations over labels for arbitrary metadata also has the advantage that it does not negatively impact the internal Kubernetes performance.

ConfigMaps

Sometimes controllers need additional information that does not fit well into labels or annotations. In this case, ConfigMaps can be used to hold the target state definition. These ConfigMaps are then watched and read by the controllers. However, CRDs are much better suited for designing the custom target state specification and are recommended over plain ConfigMaps. For registering CRDs, however, you need elevated cluster-level permissions. If you don’t have these, ConfigMaps are still the best alternative to CRDs. We will explain CRDs in detail in Chapter 23, Operator.

Here are a few reasonably simple example controllers you can study as a sample implementation of this pattern:

jenkins-x/exposecontroller

This controller watches Service definitions, and if it detects an annotation named expose in the metadata, the controller automatically exposes an Ingress object for external access of the Service. It also removes the Ingress object when someone removes the Service.

fabric8/configmapcontroller

This is a controller that watches ConfigMap objects for changes and performs rolling upgrades of their associated Deployments. We can use this controller with applications that are not capable of watching the ConfigMap and updating themselves with new configurations dynamically. That is particularly true when a Pod consumes this ConfigMap as environment variables or when your application cannot quickly and reliably update itself on the fly without a restart. In Example 22-2, we implement such a controller with a plain shell script.

Container Linux Update Operator

This is a controller that reboots a Kubernetes node when it detects a particular annotation on the node.

Now let’s take a look at a concrete example: a controller that consists of a single shell script and that watches the Kubernetes API for changes on ConfigMap resources. If we annotate such a ConfigMap with k8spatterns.io/podDeleteSelector, all Pods selected with the given annotation value are deleted when the ConfigMap changes. Assuming we back these Pods with a high-order resource like Deployment or ReplicaSet, these Pods are restarted and pick up the changed configuration.

For example, the following ConfigMap would be monitored by our controller for changes and would restart all Pods that have a label app with value webapp. The ConfigMap in Example 22-1 is used in our web application to provide a welcome message.

Example 22-1. ConfigMap use by web application
apiVersion: v1
kind: ConfigMap
metadata:
  name: webapp-config
  annotations:
    k8spatterns.io/podDeleteSelector: "app=webapp"    1
data:
  message: "Welcome to Kubernetes Patterns !"
1

Annotation used as selector for the controller in Example 22-2 to find the application Pods to restart

Our controller shell script now evaluates this ConfigMap. You can find the source in its full glory in our Git repository. In short, the Controller starts a hanging GET HTTP request for opening an endless HTTP response stream to observe the lifecycle events pushed by the API Server to us. These events are in the form of plain JSON objects, which we then analyze to detect whether a changed ConfigMap carries our annotation. As events arrive, we act by deleting all Pods matching the selector provided as the value of the annotation. Let’s have a closer look at how the controller works.

The main part of this controller is the reconciliation loop, which listens on ConfigMap lifecycle events, as shown in Example 22-2.

Example 22-2. Controller script
namespace=${WATCH_NAMESPACE:-default}      1

base=http://localhost:8001                 2
ns=namespaces/$namespace

curl -N -s $base/api/v1/${ns}/configmaps?watch=true | 
while read -r event                        3
do
   # ...
done
1

Namespace to watch (or default if not given)

2

Access to the Kubernetes API via an Ambassador proxy running in the same Pod

3

Loop with watches for events on ConfigMaps

The environment variable WATCH_NAMESPACE specifies the namespace in which the controller should watch for ConfigMap updates. We can set this variable in the Deployment descriptor of the controller itself. In our example, we are using the Downward API described in Chapter 13, Self Awareness to monitor the namespace in which we have deployed the controller as configured in Example 22-3 as part of the controller Deployment.

Example 22-3. WATCH_NAMESPACE extracted from the current namespace
env:
 - name: WATCH_NAMESPACE
   valueFrom:
     fieldRef:
       fieldPath: metadata.namespace

With this namespace, the controller script constructs the URL to the Kubernetes API endpoint to watch the ConfigMaps.

Note

Note the watch=true query parameter in Example 22-2. This parameter indicates to the API Server not to close the HTTP connection but to send events along the response channel as soon as they happen (hanging GET or Comet are other names for this kind of technique). The loop reads every individual event as it arrives as a single item to process.

As you can see, our controller contacts the Kubernetes API Server via localhost. We won’t deploy this script directly on the Kubernetes API master node, but then how can we use localhost in the script? As you may have probably guessed, another pattern kicks in here. We deploy this script in a Pod together with an Ambassador container that exposes port 8001 on localhost and proxies it to the real Kubernetes Service. See Chapter 17 for more details on this pattern. We see the actual Pod definition with this Ambassador in detail later in this chapter.

Watching events this way is not very robust, of course. The connection can stop anytime, so there should be a way to restart the loop. Also, one could miss events, so production-grade controllers should not only watch on events but from time to time also query the API Server for the entire current state and use that as the new base. For the sake of demonstrating the pattern, this is good enough.

Within the loop, the logic shown in Example 22-4 is performed.

Example 22-4. Controller reconciliation loop
curl -N -s $base/api/v1/${ns}/configmaps?watch=true | 
while read -r event
do
  type=$(echo "$event"        | jq -r '.type')                       1
  config_map=$(echo "$event"  | jq -r '.object.metadata.name')
  annotations=$(echo "$event" | jq -r '.object.metadata.annotations')

  if [ "$annotations" != "null" ]; then
    selector=$(echo $annotations |                                2
     jq -r "
        to_entries                                           |
        .[]                                                  |
        select(.key == "k8spatterns.io/podDeleteSelector") |
        .value                                               |
         @uri                                                 
     ")
  fi

  if [ $type = "MODIFIED" ] && [ -n "$selector" ]; then            3
    pods=$(curl -s $base/api/v1/${ns}/pods?labelSelector=$selector |
           jq -r .items[].metadata.name)

    for pod in $pods; do                                           4
      curl -s -X DELETE $base/api/v1/${ns}/pods/$pod
    done
  fi
done
1

Extract the type and name of the ConfigMap from the event.

2

Extract all annotations on the ConfigMap with key k8spatterns.io/podDeleteSelector. See “Some jq Fu” for an explanation of this jq expression.

3

If the event indicates an update of the ConfigMap and our annotation is attached, then find all Pods matching this label selector.

4

Delete all Pods that match the selector.

First, the script extracts the event type that specifies what action happened to the ConfigMap. After we have extracted the ConfigMap, we derive the annotations with jq. jq is an excellent tool for parsing JSON documents from the command line, and the script assumes it is available in the container the script is running in.

If the ConfigMap has annotations, we check for the annotation k8spatterns.io/podDeleteSelector by using a more complex jq query. The purpose of this query is to convert the annotation value to a Pod selector that can be used in an API query option in the next step: an annotation k8spatterns.io/podDeleteSelector: "app=webapp is transformed to app%3Dwebapp that is used as a Pod selector. This conversion is performed with jq and is explained in “Some jq Fu” if you are interested in how this extraction works.

If the script can extract a selector, we can use it now directly to select the Pods to delete. First, we look up all Pods that match the selector, and then we delete them one by one with direct API calls.

This shell script-based controller is, of course, not production-grade (e.g., the event loop can stop any time), but it nicely reveals the base concepts without too much boilerplate code for us.

The remaining work is about creating resource objects and container images. The controller script itself is stored in a ConfigMap config-watcher-controller and can be easily edited later if required.

We use a Deployment to create a Pod for our controller with two containers:

  • One Kubernetes API Ambassador container that exposes the Kubernetes API on localhost on port 8001. The image k8spatterns/kubeapi-proxy is an Alpine Linux with a local kubectl installed and kubectl proxy started with the proper CA and token mounted. The original version, kubectl-proxy, was written by Marko Lukša, who introduced this proxy in Kubernetes in Action.

  • The main container that executes the script contained in the just-created ConfigMap. We use here an Alpine base image with curl and jq installed.

You can find the Dockerfiles for the k8spatterns/kubeapi-proxy and k8spatterns/curl-jq images in our example Git repository.

Now that we have the images for our Pod, the final step is to deploy the controller by using a Deployment. We can see the main parts of the Deployment in Example 22-5 (the full version can be found in our example repository).

Example 22-5. Controller Deployment
apiVersion: apps/v1
kind: Deployment
# ....
spec:
  template:
    # ...
    spec:
      serviceAccountName: config-watcher-controller     1
      containers:
      - name: kubeapi-proxy                             2
        image: k8spatterns/kubeapi-proxy
      - name: config-watcher                            3
        image: k8spatterns/curl-jq
        # ...
        command:                                        4
        - "sh"
        - "/watcher/config-watcher-controller.sh"
        volumeMounts:                                   5
        - mountPath: "/watcher"
          name: config-watcher-controller
      volumes:
      - name: config-watcher-controller                 6
        configMap:
          name: config-watcher-controller
1

ServiceAccount with proper permissions for watching events and restarting Pods

2

Ambassador container for proxying localhost to the Kubeserver API

3

Main container holding all tools and mounting the controller script

4

Startup command calling the controller script

5

Volume mapped to the ConfigMap holding our script

6

Mount of the ConfigMap-backed volume into the main Pod

As you can see, we mount the config-watcher-controller-script from the ConfigMap we have created previously and directly use it as the startup command for the primary container. For simplicity reasons, we omitted any liveness and readiness checks as well as resource limit declarations. Also, we need a ServiceAccount config-watcher-controller allowed to monitor ConfigMaps. Refer to the example repository for the full security setup.

Let’s see the controller in action now. For this, we are using a straightforward web server, which serves the value of an environment variable as the only content. The base image uses plain nc (netcat) for serving the content. You can find the Dockerfile for this image in our example repository.

We deploy the HTTP server with a ConfigMap and Deployment as is sketched in Example 22-6.

Example 22-6. Sample web app with Deployment and ConfigMap
apiVersion: v1
kind: ConfigMap                                         1
metadata:
  name: webapp-config
  annotations:
    k8spatterns.io/podDeleteSelector: "app=webapp"      2
data:
  message: "Welcome to Kubernetes Patterns !"           3
---
apiVersion: apps/v1
kind: Deployment                                        4
# ...
spec:
  # ...
  template:
    spec:
      containers:
      - name: app
        image: k8spatterns/mini-http-server             5
        ports:
        - containerPort: 8080
        env:
        - name: MESSAGE                                 6
          valueFrom:
            configMapKeyRef:
              name: webapp-config
              key: message
1

ConfigMap for holding the data to serve

2

Annotation that triggers a restart of the web app’s Pod

3

Message used in web app in HTTP responses

4

Deployment for the web app

5

Simplistic image for HTTP serving with netcat

6

Environment variable used as an HTTP response body and fetched from the watched ConfigMap

This concludes our example of our ConfigMap controller implemented in a plain shell script. Although this is probably the most complex example in this book, it also shows that it does not take much to write a basic controller.

Obviously, for real-world scenarios, you would write this sort of controller in a real programming language that provides better error-handling capabilities and other advanced features.

Discussion

To sum up, a Controller is an active reconciliation process that monitors objects of interest for the world’s desired state and the world’s actual state. Then, it sends instructions to try to change the world’s current state to be more like the desired state. Kubernetes uses this mechanism with its internal controllers, and you can also reuse the same mechanism with custom controllers. We demonstrated what is involved in writing a custom controller and how it functions and extends the Kubernetes platform.

Controllers are possible because of the highly modular and event-driven nature of the Kubernetes architecture. This architecture naturally leads to a decoupled and asynchronous approach for controllers as extension points. The significant benefit here is that we have a precise technical boundary between Kubernetes itself and any extensions. However, one issue with the asynchronous nature of controllers is that they are often hard to debug because the flow of events is not always straightforward. As a consequence, you can’t easily set breakpoints in your controller to stop everything to examine a specific situation.

In Chapter 23, you’ll learn about the related Operator pattern, which builds on this Controller pattern and provides an even more flexible way to configure operations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset