Hardening Kubernetes

The previous section cataloged and listed the variety of security challenges facing developers and administrators deploying and maintaining Kubernetes clusters. In this section, we will hone in on the design aspects, mechanisms, and features offered by Kubernetes to address some of the challenges. You can get to a pretty good state of security by judicious use of capabilities such as service accounts, network policies, authentication, authorization, AppArmor, and secrets.

Remember that a Kubernetes cluster is one part of a bigger system that includes other software systems, people, and processes. Kubernetes can't solve all problems. You should always keep in mind general security principles, such as defense in depth, need-to-know basis, and principle of least privilege. In addition, log everything you think may be useful in the event of an attack and have alerts for early detection when the system deviates from its state. It may be just a bug or it may be an attack. Either way, you want to know about it and respond.

Understanding service accounts in Kubernetes

Kubernetes has regular users managed outside the cluster for humans connecting to the cluster (for example, via the kubectl command), and it has service accounts.

Regular users are global and can access multiple namespaces in the cluster. Service accounts are constrained to one namespace. This is important. It ensures namespace isolation, because whenever the API server receives a request from a pod, its credentials will apply only to its own namespace.

Kubernetes manages service accounts on behalf of the pods. Whenever Kubernetes instantiates a pod it assigns the pod a service account. The service account identifies all the pod processes when they interact with the API server. Each service account has a set of credentials mounted in a secret volume. Each namespace has a default service account called default. When you create a pod, it is automatically assigned the default service account unless you specify a different service account.

You can create additional service accounts. Create a file called custom-service-account.yaml with the following content:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: custom-service-account

Now type the following:

kubectl create -f custom-service-account.yaml

That will result in the following output:

serviceaccount "custom-service-account" created

Here is the service account listed alongside the default service account:

> kubectl get serviceAccounts
NAME                     SECRETS   AGE
custom-service-account   1         3m
default                  1         29d

Note

Note that a secret was created automatically for your new service account.

To get more detail, type the following:

 kubectl get serviceAccounts/custom-service-account
apiVersion: v1
kind: ServiceAccount
metadata:
  creationTimestamp: 2016-12-04T19:27:59Z
  name: custom-service-account
  namespace: default
  resourceVersion: "1243113"
  selfLink: /api/v1/namespaces/default/serviceaccounts/custom-service-account
  uid: c3cbec89-ba57-11e6-87e3-428251643d3a
secrets:
- name: custom-service-account-token-pn3lt

You can see the secret itself, which includes a ca.crt file and a token, by typing the following:

kubectl get secrets/custom-service-account-token-pn3lt -o yaml

How does Kubernetes manage service accounts?

The API server has a dedicated component called service account admission controller. It is responsible for checking, at pod creation time, if it has a custom service account and, if it does, that the custom service account exists. If there is no service account specified, then it assigns the default service account.

It also ensures the pod has ImagePullSecrets, which are necessary when images need to be pulled from a remote image registry. If the pod spec doesn't have any secrets, it uses the service account's ImagePullSecrets.

Finally, it adds a volume with an API token for API access and a volumeSource mounted at /var/run/secrets/kubernetes.io/serviceaccount.

The API token is created and added to the secret by another component called the Token Controller whenever a service account is created. The Token Controller also monitors secrets and adds or removes tokens wherever secrets are added or removed to/from a service account.

The service account controller ensures the default service account exists for every namespace.

Accessing the API server

Accessing the API requires a chain of steps that include authentication, authorization, and admission control. At each stage the request may be rejected. Each stage consists of multiple plugins that are chained together. The following diagram illustrates this:

Accessing the API server

Authenticating users

When you first create the cluster, a client certificate and key are created for you. Kubectl uses them to authenticate itself to the API server and vice versa over TLS on port 443 (an encrypted HTTPS connection). You can find your client key and certificate by checking your .kube/config file:

> cat C:Users	he_g.kubeconfig | grep client

    client-certificate: C:Users	he_g.minikubeapiserver.crt
    client-key: C:Users	he_g.minikubeapiserver.key

Note

Note that if multiple users need to access the cluster, the creator should provide the client certificate and key to the other users in a secure manner.

This is just establishing basic trust with the Kubernetes API server itself. You're not authenticated yet. Various authentication modules may look at the request and check for various additional client certificates, password, tokens, and JWT tokens (for service accounts). Most requests require an authenticated user (either a regular user or a service account), although there are some anonymous requests too. If a request fails to authenticate with all the authenticators it will be rejected with a 401 HTTP status code (unauthorized, which is a bit of a misnomer).

The cluster administrator determines what authentication strategies to use by providing various command-line arguments to the API server:

  • --client-ca-file=<filename> (for x509 client certificates specified in a file)
  • --token-auth-file=<filename> (for bearer tokens specified in a file)
  • --basic-auth-file=<filename> (for user/password pairs specified in a file)

Service accounts use an automatically loaded authentication plugin. The administrator may provide two optional flags:

  • --service-account-key-file=<filename> (PEM encoded key for signing bearer tokens. If unspecified, the API server's TLS private key will be used.)
  • --service-account-lookup (If enabled, tokens that are deleted from the API will be revoked.)

There are several other methods, such as open ID connect, web hook, keystone (the OpenStack identity service), and authenticating proxy. The main theme is that the authentication stage is extensible and can support any authentication mechanism.

The various authentication plugins will examine the request and, based on the provided credentials, will associate the following attributes: username (user-friendly name), UID (unique identifier and more consistent than the username), and groups (a set of group names the user belongs to). There may also be extra fields, which are just maps of string keys to string values.

The authenticator has no knowledge whatsoever of what a particular user is allowed to do. They just map a set of credentials to a set of identities. It is the job of the authorizers to figure out if the request is valid for the authenticated user.

Authorizing requests

Once a user is authenticated, authorization commences. Kubernetes has generic authorization semantics. A set of authorization plugins receives the request, which includes information such as the authenticated username and the request's verb (list, get, watch, create, and so on). If any authorization plugin authorizes the request, it may continue. If all authorizers rejected the request, it will be rejected with a 403 HTTP status code (forbidden).

The cluster administrator determines what authorization plugins to use by specifying the ---authorization-mode command-line flag, which is a comma-separated list of plugin names. The following modes are supported:

  • --authorization-mode=AlwaysDeny blocks all requests (used in tests).
  • --authorization-mode=AlwaysAllow allows all requests; use if you don't need authorization.
  • --authorization-mode=ABAC allows for a simple local-file-based, user-configured authorization policy. ABAC stands for Attribute-Based Access Control.
  • --authorization-mode=RBAC is an experimental implementation that allows for authorization to be driven by the Kubernetes API. RBAC stands for Roles-Based Access Control.
  • --authorization-mode=Webhook allows for authorization to be driven by a remote service using REST.

You can add your own custom authorization plugin by implementing the following straightforward Go interface:

type Authorizer interface {
  Authorize(a Attributes) (authorized bool, reason string, err error)
}

The Attributes input argument is also an interface that provides all the information you need to make an authorization decision:

type Attributes interface {
  GetUser() user.Info
  GetVerb() string
  IsReadOnly() bool
  GetNamespace() string
  GetResource() string
  GetSubresource() string
  GetName() string
  GetAPIGroup() string
  GetAPIVersion() string
  IsResourceRequest() bool
  GetPath() string
}

Using admission control plugins

OK. The request was authorized, but there is one more step before it can be executed. The request must go through a gauntlet of admission-control plugins. Unlike the authenticators and the authorizers, if a single admission controller rejects a request, it is denied.

Admission controllers are a neat concept. The idea is that there may be global cluster concerns that could be grounds for rejecting a request. Without admission controllers, all authorizers would have to be aware of these concerns and reject the request. But, with admission controllers, this logic can be performed once. In addition, an admission controller may modify the request. As usual, the cluster administrator decides which admission control plugins run by providing a command-line argument called admission-control. The value is a comma-separated and ordered list of plugins.

Let's look at what plugins are available:

  • AlwaysAdmit: Passthrough (I'm not sure why it's needed)
  • AlwaysDeny: Reject everything (useful for testing)
  • ImagePolicyWebhook: This complicated plugin connects to an external backend to decide whether a request should be rejected based on the image
  • ServiceAccount: Automation for service accounts
  • ResourceQuota: Reject requests that violate the namespace's resource quota
  • LimitRanger: Reject requests that violate resource limits
  • InitialResources (experimental): Assigns compute resources and limit based on historical usage, if not specified
  • NamespaceLifecycle: Reject requests for creating objects in terminating or non-existing namespaces
  • DefaultStorageClass: Adds a default storage class to requests for the creation of a PersistentVolumeClaim that doesn't specify a storage class

As you can see, the admission control plugins have very diverse functionality. They support namespace-wide policies and enforce validity of requests mostly from a resource management point of view. This frees the authorization plugins to focus on valid operations. The ImagePolicyWebHook is the gateway to validating images, which is a big challenge.

The division of responsibility for validating an incoming request through the separate stages of authentication, authorization, and admission, each with its own plugins, makes a complicated process much more manageable to understand and use.

Securing pods

Pod security is a major concern, since Kubernetes schedules the pods and lets them run. There are several independent mechanisms for securing pods and containers. Together these mechanisms support defense in depth, where, even if an attacker (or a mistake) bypasses one mechanism, it will get blocked by another.

Using a private image repository

This approach gives you a lot of confidence that your cluster will only pull images that you have previously vetted, and you can manage upgrades better. You can configure your $HOME/.dockercfg or $HOME/.docker/config.json on each node. But, on many cloud providers, you can't do it because nodes are provisioned automatically for you.

ImagePullSecrets

This approach is recommended for clusters on cloud providers. The idea is that the credentials for the registry will be provided by the pod, so it doesn't matter what node it is scheduled to run on. This circumvents the problem with .dockercfg at the node level.

First, you need to create a secret object for the credentials:

> kubectl create secret the-registry-secret 
  --docker-server=<docker registry server> 
  --docker-username=<username> 
  --docker-password=<password> 
  --docker-email=<email>
secret "docker-registry-secret" created.

You can create secrets for multiple registries (or multiple users for the same registry) if needed. The kubelet will combine all the ImagePullSecrets.

But, since pods can access secrets only in their own namespace, you must create a secret on each namespace where you want the pod to run.

Once the secret is defined, you can add it to the pod spec and run some pods on your cluster. The pod will use the credentials from the secret to pull images from the target image registry:

apiVersion: v1
kind: Pod
metadata:
  name: cool-pod
  namespace: the-namespace
spec:
  containers:
    - name: cool-container
      image: cool/app:v1
  imagePullSecrets:
    - name: the-registry-secret

Specifying a security context

A security context is a set of operating-system-level security settings such as UID, gid, capabilities, and SELinux role. These settings are applied at the container level as a container security content. You can specify a pod security context that will apply to all the containers in the pod. The pod security context can also apply its security settings (in particular, fsGroup and seLinuxOptions) to volumes.

Here is a sample pod security context:

apiVersion: v1
kind: Pod
metadata:
  name: hello-world
spec:
  containers:
    ...
  securityContext:
    fsGroup: 1234
    supplementalGroups: [5678]
    seLinuxOptions:
      level: "s0:c123,c456"

The container security context is applied to each container and it overrides the pod security context. It is embedded in the containers section of the pod manifest. Container context settings can't be applied to volumes, which remain at the pod level.

Here is a sample container security content:

apiVersion: v1
kind: Pod
metadata:
  name: hello-world
spec:
  containers:
    - name: hello-world-container
      # The container definition
      # ...
      securityContext:
        privileged: true
        seLinuxOptions:
          level: "s0:c123,c456"

Protecting your cluster with AppArmor

AppArmor is a Linux kernel security module. With AppArmor, you can restrict a process running in a container to a limited set of resources such as network access, Linux capabilities, and file permissions. You configure AppArmor though profiles.

Requirements

AppArmor support was added as Beta in Kubernetes 1.4. It is not available for every operating system, so you must choose a supported OS distribution in order to take advantage of it. Ubuntu and SUSE Linux support AppArmor and enable it by default. Other distributions have optional support. To check if AppArmor is enabled, type the following:

cat /sys/module/apparmor/parameters/enabled
 Y

If the result is Y then it's enabled.

The profile must be loaded into the kernel. Check the following file:

/sys/kernel/security/apparmor/profiles

Also, only the Docker runtime supports AppArmor at this time.

Securing a pod with AppArmor

Since AppArmor is still in Beta, you specify the metadata as annotations and not as bonafide fields. When it gets out of Beta that will change.

To apply a profile to a container, add the following annotation:

container.apparmor.security.beta.kubernetes.io/<container-name>: <profile-ref>

The profile reference can be either the default profile, runtime/default, or a profile file on the host localhost/<profile-name>.

Here is a sample profile that prevents writing to files:

#include <tunables/global>

profile k8s-apparmor-example-deny-write flags=(attach_disconnected) {
  #include <abstractions/base>

  file,

  # Deny all file writes.
  deny /** w,
}

AppArmor is not a Kubernetes resource, so the format is not the YAML or JSON you're familiar with.

To verify the profile was attached correctly, check the attributes of process 1:

kubectl exec <pod-name> cat /proc/1/attr/current

Pods can be scheduled on any node in the cluster by default. This means the profile should be loaded into every node. This is a classic use case for DaemonSet.

Writing AppArmor profiles

Writing profiles for AppArmor by hand is not trivial. There are some tools that can help: aa-genprof and aa-logprof can generate a profile for you and assist in fine-tuning it by running your application with AppArmor in complain mode. The tools keep track of your application's activity and AppArmor warnings, and create a corresponding profile. This approach works, but it feels clunky.

My favorite tool is bane (https://github.com/jessfraz/bane), which generates AppArmor profiles from a simpler profile language based on TOML syntax. Bane profiles are very readable and easy to grasp. Here is a snippet from a bane profile:

Name = "nginx-sample"
[Filesystem]
# read only paths for the container
ReadOnlyPaths = [
  "/bin/**",
  "/boot/**",
  "/dev/**",
]

# paths where you want to log on write
LogOnWritePaths = [
  "/**"
]


# allowed capabilities
[Capabilities]
Allow = [
  "chown",
  "setuid",
]

[Network]
Raw = false
Packet = false
Protocols = [
  "tcp",
  "udp",
  "icmp"
]

The generated AppArmor profile is pretty gnarly.

Pod security policies

Pod security policy (PSP) is available as Beta in Kubernetes 1.4. It must be enabled, and you must also enable the PSP admission control to use them. A PSP is defined at the cluster-level and defines the security context for pods. There are a couple of differences between using a PSP and directly specifying a security content in the pod manifest as we did earlier:

  • Apply the same policy to multiple pods or containers
  • Let the administrator control pod creation so users don't create pods with inappropriate security contexts
  • Dynamically generate different security content for a pod via the admission controller

PSPs really scale the concept of security contexts. Typically, you'll have a relatively small number of security policies compared to the number of pods (or rather, pod templates). This means that many pod templates and containers will have the same security policy. Without PSP, you have to manage it individually for each pod manifest.

Here is a sample PSP that allows everything:

{
  "kind": "PodSecurityPolicy",
  "apiVersion":"extensions/v1beta1",
  "metadata": {
    "name": "permissive"
  },
  "spec": {
      "seLinux": {
          "rule": "RunAsAny"
      },
      "supplementalGroups": {
          "rule": "RunAsAny"
      },
      "runAsUser": {
          "rule": "RunAsAny"
      },
      "fsGroup": {
          "rule": "RunAsAny"
      },
      "volumes": ["*"]
  }
}

Managing network policies

Node, pod, and container security is imperative, but it's not enough. Network segmentation is critical to design a secure Kubernetes clusters that allows multi-tenancy, as well as to minimize the impact of security breaches. Defense in depth mandates that you compartmentalize parts of the system that don't need to talk to each other, as well as carefully managing the direction, protocols, and ports of traffic.

Network policy allows you fine-grained control and proper network segmentation of your cluster. At the core, a network policy is a set of firewall rules applied to a set of namespaces and pods selected by labels. This a very flexible because labels can define virtual network segments and be managed as a Kubernetes resource.

Choosing a supported networking solution

Some networking backends don't support network policies. For example, the popular Flannel can't be used to apply policies.

Here is a list of supported network backends:

  • Calico
  • WeaveNet
  • Canal
  • Romana

Defining a network policy

You define a network policy using a standard YAML manifest.

Here is a sample policy:

apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
 name: the-network-policy
 namespace: default
spec:
 podSelector:
  matchLabels:
    role: db
 ingress:
  - from:
     - namespaceSelector:
        matchLabels:
         project: cool-project
     - podSelector:
        matchLabels:
         role: frontend
    ports:
     - protocol: tcp
       port: 6379

The spec part has two important parts, the podSelector and the ingress. The podSelector governs which pods this network policy applies to. The ingress governs which namespaces and pods can access these pods and which protocols and ports they can use.

In the sample network policy, the pod selector specified the target for the network policy to be all the pods that are labeled role: db. The ingress section has a from sub-section with a namespace selector and a pod selector. All the namespaces in the cluster that are labeled project: cool-project, and within these namespaces, all the pods that are labeled role: frontend, can access the target pods labeled role: db. The ports section defines a list of pairs (protocol and port) that further restrict what protocols and ports are allowed. In this case, the protocol is tcp and the port is 6379 (Redis standard port).

Note

Note that the network policy is cluster-wide, so pods from multiple namespaces in the cluster can access the target namespace. The current namespace is always included, so even if it doesn't have the project: cool label, pods with role: frontend can still have access.

It's important to realize that the network policy operates in whitelist fashion. By default, all access is forbidden, and the network policy can open certain protocols and ports to certain pods that match the labels. This means that, if your networking solution doesn't support network policies, all access will be denied.

Another implication of the whitelist nature is that, if multiple network policies exist, the union of all the rules apply. If one policy gives access to port 1234 and another gives access to port 5678 for the same set of pods, then a pod may access through either 1234 or 5678.

Using secrets

Secrets are paramount in secure systems. They can be credentials such as username and password, access tokens, API keys, or crypto keys. Secrets are typically small. If you have large amounts of data you want to protect, you should encrypt it and keep the encryption/decryption key as secrets.

Storing secrets in Kubernetes

Kubernetes stores secrets in etcd as plaintext. This means that direct access to etcd should be limited and carefully guarded. Secrets are managed at the namespace level. Pods can mount secrets either as files via secret volumes or as environment variables. From a security standpoint, this means that any user or service that can create a pod in a namespace can have access to any secret managed for that namespace. If you want to limit access to a secret, put it in a namespace accessible to a limited set of users or services.

When a secret is mounted to a pod it is never written to disk. It is stored in tmpfs. When the kubelet communicates with the API server it is uses TLS normally, so the secret is protected in transit.

Creating secrets

Secrets must be created before you try to create a pod that requires them. The secret must exist, otherwise the pod creation will fail.

You can create secrets with the following command:

kubectl create secret.

Here I create a generic secret called hush-hash, which contains two keys, username and password:

kubectl create secret generic hush-hush --from-literal=username=tobias --from-literal=password=cutoffs

The resulting secret is opaque:

> kubectl describe secrets/hush-hush
Name:           hush-hush
Namespace:      default
Labels:         <none>
Annotations:    <none>

Type:   Opaque

Data
====
password:       7 bytes
username:       6 bytes

You can create secrets from files using --from-file instead of --from-literal, and you can also create secrets manually if you encode the secret value as base64.

Key names inside a secret must follow the rules for DNS sub-domains (without the leading dor).

Decoding secrets

To get the content of a secret you can use kubectl get secret:

> kubectl get secrets/hush-hush -o yaml
apiVersion: v1
data:
  password: Y3V0b2Zmcw==
  username: dG9iaWFz
kind: Secret
metadata:
  creationTimestamp: 2016-12-06T22:42:54Z
  name: hush-hush
  namespace: default
  resourceVersion: "1450109"
  selfLink: /api/v1/namespaces/default/secrets/hush-hush
  uid: 537bd4d6-bc05-11e6-927a-26f559225611
type: Opaque

The values are base64-encoded. You need to decode them yourself:

> echo "Y3V0b2Zmcw==" | base64 –decode
cutoofs

Using secrets in a container

Containers can access secrets as files by mounting volumes from the pod. Another approach is to access the secrets as environment variables. Finally, a container can access the Kubernetes API directly or use kubectl get secret.

To use a secret mounted as a volume, the pod manifest should declare the volume and it should be mounted in the container's spec:

{
 "apiVersion": "v1",
 "kind": "Pod",
  "metadata": {
    "name": "pod-with-secret",
    "namespace": "default"
  },
  "spec": {
    "containers": [{
      "name": "the-container",
      "image": "redis",
      "volumeMounts": [{
        "name": "secret-volume",
        "mountPath": "/mnt/secret-volume",
        "readOnly": true
      }]
    }],
    "volumes": [{
      "name": "secret-volume",
      "secret": {
        "secretName": "hush-hush"
      }
    }]
  }
}

The volume name (secret-volume) binds the pod volume to the mount in the container. Multiple containers can mount the same volume.

When this pod is running, the username and password are available as files under /etc/secret-volume:

> kubectl exec pod-with-secret cat /mnt/secret-volume/username
tobias

> kubectl exec pod-with-secret cat /mnt/secret-volume/password
cutoffs
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset