10

Managing a Windows Pod

In this chapter, we’ll dive deep into scheduling Windows Pods on an Amazon EKS cluster. We will first learn about Pod resource control, then how to use RuntimeClass to ease toleration and node selector usage. Next, we will understand how to achieve Active Directory integration on Windows Pods. Finally, we will deploy a Windows Pod on the Amazon EKS cluster.

The chapter will cover the following topics:

  • Exploring Windows host and Pod resource management
  • Understanding the RuntimeClass use case
  • Understanding Active Directory integration on Kubernetes
  • Deploying a Windows Pod on Amazon EKS

Technical requirements

In the Deploying a Windows Pod on Amazon EKS section, you will need to have the following:

  • AWS CLI
  • kubectl
  • The same IAM profile used to deploy the Amazon EKS cluster in Chapter 9

To have access to the source code used in this chapter, access the following GitHub repository: https://github.com/PacktPublishing/Running-Windows-Containers-on-AWS/tree/main/eks-windows.

Exploring Windows host and Pod resource management

In Chapter 1, Windows Container 101, we learned how Windows Server implements resource management on Windows containers and how the Host Computer Service (HCS) governs these resources.

In Kubernetes, you can specify how many resources (memory, RAM, and CPU) a Pod can request and be limited to before it gets scheduled to a host. kube-scheduler is responsible for using the request/limit information in order to decide which node to schedule the Pod on.

We learned that the Windows Server OS is entirely different from Linux, thus changing the behavior of how a Windows Pod performs and allocates memory and CPU per Pod. I often see customers overlooking Amazon EC2 Windows node capacity planning and usually only figuring out they could have done better when the problem has already happened. The beauty of the cloud is that it allows quick changes, but imagine if it did not do so.

However, understanding how resource management works gives you more insights into what Amazon EC2 instance types should be part of your Windows node group and how to set resource boundary limits between Pods and nodes adequately.

Pod memory management

Windows Server doesn’t have an out-of-memory process killer as Linux does, so if the Windows Pod reaches the 800 Mi specified in the following example, the container won’t exit. Instead, it will start using paging files which affects disk performance because Windows treats all user-mode memory allocation as virtual and won’t overcommit memory for the process.

In the following example, we limit the Windows Pod to consumption of up to 800 Mi and starting with 128 Mi:

spec:
  - name: iis
    image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019
    resources:
      limits:
        cpu: 2
        memory: 800Mi
      requests:
        cpu: .1
        memory: 128Mi

With this memory behavior in mind, you should choose Amazon EC2 nodes with approximately 15% memory over-provisioned on top of what is needed to avoid slowing down performance if a Pod reaches the out-of-memory (OOM) state.

Pod CPU management

This is tricky because Pod CPU management isn’t easy to predict. It depends on different factors, such as application behavior during peak/off-peak hours and how well the applications were written. The recommendation is to set limits up to approximately 80% of the available cores, monitor for some time, and keep adjusting over time.

For instance, if your Amazon EC2 Windows node has 4 vCPUs, limit it to 3 CPUs on all Pod specifications.

Host CPU management

By default, Windows Server can limit the amount of CPU time allocated for a process, but it cannot guarantee a minimum amount of CPU time (request). Due to this condition, kubelet supports a flag to set the scheduling priority.

On Windows Server, threads are scheduled to run based on their scheduling priority. By default, the system assigns time slices in a round-robin fashion to all threads. However, if a thread with higher priority becomes available, the system ceases to execute the lower-priority thread and assigns a full-time slice to the higher-priority thread, which, in our case, is kubelet.

To enable kubelet threads with high priority, you can set "--windows-priorityclass=ABOVE_NORMAL_PRIORITY_CLASS" for KubeletExtraArgs in the Amazon EC2 Windows node bootstrap. Doing so will ensure kubelet threads are always in high priority and won’t be affected by the default system CPU time slices assignment.

The following example is an Amazon EC2 Windows node kubelet bootstrap with the priority class as an extra argument:

<powershell>
[string]$EKSBootstrapScriptFile = "$env:ProgramFilesAmazonEKSStart-EKSBootstrap.ps1"
& $EKSBootstrapScriptFile -EKSClusterName "eks-cluster" -APIServerEndpoint "https://EC8BCC9AD1F41CBEB61EB5927CB068C7.yl4.us-east-1.eks.amazonaws.com" -Base64ClusterCA "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1ekNDQWMrZ0F3SU JBZ0lCLS0tCg…==" -DNSClusterIP "10.100.0.10" -KubeletExtraArgs "--windows-priorityclass=ABOVE_NORMAL_PRIORITY_CLASS" 3>&1 4>&1 5>&1 6>&1
</powershell>

System resource reservations

Kubernetes supports two flags that can be used to reserve resources, such as memory and CPU, for optimal node performance:

  • --kube-reserved is used to reserve memory (RAM) and CPU for Kubernetes daemons such as kubelet, container runtime, and so on
  • --system-reserved is used to reserve memory (RAM) and CPU for OS daemons such as udev, sshd, and so on

When setting these values into a Windows Server OS, kube-reserved and system-reserved aren’t allowed to control such memory (RAM) in the kernel for these services. However, you can still use these values, so the reserved resources subtract from NodeAllocatable, which results in kube-scheduler not fully utilizing the resources available on the node for scheduling purposes. On Amazon EC2 Windows nodes, a best practice is to subtract 2.0 GiB from the total EC2 memory in order to have the required 1.5 GiB for Windows Server OS and around 500 MiB for kubelet.

The following example is a kubelet bootstrap for an Amazon EC2 Windows node with kube-reserved and system-reserved as extra arguments:

<powershell>
[string]$EKSBootstrapScriptFile = "$env:ProgramFilesAmazonEKSStart-EKSBootstrap.ps1"
& $EKSBootstrapScriptFile -EKSClusterName "eks-cluster" -APIServerEndpoint "https://EC8BCC9AD1F41CBEB61EB5927CB068C7.yl4.us-east-1.eks.amazonaws.com" -Base64ClusterCA "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1ekNDQWMr Z0F3SUJBZ0lCLS0tCg…==" -DNSClusterIP "10.100.0.10" -KubeletExtraArgs "--kube-reserved=memory=0.5Gi --system-reserved memory=1.5Gi --windows-priorityclass=ABOVE_NORMAL_PRIORITY_CLASS" 3>&1 4>&1 5>&1 6>&1
</powershell>

In this section, we dove deep into running a Windows Pod and how to correctly set up kubelet in the Amazon EC2 Windows node. This is a guide for best practices and won’t block you from running Windows Pods on Amazon EKS if not implemented.

Understanding the Runtime Class use case

Runtime Class is one of the features that usually doesn’t get much attention, but it is crucial to help you run heterogeneous Kubernetes clusters. Runtime Class allows you to centralize nodeSelector and Tolerations configurations the same way you would offload application configuration from the manifest file using ConfigMap.

Assume you have two Kubernetes deployments. Deployment 1 deploys the frontend, and Deployment 2 deploys the backend; however, both need to be scheduled on an Amazon EC2 Windows node and need to tolerate the os=windows value:

Figure 10.1 – Kubernetes deployment manifests

Figure 10.1 – Kubernetes deployment manifests

Both deployment specifications are almost identical; the only difference between the deployment name and labels is to differentiate between the frontend and backend. In addition, you can see that both also have nodeSelector and tolerations, ensuring that both deployments will be scheduled on node groups based on Windows OS and have a taint value of os=windows.

Now, you were asked to update the taint value on the Amazon EC2 Windows node fleet from os=windows to os=windowsserver2019. So, what would be your next move to ensure new Pods from these two deployments would be correctly scheduled on the Amazon EC2 Windows nodes with the new taint? Probably changing it directly in the deployment manifest (automated or not). Right?

But if you had 100 deployments, you would update 100 files. Using Runtime Class, you would need to change one file once, and it would automatically apply to all deployments using the RuntimeClass object.

In the following example, I’m using a RuntimeClass object to specify the container runtime, nodeSelector, and toleration that would be applied on Windows Pods:

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: windows-2019
handler: 'docker'
scheduling:
  nodeSelector:
    kubernetes.io/os: 'windows'
    node.kubernetes.io/windows-build: '10.0.17764'
  tolerations:
  - effect: NoSchedule
    key: os
    operator: Equal
    value: "windows"

The next step is to replace the nodeSelector and tolerations configurations in the Windows Pod with Runtime Class:

Figure 10.2 – Using the RuntimeClass object to specify nodeSelector and tolerations

Figure 10.2 – Using the RuntimeClass object to specify nodeSelector and tolerations

As you can see, we decoupled nodeSelector and tolerations from the Deployment manifest, and now with Runtime Class, we have a central place to manage these configurations. The net effect of this change will be applied to new Windows Pods, while existing ones will continue to run because we have the effect:NoSchedule flag on the tainted node.

Note

You can learn more about Runtime Class at the following URL: https://kubernetes.io/docs/concepts/containers/runtime-class/.

In this section, we learned that Runtime Class makes it easier to manage nodeSelector and taints. However, Runtime Class doesn’t apply only to heterogeneous Amazon EKS clusters and should be used as much as possible to decouple configuration from Pod specifications.

Understanding Active Directory integration on Kubernetes

In Chapter 5, Deploying an EC2 Windows-Based Task, in the Setting up Active Directory integration section, we dove deep into the use case and methods available to set up Active Directory integration. The concept and use case remains the same here; the only difference is how to implement it in Kubernetes.

In Kubernetes, two Webhook admission controllers (open sourced by Kubernetes-SIG) are required to support Active Directory integration with the Kerberos protocol:

  • A mutating Webhook is responsible for modifying objects sent to the API server, which modifies the gMSA account reference into a JSON file within the Pod spec
  • A validating Webhook ensures the gMSA account is authorized to be used by the Pod service account

Important note

The gMSA admission Webhook can be found at https://github.com/kubernetes-sigs/windows-gmsa.

Installing the gMSA Webhook admission controller is easy, but it requires changing the signer in the create-signed-cert.sh script to beta.eks.amazonaws.com in order to make it functional on Amazon EKS. A Kubernetes signer is responsible for signing certificates on Webhooks or Kubernetes operator requests in order to allow API access control.

The following steps will help you successfully install the admission controllers:

git clone https://github.com/kubernetes-sigs/windows-gmsa.git
cd windows-gmsa/admission-webhook/deploy
sed -i.back "s/signerName: kubernetes.io/kubelet-serving/signerName: beta.eks.amazonaws.com/app-serving/g" create-signed-cert.sh
K8S_GMSA_DEPLOY_DOWNLOAD_REV='v0.4.0' ./deploy-gmsa-webhook.sh --file ./gmsa-manifests --image sigwindowstools/k8s-gmsa-webhook:v0.4.0 --overwrite

Important note

You can find a detailed step-by-step guide on implementing Active Directory integration in the official Kubernetes documentation. Please consider the necessary signer modification that needs to be in place to have it functional on Amazon EKS. The documentation is at https://aws.amazon.com/blogs/containers/windows-authentication-on-amazon-eks-windows-pods/.

Credential specs

As discussed in Chapter 5, Deploying an EC2 Windows-Based Task, in the gMSA using portable user identity section, a credential spec (CredSpec) file is a JSON document used by the container runtime (Docker) that exposes what gMSA account will be used in the case that a Windows container needs Active Directory integration. On Amazon ECS, this credential spec file can live in the Amazon EC2 container instance or an S3 bucket (decoupled strategy).

On Kubernetes, the CredSpec file decouples from the Amazon EC2 Windows node, and its value is stored as a Custom Resource Definition (CRD) in Kubernetes etcd. The benefit of this approach is that you are not required to have a copy of this file on each Amazon EC2 Windows node in the Docker path.

This section taught us the differences between Active Directory integration on Amazon ECS and Amazon EKS. However, as mentioned, the use cases and concepts covered In Chapter 5 remain the same. The only difference is how Kubernetes stores the CredSpec file, which is a required configuration, and the admission Webhooks that need to validate and present the CredSpec file correctly to the Windows Pod requester.

Deploying a Windows Pod on Amazon EKS

In Chapter 9, Deploying a Windows Node Group, in the Deploying a Windows node group with Terraform section, we covered and deployed a heterogenous Amazon EKS cluster with a Windows node group.

In this chapter, we will deploy a Windows Pod running an IIS container image and expose it via a Kubernetes service that automatically creates an ELB Classic Load Balancer.

Important note

You will see code snippets for the remainder of this section. The full Terraform code for this chapter can be found at https://github.com/PacktPublishing/Running-Windows-Containers-on-AWS/tree/main/eks-windows.

Unlike the previous chapters, we won’t use Terraform to deploy the Windows Pod; instead, we will connect to the cluster we created and deploy the Windows Pod using kubectl.

kubectl is a command-line tool to easily interface with the Kubernetes control plane through the Kubernetes API to perform manual actions from deploying Pods to troubleshooting. However, in a production environment, it is common to see customers not even using kubectl for deployments; instead, they leverage GitOps using ArgoCD as the continuous deployment tool. This may be a topic for another book.

At the same time, Terraform is commonly used as an Infrastructure as a Code (IaC) tool to provision the Amazon EKS cluster infrastructure.

Connecting to the Amazon EKS cluster

The first step is to connect to the Amazon EKS cluster, so first, you need to configure kubeconfig by running the following command:

aws eks --region region-code update-kubeconfig --name eks-windows

Change region-code to the one you used to deploy the cluster using Terraform. You can check whether the connection was successful by running the following:

kubectl get Pods -n kube-system

If Pods are listed, you are in the cluster. Now, let’s analyze the deployment manifest that will be used to deploy the Windows Pod.

Deploying the Windows Pod

We will be using a Kubernetes manifest that is based on YAML files as a blueprint to deploy the Windows Pod.

First, we will be creating a deployment on Kubernetes on the default namespace and setting replicas to 2:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: windows-server-iis-ltsc2019
  namespace: default
spec:
  selector:
    matchLabels:
      app: windows-server-iis-ltsc2019
      tier: backend
      track: stable
  replicas: 2
  template:
    metadata:
      labels:
        app: windows-server-iis-ltsc2019
        tier: backend
        track: stable

The continuation of the file is where we specify the container configuration. As you can see, I’m using a public Windows container image from the Microsoft Artifact Registry, and for nodeSelector, I’m using Kubernetes.io/os: windows:

spec:
      containers:
      - name: windows-server-iis-ltsc2019
        image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019
        ports:
        - name: http
          containerPort: 80
        imagePullPolicy: IfNotPresent
      nodeSelector:
        kubernetes.io/os: windows

Next, we will create an Ingress controller using a service, which will expose the Windows Pod to the internet on port 80:

---
apiVersion: v1
kind: Service
metadata:
  name: windows-server-iis-ltsc2019-service
  namespace: default
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: windows-server-iis-ltsc2019
    tier: backend
    track: stable
  sessionAffinity: None
  type: LoadBalancer

To deploy the Pod, run the following command on kubectl:

kubectl create -f iis-deployment.yaml

Wait for some time until the Pods reach the Running status:

kubectl get Pods | grep windows*
windows-server-iis-ltsc2019-6869495977-bbsb8        1/1     Running   0               3d13h
windows-server-iis-ltsc2019-6869495977-kpzpx        1/1     Running   0               3d13h

At this point, you will be able to access the IIS Pods through the Amazon ELB, which was provisioned on your behalf. Run the following command on kubectl to identify the Amazon Elastic Load Balancer (ELB) address:

kubectl get service | grep windows*

The output should be similar to the following:

windows-server-iis-ltsc2019-service   LoadBalancer    172.20.105.147   a5922a91b57054e9f810bd8785a1284b-234420727.us-east-1.elb.amazonaws.com   80:30777/TCP   4d

You can now see the Windows Pod running the IIS site:

Figure 10.3 – Windows Pod running the IIS site

Figure 10.3 – Windows Pod running the IIS site

Following our previous architecture, where we deployed the heterogeneous Amazon EKS cluster with Amazon EC2 Windows nodes, we incrementally added more resources, such as the Windows Pod and Ingress controller:

Figure 10.4 – Heterogeneous Amazon EKS cluster with a Windows node group

Figure 10.4 – Heterogeneous Amazon EKS cluster with a Windows node group

So, we conclude a bigger step: successfully deploying a whole heterogeneous Amazon EKS cluster, with Amazon EC2 Windows nodes and a Windows Pod, all fully operational.

Summary

In this chapter, we started learning how to apply resource management at the host and Pod level correctly; then, we dove deep into how to use Runtime Class to centralize nodeSelector and tolerations configurations. Next, we covered the specifics of having Active Directory integration with Amazon EKS and the differences compared to Amazon ECS.

Finally, we went through the code snippets on deploying two Windows Pods by setting replica to 2 and exposing them to the internet through an Ingress controller.

In this chapter, we closed the loop on how to run Windows containers in three different AWS orchestrators. In the next chapter, we will learn about monitoring and logging to have an observability solution.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset