Affinity and anti-affinity

Even though nodeSelector is simple and flexible, it's still inept at expressing the complicated needs of real-world applications. For example, we usually don't want pods of a StatefulSet be put in the same availability zone to satisfy cross-zone redundancy. It can be difficult to configure such requirements with only node selectors. For this reason, the concept of scheduling under constraints with labels has been extended to include affinity and anti-affinity.

Affinity comes into play in two different scenarios: pods-to-nodes and pods-to-pods. It's configured under the .spec.affinity path of a pod. The first option, nodeAffinity, is pretty much the same as nodeSelector, but formulates the relation between pods and nodes in a more expressive manner. The second option represents inter-pod enforcement in two forms: podAffinity and podAntiAffinity. For both nodes and inter-pod affinity, there are two different degrees of requirements:

  • requiredDuringSchedulingIgnoredDuringExecution 
  • preferredDuringSchedulingIgnoredDuringExecution

As can be seen from their names, both requirements take effect during scheduling, not execution—that is, if a pod has already been scheduled on a node, it remains in execution even if the condition of that node becomes ineligible for scheduling the pod. As for required and preferred, these represent the notion of hard and soft constraints, respectively. For a pod with the required criteria, Kubernetes will find a node that satisfies all requirements to run it; while in the case of the preferred criteria, Kubernetes will try to find a node that has the highest preference to run the pod. If there's no node that matches the preference, then the pod won't be scheduled. The calculation of preference is based on a configurable weight associated with all terms of the requirement. For nodes that already satisfy all other required conditions, Kubernetes will iterate through all preferred terms to sum the weight of each matched term as the preference score of a node. Take a look at the following example:

The pod has three weighted preferences on two keys: instance_type and region. When scheduling the pod, the scheduler will start matching the preferences with labels on nodes. In this example, since Node 2 has the instance_type=medium and region=NA labels, it gets a score of 15, which is the highest score out of all nodes. For this reason, the scheduler will place the pod on Node 2.

There are differences between the configuration for node affinity and inter-pod affinity. Let's discuss these separately.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset