Load balancing options

Load balancing is a critical capability in dynamic systems such as a Kubernetes cluster. Nodes, VMs, and pods come and go, but the clients can't keep track of which individual entities can service their requests. Even if they could, it would require a complicated dance of managing a dynamic map of the cluster, refreshing it frequently, and handling disconnected, unresponsive, or just slow nodes. Load balancing is a battle-tested and well-understood mechanism that adds a layer of indirection that hides the internal turmoil from the clients or consumers outside the cluster. There are options for external as well as internal load balancers. You can also mix and match and use both. The hybrid approach has its own particular pros and cons, such as performance versus flexibility.

External load balancer

An external load balancer is a load balancer that runs outside the Kubernetes cluster, but there must be an external load balancer provider that Kubernetes can interact with to configure the external load balancer with health checks, firewall rules, and to get the external IP address of the load balancer.

The following diagram shows the connection between the load balancer (in the cloud), the Kubernetes API server, and the cluster nodes. The external load balancer has an up-to-date picture of which pods run on which nodes and it can direct external service traffic to the right pods:

External load balancer

Configuring an external load balancer

The external load balancer is configured via the service configuration file or directly through Kubectl. We use a service type of LoadBalancer instead of using a service type of ClusterIP, which directly exposes a Kubernetes node as a load balancer. This depends on an external load balancer provider properly installed and configured in the cluster. Google's GKE is the most well-tested provider, but other cloud platforms provide their integrated solution on top of their cloud load balancer.

Via configuration file

Here is an example service configuration file that accomplishes this goal:

{
      "kind": "Service",
      "apiVersion": "v1",
      "metadata": {
        "name": "example-service"
      },
      "spec": {
        "ports": [{
          "port": 8765,
          "targetPort": 9376
        }],
        "selector": {
          "app": "example"
        },
        "type": "LoadBalancer"
      }
}

Via Kubectl

You may also accomplish the same result using a direct kubectl command:

> kubectl expose rc example --port=8765 --target-port=9376 
--name=example-service --type=LoadBalancer

The decision whether to use a service configuration file or kubectl command is usually determined by the way you set up the rest of your infrastructure and deploy your system. configuration files are more declarative and arguably more appropriate for production usage where you want a versioned, auditable, and repeatable way to manage your infrastructure.

Finding the load balancer IP addresses

The load balancer will have two IP addresses of interest. The internal IP address can be used inside the cluster to access the service. The external IP address is the one clients outside the cluster will use. It's a good practice to create a DNS entry for the external IP address. To get both addresses, use the kubectl describe command. The IP will denote the internal IP address. The LoadBalancer ingress will denote the eternal IP address:

> kubectl describe services example-service
    Name:  example-service
    Selector:   app=example
    Type:     LoadBalancer
    IP:     10.67.252.103
    LoadBalancer Ingress: 123.45.678.9
    Port:     <unnamed> 80/TCP
    NodePort:   <unnamed> 32445/TCP
    Endpoints:    10.64.0.4:80,10.64.1.5:80,10.64.2.4:80
    Session Affinity: None
    No events.

Identifying client IP addresses

Sometimes, the service may be interested in the source IP address of the clients. Up until Kubernetes 1.5, this information wasn't available. In Kubernetes 1.5, there is a beta feature available only on GKE via an annotation to get the source IP address. In future versions, the capability will be added to other cloud platforms.

Annotating the load balancer for client IP address preservation

Here's how to annotate a service configuration file with the OnlyLocal annotation that triggers the preservation of the client source IP address:

{
  "kind": "Service",
  "apiVersion": "v1",
  "metadata": {
    "name": "example-service",
    "annotations": {
        "service.beta.kubernetes.io/external-traffic": "OnlyLocal"
    }
  },
  "spec": {
    "ports": [{
      "port": 8765,
      "targetPort": 9376
    }],
    "selector": {
      "app": "example"
    },
    "type": "LoadBalancer"
  }
}

Understanding potential in even external load balancing

External load balancers operate at the node level; while they direct traffic to a particular pod, the load distribution is done at the node level. That means that if your service has four pods, and three of them are on node A and the last one is on node B, then an external load balancer is likely to divide the load evenly between node A and node B. This will have the three pods on node A handle half of the load (1/6 each) and the single pod on node B handle the other half of the load on its own. Weights may be added in the future to address this issue.

Service load balancer

Service load balancing is designed for funneling internal traffic within the Kubernetes cluster and not for external load balancing. This is done by using a service type of clusterIP. It is possible to expose a service load balancer directly via a preallocated port by using service type of NodePort and use it as an external load balancer, but it wasn't designed for that use case. For example, desirable features such as SSL negotiation and HTTP caching will not be readily available.

The following diagram shows how the service load balancer (the yellow clouds) can route traffic to one of the backend pods it manages (via labels of course):

Service load balancer

Ingress

Ingress in Kubernetes is at its core a set of rules that allow inbound connections to reach cluster services. In addition, some ingress controllers support the following:

  • Connection algorithms
  • Request limits
  • URL rewrites and redirects
  • TCP/UDP load balancing
  • Access control and authorization

Ingress is specified using an ingress resource and serviced by an ingress controller. It's important to note that ingress is still in beta and it doesn't surface yet all the necessary capabilities. Here is an example of an ingress resource that manages traffic into two services. The rules map the externally visible http:// foo.bar.com/foo to the s1 service and http://foo.bar.com/bar to the s2 service:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: test
spec:
  rules:
  - host: foo.bar.com
    http:
      paths:
      - path: /foo
        backend:
          serviceName: s1
          servicePort: 80
      - path: /bar
        backend:
          serviceName: s2
          servicePort: 80

There are two ingress controllers right now. One of them is an L7 ingress controller for GCE only. The other is a more general-purpose Nginx ingress controller that lets you configure Nginx via a ConfigMap. The Nginx ingress controller is very sophisticated and brings to bear a lot of features that are not available yet via the ingress resource directly. It uses the endpoints API to directly forward traffic to pods. For a detailed review, check out https://github.com/kubernetes/ingress/tree/master/controllers/nginx.

HAProxy

We discussed using a cloud provider external load balancer using service type LoadBalancer and using the internal service load balancer inside the cluster using ClusterIP. If we want a custom external load balancer we can create a custom external load balancer provider and use LoadBalancer or use the third service type, NodePort. High-Availability (HA) Proxy is a mature and battle-tested load balancing solution. It is considered the best choice for implementing external load balancing with on-premises clusters. This can be done in several ways:

  • Utilize NodePort and carefully manage port allocations
  • Implement custom load balancer provider interface
  • Run HAProxy inside your cluster as the only target of your frontend servers at the edge of the cluster (load balanced or not)

You can use all approaches with HAProxy. Regardless, it is still recommended to use ingress objects. The service-loadbalancer project is a community project that implemented a load balancing solution on top of HAProxy. You can find it here: https://github.com/kubernetes/contrib/tree/master/service-loadbalancer.

Utilizing the NodePort

Each service will be allocated a dedicated port from a predefined range. This usually is a high range such as 30,000 and up to avoid clashing with other applications using low known ports. HAProxy will run outside the cluster in this case and it will be configured with the correct port for each service. Then it can just forward any traffic to any nodes and Kubernetes via the internal service, and the load balancer will route it to a proper pod (double load balancing). This is of course sub-optimal because it introduces another hop. The way to circumvent it is to query the Endpoints API and dynamically manage for each service the list of its backend pods and directly forward traffic to the pods.

Custom load balancer provider using HAProxy

This approach is a little more complicated, but the benefit is that it is better integrated with Kubernetes and can make the transition to/from on-premises from/to the cloud easier.

Running HAProxy Inside the Kubernetes cluster

In this approach, we use the internal HAProxy load balancer inside the cluster. There may be multiple nodes running HAProxy and they will share the same configuration to map incoming requests and load balance them across the backend servers (the Apache servers in the following diagram):

Running HAProxy Inside the Kubernetes cluster

Keepalived VIP

Keepalived Virtual IP (VIP) is not necessarily a load balancing solution of its own.

It can be a complement to the Nginx ingress controller or the HAProxy-based service LoadBalancer. The main motivation is that pods move around in Kubernetes including your load balancer(s). That creates a problem for clients outside the network that require a stable endpoint. DNS is often not good enough due to performance issues. Keepalived provides a high-performance virtual IP address that can serve as the address to the Nginx ingress controller or the HAProxy load balancer. Keepalived utilizes core Linux networking facilities such as IPVS (IP virtual server) and implements high availability via Virtual Redundancy Router Protocol (VRRP). Everything runs at layer 4 (TCP/UDP). It takes some effort and attention to detail to configure it. Luckily, there is a Kubernetes contrib project that can get you started: https://github.com/kubernetes/contrib/tree/master/keepalived-vip.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset