Setting up cluster federation from the ground up

To set up a Kubernetes cluster federation we need to run the components of the control plane, which are as follows:

etcd
federation-apiserver
federation-controller-manager

One of the easiest way to do that is to use the all-in-one hyperkube image:

https://github.com/kubernetes/kubernetes/tree/master/cluster/images/hyperkube.

The federation API server and the federation controller manager can be run as pods in an existing Kubernetes cluster, but as discussed earlier it is better from a fault tolerance and high availability point of view to run them in their own cluster.

Initial setup

First, you must have Docker running and get a Kubernetes release that contains the scripts we will use in this guide. The current release is 1.5.3. You can download the latest available version instead:

> curl -L https://github.com/kubernetes/kubernetes/releases/download/v1.5.3/kubernetes.tar.gz | tar xvzf -
> cd kubernetes

We need to create a directory for the federation config files and set the FEDERATION_OUTPUT_ROOT environment variable to that directory. For easy clean up, it's best to create a new directory:

> export FEDERATION_OUTPUT_ROOT="${PWD}/output/federation"
> mkdir -p "${FEDERATION_OUTPUT_ROOT}"

Now, we can initialize the federation:

> federation/deploy/deploy.sh init

Using the official hyperkube image

As part of every Kubernetes release, official release images are pushed to gcr.io/google_containers. To use the images in this repository, you can set the container image fields in the config files in ${FEDERATION_OUTPUT_ROOT} to point to the gcr.io/google_containers/hyperkube image, which includes both the federation-apiserver and federation-controller-manager binaries.

Running the federation control plane

We're ready to deploy the federation control plane by running the following command:

> federation/deploy/deploy.sh deploy_federation

The command will launch the control plane components as pods and create a service of type LoadBalancer for the federation API server and a persistent volume claim backed up by a dynamic persistent volume for etcd.

To verify everything was created correctly in the federation namespace, type the following:

> kubectl get deployments --namespace=federation

You should see this:

NAME                           DESIRED   CURRENT   UP-TO-DATE   federation-apiserver            1         1         1        federation-controller-manager   1         1         1        

You can also check your kubeconfig file for new entries via Kubectl config view. Note that dynamic provisioning works only for AWS and GCE at the moment.

Registering Kubernetes clusters with federation

To register a cluster with the federation, we need a secret to talk to the cluster. Let's create the secret in the host Kubernetes cluster. Suppose kubeconfig of the target cluster is at |cluster-1|kubeconfig. You can run the following command to create the secret:

> kubectl create secret generic cluster-1 --namespace=federation    
  --from-file=/cluster-1/kubeconfig

The configuration for the cluster looks the same as this:

apiVersion: federation/v1beta1
kind: Cluster
metadata:
  name: cluster1
spec:
  serverAddressByClientCIDRs:
  - clientCIDR: <client-cidr>
    serverAddress: <apiserver-address>
  secretRef:
    name: <secret-name>

We need to set <client-cidr>, <apiserver-address>, and <secret-name>. <secret-name> here is name of the secret that you just created. serverAddressByClientCIDRs contains the various server addresses that clients can use as per their CIDR. We can set the server's public IP address with CIDR 0.0.0.0/0, which all clients will match. In addition, if you want internal clients to use the server's clusterIP, you can set that as serverAddress. The client CIDR in that case will be a CIDR that only matches IPs of pods running in that cluster.

Let's register the cluster:

> kubectl create -f /cluster-1/cluster.yaml --context=federation-cluster

Let's see if the cluster has been registered properly:

> kubectl get clusters --context=federation-cluster
NAME       STATUS    VERSION   AGE
cluster-1   Ready               1m

Updating KubeDNS

The cluster is registered with the federation. It's time to update kube-dns so that your cluster can route federation service requests. As of Kubernetes 1.5 or later, it's done by passing the --federations flag to kube-dns via the kube-dns ConfigMap :

--federations=${FEDERATION_NAME}=${D
NS_DOMAIN_NAME}  

Here is what the ConfigMap looks:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-dns
  namespace: kube-system
data:
  federations: <federation-name>=<federation-domain-name>

Replace the federation-name and the federation-domain-name with the correct values.

Shutting down the federation

If you want to shut down the federation, just run the following command:

federation/deploy/deploy.sh destroy_federation

Setting up cluster federation with Kubefed

Kubernetes 1.5 has a new command-line tool (still in alpha) called Kubefed to help you administrate your federated clusters. The job of Kubefed is to make it easy to deploy a new Kubernetes cluster federation control plane, and to add or remove clusters from an existing federation control plane.

Getting Kubefed

Kubefed is part of the Kubernetes client binaries. You can get them here:

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md

You'll get the latest Kubectl and Kubefed. Here are the instructions for downloading and installing on Linux for the 1.5.3 version:

curl -O https://storage.googleapis.com/kubernetes-release/release/v1.5.3/kubernetes-client-linux-amd64.tar.gz
tar -xzvf kubernetes-client-linux-amd64.tar.gz
sudo cp kubernetes/client/bin/kubefed /usr/local/bin
sudo chmod +x /usr/local/bin/kubefed
sudo cp kubernetes/client/bin/kubectl /usr/local/bin
sudo chmod +x /usr/local/bin/kubectl

Make the necessary adjustments if you're using a different OS or want to install a different version.

Choosing a host cluster

The federation control plane can be its own dedicated cluster or hosted with an existing cluster. You need to make this decision. The host cluster hosts the components that make up your federation control plane. Ensure that you have a kubeconfig entry in your local kubeconfig that corresponds to the host cluster. To verify that you have the required kubeconfig entry, type the following:

> kubectl config get-contexts

You should see something like this:

CURRENT   NAME      CLUSTER   AUTHINFO  NAMESPACE
          cluster-1 cluster-1 cluster-1

The content name cluster-1 will be provided later when deploying the federation control plane.

Deploying a federation control plane

It's time to start using Kubefed. The kubefed init command requires three arguments:

  • The federation name
  • Host cluster context
  • A domain name suffix for your federated services

The following example command deploys a federation control plane with the name federation; a host cluster context, cluster-1; and the domain suffix kubernetes-ftw.com:

> kubefed init federation --host-cluster-context=cluster-1  --dns-zone-name=" kubernetes-ftw.com"

The DNS suffix should be for a DNS domain you manage, of course.

kubefed init sets up the federation control plane in the host cluster and adds an entry for the federation API server in your local kubeconfig. In the alpha release of Kubernetes 1.5, it doesn't set the current context to the newly deployed federation. You'll have to do it yourself. Type the following command:

kubectl config use-context federation

Adding a cluster to a federation

Once the control plane has been deployed successfully, we should add some Kubernetes clusters to the federation. Kubefed provides the join command exactly for this purpose. The kubefed join command requires the following arguments:

  • The name of the cluster to add
  • Host cluster context

For example, to add a new cluster called cluster-2 to the federation, type the following:

kubefed join cluster-2 --host-cluster-context=cluster-1

Naming rules and customization

The cluster name you supply to kubefed join must be a valid RFC 1035 label. RFC 1035 allows only letters, digits, and hyphens, and the label must start with a letter.

Furthermore, the federation control plane requires credentials of the joined clusters to operate on them. These credentials are obtained from the local kubeconfig. The Kubefed join command uses the cluster name specified as the argument to look for the cluster's context in the local kubeconfig. If it fails to find a matching context, it exits with an error.

This might cause issues in cases where context names for each cluster in the federation don't follow RFC 1035 label naming rules. In such cases, you can specify a cluster name that conforms to the RFC 1035 label naming rules and specify the cluster context using the --cluster-context flag. For example, if the context of the cluster you are joining is cluster-3 (underscore is not allowed), you can join the cluster by running this:

kubefed join cluster-3 --host-cluster-context=cluster-1 --cluster-context=cluster-3

Secret name

Cluster credentials required by the federation control plane as described in the previous section are stored as a secret in the host cluster. The name of the secret is also derived from the cluster name.

However, the name of a secret object in Kubernetes should conform to the DNS subdomain name specification described in RFC 1123. If this isn't the case, you can pass the secret name to kubefed join using the --secret-name flag. For example, if the cluster name is cluster-4 and the secret name is 4secret (starting with a digit is not allowed), you can join the cluster by running this:

kubefed join cluster-4 --host-cluster-context=cluster-1 --secret-name=4secret

The kubefed join command automatically creates the secret for you.

Removing a cluster from a federation

To remove a cluster from a federation, run the kubefed unjoin command with the cluster name and the federation's host cluster context:

kubefed unjoin cluster-2 --host-cluster-context=cluster-1

Shutting down the federation

Proper cleanup of the federation control plane is not fully implemented in this alpha release of Kubefed. However, for the time being, deleting the federation system namespace should remove all the resources except the persistent storage volume dynamically provisioned for the federation control plane's etcd. You can delete the federation namespace by running the following command:

> kubectl delete ns federation-system

Cascading delete of resources

The Kubernetes cluster federation often manages a federated object in the control plane, as well as corresponding objects in each member Kubernetes cluster. A cascading delete of a federated object means that the corresponding objects in the member Kubernetes clusters will also be deleted.

This doesn't happen automatically. By default, only the federation control plane object is deleted. To activate cascading delete, you need to set the following option:

DeleteOptions.orphanDependents=false

The following federated objects support cascading delete:

  • Deployment
  • DaemonSets
  • Ingress
  • Namespaces
  • ReplicaSets
  • SecretsFor other objects, you'll have to go into each cluster and delete them explicitly.

Load balancing across multiple clusters

Dynamic load balancing across clusters is not trivial. The simplest solution is to just say that it is not Kubernetes' responsibility. Load balancing will be performed outside the Kubernetes cluster federation. But given the dynamic nature of Kubernetes, even an external load balancer will have to gather a lot of information about which services and backend pods are running on each cluster. An alternative solution is for the federation control plane to implement an L7 load balancer that serves as traffic director for the entire federation. In one of the simpler use cases, each service runs on a dedicated cluster and the load balancer simply routes all traffic to that cluster. In case of cluster failure, the service is migrated to a different cluster and the load balancer now routes all traffic to the new cluster. This provides a coarse fail-over and high availability solution at the cluster level.

The optimal solution will be able to support federated services and take into account additional factors, such as the following:

  • Geo-location of client
  • Resource utilization of each cluster
  • Resource quotas and auto-scaling

The following diagram shows how an L7 load balancer on GCE distributes client requests to the closest cluster:

Load balancing across multiple clusters

Failing over across multiple clusters

Federated failover is tricky. Suppose a cluster in the federation fails; one option is to just have other clusters pick up the slack. Now, the question is, how do you distribute the load across other clusters:

  • Uniformly?
  • Launch a new cluster?
  • Pick an existing cluster as close as possible (maybe in the same region)?

Each of these solutions has subtle interactions with federated load balancing, geo-distributed high availability, cost management across different clusters, and security.

Now, the failed cluster comes back online. Should it gradually take over its original workload again? What if it comes back but with reduced capacity or sketchy networking? There are many combinations of failure modes that could make recovery complicated.

Federated service discovery

Federated service discovery is tightly coupled with federated load balancing. A pragmatic setup includes a global L7 load balancer that distributes requests to federated ingress objects in the federation clusters.

The benefit of this approach is that the control stays with the Kubernetes federation, which over time will able to work with more cluster types (currently just AWS and GCE) and understand cluster utilization and other constraints.

The alternative of having a dedicated lookup service and let clients connect directly to services on individual clusters loses all these benefits.

Federated migration

Federated migration is related to several topics we discussed, such as location affinity, federated scheduling, and high availability. At the core, federated migration means moving a whole application or some part of it from one cluster to another (and more generally from M clusters to N clusters). Federation migration can happen in response to various events, such as the following:

  • A low capacity event in a cluster (or a cluster failure)
  • A change of scheduling policy (we no longer use cloud provider X)
  • A change of resource pricing (cloud provider Y dropped their prices - let's migrate there)
  • A new cluster was added to or removed from the federation (let's rebalance the pods of the application)

Strictly-coupled applications can be trivially moved, in part or in whole, one pod at a time, to one or more clusters (within applicable policy constraints, for example PrivateCloudOnly).

For preferentially-coupled applications, the federation system must first locate a single cluster with sufficient capacity to accommodate the entire application, then reserve that capacity and incrementally move the application, one (or more) resource at a time, over to the new cluster within some bounded time period (and possibly within a predefined maintenance window).

Strictly-coupled applications (with the exception of those deemed completely immovable) require the federation system to do the following:

  • Start up an entire replica application in the destination cluster
  • Copy persistent data to the new application instance (possibly before starting pods)
  • Switch user traffic across
  • Tear down the original application instance
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset