To set up a Kubernetes cluster federation we need to run the components of the control plane, which are as follows:
etcd federation-apiserver federation-controller-manager
One of the easiest way to do that is to use the all-in-one hyperkube image:
https://github.com/kubernetes/kubernetes/tree/master/cluster/images/hyperkube.
The federation API server and the federation controller manager can be run as pods in an existing Kubernetes cluster, but as discussed earlier it is better from a fault tolerance and high availability point of view to run them in their own cluster.
First, you must have Docker running and get a Kubernetes release that contains the scripts we will use in this guide. The current release is 1.5.3. You can download the latest available version instead:
> curl -L https://github.com/kubernetes/kubernetes/releases/download/v1.5.3/kubernetes.tar.gz | tar xvzf - > cd kubernetes
We need to create a directory for the federation config files and set the FEDERATION_OUTPUT_ROOT
environment variable to that directory. For easy clean up, it's best to create a new directory:
> export FEDERATION_OUTPUT_ROOT="${PWD}/output/federation" > mkdir -p "${FEDERATION_OUTPUT_ROOT}"
Now, we can initialize the federation:
> federation/deploy/deploy.sh init
As part of every Kubernetes release, official release images are pushed to gcr.io/google_containers
. To use the images in this repository, you can set the container image fields in the config files in ${FEDERATION_OUTPUT_ROOT}
to point to the gcr.io/google_containers/hyperkube
image, which includes both the federation-apiserver
and federation-controller-manager
binaries.
We're ready to deploy the federation control plane by running the following command:
> federation/deploy/deploy.sh deploy_federation
The command will launch the control plane components as pods and create a service of type LoadBalancer
for the federation API server and a persistent volume claim backed up by a dynamic persistent volume for etcd
.
To verify everything was created correctly in the federation namespace, type the following:
> kubectl get deployments --namespace=federation
You should see this:
NAME DESIRED CURRENT UP-TO-DATE federation-apiserver 1 1 1 federation-controller-manager 1 1 1
You can also check your kubeconfig
file for new entries via Kubectl config view. Note that dynamic provisioning works only for AWS and GCE at the moment.
To register a cluster with the federation, we need a secret to talk to the cluster. Let's create the secret in the host Kubernetes cluster. Suppose kubeconfig
of the target cluster is at |cluster-1|kubeconfig
. You can run the following command to create the secret
:
> kubectl create secret generic cluster-1 --namespace=federation --from-file=/cluster-1/kubeconfig
The configuration for the cluster looks the same as this:
apiVersion: federation/v1beta1 kind: Cluster metadata: name: cluster1 spec: serverAddressByClientCIDRs: - clientCIDR: <client-cidr> serverAddress: <apiserver-address> secretRef: name: <secret-name>
We need to set <client-cidr>
, <apiserver-address>
, and <secret-name>
. <secret-name>
here is name of the secret that you just created. serverAddressByClientCIDRs
contains the various server addresses that clients can use as per their CIDR. We can set the server's public IP address with CIDR 0.0.0.0/0
, which all clients will match. In addition, if you want internal clients to use the server's clusterIP
, you can set that as serverAddres
s. The client CIDR in that case will be a CIDR that only matches IPs of pods running in that cluster.
> kubectl create -f /cluster-1/cluster.yaml --context=federation-cluster
Let's see if the cluster has been registered properly:
> kubectl get clusters --context=federation-cluster NAME STATUS VERSION AGE cluster-1 Ready 1m
The cluster is registered with the federation. It's time to update kube-dns
so that your cluster can route federation service requests. As of Kubernetes 1.5 or later, it's done by passing the --federations
flag to kube-dns
via the kube-dns
ConfigMap
:
--federations=${FEDERATION_NAME}=${D NS_DOMAIN_NAME}
Here is what the ConfigMap
looks:
apiVersion: v1 kind: ConfigMap metadata: name: kube-dns namespace: kube-system data: federations: <federation-name>=<federation-domain-name>
Replace the federation-name
and the federation-domain-name
with the correct values.
If you want to shut down the federation, just run the following command:
federation/deploy/deploy.sh destroy_federation
Kubernetes 1.5 has a new command-line tool (still in alpha) called Kubefed to help you administrate your federated clusters. The job of Kubefed is to make it easy to deploy a new Kubernetes cluster federation control plane, and to add or remove clusters from an existing federation control plane.
Kubefed is part of the Kubernetes client binaries. You can get them here:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md
You'll get the latest Kubectl and Kubefed. Here are the instructions for downloading and installing on Linux for the 1.5.3
version:
curl -O https://storage.googleapis.com/kubernetes-release/release/v1.5.3/kubernetes-client-linux-amd64.tar.gz tar -xzvf kubernetes-client-linux-amd64.tar.gz sudo cp kubernetes/client/bin/kubefed /usr/local/bin sudo chmod +x /usr/local/bin/kubefed sudo cp kubernetes/client/bin/kubectl /usr/local/bin sudo chmod +x /usr/local/bin/kubectl
Make the necessary adjustments if you're using a different OS or want to install a different version.
The federation control plane can be its own dedicated cluster or hosted with an existing cluster. You need to make this decision. The host cluster hosts the components that make up your federation control plane. Ensure that you have a kubeconfig
entry in your local kubeconfig
that corresponds to the host cluster. To verify that you have the required kubeconfig
entry, type the following:
> kubectl config get-contexts
You should see something like this:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE cluster-1 cluster-1 cluster-1
The content name cluster-1
will be provided later when deploying the federation control plane.
It's time to start using Kubefed. The kubefed init
command requires three arguments:
The following example command deploys a federation control plane with the name federation; a host cluster context, cluster-1
; and the domain suffix kubernetes-ftw.com
:
> kubefed init federation --host-cluster-context=cluster-1 --dns-zone-name=" kubernetes-ftw.com"
The DNS suffix should be for a DNS domain you manage, of course.
kubefed init
sets up the federation control plane in the host cluster and adds an entry for the federation API server in your local kubeconfig
. In the alpha release of Kubernetes 1.5, it doesn't set the current context to the newly deployed federation. You'll have to do it yourself. Type the following command:
kubectl config use-context federation
Once the control plane has been deployed successfully, we should add some Kubernetes clusters to the federation. Kubefed provides the join
command exactly for this purpose. The kubefed join
command requires the following arguments:
For example, to add a new cluster called cluster-2
to the federation, type the following:
kubefed join cluster-2 --host-cluster-context=cluster-1
The cluster name you supply to kubefed join
must be a valid RFC 1035 label. RFC 1035 allows only letters, digits, and hyphens, and the label must start with a letter.
Furthermore, the federation control plane requires credentials of the joined clusters to operate on them. These credentials are obtained from the local kubeconfig
. The Kubefed join
command uses the cluster name specified as the argument to look for the cluster's context in the local kubeconfig
. If it fails to find a matching context, it exits with an error.
This might cause issues in cases where context names for each cluster in the federation don't follow RFC 1035 label naming rules. In such cases, you can specify a cluster name that conforms to the RFC 1035 label naming rules and specify the cluster context using the --cluster-context
flag. For example, if the context of the cluster you are joining is cluster-3
(underscore is not allowed), you can join the cluster by running this:
kubefed join cluster-3 --host-cluster-context=cluster-1 --cluster-context=cluster-3
Cluster credentials required by the federation control plane as described in the previous section are stored as a secret in the host cluster. The name of the secret is also derived from the cluster name.
However, the name of a secret
object in Kubernetes should conform to the DNS subdomain name specification described in RFC 1123. If this isn't the case, you can pass the secret
name
to kubefed join
using the --secret-name
flag. For example, if the cluster name is cluster-4
and the secret name
is 4secret
(starting with a digit is not allowed), you can join the cluster by running this:
kubefed join cluster-4 --host-cluster-context=cluster-1 --secret-name=4secret
The kubefed join
command automatically creates the secret for you.
To remove a cluster from a federation, run the kubefed unjoin
command with the cluster name and the federation's host cluster context:
kubefed unjoin cluster-2 --host-cluster-context=cluster-1
Proper cleanup of the federation control plane is not fully implemented in this alpha release of Kubefed. However, for the time being, deleting the federation system namespace should remove all the resources except the persistent storage volume dynamically provisioned for the federation control plane's etcd
. You can delete
the federation namespace by running the following command:
> kubectl delete ns federation-system
The Kubernetes cluster federation often manages a federated object in the control plane, as well as corresponding objects in each member Kubernetes cluster. A cascading delete of a federated object means that the corresponding objects in the member Kubernetes clusters will also be deleted.
This doesn't happen automatically. By default, only the federation control plane object is deleted. To activate cascading delete, you need to set the following option:
DeleteOptions.orphanDependents=false
The following federated objects support cascading delete:
SecretsFor
other objects, you'll have to go into each cluster and delete them explicitly.Dynamic load balancing across clusters is not trivial. The simplest solution is to just say that it is not Kubernetes' responsibility. Load balancing will be performed outside the Kubernetes cluster federation. But given the dynamic nature of Kubernetes, even an external load balancer will have to gather a lot of information about which services and backend pods are running on each cluster. An alternative solution is for the federation control plane to implement an L7 load balancer that serves as traffic director for the entire federation. In one of the simpler use cases, each service runs on a dedicated cluster and the load balancer simply routes all traffic to that cluster. In case of cluster failure, the service is migrated to a different cluster and the load balancer now routes all traffic to the new cluster. This provides a coarse fail-over and high availability solution at the cluster level.
The optimal solution will be able to support federated services and take into account additional factors, such as the following:
The following diagram shows how an L7 load balancer on GCE distributes client requests to the closest cluster:
Federated failover is tricky. Suppose a cluster in the federation fails; one option is to just have other clusters pick up the slack. Now, the question is, how do you distribute the load across other clusters:
Each of these solutions has subtle interactions with federated load balancing, geo-distributed high availability, cost management across different clusters, and security.
Now, the failed cluster comes back online. Should it gradually take over its original workload again? What if it comes back but with reduced capacity or sketchy networking? There are many combinations of failure modes that could make recovery complicated.
Federated service discovery is tightly coupled with federated load balancing. A pragmatic setup includes a global L7 load balancer that distributes requests to federated ingress objects in the federation clusters.
The benefit of this approach is that the control stays with the Kubernetes federation, which over time will able to work with more cluster types (currently just AWS and GCE) and understand cluster utilization and other constraints.
The alternative of having a dedicated lookup service and let clients connect directly to services on individual clusters loses all these benefits.
Federated migration is related to several topics we discussed, such as location affinity, federated scheduling, and high availability. At the core, federated migration means moving a whole application or some part of it from one cluster to another (and more generally from M clusters to N clusters). Federation migration can happen in response to various events, such as the following:
Strictly-coupled applications can be trivially moved, in part or in whole, one pod at a time, to one or more clusters (within applicable policy constraints, for example PrivateCloudOnly
).
For preferentially-coupled applications, the federation system must first locate a single cluster with sufficient capacity to accommodate the entire application, then reserve that capacity and incrementally move the application, one (or more) resource at a time, over to the new cluster within some bounded time period (and possibly within a predefined maintenance window).
Strictly-coupled applications (with the exception of those deemed completely immovable) require the federation system to do the following: