The master node serves as a kernel component in the Kubernetes system. Its duties include the following:
Three major daemons support the master fulfilling the preceding duties, which are numbered in the following image:
As you can see, the master is the communicator between workers and clients. Therefore, it will be a problem if the master node crashes. A multiple-master Kubernetes system is not only fault tolerant, but also workload-balanced. There will be no longer only one API server for accessing nodes and clients sending requests. Several API server daemons in separated master nodes would help to solve the tasks simultaneously and shorten the response time.
The brief concepts for building a multiple-master system are listed here:
Pod master
is a new daemon installed in every master. It elects to decide the master node-running daemon scheduler and the master node-running controller manager. It could be the same master that runs both the daemons.In this recipe, we are going to build a two-master system, which has similar methods while scaling more masters.
Now, we will guide you step by step in building a multiple-master system. Before this, you have to deploy a load balancer server for masters.
To learn about deploying the load balancer and to build the system on AWS, please check the Building the Kubernetes infrastructure in AWS recipe in Chapter 6, Building Kubernetes on AWS on how to build a master load balancer.
First, install another master node in your previous Kubernetes system, which should be in the same environment as the original master. Then, stop the daemon services of scheduler and controller manager in both the masters:
systemctl kube-scheduler stop
and systemctl kube-controller-manager stop
// Checking current daemon processes on master server # service kubernetes-master status kube-apiserver (pid 3137) is running... kube-scheduler (pid 3138) is running... kube-controller-manager (pid 3136) is running... # service kubernetes-master stop Shutting down /usr/local/bin/kube-controller-manager: [ OK ] Shutting down /usr/local/bin/kube-scheduler: [ OK ] Shutting down /usr/local/bin/kube-apiserver: [ OK ] // Or, for "hypercube" command with init script, we block out scheduler and controller-manager. Just leave apiserver daemon in master node. // Put comment on the scheduler and controller manager daemons // the variable $prog is /usr/local/bin/hyperkube # cat /etc/init.d/kubernetes-master (ignored above parts) # Start daemon. echo $"Starting apiserver: " daemon $prog apiserver --service-cluster-ip-range=${CLUSTER_IP_RANGE} --insecure-port=8080 --secure-port=6443 --address=0.0.0.0 --etcd_servers=${ETCD_SERVERS} --cluster_name=${CLUSTER_NAME} > ${logfile}-apiserver.log 2>&1 & # echo $"Starting controller-manager: " # daemon $prog controller-manager # --master=${MASTER} # > ${logfile}-controller-manager.log 2>&1 & # # echo $"Starting scheduler: " # daemon $prog scheduler # --master=${MASTER} # > ${logfile}-scheduler.log 2>&1 & (ignored below parts) # service kubernetes-master start Starting apiserver:
At this step, you have two masters serving in the system with two processes of the API server.
Because we are going to install the daemons' scheduler and controller manager as pods, a kubelet process is a must-have daemon. Download the latest (version 1.1.4
) kubelet binary file (https://storage.googleapis.com/kubernetes-release/release/v1.1.4/bin/linux/amd64/kubelet) and put it under the directory of the system's binary files:
# wget https://storage.googleapis.com/kubernetes-release/release/v1.1.4/bin/linux/amd64/kubelet # chmod 755 kubelet # mv kubelet /usr/local/bin/
Alternatively, for the RHEL system, you can download kubelet
from the YUM repository:
# yum install kubernetes-node
Later, we will configure the kubelet
daemon with specific parameters and values:
Tag Name |
Value |
Purpose |
---|---|---|
|
|
To communicate with the API server in local. |
|
|
Avoid registering this master, local host, as a node. |
|
|
To allow containers to request the privileged mode, which means containers have the ability to access the host device, especially, the network device in this case. |
|
|
To manage local containers by the template files under this specified directory. |
If your system is monitored by systemctl
, put the preceding parameters in the configuration files:
/etc/kubernetes/config
:KUBE_MASTER
to --master=127.0.0.1:8080
:KUBE_LOGTOSTDERR="--logtostderr=true" KUBE_LOG_LEVEL="--v=0" KUBE_ALLOW_PRIV="--allow_privileged=false" KUBE_MASTER="--master=127.0.0.1:8080"
/etc/kubernetes/kubelet
:--api-servers
to variable KUBELET_API_SERVER
.KUBELET_ARGS
:KUBELET_ADDRESS="--address=0.0.0.0" KUBELET_HOSTNAME="--hostname_override=127.0.0.1" KUBELET_API_SERVER="--api_servers=127.0.0.1:8080" KUBELET_ARGS="--register-node=false --allow-privileged=true --config /etc/kubernetes/manifests"
On the other hand, modify your script file of init
service management and append the tags after the daemon kubelet
. For example, we have the following settings in /etc/init.d/kubelet
:
# cat /etc/init.d/kubelet prog=/usr/local/bin/kubelet lockfile=/var/lock/subsys/`basename $prog` hostname=`hostname` logfile=/var/log/kubernetes.log start() { # Start daemon. echo $"Starting kubelet: " daemon $prog --api-servers=127.0.0.1:8080 --register-node=false --allow-privileged=true --config=/etc/kubernetes/manifests > ${logfile} 2>&1 & (ignored)
It is fine to keep your kubelet service in the stopped state, since we will start it after the configuration files of scheduler and the controller manager are ready.
We need three templates as configuration files: pod master, scheduler, and controller manager. These files should be put at specified locations.
Pod master handles the elections to decide which master runs the scheduler daemon and which master runs the controller manager daemon. The result will be recorded in the etcd servers. The template of pod master is put in the kubelet config directory, making sure that the pod master is created right after kubelet starts running:
# cat /etc/kubernetes/manifests/podmaster.yaml apiVersion: v1 kind: Pod metadata: name: podmaster namespace: kube-system spec: hostNetwork: true containers: - name: scheduler-elector image: gcr.io/google_containers/podmaster:1.1 command: ["/podmaster", "--etcd-servers=<ETCD_ENDPOINT>", "--key=scheduler", "--source-file=/kubernetes/kube-scheduler.yaml", "--dest-file=/manifests/kube-scheduler.yaml"] volumeMounts: - mountPath: /kubernetes name: k8s readOnly: true - mountPath: /manifests name: manifests - name: controller-manager-elector image: gcr.io/google_containers/podmaster:1.1 command: ["/podmaster", "--etcd-servers=<ETCD_ENDPOINT>", "--key=controller", "--source-file=/kubernetes/kube-controller-manager.yaml", "--dest-file=/manifests/kube-controller-manager.yaml"] terminationMessagePath: /dev/termination-log volumeMounts: - mountPath: /kubernetes name: k8s readOnly: true - mountPath: /manifests name: manifests volumes: - hostPath: path: /srv/kubernetes name: k8s - hostPath: path: /etc/kubernetes/manifests name: manifests
In the configuration file of pod master, we will deploy a pod with two containers, the two electors for different daemons. The pod podmaster
is created in a new namespace called kube-system
in order to separate pods for daemons and applications. We will need to create a new namespace prior to creating resources using templates. It is also worth mentioning that the path /srv/kubernetes
is where we put the daemons' configuration files. The content of the files is like the following lines:
# cat /srv/kubernetes/kube-scheduler.yaml apiVersion: v1 kind: Pod metadata: name: kube-scheduler namespace: kube-system spec: hostNetwork: true containers: - name: kube-scheduler image: gcr.io/google_containers/kube-scheduler:34d0b8f8b31e27937327961528739bc9 command: - /bin/sh - -c - /usr/local/bin/kube-scheduler --master=127.0.0.1:8080 --v=2 1>>/var/log/kube-scheduler.log 2>&1 livenessProbe: httpGet: path: /healthz port: 10251 initialDelaySeconds: 15 timeoutSeconds: 1 volumeMounts: - mountPath: /var/log/kube-scheduler.log name: logfile - mountPath: /usr/local/bin/kube-scheduler name: binfile volumes: - hostPath: path: /var/log/kube-scheduler.log name: logfile - hostPath: path: /usr/local/bin/kube-scheduler name: binfile
There are some special items set in the template, such as namespace and two mounted files. One is a log file; the streaming output can be accessed and saved in the local side. The other one is the execution file. The container can make use of the latest kube-scheduler on the local host:
# cat /srv/kubernetes/kube-controller-manager.yaml apiVersion: v1 kind: Pod metadata: name: kube-controller-manager namespace: kube-system spec: containers: - command: - /bin/sh - -c - /usr/local/bin/kube-controller-manager --master=127.0.0.1:8080 --cluster-cidr=<KUBERNETES_SYSTEM_CIDR> --allocate-node-cidrs=true --v=2 1>>/var/log/kube-controller-manager.log 2>&1 image: gcr.io/google_containers/kube-controller-manager:fda24638d51a48baa13c35337fcd4793 livenessProbe: httpGet: path: /healthz port: 10252 initialDelaySeconds: 15 timeoutSeconds: 1 name: kube-controller-manager volumeMounts: - mountPath: /srv/kubernetes name: srvkube readOnly: true - mountPath: /var/log/kube-controller-manager.log name: logfile - mountPath: /usr/local/bin/kube-controller-manager name: binfile hostNetwork: true volumes: - hostPath: path: /srv/kubernetes name: srvkube - hostPath: path: /var/log/kube-controller-manager.log name: logfile - hostPath: path: /usr/local/bin/kube-controller-manager name: binfile
The configuration file of the controller manager is similar to the one of the scheduler. Remember to provide the CIDR range of your Kubernetes system in the daemon command.
For the purpose of having your templates work successfully, there are still some preconfigurations required before you start the pod master:
// execute these commands on each master # touch /var/log/kube-scheduler.log # touch /var/log/kube-controller-manager.log
// Just execute this command in a master, and other masters can share this update. # kubectl create namespace kube-system // Or # curl -XPOST -d'{"apiVersion":"v1","kind":"Namespace","metadata":{"name":"kube-system"}}' "http://127.0.0.1:8080/api/v1/namespaces"
Before starting kubelet for our pod master and two master-owned daemons, please make sure you have Docker and flanneld started first:
# Now, it is good to start kubelet on every masters # service kubelet start
Wait for a while; you will get a pod master running on each master and you will finally get a pair of scheduler and controller manager:
# Check pods at namespace "kube-system" # kubectl get pod --namespace=kube-system NAME READY STATUS RESTARTS AGE kube-controller-manager-kube-master1 1/1 Running 0 3m kube-scheduler-kube-master2 1/1 Running 0 3m podmaster-kube-master1 2/2 Running 0 1m podmaster-kube-master2 2/2 Running 0 1m
Congratulations! You have your multiple-master Kubernetes system built up successfully. And the structure of the machines looks like following image:
You can see that now, a single node does not have to deal with the whole request load. Moreover, the daemons are not crowded in a master; they can be distributed to different masters and every master has the ability to do the recovery. Try to shut down one master; you will find that your scheduler and controller manager are still providing services.
Check the log of the container pod master; you will get two kinds of messages, one for who is holding the key and one without a key on hand:
// Get the log with specified container name # kubectl logs podmaster-kube-master1 -c scheduler-elector --namespace=kube-system I0211 15:13:46.857372 1 podmaster.go:142] --whoami is empty, defaulting to kube-master1 I0211 15:13:47.168724 1 podmaster.go:82] key already exists, the master is kube-master2, sleeping. I0211 15:13:52.506880 1 podmaster.go:82] key already exists, the master is kube-master2, sleeping. (ignored) # kubectl logs podmaster-kube-master1 -c controller-manager-elector --namespace=kube-system I0211 15:13:44.484201 1 podmaster.go:142] --whoami is empty, defaulting to kube-master1 I0211 15:13:50.078994 1 podmaster.go:73] key already exists, we are the master (kube-master1) I0211 15:13:55.185607 1 podmaster.go:73] key already exists, we are the master (kube-master1) (ignored)
The master with the key should take charge of the specific daemon and the said scheduler or controller manager. This current high-availability solution for the master is realized by the lease-lock method in etcd:
The preceding loop image indicates the progress of the lease-lock method. Two time periods are important in this method: SLEEP is the period for checking lock, and Time to Live (TTL) is the period of lease expiration. We can say that if the daemon-running master crashed, the worst case for the other master taking over its job requires the time SLEEP + TTL. By default, SLEEP is 5 seconds and TTL is 30 seconds.
You can still take a look at the source code of pod master for more concepts (podmaster.go
: https://github.com/kubernetes/contrib/blob/master/pod-master/podmaster.go).
Before you read this recipe, you should have the basic concept of single master installation. Refer to the related recipes mentioned here and get an idea of how to build a multiple-master system automatically: