Chapter 3: Creating a Single Node Rancher

This chapter will cover the process of installing Rancher as a single Docker container. This is an excellent option for proof of concept, development, or testing purposes. This chapter will cover the requirements and limitations of a single-node Rancher and the core architecture rules needed to create a proper enterprise solution. Finally, it will cover migrating to a High Availability (HA) cluster.

In this chapter, we’re going to cover the following main topics:

  • What is a single-node Rancher installation?
  • Requirements and limitations
  • Rules for architecting a solution
  • Installation steps
  • Migration to an HA setup

What is a single-node Rancher installation?

Rancher can be installed by running a single Docker container. This process goes back to the roots of Rancher v1.6 when the Rancher server was a Java-based application that ran as a Docker container using an external MySQL database or in single node mode using a MySQL server running inside the Rancher server container. With the move to Rancher v2.x, everything in Rancher moved to use the Kube-apiserver and Custom Resource Definitions (CRDs)). Because of this, Rancher needs a Kubernetes cluster to work correctly. In the earlier Rancher v2.x releases, this was done by embedding the Kubernetes services such as etcd, Kube-apiserver, kube-scheduler, kube-controller-manager, kubelet, and kube-proxy into the Rancher server code. When Rancher first tries to detect, the environment variable KUBECONFIG is set. Kubernetes sets this variable by default in all pods, so if it’s missing, then Rancher knows that it must be running in single-node mode. At which point, the Rancher server process will start checking whether there are SSL certificates for the Kubernetes components if they are missing or expired. Rancher will handle creating them. Next, Rancher will start etcd in a cluster of one and start Kube-apiserver and the required controllers. The big note here is this cluster is very stripped down. For example, this cluster does not have CoreDNS, ingress-controller, or even a Container Network Interface (CNI) such as Canal, simply because Rancher doesn’t need them because this setup was not a true cluster and was any kind of standard configuration. Several problems came up with the earlier versions. For example, before Rancher v2.3.x, there was no way to rotate the certificates inside the Rancher server container, and initially, Rancher would create certificates with an expiration of 1 year. This meant that after a year, your Rancher server would crash and wouldn’t start up because none of the Kubernetes components work with expired certificates, this being a safety measure in the Go library to not allow any HTTPS connection to an endpoint without a validated certificate. And, of course, an expired certificate is not a validated certificate. In Rancher v2.3.x, a process was added to Rancher to look for expired or expiring certificates and rotate them. This was done by spinning up a unique K3s cluster inside a Docker container and deploying the Rancher server as a pod.

Requirements and limitations

The following items are requirements for a single-node Rancher:

  • A Linux host running Docker 18.06.3, 18.09.x, 19.03.x, 20.10.x
  • Minimum of two cores but four cores is highly recommended
  • 8 GB of RAM
  • 10 GB of SSD storage with a latency under 10ms
  • Inbound TCP ports 80 and 443 between the end users and the managed clusters

The following items are not required but are highly recommended:

  • A DNS record such as rancher.example.com in place of using the server hostname
  • A certificate signed by a recognized Certificate Authority (CA), such as DigiCert and GoDaddy
  • An HTTP or TCP load balancer placed in front of the Rancher server
  • Server backups, which can be a file or snapshot-level backups
  • A dedicated filesystem/disk for the Docker filesystem /var/lib/docker
  • A dedicated filesystem/disk for the Rancher persistent data /var/lib/rancher
  • The Linux host should be a virtual machine (VM) where the hypervisor or cloud provider will be providing redundancy in the event of hardware failure.

The following items are the known limitations of a single-node Rancher:

  • A single-node Rancher is recommended only for development and testing purposes. Do not use it in production.
  • Only the Rancher server should be installed on this host. This server should not be hosting any other applications.
  • A single-node Rancher is not designed for HA.
  • Migrating from a single node to HA is not officially supported and is not guaranteed to work.
  • The single-node Rancher feature will be removed at some point and will no longer be available.
  • Rancher v2.5.x and higher requires the privileged option, so you cannot run Docker in rootless mode.
  • A single-node Rancher can be installed on a desktop/laptop, but there are issues when the IP address of the host changes along with requiring DNS records to be created.

Rules for architecting a solution

The pros are as follows:

  • A single-node Rancher is very simple to set up as you just need to deploy a single container.
  • It’s very fast to spin up. A single-node Rancher only takes a few minutes to start compared with RKE, which can take 10-15 minutes to start.
  • It has low resource utilization, compared to RancherD and a complete RKE cluster. A single-node Rancher uses a lot less CPU, memory, and storage.
  • There is no need for a load balancer or DNS if you want just the server hostname or IP address.
  • A single-node Rancher can be run on a laptop. (Note: Rancher Desktop is a better product for this solution.)

The cons are as follows:

  • A single-node Rancher is not designed for production.
  • Rancher official and community support is very limited.
  • There are limited troubleshooting options as the K3s settings are baked into the code and cannot be changed without building a new release.
  • The long-term future of single-node Rancher is uncertain and will be removed in a future release.
  • There’s no scalability or redundancy if the host goes offline. Rancher is down.
  • By default, a single-node Rancher stores its data inside the container, and if that container is lost, the data will be lost.
  • There’s no built-in backup solution; RKE, RKE2, K3s, and RancherD can back up to local disk or S3.

The architecture rules are as follows:

  • You should plan for migrating from a single-node Rancher to HA.
  • Rancher requires an SSL certificate and will not work without a certificate.
  • Using publicly signed certificates can make scripts and tools easier as the Rancher URL will be trusted by default.
  • All clusters/nodes that Rancher will be managing need to connect to the Rancher URL over SSL.
  • Rancher does support air-gapped environments, but it will require additional steps to provide proxied access to the internet, or you will need to offer Docker images and catalogs via internally hosted services.

Installation steps

We are going to assume the following:

  • That you already have a Linux VM that has been created and patched (in this example, we’ll be using a VMware VM running Ubuntu 20.04).
  • That the Linux VM has internet access and doesn’t require an HTTP proxy for access. Note, if you do not have internet access, please see the air-gap steps located at https://rancher.com/docs/rancher/v2.5/en/installation/other-installation-methods/air-gap/.
  • That you have SSH and root access to the Linux VM.
  • That you are installing Docker using a default configuration and storage location.
  • That the filesystems /var/lib/docker and /var/lib/rancher have already been created and mounted.
  • That you have already created a DNS record for Rancher. In this example, we’ll be using rancher.support.tools and an associated SSL certificate signed by a recognized CA.

Installing Docker

In this section, we’ll be installing and configuring Docker:

  1. SSH into the Linux VM and become root using the sudo su - command.
  2. Run the curl https://releases.rancher.com/install-docker/20.10.sh | bash command to install Docker.
  3. Set Docker to start at system boot by running systemctl enable docker.
  4. Verify Docker is running by running docker info. The output should look like the following:
Figure 3.1 – Docker information output

Figure 3.1 – Docker information output

Text output: https://raw.githubusercontent.com/PacktPublishing/Rancher-Deep-Dive/main/ch03/install_steps/01_installing_docker/example_output.txt

  1. Configure log rotation – we’ll want to enable log rotation of the Docker logs. Create/edit the /etc/docker/daemon.json file to have the following content:
Figure 3.2 – Enabling log rotation of the Docker logs

Figure 3.2 – Enabling log rotation of the Docker logs

Test version: https://raw.githubusercontent.com/PacktPublishing/Rancher-Deep-Dive/main/ch03/install_steps/02_configure-log-rotation/daemon.json

  1. Restart Docker to apply the change using the systemctl restart docker command.

Prepping the SSL certificates

In this section, we’ll be preparing the SSL certificate and key for use by the Rancher server. These files will be called tls.crt and tls.key. The steps are as follows:

  1. To create tls.crt, we’ll need a full certificate chain. This includes the root and intermediate certificates. Most public root authorities publish these certificates on their website.
  2. We’ll want all certificates files to be in the Privacy Enhanced Mail (PEM) format. Note that sometimes this is called Base64. If your certificate is in a different format, you should go to https://knowledge.digicert.com/solution/SO26449.html for more details about converting between formats.
  3. Once all files are in the PEM format, we’ll want to create a file with the content of each certificate in the following order. Note that some certificates might have multiple intermediate certificates. If you have questions, please work with your CA. Also, if you are using an internal CA, you might not have an intermediate certificate. You might just have the root and server certificate.
Figure 3.3 – Creating a file to store the certificates

Figure 3.3 – Creating a file to store the certificates

Text example: https://raw.githubusercontent.com/PacktPublishing/Rancher-Deep-Dive/main/ch03/install_steps/03_prepping_ssl_certs/example_certs/tls.pem

  1. For the private key, we’ll want to make sure it does not have a passphrase. We do this by reviewing the top of the file; see the following examples for details.

These are examples of keys that have a passphrase:

Figure 3.4 – Example 1 of a passphrase

Figure 3.4 – Example 1 of a passphrase

Figure 3.5 – Example 2 of a passphrase

Figure 3.5 – Example 2 of a passphrase

This is an example of a key that does not have a passphrase:

Figure 3.6 – Example of a key without a passphrase

Figure 3.6 – Example of a key without a passphrase

  1. If your key has a passphrase, you’ll need to remove it using the openssl rsa -in original.key -out tls.key command and enter your passphrase during the prompt.
  2. Once this process is done, you should have two files, tls.crt and tls.key.
  3. You’ll want to create the /etc/rancher/ssl/ directory using the mkdir -p /etc/rancher/ssl/ command and place both files in this directory. Note that these files should be owned by root.

Starting the Rancher server

In this section, we’ll create the docker run command and start the Rancher server.

The following is an example command:

Figure 3.7 – Example of docker run command

Figure 3.7 – Example of docker run command

Text version: https://raw.githubusercontent.com/PacktPublishing/Rancher-Deep-Dive/main/ch03/install_steps/04_rancher_run_command/example01.txt

We’ll now break down this command:

  1. docker run -d will create a new container and start it in detached mode.
  2. --name rancher_server will set the name of the container to be rancher_server. This is makes future commands easier because, without it, Docker will generate a random name.
  3. --restart=unless-stopped will tell Docker to make sure this container stays running unless you manually stop it.
  4. -p 80:80 will map port 80 (HTTP) on the host to port 80 inside the container.
  5. -p 443:443 will map port 443 (HTTPS) on the host to port 443 inside the container. Note that if you are doing SSL offloading at the load balancer, this is not needed.
  6. The v /etc/rancher/ssl/tls.crt:/etc/rancher/ssl/cert.pem and -v /etc/rancher/ssl/tls.key:/etc/rancher/ssl/key.pem flags will pass the certificate files we created earlier into the Rancher server.
  7. The -v /var/lib/rancher:/var/lib/rancher flag will bind the data directory for Rancher to the host filesystem.
  8. --privileged will give the Rancher server container root capabilities on the host. This is needed because we’ll be running K3s inside the container, which will have additional containers.
  9. rancher/rancher:v2.5.8 will set the Rancher image, which will set the Rancher server version.
  10. --no-cacerts will disable the certificate generation process in the Rancher server as we will be bringing our own.
  11. Finally, once we start the Rancher server, we’ll need to wait a few minutes for Rancher to start fully.
  12. You can watch the server start by running the docker logs -f rancher_server command.
  13. You’ll then need to open your Rancher URL in a browser. Note that if you are planning to use a CNAME or load balancer, you should be using that URL instead of using the IP/hostname of the host.
  14. To get the admin password, you’ll need to run the docker logs rancher_server 2>&1 | grep “Bootstrap Password:” command:
Figure 3.8 – Snippet of step 14

Figure 3.8 – Snippet of step 14

  1. Once you have logged into Rancher, you’ll want to set the password and URL:
Figure 3.9 – Rancher login page

Figure 3.9 – Rancher login page

It is important you set the URL to something you would like to keep, as changing the URL is difficult and time-consuming. For the password, this will be the password for the local admin, which is a root-level account that has full access to Rancher. This should be a secure password.

Migration to an HA setup

To migrate from a single-node Rancher to an HA installation, we’ll need the following:

  • You should be running the latest version of Rancher and RKE.
  • You already have three new Linux VMs (in this example, we’ll be using a VMware VM running Ubuntu 20.04). Note that at the end of this process, the original VM can be reclaimed.
  • We will assume that the Linux VMs have internet access and don’t require an HTTP proxy for access. Note that if you do not have internet access, please see the air-gap steps located at https://rancher.com/docs/rancher/v2.5/en/installation/other-installation-methods/air-gap/.
  • SSH and root access to the Linux VMs.
  • Docker installed on the three new VMs.
  • A DNS record for Rancher that is not a server hostname/IP address.
  • You will need a maintenance window of 30-60 minutes. During this window, Rancher and its API will be down. This may mean that CICD pipelines will not work, and application teams may not manage their applications. Note, this does not impact downstream applications. The only impact is around management.
  • We’ll assume the single-node Rancher server container is called rancher_server during the following steps. If the name is different, please update the commands listed in the following steps.
  • This section will assume you know what RKE is and how to use it. Note that we will cover RKE in much more detail in the next chapter.

Backing up the current Rancher server

During this section, we’ll take a backup of the current Rancher single node server, including the Kubernetes certificates, etcd, and the SSL certs for the Rancher URL. The steps are as follows:

  1. SSH into the current Rancher server node.
  2. Become root using the sudo su – command.
  3. Stop the current Rancher server using the docker stop rancher_server command.
  4. Create a volume from the current server using the docker create --volumes-from rancher_server --name rancher-data-<DATE> rancher/rancher:<RANCHER_CONTAINER_TAG> command. Please replace the date and tag placeholder values.
  5. Create a tar.gz file backup using the docker run --volumes-from rancher-data-<DATE> -v $PWD:/backup:z busybox tar pzcvf /backup/rancher-data-backup-<RANCHER_VERSION>-<DATE>.tar.gz /var/lib/rancher command. Please replace the date and tag placeholder values.
  6. Verify the backup file has been created using the ls -lh command. The backup file will be created in the current directory.
  7. Restart the Rancher server backup using the docker start rancher_server command.
  8. Open a shell into the Rancher server container using the docker exec -it rancher_server /bin/bash command.
  9. Backup the current certificated using the tar -zcvf pki.bundle.tar.gz /etc/kubernetes/ssl command.
  10. Leave the shell using the exit command.
  11. Copy the backup file out of the Rancher server using the docker cp rancher_server:/var/lib/rancher/pki.bundle.tar.gz command.
  12. Create a temporary container using the docker run --net=container:rancher_server -it -v $(pwd):/cwd --name etcd-utility rancher/rke-tools:v0.1.20 command.
  13. Set up certificates for etcd using the mkdir ssl; cd ssl; cp /cwd/pki.bundle.tar.gz .; tar -zxvf pki.bundle.tar.gz --strip-components 3 commands.
  14. Take an etcd backup using the cd /; ETCDCTL_API=3 etcdctl snapshot save --cacert=/ssl/kube-ca.pem --cert=/ssl/kube-etcd-127-0-0-1.pem --key=/ssl/kube-etcd-127-0-0-1-key.pem single-node-etcd-snapshot command.
  15. Exit the shell using the exit command.
  16. Copy the etcd backup out of the temporary container using the docker cp etcd-utility:/single-node-etcd-snapshot command.
  17. Stop the current Rancher server using the docker stop rancher_server command.
  18. Copy the pki.bundle.tar.gz and single-node-etcd-snapshot files to whatever server/workstation you will be using to run the rke commands from. Some people will use the first node in the cluster for this task.

Starting cutover to new cluster

At this point, we’re going to start restoring the backup into the new cluster and migrate over the Rancher URL:

  1. You should update the DNS or your load balancer to redirect traffic from the old single-node Rancher server to the new cluster. Note, the DNS might take some time to propagate fully, so we’ll want to do that now.
  2. We do want to modify the cluster.yaml file only to include the first node. We’re going to assume that the hostname is node01 for the rest of the steps. You can use the example located at https://raw.githubusercontent.com/PacktPublishing/Rancher-Deep-Dive/main/ch03/migrating_from_single_to_ha/cluster.yml.
  3. SCP over the single-node-etcd-snapshot and pki.bundle.tar.gz files to node01.
  4. SSH into the node01 host and become root using the sudo su - command.
  5. Create the directory snapshot directory using the mkdir -p /opt/rke/etcd-snapshots command.
  6. Move the two backup files into the /opt/rke/etcd-snapshots directory. These files should be owned by root.
  7. You can now leave the SSH terminal on node01.
  8. Start the restore using the rke etcd snapshot-restore --name single-node-etcd-snapshot --config cluster.yaml command.

This process might take 5-10 minutes to complete. Example command output is located at https://raw.githubusercontent.com/PacktPublishing/Rancher-Deep-Dive/main/ch03/migrating_from_single_to_ha/restore_command_output.txt.

  1. At this point, you can start the cluster using the rke up --config cluster.yaml command.
  2. If you run into any errors, try running the rke up command a second time.
  3. Once it completes successfully, you can edit the cluster.yaml file to include node02 and node03.
  4. Run rke up again to add the additional nodes.
  5. At this point, you can log in to Rancher and verify everything is accessible and healthy. Note that it might take a few minutes for downstream clusters to reconnect and become active in the UI.
  6. It is important to note that any cluster-level changes such as creating a new cluster, editing an existing cluster, or deleting a cluster should be avoided until you are sure that you will not be rolling back to the old server.

Cleaning up/rolling back

In the event of an unsuccessful migration, use the following steps:

  1. Change the DNS record for the Rancher URL back to the original server.
  2. Start the Rancher single node container using the docker start rancher_server command.
  3. It is important to note that it is currently not possible to migrate changes made in HA back to a single node. So, in the event of a rollback, all changes since shutting the single Rancher node will be lost.
  4. After a few days of burn-in, you can delete the old VM or clean up the old server using the script at https://raw.githubusercontent.com/rancherlabs/support-tools/master/extended-rancher-2-cleanup/extended-cleanup-rancher2.sh.

Summary

In this chapter, we learned about how a single-node Rancher works and its pros and cons. We then went over the steps and commands needed to install Rancher in single node mode. We finally went into detail about migrating to an HA setup, including backing up the current data and restoring it.

In the next chapter, we will cover RKE and RKE2, including where they came from, how they work, and how to design a solution using them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset