One of the great things about Rancher is it can be deployed on any certified Kubernetes cluster. This means that Rancher can be installed on a hosted Kubernetes cluster such as Google Kubernetes Engine (GKE), Amazon Elastic Container Service (EKS) for Kubernetes, Azure Kubernetes Service (AKS), or Digital Ocean's Kubernetes Service (DOKS). This can simplify management on Rancher, but there are some limitations with hosted Kuberenetes solutions. We will then cover the rules for designing the hosted Kubernetes cluster along with some standard designs. At which point, we'll install Rancher on the cluster using the Helm tool to install the Rancher server workload on the cluster. Finally, we'll cover how to back up Rancher with a hosted Kubernetes cluster.
In this chapter, we're going to cover the following main topics:
Let's dive in!
One of the questions that always comes up when deploying a Kubernetes cluster in the cloud is not just about using a hosted Kubernetes cluster, but what a hosted Kubernetes cluster is. In short, it's a cluster that is deployed and managed by an outside party. Usually, this kind of cluster is provided as a service by a cloud provider such as Amazon's AWS, Google's GCP, Microsoft's Azure, and so on. This kind of service is sometimes called Kubernetes as a Service (KaaS) because these types of clusters are provided as a service. As a consumer, there are some limitations with a hosted Kubernetes cluster versus one you build yourself:
Note
All major cloud providers allow you to set up a preferred maintenance window, but they can do emergency maintenance outside that window if needed.
This is generally for tasks such as replacing a failed node or applying a critical security fix.
Note
The preceding statement is valid as of writing, but Azure has stated this is on the roadmap, so this might change in the future.
Now that we understand what a hosted Kubernetes cluster is, next, we're going to go into the requirements and limitations of some of the most popular cloud providers.
In this section, we'll be discussing the basic requirements of Rancher on various clusters along with their limitations and design considerations.
The basic requirements for Amazon EKS are as follows::
Note
Port 80 will redirect end users to the HTTPS URL. So, port 80 is not required but is recommended for the convenience of end users.
The design limitations and considerations are as follows:
The basic requirements for GKE are as follows:
The design limitations and considerations are as follows:
The basic requirements for AKS are as follows:
The design limitations and considerations are as follows:
Note
The Rancher server does not work on Windows nodes.
We now understand the limitations of running Rancher on a hosted Kubernetes cluster. Next, we'll be using this and a set of rules and examples to help us design a solution using the major cloud providers.
In this section, we'll cover some standard designs and the pros and cons of each. It is important to note that each environment is unique and will require tuning for the best performance and experience. It's also important to note that all CPU, memory, and storage sizes are recommended starting points and may need to be increased or decreased based on the number of nodes and clusters to be managed by Rancher.
Before designing a solution, you should be able to answer the following questions:
Note
Rancher's official server sizing guide can be found at https://rancher.com/docs/rancher/v2.5/en/installation/requirements/#rke-and-hosted-kubernetes.
In this section, we're going to cover some of the major cluster designs for EKS clusters.
In this design, we will be deploying the smallest EKS cluster that can still run Rancher. Note that this design is only for testing or lab environments and is not recommended for production deployments and can only handle a couple of clusters with a dozen or so nodes each.
The pros are as follows:
The cons are as follows:
Note
During node group upgrades, Amazon will add a new node before removing the old one.
The node sizing requirements are as follows:
In this design, we will expand upon the EKS small design by adding a worker, giving us three worker nodes. We'll also leverage AWS's Availability Zone (AZ) redundancy by having a worker node in one of three AZs. By doing this, the cluster can handle the failure of an AZ without impacting Rancher. We will also increase the size of the worker nodes to manage up to 300 clusters with 3,000 nodes.
The pros are as follows:
The cons are as follows:
The node sizing requirements are as follows
In this section, we're going to cover some of the major cluster designs for GKE clusters.
In this design, we will be deploying the smallest GKE cluster that can still run Rancher. Note that this design is only for testing or lab environments, is not recommended for production deployments, and can only handle a couple of clusters with a dozen or so nodes each.
The pros are as follows:
The cons are as follows:
Note
During cluster upgrades, Google will add a new node before removing the old one.
The node sizing requirements are as follows:
In this design, we will expand upon the GKE small design by adding a worker, giving us three worker nodes. We'll also leverage GCP's zone redundancy by having a worker node in one of three zones. By doing this, the cluster can handle the failure of an AZ without impacting Rancher. We will also increase the size of the worker nodes to manage up to 300 clusters with 3,000 nodes.
The pros are as follows:
The cons are as follows:
The node sizing requirements are as follows:
In this section, we're going to cover some of the major cluster designs for AKS clusters.
In this design, we will be deploying the smallest AKE cluster that can still run Rancher. AKS is a little special in the fact that it support clusters with only one node. As mentioned earlier, this design is only for testing or lab environments, is not recommended for production deployments, and can only handle a couple of clusters with a dozen or so nodes each. It is important to note that AKS does support Windows node pools, but Rancher must run on a Linux node.
The pros are as follows:
The cons are as follows:
Note
During cluster upgrades, Azure will add a new node before removing the old one.
The node sizing requirements are as follows:
In this design, we will expand upon the AKS single-node design by adding two workers, giving us three worker nodes. We'll also leverage Azure's zone redundancy by having a worker node in one of three zones. By doing this, the cluster can handle the failure of an AZ without impacting Rancher. We will also increase the size of the worker nodes to manage up to 300 clusters with 3,000 nodes.
The pros are as follows:
The cons are as follows:
The node sizing requirements are as follows:
Now that we have a design for our cluster, in the next section, we'll be covering the steps for creating each of the major cluster types.
In this section, we are going to walk through the commands for creating each of the hosted Kubernetes clusters.
This section will cover creating an EKS cluster with an ingress by using command-line tools.
Note
The following steps are general guidelines. Please refer to https://aws-quickstart.github.io/quickstart-eks-rancher/ for more details.
You should already have an AWS account with admin permissions along with a VPC and subnets created.
The following tools should be installed on your workstation:
eksctl create cluster --name rancher-server --version 1.21 --without-nodegroup
Note
In this example, we'll be making a standard three-node cluster with one node in each AZ.
eksctl create nodegroup --cluster=rancher-server --name=rancher-us-west-2a --region=us-west-2 --zones=us-west-2a --nodes 1 --nodes-min 1 --nodes-max 2
This will create the node pool in us-west-2a.
eksctl create nodegroup --cluster=rancher-server --name=rancher-us-west-2b --region=us-west-2 --zones=us-west-2b --nodes 1 --nodes-min 1 --nodes-max 2
This will create the node pool in us-west-2b.
eksctl create nodegroup --cluster=rancher-server --name=rancher-us-west-2c --region=us-west-2 --zones=us-west-2c --nodes 1 --nodes-min 1 --nodes-max 2
This will create the node pool in us-west-2c.
eksctl get cluster
Note
It might take 5 to 10 mins for the cluster to come online.
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm upgrade --install
ingress-nginx ingress-nginx/ingress-nginx
--namespace ingress-nginx
--set controller.service.type=LoadBalancer
--version 3.12.0
--create-namespace
If you are just testing, you can run the kubectl get service ingress-nginx-controller -n ingress-nginx command to capture the external DNS record. Then you can create a CNAME DNS record to point to this record.
Note
This should not be used for production environments.
For creating the frontend load balancer, please see https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-to-elb-load-balancer.html for more details.
At this point, the cluster is ready for Rancher to be installed. We'll cover this step in the next section.
This section will cover creating a GKE cluster with an ingress by using command-line tools.
Note
The following steps are general guidelines. Please refer to https://cloud.google.com/kubernetes-engine/docs/quickstart for more details.
You should already have a GCP account with admin permissions. This section will use Cloud Shell, which has most of the tools already installed.
gcloud container clusters create rancher-server --zone us-central1-a --node-locations us-central1-a,us-central1-b,us-central1-c --num-nodes=3
gcloud container clusters get-credentials rancher-server
It might take 5 to 10 mins for the cluster to come online.
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm upgrade --install
ingress-nginx ingress-nginx/ingress-nginx
--namespace ingress-nginx
--set controller.service.type=LoadBalancer
--version 3.12.0
--create-namespace
If you are just testing, you can run the kubectl get service ingress-nginx-controller -n ingress-nginx command to capture the external IP. Then you can create a DNS record to point to this IP.
Note
This should not be used for production environments.
For creating the frontend load balancer, please see https://cloud.google.com/kubernetes-engine/docs/concepts/ingress for more details.
At this point, the cluster is ready for Rancher to be installed. We'll cover this step in the next section.
This section will cover creating an AKS cluster with an ingress by using command-line tools.
Note
The following steps are general guidelines. Please refer to https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough-portal for more details.
You should already have an Azure account with admin permissions.
The following tools should be installed on your workstation:
Run the az login command. This command is used to log in to Azure.
Note
You might need to log in to a web browser if you are using two-factor authentication (2FA).
az group create --name rancher-server --location eastus
az aks create --resource-group rancher-server --name rancher-server --kubernetes-version 1.22.0 --node-count 3 --node-vm-size Standard_D2_v3
az aks get-credentials --resource-group rancher-server --name rancher-server
It might take 5 to 10 mins for the cluster to come online.
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm upgrade --install
ingress-nginx ingress-nginx/ingress-nginx
--namespace ingress-nginx
--set controller.service.type=LoadBalancer
--version 3.12.0
--create-namespace
If you are just testing, you can run the kubectl get service ingress-nginx-controller -n ingress-nginx command to capture the external IP. Then you can create a DNS record to point to this IP.
Note
This should not be used for production environments.
For creating the frontend load balancer, please see https://docs.microsoft.com/en-us/azure/aks/load-balancer-standard for more details.
At this point, the cluster is ready for Rancher to be installed. We'll cover this step in the next section.
In this section, we are going to cover installing and upgrading Rancher on a hosted cluster. This process is very similar to installing Rancher on an RKE cluster but with the difference being the need for Rancher Backup Operator, which we will cover in the next section.
helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
Note
The namespace name should always be cattle-system and cannot be changed without breaking Rancher.
helm upgrade --install rancher rancher-latest/rancher
--namespace cattle-system
--set hostname=rancher.example.com
--set ingress.tls.source=external
--set replicas=3
--version 2.6.2
Please see https://rancher.com/docs/rancher/v2.5/en/installation/install-rancher-on-k8s/#install-the-rancher-helm-chart for more details and options for installing Rancher.
Before starting an upgrade, you should do a backup using the backup steps mentioned in the next section:
helm get values rancher -n cattle-system
Note
If you saved your install command, you could reuse it as it has the upgrade --install flag, which tells the Helm CLI to upgrade the deployment if it exists. If the deployment is missing, install it. The only thing you need to change is the version flag.
Please see https://rancher.com/docs/rancher/v2.5/en/installation/install-rancher-on-k8s/upgrades/ for more details and options for upgrading Rancher.
At this point, we have Rancher up and running. In the next section, we'll be going into some common tasks such as backing up Rancher using the Rancher Backup Operator.
Because we don't have access to the etcd database with hosted Kubernetes clusters, we need to back up Rancher data differently. This is where the Rancher-Backup-Operator comes into the picture. This tool provides the ability to back up and restore Rancher's data on any Kubernetes cluster. It accepts a list of resources that need to be backed up for a particular application. It then gathers these resources by querying the Kubernetes API server, packages them to create a tarball file, and pushes it to the configured backup storage location. Since it gathers resources by querying the API server, it can back up applications from any type of Kubernetes cluster.
Let's look at the steps to install this tool:
helm repo add rancher-charts https://raw.githubusercontent.com/rancher/charts/release-v2.5/
helm install --wait --create-namespace -n cattle-resources-system rancher-backup-crd rancher-charts/rancher-backup-crd
helm install --wait -n cattle-resources-system rancher-backup rancher-charts/rancher-backup
To configure the backup schedule, encryption, and storage location, please see the documentation located at https://rancher.com/docs/rancher/v2.5/en/backups/configuration/backup-config/.
Take a one-time backup – before doing maintenance tasks such as upgrading Rancher, you should take a backup:
apiVersion: resources.cattle.io/v1
kind: Backup
metadata:
name: pre-rancher-upgrade
spec:
resourceSetName: rancher-resource-set
You can find additional examples at https://github.com/rancher/backup-restore-operator/tree/master/examples.
In this chapter, we learned about hosted Kubernetes clusters such as EKS, GKE, and AKS, including the requirements and limitations of each. We then covered the rules of architecting each type of cluster, including some example designs and the pros and cons of each solution. We finally went into detail about the steps for creating each type of cluster using the design we made earlier. We ended the chapter by installing and configuring the Rancher server and Rancher Backup Operator. At this point, you should have Rancher up and ready to start deploying downstream clusters for your application workloads.
The next chapter will cover creating a managed RKE cluster using Rancher IE, a downstream cluster. We will cover how Rancher creates these clusters and what the limitations are.