This chapter will focus on the history of Rancher and Kubernetes. We will cover what products and solutions came before Rancher and Kubernetes and how they have evolved into what they are today. At the end of this chapter, you should have a good understanding of the origins of Rancher and Kubernetes and their core concepts. This knowledge is essential for you to understand why Rancher and Kubernetes are what they are.
In this chapter, we're going to cover the following main topics:
Rancher Labs was founded in 2014 in Cupertino, California, by Sheng Liang, Shannon Williams, Darren Shepherd, and Will Chanas. It was a container management platform before Kubernetes was a thing. From the beginning, Rancher was built on the idea that everything should be open source and community-driven. With Rancher being an open source company, all of the products they have released (including Rancher, RancherOS, RKE, K3s, Longhorn, and more) have been 100% open source. Rancher Lab's flagship product is Rancher. Primarily, Rancher is a management and orchestration platform for containerized workloads both on-premises and in the cloud. Rancher can do this because it has always been vendor-neutral; that is, Rancher can deploy a workload using physical hardware in your data center to cloud VMs in AWS to even a Raspberry Pi in a remote location.
When Rancher v1.0 was released in March of 2016, it only supported Docker Swarm and Rancher Cattle clusters. Docker Swarm was the early cluster orchestration tool that created a number of the core concepts that we still use today; for instance, the idea that an application should be defined as a group of containers that can be created and destroyed at any time. Another concept is that containers should live on a virtual network that is accessible on all nodes in a cluster. You can expose your containers via a load balancer which, in the case of Docker Swarm, is just a basic TCP load balancer.
While the Rancher server was being created, Rancher Labs was working on their own Docker cluster software, called Cattle, which is when Rancher went General Availability (GA) with the launch of Rancher v1.0. Cattle was designed to address the limitations of Docker Swarm, which spanned several different areas.
The first was networking. Originally, Docker Swarm's networking overlay was built on Internet Protocol Security (IPsec) with the idea that each node in the cluster would be assigned a subnet; that is, a class C subnet by default. Each node would create an IPsec tunnel to all other nodes in the cluster. It would then use basic routing rules to direct traffic to the node where that container was hosted. For example, let's say a container on node01 with an IP address of 192.168.11.22 wants to connect to another container hosted on node02 with an IP address of 192.168.12.33. The networking swarm uses basic Linux routing to route anything inside the 192.168.12.0/24 subnet to node02 over the IPsec tunnel. This core concept is still in use today by the majority of Kubernetes's CNI providers. The main issue is in managing the health of these tunnels over time and dealing with compatibility issues between the nodes. Cattle addressed this issue by moving IPsec into a container and then wrapping a management layer to handle the creation, deletion, and monitoring of the tunnels.
The second main issue was to do with load balancing. With Docker Swarm, we were limited to very basic TCP/layer4 load balancing. We didn't have sessions, SSL, or connection management. This is because load balancing was all done by iptable rules. Cattle addressed this issue by deploying HAProxy on all nodes in the cluster. Following this, Cattle used a custom container, called rancher-metadata, to dynamically build HAProxy's config every time a container was created or deleted.
The third issue was storage. With Docker Swarm, there weren't any storage options outside bind mounting to a host filesystem. This meant that you had to create a clustered filesystem or shared network and then manually map them to all of your Docker hosts. Cattle addressed this by creating rancher-nfs, which is a tool that can mount NFS shares inside a container and create a bind mount. As Rancher went on, other storage providers were added, such as AWS and VMware.
Then, as time moved forward at Rancher, the next giant leap was when authentication providers were added, because Rancher provides access to the clusters that Rancher manages by integrating external authentication providers such as Active Directory, LDAP, and GitHub. This is unique to Rancher, as Kubernetes still doesn't integrate very well with external authentication providers.
Rancher is built around several core design principles:
The name Kubernetes originates from Greek, meaning helmsman or pilot. Kubernetes is abbreviated to k8s due to the number of letters between the K and S. Initially, engineers created Kubernetes at Google from an internal project called Borg. Google's Borg system is a cluster manager that was designed to run Google's internal applications. These applications are made up of tens of thousands of microservices hosted on clusters worldwide, with each cluster being made up of tens of thousands of machines. Borg provided three main benefits. The first benefit was the abstraction of resource and failure management, so application designers could focus on application development. The second benefit was its high reliability and availability by design. All parts of Borg were designed, from the beginning, to be highly available. This was done by making applications stateless. This was done so that any component could be destroyed at any time for any reason without impacting availability and, at the same time, could be scaled horizontally to hundreds of instances across clusters. The third benefit was an effective workload; Borg was designed to have a minimal overhead on the compute resources being managed.
Kubernetes can be traced directly back to Borg, as many of the developers at Google that worked on Kubernetes were formerly developers on the Borg project. Because of this, many of its core concepts were incorporated into Kubernetes, with the only real difference being that Borg was custom-made for Google, and its requirements for Kubernetes need to be more generalized and flexible. However, there are four main features that have been derived from Borg:
An example of this is a web server: the server creates logs, but how do you ship those logs to a log server like Splunk? One option is to add a custom agent to your application pod which is easy. But now that you manage more than one process inside a container, you'll have duplicate code in your environment, and most importantly, you now have to do error handling for both your main application and this additional logging agent. This is where sidecars come into play and allow you to bolt containers together inside a pod in a repeatable and consistent manner.
Kubernetes was designed to solve several problems. The primary areas are as follows:
Let's say that an application currently has two web servers, and you want to add a pod to handle the load. Just change the number of replicas to three because the current state doesn't match the desired state. The controllers kick up and start spinning up a new pod. This can be automated using Kubernetes' built-in horizontal pod autoscaler (HPA), which uses several metrics ranging from simple metrics such as CPU and memory to custom metrics such as overall application response times. Additionally, Kubernetes can use its vertical pod autoscaler (VPA) to automatically tune your CPU and memory limits over time. Following this, Kubernetes can use node scaling to dynamically add and remove nodes to your clusters as resources are required. This means your application might have 10 pods with 10 worker nodes during the day, but it might drop to only 1 pod with 1 worker node after hours. This means you can save the cost of 9 nodes for 16 hours per day plus the weekends; all of this without your application having to do anything.
We will compare both of these in the following section.
Kubernetes and Docker Swarm are open source container orchestration platforms that have several identical core functions but significant differences.
Kubernetes is a complex system with several components that all need to work together to make the cluster operate, making it more challenging to set up and administrate. Kubernetes requires you to manage a database (etcd), including taking backups and creating SSL certificates for all of the different components.
Docker Swarm is far simpler, with everything just being included in Docker. All you need to do is create a manager and join nodes to the swarm. However, because everything is baked-in, you don't get the higher-level features such as autoscaling, node provisioning, and more.
Kubernetes uses a flat network model with all pods sharing a large network subnet and another network for creating services. Additionally, Kubernetes allows you to customize and change network providers. For example, if you don't like a particular canal, or can't do network-level encryption, you can switch to another provider, such as Weave, which can do network encryption.
Docker Swarm networking is fundamental. By default, Docker Swarm creates IPsec tunnels between all nodes in the cluster using IPsec for the encryption. This speed can be good because modern CPUs provide hardware acceleration for AES; however, you can still take a performance hit depending on your hardware and workload. Additionally, with Docker Swarm, you can't switch network providers as you only get what is provided.
Kubernetes uses YAML and its API to enable users to define applications and their resources. Because of this, there are tools such as Helm that allow application owners to define their application in a templatized format, making it very easy for applications to be published in a user-friendly format called Helm charts.
Docker Swarm is built on the Docker CLI with a minimal API for management. The only package management tool is Docker Compose, which hasn't been widely adopted due to its limited customization and the high degree of manual work required to deploy it.
Kubernetes has been built from the ground up to be highly available and to have the ability to handle a range of failures, including pods detecting unhealthy pods using advanced features such as running commands inside the pods to verify their health. This includes all of the management components such as Kube-scheduler, Kube-apiserver, and more. Each of these components is designed to be stateless with built-in leader election and failover management.
Docker Swarm is highly available mainly by its ability to clone services between nodes, with the Swarm manager nodes being in an active-standby configuration in the case of a failure.
Kubernetes pods can be exposed using superficial layer 4 (TCP/UDP mode) load balancing services. Then, for external access, Kubernetes has two options. The first is node-port, which acts as a simple method of port-forwarding from the node's IP address to an internal service record. The second is for more complex applications, where Kubernetes can use an ingress controller to provide layer 7 (HTTP/HTTPS mode) load balancing, routing, and SSL management.
Docker Swarm load balancing is DNS-based, meaning Swarm uses round-robin DNS to distribute incoming requests between containers. Because of this, Docker Swarm is limited to layer 4 only, with no option to use any of the higher-level features such as SSL and host-based routing.
Kubernetes provides several tools in which to manage the cluster and its applications, including kubectl for command-line access and even a web UI via the Kubernetes dashboard service. It even offers higher-level UIs such as Rancher and Lens. This is because Kubernetes is built around a REST API that is highly flexible. This means that applications and users can easily integrate their tools into Kubernetes.
Docker Swarm doesn't offer a built-in dashboard. There are some third-party dashboards such as Swarmpit, but there hasn't been very much adoption around these tools and very little standardization.
Kubernetes provides a built-in RBAC model allowing fine-grained control for Kubernetes resources. For example, you can grant pod permission to just one secret with another pod being given access to all secrets in a namespace. This is because Kubernetes authorization is built on SSL certifications and tokens for authentication. This allows Kubernetes to simply pass the certificate and token as a file mounted inside a pod. This makes it straightforward for applications to gain access to the Kubernetes API.
The Docker Swarm security model is primarily network-based using TLS (mTLS) and is missing many fine-grained controls and integrations, with Docker Swarm only having the built-in roles of none, view only, restricted control, scheduler, and full control. This is because the access model for Docker Swarm was built for cluster administration and not application integration. In addition to this, originally, the Docker API only supported basic authentication.
Both Kubernetes and OpenShift share a lot of features and architectures. Both follow the same core design practices, but they differ in terms of how they are executed.
Kubernetes lacks a built-in networking solution and relies on third-party plug-ins such as canal, flannel, and Weave to provide networking for the cluster.
OpenShift provides a built-in network solution called Open vSwitch. This is a VXLAN--based software-defined network stack that can easily be integrated into RedHat's other products. There is some support for third-party network plugins, but they are limited and much harder to support.
Kubernetes takes the approach of being as flexible as possible when deploying applications to the cluster, allowing users to deploy any Linux distribution they choose, including supporting Windows-based images and nodes. This is because Kubernetes is vendor-agnostic.
OpenShift takes the approach of standardizing the whole stack on RedHat products such as RHEL for the node's operating system. Technically, there is little to nothing to stop OpenShift from running on other Linux distributions such as Ubuntu. Additionally, Openshift puts limits on the types of container images that are allowed to run inside the cluster. Again, technically, there isn't much preventing a user from deploying an Ubuntu image on the Openshift cluster, but they will most likely run into issues around supportably.
Kubernetes had a built-in tool for pod-level security called Pod Security Policies (PSPs). PSPs were used to enforce limits on pods such as blocking a pod from running as root or binding to a host's filesystem. PSPs were deprecated in v1.21 due to several limitations of the tool. Now, PSPs are being replaced by a third-party tool called OPA Gatekeeper, which allows all of the same security rules but with a different enforcement model.
OpenShift has a much stricter security mindset, with the option to be secure as a default, and it doesn't require cluster hardening like Kubernetes.
In this chapter, we learned about Rancher's history and how it got its start. Following this, we went over Rancher's core philosophy and how it was designed around Kubernetes. Then, we covered where Kubernetes got its start and its core philosophy. We then dived into what the core problems are that Kubernetes is trying to solve. Finally, we examined the pros and cons of Kubernetes, Docker Swarm, and OpenShift.
In the next chapter, we will cover the high-level architecture and processes of Rancher and its products, including RKE, K3s, and RancherD.