8

Deploying GKE Using Public Modules

In Chapter 4, Writing Reusable Code Using Modules, we introduced modules and public registries. In this chapter, we will learn how to effectively use modules from the public Terraform Registry to build a complete architecture. We will see how we can deploy complex architectures quickly by utilizing public modules. In particular, we use two of the most popular Google Cloud modules from the Terraform Registry – the network and kubernetes-engine modules.

In this chapter, we will learn how to use modules from the public registry and use Terraform workspaces to provision two independent environments by doing the following:

  • Developing a variable strategy
  • Provisioning a network using a public Terraform module
  • Provisioning a GKE cluster using a public Terraform module
  • Using workspaces to provision deploy development and production environments

Technical requirements

In this chapter, we are provisioning a Google Kubernetes Engine (GKE) cluster. It helps to have a basic understanding of Kubernetes and GKE. We will deploy two independent GKE clusters in two different environments. We recommend using two projects for the two environments; however, you can also use a single project if you prefer. As in the previous two chapters, running Terraform in the cloud shell is easier, as we don’t need to set up a Terraform service account and its IAM permission to provision the resources in both projects.

Please note that we will provision two GKE clusters with several VMs that incur costs. Depending on your quotas, you might not be able to have both GKE clusters running simultaneously in a single project.

After the exercise, it is important to destroy all the resources in both environments and access the web console to confirm that all VMs are removed.

The code for this chapter can be found here: https://github.com/PacktPublishing/Terraform-for-Google-Cloud-Essential-Guide/tree/main/chap08

Overview

As we mentioned in Chapter 5, Managing Environments, one of the fundamental principles in modern application development is maintaining dev/prod parity (https://12factor.net/dev-prod-parity). That is, different environments, such as development, testing, and production environments, should be as similar as possible. Terraform is the ideal tool to set up any number of environments and then delete them once they’ve been used. To minimize the cost, it is a common practice to provision smaller instance sizes or zonal resources instead of larger and regional resources in development environments. As we have discussed previously, by combining modules and workspaces, we can use Terraform to provision equivalent environments quickly and with minimal repeated effort.

In this exercise, we want to deploy two GKE clusters, one for development and one for production. The two clusters should be nearly identical, except that we want to save costs for the development cluster. Thus, we set up a regional high-availability cluster for production but a zonal cluster for development. In addition, we use small spot instances for the nodes in the development cluster but medium-sized regular instances for the production cluster. As the two environments are nearly identical, we use Terraform workspaces to manage the two environments.

Rather than starting from scratch, we utilize public modules. Both the Google blueprints and Terraform Registry contain many public modules. However, it’s up to you to decide whether a public module actually adds value or whether you are better off with your own code.

For example, we already have very efficient code to enable the necessary Google Cloud APIs and provision a service account, so we use our previous code. Ideally, we would have stored the code in a private module repository, but for simplicity’s sake, we will copy the code from the previous chapter.

Following Google’s security guidelines, we will deploy the GKE cluster in a custom VPC. Looking through the modules, we decided on the following public modules:

Developing a variable strategy

Our objective is to create GKE clusters for multiple independent environments. In our example, we provision two environments – development and production. However, we want to keep it flexible enough to provide additional environments, such as testing or staging, when required. We also want to balance flexibility with ease of use. For this, we need to decide on a variable strategy. First, we must determine which values might differ from environment to environment and which should remain the same regardless of the environment. For example, we know we want to have different values for the node pool configurations, such as the initial and maximum number of nodes, but cluster configuration attributes such as network policy and HTTP load balancing are the same regardless of the environment. Thus, we need to define variables for the node pools but not for cluster configurations.

Second, we must decide which variables to make optional and which ones should be required. The three main components that we want to parameterize are the network details, the node pool of the cluster, and the cluster itself, so we define each of these variables as an object to provide some type of constraint. In Terraform version 1.13, Hashicorp introduced default values to object types, which can be very useful. Thus, in our variable declaration, we make several variable declarations optional. For example, the name and the type of GKE cluster (zonal or region) and whether to use spot instances are required, whereas the machine type for the node pool, among other things, is optional. We follow a similar strategy for the network, and thus we declare the variables as follows:

chap08/variables.tf

variable "network" {
  type = object({
    name                = string
    subnetwork_name     = string
    nodes_cidr_range    = optional(string, "10.128.0.0/20")
    pods_cidr_range     = optional(string, "10.4.0.0/14")
    services_cidr_range = optional(string, "10.8.0.0/20")
  })
}
variable "gke" {
  type = object({
    name     = string
    regional = bool
    zones    = list(string)
  })
}
variable "node_pool" {
  type = object({
    name               = string
    machine_type       = optional(string, "e2-small")
    spot               = bool
    initial_node_count = optional(number, 2)
    max_count          = optional(number, 4)
    disk_size_gb       = optional(number, 10)
  })
}

Now that we have defined our variables, let’s look at the public modules to provision the network and the GKE cluster.

Provisioning a network using the public module

Google maintains the Google Cloud Terraform network module (https://registry.terraform.io/modules/terraform-google-modules/network/). It is part of the Cloud Foundation Toolkit (https://cloud.google.com/foundation-toolkit), a set of Terraform modules that follow Google Cloud best practices.

The starting point for using any public module is documentation. A well-written module includes clear and concise documentation and examples for various use cases. For example, the network module contains sections for inputs, outputs, dependencies, resources used, and examples of common use cases. These examples are stored in a public GitHub repository so that we can examine them. Public modules are written to cater to many different requirements and can contain many variables and configurations. Thus, it’s important to study the different configurations and input requirements. Furthermore, some modules have submodules that can be used independently. For example, the networking module contains submodules for subnets and firewall rules that you can use in your existing VPC network.

We are using the example in the README as a basis for our configuration. Our VPC runs the GKE cluster, so we want to define the secondary IP CIDR ranges for the pods and services. In addition, we might want to remotely log in to the nodes used for testing purposes using SSH but do so only via the IAP, so we add a firewall rule to the module.

The official Terraform registry supports module versioning, and it is good practice to use this feature and enforce version constraints. Modules published in the official registry must follow semantic versioning (https://semver.org/). That is, the version number follows the pattern of major.minor.patch. For example, 4.1.2 indicates major version 4, minor version 1, and patch level 2. When using public modules, it is recommended to use specific versions so that you control the updates. We are using version 5.2.0 for the network module.

In addition, this particular module has three required inputs: project_id, network_name, and a list of subnets. Looking at the documentation, the subnets are defined as a list of strings. Further down in the documentation, we see more details on the required and optional inputs for subnets.

While the main documentation clearly spells out the format for subnets and secondary_ranges, it only declares the firewall rules as any. In this case, we must go to the submodules documentation to determine the required inputs. The submodules documentation at https://registry.terraform.io/modules/terraform-google-modules/network/google/latest/submodules/firewall-rules gives a detailed list of the required and optional inputs for the firewall rules. With this in mind, we define our VPC module shown as follows:

chap08/vpc.tf

module "vpc" {
  source  = "terraform-google-modules/network/google"
  version = "= 5.2.0"
  depends_on = [google_project_service.this["compute"]]
  project_id   = var.project_id
  network_name = var.network.name
  subnets = [
    {
      subnet_name           = var.network.subnetwork_name
      subnet_ip             = var.network.nodes_cidr_range
      subnet_region         = var.region
      subnet_private_access = "true"
    },
  ]
  secondary_ranges = {
    (var.network.subnetwork_name) = [
      {
        range_name    = "${var.network.subnetwork_name}-pods"
        ip_cidr_range = var.network.pods_cidr_range
      },
      {
        range_name    = "${var.network.subnetwork_name}-services"
        ip_cidr_range = var.network.services_cidr_range
      },
    ]
  }
  firewall_rules = [
    {
      name      = "${var.network.name}-allow-iap-ssh-ingress"
      direction = "INGRESS"
      ranges    = ["35.235.240.0/20"]
      allow = [{
        protocol = "tcp"
        ports    = ["22"]
      }]
    },
  ]
}

Now that we have defined the network, let’s turn our attention to the provisioning of the GKE cluster.

Provisioning a GKE cluster using the public module

The Terraform kubernetes-engine module – https://registry.terraform.io/modules/terraform-google-modules/kubernetes-engine/google/latest – is one of the more comprehensive (complex) Google Cloud Terraform modules. Our architecture calls for a zonal GKE cluster with a single configurable node pool for development and an equivalent but regional cluster for production. Thus, we start with the example in the README and modify it as per our needs. First, we remove features we don’t require, such as node pool labels and metadata. Next, we parameterize several attributes that differ depending on the environment. This includes the network and the configuration of the nodes in the node pool. We also include a Boolean variable, which indicates whether the cluster is a zonal or a regional cluster.

Finally, we set several fixed attributes that remain constant regardless of the environment in which the cluster is deployed. One note regarding this module: it first creates a GKE cluster with a default node pool. Only after the cluster with the default node pool is generated does the module add the defined node pool. Thus, the last two attributes indicate the number of nodes in the initial, the default node pool, and whether to remove the default node pool. In our case, we don’t care about the default node pool, so we set initial_node_count to 1 and remove_default_node_pool to true:

chap08/gke.tf

# google_client_config and kubernetes provider must be
# explicitly specified like the following.
data "google_client_config" "default" {
}
provider "kubernetes" {
  host                   = "https://${module.gke.endpoint}"
  token                  = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(module.gke.ca_certificate)
}
locals {
  subnetwork_name = module.vpc.subnets["${var.region}/${var.network.subnetwork_name}"].name
}
module "gke" {
  source     = "terraform-google-modules/kubernetes-engine/google"
  version    = "23.1.0"
  project_id = var.project_id
  region     = var.region
  name     = var.gke.name
  regional = var.gke.regional
  zones    = var.gke.zones
  network           = module.vpc.network_name
  subnetwork        = local.subnetwork_name
  ip_range_pods     = "${local.subnetwork_name}-pods"
  ip_range_services = "${local.subnetwork_name}-services"
  service_account = google_service_account.this.email
  node_pools = [
    {
      name               = var.node_pool.name
      machine_type       = var.node_pool.machine_type
      disk_size_gb       = var.node_pool.disk_size_gb
      spot               = var.node_pool.spot
      initial_node_count = var.node_pool.initial_node_count
      max_count          = var.node_pool.max_count
      disk_type          = "pd-ssd"
    },
  ]
  # Fixed values
  network_policy             = true
  horizontal_pod_autoscaling = true
  http_load_balancing        = true
  create_service_account     = false
  initial_node_count       = 1
  remove_default_node_pool = true
}

Now that we have defined the variable definitions and the modules, let’s focus on the variable assignments.

Using workspaces to deploy to development and production environments

In Chapter 5, Managing Environments, we discussed the two main methods to support multiple environments using Terraform – workspaces and directory structure. In this case, we decide to use workspaces, as our two environments are very similar.

Thus, first, create two workspaces named dev and prod:

$ terraform workspace new prod
$ terraform workspace new dev

Once we have created the two workspaces, we write two variable definition files (.tfvars) for each of the two environments. Since we declared many of our variables as optional, the development variable definition file is quite short. It consists mainly of the zone in which we want to deploy the cluster and the names of the various cloud resources:

chap08/dev.tfvars

project_id = "<PROJECT_ID>"
region     = "us-west1"
zone       = "us-west1-a"
network = {
  name            = "dev-gke-network"
  subnetwork_name = "us-west1"
}
gke = {
  name     = "dev-gke-cluster"
  zones    = ["us-west1-a"]
}
node_pool = {
  name = "dev-node-pool"
}
service_account = {
  name  = "dev-sa"
  roles = []
}

Now, we can deploy the development cluster by selecting the dev workspace, and running Terraform, specifying the appropriate .tfvars file shown as follows:

$ terraform workspace select dev
$ terraform apply -var-file=dev.tfvars

Note

Please note that this GKE module requires the Compute Engine API and the Kubernetes Engine API to be enabled before Terraform can run a Terraform plan. Hence, we included a small script to enable those APIs. Alternatively, we can enable them via the web console.

Creating a GKE cluster with a defined node pool does take some time. As mentioned, Terraform first creates a GKE cluster with a default node pool, and then creates a separately managed node pool. Thus, it can take up to 20 minutes for the cluster to be provisioned entirely. We can observe the nodes being created and deleted by using the console.

We include a sample file for the production cluster. Feel free to modify the file. We can deploy the production cluster in the same or a separate project. We need to update the project ID in the prod.tfvars file to deploy it into a separate project. If we use a service account rather than the cloud shell, we need to ensure that the service account has the right IAM permissions in both projects. We do not need to modify the backend location as Terraform writes the state file in the same bucket but under the appropriate workspace name.

To provision the production cluster, apply the equivalent commands for production:

$ terraform workspace select prod
$ terraform apply -var-file=prod.tfvars

Once the cluster is deployed, we can test it with any Kubernetes configurations. An interesting application is the Online Boutique sample application from Google at https://github.com/GoogleCloudPlatform/microservices-demo.

To use it, clone the repository, and connect to our GKE cluster using the gcloud connect command, before applying the configuration file shown as follows:

$ git clone https://github.com/GoogleCloudPlatform/microservicesdemo.git 
$ gcloud container clusters get-credentials dev-gke-cluster 
--zone us-west1-a --project <PROJECT_ID>
$ kubectl apply -f 
./microservices-demo/release/kubernetesmanifests.yaml

It takes a few minutes for the application to be fully deployed. Once it is, we can retrieve the public IP address using the following command and access it by using any browser:

$ kubectl get service frontend-external | awk '{print $4}'

If you get a 500 error in your browser, wait a few minutes, and reload it.

Summary

In this chapter, we learned how to utilize public modules. Public modules can help us create complex systems more rapidly than building them from scratch. Public modules are also an excellent resource for learning Terraform language tricks. All public modules reside in GitHub repositories and are free to explore.

Furthermore, we used workspaces to create two independent environments by writing two separate variable definition files and creating two workspaces. Creating a third environment is as easy as creating a new variable definitions file and workspace and running Terraform in that workspace. That is the power of Terraform.

Now that we have demonstrated different approaches to provision three very different architectures, we will introduce some tools to help us develop Terraform code more efficiently. Before we go on, be sure to delete the GKE clusters, as they incur considerable costs due to the number of VMs provisioned.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset