2 Writing infrastructure as code

This chapter covers

  • How the current infrastructure state affects the reproducibility of infrastructure
  • Detecting and remediating infrastructure drift due to mutable changes
  • Implementing best practices for writing reproducible infrastructure as code

Imagine you’ve created a development environment for a hello-world application. You built it organically, adding new components as you needed them. Eventually, you need to reproduce the configuration for production use, which people can publicly access. You also need to scale production across three geographic regions for high availability.

To do this, you must create and update firewalls, load balancers, servers, and databases in new networks for the production environment. Figure 2.1 shows the complexity of the development environment with the firewall, load balancer, server, and database and the components you need to reproduce in production.

The figure also outlines the differences between development and production. The production configuration needs three servers for high availability, expanded firewall rules to allow all HTTP traffic, and stricter firewall rules for the servers to connect to the database. After reviewing all of the differences, you might have a lot of questions about the best and easiest way to make the changes.

Figure 2.1 When you create a production environment based on the development, you must answer many questions about configurations for new infrastructure and reverse engineer the functionality of the development environment.

You might wonder, for example, why the lack of infrastructure as code for the development environment affects your ability to create the production one. The first reason is that you cannot easily reproduce the infrastructure resources. You have to reverse engineer a weeks’ worth of manual configuration! With IaC, you can instead copy and paste some configuration and modify it for the production environment.

Second, you cannot easily compose the infrastructure resources with new ones. You need a pool of servers for production instead of a single server. If you built an infrastructure module, you could use that building block to create multiple servers without updating the configuration from scratch.

Finally, you cannot easily evolve the production environment with its specific requirements. The production environment requires some different infrastructure resources, like secure operating systems and a larger database. You’ll have to manually tweak configuration that you’ve never run in the development environment.

You can solve these challenges and improve reproducibility, composability, and evolvability in two ways. First, you need a way to migrate manually configured infrastructure to IaC. Second, you need to write clean IaC to promote reproducibility and evolvability.

The first part of this chapter outlines fundamental concepts for writing IaC and migrating existing infrastructure to code. The second part of this chapter applies code hygiene practices to infrastructure. The combination of these practices will help you write reproducible IaC and set the stage for future composition and evolution of your system.

2.1 Expressing infrastructure change

I mentioned in chapter 1 that IaC automates changes. It turns out that reproducing and automating many changes over time takes effort. For example, if you want to provision and manage a server on GCP, you’ll usually make the following changes over time:

  1. Create the server in GCP by using the console, terminal, or code.

  2. Read the server in GCP to check that you created the server with the correct specifications—for example, Ubuntu 18.04 as the operating system.

  3. Update the server in GCP with a publicly accessible network address to log into it.

  4. Delete the server in GCP because you no longer require it.

To make more complex updates or reproduce the server in another environment, you take the following steps:

  1. Create the server.

  2. Check if it exists by using a read command.

  3. Update it if you need to log in.

  4. Delete the server if you no longer need it.

No matter which resource you automate, you can always break down your changes to create, read, update, and delete (CRUD). You create an infrastructure resource, search for its metadata, update its properties, and delete it when you no longer need it.

Note You wouldn’t usually have a change record that explicitly states “read the server.” The record usually implies a read step to verify that a resource is created or updated.

CRUD allows you to automate your infrastructure step-by-step in a specific order. This approach, called the imperative style, describes how to configure infrastructure. You can think of it as an instruction manual.

Definition The imperative style of IaC describes how to configure an infrastructure resource step-by-step.

While it seems intuitive, the imperative style does not scale as you make more changes to the system. I once had to create a new database environment based on a development environment. I started reconstructing the 200 change requests submitted to the development environment over two years. Each change request became a series of steps creating, updating, and deleting resources. It took me a month and a half to complete an environment that still didn’t match the existing development one!

Rather than painstakingly re-create every step, I wished I could just describe the new database environment based on the running state of the development environment and let a tool figure out how to achieve the state. With most IaC, you will find it easier to reproduce environments and make changes in the declarative style. The declarative style describes the desired end state of an infrastructure resource. The tool decides the steps it needs to take to configure the infrastructure resource.

Definition The declarative style of IaC describes the desired end state of an infrastructure resource. Automation and tooling decide how to achieve the end state without your knowledge.

This process with IaC takes a few steps. First, you need to search inventory sources for information on the database servers. Next, you get the database IP addresses. Finally, you write a configuration based on the information you’ve collected.

Your configuration in version control becomes the infrastructure source of truth. You declare the new database environment’s desired state instead of describing a set of steps that may not end in the same result.

Definition An infrastructure source of truth structures information about the state of your infrastructure system consistently and singularly.

You make all changes on the infrastructure source of truth. However, even in ideal circumstances (such as with GitOps in chapter 7), you probably have some configuration drift from manual changes over time. If you use the declarative style and create a source of truth, you can use immutability to change infrastructure and lower the risk of failure.

Exercise 2.1

Does the following use the imperative or declarative style of configuring infrastructure?

if __name__ == "__main__":
   update_packages()
   read_ssh_keys()
   update_users()
   if enable_secure_configuration:
       update_ip_tables()

See appendix B for answers to exercises.

2.2 Understanding immutability

How do you prevent configuration drift and quickly reproduce your infrastructure? It starts with changing the way you think about change. Imagine you create a server with Python version 2. You could update your scripts to log into the server and upgrade Python without restarting the server. You can treat the server as mutable infrastructure because you update the server in place without restarting it.

Definition Mutable infrastructure means that you can update the infrastructure resource in place without recreating or restarting it.

However, treating the server as mutable infrastructure raises an issue. Other packages on the server do not work with Python 3. Rather than update every other package and break the server, you can change your update scripts to create a new server with Python version 3 and compatible dependencies. Then you can delete the old server with Python 2. Figure 2.2 shows how to do this.

Figure 2.2 You treat the server mutably by logging in and updating the Python package version. By comparison, you treat the server immutably by replacing the old server with a new one upgraded to Python 3.

Your new scripts treat the server as immutable infrastructure, in which you can replace the existing infrastructure with changes. You do not update the infrastructure in place. Immutability means that after you create a resource, you do not change its configuration.

Definition Immutable infrastructure means you must create a new resource for any changes to infrastructure configuration. You do not modify the resource after creating it.

Why treat the server’s update in two different ways? Some changes will break the resource if you do them mutably. To mitigate the risk of failure, you can create a whole new resource with the updates and remove the old one with immutability.

Immutability relies on a series of creation and deletion changes. Creating a new resource alleviates drift (difference in actual versus expected configuration) because the new resource aligns with the IaC you use to create it. You can expand this beyond server resources to even serverless functions or entire infrastructure clusters. You choose to create a new resource with changes instead of updating the existing one.

Note Machine image builders work with the concept of immutable infrastructure. Any updates to a server require a new machine image, which the builder generates and provides. Modifications to the server, such as IP address registration, should be passed as parameters to a startup script defined by the image builder.

The enforcement of immutability affects the way you make changes. Creating a new resource requires the principle of reproducibility. As a result, IaC lends well to enforcing immutability as you make changes. For example, you might create a new firewall each time you need to update it. The new firewall overrides any manual rules someone added outside of IaC, facilitating security and reducing drift.

Immutability also promotes system availability and mitigates any failures to mission-critical applications. Instead of updating an existing resource in place, creating a new one isolates changes to the new resource, limiting the blast radius if something goes wrong. I discuss more on this in chapter 9.

However, immutability sometimes comes at the cost of time and effort. Figure 2.3 compares the effect of mutable versus immutable infrastructure. When you treat the server as a mutable resource, you localize the effect of the Python in-place update. Updating Python affects a small part of the server’s overall state. When you treat a server immutably, you replace the entire server’s state, affecting any resources dependent on that server.

Here, replacing the entire state for immutability can take longer than changing to mutable infrastructure! You cannot expect to treat all infrastructure immutably all the time. If you change tens of thousands of servers immutably, you spend a few days re-creating all of them. An in-place update may take only a day if you don’t break anything.

Figure 2.3 Changes to mutable resources affect a small portion of the infrastructure state, while immutable resources replace the entire resource state.

You’ll find that you will switch between treating infrastructure as mutable and immutable, depending on the circumstance. Immutable infrastructure helps mitigate potential risk of failure across your system, while mutable infrastructure facilitates faster changes. You often treat infrastructure as mutable when you need to fix the system. How do you migrate between mutable and immutable infrastructure?

2.2.1 Remediating out-of-band changes

You cannot expect to deploy a new resource every time you make a change. Sometimes changes might seem minor in scope and impact. As a result, you decide to make the change mutably.

Imagine you and your friend meet at a coffee shop. Your friend orders a cappuccino with a nondairy alternative. However, the barista adds milk. The barista then needs to make a new cappuccino for your friend because the milk affects the whole cup. Your friend waits for another 5 to 10 minutes. You get a cup of coffee instead and add milk and sugar to taste. If you don’t have enough sugar, you just add more.

You take less time to change your mutable coffee than your friend changing their immutable cappuccino. Similarly, it takes far less time, effort, and cost to execute changes to a mutable resource. When you temporarily treat infrastructure as a mutable resource, you make an out-of-band change.

Definition An out-of-band change is a quickly implemented change that temporarily treats immutable infrastructure as mutable infrastructure.

When you break immutability with an out-of-band change, you reduce the change time but increase the risk of affecting another change in the future. After you make the out-of-band change, you need to update your source of truth to return to immutable infrastructure. How do you start this remediation process?

You must reconcile the actual state and desired configuration when making an out-of-band change. Let’s apply this to my server example in figure 2.4. First, you log into the server and upgrade Python to version 3. Then, you change the configuration in version control, so new servers install Python version 3. The configuration matches the server’s state with the source of truth in version control.

Figure 2.4 After updating a mutable resource, you need to update version control to account for the out-of-band change.

Why should you update IaC for the out-of-band change? Remember from chapter 1 that manual changes may affect reproducibility. Making sure that you transition a change made to mutable infrastructure to future immutable infrastructure preserves reproducibility. After remediating the out-of-band change and adding it to IaC, you can redeploy changes repeatedly to my server, and nothing should change. This behavior conforms to idempotency!

You will continuously reconcile state and source of truth if you make many mutable changes. You should prioritize immutability to promote reproducibility. A barista can always replace a drink in my coffee example, even if you spill the sugar container into your mutable coffee. I recommend using your organization’s change procedures to limit out-of-band changes and ensure that the updates align with the configuration in IaC. You can always use the immutable infrastructure configuration to fix a failed mutable change.

Exercise 2.2

Which of the following changes benefit from the principle of immutability? (Choose all that apply.)

A) Reducing a network to have fewer IP addresses

B) Adding a column to a relational database

C) Adding a new IP address to an existing DNS entry

D) Updating a server’s packages to backward-incompatible versions

E) Migrating infrastructure resources to another region

See appendix B for answers to exercises.

2.2.2 Migrating to infrastructure as code

Immutability through IaC allows version control to manage infrastructure configuration as a source of truth and facilitate future reproduction. In fact, conforming to immutability means that you create new resources all the time. It works well for greenfield environments, which do not have active resources.

However, most organizations have brownfield environments, an existing environment with active servers, load balancers, and networks. Recall that the chapter example includes a brownfield development environment called hello-world. You went into your infrastructure provider and manually created a set of resources.

In general, a brownfield environment treats infrastructure as mutable. You need a way to change your practice of manually changing mutable infrastructure to automatically updating immutable IaC. How do you migrate the environment’s infrastructure resources to immutability?

Let’s migrate the hello-world development environment to immutable IaC. Before you begin, you make a list of infrastructure resources in the environment. It contains networks, servers, load balancers, firewalls, and Domain Name System (DNS) entries.

Base infrastructure

To start, you find the base infrastructure resource that other resources need to use. For example, every infrastructure resource depends on the network in the development environment. You start writing IaC for the database and development networks because the server, load balancer, and database run on it. You cannot reconstruct any resources that run on the networks until the networks exist as code.

Figure 2.5 Reverse engineer the networks for the database and server first and write their configurations as code.

In figure 2.5, you use your terminal to access the infrastructure provider application programming interface (API). Your terminal command prints out the name and IP address range (classless inter-domain routing, or CIDR, block) of the development database and development networks. You reconstruct each network by copying the name and CIDR block of each network into IaC.

Why reverse engineer and reproduce the network in IaC? You must match the IaC with the network’s actual resource state exactly. If you have a mismatch, called drift, you’ll find that your IaC may break your network (and anything on it) by accident!

If possible, import the resource into an IaC state. The resource already exists, and you need your provisioning tool to recognize that. The import step migrates the existing resource to IaC management. To complete the network resource migration, run the IaC again and check that you don’t have drift.

Many provisioning tools have a function for importing resources. For example, CloudFormation uses the resource import command. Similarly, Terraform offers terraform import.

If you write IaC without a provisioning tool, you do not need a direct import capability. Instead, you write code to create a new resource. Sometimes it’s easier to use reproducibility to create a whole new resource. If you cannot easily create a new resource, write code with conditional statements that check whether the resources exist.

Figure 2.6 The decision workflow for migration helps you decide how to import your infrastructure with a provisioning tool, re-create a resource, or build conditional statements to check for resource existence. No matter which option, you must rerun your IaC and reconcile any drift.

Figure 2.6 captures the entire decision workflow of reconstructing the network and whether you can use a provisioning tool to migrate your resource to immutability. The diagram includes the considerations for creating new resources or writing conditional statements for existing resources. As you migrate, you run your IaC multiple times to check for drift.

Why do you have so many decision workflows for migrating to immutability? All of these practices adhere to the principles of reproducibility, idempotency, and composability. You want to reproduce the resources in IaC as accurately as possible. If you can’t import the resource, you can at least reproduce a new one.

Furthermore, rerunning the code uses the principle of idempotency, which ensures that you don’t re-create the resource (unless necessary). If you reconcile drift, idempotency should not change the active network. Similarly, composability allows you to migrate each resource separately to avoid disrupting the system.

As you work on other resources, keep the decision workflow in mind. You can apply it to each resource you migrate to IaC until you complete the migration. You’ll revisit parts of this decision workflow when you refactor IaC in chapter 10.

Resources dependent on base infrastructure

After reconstructing the base network infrastructure, you can work on the servers and other components. Once again, you use your terminal to print out attributes for the hello-world server. It runs in region A with an Ubuntu operating system and one CPU. You write the server specification in its configuration, making note of its dependency on the development network. Similarly, you use your terminal to learn that the database uses 10 GB of memory. You copy this into IaC and record its use of the development database network. Figure 2.7 shows the process of migrating the server and database to code.

You want to migrate the second set of resources that use the network. Use composability to isolate these infrastructure resources and make iterative updates. Small changes to next levels of infrastructure help prevent larger system failures. In chapter 7, you’ll learn more about deploying small changes to infrastructure.

Before you move to the next set of resources, complete the cycle of migration by running your IaC and checking for drift. Ensure that the network, server, and database do not show changes in their IaC. After you reconcile any new drift, you can move onto the remaining resources (DNS, firewall rules, and load balancers).

Figure 2.7 After migrating base infrastructure such as networks, migrate the server and database resources. They depend on base infrastructure but do not depend on each other.

Finally, figure 2.8 rebuilds the remaining configuration for DNS, firewall rules, and load balancers. They depend on the existing configuration of servers and databases. No other resources depend on them.

Figure 2.8 Finally, migrate resources with the fewest dependencies or require server and database configuration.

Why go through the painstaking process of reconstructing various levels of infrastructure? Your brownfield environment did not have a consistent source of truth, so you need to build one. When you finish adding the infrastructure resources to the configuration, you reconstruct a source of truth for the environment. A source of truth with IaC allows you to treat the brownfield environment as immutable infrastructure.

Outside of the example, you’ll always migrate to immutability from base to top-level resources. Identify resources that others heavily depend upon when you start the migration. Write these low-level resources—such as networks, accounts or projects, and IAM—into IaC.

Next, choose resources such as servers, queues, or databases. Firewalls, load balancers, DNS, and alerts depend on the existence of servers, queues, and databases. You can migrate resources with the fewest dependencies at the end of the process. We’ll discuss more about infrastructure dependencies in chapter 4.

Note A dependency graph represents the dependencies among infrastructure resources. IaC tools, such as Terraform, use this concept to apply changes in a structured way. When you migrate resources, you reconstruct the dependency graph. You can investigate tooling that will map the live infrastructure state for you and highlight dependencies to make this easier.

Migration steps

I usually follow general steps to assess dependencies and structure migrating existing resources to IaC:

  1. Migrate initial login, accounts, and provider resource isolation constructs. For example, I write the configuration for a cloud provider’s account or project and my initial service account for automation.

  2. Migrate networks, subnetworks, routing, and root DNS configuration, if applicable. The root DNS configuration can include Secure Sockets Layer (SSL) certificates. For example, I created the root domain hello-world.net and its SSL certificate to prepare for subdomains such as dev.hello-world.net.

  3. Migrate computing resources such as application servers or databases.

  4. Migrate the compute orchestration platform and its components if you use a compute orchestration platform. For example, I migrate my Kubernetes cluster to schedule workloads across servers.

  5. If you use a compute orchestration platform, migrate the application deployments to the compute orchestration platform. For example, I backport the configuration of the hello-world application deployed on Kubernetes.

  6. Migrate messaging queues, caches, or event-streaming platforms. These services have application dependencies before you can reconstruct them. For example, I write the configuration for a messaging queue to communicate between hello-world and another application.

  7. Migrate DNS subdomains, load balancers, and firewalls. For example, I re-create a configuration for a firewall rule between my hello-world application and its database.

  8. Migrate alerts or monitoring related to resources. For example, I reconstruct my configuration to notify me if the hello-world application fails.

  9. Finally, migrate SaaS resources, such as data processing or repositories, that do not depend on applications. For example, this could be a data transform job on GCP that has a singular dependency on a database.

Between each step, make sure you test that you’ve correctly migrated the initial resources by rerunning the configuration. You rarely get all the parameters and dependencies you need on the first try.

Note Rerunning the migrated configuration should not change existing infrastructure because of idempotency. You should reapply the configuration and check the dry run. Changes in your dry run mean your configuration has not accurately captured the actual state of the resource.

If you run the configuration and it outputs changes, you must correct your configuration! The process requires trial and error. As a result, I recommend you test and verify each set of resources.

Migrating to immutability becomes an exercise in reducing drift. This process shows an extreme circumstance where the configuration has drifted far from the state. You work to reconcile the source of truth by updating its configuration in version control. The process of importing existing resources to a new source of truth applies to refactoring IaC, something we’ll discuss in chapter 10.

2.3 Writing clean infrastructure as code

Besides using immutability, you can promote reproducibility by writing configuration cleanly. Code hygiene refers to a set of practices to enhance the readability and structure of code.

Definition Code hygiene is a set of practices and styles to enhance the readability and maintainability of code.

IaC hygiene helps save time when you need to reuse the configuration. I often find infrastructure configuration copied, pasted, and edited with hardcoded values. Hardcoded values reduce readability and reproducibility. While many of these practices come from software development, I suggest some practices specific to infrastructure.

2.3.1 Version control communicates context

How do you use version control effectively to enable reproducibility? Structured practices around version control help you quickly reproduce configuration and make informed changes. For example, you might update a firewall rule in development that allows traffic from app-network to shared-services-network. You add the following commit message to describe why you added the allowance:

$ git commit -m "Allow app to connect to queues
app-network needs to connect to shared-services-network 
because the application uses queues. 
All ports need to be open."

A few weeks later, you reproduce the network in production. However, you forgot why you added the allowance. When you examine the commit history, you remember your descriptive message. You now have information that the application needs to access queues.

When you write commit messages for IaC, you do not need to explain the configuration. The change already captures what the configuration will be. Instead, use the commit message to explain why you want to make the change and how it will affect other infrastructure.

Note In this book, I address version control practices specific to IaC. To learn more about version control, check out the “Getting Started—About Version Control” Git tutorial at http://mng.bz/pOBR. For more on writing good commit messages, check out “Distributed Git—Contributing to a Project” at http://mng.bz/OoMj. Content for both is from Pro Git by Scott Chacon and Ben Straub (Apress, 2014).

You also might have an audit requirement to prefix an issue number or ticket number to the front of the commit message for traceability. For example, you might work on a ticket numbered TICKET-002. It contains a request to allow traffic between the application and shared services. To correlate the ticket to your commit, you add the ticket to the start of the commit message:

$ git commit -m "TICKET-002 Allow app to connect to queues
app-network needs to connect to shared-services-network 
because the application uses queues. 
All ports need to be open."

Adding the work item or ticket information to commit messages makes it easier to track changes. Configuration becomes change documentation because it is the source of truth for infrastructure resources. Version control also becomes a mechanism for documenting changes. You can reconstruct the history of changes and reproduce environments by examining version control and configuration.

2.3.2 Linting and formatting

Before you commit your code, you want to lint it and format it. IaC often will not execute because you missed a space (or two) or used the wrong field name. The wrong field name could lead to an error. Misaligned code can often cause you to misread or skip a configuration line.

Imagine you configure a server, and it needs a field called ip_address. Instead, you name the field ip and later realize you cannot create the server with your IaC. How can you make sure you’ve written the field as ip_address?

You can use linting to analyze your code and verify nonstandard or incorrect configurations. Most tools offer a way to lint the configuration or code. Linting for ip_address catches the wrong field name of ip early in development.

Definition Linting automatically checks the style of your code for nonstandard configuration.

Why check for nonstandard or incorrect configuration? You want to make sure you write the proper configuration and don’t miss critical syntax. If the tool does not have a linting feature, you can always find a community extension or write your own linting rules with a programming language. You should include linting rules that address security standards, such as no secrets committed to version control (chapter 8).

Besides linting, you can use formatting to check for spacing and configuration formats. Formatting might seem obvious as a software development practice, but it becomes more critical in IaC.

Definition Formatting automatically aligns your code for correct spacing and configuration formats.

Most tools use domain-specific languages (DSLs) that offer a higher level of abstraction for a programming language. A DSL provides a lower barrier to entry if you don’t know a programming language. These languages use YAML or JSON data formats with particular format requirements. Having tools to check formatting, such as whether you missed a space in your YAML file, is helpful!

You can also add version control hooks to run formatting checks before committing your code. For example, you might create your infrastructure resources with CloudFormation in YAML data format. To validate the infrastructure resource fields and values, you use the AWS CloudFormation Linter (http://mng.bz/YGrj). You also format the YAML file with the AWS CloudFormation Template Formatter (http://mng.bz/GEVA).

Rather than remember to type these commands each time, you can add the commands as a pre-commit Git hook. Each time you run git commit, the command checks for a proper configuration and format before pushing them to a repository. You can also add them to a continuous delivery workflow, which I cover in chapter 7.

2.3.3 Naming resources

When your IaC becomes documentation, your resources, configuration, and variables need descriptive names. I once created a firewall rule to test something and called it firewall-rule-1. Two weeks later, when I wanted to reproduce it into production, I did not remember why I created the rule in development.

In retrospect, I should have named the firewall rule something more descriptive. I spent another 30 minutes tracking down the rule’s IP addresses and allowances. Naming can affect the time you spend deciphering what the infrastructure does and how it differs in another environment.

Resource names should include the environment, the infrastructure resource type, and its purpose. Figure 2.9 names the firewall rule dev-firewall-rule-allow-hello-world-to-database, which includes the environment (dev), the resource type (firewall-rule), and the purpose (allow-hello-world-to-database).

Figure 2.9 The resource name should include the environment, type, and purpose.

Why should names involve so much detail? You want to identify the resource quickly for troubleshooting, sharing, and auditing. Noticing the environment at a glance ensures that you configure the right one (and not production by accident). The purpose tells others and reminds yourself what the resource does.

Optionally, you can include the resource type. I usually omit resource type from the name because I identify it from resource metadata. Omitting the resource type allows you to conform to your cloud provider’s character limit. If you want to include more information about the purpose or type of your resource, you can always include it in the resource’s tags (chapter 8).

Describe the resource to someone else

When I name a resource, I try to describe it to someone else. If another person understands the resource based on the name, I know it is good. However, I know it needs more information if someone needs to ask additional questions about the environment or resource type.

This exercise can make the names a little long, but I err on the side of being more descriptive. Recognizing the resource’s purpose based on its name saves valuable time reconstructing the environment.

Besides resource names, you also want to make variables and configurations as descriptive as possible. Most infrastructure providers have specific resource attribute naming. AWS refers to a network’s IP address as the CidrBlock, while Azure refers to it as an address_space.

I lean toward using the provider’s specific naming to facilitate looking up documentation for the provider for later changes and reproduction. If I rename the configuration for Azure to cidr_block, I have to remember to translate the parameter to address_space for Azure to consume it. You need to remember to translate a more generic field name for variables or configurations to another provider or environment.

2.3.4 Variables and constants

Besides naming variables, how do you know which values should be variables? Let’s say the hello-world application always serves on port 8080. You don’t plan on changing the port often, so you set it to application_port = 8080 at the beginning of your configuration. However, you hardcode hello-world directly into the name attributes of your infrastructure resources.

One year later, you reproduce the environment for a new version of hello-world on port 3000. You want the new value of name as hello-world-v2. You update application_port at the beginning of your configuration to 3000. Putting the port in a variable allows you to reference the application_port throughout your configuration and store the value in one place. You congratulate yourself on not needing to find and replace instances of 8080 in your configuration. However, you spend an hour searching for all instances of hello-world in your infrastructure configuration to change its name.

In this example, you have two types of inputs. A variable stores a value referenced by infrastructure configuration. Most infrastructure values are best stored in variables and referenced by the configuration.

Definition A variable stores a value referenced by infrastructure configuration. You expect to change the value of a variable anytime you create a new resource or environment.

You should set the application’s name, hello-world, as a variable because it will change depending on the environment, version, or purpose. However, the port does not change based on environment or purpose. A constant variable sets a common value across a set of resources and rarely changes with environment or purpose.

Definition A constant variable establishes a common value across infrastructure configuration. You do not change constants often.

When deciding when to make a configuration value a variable or constant, consider the impact and security implications of changing the value. The frequency of change matters less. If changing the value affects infrastructure dependencies or compromises sensitive information, set it as a variable. You should always set names or environments as variables.

Unlike software development, which pushes for fewer constants, IaC prioritizes constants over variables. Avoid setting too many variables because they make the configuration challenging to maintain. Instead, you can set a constant by defining a local variable with a static configuration.

For example, Terraform uses local values (www.terraform.io/docs/language/values/locals.html) to store constants. Commonly defined constants include operating systems, tags, account identifiers, or domain names. Standardized values on infrastructure providers such as internal or external to describe a type of network can also be constant.

2.3.5 Parametrize dependencies

When you create a server, you need to specify the network it needs to use. You initially express this by hardcoding the name of the network you want, specifically development. When you read the configuration, you know precisely which network the server uses.

However, you realize that when you have to reproduce this for production, you need to search and replace any reference of development with production. Problematically, you have multiple references to development! Your search-and-replace mission becomes a few tedious hours.

Code example

You decide to parametrize the GCP network as a variable so you can reproduce a server in another environment with a different network. When you pass the network name as a variable, you change the network for any server referencing it. Let’s pass the name of the network as a variable in code as follows.

Listing 2.1 Parametrize the network as a variable

import json
 
 
def hello_server(name, network):                                       
   return {
       'resource': [
           {
               'google_compute_instance': [                            
                   {
                       name: [
                           {
                               'allow_stopping_for_update': True,
                               'zone': 'us-central1-a',
                               'boot_disk': [
                                   {
                                       'initialize_params': [
                                           {
                                               'image': 'ubuntu-1804-lts'
                                           }
                                       ]
                                   }
                               ],
                               'machine_type': 'f1-micro',
                               'name': name,
                               'network_interface': [
                                   {
                                       'network': network              
                                   }
                               ],
                               'labels': {
                                   'name': name,
                                   'purpose': 'manning-infrastructure-as-code'
                               }
                           }
                       ]
                   }
               ]
           }
       ]
   }
 
 
if __name__ == "__main__":
   config = hello_server(name='hello-world', network='default')        
 
   with open('main.tf.json', 'w') as outfile:                          
       json.dump(config, outfile, sort_keys=True, indent=4)            

Passes the name and network as parameters to the configuration

Uses Terraform’s google_compute_instance resource to configure a server

Sets the network by using the “network” variable

Sets the network dependency as the default network when you run the script

Creates a JSON file with the server object and runs it with Terraform

AWS and Azure equivalents

In AWS, you would use the aws_instance Terraform resource with a reference to the network you want to use (http://mng.bz/z4j6). You can create this resource on the default virtual private cloud (VPC).

In Azure, you would need to create a virtual network and subnets, then create the azurerm_linux_virtual_machine Terraform resource (http://mng.bz/064E) on the network.

Why pass the name and network as variables? You often change the name and network depending on the environment. Parametrizing these values helps with reproducibility and composability. You can create new resources on different networks and build multiple resources without worrying about conflicts.

Running the example

I’ll run the example step-by-step to celebrate our first hello-world server. Refer to chapter 1 for more information on the tools required for the examples and appendix A for detailed usage instructions. Here are the steps:

  1. Run the script in Python by entering the command in the terminal:

    $ python main.py

    The command creates a file with the extension *.tf.json. Terraform will automatically search for this file extension to create the resources.

  2. Check whether the file exists by listing files in the terminal:

    $ ls *.tf.json

    The output should be as follows:

    main.tf.json
  3. Authenticate to GCP in the terminal:

    $ gcloud auth login
  4. Set the GCP project you want to use as the CLOUDSDK_CORE_PROJECT environment variable:

    $ export CLOUDSDK_CORE_PROJECT=<your GCP project>
  5. Initialize Terraform to retrieve the GCP plugin in the terminal:

    $ terraform init

    The output should include the following:

    Initializing the backend...
     
    Initializing provider plugins...
    - Finding latest version of hashicorp/google...
    - Installing hashicorp/google v3.58.0...
    - Installed hashicorp/google v3.58.0 (signed by HashiCorp)
     
    Terraform has created a lock file .terraform.lock.hcl to 
    record the provider selections it made above.
    Include this file in your version control repository
    so that Terraform can guarantee to make the same
    selections by default when
    you run "terraform init" in the future.
     
    Terraform has been successfully initialized!
     
    You may now begin working with Terraform. Try running 
    "terraform plan" to see any changes that are
    required for your infrastructure. All Terraform commands
    should now work.
     
    If you ever set or change modules or backend configuration
    for Terraform, rerun this command to reinitialize
    your working directory. If you forget, other
    commands will detect it and remind you to do so if necessary.
  6. Apply the Terraform configuration in the terminal. Ensure that you enter yes to apply the changes and create the instance:

    $ terraform apply

    Your output should include the configuration and name of the server instance:

    Do you want to perform these actions?
      Terraform will perform the actions described above.
      Only 'yes' will be accepted to approve.
     
      Enter a value: yes
     
    google_compute_instance.hello-world: Creating...
    google_compute_instance.hello-world: Still creating... [10s elapsed]
    google_compute_instance.hello-world: Still creating... [20s elapsed]
    google_compute_instance.hello-world: Creation complete after 24s 
    [id=projects/infrastructure-as-code-book/zones
    /us-central1-a/instances/hello-world]
     
    Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

    Note I do not go through all of the nuances of Terraform in this book. For detailed information on getting started with Terraform, check out the HashiCorp “Get Started” tutorial at https://learn.hashicorp.com/terraform. You can find additional documentation on how Terraform works with GCP at http://mng.bz/Kx2g.

  7. You can examine the GCP console for the server’s network and metadata. Otherwise, you can use the Cloud SDK command-line interface (CLI) to check the network in the terminal. Enter the command to filter out the hello-world server:

    $ gcloud compute instances list --filter="name=( 'hello-world' )" 
      --format="table(name,networkInterfaces.network)"

    The output should include the GCP URL of the network:

    NAME         NETWORK
    hello-world  ['https://www.googleapis.com/compute/v1/projects/
    <your GCP project>/global/networks/default']

The GCP server uses the default network, which you passed as a variable to the example. If you want to change the network, you update the new variable. Your IaC tooling will pick up the changes and create a new server.

To destroy the server, you can use Terraform in the terminal. Make sure you enter yes to remove the server altogether:

$ terraform destroy

When you define dependencies as variables, you loosely couple the two infrastructure resources. Chapter 4 covers specific patterns you can use to decouple infrastructure resources and dependencies further. If possible, you should avoid hardcoding dependencies and pass them in as parameters.

2.3.6 Keeping it a secret

IaC often needs to use secrets such as tokens, passwords, or keys to execute changes to a provider.

Definition A secret is a piece of sensitive information such as a password, token, or key.

When you create servers in GCP, you need a service account key or token that accesses the project and server resources. To ensure that you can create resources, you maintain the secrets as part of the infrastructure configuration. Secrets in configuration can be problematic. If someone can read my secret, they can use it to access my GCP account to create resources and access restricted data!

You might also need to pass secrets as part of the configuration. For example, you use IaC to set the SSL certificate for a load balancer. The SSL certificate expires in two years. You re-create the environment two years later. However, you discover that the encrypted string of the certificate has expired. You cannot decrypt it and must now issue a new certificate.

Figure 2.10 shows how to best secure your certificate but improve its evolvability in the future. You pass the certificate as an input variable to have different certificates for each environment. Then you put the new certificates in a secrets manager, which stores and manages the certificate for you.

Figure 2.10 Retrieve sensitive information from a secrets manager to change resources with an infrastructure provider.

Anytime the certificate changes, you update it in the secrets manager. Your IaC updates its configuration when it reads the certificate from the secrets manager. Separating concerns for certificate management from configuration mitigates any problems you have later with certificate expiration.

Why store secrets outside of IaC? You just applied the principles of composability and evolvability to separating secrets from other infrastructure resources. This separation ensures that someone can’t examine your IaC to get a password or username. You also minimize the impact of failure when you rotate a secret by rerunning the IaC.

Always pass secrets as variables into IaC and use it in memory. These include Secure Shell Protocol (SSH) keys, certificates, private keys, API tokens, passwords, and other login information. A separate entity should store and manage sensitive authentication data, such as a secrets manager. Separate secrets management facilitates reproduction, especially when you want different passwords and tokens for each environment. You should never hardcode or commit secrets to version control in plaintext.

Summary

  • Prioritizing immutability reduces configuration drift, maintains a source of truth, and improves reproducibility.

  • To conform to immutability, changes to a resource create an entirely new resource and replace its state.

  • If you make mutable changes, you must reconcile the localized changes in the infrastructure state with your configuration.

  • When writing IaC, use commits in version control to communicate changes and context and format the code for readability.

  • Parametrize names, environments, and dependencies to other infrastructure. If you scope the configuration attributes to a resource, you can set it as a constant.

  • Secrets should always be passed as variables and never hardcoded or committed to version control in plaintext.

  • When writing scripts, always simplify actions to create, read, update, and delete commands to reproduce resources.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset