The examples in this book use Python to create a JSON configuration file for you to run with HashiCorp Terraform. This appendix provides guidance on running the examples. Why bother with this manual two-step process of generating the JSON file using Python and then creating the resources using Terraform?
First, I want to ensure that anyone who wants to run the examples but cannot use Google Cloud Platform (GCP) has an opportunity to examine them further. This allows for “local” development and testing and optionally creating live infrastructure resources.
Second, JSON files have a lot of content! They turned out quite verbose in code listings. A Python wrapper allows me to provide examples of the patterns without wading through lines of JSON configuration. Adding Python code around Terraform JSON syntax offers some future-proofing in case I need to rewrite the examples in another tool.
Note Reference links, libraries, and tool syntax change. Review https://github.com/joatmon08/manning-book for the most up-to-date code.
Figure A.1 reiterates the workflow you need to run the examples. If you run python
main.py
, you will get a JSON file with the file extension .tf.json. Run terraform
init
in your CLI to initialize the tool state, and terraform
apply
to provision resources.
I will briefly discuss how to set up an account with various cloud providers. Then, I’ll introduce Python and the libraries I referenced throughout the examples, such as infrastructure API access and testing. Finally, I will provide a short explanation of how to use Terraform with GCP.
The examples in this book use Google Cloud Platform (GCP) as the cloud provider. If you prefer another cloud provider, many examples have sidebars on an equivalent implementation to achieve a similar architecture. Table A.1 maps the approximations for each resource type across GCP, Amazon Web Services (AWS), and Microsoft Azure.
In this section, I’ll outline some initial setup you’ll need to do for each cloud provider, if you choose.
When you start using GCP, create a new project (http://mng.bz/mOV2) and run all of the examples in that project. This allows you to delete the project and its resources when you finish the book.
Next, install the gcloud
CLI (https://cloud.google.com/sdk/docs/install). The CLI will help you authenticate so Terraform can access the GCP API:
This sets up credentials on your machine so Terraform can authenticate to GCP (http://mng.bz/5Qw1).
When you start using AWS, create a new account (http://mng.bz/6XDD) and run all of the examples in that account. This allows you to delete the account and its resources when you finish the book.
Next, create a set of access keys in the AWS console (http://mng.bz/o21r). You will need to save these keys so Terraform can access the AWS API.
Copy the access key ID and secret access key and save it to environment variables:
Then, set the AWS region you want to use:
This sets up credentials on your machine so Terraform can authenticate to AWS (http://mng.bz/nNWg).
When you start using Azure, create a new account (http://mng.bz/v6nJ). Creating a new account gives you a subscription by default. This allows you to create resources within the subscription and group them by resource groups. After you finish the book, you can delete the resource group.
Next, install the Azure CLI (http://mng.bz/44Da). The CLI will help you authenticate so Terraform can access the Azure API.
List the subscriptions so you can get the ID for the default subscription:
Copy the subscription ID and save it to an environment variable:
This sets up credentials on your machine so Terraform can authenticate to Azure (http://mng.bz/QvPw). You should create an Azure resource group (http://mng.bz/XZNG) for each example. Delete the resource group to remove all infrastructure resources for the example.
Before you start running the examples, you must download Python. I used Python 3 in the code listings. You can install Python in a few ways, such as using your package manager of choice or the Python downloads page (www.python.org/downloads/). However, I prefer to use pyenv (https://github.com/pyenv/pyenv) to download and manage my Python versions. pyenv allows you to choose the Python version you need and install it into a virtual environment using Python’s venv library (https://docs.python.org/3/library/venv.html).
I use a virtual environment because I have many projects that require different Python versions. Installing different versions of each project in the same environment gets confusing and often breaks code. As a result, I want to separate each project into a development environment with its dependencies and Python version.
After you install Python 3 into your development or virtual environment, you need to install some external libraries. In listing A.1, I capture libraries and dependencies in requirements.txt, a plaintext file with a list of packages and versions.
apache-libcloud==3.3.1 ❶ google-api-python-client==2.17.0 ❷ google-cloud-billing==1.3.3 ❷ netaddr==0.8.0 ❸ pytest==6.2.4 ❹
❶ Installs Apache Libcloud library
❷ Installs client libraries for GCP, including the Python client and Cloud Billing client
❸ Installs netaddr, a Python library for parsing network information
❹ Installs pytest, a Python testing framework
The example repository contains a requirements.txt file that freezes the library versions you need to install. In your Python development environment, use your CLI to install the libraries with pip, Python’s package installer:
Some of the examples require more complex automation or testing. They reference libraries that you will need to import separately. Let’s examine the libraries to download in more detail.
Apache Libcloud (https://libcloud.apache.org/) offers a Python interface to create, update, read, and delete cloud resources. It includes a single interface agnostic of the cloud service or provider. I reference this library in the early sections of the book for examples on integration and end-to-end testing. To use Apache Libcloud in the following listing, you can import the libcloud
package and set up a driver to connect to GCP.
from libcloud.compute.types import Provider ❶ from libcloud.compute.providers import get_driver ❷ ComputeEngine = get_driver(Provider.GCE) ❸ driver = ComputeEngine( ❹ credentials.GOOGLE_SERVICE_ACCOUNT, ❹ credentials.GOOGLE_SERVICE_ACCOUNT_FILE, ❹ project=credentials.GOOGLE_PROJECT, ❹ datacenter=credentials.GOOGLE_REGION) ❹
❶ Imports the object to set the cloud provider, such as GCP
❷ Imports the function to initialize the driver for a cloud provider
❸ Sets up the driver that will connect to Google Cloud
❹ Passes credentials to connect to the Google Cloud API to initialize the driver
I use Apache Libcloud in the tests instead of Google Cloud’s client libraries because it provides a unified API to access any cloud. If I want to switch my examples to AWS or Azure, I need to change only the driver for the cloud provider. The tests only read information from the cloud provider and do not run any complex operations with Apache Libcloud.
Python Clients for Google Cloud
The later part of the book includes more complicated IaC and tests, which I could not implement in Apache Libcloud. Apache Libcloud could not support my use case of retrieving specific information about Google Cloud resources, such as pricing information! The following listing shows how I used client libraries specific to Google Cloud in these use cases.
❶ Imports the Google Cloud Client Library for Python
❷ Imports the Python Client for Google Cloud Billing API
The examples use two libraries maintained by Google Cloud. The Google Cloud Client Library for Python (http://mng.bz/aJ1z) allows you to access many of the APIs on Google Cloud and create, read, update, and delete resources. However, it does not include access to Google’s Cloud Billing API.
As a result, for chapter 12 on cost, I had to import a different library maintained by Google Cloud to retrieve billing catalog information. The Python Client for Google Cloud Billing API (http://mng.bz/gwBl) allows me to read information from the Google Cloud service catalog.
When you have IaC that needs to reference specific resources or APIs not available in a unified API, like Apache Libcloud, you often need to find a separate library to retrieve the information you need. While we would like to minimize dependencies, we must recognize that not every library achieves every use case! Choose a different library if you feel that your existing one cannot accomplish the automation you need.
In chapter 5, I needed to modify an IP address block. While I entertained the possibility of mathematically calculating the correct address, I decided to use a library instead. Python does have a built-in ipaddress library, but it does not include the functionality I needed. I installed netaddr (https://netaddr.readthedocs.io/en/latest/) instead to reduce the additional code I needed to calculate the IP addresses.
Many of the tests in this book use pytest, a Python testing framework. You can use Python’s unittest module to write and run tests as well. I prefer pytest since it offers a minimal interface to write and run tests without more complicated testing features. Rather than explain pytest in depth, I will outline some of the features I use in the tests and how to run them.
Pytest searches for Python files prefixed with test_. This filename signals that the file contains Python tests. Each test function also uses the prefix test_. Pytest selects and runs the tests based on the prefix.
Many of the tests in this book include test fixtures. A test fixture captures a known object, such as a name or constant, that you can use for comparison across multiple tests. In the following listing, I use fixtures to pass commonly processed objects, like network attributes, among multiple tests.
import pytest ❶ @pytest.fixture ❷ def network(): ❸ return 'my-network' ❸ def test_configuration_for_network_name(network): ❸ assert network == 'my-network', 'Network name does not match expected' ❹
❷ Sets a known object, or test fixture
❸ Returns the known network name “my-network” and passes it to your first test
❹ Asserts that the network name matches the expected, fails the test if it does not. You can also include a descriptive error message.
The most important part of the test involves checking that the expected value matches the actual value, or asserting. Pytest suggests one assert
statement for each test. I follow this convention because it helps me write more descriptive, helpful tests. Your tests should describe their intent and what they test as clearly as possible.
To run a set of tests with pytest, you can pass in the directory with the tests. However, make sure your testing directory has absolute paths to any files you read in via pytest! For example, the tests in chapter 4 read external JSON files. As a result, you need to change the working directory to the chapter and section:
You can run all the tests in the directory by passing a dot (.
) to pytest in the CLI:
You can run one file by adding the filename to pytest in the CLI:
Many of the tests in this book use similar patterns of fixtures and assert
statements. For more information on other pytest features, review its documentation (https://docs.pytest.org). You will run either pytest
or python main.py
commands in your CLI for the examples.
I separate each infrastructure resource into a Python file. Every directory contains a main.py file, as shown in listing A.5. The file always includes code that writes a Python dictionary to a JSON file. The object needs to use Terraform’s JSON configuration syntax for an infrastructure resource.
import json if __name__ == "__main__": server = ServerFactoryModule(name='hello-world') ❶ with open('main.tf.json', 'w') as outfile: ❷ json.dump(server.resources, outfile, sort_keys=True, indent=4) ❸
❶ Generates a Python dictionary for a GCP server
❷ Creates a JSON file named “main.tf.json,” which contains Terraform-compatible JSON configuration
❸ Writes out the server dictionary to the JSON file
You can run the Python script in your terminal:
When you list the files, you will find a new JSON file named main.tf.json:
Many examples require you to run Python for main.py and generate a JSON file named main.tf.json unless otherwise noted. However, some examples use other libraries or code for automation or testing.
After you generate a main.tf.json file by using python main.py
, you need to create the resources in GCP. The .tf.json files need HashiCorp Terraform to create, read, update, and delete the resources in GCP.
You can download and install Terraform with the package manager of your choice (www.terraform.io/downloads.html). You run it with a set of CLI commands, so you will need to download the binary and make sure you can run it in your terminal. Terraform searches for files with the .tf or .tf.json extension within a working directory and creates, reads, updates, and deletes the resources you define in those files.
Terraform offers various interfaces for you to create infrastructure resources. Most of its documentation uses HashiCorp Configuration Language (HCL), a DSL that defines infrastructure resources for each cloud provider. For more on Terraform, review its documentation (www.terraform.io/docs/index.html).
The examples in this book do not use HCL. Instead, they use a JSON configuration syntax (www.terraform.io/docs/language/syntax/json.html) specific to Terraform. This syntax uses the same DSL as the HCL, just formatted in JSON.
Each main.py file in Python writes a dictionary out to the JSON file. Listing A.6 shows how I create a dictionary that defines a Terraform resource in JSON configuration syntax. The JSON resource references the google_compute_instance
resource defined by Terraform (http://mng.bz/e71z) and sets all of the required attributes.
terraform_json = { 'resource': [{ ❶ 'google_compute_instance': [{ ❷ 'my_server': [{ ❸ 'allow_stopping_for_update': True 'boot_disk': [{ 'initialize_params': [{ 'image': 'ubuntu-1804-lts' }] }], 'machine_type': 'e2-micro', 'name': 'my-server', 'zone': 'us-central1-a', }] }] }] }
❶ Signals to Terraform that you will define a list of resources
❷ Defines a “google_compute_instance,” a Terraform resource that will create and configure a server in GCP
❸ Defines a unique identifier for the server so Terraform can track it
The Python dictionary becomes Terraform JSON configuration syntax when you write it to a JSON file. Terraform will create the resources defined in its current working directory only with files that have the extension .tf or .tf.json. If you update the code to write the configuration to a JSON file that does not have the .tf.json extension, Terraform will not recognize the resources in the file.
After running Python and creating a JSON file, you need to initialize Terraform in the working directory. Figure A.2 outlines the commands you need to run in your terminal to initialize state and apply infrastructure changes.
In your terminal, change to a directory with a *.tf.json file. For example, I change to the directory that contains examples for section 2.3:
Initialize Terraform in your terminal:
$ terraform init Initializing the backend... Initializing provider plugins... - Reusing previous version of hashicorp/google from the dependency lock file - Using previously-installed hashicorp/google v3.86.0 Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" ➥to see any changes that are required for your infrastructure. ➥All Terraform commands should now work. If you ever set or change modules or backend configuration ➥for Terraform, rerun this command to reinitialize ➥your working directory. If you forget, other ➥commands will detect it and remind you to do so if necessary.
Terraform runs an initialization step that creates a tool state called backend and installs plugins and modules. The initialization creates a series of files that you should not delete from your filesystem. After you initialize Terraform, you will find some hidden and new files when you list the contents of your directory:
$ ls -al drwxr-xr-x .terraform -rw-r--r-- .terraform.lock.hcl -rw-r--r-- main.py -rw-r--r-- main.tf.json -rw-r--r-- terraform.tfstate -rw-r--r-- terraform.tfstate.backup
Terraform stores its tool state in a state file to quickly reconcile any changes that you make to infrastructure resources. Terraform can reference a state file stored locally or on a server, artifact registry, object store, or other. The examples store tool state in a local file named terraform.tfstate. If you accidentally delete this file, Terraform will no longer recognize resources under its management! Ensure that you do not remove the local state file or update the examples to use a remote backend. You may also find a terraform.tfstate.backup file, which Terraform uses to back up its tool state before it makes changes.
Initialization also installs a plugin for Terraform to communicate to Google. Terraform uses a plugin system to extend its engine and interface with cloud providers. The AWS examples use the same command of terraform init
to download the AWS plugin for you automatically. Plugins or modules get downloaded to the .terraform folder.
Terraform also pins the versions of plugins for you, similar to the requirements.txt file for Python. You’ll find a list of pinned versions for plugins in .terraform.lock.hcl. In the examples repository, I committed the .terraform.lock.hcl to version control so that Terraform installs only the plugins I’ve tested with at the time I generated the examples.
Most Terraform plugins read credentials for infrastructure provider APIs by using environment variables. I usually set the GCP project environment variable, so Terraform connects to the correct GCP project:
I also authenticate to GCP by using the gcloud CLI tool. The command automatically sets credentials for Terraform to access GCP:
For other cloud providers, I recommend setting environment variables in your terminal to authenticate to your AWS or Azure account. Reference section A.1 for their configuration.
After setting your credentials, you can use Terraform to dry-run and deploy your infrastructure resources. In your terminal, you can run terraform apply
to start the deployment of your changes:
$ terraform apply Terraform used the selected providers to generate ➥the following execution plan. Resource actions ➥are indicated with the following symbols: + create Terraform will perform the following actions: # google_compute_instance.hello-world will be created + resource "google_compute_instance" "hello-world" { ... OMITTED ... Plan: 1 to add, 0 to change, 0 to destroy. Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value:
The command will stop and wait for you to enter yes
at Enter a value
. It waits for you to review the changes and check that you want to add, change, or destroy resources. Always review changes before entering yes!
After you type yes
, Terraform will start deploying the resources:
Enter a value: yes google_compute_instance.hello-world: Creating... google_compute_instance.hello-world: ➥Still creating... [10s elapsed] google_compute_instance.hello-world: ➥Creation complete after 15s [id=projects/infrastructure-as-code-book/zones ➥/us-central1-a/instances/hello-world] Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
You will find your resources in your GCP project after you use terraform
apply
.
Many of the examples use overlapping names or network CIDR blocks. I recommend you clean up the resources between each chapter and section. Terraform uses the terraform destroy
command to delete all the resources listed in terraform.tfstate from GCP. In your terminal, make sure you authenticate to GCP or your infrastructure provider.
When you run terraform destroy
, it outputs the resources it will destroy. Review the list of resources and make sure you want to delete them!
$ terraform destroy Terraform used the selected providers to generate ➥the following execution plan. Resource actions ➥are indicated with the following symbols: - destroy Terraform will perform the following actions: # google_compute_instance.hello-world will be destroyed ... OMITTED ... Plan: 0 to add, 0 to change, 1 to destroy. Do you really want to destroy all resources? Terraform will destroy all your managed infrastructure, ➥as shown above. There is no undo. Only 'yes' will be accepted to confirm. Enter a value:
After you review the resources you expect to delete, enter yes
at the command prompt. Terraform will delete the resources from GCP. Deletion will take some time, so expect this to run for a few minutes. Some examples will take even longer to deploy and destroy because they have many resources involved:
Enter a value: yes google_compute_instance.hello-world: Destroying... elapsed] google_compute_instance.hello-world: Still destroying... ➥[id=projects/infrastructure-as-code-book/zones ➥/us-central1-a/instances/hello-world, 2m10s elapsed] google_compute_instance.hello-world: Destruction complete after 2m24s Destroy complete! Resources: 1 destroyed.
After destroying the resources, you can remove the terraform.tfstate, terraform .tfstate.backup, and .terraform files if you would like. Remember to delete your resources from GCP (or delete the entire project) each time you finish an example so you can reduce your cloud bill!