In Chapter 5 learned how to manage, share, and use Terraform’s remote state. We also saw some of the challenges involved in using that state. We’re going to combine that state knowledge with our previous learning about Terraform itself to create a multi-environment and multi-service architecture. Importantly, we’re also going to describe a workflow for developing and deploying Terraform infrastructure and changes to that infrastructure.
Terraform users will tell you that working out how to organize and lay out your code is crucial to having a usable Terraform environment. Inside our new architecture, we’re going to create new Terraform configuration in the form of an example data center. We’ll include a file and directory layout, state management, multiple providers, and data sources.
We’re going to create a multi-environment data center to demonstrate what we’ve learned over the last few chapters. It’s going to be hosted in AWS and will look like this:
We have a development
environment and a production
environment, both built in AWS. We have two services, a web service and a backend API service, in each environment.
Our web service stack will consist of a Cloudflare website service, a load balancer, two EC2 instances running a web server, and two EC2 instances as backend application servers for persistent data.
Our API stack will consist of a load balancer with five EC2 instances.
We’re going to create a module for each service that we’re going to build. This will allow us to reuse our Terraform configuration across environments and better manage versions of infrastructure.
Let’s start by creating a directory and file structure to hold our Terraform configuration. We’ll create it under our home directory.
We’ve created a base directory called dc
and a directory for each environment we want to create—development
, production
—as well as a modules
directory. The modules
directory will contain each of the modules we’ll use in our environment. These directories will contain the module source code. We’ll upload the modules to GitHub and use that as their source so we can more easily reuse them.
Our new directory structure looks like this:
Each environment will contain environment configuration, and then each service’s module will be declared and configured. We’ll add variables and outputs for the environment, as well as for each service.
Let’s add a .gitignore
file too. We’ll exclude any state files to ensure we don’t commit any potentially sensitive variable values.
$ echo "terraform.tfstate*" >> .gitignore
$ echo ".terraform/" >> .gitignore
$ git add .gitignore
$ git commit -m "Adding .gitignore file"
We’ve also created a Git repository for our data center environment. You can find it on GitHub here.
One of the key aspects of managing infrastructure is the workflow you establish to do so. Our multiple environments provide a framework for testing changes prior to production. The terraform plan
command provides a mechanism for conducting that testing.
We’ve run the terraform plan
command before pretty much every change we’ve made in the book. This is intentional: the output of planning is critical to ensure we deploy the right changes to our environment. When we’re working with a multi-environment architecture we can make use of our framework and the planning process.
We recommend a simple workflow.
We develop our infrastructure in the development
environment. We’ll see some patterns in this chapter for how to build each environment and how to package services using modules to ensure appropriate isolation for each environment and each service we wish to deploy.
We should ensure our code is clean, simple, and well documented. There are some good guidelines to work within to help with this:
description
fields to all variables.README
files or documentation for your module and their interfaces.terraform fmt
and terraform validate
prior to committing or as a commit hook is strongly recommended to ensure your code is well formatted and valid.When you’re ready to test your infrastructure, always run terraform plan
first. Use this plan to ensure the development
version of your infrastructure is accurate and that any changes you’re proposing are viable. Between validation and the planning process you are likely to catch the vast majority of potential errors.
Remember from Chapter 2 that the terraform plan
also allows you to save the generated plan from each run. To do this we run the terraform plan
command with the -out
flag.
This saves the generated plan, including any differences, in a file—here, development-epochtime.plan
.
We can now use this file as a blueprint for our change. This ensures that even if the configuration or state has drifted or changed since we ran the plan
command, the changes applied will be the changes we expect.
Even with a plan
things can still go wrong. The real world is always ready to disappoint us. Before you apply your configuration in your production
environment, apply it in your development
environment. Apply it using plan output files in iterative pieces to confirm it is working correctly.
You can use scaled-down versions of infrastructure—for example, in AWS we often use smaller instance types to build infrastructure in development
, in order to save money and capacity. Actually deploying your configuration further validates it and likely reduces the risk of potential errors.
If things have worked in your development
environment, then it’s time to consider promoting it to your production
environment. This doesn’t mean, however, that you shouldn’t make your changes jump through a few more hoops.
You should have your code in version control so it becomes easy to pass changes through an appropriate workflow. Many people use the GitHub Flow to review and promote code. Broadly, this entails:
master
and deploying them.As part of Step 4 above, you can also run tests, often automatically, using continuous integration tools like Jenkins or third-party services like TravisCI or CircleCI.
Once we’re satisfied that our changes are okay, any tests have passed, and our colleagues have reviewed and approved our code, then it’s time to merge and deploy. This is another opportunity to again run terraform plan
. You can never be too sure that what you’re deploying is correct.
You might even automate this process upon the merge of your code into master
—for example, by using one of the CI tools we’ve mentioned or a Git hook of some kind. This is an excellent way to combine plan output files. Each commit generates a plan output file that is applied iteratively to your environments.
If this planning reveals that everything is still okay, then it’s time to run the terraform apply
command and push your changes live!
Now that we’ve got an idea of how a workflow might operate, let’s dive into the development
directory and start to define our development
environment. We’re going to start with a variables.tf
file for the environment.
variable "region" {
type = string
description = "The AWS region."
}
variable "prefix" {
type = string
description = "The name of our org, e.g., examplecom."
}
variable "environment" {
type = string
description = "The name of our environment, e.g., development."
}
variable "key_name" {
type = string
description = "The AWS key pair to use for resources."
}
variable "vpc_cidr" {
type = string
description = "The CIDR of the VPC."
}
variable "public_subnets" {
type = list(string)
default = []
description = "The list of public subnets to populate."
}
variable "private_subnets" {
type = list(string)
default = []
description = "The list of private subnets to populate."
}
We’ve defined the basic variables required for our environment:
Let’s populate a terraform.tfvars
file to hold our initial variable values.
region = "us-east-1"
prefix = "examplecom"
environment = "development"
key_name = "james"
vpc_cidr = "10.0.0.0/16"
public_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
We’ve added default values for all of our development
environment variables, including the name of the environment itself. This includes the base CIDR block of our VPC, and two lists, each containing public and private subnets.
Now we’re going to define a base configuration for our environment. We’ll put it in a file called main.tf
.
provider "aws" {
region = var.region
}
module "vpc" {
source = "github.com/turnbullpress/tf_vpc.git?ref=v0.0.4"
environment = var.environment
region = var.region
key_name = var.key_name
vpc_cidr = var.vpc_cidr
public_subnets = [var.public_subnets]
private_subnets = [var.private_subnets]
}
We’ve specified the aws
provider to provide our AWS resources. We’re configuring it with variables we’ve specified, and we’ve assumed you’re using either AWS environment variables or shared credentials. We could also enable profiles in our shared credentials: allowing different users for different environments or configurations.
Then we add the vpc
module. We’re going to use a more sophisticated VPC module for our full environment. We’ve specified its source
as the v0.0.4
tag of the tf_vpc
repository on GitHub:
github.com/turnbullpress/tf_vpc.git?ref=v0.0.4
This new VPC module creates a VPC, public and private subnets, NAT gateways, and all the required routing and security rules for them. The VPC build is modeled on this AWS scenario which results in a configuration much like:
You can find the updated in the tf_vpc repository on GitHub. We’re not going to step through it in detail, as most of the code is self-explanatory and builds on the work we saw in Chapter 3.
We will, however, quickly look at the outputs of the new version of the module as we’re going to reuse several of these shortly.
output "vpc_id" {
value = aws_vpc.environment.id
}
output "vpc_cidr" {
value = aws_vpc.environment.cidr_block
}
output "bastion_host_dns" {
value = aws_instance.bastion.public_dns
}
output "bastion_host_ip" {
value = aws_instance.bastion.public_ip
}
output "public_subnet_ids" {
value = aws_subnet.public.*.id
}
output "private_subnet_ids" {
value = aws_subnet.private[*].id
}
output "public_route_table_id" {
value = aws_route_table.public.id
}
output "private_route_table_id" {
value = aws_route_table.private[*].id
}
output "default_security_group_id" {
value = aws_vpc.environment.default_security_group_id
}
We can see our vpc
module outputs the key configuration details of our VPC: ID, CIDR block, subnet and route IDs, as well as the DNS and public IP of a bastion host we can use to connect to hosts in our private subnets or without public IP addresses.
Before we use it we’ll need to initialize our environment and get the vpc
module from GitHub. Inside the ~/dc/development
directory we need to run the terraform init
command and then the terraform get
command to download our modules.
$ cd ~/dc/development
$ terraform init
. . .
$ terraform get
Get: git::https://github.com/turnbullpress/tf_vpc.git?ref=v0.0.4
This has downloaded the v0.0.4
tag of our vpc
module and installed it in the .terraform/modules
directory.
Let’s add some outputs, all based on outputs of the vpc
module, to the end of the main.tf
file in our development
environment. This will expose them for us for later use.
. . .
output "public_subnet_ids" {
value = [module.vpc.public_subnet_ids]
}
output "private_subnet_ids" {
value = [module.vpc.private_subnet_ids]
}
output "bastion_host_dns" {
value = module.vpc.bastion_host_dns
}
output "bastion_host_ip" {
value = module.vpc.bastion_host_ip
}
We’ve added four outputs from the vpc
module. For example, we’ve added an output called bastion_host_dns
with a value of module.vpc.bastion_host_dns
to expose the bastion host’s fully qualified domain name from the vpc
module.
We’ve passed all of our current variables into the module. If we now run plan
, we should see our proposed VPC and environment. Let’s try that now.
$ terraform plan
. . .
+ module.vpc.aws_eip.environment.0
allocation_id: "<computed>"
association_id: "<computed>"
domain: "<computed>"
instance: "<computed>"
network_interface: "<computed>"
private_ip: "<computed>"
public_ip: "<computed>"
vpc: "true"
. . .
Plan: 23 to add, 0 to change, 0 to destroy.
We can see that we’ll create 23 resources.
Now let’s apply our resources.
$ terraform apply
module.vpc.aws_vpc.environment: Creating...
cidr_block: "" => "10.0.0.0/16"
default_network_acl_id: "" => "<computed>"
. . .
Apply complete! Resources: 23 added, 0 changed, 0 destroyed.
. . .
Outputs:
bastion_host_dns = ec2-52-90-119-131.compute-1.amazonaws.com
bastion_host_ip = 52.90.119.131
private_subnet_ids = [
subnet-5056af6c,
subnet-f6bb74ad
]
public_subnet_ids = [
subnet-f5bb74ae,
subnet-f4bb74af
]
We can see that our 23 resources have been added and that Terraform has outputted some useful data about our new VPC, including details for the bastion host that will allow us to connect to any instances that we launch in the VPC.
Let’s configure remote state. We enable the remote state like so:
terraform {
backend "s3" {
region = "us-east-1"
bucket = "examplecom-remote-state-development"
key = "terraform.tfstate"
}
}
We’ve specified our remote state in an S3 bucket and can now run the terraform init
command to initialize our remote state backend.
Now let’s add our first service to the development
environment. We’re going to configure the web service first. We’re going to add all of the variables, the web service module, and the outputs in a single file. Let’s call it web.tf
, add it to our ~/dc/development
directory, and populate it. Note this is the first time we’re going to see a second resource provider added to our environment.
variable "cloudflare_email" {}
variable "cloudflare_token" {}
variable "domain" {
default = "turnbullpress.com"
}
variable "web_instance_count" {
default = 2
}
variable "app_instance_count" {
default = 2
}
provider "cloudflare" {
email = var.cloudflare_email
token = var.cloudflare_token
}
module "web" {
source = "github.com/turnbullpress/tf_web"
environment = var.environment
vpc_id = module.vpc.vpc_id
public_subnet_ids = module.vpc.public_subnet_ids
private_subnet_ids = module.vpc.private_subnet_ids
web_instance_count = var.web_instance_count
app_instance_count = var.app_instance_count
domain = var.domain
region = var.region
key_name = var.key_name
}
output "web_elb_address" {
value = module.web.web_elb_address
}
output "web_host_addresses" {
value = [module.web.web_host_addresses]
}
output "app_host_addresses" {
value = [module.web.app_host_addresses]
}
We’ve started by adding some new variables. These variables are specifically for our web
service, so we’re adding them in here instead of the variables.tf
file, which handles the base variables for our environment. We’ve added two variables to configure Cloudflare (more on this in a moment), and a third variable to hold the domain name of our web service. We also have two variables for the instance count for our web and app servers.
We’ve also added a second provider—because it’s specific to the web service, we’re adding it here and not in the environment’s base configuration in main.cf
. Our second provider, cloudflare
, manages Cloudflare records and websites. We’ve used the two variables we just created to specify the email address and API token of our Cloudflare account. We’ll need to specify some values for these variables. We’re going to put ours in the terraform.tfvars
file with our other credentials.
. . .
environment = "development"
cloudflare_email = "[email protected]"
cloudflare_token = "abc123"
key_name = "james"
. . .
We’ve next defined a new module, called web
, for our web service. The source
of the module is a GitHub repository:
github.com/turnbullpress/tf_web
We’re retrieving the HEAD
of the Git repository rather than a specific tag or commit.
We’ve also passed in several variables. Some of them are defined in our variables.tf
file, like the name of our environment and the region in which to launch the service. The key variables, though, are extracted from our vpc
module.
vpc_id = module.vpc.vpc_id
The module.vpc.vpc_id
references the vpc_id
output from our VPC module:
This allows us to daisychain modules. The values created by one module can be used in another. We get another benefit from this variable reference: resource ordering. By specifying the module.vpc.vpc_id
variable, we place the vpc
module before the web
module in our dependency graph. This ensures our modules are created in the right sequence but still sufficiently isolated.
Finally, we specify a few useful outputs that will be outputted when we apply our configuration.
Let’s take a glance inside our web
module and see what it looks like.
We’re going to build the web
module inside our ~/dc/modules
directory.
We’ve created a directory, web
, and a sub-directory, files
, in our modules
directory. We’ve also touch
’ed the base files—interface.tf
and main.tf
—for our module, and initialized the directory as a Git repository.
Let’s start with our module’s variables in interface.tf
. We know we need to define a variable for each incoming variable used in our module
block.
variable "region" {}
variable "ami" {
default = {
"us-east-1" = "ami-f652979b"
"us-west-1" = "ami-7c4b331c"
}
}
variable "instance_type" {
default = "t2.micro"
}
variable "key_name" {
default = "james"
}
variable "environment" {}
variable "vpc_id" {}
variable "public_subnet_ids" {
type = list(string)
}
variable "private_subnet_ids" {
type = list(string)
}
variable "domain" {}
variable "web_instance_count" {}
variable "app_instance_count" {}
You can see we’ve defined the inputs from our module
block plus some additional variables for the Ubuntu 16.04 AMIs to use to create any EC2 instances and the instance type.
We also need to define some outputs for our module to match the outputs we specified in web.tf
. We’ll add these to end of the interface.tf
file.
output "web_elb_address" {
value = aws_elb.web.dns_name
}
output "web_host_addresses" {
value = aws_instance.web[*].private_ip
}
output "app_host_addresses" {
value = aws_instance.app[*].private_ip
}
Finally, we’ve gotten to the heart of our module: the resources we want to create. These are contained in the main.tf
file. It’s too big to show here in its entirety, but you can find it in the book’s source code. For now, let’s look at a couple specific resources.
The first resource we’re going to look at will show us a useful pattern for using data sources in modules.
data "aws_vpc" "environment" {
id = var.vpc_id
}
. . .
resource "aws_security_group" "web_host_sg" {
name = "${var.environment}-web_host"
description = "Allow SSH and HTTP to web hosts"
vpc_id = "${data.aws_vpc.environment.id}"
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = [data.aws_vpc.environment.cidr_block}]
}
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = [data.aws_vpc.environment.cidr_block}]
}
. . .
}
You’ll first note we haven’t specified an AWS provider or credentials. This is because the module will inherit the existing provider
configuration provided by our development
environment.
We instead have the aws_vpc
data source defined as our first resource. The aws_vpc
data source returns data from a specific VPC. In our web
module we’ve used the VPC ID we received from the vpc
module and passed into the web
module.
There’s another approach we could take to querying the data source, where we can search for the VPC using a filter. Filtering allows us to search for a specific resource within a data source.
You can see we’ve specified a filter
attribute with two sub-attributes: name
and values
.
The name
attribute should be the name of a specific field to query. In the case of the aws_vpc
data source, the field to query is derived from the AWS DescribeVPC API, a tag
called Name
.
If you look at the vpc
module you’ll discover that it creates the VPC and applies a Name
tag using the name of the environment:
The values
attributes contains a list of potential values for the tag specified. The data source will find any VPCs that match this list. You can only return one VPC though, so the combination of your name
and values
attributes must be sufficiently constrained to return the correct resource.
The aws_vpc
data source returns a variety of information. Here’s an example of what the data source returns:
module.web.data.aws_vpc.environment:
id = vpc-c96f7eae
cidr_block = 10.0.0.0/16
default = false
dhcp_options_id = dopt-1ef3017b
instance_tenancy = default
state = available
tags.% = 1
tags.Name = development
. . .
There’s a lot of interesting information here, but most importantly for us, it returns the CIDR block of the VPC. We’ve updated several security groups in our main.tf
file to use that CIDR block in the form of a data source variable as the value of the cidr_block
attribute:
The variable is prefixed with data
, the type of our data source aws_vpc
, and the name we’ve assigned it: environment
. The suffix is the name of the value returned from the data source query: cidr_block
.
You can see we’ve used data.aws_vpc.environment.id
as a variable as well. This is the VPC ID returned by our data source. We’ve specified this for consistency to ensure all our data is coming from the same source and specified in the same form, but we could instead use our var.vpc_id
value.
Our next resources are the AWS instances that will run our web servers.
resource "aws_instance" "web" {
ami = var.ami[var.region]
instance_type = var.instance_type
key_name = var.key_name
subnet_id = var.public_subnet_ids[0]
user_data = file("${path.module}/files/web_bootstrap.sh")
vpc_security_group_ids = [
aws_security_group.web_host_sg.id
]
tags = {
Name = "${var.environment}-web-${count.index}"
}
count = var.web_instance_count
}
This is very similar to the EC2 resources we created earlier in the book. There are a couple of interesting features we should note.
The first feature is the selection of the subnet in which to run the instance. We’re taking advantage of a sorting quirk of Terraform to select a specific subnet for our instances. We’re passing in the list of public and private subnet IDs into the web
module. These are generated in the vpc
module and emitted as an output of that module. We’ve assigned them in the web
module to a variable called var.public_subnet_ids
.
The original subnets were created using values from our development
environment’s terraform.tfvars
file:
public_subnets = [ "10.0.1.0/24", "10.0.2.0/24" ]
private_subnets = [ "10.0.101.0/24", "10.0.102.0/24" ]
Each subnet is created in the order specified—here 10.0.1.0/24
and then 10.0.2.0/24
. The resulting subnet IDs are also outputted in the order they are created from the vpc
module.
We select the first element in the var.public_subnet_ids
and know we’re getting the 10.0.1.0/24
subnet and so on.
The second interesting feature is contained in the user_data
attribute for our instance:
user_data = file("${path.module}/files/web_bootstrap.sh")
The user_data
uses the file
function to point to a file inside our module. To make sure we find the right path, we’ve used the path
variable to help us locate this file. We briefly saw the path
variable in Chapter 3.
The path
variable can be suffixed with a variety of methods to select specific paths. For example:
path.cwd
for current working directory.path.root
for the root directory of the root module.path.module
for the root directory of the current module.Here we’re using path.module
, which is the root directory of the web
module. Hence our web_bootstrap.sh
script is located in the files
directory inside the root of the web
module.
We’ve also specified a resource to manage our Cloudflare record. Let’s take a look at that now.
resource "cloudflare_record" "web" {
domain = var.domain
name = "${var.environment}.${var.domain}"
value = aws_elb.web.dns_name
type = "CNAME"
ttl = 3600
}
The cloudflare_record
resource creates a DNS record—in our case a CNAME
record. We’ve created our CNAME
record by joining our environment name, development
, with the domain name, turnbullpress.com
, we specified for the var.domain
variable in our ~/dc/development/web.tf
file. Our CNAME
record points to the DNS name of our AWS load balancer provided by the variable aws_elb.web.dns_name
. Specifying this variable guarantees our Cloudflare record will be created after our load balancer in the dependency graph.
Finally, we commit and add our module to GitHub.
We would then create a GitHub repository to store our module and push it.
Now that we’ve added our module to the development
environment, we need to get it.
$ cd ~/dc/development
$ terraform get
Get: git::https://github.com/turnbullpress/tf_vpc.git?ref=v0.0.4
Get: git::https://github.com/turnbullpress/tf_web.git
We’ve now downloaded the web
module to join our vpc
module.
If we again run terraform plan
, we’ll see some new resources to be added.
$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but
will not be persisted to local or remote state storage.
module.vpc.aws_vpc.environment: Refreshing state... (ID: vpc-782c281f)
module.vpc.aws_eip.environment[0]: Refreshing state... (ID: eipalloc-52336f6d)
module.vpc.aws_eip.environment[1]: Refreshing state... (ID: eipalloc-bf366a80)
. . .
Plan: 9 to add, 0 to change, 0 to destroy.
Terraform has refreshed our state to check what’s new and has identified nine resources, the contents of our web
module, that will be added to the development
environment.
When we run terraform apply
we’ll get our first service installed in the development environment.
$ terraform apply
module.vpc.aws_vpc.environment: Refreshing state... (ID: vpc-782c281f)
. . .
Apply complete! Resources: 9 added, 0 changed, 0 destroyed.
. . .
Outputs:
app_host_addresses = [
10.0.101.238,
10.0.101.174
]
bastion_host_dns = ec2-52-90-119-131.compute-1.amazonaws.com
bastion_host_ip = 52.90.119.131
private_subnet_ids = [
subnet-5056af6c,
subnet-f6bb74ad
]
public_subnet_ids = [
subnet-f5bb74ae,
subnet-f4bb74af
]
web_elb_address = development-web-elb-1020554483.us-east-1.elb.amazonaws.com
web_host_addresses = [
10.0.1.56,
10.0.1.233
]
Terraform has created our new resources and outputted their information, in addition to the previous vpc
outputs. We can also see our new Cloudflare domain record has been created.
module.web.cloudflare_record.web: Creating...
domain: "" => "turnbullpress.com"
hostname: "" => "<computed>"
name: "" => "development.turnbullpress.com"
proxied: "" => "false"
ttl: "" => "3600"
type: "" => "CNAME"
value: "" => "development-web-elb-1020554483.us-east-1.elb.amazonaws.com"
zone_id: "" => "<computed>"
module.web.cloudflare_record.web: Creation complete
This shows us that our second provider is working too.
Let’s try to use a couple of those resources now. We’ll browse to our load balancer first.
Let’s also sign in to our bastion host, 52.90.119.131
, and then bounce into one of our web server hosts.
james
key pair that we used to create our instances locally on our Terraform host.
$ ssh -A [email protected]
The authenticity of host '52.90.119.131 (52.90.119.131)' can't be established.
ECDSA key fingerprint is SHA256:g/Jfap5CgjZVEQoxDsAVDMILToEHfY/mQ13mzLjXJe8.
Are you sure you want to continue connecting (yes/no)? yes
. . .
ubuntu@ip-10-0-1-224:~$
We’ve connected to the bastion host, specifying the -A
flag on the SSH command to enable agent forwarding so we can use the james
key pair on subsequent hosts.
From the bastion host we should be able to ping
one of our web hosts and SSH into it by using its private IP address in the 10.0.1.0/24
subnet.
ubuntu@ip-10-0-1-224:~$ ping 10.0.1.56
PING 10.0.1.56 (10.0.1.56) 56(84) bytes of data.
64 bytes from 10.0.1.56: icmp_seq=1 ttl=64 time=1.17 ms
64 bytes from 10.0.1.56: icmp_seq=2 ttl=64 time=0.613 ms
64 bytes from 10.0.1.56: icmp_seq=3 ttl=64 time=0.598 ms
^C
ubuntu@ip-10-0-1-224:~$ ssh [email protected]
. . .
ubuntu@ip-10-0-1-56:~$
Excellent! It all works and we’re connected. We now have a running service inside a custom built VPC that’s secured and can be managed via a bastion host. What’s even more awesome is that this is totally repeatable in other environments.
If we want to remove the web service, all we need to do is remove the web.tf
file from the ~/dc/development
directory and run terraform apply
to remove the web service resources. If we can remove the file and run terraform plan
we can see the proposed deletions.
$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but
will not be persisted to local or remote state storage.
. . .
- module.web.aws_elb.web
- module.web.aws_instance.app[0]
- module.web.aws_instance.app[1]
- module.web.aws_instance.web[0]
- module.web.aws_instance.web[1]
- module.web.aws_security_group.app_host_sg
- module.web.aws_security_group.web_host_sg
- module.web.aws_security_group.web_inbound_sg
Plan: 0 to add, 0 to change, 8 to destroy.
But instead of removing these resources, let’s finish our development
environment by adding our second service, an API service.
Like our web service, the API service lives in a file in our ~/dc/development
directory. We’ve called this file api.tf
.
variable "api_instance_count" {
default = 5
}
module "api" {
source = "github.com/turnbullpress/tf_api"
environment = var.environment
vpc_id = module.vpc.vpc_id
public_subnet_ids = module.vpc.public_subnet_ids
private_subnet_ids = module.vpc.private_subnet_ids
region = var.region
key_name = var.key_name
api_instance_count = var.api_instance_count
}
output "api_elb_address" {
value = module.api.api_elb_address
}
output "api_host_addresses" {
value = [module.api.api_host_addresses]
}
In our api.tf
file we’ve specified a new variable for this module: the count of API servers we’d like to create. We’ve set a default of 5
. We’ve also specified the module that will provision our API service. Our module is sourced from the github.com/turnbullpress/tf_api
repository on GitHub. We’ve passed in an identical set of variables as we did to our web
module, plus our new API server count.
Our api
module will create five AWS instances, a load balancer in front of our instances, and appropriate security groups to allow connectivity. We’re not going to create each individual file; rather we’re going to check out the existing module from GitHub.
The resulting file tree will look like:
Let’s take a peak at the aws_instance
resource in our api
module’s main.tf
file to see how we’ve used our new count variable.
resource "aws_instance" "api" {
ami = var.ami[var.region]
instance_type = var.instance_type
key_name = var.key_name
subnet_id = var.public_subnet_ids[1]
user_data = file("${path.module}/files/api_bootstrap.sh")
vpc_security_group_ids = [
aws_security_group.api_host_sg.id,
] ]
tags gs {
Name = "${var.environment}-api-${count.index}"
}
count = var.api_instance_count
}
Our aws_instance.api
resource looks very much like our earlier instance resources from the web
module. We’ve specified a different subnet, the second public subnet in our VPC: 10.0.2.0/24
.
We’ve also used the var.api_instance_count
variable as the value of the count
meta-argument. This allows us to avoid hard-coding a number of API servers in our configuration. Instead we can set the value of this variable, either in an terraform.tfvars
file, or via one of the variable population methods we explored in Chapter 3. This means we can raise and lower the number of instances in our API cluster simply and quickly.
Back in our ~/dc/development/api.tf
file we’ve also specified two outputs:
output "api_elb_address" {
value = module.api.api_elb_address
}
output "api_host_addresses" {
value = [module.api.api_host_addresses]
}
One will return the DNS address of the API service load balancer, and one will return the host addresses for the API servers.
In order to use our API module, we need to get it.
$ cd ~/dc/development
$ terraform get -update
Get: git::https://github.com/turnbullpress/tf_api.git (update)
Get: git::https://github.com/turnbullpress/tf_vpc.git?ref=v0.0.4 (update)
Get: git::https://github.com/turnbullpress/tf_web.git (update)
Like our web
module, we’re pulling the HEAD
of the Git repository, rather than a tag or commit.
With the api
module in place we can then plan the service.
$ terraform plan
. . .
+ module.web.aws_elb.web
availability_zones.#: "<computed>"
connection_draining: "false"
connection_draining_timeout: "300"
cross_zone_load_balancing: "true"
. . .
Plan: 8 to add, 0 to change, 0 to destroy.
You can see we’ll add eight resources to our existing infrastructure.
Let’s do that now by applying our configuration.
$ terraform apply
module.vpc.aws_vpc.environment: Refreshing state... (ID: vpc-782c281f)
module.vpc.aws_eip.environment.1: Refreshing state... (ID: eipalloc-bf366a80)
. . .
Apply complete! Resources: 8 added, 0 changed, 0 destroyed.
. . .
Outputs:
api_elb_address = development-api-elb-1764248438.us-east-1.elb.amazonaws.com
api_host_addresses = [
10.0.2.120,
10.0.2.172,
10.0.2.64,
10.0.2.72,
10.0.2.239
]
. . .
Now we have two services running in our development
environment. We can see that our API service has created a new load balancer and five API hosts located behind it in our 10.0.2.0/24
public subnet.
Let’s test our API is working by curl
’ing an API endpoint on the load balancer.
$ curl development-api-elb-1764248438.us-east-1.elb.amazonaws.com/api/users/1
{
"id": 1,
"name": "user1"
}
Our API is very simple and only has one endpoint: /api/users
. With /api/users/1
we can return a single user entry, or with /api/users
we can return all of the users in our API’s database.
We can also test the API is working by visiting the load balancer via browser.
Finally, we can easily remove our API service by removing the api.tf
file and running terraform apply
to adjust our configuration. We can also adjust the API instance count to grow or shrink our pool of API instances.
main.tf
file containing our VPC configuration, the VPC and everything in it will be destroyed!
Now that we’ve got a functioning development
environment, we can extend this architecture to add new environments. Let’s populate a production
environment now. Our architecture makes this very easy: we can just duplicate our existing environment and update its variables.
$ cd ~/dc
$ mkdir -p production
$ cp -R development/{main.tf,variables.tf,terraform.tfvars} production/
We’ve copied the main.tf
, variables.tf
, and terraform.tfvars
into the production
directory. We haven’t included any of the services yet. Let’s update our configuration files, starting with terraform.tfvars
, for our new environment.
region = "us-east-1"
environment = "production"
key_name = "james"
vpc_cidr = "10.0.0.0/16"
public_subnets = [ "10.0.1.0/24", "10.0.2.0/24" ]
private_subnets = [ "10.0.101.0/24", "10.0.102.0/24" ]
You can see we’ve updated our environment
variable for the production
environment. We could also adjust other settings, like the region
or the public and private subnets, to suit our new environment.
We can now add services, such as our web service, to the environment by duplicating—and potentially adjusting—the web.tf
file from our development
environment. With the core of our modules driven by variables, we’re likely to only need to adjust the scale and size of the various services rather than their core configuration.
We would wrap this whole process in the workflow we introduced at the beginning of the chapter to ensure our changes are properly managed and tested.
Another useful feature when thinking about workflow are Terraform state environments. State environments were introduced in Terraform 0.9. You can think about a state environment as branching version control for your Terraform resources.
A state environment is a namespace, much like a version control branch. They allow a single folder of Terraform configuration to manage multiple states of resources. They are useful for isolating a set of resources to test changes during development. Unlike version control though, state environments do not allow merging, any changes you make in a state environment need to be re-applied to any other environments.
Right now you are operating in a state environment, a special, always present, environment called default
. If you’re running a Consul backend, you can see that by running the terraform workspace
command with the list
flag.
As they are pretty limited right now we’re not going to cover them in more details but you can learn more about state environments in the documentation.
There are also some tools, blog posts, and resources designed to help run Terraform in a multi-environment setup.
In this chapter we’ve seen how to build a multi-environment architecture with Terraform. We’ve proposed a workflow to ensure our infrastructure is managed and changed as carefully as possible. We’ve built a framework of environments to support that workflow. We’ve also seen how to build isolated and portable services that we can deploy safely and simply using modules and configuration.
In the next chapter we’ll look at how to expand upon one section of our workflow: automated testing. We’ll learn how to write tests to help validate changes to our infrastructure.