5 Structuring and sharing modules

This chapter covers

  • Constructing module versions and tags for infrastructure changes
  • Choosing a single repository versus multiple repositories
  • Organizing shared infrastructure modules across teams
  • Releasing infrastructure modules without affecting critical dependencies

Up to this point in the book, you’ve learned practices and patterns for writing infrastructure as code and breaking them into groups of infrastructure components. However, you can write the most optimal configurations but still have trouble maintaining and mitigating the risk of failure to your systems. The difficulties happen because your team does not standardize collaboration practices when updating infrastructure modules.

Imagine a company, Datacenter for Veggies, starts by automating its growing operations for herbs. Applications in GCP monitor and adjust for optimal herb growth. Each team uses the singleton pattern and creates a unique infrastructure configuration.

Over time, Datacenter for Veggies becomes more popular and wants to expand to all vegetables. It hires a new application development team specializing in software for growing various vegetables, from herbs to leafy greens to root vegetables. Each team creates an infrastructure configuration independent of the others.

Datacenter for Veggies hires you to develop an application to grow fruit. You realize you cannot reuse any infrastructure configurations because they are unique to each vegetable team. The company needs a consistent, reusable way to build, secure, and manage infrastructure.

You realize that Datacenter for Veggies could use some module patterns from chapter 3 to organize infrastructure configuration into modules for composability. You sketch out a diagram, depicted in figure 5.1, to organize and coordinate groups of infrastructure for multiple teams. The teams for herbs, root vegetables, leafy green vegetables, and fruit can all use standardized configuration for networks, databases, and servers.

Figure 5.1 Datacenter for Veggies can use modules to organize and standardize infrastructure configuration across application teams.

Sharing modules across teams promotes reproducibility, composability, and evolvability. The teams do not have to spend as much time building IaC because they reproduce established configurations. Team members can choose how they compose their systems and override the configurations for their specific needs.

To fully realize the benefit of standardized modules, you need to treat them with a development life cycle outside of regular infrastructure changes. This chapter covers practices for sharing and managing infrastructure modules. You’ll learn techniques and practices to release stable modules without introducing critical failures to higher-level dependencies.

5.1 Repository structure

Imagine that each team in Datacenter for Veggies uses a singleton pattern for its infrastructure. The Herbs and Leafy Greens teams realize they use similarly configured servers, networks, and databases. Can they merge their infrastructure configuration into one module?

Rather than copy and paste each other’s configuration, the Herbs and Leafy Greens teams want to update it in one place and reference it in their configuration. Should Datacenter for Veggies put all infrastructure in one repository? Or should it divide its modules across multiple repositories?

5.1.1 Single repository

At first, each Datacenter for Veggies team stores its infrastructure configuration in a single code repository. Each team organizes its configuration into a dedicated directory to avoid mixing up configurations. If a team wants to reference a module, the team imports the module by using a local file path.

Figure 5.2 shows how Datacenter for Veggies structures its single code repository. The repository contains two folders at the top level, separating modules and environments. The company subdivides the environments directory for each team, such as the Leafy Greens team. The Leafy Greens team separates configurations by development and production environments.

Figure 5.2 The Leafy Greens team’s production and development environments use the directories containing server, network, and database factory modules in a single repository structure.

When the Leafy Greens team members want to create a database, they can use a module in the modules folder. In their IaC, they import the module by setting a local path. After importing, they can use the database factory and build the resource in production.

Datacenter for Veggies started defining infrastructure with a single repository (also known as a mono repository, or monorepo) to contain all configuration and modules for each team.

Definition A single repository structure (also known as mono repository, or monorepo) contains all IaC (configuration and modules) for a team or function.

In general, the company likes the single repository structure. All teams can reproduce their configuration by copying and pasting, and can compose new resources by adding a new folder for a module. In listing 5.1, the Leafy Greens team members build a new database module. Using Python, they insert a local file path to modules by using the sys.path method. They use the database by importing its module into the codebase.

Listing 5.1 Referencing infrastructure modules in a different directory

import sys
sys.path.insert(1, '../../modules/gcp')                                   
 
from database import DatabaseFactoryModule                                
from server import ServerFactoryModule                                    
from network import NetworkFactoryModule                                  
 
import json
 
 
if __name__ == "__main__":
   environment = 'production'
   name = f'{environment}-hello-world'
   network = NetworkFactoryModule(name)                                   
   server = ServerFactoryModule(name, environment, network)               
   database = DatabaseFactoryModule(name, server, network, environment)   
   resources = {                                                          
       'resource': network.build() + server.build() + database.build()    
   }                                                                      
 
   with open('main.tf.json', 'w') as outfile:                             
       json.dump(resources, outfile, sort_keys=True, indent=4)            

Imports the directory with the modules because it exists in the same repository

Imports the server, database, and network factory modules for the production environment

Uses the modules to create the JSON configuration for the network, server, and database

Writes the Python dictionary out to a JSON file to be executed by Terraform later

Using local folders to store modules helps the teams reference the infrastructure they want. Everyone can look in the same repository for modules or examine other teams’ configurations. If someone on the Herbs team wants to learn about the Fruits team’s IaC, they can use the tree command to examine the directory structure:

$ tree .
.
├── environments
│     ├── fruits
│     │     ├── development
│     │     └── production
│     ├── herbs
│     │     ├── development
│     │     └── production
│     ├── leafy-greens
│     │      ├── development
│     │      └── production
│     └── roots
│             ├── development
│             └── production
└── modules
        └── gcp
                ├── database.py
                ├── network.py
                ├── server.py
                └── tags.py

To better organize configurations, each team puts development and production environment configurations into separate folders. These directories isolate configurations and changes for each environment. Ideally, all environments should be the same. Realistically, you will have differences between environments to address cost or resource constraints.

Other tools

A single repository structure applies to many other IaC tools. You can apply the single repository structure for reusing roles and playbooks to configuration management tools like Ansible. You can reference and build playbooks or configuration management modules based on each local directory in the single repository.

CloudFormation works a bit differently. You can host all of your stack definition files in a single repository. However, you must release the child template (which I consider a module) into an S3 bucket and reference it with the TemplateURL parameter in the AWS::CloudFormation::Stack resource. Later in this chapter, you’ll learn how to deliver and release changes to modules.

Datacenter for Veggies uses one infrastructure provider, GCP. In the future, teams can add new directories for different infrastructure tools or providers. These tools can update servers or networks (ansible directory), build virtual machine images (packer directory), or deploy the database to AWS (aws directory):

$ tree .
.
├── environments
│     ├── development
│     └── production
└── modules
     ├── ansible
     ├── aws
     ├── gcp
     └── packer

You may encounter the principle of don’t repeat yourself (DRY) in other IaC materials. DRY promotes reuse and composability. Infrastructure modules reduce duplication and repetition in configuration, which conforms to DRY. If you can have identical development and production environments, you could omit the development and production directories and reference one module instead of separate environment files.

You cannot fully comply with DRY in infrastructure. Depending on the infrastructure or tool’s language and syntax, you will always have repetitive configuration. As a result, you can have occasional repetition for clearer configuration or within the limitations of tools or platforms.

5.1.2 Multiple repositories

As Datacenter for Veggies grows, its infrastructure repository has hundreds of folders. Each folder contains many more nested ones. Every week, you spend time rebasing the configuration with all the updates from every repository. You also wait 20 minutes each time you push to production because your CI framework must recursively search for changes. The security team also expresses concern because contractors working with the Leafy Greens team have access to all of the Fruits team’s infrastructure!

You divide network, tag, server, and database modules into individual repositories. Each repository has its workflow for building and delivering the module, which takes less time for the CI framework. You can control access to each repository, allowing contractors on the Leafy Greens team to access only the Leafy Greens configuration.

Different teams in Datacenter for Veggies can use the module’s repository or packaged version. Each team stores its configuration and modules in a separate repository. Anyone in the company can download and use the modules in their configurations.

Figure 5.3 shows the code repositories Datacenter for Veggies uses to create IaC. Each team and module gets its own code repository. When the Leafy Greens team wants to create a database, it downloads and imports the database module from a GitHub repository URL instead of the local folder. If teams have multiple environments, they subdivide their code repository into folders.

Figure 5.3 In a multiple repository structure, you store each module in its own code repository. The configuration references the repository URL to use the module.

Datacenter for Veggies has migrated from a single repository structure to a multiple repository, or multi repo, structure. The company separated modules into various repositories based on the teams.

Definition A multiple repository (also known as a multi repo) structure separates IaC (configuration or modules) into different repositories based on team or function.

Recall that a single repository pattern promotes reproducibility and composability. A multiple repository pattern helps improve the principle of evolvability. Separating the modules into their own repositories helps structure each module’s life cycle and management.

To implement a multiple repository structure, you split the modules into their own version control repository. In the following listing, you configure Python’s package manager to download each module by adding it as a library requirement in requirements.txt. Each library requirement must include a URL to the version control repository and a specific tag to download.

Listing 5.2 Python requirements.txt references module repositories

-e git+https://github.com/joatmon08/[email protected]#egg=tags         
-e git+https://github.com/joatmon08/[email protected]#egg=network   
-e git+https://github.com/joatmon08/[email protected]#egg=server     
-e git+https://github.com/joatmon08/[email protected]#egg=database 

Downloads the prototype module for tags from a GitHub repository. Picks the module version based on the tag.

Downloads the factory module for the network, server, and database from a GitHub repository. Picks the module version based on the tag.

First, you create a repository for the production configuration of the fruit application’s infrastructure. After you create the repository, you add requirements.txt to it. Then you run Python’s package installation manager to download each module for the infrastructure configuration:

$ pip install -r requirements.txt
Obtaining tags from 
git+https://github.com/joatmon08/
[email protected]#egg=tags
...
Successfully installed database network server tags

Rather than set a local path and import the modules, you need to run Python’s package installation manager to download from the remote repository first. After downloading the modules, teams can import them in environment configurations by using Python in listing 5.3.

Listing 5.3 Importing the modules for use in infrastructure configuration

from tags import StandardTags                                             
from server import ServerFactoryModule                                    
from network import NetworkFactoryModule                                  
from database import DatabaseFactoryModule                                
 
import json
 
if __name__ == "__main__":
   environment = 'production'
   name = f'{environment}-hello-world'
 
   tags = StandardTags(environment)
   network = NetworkFactoryModule(name)
   server = ServerFactoryModule(name, environment, network, tags.tags)
   database = DatabaseFactoryModule(
       name, server, network, environment, tags.tags)
   resources = {
       'resource': network.build() + server.build() + database.build()    
   }
 
   with open('main.tf.json', 'w') as outfile:                             
       json.dump(resources, outfile, sort_keys=True, indent=4)            

Imports the modules downloaded by the package manager

Uses the modules to create the JSON configuration for the network, server, and database

Writes the Python dictionary out to a JSON file to be executed by Terraform later

Recall that Datacenter for Veggies separately configures the development and production environments. The teams would implement code to reference the same factory and prototype modules hosted in version control. Consistent modules for development and production environments prevent drift between environments and help you test module changes before production. You’ll learn more about testing and environments in chapter 6.

The IaC implementation for a multiple repository does not differ too much from a single repository. Both structures support reproducibility and composability. However, they differ in that you independently evolve a module in an external repository.

Updating a configuration in a multiple repository structure involves re-downloading new modules with your package manager. Running the package manager to use a new module can introduce friction in your IaC workflow. Someone may update a module, and you won’t know unless you review its repository. Later in this chapter, you’ll learn about solving this problem with versioning.

Domain-specific languages

If a tool can reference modules or libraries with version control or artifact URLs, it can support a multiple repository structure.

When you adopt a multiple repository structure, you must establish a few standard practices to share and maintain modules. First, standardize a module file structure and format. It helps the teams across your organization identify and filter modules in version control. Consistent file structures and naming for module repositories also help with auditing and future automation.

For example, infrastructure modules in Datacenter for Veggies follow the same pattern and file structure. Their names include infrastructure provider, resource, and tool or purpose. In figure 5.4, the gcp-server-module describes GCP as the infrastructure provider, server as the resource type, and module as the purpose.

Figure 5.4 The repository name should include the infrastructure provider, resource type, and purpose.

If your modules use a specific tool or have a unique purpose, you can append it to the end of the repository name. It helps to add the tool to the name to identify the module type. Similar to the practices outlined in chapter 2, you want your module name descriptive enough for a teammate to identify.

You can apply the repository naming approach to naming folders in a single repository as well. However, subdirectories in a single repository make it easier to nest and identify infrastructure provider and resource type. Depending on your organization and team’s preferences, you can always add more fields to a repository name.

5.1.3 Choosing a repository structure

The scalability of your system and CI framework determine whether you use a single repository or multiple repositories. Datacenter for Veggies started with a single repository, which worked well because it had tens of modules and a few environments. Each module has two environments, for development and production. Each environment needs a few servers, one database, a network, and a monitoring system.

Using a single repository provides a few benefits. Figure 5.5 outlines some of the advantages and limitations. First, anyone on your team can access modules and configurations in one repository. Second, you need to go to only one place to compare and identify differences between environments. For example, you can compare two files in the repository to check whether development uses three servers and production uses five servers.

Drawing on the IaC principles, a single repository structure still offers composability, evolvability, and reproducibility. Anyone can go into a folder and evolve a module. You can still build modules on one another because you have a singular view of all infrastructure and configuration.

Figure 5.5 A single repository offers one view for all modules and configurations but limits CI frameworks or granular access control.

On the other hand, a single repository structure has some limitations. If anyone can go and change a module, it could break the IaC that depends on it! Furthermore, your CI system might break down as it recursively checks each directory for changes.

As a result, you need to adopt practices and tools to handle single repositories. These include opinionated versioning and specialized build systems. If your organization cannot build or adopt a tool that helps alleviate single repository management, you may choose a multiple repository structure.

Note You will find a few tools that help with building and managing single repositories. They have additional code to handle nested subdirectories and individual build workflows. Some of them include Bazel, Pants, and Yarn.

The migration from single to multiple repository structure happens more than you think. I had to do it twice! One organization started with three environments and four modules. Over a few years, the IaC grew to hundreds of modules and environments.

Unfortunately, the CI framework (Jenkins) took nearly three hours to run a standard change scaling up servers. The framework spent most of its time searching each directory and nested directory for changes! We eventually refactored the configurations and modules into multiple repositories.

Refactoring into multiple repositories alleviated some of the problems with the CI framework. A multiple repository structure also provided more granular access control to specific modules. The security team could grant module edit access to specific teams. You’ll learn more about refactoring in chapter 10.

Figure 5.6 shows the benefits and limitations of multiple repositories, including granular access control and scalable CI workflows. However, a multiple repository structure reduces your singular view of modules and configurations for your organization.

Figure 5.6 Multiple repositories help reduce the burden of running tests and configurations with CI frameworks but require constant verification of conformance to formatting and troubleshooting.

By refactoring the configuration into a multiple repository structure, you can isolate access to and evolve the infrastructure configuration for specific teams. You have greater control over the evolution and life cycle of modules. Most CI frameworks support multiple repositories and will run workflows in parallel when the framework detects changes to a given repository.

However, multiple repositories do have some downsides. Imagine Datacenter for Veggies has ten or more modules in different repositories. How do you know if they all conform to the same file standards and naming?

Figure 5.7 shows one solution to the problem of file and standard conformance. You can capture all of the tests for formatting and linting checks into a prototype module. Then, the CI framework downloads the tests and checks for README and Python files in the server, network, database, and DNS modules.

Figure 5.7 Creating a prototype module that contains all of the checks for module repositoryformat will help fix older repositories that do not conform to new standards.

The prototype module with tests help enforce formatting for older modules that you don’t use as often, like DNS. If you want to add a new standard, you update the prototype module with a new test. The next time someone updates a module or configuration, they need to update their module format to conform.

A standardized set of checks helps alleviate the operational burden of finding and replacing files in hundreds of repositories. It distributes the responsibility of updating the module repository to the module’s maintainers. For more on module conformance testing and integrating modules in your workflow, you can apply the practices in chapters 6, 7, and 8.

A second disadvantage to a multiple repository structure involves the challenge of troubleshooting. When you reference a module in your configuration, you need to search for the module repository to identify which inputs and outputs it needs. The search adds extra effort and time when you debug failures in configuration.

If you have a build system that can handle single repository building requirements, you can use a single repository for everything. However, most build systems do not scale with recursive directory searching. To solve this problem, you can use a combination of single and multiple repositories.

Let’s apply this solution to Datacenter for Veggies. They separate each configuration for different types of fruits and vegetables. Leafy Greens uses one repository, while Fruits uses another. Both of them reference shared modules for network, tags, database, and DNS.

Figure 5.8 shows that the Fruits team needs a queue but the Leafy Greens team does not. As a result, the Fruits repository includes a local module for creating queues. The Fruits team uses a single repository for its unique configurations but references multiple repositories for common modules.

Figure 5.8 Your organization can combine multiple repositories with a single repository for application or system-specific configuration.

When you use this mix-and-match approach, recognize the kind of access control you want for individual repositories or shared configurations. If you want to improve composability and reproducibility for other teams, you might put a module in its own repository. However, if you want to maintain evolvability for a specialized configuration, you might manage the module locally with your configuration.

As you choose your repository structure, recognize the trade-off between approaches and refactor as the number of modules and configurations grows. When you add more configuration and resources into a single repository, you need to make sure the tools and processes scale with it!

5.2 Versioning

Throughout this chapter, you’ve used the practice of keeping infrastructure configuration or code in version control. For example, Datacenter for Veggies teams can always reference infrastructure based on the commit hash. One day, the security team for Datacenter for Veggies expressed concern about the age of usernames and passwords for soil-monitoring databases.

The team recommends using a secret manager to store and rotate the password every 30 days. Problematically, all teams use the soil-monitoring database module. Figure 5.9 shows that the application currently references the output of the database module. The module outputs the password for the database, which applications use to write and read data. The security team wants you to use a secrets manager instead.

Figure 5.9 The applications reference the database endpoint and password from the soil-monitoring module but should use the password from the secrets manager.

The database module output affects the secret’s evolvability and security. How can we update the database to use the secrets manager without disrupting soil data collection? The infrastructure team at Datacenter for Veggies decides to add versioning to the database module.

Definition Versioning is the process of assigning unique versions to iterations of code.

Let’s examine how the Datacenter for Veggies team implements module versions. The team uses version control to tag the current version of the database module as v1.0.0. Version v1.0.0 will output the database password for applications:

$ git tag v1.0.0

They push the tag for v1.0.0 to version control:

$ git push origin v1.0.0
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
 * [new tag]         v1.0.0 -> v1.0.0

You must refactor configurations for Fruit, Leafy Greens, Grain, and Herb growth to use the v1.0.0 version of the database module in a process called version pinning. Version pinning preserves idempotency. When you run the IaC, the configurations continue to use the database module outputs. You should not detect any drift between a pinned module and the existing infrastructure.

After all of the teams pin the versions to v1.0.0, you can rewrite the module to use a secrets manager. The database module stores the password in the secrets manager. The team tags the new database module as v2.0.0, which outputs the database endpoint and location of the password in the secrets manager:

$ git tag v2.0.0

They push the tag for v2.0.0 to version control:

$ git push origin v2.0.0
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
 * [new tag]         v2.0.0 -> v2.0.0

You can examine the difference between the two versions of the module based on the commit history:

$ git log --oneline
7157d3e (HEAD -> main, tag: v2.0.0, origin/main) 
Change database module to store password in secrets manager
5c5fd65 (tag: v1.0.0) Add database factory module

Now that you’ve created a new version of the database factory module, you ask some of the teams to try it. The Fruits team bravely volunteers. The Fruits team currently uses version 1.0.0. That module version outputs the database endpoint and password.

When updating to module version 2.0.0, as shown in figure 5.10, the Fruits team needs to account for changes in the module’s workflow. The team cannot use the database password in the module’s output. The module outputs an API path to the database password stored in the secrets manager. As a result, the Fruits team refactors its IaC to get the database password from the secrets manager before creating the database.

You’ll apply a few essential practices to module versioning. First, make sure you run your IaC and eliminate any drift before you update. Second, establish a versioning approach that does not reference the latest version of the module.

Datacenter for Veggies follows semantic versioning, assigning version numbers that convey essential information about the configuration. You can specify module versions in a few ways, including tagging the commit with a number in version control or packaging and labeling the module in an artifact repository.

Figure 5.10 You can refactor the Fruit application to reference version 2.0.0 of the database module and retrieve the database password from the secrets manager.

Note I often update the major version for significant updates that remove inputs, outputs, and resources. I usually update the minor version if I update configuration values, inputs, or outputs to modules that do not affect dependencies using previous versions. Finally, I will change the patch version for minor configuration value changes scoped to the module and its resources. For additional details on semantic versioning and its approaches, you can reference https://semver.org/.

Using a consistent versioning approach, you can more effectively evolve downstream infrastructure resources without breaking upstream ones because you control their dependencies. Versioning also helps with the auditing of active versions. To save resources, reduce confusion, and promote the latest changes, versioning allows you to identify and deprecate older, inactive versions of the module.

However, you must continuously remember and enforce certain versioning practices. The longer you wait to update the application to use v2.0.0 of the database, the higher chance it will fail. You might consider putting a timeline on how long you can use the module version v1.0.0. You do not need to immediately delete v1.0.0 of the database module. Generally, I upgrade dependent modules within a few minor version changes. Trying to upgrade with a broader “jump” between versions increases the change’s risk and possible failure rate.

Note If you use feature-based development or Git Flow, you can accommodate patches or hotfixes in the same workflow as software development. You can make a branch based on the version tag, update the changes, increase the patch version, and add a new tag for the hotfix branch. You will need to keep the branch for the commit history.

This versioning process works well for a multiple repository structure. What about a single repository? You can still apply the version control tagging approach. You may want to add a prefix to the tag with the module name (module-name-v2.0.0). Then you can package and release your module to an artifact repository. Your build system packages the contents of the module subdirectory and tags the version in the artifact store. Your configuration references the remote modules in the artifact repository instead of a local file.

5.3 Releasing

I explained the practice of module versioning to help with module evolution and minimize disruption to your system. However, you don’t want every team to update its IaC to the newest module immediately. Instead, you want to make sure the module works and doesn’t break your infrastructure before you use it in production.

Figure 5.11 shows how you evaluate your database module update before allowing all Datacenter for Veggies teams to use it. After you update the database module to store a password in the secrets manager, you push the changes to version control. You ask the Fruits team to test the module in a separate environment and confirm that the module works. They confirm it works correctly. You tag the release with a new version, 2.0.0, and update the documentation on the secrets manager.

In the previous section, the Datacenter for Veggies infrastructure team updated the module and tested it with the Fruits team’s development environment first. Now that the module passed the test, other teams can use the new database module with a secret manager. The team followed a release process to certify that other teams can use the new module.

Definition Releasing is the process of distributing software to a consumer.

Figure 5.11 When you make module updates, ensure that you include a testing stage before releasing the module and updating its documentation.

A release process identifies and isolates any problems from module updates. You do not package a new module unless the tests certify that it works.

I recommend running module tests in a dedicated testing environment away from development and production workloads. A separate account or project for module testing helps you track the cost of running the tests and isolates failures away from active environments. You’ll learn more about testing and testing environments in chapter 6.

Note For a detailed code listing of a continuous delivery pipeline for releasing modules, check out http://mng.bz/PnaR. The GitHub Actions pipeline automatically builds a GitHub release when the tests succeed based on a commit message.

After testing the modules, you tag the release with a new version for your team to use. Datacenter for Veggies releases the database module as version v2.0.0 and uses the Python package manager to reference the tag. Alternatively, you can package the module and push it to an artifact repository or storage bucket.

For example, imagine Datacenter for Veggies has some teams that use CloudFormation. These teams prefer to reference modules (or CloudFormation stacks) stored in an Amazon S3 bucket. In figure 5.12, the teams add a step to their delivery pipeline to compress their modules and upload them to an S3 storage bucket. As a last step, they update documentation outlining the changes they made.

Figure 5.12 After testing, you optionally choose to package and push the module to an artifact repository or storage bucket.

Some organizations prefer packaging the artifact and storing it in a separate repository for additional security control. If you have a secure network that cannot access an external version control endpoint, you can reference the artifact repository instead. Just make sure to keep the tag in version control so someone can correlate the artifact to the correct code version.

After packaging and pushing the artifact, you should update documentation outlining your changes. That documentation, called release notes, outlines breaking changes to outputs and inputs. Release notes communicate a summary of changes to other teams.

Definition Release notes list changes to code for a given release. You should store them in a document in the repository, often called a changelog.

You can manually update the release notes, but I prefer an automated semantic release tool (such as semantic-release) to examine the commit history and build release notes for me. Make sure that you use the correct commit message format for the tool to match and parse changes. Chapter 2 emphasized the importance of writing descriptive commit messages. You’ll also find them helpful for an automated release tool.

For example, the database module stores the password in a secret manager. Datacenter for Veggies considers this a major feature, so you prefix the commit message with feat:

$ git log -n 1 --oneline
1b65555 (HEAD -> main, tag: v0.0.1, origin/main, origin/HEAD) 
feat(security): store password in secrets manager

A commit analyzer in an automated release tool automatically updates the major version of the tag to v2.0.0 based on this commit.

Image building

You might encounter the practice of using image-building tools to build immutable server or container images. By baking the packages you want into a server or container image, you can create new servers with updates without the problems of in-place updates. When you release immutable images, use a workflow to create a test server based on the image, check that it runs correctly, and update the version of the image tag. Chapter 7 covers some of these workflows.

Besides updating release notes, make sure you update commonly used files and documentation. Common files help your teammates use the module. For instance, Datacenter for Veggies agrees that teams must always include a README file. A README documents the purpose, inputs, and outputs of each module.

Definition A README is a document in a repository that explains usage and contribution instructions for code. For IaC, use it to document a module’s purpose, inputs, and outputs.

Use a linting rule to check for the existence of a README file. In chapter 2, I discussed some linting practices to ensure clean IaC. Applying the pattern to commonly used files and documentation helps you format and organize large amounts of IaC.

In the Python examples, the modules include common files like __init__.py for identifying the package and setup.py for module configuration. I often refer to files with configuration or metadata that help specific tools or languages as helper files. They change depending on the tool and platform you use. You will want to standardize them across your organization so you can change or search them in parallel by using automation.

5.4 Sharing modules

As Datacenter for Veggies grows more produce, it adds new teams that automate the growth of grains, tea, coffee, and beans. The company also creates a new team for researching wild strains of produce. Each team needs to be able to expand the existing modules but also create new ones.

For example, the Beans team needs to change a database module to use PostgreSQL version 12. Should those team members be able to edit the module with the version update? Or should they file a ticket with you, the infrastructure team, to update it?

You need to empower different teams to create and update modules with IaC. However, you want to make sure that teams do not change an attribute and compromise security or functionality. You’ll find a few practices that can help you share modules across your organization.

Imagine that all teams in Datacenter for Veggies need a database. You create a new, opinionated database module that establishes a default set of parameters to provide security and functionality. The database module uses embedded defaults for module inputs to cover many Datacenter for Veggies use cases. Even if the Coffee team doesn’t know how to create a database, that team can use the module to build a secure, working database.

As a general practice, set opinionated defaults in your module. You want to err on the prescriptive side. If a team needs more flexibility, it can update the module or override the default attributes. Preset defaults help teach secure and standard practices for deploying specific infrastructure resources.

In this scenario, the Beans team expresses a need for more flexibility. The module does not use a new version of the database, PostgreSQL version 12. No other team uses that version of PostgreSQL. The Beans team decides to update the database version and push the changes into the repository.

However, the changes do not get released immediately. The build system sends a notification to module approvers in the infrastructure team. In figure 5.13, the infrastructure team pauses the build system and reviews the changes. If the changes pass the team’s approval, the build system releases the module. The Beans team can use the new version of the database module with PostgreSQL version 12.

Why should you allow the Beans team to change the infrastructure module? Self-service of module changes empowers all teams to update their systems and reduce the burden on infrastructure and platform teams. You want to balance their development progress with security and infrastructure availability. Adding an approval before module release identifies potential failures or nonstandard changes to infrastructure.

Figure 5.13 An application team can update the database module. However, the team must wait for approval from subject-matter experts before being able to use the new release.

The practice of allowing any team to use modules and edit them with approvers works best with established module development standards and processes. If you don’t establish module standards, this approach falls apart and adds friction to delivering the infrastructure change to production.

Let’s return to the example. The infrastructure team does not have much confidence in the change, so the team asks a database administrator for additional review. The database administrator points out that if the Beans team upgrades its module version, the resulting behavior deletes the previous database and creates an empty one with the new version! This would significantly disrupt the application supporting bean growth.

In figure 5.14, the Beans team submits a request for help from the database team. An administrator recommends some practices that will help update the database without deleting data. The Beans team implements these practices and asks module approvers for a second review. Once the module gets released, the team can use the module without worrying about disrupting its applications.

Figure 5.14 For disruptive module updates, the application team submits a ticket to the database team to verify database migration steps before releasing a new module version.

If you have concerns that a change might be particularly disruptive to a system’s architecture, security, or availability, ask for review from a subject-matter expert before releasing a new version. A subject-matter expert can help identify any problems that will affect other teams using the module and advise on the best way to update it. The process of review helps you evolve your IaC and identify potential failures from infrastructure changes.

In general, you need a process that empowers your team to make infrastructure changes and provides the team the knowledge and support to complete those changes successfully without disrupting critical systems. Manual review may seem tedious but helps educate your team and prevent problems in production. Your team must find a balance between quickly deploying changes to production and waiting for manual review from a subject-matter expert, something I’ll expand on in chapter 7.

By working collaboratively on modules, you share IaC knowledge across teams and collectively identify potential disruptions to critical infrastructure. You can treat modules as artifacts for use across an organization, similar to shared application libraries, container images, or virtual machine images. Anyone in the company can use and update modules (with additional help, if needed!) to evolve infrastructure architecture, security, or availability.

Summary

  • Structure and share modules and configurations in a single repository or multiple repositories.

  • A single repository structure organizes all configuration and modules in one place, making it easier to troubleshoot and identify usable resources.

  • A multiple repository structure organizes all configuration and modules into their own code repositories, divided by business domain, function, team, or environment.

  • A multiple repository structure allows better access control for individual infrastructure configuration or modules and streamlines pipeline execution for each repository.

  • A single repository may not scale as more people collaborate on IaC and require additional resources for a build system to process changes quickly.

  • Refactor a single repository into multiple repositories, one for each module.

  • Choose a consistent versioning methodology for modules and update them using Git tags.

  • Package and release a module to an artifact repository, which will allow anyone in the organization to retrieve a specific module version.

  • When sharing modules across teams, establish opinionated default parameters in modules to maintain security and functionality.

  • Allow anyone in the organization to suggest updates to modules, but add governance to identify potentially disruptive changes to modules that affect architecture, security, or infrastructure availability.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset