© Moritz Lenz 2019
Moritz LenzPython Continuous Integration and Deliveryhttps://doi.org/10.1007/978-1-4842-4281-0_9

9. Building in the Pipeline with Go Continuous Delivery

Moritz Lenz1 
(1)
Fürth, Bayern, Germany
 

The previous chapters have demonstrated the automation of the essential steps from source code to deployment: build, distribution, and deployment. What’s missing now is the glue that holds them all together: polling the source code repositories, getting packages from the build server to the repository server and generally controlling the flow, aborting the pipeline instance when one step has failed, and so on.

We will use Go Continuous Delivery1 (GoCD or Go) by ThoughtWorks as glue.

9.1 About Go Continuous Delivery

GoCD is an open source project written in Java, with components of its web interface in Ruby on Rails. It started out as proprietary software in 2010 and was open sourced in 2014.

You can download GoCD for Windows, OSX, Debian and RPM-based Linux distributions, and Solaris. Commercial support for GoCD is available from ThoughtWorks.

It consists of a server component that holds the pipeline configuration, polls source code repositories for changes, schedules and distributes work, collects artifacts, presents a web interface to visualize and control it all, and offers a mechanism for manual approval of steps.

One or more agents connect to the server and carry out the actual jobs in the build pipeline.

Pipeline Organization

Every build, deployment, or test job that GoCD executes must be part of a pipeline. A pipeline consists of one or more linearly arranged stages. Within a stage, one or more jobs run potentially in parallel and are individually distributed to agents. Tasks are serially executed within a job.

In a task, you can rely on files that previous tasks in the same job produced, whereas between jobs and stages, you have to explicitly capture and later retrieve them as artifacts. More on that follows.

The most general task is the execution of an external program. Other tasks include the retrieval of artifacts or language-specific things such as running Ant or Rake builds.2

Pipelines can trigger other pipelines, allowing you to form an acyclic, directed graph of pipelines (Figure 9-1).
../images/456760_1_En_9_Chapter/456760_1_En_9_Fig1_HTML.png
Figure 9-1

GoCD pipelines can form a graph. Pipelines consist of sequential stages in which several jobs can run in parallel. Tasks are serially executed inside a job.

Matching of Jobs to Agents

When an agent is idle, it polls the server for work. If the server has jobs to run, it uses two criteria to decide if the agent is fit for carrying out the job: environments and resources.

Each job is part of a pipeline, and if you choose to use environments, a pipeline is part of an environment. On the other hand, each agent is configured to be part of one or more environments. An agent only accepts jobs from pipelines from one of its environments.

Resources are user-defined labels that describe what an agent has to offer, and inside a pipeline configuration, you can specify what resources a job requires. For example, if you define that a job requires the phantomjs resource to test a web application, only agents that you assign this resource to will execute that job. It is a good idea to add the operating system and version as resources. In the preceding example, the agent might have the phantomjs, debian, and debian-stretch resources, offering the author of the job some choice of granularity for specifying the required operating system.

A Word on Environments

GoCD makes it possible to run agents in specific environments. As an example, one can run a Go agent on each testing and on each production machine and match pipelines to agent environments, to ensure that an installation step occurs on the right machine in the right environment. If you go with this model, you can also use GoCD to copy the build artifacts to the machines for which they are needed.

I chose not to do this, because I didn’t want to have to install a GoCD agent on each machine that I want to deploy to. Instead, I use Ansible, executed on a GoCD agent, to control all machines in an environment. This requires managing the SSH keys that Ansible uses and distributing packages through a Debian repository. But because Debian requires a repository anyway, to be able to resolve dependencies, this is not much of an extra burden.

Materials

A material in GoCD serves two purposes: it triggers a pipeline, and it provides files that the tasks in the pipeline can work with.

I tend to use Git repositories as materials, and GoCD can poll these repositories, triggering the pipeline when a new version becomes available. The GoCD agent also clones the repositories into the file system in which the agent executes its jobs.

There are material plug-ins for various source control systems, such as Subversion (svn) and mercurial, and plug-ins for treating Debian and RPM package repositories as materials.

Finally, a pipeline can serve as a material for other pipelines. Using this feature, you can build graphs of pipelines.

Artifacts

GoCD can collect artifacts , which are files or directories generated by a job. Later parts of the same pipeline, or even of other, connected pipelines, can retrieve those artifacts. Retrieval of artifacts is not limited to artifacts created on the same agent machine.

You can also retrieve artifacts from the web interface and from a REST API that the GoCD server provides.3

The artifact repository can be configured to discard older versions when disk space becomes scarce.

9.2 Installation

In order to use GoCD, you have to install the GoCD server on one machine and a GoCD agent on at least one machine. This can be on the same machine as the server or on a different one, as long as it can connect to the GoCD server with ports 8153 and 8154.

When your infrastructure and the number of pipelines grow, it is likely that you will be running several Go agents.

Installing the GoCD Server on Debian

To install the GoCD server on a Debian-based operating system, first you have to make sure you can download Debian packages via HTTPS.
$ apt-get install -y apt-transport-https
Then you have to configure the package sources.
$ echo 'deb https://download.gocd.org /'
        > /etc/apt/sources.list.d/gocd.list
$ curl https://download.gocd.org/GOCD-GPG-KEY.asc
        | apt-key add -
And finally install it.
$ apt-get update && apt-get install -y go-server

On Debian 9, codename Stretch, Java 8 is available out of the box. In older versions of Debian, you might have to install Java 8 from other sources, such as Debian Backports .4

When you now point your browser at port 8154 of the Go server for HTTPS (ignore the SSL security warnings), or port 8153 for HTTP, you should see the GoCD server’s web interface (Figure 9-2).
../images/456760_1_En_9_Chapter/456760_1_En_9_Fig2_HTML.jpg
Figure 9-2

GoCD’s initial web interface

If you get a connection refused error, check the files under /var/log/go-server/ for hints of what went wrong.

To prevent unauthenticated access, you can install authentication plug-ins, for example, password file-based authentication 5 or LDAP or Active Directory–based authentication .6

Installing a GoCD Agent on Debian

On one or more machines on which you want to execute the automated build and deployment steps, you must install a Go agent, which will connect to the server and poll it for work.

See Chapter 8 for an example of automatic installation of a GoCD agent. If you want to do it manually instead, you must perform the same first three steps as when installing the GoCD server, to ensure that you can install packages from the GoCD package repository. Then, of course, you install the Go agent. On a Debian-based system, this is the following:
$ apt-get install -y apt-transport-https
$ echo 'deb https://download.gocd.org /' >
    /etc/apt/sources.list.d/gocd.list
$ curl https://download.gocd.org/GOCD-GPG-KEY.asc
    | apt-key add -
$ apt-get update && apt-get install -y go-agent
Then edit the file /etd/default/go-agent. The first line should read
GO_SERVER_URL=https://127.0.0.1:8154/go
Change the variable to point to your GoCD server machine, then start the agent.
$ service go-agent start
After a few seconds, the agent will have contacted the server. When you click the Agents menu in the GoCD server’s web interface, you should see the agent (Figure 9-3).
../images/456760_1_En_9_Chapter/456760_1_En_9_Fig3_HTML.jpg
Figure 9-3

Screenshot of GoCD’s agent management interface. (lara is the host name of the agent here.)

First Contact with GoCD’s XML Configuration

There are two ways to configure your GoCD server: through the web interface and through a configuration file in XML. You can also edit the XML config through the web interface.7

While the web interface is a good way to explore GoCD’s capabilities, it quickly becomes annoying to use, due to too much clicking. Using an editor with good XML support gets things done much faster, and it lends itself better to compact explanation, so that’s the route I’m taking here. You can also use both approaches on the same GoCD server instance.

In the Admin menu, the Config XML item lets you see and edit the server config. Listing 9-1 is what a pristine XML configuration looks like, with one agent already registered.
<?xml version="1.0" encoding="utf-8"?>
<cruise
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="cruise-config.xsd"
    schemaVersion="77">
<server artifactsdir="artifacts"
        commandRepositoryLocation="default"
        serverId="b2ce4653-b333-4b74-8ee6-8670be479df9">
</server>
<agents>
    <agent hostname="lara" ipaddress="192.168.2.43"
        uuid="19e70088-927f-49cc-980f-2b1002048e09" />
</agents>
</cruise>
Listing 9-1

Baseline GoCD XML Configuration, with One Agent Registered

The serverId and the data of the agent will differ in your installation, even if you followed the same steps.

To give the agent some resources, you can change the <agent .../> tag in the <agents> section to read as shown in Listing 9-2.
<agent hostname="lara" ipaddress="192.168.2.43"
    uuid="19e70088-927f-49cc-980f-2b1002048e09">
  <resources>
    <resource>debian-stretch</resource>
    <resource>build</resource>
    <resource>aptly</resource>
  </resources>
</agent>
Listing 9-2

GoCD XML Configuration for an Agent with Resources

Creating an SSH Key

It is convenient for GoCD to have an SSH key without a password, to be able to clone Git repositories via SSH, for example. To create one, run the following commands on the server:
$ su - go
$ ssh-keygen -t rsa -b 2048 -N " -f ~/.ssh/id_rsa

Either copy the resulting .ssh directory and the files therein onto each agent into the /var/go directory (and remember to set owner and permissions as they were created originally) or create a new key pair on each agent.

9.3 Building in the Pipeline

Triggering the build of a Debian package requires fetching the source code from a Git repository, by configuring it as a GoCD material, then invoking the dpkg-buildpackage command with some options, and, finally, collecting the resulting files.

Here (Listing 9-3) is the first shot at building the python-matheval package, expressed in GoCD’s XML configuration.
<pipelines group="deployment">
  <pipeline name="python-matheval">
    <materials>
      <git
url="https://github.com/python-ci-cd/python-matheval.git"
        dest="source" />
    </materials>
    <stage name="build" cleanWorkingDir="true">
       <jobs>
         <job name="build-deb" timeout="5">
          <tasks>
            <exec command="/bin/bash" workingdir="source">
              <arg>-c</arg>
              <arg>dpkg-buildpackage -b -us -uc</arg>
            </exec>
          </tasks>
          <artifacts>
            <artifact src="*.deb" dest="debian-packages/"
                type="build" />
          </artifacts>
          <resources>
            <resource>debian-stretch</resource>
            <resource>build</resource>
          </resources>
        </job>
      </jobs>
    </stage>
  </ pipeline>
</pipelines>
Listing 9-3

Simple Approach to Building a Debian Package in GoCD

You can find this and all following XML configurations in the gocd directory of the deployment-utils8 repository.

The outermost tag is a pipeline group, which has a name. It can be used to categorize available pipelines and also to manage permissions.

The second level is the <pipeline> with a name, and it contains a list of materials and one or more stages.

Directory Layout

Each time a job within a stage is run, the GoCD agent that is assigned to the job prepares a directory in which it makes the materials available. On Linux, this directory defaults to /var/lib/go-agent/pipelines/, followed by the pipeline name. Paths in the GoCD configuration are relative to this path.

For example, the preceding material definition contains the attribute dest="source", so the absolute path to this Git repository’s working copy is /var/lib/go-agent/pipelines/python-matheval/source. Leaving out the dest="..." would work and give one less directory level, but it would also prevent us from using a second material in the future.

See the config references 9 for a list of available material types and options. Plug-ins are available10 that add further material types.

Stages, Jobs, Tasks, and Artifacts

All the stages in a pipeline run serially, and each one runs only if the previous stage succeeded. Each stage has a name, which is used both in the front end and for fetching artifacts produced in that stage.

In the preceding example, I gave the stage the attribute cleanWorkingDir="true", which makes GoCD delete files created during the previous build and discard changes to files under version control. This tends to be a good option to use; otherwise, you might unknowingly slide into a situation in which a previous build affects the current build, which can be really painful to debug.

Jobs are potentially executed in parallel within a stage and have names for the same reasons that stages do. The jobs only run in parallel if several agents are available to run them.

The GoCD agent serially executes the tasks within a job. I tend to mostly use <exec> tasks (and <fetchartifact>, which you will see in the next chapter), which invoke system commands. They follow the UNIX convention of treating an exit status of zero as success and everything else as a failure.

For more complex commands, I create shell, Perl, or Python scripts inside a Git repository and add the repository as a material to the pipeline, which makes them available during the build process, with no extra effort.

The <exec> task in our example invokes /bin/bash -c 'dpkg-buildpackage -b -us -uc'. This is a case of Cargo Cult Programming,11 because invoking dpkg-buildpackage directly works just as well. Ah well, we can revise this later…

dpkg-buildpackage -b -us -uc builds the Debian package and is executed inside the Git checkout of the source. It produces a .deb file, a .changes file, and possibly a few other files with metadata. They are created one level above the Git checkout, in the root directory of the pipeline.

Because these are the files that we want to work with later on, at least the .deb file, we let GoCD store them in an internal database called the artifact repository. That’s what the <artifact> tag in the configuration instructs GoCD to do.

The name of the generated package files depend on the version number of the built Debian package (which comes from the debian/changelog file in the Git repository), so it’s not easy to reference them by name later on. That’s where the dest="debian-packages/" comes into play: it makes GoCD store the artifacts in a directory with a fixed name. Later stages then can retrieve all artifact files from this directory by the fixed directory name.

The Pipeline in Action

If nothing goes wrong (and nothing ever does, right?), Figure 9-4 shows roughly what the web interface looks like after running the new pipeline.
../images/456760_1_En_9_Chapter/456760_1_En_9_Fig4_HTML.jpg
Figure 9-4

Pipeline overview after a successful run of the build stage

Whenever there is a new commit in the Git repository, GoCD happily builds a Debian package and stores it for further use. Automated builds, yay!

Version Recycling Considered Harmful

When building a Debian package, the tooling determines the version number of the resulting package, by looking at the top of the debian/changelog file. This means that whenever somebody pushes code or documentation changes without a new changelog entry, the resulting Debian package has the same version number as the previous one.

Most Debian tooling assumes that the tuple of package name, version, and architecture uniquely identifies a revision of a package. Stuffing a new version of a package with an old version number into a repository is bound to cause trouble. Most repository-management software simply refuses to accept a copy of a package that recycles a version. On the target machine on which the package is to be installed, upgrading the package won’t do anything, if the version number stays the same.

Constructing Unique Version Numbers

There are several sources that you can tap to generate unique version numbers.
  • Randomness (for example, in the form of UUIDs)

  • The current date and time

  • The Git repository itself

  • Several environment variables12 that GoCD exposes that can be of use

The latter is promising. GO_PIPELINE_COUNTER is a monotonic counter that increases each time GoCD runs the pipeline, so a good source for a version number. GoCD allows manual rerunning of stages, so it’s best to combine it with GO_STAGE_COUNTER. In terms of shell scripting, using $GO_PIPELINE_COUNTER.$GO_STAGE_COUNTER as a version string sounds like a decent approach.

But, there’s more. GoCD allows you to trigger a pipeline with a specific version of a material, so you can have a new pipeline run to build an old version of the software. If you do that, using GO_PIPELINE_COUNTER as the first part of the version string doesn’t reflect the use of the old code base.

git describe is an established way to count commits. By default, it prints the last tag in the repository, and if HEAD does not resolve to the same commit as the tag, it adds the number of commits since that tag and the abbreviated SHA1 hash prefixed by g, so, for example, 2016.04-32-g4232204 for the commit 4232204, which is 32 commits after the tag 2016.04. The option --long forces it to always print the number of commits and the hash, even when HEAD points to a tag.

We don’t need the commit hash for the version number, so a shell script to construct a suitable version number looks like this.
#!/bin/bash
set -e
set -o pipefail
v=$(git describe --long |sed 's/-g[A-Fa-f0-9]*$//')
version="$v.${GO_PIPELINE_COUNTER:-0}.${GO_STAGE_COUNTER:-0}"
Bash’s ${VARIABLE:-default} syntax is a good way to make the script work outside a GoCD agent environment. This script requires a tag to be set in the Git repository. If there is none, it fails with this message from git describe:
fatal: No names found, cannot describe anything.

Other Bits and Pieces Around the Build

Now that we have a unique version string, we must instruct the build system to use this version string. This works by writing a new entry in debian/changelog with the desired version number. The debchange tool automates this for us. A few options are necessary to make it work reliably.
export DEBFULLNAME='Go Debian Build Agent'
export DEBEMAIL='[email protected]'
debchange --newversion=$version --force-distribution -b
    --distribution="${DISTRIBUTION:-stretch}" 'New Version'
When we want to reference this version number in later stages in the pipeline (yes, there will be more), it’s handy to have it available in a file. It is also handy to have it in the output, so we need two more lines in the script.
echo $version
echo $version > ../version
and ,of course, must trigger the actual build, as follows:
dpkg-buildpackage -b -us -uc

Plugging It into GoCD

To make the script accessible to GoCD, and also have it under version control, I put the script into a Git repository, under the name debian-autobuild, and added the repository as a material to the pipeline (Listing 9-4).
<pipeline name="python-matheval">
  <materials>
    <git
url="https://github.com/python-ci-cd/python-matheval.git"
        dest="source" materialName="python-matheval" />
    <git
url="https://github.com/python-ci-cd/deployment-utils.git"
    dest="deployment-utils" materialName="deployment-utils" />
  </materials>
  <stage name="build" cleanWorkingDir="true">
    <jobs>
      <job name="build-deb" timeout="5">
        <tasks>
          <exec command="../deployment-utils/debian-autobuild"
                workingdir="source" />
        </tasks>
        <artifacts>
          <artifact src="version" type="build"/>
          <artifact src="*.deb" dest="debian-packages/"
            type="build" />
        </artifacts>
        <resources>
          <resource>debian-stretch</resource>
          <resource>build</resource>
        </resources>
      </job>
    </jobs>
  </stage>
</pipeline>
Listing 9-4

GoCD Configuration for Building Packages with Distinct Version Numbers

Now, GoCD automatically builds Debian packages on each commit to the Git repository and gives each a distinct version string.

9.4 Summary

GoCD is an open source tool that can poll your Git repositories and trigger the build through dedicated agents. It is configured through a web interface, either by clicking through assistants or providing an XML configuration.

Care must be taken to construct meaningful version numbers for each build. Git tags, the number of commits since the last tag, and counters exposed by GoCD are useful components with which to construct such version numbers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset