Before we start building a Puppet server, we’ll stop to review some essential considerations:
Understanding these important details will save unnecessary and time-consuming rebuilds later.
To properly explain the functionality provided by a server for Puppet nodes, we’ll start by reviewing what we have previously covered about how Puppet builds the resource catalog.
A node is a discrete system that could be managed by a Puppet agent. You can probably name the conventional node types quickly:
There are a large number of nodes that can run Puppet agents you might not be aware of. Here’s a short but incomplete list of unconventional node types:
In summary, any device or system that can run a Puppet agent can be a Puppet node. As more and more devices every day contain small computers and participate in the Internet of Things, more and more things will have Puppet agents.
The Puppet agent evaluates and applies Puppet resources (from the catalog) on a node. Following is a more detailed review of the Puppet evaluation process when puppet apply
is used to evaluate resources on the node. We have presented all of this functionality in the first two parts of this book.
environment
value from the configuration file or command-line inputif
, unless
, and other conditional attributesonlyif
, creates
, and other conditional attributes to determine which resources in the catalog should be applied to the nodeAlthough Puppet agent is currently a Ruby application, it is planned that many agents will become native applications and not require Ruby in the near future. At convergence time the Puppet agent processes only the Puppet catalog, and thus does not need to process Ruby source code.
When you use a Puppet master or Puppet Server (both collectively referred to as a Puppet server for the remainder of the book), the server takes over the process of evaluating the data and compiling a catalog for the node to evaluate. This moves the core processing and policy evaluation to the server, like so:
This configuration is by far the most common and well-supported configuration for a Puppet deployment. Only the server requires access to the raw Puppet modules and their data sources. This provides many advantages:
This doesn’t mean that every environment must use a Puppet server, or that it is the correct solution for every need. In Part IV we’ll discuss the pros and cons of both server and serverless Puppet environments, and why you may want to evaluate each one.
This section will discuss some important considerations when you are adding a server to a Puppet environment. There are some choices that are important to get right. Many of these are subtle, and you won’t feel the pain until your environment has grown a bit. So I’m going to share with you what I and many others have found to be best practices for creating a server-oriented Puppet environment.
If you are new to Puppet, read these as ways you can avoid some pain in the future. If you have an existing Puppet environment, you may already feel the pain around the early choices. Use these as suggestions for ways to improve your environment.
You need to select a unique name for the Puppet server. This name will be around for the entire lifetime of this Puppet environment, so it should be easy to remember and use. If you are a small company, or use a discrete domain for the nodes you will be managing, you may want to name the service puppet.example.com or something equally straightforward and easy to remember.
You will choose a node or nodes on which to run Puppet server. Do not confuse the node with the Puppet Server service. Do not name the node puppet.example.com.
This probably seems counterintuitive at the moment, so let me share some crucial information here. When you start the Puppet server services, it will create a new certificate authority with the name you assign. Every Puppet agent that will use the service must acquire a certificate signed by this authority. This means that renaming the service later will be a huge effort, involving recertification of every node in the environment.
Puppet server is an application service that can be hosted anywhere. The node on which it runs today may change, or be replaced with a cluster of nodes. If the server is not named for a specific node, upgrades can be simple application migrations with zero changes to the end nodes.
Use a globally unique name for your Puppet server that can easily be moved or shared between multiple servers.
To set the globally unique name of the server, use the following configuration settings in /etc/puppetlabs/puppet/puppet.conf.
[
agent
]
server
=
puppet
.
example
.
com
[
master
]
certname
=
puppet
.
example
.
com
A very common mistake is to set the certname
of the node to be the same as the name of the Puppet server. This creates a different kind of problem. Don’t do this:
[
main
]
certname
=
puppet
.
example
.
com
# bad idea. server and agent are distinct
Each node that connects to the Puppet server must have a unique name, and a unique TLS key and certificate. When the node shares the server’s key and certificate, what happens when the Puppet server is migrated to a new node? You guessed it—conflict. Both of the nodes cannot use the same key. Even if the old node is shut down, the history and reports for the new node will be invalid, and could corrupt the new node by using data from the old node.
On the node that hosts the Puppet server, configure Puppet agent to use the node’s fully qualified and unique hostname.
To configure this properly, ensure that only the Puppet server uses the certificate authority’s name. Use the following configuration settings in /etc/puppetlabs/puppet/puppet.conf:
[
main
]
certname
=
$fqdn
# default value, could be left out
[
agent
]
server
=
puppet
.
example
.
com
[
master
]
certname
=
puppet
.
example
.
com
With this configuration, the node that hosts the Puppet server is a node like any other. If the server moves to another node, there is no conflict between the node identities.
The Puppet Server runs as a nonprivileged user account. I believe it is a mistake for the Puppet Server to use the same directories and paths as the Puppet agent, which is running as a privileged user on the node. Furthermore, it is easier to migrate or duplicate the Puppet Server when the service’s files are in their own file hierarchy.
For reasons completely unclear to the author, Puppet Server stores TLS key and certificate data for its nodes within /etc/puppetlabs/. For a typical managed node, which will create a single TLS key pair, this makes sense. The /etc/ directory should contain static configuration files.
In contrast, a Puppet server will create, alter, and remove files constantly to manage keys of every authorized node. In virtualized, autoscaled environments new Puppet clients come and go. Some of my environments build and destroy 9,000 test nodes in a single day. This makes the TLS directory volatile, and completely unsuitable for placement within /etc/.
Likewise, the default settings place the highly volatile node state and report data within the /opt/ directory or filesystem, which is likewise not expected to contain highly volatile data.
Use the following configuration settings in /etc/puppetlabs/puppet/puppet.conf:
[
user
]
vardir
=
/var/o
pt
/
puppetlabs
/
server
ssldir
=
$vardir
/
ssl
[
master
]
vardir
=
/var/o
pt
/
puppetlabs
/
server
ssldir
=
$vardir
/
ssl
The remainder of this book assumes that you have made this change. If you do not use these recommended settings, you’ll need to remember that the default locations are as follows:
[
master
]
vardir
=
/opt/
puppetlabs
/
puppet
/
cache
ssldir
=
/etc/
puppetlabs
/
puppet
/
ssl
Functions are executed by the process that builds the catalog. When a server builds the catalog, the functions are run by the server. This has three immediate concerns:
debug()
, info()
, notice()
, warning()
, and err()
functions will output to the server’s logs.puppet apply
environment, you may need to take steps to centralize the data sources.puppet apply
configuration it is possible to write functions that read data from the node. In a server-based environment, the only node-specific data available to the server comes from the node’s facts. You must change the functions into custom facts that supply the same data.With Puppet 4 you have two different products that can function as a Puppet server. At this point in time they provide similar feature sets. Which one should you use?
In turns out that this question is very easy to answer. Let’s review the two products.
Puppet 4 includes the existing, well-known puppet master
Rack application. If you are using any prior version of Puppet with a server (or “puppetmaster”), this is what you are using today.
The Puppet master is a Rack application. It comes with a built-in application server known as WEBrick, which can accept only two concurrent connections. Any site larger than a few dozen nodes would easily hit the limitations of the built-in server.
The only scalable way to run a Puppet master was under a Rack application server such as Phusion Passenger. This allowed you to tune the platform to support a large number of nodes quickly and well. For most purposes, tuning a Puppet master was tuning the Passenger application server.
The following chapter will cover how to install, configure, and tune a Puppet master running under Phusion Passenger. Puppet 4’s all-in-one installation has made this process easier than ever before.
If you already have Puppet master deployed and working well in your environment, you’ll find it very easy to upgrade to Puppet 4’s master. You already have the knowledge and experience to maintain and run it. The fairly minor changes you’ll make to upgrade to Puppet 4 are covered in this chapter.
The new, written-from-scratch Puppet Server is a self-standing product intended to extend and expand on the services previously provided by puppet master
. Puppet Server is a drop-in replacement for a Puppet master, allowing you to use it in an existing environment with no changes to the Puppet agents.
Many of the changes with Puppet Server were a change in the technology stack. Puppet Server was written in Clojure and uses Java rather than Ruby as the underlying engine. This ties it more closely with PuppetDB and other Puppet Labs products, and will make it easier to leverage the technologies together. Puppet Labs intends to provide powerful extension points in the future based around this platform.
Puppet Server will continue to support Ruby module plugins using the JRuby interpreter. You can continue to develop facts, features, functions, and all other module plugins in Ruby.
Puppet Server was designed with speed in mind, and boasts some impressive benchmarks when compared to the Puppet master. In addition, the new service provides significantly more visibility into the server process. The new metrics can be made available through open standards like JMX to popular analytics tools like Graphite.
You’ll notice that I didn’t provide a side-by-side feature comparison of the two choices. This is due to one factor that makes your choice self-evident:
Puppet 4 is the final version with Puppet master. If you already have a Puppet environment, the next chapter will help you upgrade a Puppet master with few changes to an existing environment. The Puppet 4 master will snap right into the Rack application environment you already know how to support.
However, all development of Puppet master has ceased. You need to plan for migration to Puppet Server, where all new features will appear.
If you are brand new to Puppet and don’t yet have a Puppet master server, ignore the Puppet master and start immediately with Puppet Server. Due to the complete change of technology stack, there is nothing about configuring or enabling the Puppet master Rack application that will help make you smarter or better with Puppet. Skip the next chapter entirely and go straight to Chapter 21.
Once you have finished testing and wish to deploy a Puppet server in production, there are two issues to consider for performance: available RAM and fast CPUs.
On a server with sufficient RAM for the Puppet server threads, the Puppet modules and Hiera data will be cached in file buffer memory. The only disk I/O will be to store Puppet nodes reports (if enabled), and when module or data changes are written out. Disk usage will be write-heavy, generally doing 10:1 writes versus reads.
Memory utilization tends to be very stable. You will have more active threads when you have more concurrent nodes connecting at the same time. If your server is running low on memory, increase it until the file cache becomes stable and consistent. As I’m sure you know, any free memory not used by processes will be used for file cache. The Puppet modules and Hiera data should always reside in the file buffer cache.
The vast majority of the server’s time will be spent compiling catalogs for the nodes, which is CPU-intensive. Other than ensuring enough memory is available, the most important things you need to consider for server performance are CPU cores, CPU cores, and more CPU cores.
If you exceed the capacity of a single Puppet server, you can scale Puppet servers horizontally. See “Deploying Puppet Servers at Scale” near the end of Part III.
1 Bug PUP-4376 tracks this obvious mistake.