Chapter 19. Preparing for a Puppet Server

Before we start building a Puppet server, we’ll stop to review some essential considerations:

  • Why a Puppet server changes the catalog build process
  • How to build a Puppet server that will be easy to move or upgrade
  • Whether to use Puppet master or Puppet Server with your nodes

Understanding these important details will save unnecessary and time-consuming rebuilds later.

Understanding the Catalog Builder

To properly explain the functionality provided by a server for Puppet nodes, we’ll start by reviewing what we have previously covered about how Puppet builds the resource catalog.

Node

A node is a discrete system that could be managed by a Puppet agent. You can probably name the conventional node types quickly:

  • A physical computer system, like an Oracle Sun X5-4 or Dell PowerEdge R720
  • A virtualized operating system, like RedHat Enterprise Linux, running on an VMWare ESX or OpenStack instance
  • A virtualized operating system running on a public cloud provider like Amazon Web Services (AWS) or Google Compute.

There are a large number of nodes that can run Puppet agents you might not be aware of. Here’s a short but incomplete list of unconventional node types:

  • Routers, switches, and VPN concentrators supplied by Juniper, Cisco, Arista, and other network device vendors come with integrated Puppet agents.
  • Virtualization platforms like OpenStack can use Puppet agent to configure and manage the hypervisor.
  • Container technologies like Docker can be deployed and managed by Puppet agents.
  • Puppet agents can be isolated under different users or paths on the same system.

In summary, any device or system that can run a Puppet agent can be a Puppet node. As more and more devices every day contain small computers and participate in the Internet of Things, more and more things will have Puppet agents.

Agent

The Puppet agent evaluates and applies Puppet resources (from the catalog) on a node. Following is a more detailed review of the Puppet evaluation process when puppet apply is used to evaluate resources on the node. We have presented all of this functionality in the first two parts of this book.

  1. Gathers node data
    • Reads the environment value from the configuration file or command-line input
    • Retrieves plugins including functions and facts
    • Runs Facter to generate the node’s facts—both built-in and custom facts added by Puppet modules
    • Selects the node name from the configuration file or fact data
  2. Builds a catalog of resources for the node
    • Query a node terminus (if configured) to obtain node information
    • Evaluate the main manifest, node terminus, and Hiera data to create a list of classes to apply
    • Performs all variable assignment to resolve variables into values within the catalog
    • Evaluates if, unless, and other conditional attributes
    • Performs iterations to build resources or map values
    • Executes functions configured in classes to provide module-specific data
    • Executes functions configured in the environment to provide environment-specific data
  3. Evaluates (or converges) the catalog on the node
    • Evaluates the onlyif, creates, and other conditional attributes to determine which resources in the catalog should be applied to the node
    • Creates a dependency graph based on ordering attributes and automatic dependencies
    • Compares the state of each resource, and makes the necessary changes to bring the resource into compliance with the policy
    • Creates a report containing all events processed during the agent run

Although Puppet agent is currently a Ruby application, it is planned that many agents will become native applications and not require Ruby in the near future. At convergence time the Puppet agent processes only the Puppet catalog, and thus does not need to process Ruby source code.

Server

When you use a Puppet master or Puppet Server (both collectively referred to as a Puppet server for the remainder of the book), the server takes over the process of evaluating the data and compiling a catalog for the node to evaluate. This moves the core processing and policy evaluation to the server, like so:

  1. The Puppet agent submits name, environment, and facts to the Puppet server.
  2. The Puppet server builds a Puppet catalog for the node.
  3. The Puppet agent applies the catalog’s resources on the node.

This configuration is by far the most common and well-supported configuration for a Puppet deployment. Only the server requires access to the raw Puppet modules and their data sources. This provides many advantages:

  • Removes the need to distribute code and data down to every node
  • Reduces computational cost on the end node
  • Provides auditing and controls for some security requirements
  • Simplifies network access requirements for internal data sources

This doesn’t mean that every environment must use a Puppet server, or that it is the correct solution for every need. In Part IV we’ll discuss the pros and cons of both server and serverless Puppet environments, and why you may want to evaluate each one.

Right now, let’s show you how to build your own server.

Planning for Puppet Server

This section will discuss some important considerations when you are adding a server to a Puppet environment. There are some choices that are important to get right. Many of these are subtle, and you won’t feel the pain until your environment has grown a bit. So I’m going to share with you what I and many others have found to be best practices for creating a server-oriented Puppet environment.

If you are new to Puppet, read these as ways you can avoid some pain in the future. If you have an existing Puppet environment, you may already feel the pain around the early choices. Use these as suggestions for ways to improve your environment.

The Server Is Not the Node

You need to select a unique name for the Puppet server. This name will be around for the entire lifetime of this Puppet environment, so it should be easy to remember and use. If you are a small company, or use a discrete domain for the nodes you will be managing, you may want to name the service puppet.example.com or something equally straightforward and easy to remember.

You will choose a node or nodes on which to run Puppet server. Do not confuse the node with the Puppet Server service. Do not name the node puppet.example.com.

This probably seems counterintuitive at the moment, so let me share some crucial information here. When you start the Puppet server services, it will create a new certificate authority with the name you assign. Every Puppet agent that will use the service must acquire a certificate signed by this authority. This means that renaming the service later will be a huge effort, involving recertification of every node in the environment.

Puppet server is an application service that can be hosted anywhere. The node on which it runs today may change, or be replaced with a cluster of nodes. If the server is not named for a specific node, upgrades can be simple application migrations with zero changes to the end nodes.

Best Practice

Use a globally unique name for your Puppet server that can easily be moved or shared between multiple servers.

To set the globally unique name of the server, use the following configuration settings in /etc/puppetlabs/puppet/puppet.conf.

[agent]
    server = puppet.example.com

[master]
    certname = puppet.example.com

The Node Is Not the Server

A very common mistake is to set the certname of the node to be the same as the name of the Puppet server. This creates a different kind of problem. Don’t do this:

[main]
  certname = puppet.example.com   # bad idea. server and agent are distinct

Each node that connects to the Puppet server must have a unique name, and a unique TLS key and certificate. When the node shares the server’s key and certificate, what happens when the Puppet server is migrated to a new node? You guessed it—conflict. Both of the nodes cannot use the same key. Even if the old node is shut down, the history and reports for the new node will be invalid, and could corrupt the new node by using data from the old node.

Best Practice

On the node that hosts the Puppet server, configure Puppet agent to use the node’s fully qualified and unique hostname.

To configure this properly, ensure that only the Puppet server uses the certificate authority’s name. Use the following configuration settings in /etc/puppetlabs/puppet/puppet.conf:

[main]
    certname = $fqdn      # default value, could be left out

[agent]
    server = puppet.example.com

[master]
    certname = puppet.example.com

With this configuration, the node that hosts the Puppet server is a node like any other. If the server moves to another node, there is no conflict between the node identities.

Store Server Data Files Separately

The Puppet Server runs as a nonprivileged user account. I believe it is a mistake for the Puppet Server to use the same directories and paths as the Puppet agent, which is running as a privileged user on the node. Furthermore, it is easier to migrate or duplicate the Puppet Server when the service’s files are in their own file hierarchy.

For reasons completely unclear to the author, Puppet Server stores TLS key and certificate data for its nodes within /etc/puppetlabs/. For a typical managed node, which will create a single TLS key pair, this makes sense. The /etc/ directory should contain static configuration files.

In contrast, a Puppet server will create, alter, and remove files constantly to manage keys of every authorized node. In virtualized, autoscaled environments new Puppet clients come and go. Some of my environments build and destroy 9,000 test nodes in a single day. This makes the TLS directory volatile, and completely unsuitable for placement within /etc/.

Warning
Placing the volatile TLS certificate repository within the /etc directory on a Puppet server violates the Filesystem Hierarchy Standard (FHS)1 and the expectations of every experienced Unix/Linux administrator.

Likewise, the default settings place the highly volatile node state and report data within the /opt/ directory or filesystem, which is likewise not expected to contain highly volatile data.

Best Practice

Configure your Puppet server to place all volatile files within the /var filesystem.

Use the following configuration settings in /etc/puppetlabs/puppet/puppet.conf:

[user]
  vardir = /var/opt/puppetlabs/server
  ssldir = $vardir/ssl

[master]
  vardir = /var/opt/puppetlabs/server
  ssldir = $vardir/ssl
Tip
Even though SSL is now named TLS, the directory and variable names have not yet been updated.

The remainder of this book assumes that you have made this change. If you do not use these recommended settings, you’ll need to remember that the default locations are as follows:

[master]
  vardir = /opt/puppetlabs/puppet/cache
  ssldir = /etc/puppetlabs/puppet/ssl

Functions Run on the Server

Functions are executed by the process that builds the catalog. When a server builds the catalog, the functions are run by the server. This has three immediate concerns:

Log messages will be in the server logs, not on the node
Log messages created by the debug(), info(), notice(), warning(), and err() functions will output to the server’s logs.
Data used by the catalog must be local to the server, or supplied by facts
In most situations it is easier to centralize data on a server than to distribute it to every node. However, if migrating from a puppet apply environment, you may need to take steps to centralize the data sources.
The only method to provide local node data to the server is facts
In a puppet apply configuration it is possible to write functions that read data from the node. In a server-based environment, the only node-specific data available to the server comes from the node’s facts. You must change the functions into custom facts that supply the same data.

Choosing Puppet Master Versus Puppet Server

With Puppet 4 you have two different products that can function as a Puppet server. At this point in time they provide similar feature sets. Which one should you use?

In turns out that this question is very easy to answer. Let’s review the two products.

Upgrading Easily with Puppet Master

Puppet 4 includes the existing, well-known puppet master Rack application. If you are using any prior version of Puppet with a server (or “puppetmaster”), this is what you are using today.

Note
Puppet Enterprise 3.7 and above use the Puppet Server product described on the next page.

The Puppet master is a Rack application. It comes with a built-in application server known as WEBrick, which can accept only two concurrent connections. Any site larger than a few dozen nodes would easily hit the limitations of the built-in server.

The only scalable way to run a Puppet master was under a Rack application server such as Phusion Passenger. This allowed you to tune the platform to support a large number of nodes quickly and well. For most purposes, tuning a Puppet master was tuning the Passenger application server.

The following chapter will cover how to install, configure, and tune a Puppet master running under Phusion Passenger. Puppet 4’s all-in-one installation has made this process easier than ever before.

If you already have Puppet master deployed and working well in your environment, you’ll find it very easy to upgrade to Puppet 4’s master. You already have the knowledge and experience to maintain and run it. The fairly minor changes you’ll make to upgrade to Puppet 4 are covered in this chapter.

Warning
Puppet master only supports Puppet 4 clients, so you will be forced to upgrade all clients to Puppet 4.

Embracing the Future with Puppet Server

The new, written-from-scratch Puppet Server is a self-standing product intended to extend and expand on the services previously provided by puppet master. Puppet Server is a drop-in replacement for a Puppet master, allowing you to use it in an existing environment with no changes to the Puppet agents.

Many of the changes with Puppet Server were a change in the technology stack. Puppet Server was written in Clojure and uses Java rather than Ruby as the underlying engine. This ties it more closely with PuppetDB and other Puppet Labs products, and will make it easier to leverage the technologies together. Puppet Labs intends to provide powerful extension points in the future based around this platform.

Puppet Server will continue to support Ruby module plugins using the JRuby interpreter. You can continue to develop facts, features, functions, and all other module plugins in Ruby.

Puppet Server was designed with speed in mind, and boasts some impressive benchmarks when compared to the Puppet master. In addition, the new service provides significantly more visibility into the server process. The new metrics can be made available through open standards like JMX to popular analytics tools like Graphite.

Tip
Puppet Server is backward compatible with Puppet 3 clients. This can be essential if a Puppet agent cannot be upgraded for one reason or another.

Why There’s Really No Choice

You’ll notice that I didn’t provide a side-by-side feature comparison of the two choices. This is due to one factor that makes your choice self-evident:

  • Puppet master has been deprecated, and will not exist in Puppet 5.

Puppet 4 is the final version with Puppet master. If you already have a Puppet environment, the next chapter will help you upgrade a Puppet master with few changes to an existing environment. The Puppet 4 master will snap right into the Rack application environment you already know how to support.

However, all development of Puppet master has ceased. You need to plan for migration to Puppet Server, where all new features will appear.

If you are brand new to Puppet and don’t yet have a Puppet master server, ignore the Puppet master and start immediately with Puppet Server. Due to the complete change of technology stack, there is nothing about configuring or enabling the Puppet master Rack application that will help make you smarter or better with Puppet. Skip the next chapter entirely and go straight to Chapter 21.

Ensuring a High-Performance Server

Once you have finished testing and wish to deploy a Puppet server in production, there are two issues to consider for performance: available RAM and fast CPUs.

On a server with sufficient RAM for the Puppet server threads, the Puppet modules and Hiera data will be cached in file buffer memory. The only disk I/O will be to store Puppet nodes reports (if enabled), and when module or data changes are written out. Disk usage will be write-heavy, generally doing 10:1 writes versus reads.

Memory utilization tends to be very stable. You will have more active threads when you have more concurrent nodes connecting at the same time. If your server is running low on memory, increase it until the file cache becomes stable and consistent. As I’m sure you know, any free memory not used by processes will be used for file cache. The Puppet modules and Hiera data should always reside in the file buffer cache.

The vast majority of the server’s time will be spent compiling catalogs for the nodes, which is CPU-intensive. Other than ensuring enough memory is available, the most important things you need to consider for server performance are CPU cores, CPU cores, and more CPU cores.

If you exceed the capacity of a single Puppet server, you can scale Puppet servers horizontally. See “Deploying Puppet Servers at Scale” near the end of Part III.

1 Bug PUP-4376 tracks this obvious mistake.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset