There is a tremendous amount of additional features that can be utilized in a module. This chapter will cover the following features:
Let’s get started.
Puppet 4 introduced a new type system that can validate input parameters. This improves code quality and readability.
In older versions of Puppet, it was common to perform input validation like this:
class
puppet
(
# input parameters and default values for the class
$version
=
'latest'
,
$status
=
'running'
,
$enabled
=
true
,
$datetime
=
''
,
)
{
validate_string
(
$version
)
validate_string
(
$status
)
validate_bool
(
$enabled
)
# can't validate or convert timestamps
.
.
.
resources
defined
below
.
.
.
While this looks easy with three variables, it could consume pages of code if there are a lot of input variables. In my experience with larger modules, it wasn’t uncommon for the first resource defined in a manifest to be down below line 180.
Puppet 4 allows you to define the type when declaring the parameter now, which both shortens the code and improves its readability. When declaring the parameter, simply add the type before the variable name. This declares the parameter and adds validation for the input on a single line.
I think you’ll agree the following is significantly more readable:
class
puppet
(
# input parameters and default values for the class
String
$version
=
'latest'
,
Enum
[
'running'
,
'stopped'
]
$status
=
'running'
,
Boolean
$enabled
=
true
,
Timestamp
$datetime
=
Timestamp
.
new
(
)
,
)
{
.
.
.
class
resources
.
.
.
}
It is not necessary for you to declare a type. If you are passing in something that can contain multiple types of data, simply leave the definition without a type as shown in the previous chapter. A parameter without an explicit type defaults to a type named Any
.
Use explicit data types for all input parameters. Avoid accepting ambiguous values that must be introspected before use.
You can (and should) also declare types for lambda block parameters:
split
(
$facts
[
'interfaces'
]
)
.
each
|
String
$interface
|
{
.
.
.
lambda
block
.
.
.
}
The type system is hierarchical, where you can allow multiple types to match by defining at a higher level in the tree. If you are familiar with Ruby object types, this will look very familiar:
Data
accepts any of:
Scalar
accepts any of:
Boolean
(true
or false
)
String
(ASCII and UTF8 characters)
Enum
(a specified list of String
values)Pattern
(a subset of strings that match a given Regexp
)Numeric
accepts either:
Float
(numbers with periods; fractions)Integer
(whole numbers without periods)Regexp
(a regular expression)SemVer
(a semantic version)Timespan
(duration in seconds with nanosecond precision)Timestamp
(seconds since the epoch with nanosecond precision)Undef
(a special type that only accepts undefined values)SemVerRange
a contiguous range of SemVer versions
Collection
accepts any of:
Array
(list containing Data
data types)Hash
(Scalar
keys associated with Data
values)Catalogentry
accepts any of:
Resource
(built-in, custom, and defined resources)Class
(named manifests that are not executed until called)As almost every common data value falls under Data
, use it to accept any of these types. When you are testing types, $type =~ Data
will also match a Collection
that contains only Scalar
values.
There are a few special types that can match multiple values:
Variant
Optional
NotUndef
Tuple
Array
with specific data types in specified positionsStruct
Hash
with specified types for the key/value pairsIterable
Type
or Collection
that can be iterated over (new in Puppet 4.4)Iterator
Iterable
type, used by chained iterator functions to act on elements individually instead of copying the entire Iterable
type (new in Puppet 4.4)There are a few types you will likely never use, but you might see them in error messages.
Default
default:
key in case
and select
statementsCallable
Runtime
All of these types are considered type Any
. There’s no point in declaring this type because you’d be saying that it accepts anything, when most of the time you want Data
.
The type system allows for validation of not only the parameter type, but also of the values within structured data types. For example:
Integer
[
13
,
19
]
# teenage years
Array
[
Float
]
# an array containing only Float numbers
Array
[
Array
[
String
]
]
# an array containing only Arrays of String values
Array
[
Numeric
[-
10
,
10
]
]
# an array of Integer or Float values between -10 and 10
For structured data types, the range parameters indicate the size of the structured type, rather than a value comparison. You can check both the type and size of a Collection
(Array
or Hash
) using two values—the type and an array with the size parameters:
Array
[
String
,
0
,
10
]
# an array of 0 to 10 string values
Hash
[
String
,
Any
,
2
,
4
]
]
# a hash with 2 to 4 pairs of string keys with any value
For even more specificity about key and value types for a hash, a Struct
specifies exactly which data types are valid for the keys and values.
# This is a Hash with short string keys and floating-point values
Struct
[
{
day
=>
String
[
1
,
8
]
,
# keys are 1-8 characters in length
temp
=>
Float
[-
100
,
100
]
,
# values are floating-point Celsius
}
]
# This is a hash that accepts only three well-known ducks as key names
Struct
[
{
duck
=>
enum
[
'Huey'
,
'Dewey'
,
'Louie'
]
,
loved
=>
Boolean
,
}
]
Like what a Struct
does for a Hash
, you can specify types of data for an Array
using the Tuple
type. The tuple can list specific types in specific positions, along with a specified minimum and optional maximum number of entries:
# an array with three integers followed by one string (explicit)
Tuple
[
Integer
,
Integer
,
Integer
,
String
]
# an array with two integers followed by 0-2 strings (length 2-4)
Tuple
[
Integer
,
Integer
,
String
,
2
,
4
]
# an array with 1-3 integers, and 0-5 strings
Tuple
[
Integer
,
Integer
,
Integer
,
String
,
1
,
8
]
That last one is pretty hard to understand, so let’s break it down:
As you can see, the ability to be specific for many positions in an Array
makes Tuple
a powerful type for well-structured data.
The Variant
and Optional
types allow you to specify valid alternatives:
Variant
[
Integer
,
Float
]
# the same as Numeric type
Variant
[
String
,
Array
[
String
]]
# a string or an array of strings
Variant
[
String
,
undef
]
# a string or nothing
Optional
[
String
]
# same as the previous line
Optional
[
String
,
Integer
]
# string, integer, or nada
You can also check the size of the value of a given type:
String
[
12
]
# a String at least 12 characters long
String
[
1
,
40
]
# a String between 1 and 40 characters long
Array
[
Integer
,
3
]
# an Array of at least 3 integers
Array
[
String
,
1
,
5
]
# an Array with 1 to 5 strings
You can use all of these together in combination:
# An array of thumbs up or thumbs down values
Array
[
Enum
[
'thumbsup'
,
'thumbsdown'
]
]
# An array of thumbs up, thumbs down, or integer from 1 to 5 values
Array
[
Variant
[
Integer
[
1
,
5
]
,
Enum
[
'thumbsup'
,
'thumbsdown'
]
]
]
In addition to defining the type of parameters passed into a class, defined type, or lambda, you can perform explicit tests against values in a manifest. Use the =~
operator to compare a value against a type to determine if the value matches the type declaration. For instance, if a value could be one of several types, you could determine the exact type so as to process it correctly:
if
(
$input_value
=~
String
)
{
notice
(
"Received string ${input_value}"
)
}
elsif
(
$input_value
=~
Integer
)
{
notice
(
"Received integer ${input_value}"
)
}
The match operator can inform you if a variable can be iterated over:
if
(
$variable
=~
Iterable
)
{
$variable
.
each
()
|
$value
|
{
notice
(
$value
)
}
}
else
{
notice
(
$variable
)
}
You can determine if a version falls within an allowed SemVerRange
using the match operator:
if
(
$version
=~
SemVerRange
(
'>=4.0.0 <5.0.0'
)
)
{
notice
(
'This version is permitted.'
)
}
else
{
notice
(
'This version is not acceptable.'
)
}
You can also determine if a type is available within a Collection
with the in
operator:
if
(
String
in
$array_of_values
)
{
notice
(
'Found a string in the list of values.'
)
}
else
{
notice
(
'No strings found in the list of values.'
)
}
The with()
function can be useful for type checking as well:
with
(
$password
)
|
String
[
12
]
$secret
|
{
notice
(
"The secret '${secret}' is a sufficiently long password."
)
}
You can likewise test value types using case
and selector
expressions:
case
$input_value
{
Integer
:
{
notice
(
'Input plus ten equals '
+
(
$input_value
+
10
)
)
}
String
:
{
notice
(
'Input was a string, unable to add ten.'
)
}
}
You can test against Variant
and Optional
types as well:
if
(
$input
=~
Variant
[
String
,
Array
[
String
]
]
)
{
notice
(
'Values are strings.'
)
}
if
(
$input
=~
Optional
[
Integer
]
)
{
notice
(
'Values is a whole number or undefined.'
)
}
A type compares successfully against its exact type, and parents of its type, so the following statements will all return true:
'text'
=~
String
# exact
'text'
=~
Scalar
# Strings are children of Scalar type
'text'
=~
Data
# Scalars are a valid Data type
'text'
=~
Any
# all types are children of Any
If you don’t like the default messages displayed when the catalog build fails due to a type mismatch, you can customize your own by using the assert_type()
function with a lambda block:
assert_type
(
String
[
12
]
,
$password
)
|
$expected
,
$actual
|
{
fail
"Passwords less than 12 chars are easily cracked. (provided: ${actual})"
}
You can evaluate strings against regular expressions to determine if they match using the same =~
operator. For instance, if you are evaluating filenames to determine which ones are Puppet manifests, the following example would be useful:
$manifests
=
$
{
filenames
}
.
filter
|
$filename
|
{
$filename
=~
/
.pp$
/
}
Puppet uses the Ruby core Regexp class for matching purposes. The following options are supported:
i
(case insensitive)m
(multiline mode).
wildcard.x
(free spacing)Puppet doesn’t support options after the final slash—instead, use the (?<options>:<pattern>)
syntax to set options:
$input
=~
/(?i:food)/
# will match Food, FOOD, etc.
$input
=~
/(?m:fire.flood)/
# will match "fire flood"
$input
=~
/(?x:fo w $)/
# will match fog but not food or "fo g"
$input
=~
/(?imx:fire . flood)/
# will match "Fire Flood"
You can match against multiple exact strings and regular expressions with the Pattern
type:
$placeholder_names
=
$victims
.
filter
|
$name
|
{
$name
=~
Pattern
[
'alice'
,
'bob'
,
'eve'
,
'^(?i:j.* doe)'
,
/
^(?i:j.* roe)
/
]
}
A String
on the righthand side of the matching operator is converted to a Regexp
. This allows you to use interpolated variables in the regular expression:
$drink_du_jour
=
'coffee'
$you_drank
=~
"^${drink_du_jour}$"
# true if they drank coffee
You can also compare to determine if something is a Regexp
type:
/foo/
=~
Regexp
# true
'foo'
=~
Regexp
# false
This allows for input validation:
if
$input
=~
Regexp
{
notify
{
'Input was a regular expression'
:
}
}
You can compare to see if the regular expressions are an identical match:
/foo/
=~
Regexp
[
/foo/
]
# true
/foo/
=~
Regexp
[
/foo$/
]
# false
The Regexp
type can use variable interpolation, by placing another variable within a string that is converted to the regular expression to match. Because this string requires interpolation, backslashes must be escaped:
$nameregexp
=~
Regexp
[
"
${first_name} [
\
w
\
-]+
"
]
}
The preceding example returns true
if $nameregexp
is a regular expression that looks for the first name input followed by another word.
Now that you’ve taken a tour of data type validation, let’s take a look at how the Puppet module could be revised to validate each parameter:
class
puppet
(
String
$server
=
'puppet.example.com'
,
String
$version
=
'latest'
,
Enum
[
'running'
,
'stopped'
]
$status
=
'running'
,
Boolean
$enabled
=
true
,
String
$common_loglevel
=
'warning'
,
Optional
[
String
]
$agent_loglevel
=
undef
,
Optional
[
String
]
$apply_loglevel
=
undef
,
)
{
As written, each parameter value is now tested to ensure it contains the expected data type.
Instead of type String
, it would be more specific to use the following for each log level:
Enum['debug','info','notice','warning','err','alert', 'emerg','crit','verbose']
I didn’t do this here due to page formatting reasons.
As discussed in Chapter 11, Hiera provides a configurable, hierarchical mechanism to look up input data for use in manifests.
Retrieve Hiera data values using the lookup()
function call, like so:
$version
=
lookup
(
'puppet::version'
)
One of the special features of Puppet classes is automatic parameter lookup. Explicit hiera()
or lookup()
function calls are unnecessary. Instead, list the parameters in Hiera within the module’s namespace.
As we are still testing our module, let’s go ahead and define those values now in data for this one node at /etc/puppetlabs/code/hieradata/hostname/client.yaml:
--
classes
:
-
puppet
puppet
:
:version
:
'latest'
puppet
:
:status
:
'stopped'
puppet
:
:enabled
:
false
Without any function calls, these values will be provided to the puppet
module as parameter input, overriding the default values provided in the class declaration.
Given that Hiera uses the ::
separator to define the data hierarchy, you might think that it would be easier to define the input parameters as hash entries underneath the module. And yes, I agree that the following looks very clean:
puppet
:
version
:
'latest'
status
:
'stopped'
enabled
:
false
Unfortunately, it does not work. The key for an input parameter must match the complete string of the module namespace plus the parameter name. You must write the file using the example shown immediately before this section.
By default, automatic parameter lookup will use the first strategy for lookup of Hiera data, meaning that the first value found will be returned. There are two ways to retrieve merged results of arrays and hashes from the entire hierarchy of Hiera data:
lookup_option
hash from the module data provider.merge
parameter of lookup()
to override the default strategy.An explicit lookup()
function call will override the merge strategy specified in the module data. We will cover how to adjust the merge policy for module lookups in “Binding Data Providers in Modules”. This section will document how to perform explicit lookup calls.
In order to retrieve merged results of arrays or hashes from the entire hierarchy of Hiera data, utilize the lookup()
function with the merge
parameter. A complete example of the function with merge, default value, and value type checking parameters is shown as follows:
$userlist
=
lookup
({
name
=>
'users::users2add'
,
value_type
=>
Array
[
String
]
,
default_value
=>
[]
,
merge
=>
'unique'
,
})
The following merge strategies are currently supported:
first
priority
in older Hiera versions).unique
array
in older Hiera versions).hash
native
in older Hiera versions). Ignores unique values of lower-priority keys.deep
The lookup()
function will merge values from multiple levels of the Hiera hierarchy. For example, a users
module for creating user accounts might expect data laid out like so:
users
:
:home_path
:
'/home'
users
:
:default_shell
:
'/bin/bash'
users
:
:users2add
:
-
jill
-
jack
-
jane
If you wanted to add one user to a given host, you could create an override file for that host with just the user’s name:
users
:
:users2add
:
-
jo
By default the automatic parameter lookup would find this entry and return it, making jo be the only user created on the system:
[
vagrant@client
hieradata
]
$
puppet
lookup
users::users2add
---
jo
To merge all the answers together, you could instead look up all unique Hiera values, as shown here:
[
vagrant@client
hieradata
]
$
puppet
lookup
--merge
unique
users::users2add
---
jo
jill
jack
jane
Apply this same merge option in the Puppet manifest to create an array of all users, as shown here:
class
users
(
# input parameters and default values for the class
$home_path
=
'/home'
,
$default_shell
=
'/bin/bash'
,
)
{
$userlist
=
lookup
(
{
name
=
>
'users::users2add'
,
merge
=
>
'unique'
}
)
The $userlist
variable will be assigned a unique, flattened array of values from all priority levels.
Sometimes a lookup does not return the value you expect. To see how the lookup is finding the values in the hierarchy, add the --explain
option to your puppet lookup
command.
[
vagrant@client
hieradata
]
$
puppet
lookup
--explain
users::users2add
Searching
for
"users::users2add"
Global
Data
Provider
(
hiera
configuration
version
5
)
Using
configuration
"/etc/puppetlabs/puppet/hiera.yaml"
Hierarchy
entry
"yaml"
Path
"/etc/puppetlabs/code/hieradata/hostname/client.yaml"
Original
path:
"hostname/%{facts.hostname}"
Path
not
found
Path
"/etc/puppetlabs/code/hieradata/common.yaml"
Original
path:
"common"
Path
not
found
...iterates
over
every
hierarchy
layer...
The unfortunate aspect of the previous example is that you have to split out the parameter assignment to an explicit lookup()
call, which doesn’t read well.
It is possible to specify the default merge strategy on a per-parameter basis.
Do this by creating a lookup_options
hash in your data source with the full parameter name as a key. The value should be a hash of lookup options,
exactly the same as used in the lookup()
call shown in “Using Array and Hash Merges”.
lookup_options
:
users
:
:userlist
:
merge
:
unique
This allows you to simplify the class parameter lookup shown in the preceding section back to a single location:
class
users
(
# input parameters and default values for the class
$home_path
=
'/home'
,
$default_shell
=
'/bin/bash'
,
$userlist
=
[
]
,
)
{
Adding a lookup_options
hash to
the data allows module authors to set a default merge strategy and other options for automatic parameter lookup. The user of the module can override the module author by declaring a
lookup_options
hash key in the global or environment data, which are evaluated at a higher priority.
lookup()
query to retrieve the lookup_options
hash. It is accessible only to the lookup()
function and automatic parameter lookup.Direct Hiera queries utilize only the original (global) lookup scope. Replace all hiera()
queries with lookup()
queries to make use of environment and module data providers.
If a module does direct Hiera queries, such as this:
# Format: hiera( key, default )
$status
=
hiera
(
'specialapp::status'
,
'running'
)
replace them with one of the following two variants, depending on whether extra options like default values and type checking are necessary. The simplest form is identical to the original hiera()
function with a single parameter.
# Format: lookup( key )
# Simple example assumes type Data (anything), no default value
$status
=
lookup
(
'specialapp::status'
)
The more complex form allows you to pass in a hash with optional attributes that define how the data is retrieved and validated. Here are some examples:
# Perform type checking on the value
# Provide a default value if the lookup doesn't succeed
$status
=
lookup
({
name
=>
'specialapp::status'
,
value_type
=>
Enum
[
'running'
,
'stopped'
]
,
default_value
=>
'running'
,
}
Here is another example, which performs an array merge of all values from every level of the hierarchy:
$userlist
=
lookup
({
name
=>
'users2add'
,
value_type
=>
Array
[
String
]
,
merge
=>
'unique'
,
})
Here are some example replacements for the older Hiera functions:
# replaces hiera_array('specialapp::id', [])
lookup
({
name
=>
'specialapp::id'
,
merge
=>
'unique'
default_value
=>
[]
,
value_type
=>
Array
[
Data
]
,
})
# replaces hiera_hash('specialapp::users', {})
lookup
({
name
=>
'specialapp::users'
,
merge
=>
'hash'
default_value
=>
{},
value_type
=>
Hash
[
Data
]
,
})
The lookup()
function will accept an array of attribute names. Each name will be looked up in order until a value is found. Only the result for the first name found is returned, although that result could contain merged values:
lookup
(
[
'specialapp::users'
,
'specialapp::usernames'
,
'specialapp::users2add'
]
,
{
merge
=>
'unique'
})
The lookup()
function can also pass names to a lambda and return the result it provides, as shown here:
# Create a default value on request
$salt
=
lookup
(
'security::salt'
)
|
$key
|
{
# ignore key; generate a random salt every time
rand
(
2
**
256
)
.
to_s
(
24
)
}
When building a module you may find yourself with several different related components, some of which may not be utilized on every system. For example, our Puppet class should be able to configure both the Puppet agent and a Puppet server. In situations like this, it is best to break your module up with subclasses.
Each subclass is named within the scope of the parent class. For example, it would make sense to use puppet::agent
as the name for the class that configures the Puppet agent.
Each subclass should be a separate manifest file, stored in the manifests directory of the module, and named for the subclass followed by the .pp extension. For example, our Puppet module could be expanded to have the following classes:
Class name | Filename |
---|---|
puppet |
manifests/init.pp |
puppet::agent |
manifests/agent.pp |
puppet::server |
manifests/server.pp |
As our module currently only configures the Puppet agent, let’s go ahead and move all resources from the puppet
class into the puppet::agent
class. When we are done, the files might look like this:
# manifests/init.pp
class
puppet
(
# common variables for all Puppet classes
String
$version
=
'latest'
,
String
$loglevel
=
'warning'
,
)
{
# no resources in this class
}
# manifests/agent.pp
class
puppet
:
:agent
(
# input parameters specific to agent subclass
Enum
[
'running'
,
'stopped'
]
$status
=
'running'
,
Boolean
$enabled
,
# required parameter
)
inherits
puppet
{
all
of
the
resources
previously
defined
}
# manifests/server.pp
class
puppet
:
:server
(
)
{
# we'll write this in Part III of the book
}
Any time you would need an if
/then
block in a module to handle different needs for different nodes, use subclasses instead for improved readability.
One last change will be to adjust Hiera to reflect the revised class name:
# /etc/puppetlabs/code/hieradata/common.yaml
classes
:
-
puppet
:
:agent
puppet
:
:common_loglevel
:
'info'
puppet
:
:version
:
'latest'
puppet
:
:agent
::
status
:
'stopped'
puppet
:
:agent
::
enabled
:
false
puppet
:
:agent
::
server
:
'puppetmaster.example.com'
With these small changes we have now made it possible for a node to have the Puppet agent configured, or the Puppet server configured, or both.
Puppet classes are what’s known as singletons. No matter how many places they are called with include
or require()
functions, only one copy of the class exists in memory. Only one set of parameters is used, and only one set of scoped variables exists.
There will be times that you may want to declare Puppet resources multiple times with different input parameters each time. For that purpose, create a defined resource type. Defined resource types are implemented by manifests that look almost exactly like subclasses:
Like the core Puppet resources, and unlike classes, defined resource types can be called over and over again. This makes them suitable for use within the lambda block of an iterator. We’ll use an iterator in our puppet
class in the next section. To demonstrate the idea now, here’s an example from a users
class.
# modules/users/manifests/create.pp
define
users
:
:create
(
String
$comment
,
Integer
$uid
,
Optional
[
Integer
]
$gid
=
undef
,
)
{
user
{
$title
:
uid
=>
$uid
,
gid
=>
$gid
||
$uid
,
comment
=>
$comment
,
}
}
# modules/users/manifests/init.pp
class
users
(
Array
[
Hash
]
$userlist
=
[]
)
{
userlist
.
each
|
$user
|
{
users
:
:create
{
$user
[
'name'
]
:
uid
=>
$user
[
'uid'
]
,
comment
=>
$user
[
'comment'
]
,
}
}
}
The create
defined type will be declared once for every user in the array provided to the users
module. Unlike a class, the defined type sees fresh input parameters each time it is called.
You might notice that this defined type utilizes a variable that wasn’t listed in the parameters. Just like a core resource, a defined resource type receives two parameters that aren’t named in the parameter list:
$title
$name
$title
but can be overridden in the declaration.These attributes should be declared exactly the same way as any core Puppet resource.
Modules may only declare variables within the module’s namespace (also called scope). This is very important to remember when using subclasses within a module, as each subclass has its own scope.
class
puppet
:
:agent
{
# these two definitions are the same and will produce an error
$status
=
'running'
$:
:puppet
::
agent
:
:status
=
'running'
A module may not create variables within the top scope or another module’s scope. Any of the following declarations will cause a build error:
class
puppet
{
# FAIL: can't declare top-scope variables
$:
:version
=
'1.0.1'
# FAIL: Can't declare variables in another class
$:
:mcollective
::
version
=
'1.0.1'
# FAIL: no, not even in the parent class
$:
:puppet
::
version
=
'1.0.1'
Variables can be prefaced with an underscore to indicate that they should not be accessed externally:
# variable that shouldn't be accessed outside the current scope
$_internalvar
=
'something'
# deprecated: don't access underscore-prefaced variables out of scope
notice
(
$:
:mymodule
::
_internalvar
)
This is currently polite behavior rather than enforced; however, external access to internal variables will be removed in a future version of Puppet.
While you cannot change variables in other scopes, you can use them within the current scope:
notify
(
$variable
)
# from current, parent, node, or top scope
notify
(
$:
:variable
)
# from top scope
notify
(
$:
:othermodule
::
variable
)
# from a specific scope
The first invocation could return a value from an in-scope variable, or the same variable name defined in the parent scope, the node scope, or the top scope. A person would have to search the module to be certain a local scope variable wasn’t defined. Furthermore, a declaration added to the manifest above this could assign a value different from what you intended to use. The latter forms are explicit and clear about the source.
Always refer to out-of-scope variables with the explicit $::
root prefix for clarity.
Top-scope variables are defined from the following crazy mishmash of places, which can confuse and baffle you when you’re trying to debug problems in a module:
parameters
block of an ENC’s resultIn my own perfect world, top-scope variables would cease to exist and be replaced entirely by hashes of data from each source. That said, top-scope variables are used in many places, and many Forge modules, and some of them have no other location from which to gather the data. If you are debugging a module, you’ll have to evaluate all of these sources to determine where a value came from.
Follow these rules to simplify debugging within your environment:
facts[]
hash.trusted_server_facts
and use the server-validated facts available in the $server_facts[]
and $trusted[]
hashes.By following these rules you can safely assume that any top-level variable was set by Hiera or an ENC’s results.
Node scope is a special type of scope where variables could be defined that look like top-scope variables but are specific to a node assignment.
As discussed in “Assigning Modules to Nodes”, it was previously common to declare classes within node blocks. It was possible to declare variables within the node block, which would override top-scope variables if you were using the variable name without the $::
prefix.
It is generally considered best practice to avoid node blocks entirely and to assign classes to nodes using Hiera data, as documented in the section mentioned. However, it remains possible in Puppet 4 to declare node definitions, and to declare variables within the node definitions. These variables would be accessible as $variable
, and hide the values defined in the top scope.
Parent scope for variables is the scope of the class which the current class inherits (for example, when a subclass inherits from the base class, as shown in “Building Subclasses”):
class
puppet
:
:agent
()
inherits
puppet
{
.
.
.
}
Previous versions of Puppet would use the class that declared the current class as the parent class. This caused significant confusion and inconsistent results when multiple modules/classes would declare a common class dependency.
As discussed in Chapter 6, it is possible to declare attribute defaults for a resource type.
It can be useful to change those defaults within a module or a class. To change them within a class scope, define the default within the class. To change them for every class in a module, define them in a defaults
class and inherit it from every class in the module.
It’s not uncommon to place module defaults in the params
class, as it is inherited by every class of the module to provide default values:
class
puppet
:
:params
(
)
{
# Default values
$attribute
=
'default value'
# Resource defaults
Package
{
install_options
=
>
'--enable-repo=epel'
,
}
}
As with variables, if a resource default is not declared in the class, it will use a default declared in the parent scope, the node scope, or the top scope. Unlike variables, parent scope is selected through dynamic scoping rules. This means that the parent class will be the class which declared this class if the class does not inherit from another class. Read that sentence twice, carefully.
As classes are singletons and thus instantiated only once, this means the parent scope changes depending on which class declares this class first. This can change as you add and remove modules from your environment.
Puppet 4 provides the ability to implement named resource defaults which never bleed to other modules. It involves combining two techniques together:
By combining these techniques together, you can create resource defaults that can be applied by name. Let’s use this technique with the same default values shown in the previous section:
$package_defaults
=
{
'ensure'
=>
'present'
,
'install_options'
=>
'--enable-repo=epel'
,
}
# Resource defaults
package
{
default
:
*
=>
$package_defaults
;
'puppet-agent'
:
ensure
=>
'latest'
;
}
This works exactly as if a Package {}
default was created, but it will apply only to the resources that specifically use that hash for the defaults. This allows you to have multiple resource defaults, and select the appropriate one by name.
In previous versions of Puppet, you could also use $sumtotal += 10
to declare a local variable based on a computation of variable in a parent, node, or top scope. This reads an awful lot like a redefinition of a variable, which as you might recall is not possible within Puppet. This was removed in Puppet 4 to be more consistent.
This kind of redeclaration is now handled with straightforward assignment like so:
$sumtotal
=
$sumtotal
+
10
This appears to be a variable redefinition. However, the =
operator actually creates a variable in the local scope with a value computed from the higher scope variable as modified by the operands.
To avoid confusion, I won’t use this syntax. I always use one of the following forms instead:
# clearly access top-scope variable
$sumtotal
=
$:
:sumtotal
+
10
# clearly access parent-scope variable
$sumtotal
=
$:
:parent
::
sumtotal
+
10
# even more clear by not using the same name
$local_sumtotal
=
$:
:sumtotal
+
10
In “Building Subclasses”, we split up the module into separate subclasses for the Puppet agent and Puppet server. A complication of this split is that both the Puppet agent and Puppet server read the same configuration file, puppet.conf. Both classes would modify this file, and restart their services if the configuration changes.
Let’s review two different ways to deal with this complication. Both solutions have classes depend on another module to handle configuration changes. Each presents different ways to deal with the complications of module dependencies, and thus we cover both solutions to demonstrate different tactics.
One way to solve this problem would be to create a third subclass named config
. This module would contain a template for populating the configuration file with settings for both the agent and server. In this scenario, each of the classes could include the config
class. This would work as shown here:
# manifests/_config.pp
class
puppet
:
:_config
(
Hash
$common
=
{},
# [main] params empty if not available in Hiera
Hash
$agent
=
{},
# [agent] params empty if not available in Hiera
Hash
$user
=
{},
# [user] params empty if not available in Hiera
Hash
$server
=
{},
# [master] params empty if not available in Hiera
)
{
file
{
'puppet.conf'
:
ensure
=>
ensure
,
path
=>
'/etc/puppetlabs/puppet/puppet.conf'
,
owner
=>
'root'
,
group
=>
'wheel'
,
mode
=>
'0644'
,
content
=>
epp
(
'puppet:///puppet/puppet.conf.epp'
,
# template file
{
'agent'
=>
$agent
,
'server'
=>
$server
}
# hash of config params
),
}
}
This example shows a common practice of naming classes that are used internally by a module with a leading underscore.
Name classes, variables, and types that should not be called directly by other modules with a leading underscore.
You may notice that the file resource doesn’t require
the agent or server packages, nor notify
the Puppet agent or Puppet server services. This is because a Puppet agent and server are separate classes. One or the other may not be declared for a given node.1 Don’t define relationships with resources that might not exist in the catalog.
Instead, we’ll modify the agent
class to include the config
class, and depend on the file
resource it provides:
# manifests/agent.pp
class
puppet
:
:agent
(
String
$status
=
'running'
,
Boolean
$enabled
,
)
{
# Include the class that defines the config
include
puppet
:
:_config
# Install the Puppet agent
package
{
'puppet-agent'
:
version
=
>
$version
,
before
=
>
File
[
'puppet.conf'
]
,
notify
=
>
Service
[
'puppet'
]
,
}
# Manage the Puppet service
service
{
'puppet'
:
ensure
=
>
$status
,
enable
=
>
$enabled
,
subscribe
=
>
[
Package
[
'puppet-agent'
]
,
File
[
'puppet.conf'
]
]
,
}
Create a server class with the same dependency structure. As you might remember from the preceding section, each class is a singleton: the configuration class will be called only once, even though it is included by both classes. If the puppet::server
class is defined with the same dependencies as the puppet::agent
class, the before
and subscribe
attributes shown will ensure that resource evaluation will happen in the following order on a node that utilizes either or both classes:
puppet::agent
puppet::server
puppet::_config
: The Puppet configuration file would be written out.
The services will be started:
puppet::agent
puppet::server
The previous example showed a way to solve a problem within a single Puppet module, where you control each of the classes that manages a common dependency. Sometimes there will be a common dependency shared across Puppet modules maintained by different groups, or perhaps even sometimes entirely outside of Puppet.
The use of templates requires the ability to manage the entire file. Even when using modules that can build a file from multiple parts, such as puppetlabs/concat on the Puppet Forge, you must define the entirety of the file within the Puppet catalog.
The following alternative approach utilizes a module to make individual line or section changes to a file without any knowledge of the remainder of the file:
# manifests/agent.pp
class
puppet
:
:agent
(
String
$status
=
'running'
,
Boolean
$enabled
=
true
,
Hash
$config
=
{
}
,
)
{
# Write each agent configuration option to the puppet.conf file
$config
.
each
|
$setting
,
$value
|
{
ini_setting
{
"
agent $setting
"
:
ensure
=
>
present
,
path
=
>
'/etc/puppetlabs/puppet/puppet.conf'
,
section
=
>
'agent'
,
setting
=
>
$setting
,
value
=
>
$value
,
require
=
>
Package
[
'puppet-agent'
]
,
}
}
}
This shorter, simpler definition uses a third-party module to update the Puppet configuration file in a nonexclusive manner. In my opinion, this is significantly more flexible than the common dependency module shown in the previous example.
When using other modules, it will be necessary to use ordering metaparameters to ensure that the dependencies are fulfilled before resources that require those dependencies are evaluated. Some tricky problems come about when you try to utilize ordering metaparameters between classes maintained by different people.
In this section, we’ll cover strategies for safe ordering of dependencies between modules.
A primary problem with writing classes that depend on resources in other classes comes from this requirement:
If you are using a module that depends on an explicitly named resource in another module, you are at risk of breaking when the dependency module is refactored and resource titles are changed.
As discussed in “Understanding Variable Scope”, classes and defined types has their own unique scope. Each instance of a class or defined type becomes a container for the variables and resources declared within it. This means you can set dependencies on the entire container.
Whenever possible, treat other modules as black boxes and depend on the entire class, rather than “peeking in” to depend on specific resources.
If a resource defines a dependency with a class or type, it will form the same relationship with every resource inside the container. For example, say that we want the Puppet service to be started after the rsyslog daemon is already up and running. As you might imagine, the rsyslog
module has a similar set of resources as our puppet::client
module:
class
rsyslog
{
package
{
.
.
.
}
file
{
.
.
.
}
service
{
.
.
.
}
}
Rather than setting a dependency on one of these resources, we can set a dependency on the entire class:
# Manage the Puppet service
service
{
'puppet'
:
ensure
=
>
$status
,
enable
=
>
$enabled
,
subscribe
=
>
Package
[
'puppet-agent'
]
,
after
=
>
Class
[
'rsyslog'
]
,
}
With the preceding configuration, someone can refactor and change the rsyslog
module without breaking this module.
There is another difficulty with ordering dependencies that you may run into:
notify
or subscribe
to must exist in the catalog.This obviously comes into play when you are writing a module that may be used with or without certain other modules. It’s easy if the requirement is absolute: you simply include
or require
the dependency class. But if not having the class is a valid configuration, then it becomes tricky.
Let’s use, for example, the puppet
module you are building. It is entirely valid for a user to install Puppet server with it, but not to run or even configure the Puppet agent. If you use a puppet::config
class, the file['puppet.conf']
resource cannot safely notify service['puppet']
because that service won’t be available in the catalog if the puppet::agent
class wasn’t included. The catalog build will abort, and animals will scatter in fright.
You could explicity declare the puppet::agent
class, and force everyone who doesn’t want to run the agent to define settings to disable it. However, a more flexible approach would be to have the optional service subscribe
to the file
resource, which is always included:
# Manage the Puppet service
service
{
'puppet'
:
ensure
=>
$status
,
enable
=>
$enabled
,
subscribe
=>
File
[
'puppet.conf'
]
,
}
By placing the notification dependency within the optional class, you have solved the problem of ensuring that the resources exist in the catalog. If the puppet::agent
class is not included on a node, the dependency doesn’t exist, and no animals were harmed when Puppet applied the resources.
The same rule discussed before has a much trickier application when ordering dependencies of dynamic resources:
This comes into play when you are writing a module that depends on resources that are dynamically generated. The use of puppetlabs::inifile
to modify the configuration file defines each configuration setting as a unique resource within the class:
# Write each agent configuration option to the puppet.conf file
$config
.
each
|
$setting
,
$value
|
{
ini_setting
{
"
agent $setting
"
:
ensure
=
>
present
,
.
.
.
}
}
Because each setting is a unique resource, the package
and service
resources can’t use before
or subscribe
attributes, as the config settings list can change. In this case, it is best to reverse the logic. Use the dynamic resource’s require
and notify
attributes to require the package and notify the service resources.
Here’s an example that places ordering metaparameters on the dynamic INI file resources for the Puppet configuration file:
# Write each agent configuration option to the puppet.conf file
$config
.
each
|
$setting
,
$value
|
{
ini_setting
{
"
agent $setting
"
:
ensure
=
>
present
,
path
=
>
'/etc/puppetlabs/puppet/puppet.conf'
,
section
=
>
'agent'
,
setting
=
>
$setting
,
value
=
>
$value
,
require
=
>
Package
[
'puppet-agent'
]
,
notify
=
>
Service
[
'puppet'
]
,
}
}
In this form, we’ve moved the ordering attributes into the dynamic resources to target the well-known resources. The service no longer needs to know in advance the list of resources that modify the configuration file. If any of the settings are changed in the file, it will notify the service.
The really tricky problems come about when you are facing all of these problems together:
notify
or subscribe
to must exist in the catalog.This obviously comes into play when you are writing a module that depends on resources dynamically generated by a different class. If that class can be used without your dependent class, then it cannot send a notify
event, as your resource may not exist in the catalog.
The solution is what I call an escrow refresh resource, which is a well-known static resource that will always be available to notify
and subscribe
to.
Let’s return to the Puppet module you are building for an example. It is entirely valid for a user to install Puppet with it, but not to run or configure the Puppet agent. Changes to the Puppet configuration file can happen within the agent class. As the agent service is defined in the same class, it can safely notify the service.
However, changes to the Puppet configuration file can happen in the main puppet
class and modify the configuration parameters in [main]
. As these parameters will affect the Puppet agent service, the Puppet agent will need to reload the configuration file.
However, the base puppet
class can be used without the puppet::agent
subclass, so the inifile
configuration resources cannot notify the agent service, as that service might not exist in the catalog. Likewise, the Puppet agent service cannot depend on a dynamically generated set of configuration parameters.
In this situation, the base puppet
class creates an escrow refresh resource to which it will submit notifications that the Puppet configuration file has changed. With this combination of resource definitions, the following sequence takes place:
Implement this by creating a resource which does something that succeeds. It need not do anything in particular, as it only serves as a well-known relay for refresh events:
# refresh escrow that optional resources can subscribe to
Exec
{
'puppet-configuration-has-changed'
:
command
=>
'/bin/true'
,
refreshonly
=>
true
,
}
Adjust the dynamic resources to notify the escrow resource if they change the configuration file:
# Write each main configuration option to the puppet.conf file
$config
.
each
|
$setting
,
$value
|
{
ini_setting
{
"
main $setting
"
:
ensure
=
>
present
,
path
=
>
'/etc/puppetlabs/puppet/puppet.conf'
,
section
=
>
'main'
,
setting
=
>
$setting
,
value
=
>
$value
,
require
=
>
Package
[
'puppet-agent'
]
,
notify
=
>
Exec
[
'puppet-configuration-has-changed'
]
,
}
}
Declare the agent service to subscribe to the escrow refresh resource:
# Manage the Puppet service
service
{
'puppet'
:
ensure
=
>
$status
,
enable
=
>
$enabled
,
subscribe
=
>
Exec
[
'puppet-configuration-has-changed'
]
,
}
The only difficulty with this pattern is that the escrow resource must be defined by the class on which the optional classes depend. This may require you to submit a request to the maintainer of a Forge module to add in an escrow resource for you to subscribe to.
In most situations, each class declaration stands independent. While a class can include
another class, the class is defined at an equal level as the calling class—they are both instances of the Class
type. Ordering metaparameters are used to control which classes are processed in which order.
As classes are peers, no class contains any other class. In almost every case, this is exactly how you want class declaration to work. This allows freedom for any class to set dependencies and ordering against any other class.
However, there is also a balance where one class should not be tightly tied to the internals of another class. It can be useful to allow other classes to declare ordering metaparameters that refer to the parent class, yet ensure that any necessary subclasses are processed at the same time.
For example, a module may have a base class that declares only common variables. All resources might be declared in package
and service
subclasses. A module that sets a dependency on the base class would not achieve the intended goal of being evaluated after the service is started:
service { 'dependency':
ensure => running,
after => Class['only_has_variables']
}
Rather than require the module to set dependencies on each subclass of the module, declare that each of the subclasses is contained within the main class:
class
application
(
Hash
[
String
]
$globalvars
=
{
}
,
)
{
# Ensure that ordering includes subclasses
contain
application
:
:package
contain
application
:
:service
}
With this definition, any class that references the application
class need not be aware of the subclasses it contains.
In this section, we’ll talk about ways to ensure your module can be used successfully by others, and even yourself in different situations. Even if you don’t plan to share your modules with anyone, the ideas in this section will help you build better modules that you won’t kick yourself for later.
Many, many examples in this book have placed hard values in resources in the name of simplicity, to make the resource language easy to read for learning purposes. Unfortunately, this is a terrible idea when building manifests for production use. You’ll find yourself changing paths, changing values, and adding if
/else
sequences as the code is deployed in more places.
Use variables for resource attribute values. Set the values in a params
class, Hiera, or another data source.
To give you a clear example, visualize a module that installs and configures the Apache httpd server. The following would be a valid definition for a virtualhost configuration file on CentOS 7:
file
{
'/etc/httpd/conf.d/virtualhost.conf'
:
ensure
=>
file
,
owner
=>
'apache'
,
group
=>
'apache'
,
mode
=>
'0644'
,
source
=>
'puppet:///modules/apache_vhost/virtualhost.conf'
,
require
=>
Package
[
'httpd'
]
,
notify
=>
Service
[
'httpd'
]
,
}
Then someone wants to use that module on an Ubuntu server. The problem is, nearly everything in that definition is wrong. The package name, the service name, the file location, and the file owners are all different on Ubuntu. It’s much better to write that resource as follows:
package
{
'apache-httpd-package'
:
ensure
=>
present
,
name
=>
$apache_httpd
::
package_name
,
}
file
{
'virtualhost.conf'
:
ensure
=>
file
,
path
=>
"${apache_httpd::sites_directory}/virtualhost.conf"
,
owner
=>
$apache_httpd
::
username
,
group
=>
$apache_httpd
::
groupname
,
mode
=>
'0644'
,
source
=>
'puppet:///modules/apache_vhost/virtualhost.conf'
,
require
=>
Package
[
'apache-httpd-package'
]
,
notify
=>
Service
[
'apache-httpd-service'
]
,
}
service
{
'apache-httpd-service'
:
name
=>
$apache_httpd
::
service_name
,
ensure
=>
'running'
,
}
Then utilize your Hiera hierarchy and place the following variables in the os/redhat.yaml file:
apache_httpd
::
username
:
apache
apache_httpd
::
groupname
:
apache
apache_httpd
::
package_name
:
httpd
apache_httpd
::
service_name
:
httpd
If this service is ever deployed on Ubuntu, you can redefine those variables for that platform in the os/debian.yaml file—zero code changes to your module:
apache_httpd
::
username
:
httpd
apache_httpd
::
groupname
:
httpd
apache_httpd
::
package_name
:
apache2
apache_httpd
::
service_name
:
apache2
Likewise, if you find yourself using the module for an Apache instance installed in an alternate location, you can simply override those values for that particular hostname in your Hiera hierarchy, such as hostname/abitoddthisone.yaml.
If you were looking carefully at the preceding section, you might have noticed that I didn’t take advantage of the resource’s ability to take the resource name from the title. Here, have another look:
package
{
'
apache-httpd-package
'
:
ensure
=
>
present
,
name
=
>
"
apache_httpd::package_name
"
,
}
Wouldn’t it be much easier and less code to inherit the value like this?
package
$apache
::
httpd
:
:package_name
{
ensure
=
>
present
,
}
It would be simpler, but only the first time I used this class. The resource’s name would vary from operating system to operating system. If I ever created a wrapper class for this, or depended on it in another class, I’d have to look up which value contains the resource and where it was defined. And then I’m scattering variable names from this class throughout another class.
Here’s an example of what a class that needs to install its configuration files before the Apache service starts would have to put within its own manifest:
# poor innocent class with no knowledge of Apache setup
file
{
'config-file-for-other-service'
:
.
.
.
require
=
>
Package
[
$apache_httpd
::
package_name
]
,
notify
=
>
Service
[
$apache_httpd
::
service_name
]
,
}
Worse, if you refactor the Apache class and rename the variables, it will break every module that referred to this resource.
Use static names for resources to which wrapper classes may need to refer with ordering metaparameters.
In this situation, it’s much better to explicitly define a static title for the resource, and declare the package or service name by passing a variable to the name
attribute. Then wrapper and dependent classes can safely depend on the resource name:
# poor innocent class with no knowledge of Apache setup
file
{
'config-file-for-other-service'
:
.
.
.
require
=
>
Package
[
'apache-httpd-package'
]
,
notify
=
>
Service
[
'apache-httpd-service'
]
,
}
As discussed previously in this book, parameters can be declared in the class definition with a default value. Continuing with our Apache module example, you might define the base class like so:
class
apache_httpd
(
String
$package_name
=
'httpd'
,
String
$service_name
=
'httpd'
,
Then you could define the default values in Hiera operating system overrides. However, this would require everyone who uses your module to install Hiera data to use your module. To avoid that, you’d have to muddy up the class with case
blocks or selector expressions:
String
$package_name
=
$facts
[
'os'
][
'family'
]
?
{
/redhat/
=>
'httpd'
,
/debian/
=>
'apache2'
,
default
=>
'apache'
,
}
String
$service_name
=
$facts
[
'os'
][
'family'
]
?
{
.
.
.
}
String
$user_name
=
$facts
[
'os'
][
'family'
]
?
{
.
.
.
}
A much cleaner design is to place all conditions on platform-dependent values in another manifest named params.pp.
Place all conditional statements around operating system and similar data in a params
class. Inherit from the params
class and refer to it for all default values.
Here’s an example:
class
apache_httpd
(
String
$package_name
=
$apache_httpd
::
params
:
:package_name
,
String
$service_name
=
$apache_httpd
::
params
:
:service_name
,
String
$user_name
=
$apache_httpd
::
params
:
:user_name
,
)
inherits
apache_httpd
::
params
{
This makes your class clear and easy to read, hiding away all the messy per-OS value selection. Furthermore, this design allows Hiera values to override the OS-specific values if desired.
In many cases, the params manifest can be replaced with data in modules, as described in “Binding Data Providers in Modules”.
Let’s review some of the best practices for module development we covered in this chapter:
contain()
function to wrap subclasses for simple dependency management.You can find more detailed guidelines in the Puppet Labs Style Guide.
Modules provide an independent namespace for reusable blocks of code that configure or maintain something. A module can create new resource types that can be independently used in other modules.
In this chapter, we have covered how to configure a module to:
This chapter has reviewed the features and functionality you can utilize within modules. The next chapter will discuss how to create plugins that extend modules with less common functionality.
1 The astute reader might point out that Puppet couldn’t possibly configure the Puppet server if the Puppet agent isn’t installed—a unique situation for only a Puppet module. This concept would be valid for any other module that handles both the client and server configurations.