Chapter 15. Extending Modules with Plugins

This chapter covers adding plugins to Puppet modules. Plugins are used to provide new facts, functions, and module-specific data that can be used in the Puppet catalog.

This chapter will explain how to extend a module to:

  • Provide new customized facts that can be used within your module.
    • Facts can be supplied by a program that outputs one fact per line.
    • Facts can be supplied by Ruby functions.
    • Facts can be read from YAML, JSON, or text file formats.
  • Provide new functions that extend the Puppet configuration language.
    • Facts can be written in either Puppet or Ruby languages.
  • Provide data lookups that are environment- or module-specific.
    • Custom data sources are written in Ruby.

Nothing in this chapter is required to build a working module, and there are thousands of modules that don’t use plugins. You can safely skip this chapter and return to it after you are comfortable building the common features of modules.

Tip
Many of the extensions in this chapter are written in Ruby. To build Ruby plugins, you’ll need knowledge of the Ruby programming language. I recommend Michael Fitzgerald’s Learning Ruby as an excellent reference.

Adding Custom Facts

One of the most useful plugins a module can provide is custom facts. Custom facts extend the built-in facts that Facter provides with node-specific values useful for the module. Module facts are synced from the module to the agent, and their values are available for use when the catalog is built.

Previous versions of Puppet would convert all fact values to strings. In Puppet 4, custom facts can return any of the Scalar data types (String, Numeric, Boolean, Timestamp, etc) in addition to the Collection types (Array, Hash, Struct, etc.).

There are two ways to provide custom facts: using Ruby functions, or through external data. Let’s go over how to build new facts using both methods.

External Facts

Would you like to provide facts without writing Ruby? This is now possible with external facts. There are two ways to provide external facts:

  • Write fact data out in YAML, JSON, or text format.
  • Provide a program or script to output fact names and values.

The program or script can be written in Bash, Python, Java, whatever... it’s only necessary that it can be executed by the Puppet agent. Let’s go over how to use these.

Structured data

You can place structured data files in the facts.d/ directory of your module with values to assign to facts. Structured data files must be in a known format, and must be named with the appropriate file extension for their format. At the time this book was written, the formats shown in Table 15-1 were supported.

Table 15-1. Supported formats for structured data files containing facts
Type Extension Description
YAML .yaml Facts in YAML format
JSON .json Facts in JSON format
Text .txt Facts in key=value format

We introduced YAML format back in Chapter 11. Following is a simplified YAML example that sets three facts:

# three_simple_facts.yaml
---
my_fact: 'myvalue'
my_effort: 'easy'
is_dynamic: 'false'

The text format uses a single key=value pair on each line of the file. This only works for string values; arrays and hashes are not supported. Here is the same example in text format:

# three_simple_facts.txt
my_fact=myvalue
my_effort=easy
is_dyanamic=false

Programs

You can place any executables program or script in the facts.d/ directory of your module to create new facts.

The script or program must have the execute bit set for root (or the user that you are running Puppet as). The script must output each fact as key=value on a line by itself. Following is a simplified example that sets three facts:

#!/bin/bash
echo "my_fact=myvalue"
echo "my_effort=easy"
echo "is_dynamic=false"

Install this in the directory and test it:

$ $EDITOR facts.d/three_simple_facts.sh
$ chmod 0755 facts.d/three_simple_facts.sh
$ facts.d/three_simple_facts.sh
my_fact=myvalue
my_effort=easy
is_dynamic=false

Facts on Windows

Executable facts on Windows work exactly the same, and require the same execute permissions and output format. However, the program or script must be named with a known extension. At the time this book was written, the following extensions were supported:

.com, .exe
Binary executables to be executed directly
.psl PowerShell scripts
Scripts to be run by the PowerShell interpreter
.cmd, .bat command-shell scripts
ASCII or UTF8 batch scripts to be processed by cmd.exe

Following is the same example from earlier, rewritten as a PowerShell script:

Write-Host "my_fact=myvalue"
Write-Host "my_effort=easy"
Write-Host "is_dyanamic=false"

You should be able to save and execute this PowerShell script on the command line.

Custom (Ruby) Facts

To create new facts in Ruby for Facter to use, simply create the following directory in your module:

$ mkdir -p lib/facter

Ruby programs in this directory will be synced to the node at the start of the Puppet run. The node will evaluate these facts, and provide them for use in building the catalog.

Within this directory, create a Ruby script ending in .rb. The script can contain any normal Ruby code, but the code that defines a custom fact is implemented by two calls:

  • A Ruby code block starting with Facter.add('fact_name')
  • Inside the Facter code block, a setcode block that returns the fact’s value

For an easy example, let’s assume that hosts in a cluster have the same name. Unique numbers are added to each node to keep them distinct. This results in a common pattern with hostnames like:

  • webserver01
  • webserver02
  • webserver03
  • mailserver01
  • mailserver02

You’d like to define Hiera data based on the cluster name. Right away, you might think to use the hostname command to acquire the node name. Facter provides a helper function to execute shell commands: Facter::Core::Execution.exec().

Tip
Keep in mind that you are passing a Ruby String to a Ruby function. You must escape metacharacters as required by Ruby, rather than the more permissive rules of Puppet.

The code to derive a node’s cluster name from the hostname command could look like this:

# cluster_name.rb
Facter.add('cluster_name') do
  setcode do
    Facter::Core::Execution.exec("/bin/hostname -s | /usr/bin/sed -e 's/d//g'")
  end
end

This new fact will be available as $facts['cluster_name'] for use in manifests and templates.

Warning
The value passed to exec should be a command to execute. Pipe (|) and redirection (>) operators work as you might expect, but shell commands like if or for do not work.

It is best to run the smallest command possible to acquire a value, and then manipulate the value using native Ruby code. Here’s an example using Ruby native libraries to acquire the hostname, and then manipulate the value to remove the domain and numbers:

# cluster_name.rb
require 'socket'
Facter.add('cluster_name') do
  setcode do
    hostname = Socket.gethostname
    hostname.sub!(/..*$/, '') # remove the first period and everything after it
    hostname.gsub(/[0-9]/, '') # remove every number and return revised name
  end
end

For more optimization, you could simply use the hostname fact. You can acquire the value of an existing fact using Facter.value('factname'). Here is a simpler example starting with the existing fact:

# cluster_name.rb
Facter.add('cluster_name') do
  setcode do
    hostname = Facter.value(:hostname).sub(/..*$/, '')
    hostname.gsub(/[0-9]/, '')
  end
end

Plugins including facts are synced from the modules to the node (pluginsync) during Puppet’s initial configuration phase, prior to the catalog being built. This makes the fact available for immediate use in manifests and templates.

You can run puppet facts to see all facts, including those that have been distributed via pluginsync:

[vagrant@client facter]$ puppet facts find |grep cluster_name
    "cluster_name": "client",

Avoiding delay

To limit problems with code that may run long or hang, use the timeout property of Facter.add() to define how many seconds the code should complete within. This causes the setcode block to halt if the timeout is exceeded. The facts evaluation will move on without an error, but also without a value for the fact. This is preferable to a hung Puppet client.

Returning to our example of calling a shell command, modify it as follows:

# cluster_name.rb
Facter.add('cluster_name', :sleep, :timeout => 5) do
  setcode do
    Facter::Core::Execution.exec("/bin/hostname -s |/usr/bin/sed -e 's/d//g'")
  end
end

Best Practice

Define a timeout for any fact that calls a program or connects to a dependency, to avoid hanging the Puppet agent during initialization.

Confining facts

Some facts are inappropriate for certain systems, would not work, or simply wouldn’t provide any useful information. To limit which nodes attempt to execute the fact code, utilize a confine statement. This statement lists another fact name and valid values. For example, to ensure that only hosts in a certain domain provide this fact, you could use the following:

# cluster_name.rb
Facter.add('cluster_name') do
  confine 'domain' => 'example.com'
  setcode do
ruby code which provides the value...

You can use multiple confine statements to enforce multiple conditions, all of which must be true. Finally, you can also test against multiple values by providing an array of values. If you are looking for a fact only available on Debian-based systems, you could use this:

# debian_fact.rb
Facter.add('debian_fact') do
  confine 'operatingsystem' => %w{ Debian Ubuntu }
  setcode do
ruby code which provides the value...

Ordering by precedence

You can define multiple methods, or resolutions, to acquire a fact’s value. Facter will utilize the highest precedence resolution that returns a value. To provide multiple resolutions, simply add another Facter code block with the same fact name.

This is a common technique used for facts where the source of data differs on each operating system.

The order in which Facter evaluates possible resolutions is as follows:

  1. Facter discards any resolutions where the confine statements do not match.
  2. Facter tries each resolution in descending order of weight.
  3. Whenever a value is returned, no further code resolutions are tried.

You can define a weight for a resolution using the has_weight statement. If no weight is defined, the weight is equal to the number of confine statements in the block. This ensures that more specific resolutions are tried first.

Following is an example that attempts to acquire the hostname from two different system configuration files:

# configured_hostname.rb
Facter.add('configured_hostname') do
  has_weight 10
  setcode do
    if File.exist? '/etc/hostname'
      File.open('/etc/hostname') do |fh|
        return fh.gets
      end
    end
  end
end

Facter.add('configured_hostname') do
  confine "os['family']" => 'RedHat'
  has_weight 5
  setcode do
    if File.exist? '/etc/sysconfig/network'
      File.open('/etc/sysconfig/network').each do |line|
        if line.match(/^HOSTNAME=(.*)$/)
          return line.match(/^HOSTNAME=(.*)$/)[0]
        end
      end
    end
  end
end

Aggregating results

The data provided by a fact could be created from multiple data sources. You can then aggregate the results from all data sources together into a single result.

An aggregate fact is defined differently than a normal fact:

  1. Facter.add() must be invoked with a property :type => :aggregate.
  2. Each discrete data source is defined in a chunk code block.
  3. The chunks object will contain the results of all chunks.
  4. An aggregate code block should evaluate chunks to provide the final fact value (instead of setcode).

Here’s a simple prototype of an aggregate fact that returns an array of results:

Facter.add('fact_name', :type => :aggregate) do
  chunk('any-name-one') do
    ruby code
  end

  chunk('any-name-two') do
    different ruby code
  end

  aggregate do |chunks|
    results = Array.new
    chunks.each_value do |partial|
      results.push(partial)
    end
    return results
  end
end

The aggregate block is optional if all of the chunks return arrays or hashes. Facter will automatically merge the results into a single array or hash in that case. For any other resolution, you should define the aggregate block to create the final result.

For more examples of aggregate resolutions, see “Writing Facts with Aggregate Resolutions” on the Puppet docs site.

Best Practice

Place values from discrete data sources in their own facts, rather than creating facts with complex data structures.

Debugging

Your new fact will not appear in the output of facter. Use puppet facts find to see the values of custom Puppet facts.

To test the output of your fact, run puppet facts find --debug to see verbose details of where facts are being loaded from:

$ puppet facts find --debug
...
Debug: Loading facts from modules/stdlib/lib/facter/facter_dot_d.rb
Debug: Loading facts from modules/stdlib/lib/facter/pe_version.rb
Debug: Loading facts from modules/stdlib/lib/facter/puppet_vardir.rb
Debug: Loading facts from modules/puppet/facts.d/three_simple_facts.txt

If the output from your custom fact wasn’t in the proper format, you’ll get errors like this:

Fact file facts.d/python_sample.py was parsed but returned an empty data set

In those situations, run the program by hand and examine the output. Here’s the example I used to generate the preceding complaint:

$ /etc/puppetlabs/code/environment/testing/modules/puppet
$ facts.d/python_sample.py
fact1=this
fact2
fact3=that
Tip
External facts obsolete and supersede the facter-dot-d functionality provided by the stdlib module for older Puppet versions. Instead of installing facts in the global facter/facts.d/ directory, place them in an appropriate module’s facts.d/ directory.

Understanding Implementation Issues

There are a few things to understand about how the Puppet agent utilizes facts:

  • External facts are evaluated first, and thus cannot reference or use Facter or Ruby facts.
  • Ruby facts are evaluated later and can use values from external facts.
  • External executable facts are forked instead of executed within the same process. This can have performance implications if thousands of external fact programs are used.

Outside of these considerations, the facts created by these programs are equal and indistinguishable.

Defining Functions

You can create your own functions to extend and enhance the Puppet language. These functions will be executed during the catalog build process. Functions can be written in either the Puppet configuration language or in Ruby.

Puppet Functions

New to Puppet 4 is the ability to write functions in the Puppet configuration language. This gives you the freedom to write functions for use in your manifests entirely in the Puppet language.

Each function should be declared in a separate file stored in the functions/ directory of a module. The file should be named for the function name followed by the .pp extension. Each function defined in a module must be qualified within the module’s namespace.

A function placed in an environment’s manifests/ directory should be qualified with the environment:: namespace.

Note
At this time, it is possible to write a function that patches (by masking) a broken function from another module. I expect that future changes will enforce the qualified naming standards, and prevent that usage.

For an example, we’ll create a function called make_boolean() that accepts many types of input and returns a Boolean value. Much like the str2boolean and num2boolean functions from the puppetlabs/stdlib library, this function will accept either strings (yes/no/y/n/on/off/true/false) or numbers. Unlike the stdlib functions, it will handle either input type, and also accept Boolean input without raising an error.

The function should be placed in a file named functions/make_boolean.pp in the module directory. The function must be named with the qualified scope of the module. For our example, the function would be declared like so:

# functions/make_boolean.pp
function puppet::make_boolean( Variant[String,Numeric,Boolean] $inputvalue ) {
  case $inputvalue {
    Undef:   { false }
    Boolean: { $inputvalue }
    Numeric: {
      case $inputvalue {
        0:       { false }
        default: { true }
      }
    }
    String: {
      case $inputvalue {
        /^(?i:off|false|no|n|'')$/: { false }
        default: { true }
      }
    }
  }
}

Functions written in the Puppet language can only take actions possible within the Puppet language. Ruby functions remain significantly more powerful.

Ruby Functions

Each Ruby function should be declared in a separate file, stored in the lib/puppet/functions/modulename/ directory of the module, and named for the function followed by the .rb extension.

Define the function by calling Puppet::Functions.create_function() with the name of the new function as a Ruby symbol for the only parameter. The code for the function should be defined within a method of the same name.

Our make_boolean() function from the previous section could look like this:

Puppet::Functions.create_function(:'puppet::make_boolean') do
  def make_boolean( value )
    if [true, false].include? value
      return value
    elsif value.nil?
      return false
    elsif value.is_a? Integer
      return value == 0 ? false : true
    elsif value.is_a? String
      case value
      when /^s*(?i:false|no|n|off)s*$/
        return false
      when ''
        return false
      when /^s*0+s*$/
        return false
      else
        return true
      end
    end
  end
end

Accepting varied input with dispatch

The Puppet 4 API supports Puppet type validation inside Ruby functions with the dispatch method. The API supports multiple dispatch, allowing the selection of method based on the input type(s). dispatch will select the first method that has parameters matching the Puppet type (not Ruby type!) and call the named method with the parameters:

Puppet::Functions.create_function(:'puppet::make_boolean') do
  dispatch :make_boolean_from_string do
    param 'String', :value
  end

  dispatch :make_boolean_from_integer do
    param 'Integer', :value
  end

  dispatch :return_value do
    param 'Boolean', :value
  end

  dispatch :return_false do
    param 'Undef', :false
  end

  def return_value( value )
    return value
  end

  def return_false( value )
    return false
  end

  def make_boolean_from_integer( value )
    return value == 0 ? false : true
  end

  def make_boolean_from_string( value )
    case value
    when /^s*(?i:false|no|n|off)s*$/
      return false
    when ''
      return false
    when /^s*0+s*$/
      return false
    else
      return true
    end
  end
end

It’s possible to accept a range or unlimited values as well. Here are dispatchers for when two values are supplied, and for all other (e.g., unlimited) amounts of values:

Puppet::Functions.create_function(:'puppet::find_largest') do
  dispatch :compare_two_values do
    required_param 'Integer', :first
    optional_param 'Integer', :second
  end

  dispatch :compare_unlimited_values do
    repeated_param 'Integer', :values
  end

  def compare_two_values( first, second )
    ...
  end

  def compare_unlimited_values( *values )
    ...
  end

Accessing facts and values

In older versions of Puppet, Ruby functions could access facts and values. In Puppet 4, any values that are needed by the function should be passed as a parameter to the function. For example, here’s a function that calculates the subnet for the primary network interface:

require 'ipaddr'
# @param [String] address - an IPv4 or IPv6 address
# @param [String] netmask - a netmask
Puppet::Functions.create_function(:'puppet::get_subnet') do
  def get_subnet( address, netmask )
    if !address.nil?
      ip = IPAddr.new( address )
      return ip.mask( netmask ).to_s
    end
  end
end

Call this function with the necessary facts as input parameters:

puppet::get_subnet( $facts['networking']['ip'], $facts['networking']['netmask'] )

Calling other functions

You can invoke a custom Puppet function from another custom Puppet function using the call_function() method. This function scans the scope of where the function was invoked to find and load the other function.

All input values should be sent as a single array in the second parameter.

Puppet::Functions.create_function(:'mymodule::outer_function') do
  def outer_function( host_name )
    call_function('process_value', ['hostname', host_name] )
  end
end

Sending back errors

To send an error response back to Puppet (which will cause the Puppet catalog build to fail), raise an error of type Puppet::ParseError. Here’s an example:

Puppet::Functions.create_function(:'mymodule::outer_function') do

  def outer_function( fact_value )
    raise Puppet::ParseError, 'Fact not available!' if fact_value.nil?
    ...things you do if the fact is available...
  end
end

Whenever possible, it is preferred to simply return nil or some other failure value when a function doesn’t succeed. This allows the code that called the function to determine what action to take. This is generally better practice than causing the entire Puppet catalog build to fail.

Learning more about functions

The Puppet Functions API has changed dramatically in Puppet 4, and new features are being introduced in each new version in the 4.x releases. You can find the very latest details at “Puppet::Functions” on the Puppet docs site.

Using Custom Functions

Whether your function was written in Puppet or Ruby, you can use a function you’ve created the same way as a function built into Puppet. For example, we could use the make_boolean() function we’ve defined to ensure that a service receives a Boolean value for enable, no matter what type of value was passed to it:

service { 'puppet':
  ensure    => $status,
  enable    => puppet::make_boolean( $enabled ),
  subscribe => Package['puppet-agent'],
}

Within Puppet templates, all functions are methods of the scope object. Use the call_function() method. This is because, as you might guess, templates are parsed within the scope of the Ruby function template(). Use square brackets around the input variables to create a single array of parameters:

true_or_false = <%= scope.call_function('puppet::make_boolean', [inputvar]) %>

Creating Puppet Types

One way to create a type in Puppet is using the Puppet configuration language, as described in “Creating New Resource Types”. A more powerful way to create new types in Puppet is to create them in a Ruby class.

For an example, we will create a somewhat ridiculous elephant type that generates an elephant resource. A resource using our type would look something like this:

  elephant { 'horton':
    ensure   => present,
    color    => 'grey',
    motto    => 'After all, a person is a person, no matter how small',
  }

Each Puppet type should be declared in a separate file, stored in the lib/puppet/type/ directory of the module, and named for the type followed by the .rb extension. For our example, the filename will be lib/puppet/type/elephant.rb.

Define the type by calling Puppet::Type.newtype() with the name of the type as a Ruby symbol for the only parameter. The Ruby code that evaluates the type should be defined within the following block:

Puppet::Type.newtype( :elephant ) do
  @doc = %q{Manages elephants on the node
    @example 
      elephant { 'horton':
        ensure   => present,
        color    => 'grey',
        motto    => 'After all, a person is a person, no matter how small',
      }
  }
end

Place Markdown within the value of the @doc tag to provide an example of how to utilize the resource. You can safely indent underneath the value, as the common amount of leading whitespace on following lines is removed when the documentation is rendered.

Defining Ensurable

Most types are ensurable, meaning that we compare the current state of the type to determine what changes are necessary. Add the method ensurable to the type:

Puppet::Type.newtype( :elephant ) do
  ensurable
end

The provider for the type is required to define three methods to create the resource, verify if it exists, and destroy it.

Accepting Params and Properties

There are two types of values that can be passed into a type with an attribute. params are values used by the type but not stored or verifiable on the resource’s manifestation.

Every type must have a namevar parameter, which is how the resource is uniquely identified on the system. For user resources this would be the uid, while file resources use path. For our example, the unique identifier is the elephant’s name:

  newparam( :name, :namevar => true ) do
    desc "The elephant's name"
  end

A property is different in that we can retrieve the property from the resource and compare the values:

  newproperty( :color ) do
    desc "The color of the elephant"
    defaultto 'grey'
  end

  newproperty( :motto ) do
    desc "The elephant's motto"
  end

The preceding example defines the properties and sets a default value for the color.

Validating Input Values

You can perform input validation on each param or property provided to the type. For example, we need to ensure that the color is a known elephant color:

  newproperty( :color ) do
    desc "The color of the elephant"
    defaultto 'gray'

    validate do |value|
      unless ['grey','brown','red','white'].include? value
        raise ArgumentError, "No elephants are colored #{value}"
      end
    end
  end

There’s a newvalues() method that provides this kind of test, albeit without the more informative message. If a clear error message is not required, it’s much shorter to write:

  newproperty( :color ) do
    desc "The color of the elephant"
    defaultto 'grey'
    newvalues('grey','brown','red','white')
  end

  newproperty( :motto ) do
    desc "The elephant's motto"
    newvalues(/^[ws'.]$/)
  end

The latter form used a Regexp to accept any string that contained only alphanumeric letters, spaces, single quotes, and periods.

Tip
You can also define default values as symbols, but this requires your provider to translate symbols to strings in some cases, and it gets messy. Use strings consistently and avoid type conversions entirely.

You can munge values to provide clean mapping for local inconsistencies:

  newproperty( :color ) do
    desc "The color of the elephant"
    defaultto 'grey'
    newvalues('grey','brown','red','white')
    munge do |value|
      case value
      when 'gray'
        'grey'
      else
         super
      end
    end
  end

You can perform input validation for the entire type using a global validate method. The input values are available as attributes of self:

Puppet::Type.newtype( :elephant ) do
  ...

  validate do |value|
    if self[:motto].nil? and self[:color].nil?
      raise ArgumentError, "Both a color and a motto are required input."
    end
  end
end

You can define a pre_run_check method, which will run the code for each instance added to the catalog just before attempting to apply the catalog on the node. Every instance that generates an error will be output as an error by Puppet, and the Puppet run will be aborted.

Keep in mind the difference in the placement of these methods in the run cycle:

  • The validate method is called each time a resource of this type is added to the catalog. Other resources and types may not yet be parsed and available in the catalog yet.
  • The pre_run_check method is called after the entire catalog is built, and every instance should exist within the catalog. This is the only valid place to check that requirements for this type exist within the catalog.

Defining Implicit Dependencies

As you might recall from Chapter 7, many Puppet types will autorequire other items in the catalog upon which they depend. For example, a file resource will autorequire the user who owns the file. If the user exists in the catalog, that file resource will depend on that user resource.

You can autorequire dependent resources for your type as shown here:

  Puppet::Type.newtype( :elephant ) do

  autorequire(:file) do
    '/tmp/elephants'
  end
  autorequire(:file) do
    '/tmp'
  end

As our elephant exists in the /tmp/elephants directory, we autorequire the File['/tmp'] resource. If that resource exists in the catalog, we will depend on it. If it is not defined, then the dependency will not be set.

In addition to autorequire, you can use the same syntax for autobefore, autonotify, and autosubscribe. These create soft dependencies and refresh events in the same manner as the ordering metaparameters without auto in their name.

Learning More About Puppet Types

One of the most informative ways to debug issues with Puppet types is to run Puppet with the --debug argument. The debugging shows the loading of custom types and selection of the provider.

The best book to learn more about Puppet types is Dan Bode and Nan Liu’s Puppet Types and Providers. That book covers this topic much more exhaustively than this quick introduction.

The Puppet::Type documentation provides a detailed reference to all methods available.

The Custom Types documentation provides some prescriptive guidance.

Adding New Providers

Puppet providers are Ruby classes that do the detailed work of evaluating a resource. The provider handles all operating system–specific dependencies. For example, there are yum, apt, pkgng, and chocolatey providers for installing packages on RedHat, Debian, FreeBSD, and Windows systems, respectively. There are more than 20 different providers for the package type due to the wide variety of package managers in use.

For an example, we will create a posix provider for our elephant type that generates an elephant resource on a POSIX-compliant system (e.g., Linux, Solaris, FreeBSD, etc.).

Providers for Puppet types are always written in Ruby. Each provider should be declared in a separate file, stored in a subdirectory of lib/puppet/provider/ named for the type. The file should be named for the provider followed by the .rb extension. For our example, the filename will be lib/puppet/provider/elephant/posix.rb.

Define the provider by calling Puppet::Type.type() method with the name of the type, followed by the provide() method with the following three inputs:

  • The provider name as a Ruby symbol
  • Optional :parent that identifies a provider from which to inherit methods
  • Optional :source that identifies a different provider that manages the same resources

The Ruby code that implements the provider should be defined within a following block:

Puppet::Type.type( :elephant ).provide( :posix ) do
  desc "Manages elephants on POSIX-compliant nodes."
end

Always include a description of what the provider does. You would create an alternate provider for Windows nodes with the following definition in the lib/puppet/provider/elephant/windows.rb file:

Puppet::Type.type( :elephant ).provide( :windows ) do
  desc "Manages elephants on Windows nodes."
end

Determining Provider Suitability

Each provider needs to define ways that Puppet can determine if the provider is suitable to manage that resource on a given node.

There are a wide variety of suitability tests. Following are some examples with comments about their use:

Puppet::Type.type( :elephant ).provide( :posix ) do
  # Test the operating system family fact
  confine :osfamily => ['redhat','debian','freebsd','solaris']

  # Test true/false comparisons
  confine :true     => /^4/.match( clientversion )

  # A directory for /tmp must exist
  confine :exists => '/tmp'

  # Ensure that the 'posix' feature is available on this target
  confine :feature => 'posix'
end

Assigning a Default Provider

At times, multiple providers can manage the same resource on a given node. For example, the yum and rpm providers can both manage packages on a CentOS node.

You can declare the provider suitable to be the default provider for a given platform by using defaultfor method and passing it a fact name as a symbol, and an array of suitable values:

  # Test the operating system family fact
  defaultfor :osfamily => ['redhat','debian','freebsd','solaris']

Defining Commands for Use

The commands method lets you test for the existence of a file, and sets up an instance method you can use to call the program. If the command cannot be found, then the provider is not suitable for this resource:

  # The echo command must be within /bin
  commands :echo => '/bin/echo'

  # the ls command must be found in Puppet's path
  commands :ls  => 'ls'

The commands method also defines a new method named for the command that invokes the command, passing all arguments as space-separated command-line options. The method puts the command invocation in the Puppet debug output, and it automatically traps nonzero exit codes and raises Puppet::ExecutionFailure for you.

This is significantly better than using Ruby’s built-in command execution methods, and having to write the exception handling yourself.

Ensure the Resource State

Most types are ensurable, meaning that the provider must validate the existence of the resource and determine what changes are necessary. Define three methods to create the resource, verify its existence, and destroy it:

Puppet::Type.type( :elephant ).provide( :posix ) do
  # where elephants can be found
  filename = '/etc/elephants/' + resource['name']

  # commands we need.
  commands :echo => 'echo'
  commands :ls   => 'ls'
  commands :rm   => 'rm'

  # ensurable requires these methods
  def create
    echo("color = #{resource['color']}",'>',filename)
    echo("motto = #{resource['motto']}",'>>',filename)
  end

  def exists?
    begin
      ls(filename)
    rescue Puppet::ExecutionFailure => e
      false
    end
  end

  def destroy
    rm(filename)
  end

These three methods are used to handle transition from present to absent and vice versa. The type calls the exists? method and then determines whether to call create or destroy in order to meet the state defined by ensure.

Note
We could implement the same features in pure Ruby. I’m using the command-line variant because more information is provided in debug output, and to demonstrate the technique.

Adjusting Properties

For each property with a value that needs to be compared to the resource, you’ll need to define two methods—a getter and a setter for each attribute:

  # where elephants can be found
  filename = '/etc/elephants/' + resource['name']

  # commands we need.
  commands :sed => 'sed'

  def color
    sed('-e','s/^color = (.*)$/1/',filename)
  end

  def color=(value)
    sed('-i','-e','s/^color = /color = #{value}/',filename)
  end

The first method retrieves the current value for the property color, and the second method changes the elephant’s color on the node. The definition for motto would be nearly identical.

If there are many attributes that change values, you may want to cache up the changes and write them out at once. After calling any setter methods, a resource will call the flush method if defined:

  def color=(value)
    true
  end

  def flush
    echo("color = #{resource['color']}",'>',filename)
    echo("motto = #{resource['motto']}",'>>',filename)
    @property_hash = resource.to_hash
  end

If the resource was modified through a long command line of values (e.g., usermod), you could track which values were changed and build a customized command invocation with only those values. Because this resource is only two lines of text, it’s significantly easier to just write the file out again.

The final line caches the current values of the resource into an instance variable. Let’s talk about what can be done with caching now.

Providing a List of Instances

If it is low impact to read in the resources, you can implement an instances class method that will load all instances into memory. This can improve performance in comparison to loading each one separately and modifying it.

Resource providers can make use of that data—for example, when someone uses the following command:

$ puppet resource elephant
Warning

Do not implement anything in this section if the work required to read the state of all instances would be a drain on resources. For example, the file resource does not implement instances because reading the information about every file on the node would be very costly.

To disable preloading of instance data, define it with an empty array.

  self.instances
    []
  end

To preload instance data, we simply need to construct the method to output each file in the directory and create a new object with the values:

  commands :ls  => 'ls'
  commands :cat => 'cat'

  self.instances
    elephants = ls('/tmp/elephants/)
    # For each elephant...
    elephants.split("
").collect do |elephant|
      attrs = Hash.new
      output = cat("/tmp/elephants/#{elephant}")
      # for each line in the file
      output.split("
").collect do |line|
        name, value = line.split(' = ', 2)
        # store the attribute
        attrs[name] = value
      end
      # add the name and its status and then create the resource type
      attrs[:name] = elephant
      attrs[:ensure] = :present
      new( attrs )
    end
  end

The preceding code reads each assignment in the elephant file, and assigns the value to the name in a hash. It adds the elephant’s name to the hash, and sets ensure to present. Voilà, we have built this resource in memory! This preloads every instance of the resource, and makes the data available in the @property_hash instance variable.

When the instances method is available, puppet agent and puppet apply will load all instances in this manner, and then match up resources in the database to the provider that returned their values.

Taking Advantage of Caching

If all of your instances are cached in memory, then you don’t need to read from disk every time. This means you can rewrite your exists method to simply...

  def exists?
    @property_hash[:ensure] == 'present'
  end

Although in the same sense, you need to ensure that any resources created or deleted are updated in memory:

  def create
    echo("color = #{resource['color']}",'>',filename)
    echo("motto = #{resource['motto']}",'>>',filename)
    @property_hash[:ensure] = :present
  end

  def destroy
    rm(filename)
    @property_hash[:ensure] = 'absent'
  end

Finally, how about each of those instance setter and getter methods? These would each be identical, as they would be just setting or reading from the hash. There is a convenience method, mk_resource_methods, which would define all resource attribute methods as follows:

  def color
    @property_hash[:color] || :absent
  end
  def color=(value)
    @property_hash[:color] = value
  end

Place the mk_resource_methods invocation near the top of the provider. You can then override any one or more of the default methods it creates. On a resource type with 20 or more attributes, this convenience method will save you a lot of typing!

However, now that you are only saving the changes back to a hash, you must define a flush() method to write the changes out to disk as described in “Adjusting Properties”.

Learning More About Puppet Providers

One of the most informative ways to debug issues with Puppet types is to run Puppet with the --debug argument. The debugging shows the selection and execution of the provider for a given node.

The best book to learn more about Puppet providers is Dan Bode and Nan Liu’s Puppet Types and Providers. That book covers providers much more exhaustively than this quick introduction.

The Puppet::Provider documentation provides a detailed reference to all methods available.

The Provider Development documentation includes many details of creating providers.

Identifying New Features

Puppet features are Ruby classes that determine if a specific feature is available on the target node. For an example, we will create an elephant feature that is activated if the node has elephants installed.

Features for Puppet are always written in Ruby. Each feature should be declared in a separate file, stored in the module’s lib/puppet/feature/ directory. The file should be named for the feature followed by the .rb extension. For our example, the filename will be elephant.rb.

Always start by requiring the puppet/util/feature library.

Define the feature by calling Puppet.features.add() method with the name of the feature as a Ruby symbol. The code that validates if the feature is available should be defined within a following block:

require 'puppet/util/feature'

Puppet.features.add( :elephant ) do
  Dir.exist?('/tmp') and
  Dir.exist?('/tmp/elephants') and
  !Dir.glob?('/tmp/elephants/*').empty?
end

You can simplify the expression of features that validate whether a certain library is installed by adding an optional :libs argument to the feature definition:

require 'puppet/util/feature'

Puppet.features.add( :openssl, :libs => %{openssl} )

Binding Data Providers in Modules

Until Puppet 4, data for a module was provided in only three ways:

  • Provided as an attribute to the class declaration
  • Automatically looked up in Hiera, the only data provider available
  • Statically defined within the module itself

Puppet 4 has introduced tiered hierarchies of data sources for environments and modules. The lookup() function and automatic parameter lookup in classes use the following sources for data:

  • The global data provider (Hiera v3) configured in ${confdir}/hiera.yaml
  • The environment data provider specified in ${environmentpath}/environment/hiera.yaml
  • The module data provider specified in ${moduleroot}/hiera.yaml

To enable the Hiera data source for your module, create a hiera.yaml file in the module’s root directory.

Tip
Data providers in modules can replace the use of a params class for providing default data values.

Using Data from Hiera

Create a module data configuration file that defines the Hiera hierarchy. The file should be named hiera.yaml, much like the global Hiera configuration file, but it uses a v5 configuration format.

The file must contain a version key with value 5, The hierarchy in the file must be an array of hashes that define the data sources. The datadir used by each source will be a path relative to and contained within the module root. An example file is shown here:

---
version: 5
defaults:
  datadir: data
  data_hash: yaml_data

hierarchy:
  - name: "OS family"
    backend: json_data
    path: "os/%{facts.os.family}.json"

  - name: "common"
    path: "common.yaml"

Create the data/ directory indicated in the file, and populate Hiera data files with data specific to this module.

Using data from a function

Any function can be used to provide data for Hiera, so long as it returns the data in the expected format. Create a function in your module’s namespace as described in “Defining Functions”. If your module is named specialapp, then the function should be named specialapp::something().

Assuming you created a function simply named data, the function can be a Puppet language function defined in functions/data.pp, or it can be a Ruby function defined in lib/puppet/functions/specialapp/data.rb.

A Ruby function should look something like the following. Replace this simple example function with your own:

Puppet::Functions.create_function(:'specialapp::data') do
  def data() 
    # the following is just example code to be replaced
    # Return a hash with parameter name to value mapping for the user class
    return {
      'specialapp::user::id'   => 'Value for parameter $id',
      'specialapp::user::name' => 'Value for parameter $name',
    }
  end
end

Whether the function is defined in Puppet or Ruby, the function must return a hash that contains keys within the class namespace, exactly the same as how keys must be defined in the Hiera data.

To enable this data source, add it to the hierarchy of the hiera.yaml in the module’s directory.

  - name: "module data"
    lookup_key: specialapp::data

The specialapp::data() function is now a data source for the specialapp module.

Performing Lookup Queries

For the example specialapp module, the new lookup strategy of global, then environment, then module data providers, would be queried as follows if you used the function data source for the module:

class specialapp(
  Integer $id,  # will check global Hiera, then environment data provider,
  String $user, # then call specialapp::data() to get all values
) {

If you used the hiera data source for the module, then parameter values would be independently looked up in each data source:

class specialapp(
  Integer $id,  # will check global Hiera, then environment data provider,
  String $user, # then check module Hiera data
) {

Requirements for Module Plugins

There were a lot of rules in this chapter around how module plugins are named and created. Let’s review them:

  • External fact programs should be placed in the facts.d/ directory and be executable by the puppet user. Windows fact providers need to be named with a known file extension.
  • External fact data should be placed in the facts.d/ directory and have a file extension of .yaml, .json, or .txt.
  • Functions written with the Puppet language should be placed in the functions/ directory and be named with a .pp file extension.
  • Ruby functions should be placed in the lib/puppet/functions/modulename/ directory and be named the same as the function with an .rb file extension.
  • Ruby features should be placed in the lib/puppet/features/ directory and be named the same as the feature with a .rb file extension.
  • Ruby functions or templates that call custom functions need to use the call_function() method of the scope object.
  • Ruby functions or templates that call custom functions need to pass all input parameters in a single array.

You can find more detailed guidelines in “Plugins in Modules” on the Puppet docs site.

Reviewing Module Plugins

In this chapter, we covered how to extend a module to:

  • Provide new facts that will be available for any module to reference.
    • A program can output one fact name and value per line.
    • Facts can be written in Ruby, and use Ruby language features.
    • Facts can be read from YAML, JSON, or text file formats.
  • Provide new functions that will be available for any module to reference.
    • Facts can be written in the Puppet language and use all Puppet features.
    • Facts can be written in Ruby, and use all Ruby language features.
  • Bind a custom function as a data lookup source.
    • This data source will be queried after the global data provider (Hiera), and the environment data provider (if defined).

As you can see, module plugins can provide powerful new features and functionality.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset