The previous chapter covered creating job templates and workflows inside the Automation controller. This chapter will go into detail about designing playbooks and jobs in a workflow to take advantage of workflows. This includes creating nodes that contain information so that users do not need to hunt in a playbook and using approval nodes to gain user input to allow a workflow to continue. Notifications are used to notify users and services about the specific events that have occurred.
In this chapter, we will cover the following topics:
All the code referenced in this chapter is available at https://github.com/PacktPublishing/Demystifying-Ansible-Automation-Platform/tree/main/ch11. It is assumed that you have Ansible installed to run the code provided.
Most users who have been using Ansible with the command line design their playbooks and roles to encompass everything for a given scenario in a single playbook, These monolithic playbooks account for various scenarios in a single play, such as deploying a high-availability SQL server with a load balancer on multiple servers from scratch. However, when using workflows, it makes more sense to break up the work into various parts that are run separately. The advantages of doing this are that you can anticipate and account for failures, pass information from one job node to another, and wait for user approval to continue during the workflow execution.
Workflows can take advantage of a special set of extra variables that are passed from one node to another. This can be achieved by using a specific Ansible module: set_stats. These can be created with the following playbook task:
//review_results.yaml - set_stats: data: list_to_pass: "{{ list_to_pass }}" host_groups: "{{ group_list }}" aggregate: false per_host: false
The set_stats module has the following options:
Variables set with set_stats are inherited by any job in the workflow that occurs after that job is finished. An example workflow can be seen in the following diagram:
Figure 11.1 – Workflow illustration for converging nodes
Any variables set with set_stats in Job 1 or Job 2 will be inherited in Job 3 and any job after that. However, in this case, the merger of Job 1 and Job 2 is not controlled or defined, so be sure to set variables to unique names so that there is no overlap. The order of precedence for variables in jobs and workflows is dictated by the following list. The further you must go down the list, the higher the priority for precedence:
It is important to keep this variable precedence in mind when designing workflows, job templates, and Ansible playbooks to use in the Automation controller. The official list regarding variable precedence can be found here: https://docs.ansible.com/automation-controller/latest/html/userguide/job_templates.html#ug-jobtemplates-extravars.
In addition, if you want to set a single set_stats variable that combines variables from each host, the following task can be used:
//set_vars.yaml - name: "set a stat that has variables per host" set_stats: data: var_per_host: "{{ var_per_host | default({}) | combine({inventory_hostname: show_vars }) }}"
This will take a previous set_stats variable that was created for each host, and then save it for a future job node to use. An example of the result is as follows:
var_per_host:
hostname1:
other_info: 1
hostname2:
other_info: 2
hostname3:
other_info: 3
The var_per_host variable can then be used in a later playbook/job template using the following variable reference:
"{{var_per_host[inventory_hostname]}}"
Using set_stats in this manner takes advantage of job artifacts so that you can pass variables to future job nodes and playbooks. The variables that are used can be large dictionaries that are kept, such as data on an entire router configuration. However, with automation, you may need someone to review what is being done before pushing out the change. One part of this is creating a job node that distills information that users need to review.
With some workflows where a lot of information is being parsed and a review is needed before final approval, it is best to set up a simple playbook that displays information to the user. This can be done with either a set_stats or debug task in the playbook.
The debug task is normally used for troubleshooting, but it does display a variable in full in the job’s output:
//set_vars.yaml - debug: var: var_name
This will appear as follows in the job’s output:
Figure 11.2 – Debug output
With set_stats, especially if the previous node has created many artifacts, it might be a good idea to add an underscore to the variable. This makes it go to the top of the list so that a user can easily find it:
//set_vars.yaml - name: "set a stat that has variables per host" set_stats: data: _var_per_host: "{{ var_per_host }}"
This node will show up at the top of the artifacts window on the Automation controller, as follows:
Figure 11.3 – Workflow artifacts with underscores go to the top
As mentioned previously, these are useful when using approval nodes, as covered in the next section.
Approval nodes are useful for when a manager, another team, or someone else needs to review information from a playbook before moving forward with a workflow. Approval nodes were covered in Chapter 10, Creating Job Templates and Workflows. The following is the Configuration as Code (CAC) definition of a workflow approval node in a workflow:
- identifier: Approval Node unified_job_template: description: Approval node for example timeout: 900 type: workflow_approval name: Approval to continue
These approval nodes pause a workflow until either they time out or a user approves or denies it. If workflows and job templates are not set to run concurrently, this will cause a bottleneck, preventing other jobs from stalling. This can lead to problems if users do not respond to the approval notifications. This will be discussed in the Notifications and how to integrate them section.
If a user denies an approval node, it will mark the workflow as failed. The node behaves as if the approval job failed in terms of logic. If this is the last node in the logical flow, then the workflow will be marked as failed. To avoid the workflow being marked as failed, it might be useful to set the workflow’s state as successful after a denied approval.
In some cases, some conditions result in failed jobs, which then cause a workflow to fail. This can be due to a variety of things, such as certain checks failing or the user deciding to deny the approval node. Another case would be when a log aggregation service checks the status of the workflows and reports the number that failed throughout the day. In all of these cases, it can be useful to incorporate jobs with tasks that dictate the flow of a workflow. Two playbooks are used in such a case: a success playbook and a fail playbook.
The following is a simple success playbook, though it really could be any valid playbook:
//success_playbook.yaml - debug: msg: Approval Denied, Marking workflow as Successful.
An example of when to use this is when designing workflows for checks and work to be done without any changes, then have a user review the results before approving or denying the node to continue and finish the work. If the user chooses to deny, the workflow will be marked as failed, which doesn’t look good on reports when this is done hundreds of times a day. This is why the workflow must be marked as a success since it did what it intended to do.
The fail playbook, however, does rely on the fail module to make sure the playbook fails:
//fail_playbook.yaml - name: "Mark workflow/template as Failed" fail: msg: "Set Status to Fail because an error occurred in the workflow/template."
While these are two simple playbooks, they help determine the outcomes of the workflows when there has been an acceptable failure. They can also be used to force a workflow failure, depending on logic, and trigger a notification. The next section will go into using notifications so that a message can be sent if an event occurs.
Notifications are ways to send messages somewhere when an event occurs in the Automation controller. These notifications are made through a variety of mediums, such as emails, slack messages, or webhooks.
These are extremely useful for approvals so that users respond to them, but also for critical pieces such as an inventory failure. The important part is that these are knobs that can be turned on and off as needed.
The events that can trigger a notification are as follows:
Each of these sets a state for when to send a notification. However, these events only pertain to some things in the Automation controller. The following objects in the Automation controller can have notifications set. All of these contain Start, Success, and Failure options:
Inventory and project failures can interrupt how all the jobs are run, bringing work to a critical stoppage. It is recommended that failure events be turned on for these. However, different groups communicate in different ways. Some prefer emails or Slack notifications. The following notification services can be used to send a notification:
Now that we’ve learned what notifications are and where they are used, let’s learn how to configure notification templates.
The different options will be broken up into three sections: notification options, configuration, and messages. The GUI fields will be represented with Name. If the module and role fields are different other than being in lowercase, they will be represented as name.
For the GUI, roles, and modules, we can use the following options:
Because each of the services has different configuration options, it is best to consult the documentation at https://docs.ansible.com/automation-controller/latest/html/userguide/notifications.html for each setting.
This section will not cover every notification service, but it will cover two of the most popular ones: email and Slack.
The following are the configuration settings for email. Here, we are using Gmail as an example. The steps for setting up an app password for Google can be found at https://support.google.com/accounts/answer/185833.
Each of these settings is straightforward, such as the host, sender, and recipient. This example uses TLS and port 587 to contact the email server. The full configuration is as follows:
//notifications/set_notifications_with_roles.yaml notification_configuration: username: [email protected] password: tnbfksktebythfwv host: smtp.gmail.com recipients: - [email protected] sender: [email protected] port: 587 timeout: 60 use_tls: true
Another popular option for notifications is Slack; the next section will cover how to configure Slack notifications.
As for Slack, there are only a few options. However, it does require a Slack application to be created and a token for said app. The links for app creation can be found in the Automation controller documentation linked at the beginning of the Notification options section. The following configuration options must be used:
//notifications/set_notifications_with_roles.yaml notification_configuration: channels: - group_approval_notification token: "{{ notification_slack_oath }}" use_ssl: false use_tls: false
Now that we’ve learned how to configure both email and Slack, the next section will focus on customized notification messages. Some services will want a JSON object to be sent, while others will allow custom text to be sent to users. Customizing notification messages is important so that the right information gets sent.
Each notification type can send custom messages. For email, this includes a subject that has been categorized as a message, and a body – that is, the actual body sent in the mail. Slack, on the other hand, only supports custom messages; there is no body. These fall under the following areas. These are also nested dictionaries that should be used in the modules and roles:
messages started success error workflow_approval approved denied running timed_out
Each of these contains the message or body to send when the appropriate notification is triggered. A message can be crafted to be sent. The following message has been crafted for Slack:
//set_notifications_with_roles.yaml messages: workflow_approval: running: body: "" message: 'The approval node "{ { approval_node_name }}" needs review. This node can be viewed at: { { workflow_url }}, job data: { { job_metadata }}'
Note
The double spaces between the curly braces, { {, prevent Ansible from interpreting the variable. The roles are built to replace this using a regex_replace filter. So, the correct information is sent to the Automation controller. It is difficult to send the right text for variables using the module or URI without using a filter.
The body field is not used as Slack does not use the body field. The variables that can be used in these messages are limited. They include the following:
Each of these variables can be used to craft custom messages, but the messages are limited to text and these variables. To send an even more customized message, a playbook and task would need to be used, such as to send an email with an attachment. Some additional tweaking can be done to limit the number of approval messages that are sent.
While each of the aforementioned options can be turned on/off for messages to be sent, such as success or error, for workflow approval, it is not possible to tweak the suboptions in that way. This means that if the workflow approval notifications are turned on, an approved, denied, running, or timed out message will always be sent. Here, the workaround is to use a message body that contains no data as it will not send a blank message:
//notifications/set_notifications_with_roles.yaml workflow_approval: approved: message: "{ { job.name }}"
By using the job.name variable, which does not exist for an approval node, the message is never sent. This allows each option to be tweaked to dictate the necessary behavior.
A module can be invoked from a task. The following is an excerpt from the //notifications/set_notifications_using_module.yml file. The full file can be found in this book’s GitHub repository:
- name: Add Slack notification with custom messages ansible.controller.notification_template:
The first section creates the necessary notification:
name: Slack_approval_notification organization: Default notification_type: slack
The second section creates the necessary configuration for email:
notification_configuration: channels: - notification_test token: xoxb-1234
The last section sets any custom messages to be sent. This has been left blank so that default messages are used:
messages: started: message: "{{ '{{ job_friendly_name }}{{ job.id }} started' }}"
When considering whether you wish to use a GUI, module, or role, it’s the module that doesn’t always get used. However, it is important to know how it works. When maintaining and updating notification templates, it makes more sense to make use of roles.
Let’s learn how to create and maintain job templates with roles. The job templates role takes a list of job templates with options and applies them to the Automation controller:
//notifications/set_notifications_with_roles.yml controller_notifications:
The first section creates the necessary notification:
- name: Gmail notification description: Notify us on Google organization: Default notification_type: email
The second section creates the necessary configuration for email:
notification_configuration: username: [email protected] password: tnbfksktebythfwv host: smtp.gmail.com recipients: - [email protected] sender: [email protected] port: 587 timeout: 60 use_tls: true
The last section sets any custom messages to be sent. This has been left blank so that default messages are used:
messages: success: body: '{"fields": {"project": {"id": "11111"},"summary": "Lab { { job.status }} Ansible Controller { { job.name }}","description": "{ { job.status }} in { { job.name }} { { job.id }} { { url }}","issuetype": {"id": "1"}}}'
The notification role allows users to maintain the Automation controller’s configuration as code.
The redhat_cop.controller_configuration.notification_templates role was used in the preceding playbook to push the configuration to the Automation controller.
Which notification service to use depends on the group preferences of those who want to receive the notifications. It is possible to have notifications sent to any combination of events and services, depending on the preference. Depending on the configuration, a valid configuration could include a Slack channel and multiple email notifications, each with a separate list of email recipients. These configuration options can be seen in the workflow notification settings tab:
Figure 11.4 – Notifications settings for a workflow
As mentioned in the Using and configuring notifications section, notifications can be added to inventory sources, projects, job templates, and workflow templates. In the GUI, it is as simple as toggling the option to turn it on, as illustrated in the preceding screenshot. In the modules and roles, each of the previously mentioned objects accepts a list of notifications to turn on. An example of this can be found in both playbooks in the //notifications folder for this chapter. For example, to replicate the notifications shown in the preceding screenshot, you could use the following dictionaries with list items:
notification_templates_error: - Gmail notification notification_templates_started: [] notification_templates_success: - Slack_notification notification_templates_approvals: - Gmail notification - Slack_notification
This can be applied to anything that uses a notification template. As mentioned previously, it may make sense to make a few different templates for the same service. While there is no limit to the number of notifications that can be made, there is a limit to users’ patience, so use them wisely.
This chapter covered advanced workflow options and notifications. Some of these will not be applicable in every situation, but they are useful tools. If there is a workflow that takes over 10 minutes or even an hour to complete, some users may prefer a way to get notified of the need to intervene with an approval node or a job failure. The logic that’s used in workflows can take advantage of artifacts and other methods to become as simple or as complex as needed.
The next chapter will cover the use of CI/CD and ways to interact with the Automation controller using outside tasks and services.