Automating MariaDB data loading with Ansible

MariaDB is a good choice for this chapter because it offers a middle-of-the-road view when it comes to database management with Ansible. There is some native module support in Ansible, but this is not complete for all tasks you might want to execute. As a result, we will develop the following example, which automates the loading of a sample set of data, using just the shell Ansible modules. We will then develop this to show how it would be completed with the mysql_db module, to provide you with a direct comparison between the two automation techniques.

Note that the following examples performed using the shell module could be adapted for almost any database you can manage from the command line, and so it is hoped these will provide you with a valuable reference for automating your database management tasks.

In terms of example databases, we will work with the publicly available Employees sample database, as this is available to everyone reading this book. You can, of course, choose your own set of data to work with—however, it is, as ever, hoped that this following practical example will teach you the skills you need to load data into your newly installed database with Ansible:

  1. To start with, let's create a role called loadmariadb. Into the roles directory structure, create a directory called files/, and clone the employees sample database. This is publicly available on GitHub, and, at the time of writing, could be cloned using the following command:
$ git clone https://github.com/datacharmer/test_db.git
  1. From here, we create a tasks/ directory within the role and write the code for our role tasks themselves. To start with, we need to copy the database files across to our database server, by running the following code:
---
- name: Copy sample database to server
copy:
src: "{{ item }}"
dest: /tmp/
loop:
- files/test_db/employees.sql
- files/test_db/load_departments.dump
- files/test_db/load_employees.dump
- files/test_db/load_dept_emp.dump
- files/test_db/load_dept_manager.dump
- files/test_db/load_titles.dump
- files/test_db/load_salaries1.dump
- files/test_db/load_salaries2.dump
- files/test_db/load_salaries3.dump
- files/test_db/show_elapsed.sql
  1. Once the data files are copied to the server, it is simply a matter of loading them into the database. However, as there is no module for this task, we must revert to a shell command to handle this, as shown in the following code block:
- name: Load sample data into database
shell: "mysql -u root --password={{ mariadb_root_password }} < /tmp/employees.sql"
args:
chdir: /tmp
  1. The role tasks are simplicity themselves—however, before we can run the playbook, we need to set the mariadb_root_password variable, ideally in a vault, but for simplicity in this book, we will put it in a plaintext vars file in the role. The file vars/main.yml should look like this:
---
mariadb_root_password: "securepw"

As you will have spotted, this playbook assumes that you already installed and configured MariaDB in a previous role—the password used in the preceding code block is that set in the previous section when we installed MariaDB and secured it using Ansible.

  1. Running the playbook should yield results like this:

Here, we have not only loaded a sample schema, but also sample data, into our database. In your enterprise, you could choose to perform either of these tasks in isolation, as required.

You will probably have spotted that this playbook is extremely dangerous. As we discussed previously, the issue with using the shell module in Ansible playbooks is that the results of the task will vary as the shell command is always run, whether it needs running or not. Thus, if you ran this playbook against a server with an existing database called employees, it would overwrite all the data in it with the sample data! Contrast this with the copy module, which only copies the files if they do not already exist on the receiving end.

Given the lack of native database modules at the time of writing, we need to devise a more intelligent way of running this command. Here, we can make use of some of the clever error handling built into Ansible.

The shell module assumes that the command it is running has run successfully if it returns exit code zero. This results in the task returning the changed status we saw in this playbook run. However, if the exit code is not zero, the shell module will instead return a status of failed.

We can take advantage of this knowledge, and couple it with a useful MariaDB command that will return a zero exit code if the database we query exists, and non-zero if it doesn't. See the following screenshot for an example:

We can make use of this command by running it before our task that loads the data. We can ignore any errors from the command, and instead register them in a variable. We use this to conditionally run the data load, loading it only if an error occurred (this is the instance where the database does not exist, and so it is safe to load the data).

The copy task remains the same, but the tail end of the tasks now looks like this:

- name: Check to see if the database exists
shell: "mysqlshow -u root --password={{ mariadb_root_password }} employees"
ignore_errors: true
register: dbexists

- name: Load sample data into database
shell: "mysql -u root --password={{ mariadb_root_password }} < /tmp/employees.sql"
args:
chdir: /tmp
when: dbexists.rc != 0

Now, we will only load the data if the database doesn't exist. This code has been kept simple for the sake of providing an example, and it is left to you to enhance it—for example, by putting the filenames and database name into a variable so that the role becomes reusable in a variety of circumstances (which, after all, is one of the goals of writing a role). 

If we now run this code, we can see that it operates as desired—on the first run, the data is loaded, as the following screenshot shows:

However, on the second run, it is not—the following screenshot shows the playbook being run a second time, and the data load task being skipped because the database exists:

Although these examples are specific to MariaDB, the high-level process performed here should work with just about any database. The key element is to use the shell module to load the data and/or schema but to do so in a way that reduces the right of a valid database getting overwritten, in the event that the playbook gets run twice. You should extend this logic to any other task you perform—your ultimate goal should be that if your playbook is run unintentionally, then no damage is done to the existing database.

Having completed this example, it is worth noting that Ansible does provide a module called mysql_db, which can natively handle tasks such as dumping and importing database data. Let's now develop an example that makes use of the native mysql_db module:

  1. If we were to develop a role to perform exactly the same task as shown previously, but using this native module, we would first of all check to see if the database exists as before, registering the result to a variable, like this:
---
- name: Check to see if the database exists
shell: "mysqlshow -u root --password={{ mariadb_root_password }} employees"
ignore_errors: true
register: dbexists
  1. We then create a block in our tasks file, as there is no point running any of the tasks after this step if the database exists. The block uses the when clause we used before, to determine whether the tasks inside it should run or not, as follows:
- name: Import new database only if it doesn't already exist
block:

when: dbexists.rc != 0
  1. Inside the block, we copy over all the SQL files to import just as we did before, like this:
  - name: Copy sample database to server
copy:
src: "{{ item }}"
dest: /tmp/
loop:
- files/test_db/employees.sql
- files/test_db/load_departments.dump
- files/test_db/load_employees.dump
- files/test_db/load_dept_emp.dump
- files/test_db/load_dept_manager.dump
- files/test_db/load_titles.dump
- files/test_db/load_salaries1.dump
- files/test_db/load_salaries2.dump
- files/test_db/load_salaries3.dump
- files/test_db/show_elapsed.sql
  1. Now, an important difference emerges between using the shell module, and mysql_db. When using the shell module, we used the chdir argument to change the working directory to /tmp, which is where all our SQL files were copied to. The mysql_db module has no chdir (or equivalent) argument, and so will fail when it comes to try to load the *.dump files that are sourced via employees.sql. To work around this, we use the Ansible replace module, to add the full path to these files into the appropriate lines in employees.sql, as follows:
  - name: Add full paths to employees.sql as mysql_db won't know where to load them from otherwise
replace:
path: /tmp/employees.sql
regexp: '^source (.*)$'
replace: 'source /tmp/1'
  1. Finally, we use the mysql_db module to load in the data (this is analogous to the shell command we performed in our earlier example), as follows:
  - name: Load sample data into database
mysql_db:
name: all
state: import
target: /tmp/employees.sql
login_user: root
login_password: "{{ mariadb_root_password }}"
  1. When we run this code, it achieves the same end result as our previous role that used the shell module, as the following screenshot shows:

This process works equally well for backing up databases, too. If you were to use the shell module, you could use the mysqldump command to back up a database, and then copy the backed-up data to your Ansible host (or indeed, another) for archiving. A simple piece of example code to achieve this might be constructed as follows:

  1. As we want the backup filename to be dynamic and include useful information such as the current date and hostname on which the backup is being performed, we use the set_fact module, along with some internal Ansible variables, to define a filename for the backup data, as follows:
---
- name: Define a variable for the backup file name
set_fact:
db_filename: "/tmp/{{ inventory_hostname }}-backup-{{ ansible_date_time.date }}.sql"
  1. We then use the shell module to run mysqldump, with the appropriate parameters for creating a backup—going into depth on these is beyond the scope of this book, but the following example creates a backup of all databases on your server, without locking the tables during the backup:
- name: Back up the database
shell: "mysqldump -u root --password={{ mariadb_root_password }} --all-databases --single-transaction --lock-tables=false --quick > {{ db_filename }}"
  1. The fetch module is then used to retrieve the data for archiving—fetch works just like the copy module that we used earlier in this section, except that it copies data in the reverse direction (that is, from the inventory host to the Ansible server). Run the following code:
- name: Copy the backed up data for archival
fetch:
src: "{{ db_filename }}"
dest: "/backup"
  1. Running this in the usual manner results in a complete backup of the database, with the resulting file being copied to our Ansible server, as the following screenshot shows:

This example could also be achieved using the mysql_db module, just as we did before—the set_fact and fetch tasks remain exactly the same, while the shell task is replaced with the following code:

- name: Back up the database
mysql_db:
state: dump
name: all
target: "{{ db_filename }}"
login_user: root
login_password: "{{ mariadb_root_password }}"

Thus, Ansible can assist you both with loading data into your databases and backing it up. As we have discussed previously, it is generally better to use the native Ansible modules (such as mysql_db) where they are available, but, provided you apply the correct logic to it, the shell module can assist you, if native modules don't exist or provide the functionality you need.

Now that we have considered the process of creating databases and loading data into them, we will proceed in the next section to demonstrate how to build on this work, to perform routine database maintenance with Ansible.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset