Querying the database

Grabbing data from the frontend is quick and simple, but this method is unsuitable for large volumes of data and is hard to automate—parsing the frontend pages can be done, but isn't the most efficient way of obtaining data. Another way to get to the data would be to directly query the database.

We'll look at the Zabbix API a bit later. It's suggested to use the API unless there are performance issues.

Let's find out how historical data is stored. Launch the MySQL command-line client (simply called mysql, usually available in the path variable) and connect to the zabbix database as the zabbix user:

$ mysql -u zabbix -p zabbix

When prompted, enter the zabbix user's password (which you can remind yourself of by looking at the contents of zabbix_server.conf) and execute the following command in the MySQL client:

mysql> show tables;

This will list all of the tables in the zabbix database—exactly 144 in Zabbix 4.0. That's a lot of tables to figure out, but for our current need (getting some historical data out), we'll only need a few. First are the most interesting ones—tables that contain gathered data. All historical data is stored in tables, the names of which start with history. As you can see, there're many of those with different suffixes—why is that? Zabbix stores retrieved data in different tables depending on the data type. The relationship between types in the Zabbix frontend and database is as follows:

history: Numeric (float)
history_log: Log
history_str: Character
history_text: Text
history_uint: Numeric (unsigned)

To grab the data, we first have to find out the data type for that particular item. The easiest way to do that's to open item properties and observe the Type of information field. We can try taking a look at the contents of the history table by retrieving all fields for three records:

mysql> select * from history limit 3;

The output will show us that each record in this table contains four fields (your output will have different values):

+--------+------------+--------+-----------+ 
| itemid | clock      | value  | ns        | 
+--------+------------+--------+-----------+ 
|  23668 | 1430700808 | 0.0000 | 644043321 | 
|  23669 | 1430700809 | 0.0000 | 644477514 | 
|  23668 | 1430700838 | 0.0000 | 651484815 | 
+--------+------------+--------+-----------+

The preceding output values are explained as follows:

The next-to-last field, value, is quite straightforward—it contains the gathered value
The clock field contains the timestamp in Unix time—the number of seconds since the so-called Unix epoch, 00:00:00 UTC on January 1, 1970
The ns column contains nanoseconds inside that particular second

An easy way to convert the Unix timestamp into a human-readable form that doesn't require an internet connection is using the GNU date command: date -d@<timestamp>. For example, date -d@1234567890 will return Sat Feb 14 01:31:30 EET 2009.

The first field, itemid, is the most mysterious one. How can we determine which ID corresponds to which item? Again, the easiest way is to use the frontend. You should still have the item properties page open in your browser, so take a look at the address bar. Along with other variables, you'll see part of the string that reads like itemid=23668. Great, so we already have the itemid value on hand. Let's try to grab some values for this item from the database:

mysql> select * from history where itemid=23668 limit 3;

Use the itemid value that you obtained from the page URL:

+--------+------------+-    -------+-----------+ 
| itemid | clock      | value  | ns        | 
+--------+------------+--------+-----------+ 
|  23668 | 1430700808 | 0.0000 | 644043321 | 
|  23668 | 1430700838 | 0.0000 | 651484815 | 
|  23668 | 1430700868 | 0.0000 | 657907318 | 
+--------+------------+--------+-----------+

The resulting set contains only values from that item, as evidenced by the itemid field in the output.

We'll usually want to retrieve values from a specific period. Guessing Unix timestamps isn't entertaining, so we can again use the date command to figure out the opposite—a Unix timestamp from a date in human-readable form:

$ date -d "2016-01-13 13:13:13" "+%s"
1452683593

The -d flag tells the date command to show the specified time instead of the current time, and the %s format sequence instructs it to output in Unix timestamp format. This fancy little command also accepts more free-form input, such as last Sunday or next Monday.

As an exercise, figure out two recent timestamps half an hour apart, then retrieve values for this item from the database. Hint—the SQL query will look similar to this:

mysql> select * from history where itemid=23668 and clock >= 1250158393 and clock < 1250159593;

You should get back some values. To verify the period, convert the returned clock values back into a human-readable format. The obtained information can be now passed to any external applications for analyzing, graphing, or comparing.

With history* tables containing the raw data, we can get a lot of information out of them. But sometimes, we might want to get a bigger picture only, and that's when table trends can help. Let's find out what exactly this table holds. In the MySQL client, execute this:

mysql> select * from trends limit 2;

We're now selecting two records from the trends table:

+--------+------------+-----+-----------+-----------+-----------+ 
| itemid | clock      | num | value_min | value_avg | value_max | 
+--------+------------+-----+-----------+-----------+-----------+ 
|  23668 | 1422871200 |  63 |    0.0000 |    1.0192 |    1.4300 | 
|  23668 | 1422874800 | 120 |    1.0000 |    1.0660 |    1.6300 | 
+--------+------------+-----+-----------+-----------+-----------+

Just like the history tables have history and history_uint, there are trends and trends_uint tables for Numeric (float) and Numeric (unsigned) types of information. There're no corresponding _log, _str, or _text tables as trend information can be calculated for numeric data only.

Here, we find two familiar friends, itemid and clock, whose purpose and usage we just discussed. The last three values are quite self-explanatory—value_min, value_avg, and value_max contain the minimal, average, and maximal values of the data. But for what period? The trends table contains information on hourly periods. So, if we would like to plot the minimal, average, or maximal values per hour for one day in some external application, instead of recalculating this information, we can grab data for this precalculated data directly from the database.

But there's one field we've missed—num. This field stores the number of values there were in the hour that is covered in this record. It's useful if you have hundreds of records each hour in a day that're all more or less in line but data is missing for one hour, except a single extremely high or low value. Instead of giving the same weight to the values for every hour when calculating daily, weekly, monthly, or yearly data, we can more correctly calculate the final value.

If you want to access data from the database to reuse in external applications, beware of the retention periods—data is removed from the history* and trends* tables after the number of days specified in the History storage period and Trend storage period fields for the specific items.

Table of Contents for Querying the database

Create new playlist

Sign In

Sign Up

Table of Contents for
Querying the database