Let's cover another important point that is, at times, forgotten by Zabbix administrators. Differently from other monitoring tools, Zabbix allows each item to have its own time for data holding configured individually. There are two ways for this to occur.
Zabbix works with five history tables, two of which are used to store numerical data (history
and history_unit
).
Each item can have its own retention time defined in the History storage period
field (in days). This period defines the maximum amount of time (in days) that the collected values from the particular item will remain in the database.
After this period, an internal process called housekeeping will remove the old data from the database. This function creates concurrency competition for reading and writing. At this point, the housekeeping tool needs to select the data that will be erased and also issue DELETE
SQL statements that will be generated to eliminate this data.
Like history tables, it is also possible to set a maximum retention time for each item in trend tables, which are actually a consolidation of data from history tables.
Every hour (00:00, 01:00, 02:00, 03:00, and so on), an internal function of the Zabbix Server accesses the history tables that store numerical data and aggregates the data from the last hour. This process will generate a new row in the trends table with the consolidated data. The trend storage period
directive (in days) will set the retention time of this item in the trend tables.
With this in mind, we can understand that we can work with two retention settings for the items. The following is the impact of this function when talking about Zabbix's performance:
min
, max
, avg
, and num
(count). The next figure shows you how trends work:Once more, default templates are the villains in our scenario. This is because they, as a standard, have a retention time of 90 days for history tables, and 365 days for trends tables.
The question here is: do we need to keep the collected data for so long? Probably not, and we can use different times for specific data.