Caches and buffers

In the early versions of Zabbix up to version 1.6, all Zabbix server interaction with data occurred directly via SQL statements in the database. For collecting an item, the flow was as follows:

  • The Zabbix server queried the database in search of the next item to be gathered (which would, actually, be the latest item for collection)
  • This query basically took into account two fields in the items table (delay and last_clock)
  • If the duration of the gap between the current time and last_clock (time of the last valid sample collection) was greater than delay (number of seconds between gathers), then the Zabbix server started a new collection for that item
  • A poller would be called and the collection would be made
  • When this poller received data from the Zabbix agent, it would execute INSERT on the database and store the collected values

The entire collection stream involved querying and writing (insert and update) to the database. In an environment of a large organization, where we have thousands of hosts and tens of thousands of items, it is possible to understand that the bottleneck would undoubtedly be in the database.

For the purpose of comprehensiveness and understanding, we should add to that environment triggers and their functions, the housekeeper, the Zabbix frontend, users' actions, alerts and escalations, and so on. The important fact is that the database has ended up being the main point when talking about Zabbix's performance. Zabbix SIA came to the conclusion that Zabbix should minimize its interactions with the database, or have these interactions be executed in the background by new internal processes and structures (caches and buffers).

Zabbix version 1.8 introduced some caches and buffers, which kept evolving. Today, they are responsible for ensuring good performance. Currently, the main caches and buffers are as follows:

  • CacheSize (configuration cache):
    • This stores all configuration data (hosts, items, and triggers)
    • It has a minimum value of 128 KB and a maximum value of 8 GB
    • If the volume of configuration data is larger than the size of CacheSize, the Zabbix server will not launch
    • The Zabbix server will stop if the configuration cache becomes full at runtime
    • This cache is stable and must be monitored so that its capacity stays between 80 and 90 percent
  • HistoryCacheSize (history cache):
    • This stores the data received by the Zabbix server, pollers, or trappers
    • It has a minimum value of 128 KB and a maximum value of 2 GB
    • If it is completely filled, the Zabbix server starts rejecting new values
    • This is a security cache and will only grow if the Zabbix server has difficulties delivering data to the database
    • It must be monitored so that it remains as empty as possible (it may fluctuate due to writing database disk latency)
    • It keeps the highest possible value (2 GB) so that there is no gathering queue at times when the database/storage is slower
  • HistoryTextCacheSize (history text cache):
    • This has the same function and behavior as HistoryCacheSize. However, it receives data items of only three kinds: characters, text, and log.
  • ValueCache (value cache):
    • This was introduced in version 2.2. It is used to make calculations of trigger expressions, calculated/aggregate items, and some macros much faster.
    • It has a minimum value of 128 KB (0 disables ValueCache) and a maximum value of 64 GB.
    • It grows gradually as more values are added to it.
    • It must be monitored to ensure that its values increase (if there is memory available) in the event of being full.
  • TrendCacheSize (trends cache):
    • This is used to store trends data before writes to the database.
    • It has a minimum value of 128 KB and a maximum value of 2 GB.
    • It doesn't change much. It depends on the number of items.

With these settings, we can eliminate much of direct access to the database.

The main point here is to establish adequate monitoring of these caches and buffers so that we know how they are behaving and whether they need some adjustments in the parameter values assigned. For example, if HistoryCache and/or HistoryTextCache fluctuate a lot, this indicates that the Zabbix server has difficulties in delivering data to the database. The bigger this cache, the longer the Zabbix server supports this fluctuation.

To learn more about internal items related to caches and buffers inside Zabbix, take a look at the Zabbix manual and stay up to date with new keys and parameters.

In short, we must understand and exploit Zabbix's caches and buffers so that we can manipulate data in the storage and minimize database access.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset