Until now, every item type described in this chapter could be considered a way to get raw measurements as single data points. In fact, the focus of the chapter has been more on setting up Zabbix to retrieve different kinds of data than on what is actually collected. This is because on the one hand, a correct setup is crucial for effective data gathering and monitoring, while on the other hand, the usefulness of a given metric varies wildly across environments and installations, depending on the specific needs that you may have.
When it comes to aggregated and calculated items though, things start to become really interesting. Both types don't rely on probes and measurements at all; instead, they build on existing item values to provide a whole new level of insight and elaboration on the data collected in your environment.
This is one of the points where Zabbix's philosophy of decoupling measurements and triggering logic really pays off, because it would be quite cumbersome, otherwise, to come up with similar features, and it would certainly involve a significant amount of overhead.
The two types have the following features in common:
The simpler of the two types discussed here, aggregated items can perform different kinds of calculations on a specific item that is defined for every host in a group. For every host in a given group, an aggregated item will get the specified item's data (based on a specified function) and then apply the group function on all of the values collected. The result will be the value of the aggregated item measurement at the time that it was calculated.
To build an aggregated item, you first need to identify the host group that you are interested in and then identify the item, shared by all the group's hosts, which will form the basis of your calculations. For example, let's say that you are focusing on your web application servers, and you want to know something about the active sessions of your Tomcat installations. In this case, the group would be something similar to Tomcat Servers
, and the item key would be jmx["Catalina:type=Manager,path=/,host=localhost",activeSessions]
.
Next, you need to decide how you want to retrieve every host's item data. This is because you are not limited to just the last value but can perform different kinds of preliminary calculations. Except for the last
function, which indeed just retrieves the last value from the item's history, all the other functions take a period of time as a further argument.
Function |
Description |
---|---|
|
This is the average value in a specified time period |
|
This is the sum of all values in a specified time period |
|
This is the minimum value recorded in a specified time period |
|
This is the maximum value recorded in a specified time period |
|
This is the latest value recorded |
|
This is the number of values recorded in a specified time period |
What you now have is a bunch of values that still need to be brought together. The following table explains the job of the group function:
Function |
Description |
---|---|
|
This is the average of all the values collected |
|
This is the sum of all values collected |
|
This is the minimum value in a collection |
|
This is the maximum value in a collection |
Now that you know all the components of an aggregated item, you can build the key; the appropriate syntax is as follows:
groupfunc["Host group","Item key",itemfunc,timeperiod]
The Host group
part can be defined locally to the aggregated item definition. If you want to bring together data from different hosts that is not part of the same group and you don't want to create a host group just for this, you can substitute the host group name with a list of the hosts—["HostA, HostB, HostC"]
.
Continuing with our example, let's say that you are interested in collecting the average number of active sessions on your Tomcat application server every hour. In this case, the item key would look as follows:
grpavg["Tomcat servers","jmx["Catalina:type=Manager,path=/,host=localhost",activeSessions]", avg, 3600]
Using the same group and a similar item, you would also want to know the peak number of concurrent sessions across all servers, this time every 5 minutes, which can be done as follows:
grpsum["Tomcat servers","jmx["Catalina:type=Manager,path=/,host=localhost",maxActive]",last, 0]
Simple as they are, aggregated items already provide useful functionality, which would be harder to match without a collection of measurements as simple data that is easily accessible through a database.
This item type builds on the concept of item functions expressed in the previous paragraphs and takes it to a new level. Unlike aggregated items, with calculated ones, you are not restricted to a specific host group, and more importantly, you are not restricted to a single item key. With calculated items, you can apply any of the functions available for the trigger definitions to any item in your database and combine different item calculations using arithmetic operations. As with other item types that deal with specialized pieces of data, the item key of a calculated item is not used to actually define the data source but still needs to be unique so that you can refer to the item in triggers, graphs, and actions. The actual item definition is contained in the formula
field, and as you can imagine, it can be as simple or as complex as you need.
In keeping with our Tomcat server's example, you could have a calculated item that gives you a total application throughput for a given server as follows:
last(jmx["Catalina:type=GlobalRequestProcessor,name=http-8080",bytesReceived]) +last(jmx["Catalina:type=GlobalRequestProcessor,name=http-8080",bytesSent]) +last(jmx["Catalina:type=GlobalRequestProcessor,name=http-8443",bytesReceived]) +last(jmx["Catalina:type=GlobalRequestProcessor,name=http-8443",bytesSent]) +last(jmx["Catalina:type=GlobalRequestProcessor,name=jk-8009",bytesReceived]) +last(jmx["Catalina:type=GlobalRequestProcessor,name=jk-8009",bytesSent])
Alternatively, you could be interested in the ratio between the active sessions and the maximum number of allowed sessions so that, later, you could define a trigger based on a percentage value instead of an absolute one, as follows:
100*last(jmx["Catalina:type=Manager,path=/,host=localhost",activeSessions]) /last(jmx["Catalina:type=Manager,path=/,host=localhost",maxActiveSessions])
As previously stated, you don't need to stick to a single host either in your calculations.
The following is how you could estimate the average number of queries on the database per single session, on an application server, every 3 minutes:
avg(DBServer:mysql.status[Questions], 180) /avg(Tomcatserver:Catalina:type=Manager,path=/,host=localhost",activeSessions], 180)
The only limitation with calculated items is that there are no easy group functions such as those available to aggregated items. So, while calculated items are essentially a more powerful and flexible version of aggregated items, you still can't dispense with aggregated items, as you'll need them for all group-related functions.
Despite this limitation, as you can easily imagine, the sky is the limit when it comes to calculated items. Together with aggregated items, these are ideal tools to monitor the host's group performances, such as clusters and grids, or to correlate different metrics on different hosts that contribute to the performance of a single service.
Whether you use them for performance analysis and capacity planning or as the basis of complex and intelligent triggers, or both, the judicious use of aggregated and calculated items will help you to make the most out of your Zabbix installation.