Working with Zabbix protocols

Zabbix protocols are quite simple; this is a strong point because it is simple to implement your own custom agent or software that sends data to Zabbix.

Zabbix supports different versions of protocols. We can divide the protocols into three families:

  • Zabbix get
  • Zabbix sender
  • Zabbix agent

The Zabbix get protocol

The Zabbix get protocol is really simple and easy to implement. Practically, you only need to send data to your Zabbix server at the port 10050.

This protocol is so simple that you can implement it with a shell script as well:

This is a textual protocol and is used to retrieve data from the agent directly. [root@zabbixserver]# telnet 127.0.0.1 10050
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
agent.version
ZBXD2.0.6Connection closed by foreign host.

This example shows you how to retrieve the agent version simply with a telnet. Please note that the data is returned with a header that is ZBXD, followed by the data that represents the actual response 2.0.6.

This simple protocol is useful to retrieve data directly from the agent installed into our server and use it in a shell script.

This protocol is useful to identify the agent version without logging on to the server and to check all the instances of UserParameter defined against an agent.

The Zabbix sender protocol

The Zabbix sender protocol is a JSON-based protocol. The message composition is the following:

<HEADER><DATA_LENGTH><DATA>

The <HEADER> section is of 5 bytes, and it is in the form ZBXDx01. Actually, only the first 4 bytes are the header; the next byte is used to specify the protocol version. Currently, only Version 1 is supported (0 x 01 HEX).

The <DATA_LENGTH> section is 8 bytes in length and in the hex format. So, for instance, 1 is formatted as 01/00/00/00/00/00/00/00, an 8-byte (or 64-bit) number in the hex format.

It is followed by <DATA>. This section is expressed in the JSON format.

Note

From version 2.0.3, Zabbix can receive only 128 MB of data to prevent the server from running out of memory. This limit has been added to protect the server from crashes caused by a large amount of data input.

To send the value, the JSON message needs to be in the following form:

<HEADER><DATALEN>{
  "request":"sender data",
  "data":[
  {
    "host":"Host name 1",
    "key":"item_key",
    "value":"XXX",
    "clock":unix_time_format
  },

  {
    "host":"Host name 2",
    "key":"item_key",
    "value":"YYY"
  }
  ], 
"clock":unix_time_format
}

In the previous example, as you can see, multiple items are queued on the same message if they come from different hosts or are referred to as different item keys.

Note

The "clock" term is optional in this protocol and can be omitted on the JSON object as well as at the end of the data section.

Once all the items are received, the server will send back the response. The response has the following structure:

<HEADER><DATALEN>{ 
  "response":"success",
  "info":"Processed 6 Failed 1 Total 7 Seconds spent 0.000283"
}

This example reports a response message; the following are some considerations:

  • The response has a status that can be [success|failure] and refers to the whole transmission of your item list to the Zabbix server.
  • It is possible, as shown in this example, that some of the items failed. You simply receive a notification and you can't do much more than notify and write this status in a log file.

Tip

It is important to keep track of the time spent to send your item list because if this value becomes high or has a detectable variation, it means that our Zabbix server suffers on receiving items.

Unfortunately, this protocol does not give you feedback on which items failed and the reason for failure. At the time of writing this, there are two requested features that are still pending:

Now you know how the Zabbix sender protocol works on version 1.8 and higher.

Another issue is that the Zabbix sender protocol until now doesn't support any kind of encryption, which can cause an issue in the case of sensitive data that is sent in clear text. We also need to consider the case of a hacker who would like to hide his activity behind a large number of alarms or triggers on fire. With this protocol, the hacker can easily send a false alarm in order to set the trigger on fire and then proceed with his activity unnoticed.

Fortunately, this feature has now been taken into consideration, and the team is working on an SSL version or, better, a TLS version.

For more information, you can have a look at the ticket at https://support.zabbix.com/browse/ZBXNEXT-1263.

An interesting undocumented feature

There is an interesting sender's feature that is not widely known and not well documented. While going deep into protocol analysis, the first thing to do is read the official documentation, and the second is to check how Zabbix will implement it; it is possible that not all the minor changes are updated in the documentation.

Then, looking into the zabbix_sender code, you can find the section where the protocol is implemented:

zbx_json_addobject(&sentdval_args.json, NULL);
zbx_json_addstring(&sentdval_args.json, ZBX_PROTO_TAG_HOST, hostname, ZBX_JSON_TYPE_STRING);
zbx_json_addstring(&sentdval_args.json, ZBX_PROTO_TAG_KEY, key, ZBX_JSON_TYPE_STRING);
zbx_json_addstring(&sentdval_args.json, ZBX_PROTO_TAG_VALUE, key_value, ZBX_JSON_TYPE_STRING);

The preceding code snippet implements the Zabbix JSON protocol and, in particular, this section:

"host":"Host name 1",
"key":"item_key",
"value":"XXX",

Until here, the protocol has been well documented. Right after these lines there are interesting sections that add one more property to our JSON item.

if (1 == WITH_TIMESTAMPS)
   zbx_json_adduint64(&sentdval_args.json, ZBX_PROTO_TAG_CLOCK, atoi(clock));

Here, a timestamp is provided within the item and is added as a property of the JSON object, after which the item is closed as follows:

zbx_json_close(&sentdval_args.json);

Note

The clock is defined as an unsigned int64 variable.

This is a really important property because, if you write your own zabbix_sender, you can specify the timestamp of when the item has been retrieved.

The important thing is that by testing this section, Zabbix stores the clock time of when the item has been retrieved at the specified clock time on the database.

Using the clock properties in JSON items

Now this property can be used to optimize your sender. Zabbix supports 128 MB of data for a single connection. Of course, it is better to be far from that limit because if we reach that limit, it is a sign that our implementation is not well done.

The clock feature can be used in two scenarios:

  • If buffer items need to be sent and if they are sent inside a single connection in bursts
  • If the server is not available, you can cache and send the item later

The first usage of this feature is clearly an optimization to keep the whole communication as lightweight as possible, and reducing the number of connections against our Zabbix server can prevent issues.

The second way to enable this is to implement a robust sender, which can handle a Zabbix server downtime and preserve your item in a cache, ready to be sent once the server is backed up and running. Please be aware not to flood the server if it is not reachable for a long period of time. Manage the communication by sending a reasonable number of items and not a long trail of items.

The Zabbix agent protocol

This protocol is a bit more complex because it involves more phases and the dialogue is more articulated. When an active agent starts, the first thing it does is connect to the server and ask for a task to perform, in particular, which item is to be retrieved and the relative timing.

Also, in the following code, the form of the protocol is the same as used previously:

<HEADER><DATA_LENGTH><DATA>

The <HEADER>, <DATA_LENGTH>, and <DATA> tags are as explained in the previous section.

The dialogue begins when the agent sends the following request to the server:

<HEADER><DATALEN>{
   "request":"active checks",
   "host":"<hostname>"
}

With this kind of request, the agent is going to ask for a specified hostname in the active checklist. The server response will, for instance, be as follows:

<HEADER><DATALEN>{
  "response":"success",
  "data":[{
    "key":"log[/var/log/localmessages,@errors]",
    "delay":1,
    "lastlogsize":12189,
    "mtime":0
  },
  {
    "key":"agent.version",
    "delay":"900"
    }]
"regexp":[
  {
    "name":"errors",
    "expression":"error",
    "expression_type":0,
    "exp_delimiter":",",
    "case_sensitive":1
  }]
}

The Zabbix server must respond with success, followed by the list of items and the relative delay.

Note

In the case of log and logrt items, the server should respond with lastlogsize. The agent needs to know this parameter to continue the work. Also, mtime is needed for all the logrt items.

"regexp", which, in this example, is the response back to the agent, will exist only if you have defined global or regular expressions. Note that if a user macro is used, the parameter key is resolved and the original key is sent as key_orig. The original key is the user macro name.

Once the response is received, the agent will close the TCP connection and will parse it. Now, the agent will start to collect the items at their specified period. Once collected, the items will be sent back to the server:

<HEADER><DATALEN>{
   "request":"agent data",
   "data":[
       {
            "host":"HOSTNAME",
"key":"log[/var/log/localmessages]",
"value":"Sep 16 18:26:44 linux-h5fr dhcpcd[3732]: eth0: adding default route via 192.168.1.1 metric 0",
"lastlogsize":4315,
"clock":1360314499,
"ns":699351525
},       
       {
           "host":"<hostname>",
           "key":"agent.version",
           "value":"2.0.1",
           "clock":1252926015
       }
   ],
   "clock":1252926016
}

Note

While implementing this protocol, make sure to send back lastlogsize for all the log-type items and mtime for the logrt items.

The server will respond with:

{
  "response":"success",
  "info":"Processed 2 Failed 0 Total 2 Seconds spent 0.000110"
}

Also, there is a possibility that some items have not been accepted, but, currently, there isn't a way to know which ones they are.

Some more possible responses

To complete the protocol description, you need to know that there are some particular cases to handle:

  • When a host is not monitored
  • When a host does not exist
  • When the host is actively monitored but there aren't active items

In the first case, when a host is not monitored, the agent will receive the following response from the server:

<HEADER><DATALEN>{
  "response":"failed",
  "info":"host [Host name] not monitored"
}

In the second case, when the host does not exist, the agent will receive the following response:

<HEADER><DATALEN>{
  "response":"failed",
  "info":"host [Host name] not found"
}

In the last case, when the host is monitored but does not have active items, the agent will receive an empty dataset:

<HEADER><DATALEN>{
  "response":"success",
  "data":[]
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset