All the check categories we explored before cover a very wide range of possible devices, but there's always that one which doesn't play well with standard monitoring protocols, can't have the agent installed, and is buggy in general. A real-life example would be a UPS that provides temperature information on the web interface, but does not provide this data over SNMP. Or maybe we would like to collect some information remotely that Zabbix does not support yet—for example, monitoring how much time an SSL certificate has until it expires.
In Zabbix, such information can be collected with external checks or external scripts. While user parameters are scripts run by the Zabbix agent, external check scripts are run directly by the Zabbix server.
First, we should figure out the command to find out the remaining certificate validity period. We have at least two options here:
0
or 1
to identify that the certificate expires in some period of timeLet's try out both options.
We could find out the certificate expiry time with an OpenSSL command like this:
$ echo | openssl s_client -connect www.zabbix.com:443 2>/dev/null | openssl x509 -noout -enddate
We are closing the stdin
for the openssl
command with echo
and passing the retrieved certificate information to another openssl
command, x509
, to return the date and time when the certificate will expire:
notAfter=Jan 2 10:35:38 2019 GMT
The resulting string is not something we could easily parse in Zabbix, though. We could convert it to a UNIX timestamp like this:
$ date -d "$(echo | openssl s_client -connect www.zabbix.com:443 2>/dev/null | openssl x509 -noout -enddate | sed 's/^notAfter=//')" "+%s"
We're stripping the non-date part with sed
and then formatting the date and time as a UNIX timestamp with the date utility:
1546425338
Looks like we have the command ready, but where would we place it? For external checks, a special directory is used. Open zabbix_server.conf
and look for the option ExternalScripts
. You might see either a specific path, or a placeholder:
# ExternalScripts=${datadir}/zabbix/externalscripts
If it's a specific path, that's easy. If it's a placeholder like above, it references the compile-time data directory. Note that it is not a variable. When compiling from the sources, the ${datadir}
path defaults to /usr/local/share/
. If you installed from packages, it is likely to be /usr/share/
. In any case, there should be a zabbix/externalscripts/
subdirectory in there. This is where our external check script will have to go. Create a script zbx_certificate_expiry_time.sh
there with the following contents:
#!/bin/bash date -d "$(echo | openssl s_client -connect "$1":443 2>/dev/null | openssl x509 -noout -enddate | sed 's/^notAfter=//')" "+%s"
Notice how we replaced the actual website address with a $1
placeholder—this allows us to specify the domain to check as a parameter to this script. Make that file executable:
$ chmod 755 zbx_certificate_expiry_time.sh
And now for a quick test:
$ ./zbx_certificate_expiry_time.sh www.zabbix.com 1451727126
Great, we can pass the domain name to the script and get back the time when the certificate for that domain expires. Now, on to placing this information in Zabbix. In the frontend, go to Configuration | Hosts, click on Items next to A test host, and click on Create item. Fill in the following:
Certificate expiry time on $1
zbx_certificate_expiry_time.sh[www.zabbix.com]
unixtime
We specified the domain to check as a key parameter, and it will be passed to the script as the first positional parameter, which we then use in the script as $1
. If more than one parameter is needed, we would comma-delimit them, the same as for any other item type. The parameters would be properly passed to the script as $1
, $2
, and so on. If we need no parameters, we would use empty square brackets []
, or just leave them off completely. If we wanted to act upon the host information instead of hardcoding the value like we did, we could use some macro—for example, {HOST.HOST}
, {HOST.IP}
, and {HOST.DNS}
are common values. Another useful macro here would be {HOST.CONN}
, which would resolve either to the IP or DNS, depending on which one is selected in the interface properties.
When done, click on the Add button at the bottom. Now check this item in the Latest data page:
The expiry time seems to be collected correctly and the unixtime
unit converted the value in a human-readable version. What about a trigger on this item? The easiest solution might be with the fuzzytime()
function again. Let's say we want to detect a certificate that will expire in 7 days or less. The trigger expression would be as follows:
{A test host:zbx_certificate_expiry_time.sh[www.zabbix.com].fuzzytime(604800)}=0
The huge value in the trigger function parameters, 604800
, is 7 days in seconds. Can we make it more readable? Sure we can, this would be exactly the same:
{A test host:zbx_certificate_expiry_time.sh[www.zabbix.com].fuzzytime(7d)}=0
The trigger would alert with 1 week left, and from the item values we could see how much time exactly is left. We discussed triggers in more detail in Chapter 6, Detecting Problems with Triggers.
A simpler approach might be passing the threshold to the OpenSSL utilities and let them determine whether the certificate will be good after that many seconds. A command to check whether the certificate is good for 7 days would be as follows:
$ echo | openssl s_client -connect www.zabbix.com:443 2>/dev/null | openssl x509 -checkend 604800 Certificate will not expire
That looks simple enough. If the certificate expires in the given time, the message would say "Certificate will expire
". The great thing is that the exit code also differs based on the expiry status, thus we could return 1
when the certificate is still good and 0
when it expires.
$ echo | openssl s_client -connect www.zabbix.com:443 2>/dev/null | openssl x509 -checkend 604800 -noout && echo 1 || echo 0
In the same directory as before, create a script zbx_certificate_expires_in.sh
with the following contents:
#!/bin/bash echo | openssl s_client -connect "$1":443 2>/dev/null | openssl x509 -checkend "$2" -noout && echo 1 || echo 0
This time, in addition to the domain being replaced with $1
, we also replaced the time period to check with a $2
placeholder. Make that file executable:
$ chmod 755 zbx_certificate_expires_in.sh
And now for a quick test:
$ ./zbx_certificate_expires_in.sh www.zabbix.com 604800 1
Looks good. Now, on to creating the item—in the frontend, let's go to Configuration | Hosts, click on Items next to A test host, and click on Create item. Start by clicking on Show value mappings next to the Show value dropdown. In the resulting popup, click on the Create value map. Enter "Certificate expiry status
" in the Name field, then click on the Add link in the Mappings section. Fill in the following:
0
: Expires soon
1
: Does not expire yet
We're not specifying the time period here as that could be customized per item. When done, click on the Add button at the bottom and close the popup. Refresh the item configuration form to get our new value map and fill in the following:
Certificate expiry status for $1
zbx_certificate_expires_in.sh[www.zabbix.com,604800]
When done, click on the Add button at the bottom. And again, check this item in the Latest data page:
Seems to work properly. It does not expire yet, so we're all good. One benefit over the previous approach could be that it is more obvious which certificates are going to expire soon when looking at a list.
It is important to remember that external checks could take quite a long time. With the default timeout being 3 or 4 seconds (we will discuss the details in Chapter 22, Zabbix Maintenance), anything longer than a second or two is already too risky. Also, keep in mind that a server poller process is always busy while running the script; we cannot offload external checks to an agent like we did with the user parameters being active items. It is suggested to use external checks only as a last resort when all other options to gather the information have failed. In general, external checks should be kept lightweight and fast. If a script is too slow, it will time out and the item will become unsupported.