Monitoring Zabbix IPMI

Nowadays, you can quickly monitor the health and availability of you devices using IPMI. Definitely, the main requirement here is that your device supports Intelligent Platform Management Interface (IPMI). IPMI is a hardware level specification that is software neutral, meaning it is not tied in any way with BIOS and operating systems. One interesting feature is that the IPMI interface can be available even when the system is in the powered-down state. This is possible because inside each IPMI-enabled device, there is a separate device that consumes less power, independent of any other board or software. Nowadays, IPMI is fully supported by most server vendors, and talking about servers, it is usually exposed by the management cards: HP ILO, IBM RSA, Sun SSP, DELL RDAC, and so on.

If you would like to know in detail how IMPI works, since is a standard designed by Intel, you can find the documentation at http://www.intel.com/content/www/us/en/servers/ipmi/ipmi-specifications.html.

Obviously, to perform an IPMI check, you need to have compiled Zabbix with IPMI --with-openipmi support, please refer to Chapter 1, Deploying Zabbix.

IPMI uses a request-response protocol over a message-based interface to dialogue with all the device components, but more interesting is that other than retrieving components metrics or accessing the non-volatile system event log, you even can retrieve data from all the sensors installed in your hardware.

The first steps with IPMI

First of all, you need to make sure that you've installed all the required packages; otherwise, you can install them with this command executed from root:

$ yum install ipmitool OpenIPMI OpenIPMI-libs

Now, we can already retrieve temperature metrics using IPMI, for instance, using the following command:

$ ipmitool sdr list | grep Temp
Ambient Temp | 23 degrees C  | ok
CPU 1 Temp   | 45 degrees C  | ok
CPU 2 Temp   | disabled    | ns
CPU 3 Temp   | disabled    | ns
CPU 4 Temp   | disabled    | ns 

Note that in the previous example, we've got three disabled lines as those CPU sockets are empty. As you can see, we can quickly retrieve all the internal parameters via the IPMI interface. Now, it is interesting to see all the possible states that can apply to our IPMI ID, which is CPU 1 Temp, please note that since the IPMI ID contains spaces, we need to use the double quote notation:

$ ipmitool event "CPU 1 Temp" list 
Finding sensor CPU 1 Temp... ok 
Sensor States: 
  lnr : Lower Non-Recoverable 
  lcr : Lower Critical 
  lnc : Lower Non-Critical 
  unc : Upper Non-Critical 
  ucr : Upper Critical 
  unr : Upper Non-Recoverable

Those are all the possible CPU 1 Temp states. Now, IPMI is a simple, read-only protocol, but you can even simulate errors or configure parameters. We are now going to simulate a low-temperature threshold, just to see how this works. Running the following command, you can simulate a -128 degrees Celsius reading:

$ ipmitool event "CPU 1 Temp" "lnc : Lower Non-Critical" 
Finding sensor CPU 1 Temp... ok 
0 | Pre-Init Time-stamp | Temperature CPU 1 Temp | Lower Non-critica l | 
going low | Reading -128 < Threshold -128 degrees C

Now, we can quickly verify that this has been logged in the system event log with:

$ ipmitool sel list | tail -1 
1c0 | 11/19/2008 | 21:38:22 | Temperature #0x98 | Lower Non-critical going low

Tip

This is one is the best nondisruptive tests that we can do to make you aware that it's required to profile read-only IPMI accounts. Using the admin IPMI account, you can reset your management controller, trigger a shutdown, trigger a power-reset, change the boot list, and so on.

Configuring IPMI accounts

To configure an IPMI account, you have essentially two ways:

  • Use the management interface itself (RDAC, ILO, RS, and so on)
  • Using OS tools and then OpenIPMI

First of all, it is better to change the default root password; you can do it with:

$ ipmitool user set password 2 <new_password>

Here, we are resetting the default password for the root account that has the user ID 2.

Now, it is important to create a Zabbix user account that can query the signor's data and has no rights to restart a server or change any configuration.

In the next line, we are creating the Zabbix user with the user ID 3; please check whether you already have the user ID 3 in your system. First of all, define the user login with this command from root:

$ ipmitool user set name 3 zabbix

Then, set the relative password:

$ ipmitool user set password 3
Password for user 3: 
Password for user 3: 

Now, we need to grant our Zabbix the required privileges:

$ ipmitool channel setaccess 1 3 link=on ipmi=on callin=on privilege=2

Activate the account:

$ ipmitool user enable 3

Verify that all is fine:

$ ipmitool channel getaccess 1 3
Maximum User IDs     : 15
Enabled User IDs     : 2

User ID              : 3
User Name            : zabbix
Fixed Name           : No
Access Available     : call-in / callback
Link Authentication  : enabled
IPMI Messaging       : enabled
Privilege Level      : USER

The use we've just created is named zabbix, and it has the USER privilege level. Anyway, the account is not enabled to access from the network; to enable this account, we need to activate the MD5 authentication for LAN access for this user group:

$ ipmitool lan set 1 auth USER MD5 

We can verify this with:

$ ipmitool lan print 1 
Set in Progress         : Set Complete
Auth Type Support       : NONE MD5 PASSWORD 
Auth Type Enable        : Callback : 
                        : User     : MD5 
                        : Operator : 
                        : Admin    : MD5 
                        : OEM      : 

Now we can finally run the queries remotely from our Zabbix server directly with this command:

$ ipmitool –U Zabbix –H <ip-of-IPMI-host-here> -I lanplus sdr list | grep Temp
Ambient Temp | 23 degrees C  | ok
CPU 1 Temp   | 45 degrees C  | ok
CPU 2 Temp   | disabled    | ns
CPU 3 Temp   | disabled    | ns
CPU 4 Temp   | disabled    | ns 

Now we are ready to use our Zabbix server to retrieve IPMI items.

Configuring Zabbix IPMI items

When you're looking for IPMI metrics, the most difficult part is the setup that we've just done. In Zabbix, the setup is quite easy. First of all, we need to uncomment the following line in zabbix_server.conf:

# StartIPMIPollers=0

Change the value to something reasonable for the amount of IPMI interface you're going to monitor. Anyway, this is not critical; the most important part is to enable the IPMI Zabbix poller that is disabled by default. In this example, we will use:

StartIPMIPollers=5

Now, we need to restart Zabbix from root by running:

$ service zabbix-server restart

Now, we can finally switch on the web interface and start adding IPMI items.

The first step is configure the IPMI parameters at the host level and then go to Configuration | Host. There, we need to add IPMI interface, the relative port, as shown in the following screenshot:

Configuring Zabbix IPMI items

Then, we need to switch on the IPMI tab, which is where the other configuration parameters are.

In the IPMI tab for Authentication algorithm, select MD5, and as per our example configuration done previously, for the Privilege level, select User. In the Username field, you can write Zabbix, and in Password, you can write the password you've specified during the configuration, as shown in the following screenshot:

Configuring Zabbix IPMI items

Now, we can add our item of the type IPMI agent. As per the previous example, the item we're acquiring here is CPU 1 Temp, and the the Type is Numeric (float). The following screenshot shows this:

Configuring Zabbix IPMI items

Configuring the Zabbix side of IPMI is a straightforward process; anyway, if you're using a different OpenIPMI version, please be aware that there are known issues with OpenIPMI Version 2.0.7, and that Zabbix is not working fine with this version. Then, the version 2.0.14 or later is required to make it work. In some devices, such as network temperature sensors that have only one interface card, logically, the same card will expose even the IPMI interface. If this is your case, please bear in mind to configure it on the same IP address as that of your device. Another important thing to know about IPMI is that the names of discrete sensors have been changed between OpenIPMI 2.0.16, 2.0.17, 2.0.18, and 2.0.19. Thus, it is better to check the correct name using the OpenIPMI version that you have deployed in the Zabbix server.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset