In order to monitor the performance and characteristics of a cache tier in a Ceph cluster, there are a number of performance counters you can monitor. We will assume for the moment that you are already collecting the Ceph performance counters from the admin socket as discussed in the next chapter.
The most important thing to remember when looking at the performance counters is that once you configure a tier in Ceph all client requests, go through the top-level tier. Therefore, only read and write operation counters on OSDs that make up your top-level tier will show any requests, assuming that the base tier OSDs are not used for any other pools. To understand the number of requests handled by the base tier, there are proxy operation counters, which will show this number. These proxy operation counters are also calculated on the top-level OSDs, and so to monitor the throughput of a Ceph cluster with tiering, only the top-level OSDs need to be included in the calculations.
The following counters can be used to monitor tiering in Ceph, all are to be monitored on the top-level OSDs:
Counter |
Description |
op_r |
Read operations handled by the OSD |
op_w |
Write operations handled by the OSD |
tier_proxy_read |
Read operations that were proxied to the base tier |
tier_proxy_write |
Write operations that were proxied to the base tier |
tier_promote |
The number of promotions from base to top-level tier |
tier_try_flush |
The number of flushes from the top-level to the base tier |
tier_evict |
The number of evictions from the top-level to the base tier |