Cluster status

Cluster status checks are among the most frequent tasks of any Ceph administrator; indeed like driving a manual transmission car they quickly become reflexive. These dovetail with the health check we discussed above. The overall status of the cluster can be checked by using the ceph status subcommand or the -s switch.

root@ceph-client0:~# ceph status
cluster e6d4e4ab-f59f-470d-bb76-511deebc8de3
health HEALTH_OK
monmap e1: 1 mons at {ceph-mon0=192.168.42.10:6789/0}
election epoch 5, quorum 0 ceph-mon0
fsmap e15527: 1/1/1 up {0=ceph-mds0=up:active}
osdmap e6279: 6 osds: 6 up, 6 in
flags sortbitwise,require_jewel_osds
pgmap v16658: 128 pgs, 9 pools, 3656 bytes data, 191 objects
1962 MB used, 63371 MB / 65333 MB avail
128 active+clean

The output of the ceph status is split into two columns. The first column on the left side displays each key and its value is displayed to the right. The keys are always single words but the values can occupy multiple lines.

  • cluster: The unique identifier for a given cluster, also known as fsid.
  • health: The present state of cluster health.
  • monmap: This value summarizes the output of the cluster MON map. It shows the epoch version of the MON map, a list of Ceph MONs with their IP addresses and ports the, election epoch version, and quorum members with their IDs.
  • fsmap: Displays the values from MDS map (in older versions of Ceph this was keyed with mdsmap). This includes the latest epoch version, total MDSs that are up, and their sync status.
  • osdmap: This represents the cluster OSD map including the most recent epoch version, the total number of provisioned OSDs, and the number of OSDs that are marked up and in. When OSDs go down, the up count reduces, but it stays the same until they are removed from the OSD map. The flags entry shows any cluster flags applied to the OSD map.
  • pgmap: This contains the epoch version of the cluster's PG map, total counts of PGs and pools, bytes allocated for data, and the total count of objects within the cluster. The next line shows the amount of data the cluster holds and the balance available out of the total cluster capacity. The last line shows the count of PGs that are inactive and clean states. If the cluster has degraded PGs, there will be additional lines summarizing the numbers of PGs in each state.

If the cluster has ongoing recovery or client I/O, that will be displayed as two more lines at the end.

We typically use the status command to observe PG state transitions when we are performing changes within the cluster like reweighting OSDs during maintenance or adding new OSDs or hosts. In such scenarios, it best to open a dedicated window that continually updates this crucial status information instead of us repeatedly issuing this command manually. We can leverage the Linux watch command to periodically update the display of changes in cluster state. By default, the watch will run the command every 2 seconds, displaying a timestamp at the upper right. This timestamp itself is valuable. If we notice that the value is not updating as expected, it's likely that our cluster's Monitor nodes are experiencing problems.

$ watch ceph status
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset