OSD tree lookup

If we need a quick view into the state and availability of all OSDs, we will use the OSD tree subcommand. This command is usually the second-most used (or abused) after the Ceph status command.

root@ceph-client0:~# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.06235 root default
-2 0.02078 host ceph-osd1
0 0.01039 osd.0 up 1.00000 1.00000
3 0.01039 osd.3 up 1.00000 1.00000
-3 0.02078 host ceph-osd0
1 0.01039 osd.1 up 1.00000 1.00000
5 0.01039 osd.5 up 1.00000 1.00000
-4 0.02078 host ceph-osd2
2 0.01039 osd.2 up 1.00000 1.00000
4 0.01039 osd.4 up 1.00000 1.00000

Let's dive into the columns of the preceding output and their particulars a little more in the following:

ID: The unique identifier of each bucket. All buckets that are not leaf nodes or devices always have negative values and all devices or OSDs have positive values. The IDs for OSDs are prefixed by osd to form their names. These values are pulled directly from the CRUSH map.
WEIGHT: This value shows the weight of a bucket in tebibytes (i.e. 2^30). The weight of each bucket is a critical input to the CRUSH placement algorithm, and thus for the balanced placement of data across the cluster.
TYPE: This field shows the bucket type. Valid default types include device, host, chassis, rack, row, pdu, pod, room, datacenter, region, and root. We can create custom types if desired, but it is recommended to employ the predefined types in most cases. This helps Ceph's CRUSH map accurately mimic your physical topology. The root bucket, as its name implies, is at the root of the hierarchical map of buckets, and all the rulesets (replicated or erasure coded) for a pool will be applied to a given root.
NAME: The name of each bucket. In the above example, we have one root bucket named default, three host buckets named ceph-osd1, ceph-osd2, and ceph-osd3, and six device or OSD buckets.
UP/DOWN: This column displays the state of the OSD devices. Up means, the OSD is actively communicating with other OSDs in the cluster and readily accepting the client I/O. Down indicates that the OSD is unavailable. It may be running slowly or its process is dead and thus unable to communicate with the cluster. This OSD should be looked at and fixed.
REWEIGHT: If we have applied an override weight to an existing OSD, that will show up here. The floating point values for overrides lie between zero and one.
PRIMARY AFFINITY: This value controls the probability of an OSD being elected as a primary. Changing this value to zero will ensure that the corresponding OSD is never elected primarily for any PGs that are stored on it. This is sometimes useful when you are deploying a hybrid SSD-HDD cluster and want all your primary OSDs to be on SSD nodes to maximize performance, especially for reads. Tread carefully when changing those values because they do add limitations to how well Ceph can handle failures when the primaries go down.

It might be easy in a small cluster to determine by inspection which OSD belongs to which host. Within a large cluster comprising hundreds or thousands of OSDs located in tens of servers, it is distinctly harder to quickly isolate a problem to a specific host when an OSD fails and thus is marked down. Ceph provides us with the osd find subcommand that makes it really easy to identify which host a broken OSD belongs to.

root@ceph-client0:~# ceph osd find 1
{
"osd": 1,
"ip": "192.168.42.100:6802/4104",
"crush_location": {
"host": "ceph-osd0",
"root": "default"
}
}

The command takes the numeric OSD ID in as an input and prints a list of important fields as JSON output.

osd: The numeric ID of an OSD.
ip: The IP address, port of the OSD host and PID of the OSD process running on that host.
crush_location: This prints the hostname and also the root which the OSD is part of.

Table of Contents for OSD tree lookup

Create new playlist

Sign In

Sign Up

Table of Contents for
OSD tree lookup