MON quorum status

The concept of quorum is fundamental to all consensus algorithms that are designed to access information in a fault-tolerant distributed system. The minimum number of votes necessary to achieve consensus among a set of nodes is called a quorum. In Ceph's case, MONs are exploited to persist operations that result in a change of the cluster state. They need to agree to the global order of operations and register them synchronously. Hence, an active quorum of MON nodes is important in order to make progress. To keep a cluster operational, the quorum (or majority) of MON nodes needs to be available at all times. Mathematically, it means we need (n/2)+1 MON nodes available at all times, where n is the total number of MONs provisioned. For example, if we have 5 MONs, we need at least (5/2)+1 = 3 MONs available at all times. Ceph considers all MONs equal when testing for a majority, and thus any three working MONs in the above example will qualify to be within the quorum.

We can obtain the cluster quorum status by using the quorum_status subcommand of our trusty Swiss Army Knife—like the Ceph utility.

root@ceph-client0:~# ceph quorum_status
{
"election_epoch": 6,
"quorum": [
0
],
"quorum_names": [
"ceph-mon0"
],
"quorum_leader_name": "ceph-mon0",
"monmap": {
"epoch": 1,
"fsid": "e6d4e4ab-f59f-470d-bb76-511deebc8de3",
"modified": "2017-09-10 20:20:16.458985",
"created": "2017-09-10 20:20:16.458985",
"mons": [
{
"rank": 0,
"name": "ceph-mon0",
"addr": "192.168.42.10:6789/0"
}
]
}
}

The output is displayed in JSON format. Let's discuss the fields below:

  • election_epoch: A counter that indications the number of re-elections that have been proposed and completed to date.
  • quorum: A list of the ranks of MON nodes. Each active MON is associated with a unique rank value within the cluster. The value is an integer and starts at zero.
  • quorum_names: The unique identifier of each MON process.
  • quorum_leader_name: The identifier of a MON that is elected to be the leader of the ensemble. When a client first talks to the cluster, it acquires all the necessary information and cluster maps from the current or acting leader node.
  • monmap: The dump of the cluster's MON map.
A MON leader acts as an arbiter to ensure that writes are applied to all nodes and in the proper order. In Ceph versions up to and including Jewel, the leader of MON nodes was selected based on the advertised IP address. The MON node with the lowest IP address (lowest is calculated by transforming the IP address from a quad-dotted notation to a 32-bit integer value) was picked as the leader. If it was unavailable, then the one with the next highest IP address was selected, and so on. Once the unavailable, MON came back up, assuming that it still had the lowest IP address of all MONs, it would be re-elected the leader.

 

Ceph's Luminous release adds a new configuration setting called mon priority that lets us adjust priorities of MONs regardless of the values of their IP addresses. This helps us apply custom ordering to all MONs that we want to act as temporary leaders when the previous leader dies. It also allows us to change the existing leader dynamically and in a controlled manner. We might, for example, switch the leader a day before performing firmware upgrades or a disk/chassis replacement, so that the former lead server going down does not trigger an election at a time when we need to concentrate on other priorities.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset