Ceph Placement Group

A Placement Group (PG) is a logical collection of objects that are replicated on OSDs to provide reliability in a storage system. Depending on the replication level of a Ceph pool, each PG is replicated and distributed on more than one OSD of a Ceph cluster. You can consider a PG as a logical container holding multiple objects, such that this logical container is mapped to multiple OSDs:

The PGs are essential for the scalability and performance of a Ceph storage system. Without PGs, it will be difficult to manage and track tens of millions of objects that are replicated and spread over hundreds of OSDs. The management of these objects without a PG will also result in a computational penalty. Instead of managing every object individually, a system has to manage the PGs with numerous objects. This makes Ceph a more manageable and less complex system.

Each PG requires some system resources, as they have to manage multiple objects. The number of PGs in a cluster should be meticulously calculated, and this is discussed later in this book. Usually, increasing the number of PGs in your cluster rebalances the OSD load. A recommended number of PGs per OSD is 50 to 100, to avoid high resource utilization on the OSD node. As the amount of data on a Ceph cluster increases, you might need to tune the cluster by adjusting the PG counts. When devices are added to or removed from a cluster, CRUSH manages the relocation of PGs in the most optimized way.

Now, we have understood that a Ceph PG stores its data on multiple OSDs for reliability and high availability. These OSDs are referred to as primary, secondary, tertiary, and so on, and they belong to a set known as the acting set for that PG. For each PG acting set, the first OSD is primary and the latter are secondary and tertiary.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset