Selecting a Partitioning Method

The next step in sharding a large collection is to decide how to partition the documents based on the shard key. There are two methods you can use to distribute the documents into different shards, based on the shard key value. Which method you use depends on the type of shard key you select:

Image Range-based sharding: One method is to divide the data set into specific ranges, based on the value of the shard key. This method works well for shard keys that are numeric. For example, if you have a collection of products and each product has a specific product ID from 1 to 1,000,000, you could shard the products in ranges of 1–250,000; 250,001–500,000, etc.

Image Hash-based sharding: Another method is to use a hash function that computes a field value to create chunks. The hash function should ensure that shard keys that have a very close value should end up in different shards to ensure a good distribution.

It is vital that you select a shard key and distribution method that will distribute documents as evenly as possible across the shards. Otherwise, one server ends up overloaded while another is relatively unused.

The advantage of range-based sharding is that it is often very easy to define and implement. Also, if your queries are often range bases as well, it is more performant than hash-based sharding. However, it is very difficult to get an even distribution with range-based sharding unless you have all the data up front and the shard key values will not change in the future.

The hash-based sharding method takes more understanding of the data but typically provides the best overall approach to sharding because it ensures a much more evenly spaced distribution.

The index used when enabling sharding on the collection determines which partitioning method is used. If you have an index that is based on a value, MongoDB uses range-based sharding. For example, the following implements a range-based shard on the zip and name fields of the document:

db.myDB.myCollection.ensureIndex({"zip": 1, "name":1})

To shard using the hash-based method, you need to define the index using the hash method, as in this example:

db.myDB.myCollection.({"name":"hash"})

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset