Indexes.
Types of indexes.
Index properties.
The various indexing strategies to be considered.
Indexes are used to improve the performance of the query. Without indexes, MongoDB must search the entire collection to select those documents that match the query statement. MongoDB therefore uses indexes to limit the number of documents that it must scan.
Indexes are special data structures that store a small portion of the collection’s data set in an easy-to-transform format. The index stores a set of fields ordered by the value of the field. This ordering helps to improve the performance of equality matches and range-based query operations.
MongoDB defines indexes at the collection level and indexes can be created on any field of the document. MongoDB creates an index for the _id field by default.
Note
MongoDB creates a default unique index _id field, which helps to prevent inserting two documents with the same value of the _id field.
Recipe 4-1. Working with Indexes
In this recipe, we are going to discuss how to work with indexes in MongoDB.
Problem
You want to create an index.
Solution
How It Works
Let’s follow the steps in this section to create an index.
Step 1: Create an Index
Here, the parameter value “1” indicates that empId field values will be stored in ascending order.
Note
We can’t drop _id indexes. MongoDB creates an index for the _id field by default.
Recipe 4-2. Index Types
In this recipe, we are going to discuss various types of indexes.
Problem
You want to create different types of indexes.
Solution
How It Works
Step 1: Multikey Index
Multikey indexes are useful to create an index for a field that holds an array value. MongoDB creates an index key for each element in the array.
Note
You can’t create a compound multikey index.
Step 2: Text Indexes
MongoDB provides text indexes to support text search queries on string content. You can create a text index on a field that takes as its value a string or an array of string elements.
This command creates a text index for the field post_text.
Step 3: Hashed Indexes
The size of the indexes can be reduced with the help of hashed indexes. Hashed indexes store the hashes of the values of the indexed field. Hashed indexes support sharding using hashed shard keys. In hashed-based sharding, a hashed index of a field is used as the shard key to partition data across the sharded cluster. We discuss sharding in Chapter 5.
Hashed indexes do not support multikey indexes.
Step 4: 2dsphere Index
The 2dsphere index is useful to return queries on geospatial data.
Recipe 4-3. Index Properties
Indexes can also have properties. The index properties define certain characteristics and behaviors of an indexed field at runtime. For example, a unique index ensures the indexed fields do not support duplicates. In this recipe, we are going to discuss various index properties.
Problem
You want to work with index properties.
Solution
How It Works
Let’s follow the steps in this section to work with index properties.
Step 1: TTL Indexes
Time to Live (TTL) indexes are single-field indexes that are used to remove documents from a collection after a certain amount of time. Data expiration is useful for certain types of information such as logs, machine-generated data, and so on.
Step 2: Unique Indexes
A unique index ensures that the indexed fields do not contain any duplicate values. By default, MongoDB creates a unique index on the _id field.
Step 3: Partial Indexes
Partial indexes are useful when you want to index the documents in a collection that meet a specific filter condition. The filter condition could be specified using any operators. For example, db.person.find( { age: { $gt: 15 } } ) can be used to find the documents that have an age greater than 15 in the person collection. Partial indexes reduce storage requirements and performance costs because they store only a subset of the documents.
Use the db.collection.createIndex() method with the partialFilterExpression option to create a partial index.
equality expressions (i.e., field: value or using the $eq operator).
$exists: true expression.
$gt, $gte, $lt, and $lte expressions.
$type expressions.
$and operator at the top level only.
Step 4: Sparse Indexes
Sparse indexes store entries only for the documents that have the indexed field, even if it contains null values. A sparse index skips any documents that do not have the indexed field. The index is considered sparse because it does not include all the documents of a collection.
Note
Partial indexes determine the index entries based on the filter condition, whereas sparse indexes select the documents based on the existence of the indexed field.
Recipe 4-4. Indexing Strategies
We must follow different strategies to create the right index for our requirements. In this recipe, we are going to discuss various indexing strategies.
Problem
You want to learn about indexing strategies to ensure you are creating the right type of index for different purposes.
Solution
Type of executing query.
Number of read/write operations.
Available memory.
How It Works
Let’s follow the steps in this section to work with different indexing strategies.
Step 1: Create an Index to Support Your Queries
Creating the right index to support the queries increases the query execution performance and results in great performance.
Step 2: Using an Index to Sort the Query Results
Sort operations use indexes for better performance. Indexes determine the sort order by fetching the documents based on the ordering in the index.
Sorting with a single-field index.
Sorting on multiple fields.
Sorting with a Single-Field Index
Sorting on Multiple Fields
Index to Hold Recent Values in Memory
When using multiple collections, we must consider the size of indexes on all collections and ensure the index fits in memory to avoid the system reading the index from the disk.
When we ensure the index fits entirely into the RAM, that ensures faster system processing.
Create Queries to Ensure Selectivity
This query must scan all the documents to return the result of empId values greater than 1.
This query must scan only one document to return the result empId:4.