MongoDB Query Optimization Techniques

Overview

MongoDB is an open source database that uses a document-oriented data model and a non structured query language. MongoDB is built on an architecture of collections and documents.

When you’re programming an application, you generally want the database to respond instantly to anything you do.
Performance optimization is required when your data reaches at highest limit or due to long query, it harms the execution time.
I hope these simple tips will help you avoid the pain I went through!

1. Add Index on Collection:

If your application queries a collection on a particular field or set of fields, then an index on the queried field or a compound index on the set of fields can prevent the query from scanning the whole collection to find and return the query results.
You can set order an index field like for Ascending: 1 and Descending: -1.
I will give you an example for how to add index on collection with appropriate order.
MongoDB support many types of an index that can be used in collection.

Single Index:

I have assumed like you have one collection that store the user information.
Indexes are created with createIndex() function .The most basic command to index the email field in the user collection in ascending order:

If your collection have the object field like an address that store the information like city, state and country then you add the index like below.

Single index doesn’t matter with the field sequence of collection.

Compound Index:

This index are always created with minimum two fields from the collection.
For example, below index was created with ascending order on fullName and userName field.

MongoDB have to limit for compound index only on 31 fields .
The order of the fields listed in a compound index is important.

Text Index:

If you need to search for text or array fields, then add text index.
Text indexes can include any field whose value is a string or an array of string elements.

Unique Single Index:

A unique index ensures that the indexed fields do not store duplicate values

Unique Compound Index:

You use the unique constraint on a compound index, then MongoDB will enforce uniqueness on the combination of the index key values.

2. Aggregation Pipeline Optimization:

The Aggregation Pipeline consists of many stages and each stage transforms the documents as they pass through the pipeline. Aggregation is always used for getting the result from multiple collection and each collection have stored a reference.

I will share with you multiple tips for getting the best result from aggregate query.

Projection Optimization:

Add only require fields from the collection and reducing the amount of data passing through the pipeline.

For Example:

Pipeline Sequence Optimization:

Always maintain the sequence like stage $match + $projection, $match1 + $projection1 and so on in queue. The sequence has reduced the query execution time because data are filtered before going to projection.

$match and $sort:

Define index to match and sort field because it uses the index technique.
Always add $match and $sort on an aggregation first stage if possible.

$sort and $limit:

Use $sort before $limit in the aggregate pipeline like $sort + $limit +$skip.
$sort operator can take advantage of an index when placed at the beginning of the pipeline or placed before the $project, $unwind, and $group aggregation operators.

The $sort stage has a limit of 100 megabytes of RAM, So use allowDiskUse option true to not consume too much RAM.

$skip and $limit:

Always use $limit before $skip in aggregate pipeline.

$lookup and $unwind:

Always create an index on the foreignField attributes in a $lookup, unless the collections are of trivial size.
If $unwind follows immediately after $lookup, then use $unwind in $lookup.

For example

AllowDiskUse in aggregate:

AllowDiskUse : true, aggregation operations can write data to the _tmp subdirectory in the Database Path directory. It is used to perform the large query on temp directory. For example

3. Rebuild the index on collection:

Index rebuilds is required when you add and remove indexes on fields at multiple times on collection.

This operation drops all indexes for a collection, including the _id index, and then rebuilds all indexes.

4. Remove Too Many Index:

Add only required index on the collection because the index will consume the CPU for write operation.
If compound index exist then remove single index from collection fields.

5. Use limit in the result records:

If you know the record result limit, then always use the limit () for reducing the demand on network resources.

For example, You need only 10 users from the user’s collection then use query like below.

6. Use Projection to return only required Data:

When response requires only a subset of fields from documents, you can achieve better performance by returning only the fields you need.

You have a users collection and you only need the fields like fullName ,email and mobile and you would issue the following query.

Analyze Query Performance:

Hope you have applied all above techniques, now you have to check the performance of the query using the Mongodb command.
To analyze the query performance, we can check the query execution time, no. of records scanned and much more.

The explain() method returns a document with the query plan and, optionally, the execution statistics.

The explain() Method used the three different options for returning the execution information.

The possible options are: “queryPlanner”, “executionStats”, and “allPlansExecution” and queryPlanner is default.
You can check the difference after applying above option in explain() method.

For Example-

Main point that we have to take care on above explanation:

  • queryPlanner.winningPlan.stage: displays COLLSCAN to indicate a collection scan. This is a generally expensive operation and can result in slow queries.
  • executionStats.nReturned displays 3 to indicate that the query matches and returns three documents.
  • executionStats.totalKeysExamined: displays 0 to indicate that this query is not using an index.
  • executionStats.totalDocsExamined: displays 10 to indicate that MongoDB had to scan ten documents (i.e. all documents in the collection) to find the three matching documents.
  • queryPlanner.winningPlan.inputStage.stage: displays IXSCAN to indicate index use.

The explain() method can be used in many ways like below.

Conclusion : –

Finally, Now that I have covered the very useful technique for query optimization, take the information provided and see how you can dramatically transform your query fast and efficient.

Please let me know if you have further performance tips.

mm

Bharat Kumar Sr. Web Developer

I am a Sr. Web Developer at Yudiz Solutions Pvt. Ltd. - a leading Web, Mobile Apps and Game Development Company. I’m interested in learning about latest technology. Passionate about traveling, listening music and watching movies.

Comments are closed.

Top