Using regular expressions

MongoDB offers a rich interface for querying using regular expressions. In its simplest form, we can use regular expressions in queries by modifying the query string:

> db.books.find({"name": /mongo/})

This is done to search for books in our books collection that contain the mongo name. It is the equivalent of a SQL LIKE query.

MongoDB uses Perl Compatible Regular Expression (PCRE) version 8.39 with UTF-8 support.

We can also use some options when querying:

Option

Description

i

This option queries case insensitivity.

m

For patterns that include anchors (that is, ^ for the start and $ for the end), this option matches at the beginning or end of each line for strings with multiline values. Without this option, these anchors match at the beginning or end of the string.

If the pattern contains no anchors, or if the string value has no newline characters (for example, ), the m option has no effect.

 

In our previous example, if we wanted to search for mongo, Mongo, MONGO, and any other case-insensitive variation, we would need to use the i option, as follows:

> db.books.find({"name": /mongo/i})

Alternatively, we can use the $regex operator, which provides more flexibility.

The same queries using $regex will be written as follows:

> db.books.find({'name': { '$regex': /mongo/ } })
> db.books.find({'name': { '$regex': /mongo/i } })

By using the $regex operator, we can also use the following two options:

Option

Description

x

Extended capability to ignore all whitespace characters in the $regex pattern, unless they have escaped or are included in a character class.

Additionally, it ignores characters in between (and including) an unescaped hash/pound (#£) character and the next newline so that you may include comments in complicated patterns. This only applies to data characters; whitespace characters may never appear within special character sequences in a pattern.

The x option does not affect the handling of the VT character.

s

This option allows the dot character (that is, .) to match all characters, including newline characters.

 

Expanding matching documents using regex makes our queries slower to execute.

Indexes using regular expressions can only be used if our regular expression does queries for the beginning of a string that is indexed; that is, regular expressions starting with ^ or A. If we want to query only using a starts with regular expression, we should avoid writing lengthier regular expressions, even if they will match the same strings.

Take the following code block as an example:

> db.books.find({'name': { '$regex': /mongo/ } })
> db.books.find({'name': { '$regex': /^mongo.*/ } })

Both queries will match name values starting with mongo (case-sensitive), but the first one will be faster as it will stop matching as soon as it hits the sixth character in every name value.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset