As this ebook edition doesn't have fixed pagination, the page numbers below are hyperlinked for reference only, based on the printed edition of this book.
Symbols
$regex operator
options 133
2D geospatial index 233
2dsphere geospatial index 233
A
accumulators 190
accumulators stages 191
ACID properties
atomicity 149
consistency 149
durability 152
exploring 148
isolation 150
Active Directory (AD) 78
Active Record (AR) 111
administration tasks, with mongo shell
currentOp() 64
killOp() 64
Advanced Message Queuing Protocol (AMQP) 322
aggregation
aggregation expressions 144, 145
aggregation framework
need for 176
SQL commands, mapping to 72
versus MapReduce 73
aggregation operators
about 178
aggregation pipeline stage 178
expression string operator 185, 186
literal expression operator 184
miscellaneous operator 185
object expression operator 185
text expression operator 186
timestamp expression operator 187
trigonometry expression operator 187, 188
type expression operator 188, 189
aggregation options
about 176
single-purpose aggregation methods 177
aggregation pipeline
limitations 198
optimizing 199
aggregation pipeline expression operators
about 180
array operators 181
comparison operators 181
set operators 181
aggregation pipeline stage
$addFields 178
$bucket 178
$bucketAuto 178
$collStats 178
$count 178
$densify 179
$documents 179
$facet 179
$fill 179
$geoNear 179
$graphLookup 179
$group 179
$indexStats 179
$limit 179
$listSessions 179
$lookup 179
$match 179
$merge 179
$out 179
$project 179
$redact 179
$replaceRoot 179
$replaceWith 180
$sample 180
$search 180
$searchMeta 180
$set 180
$setWindowFields 180
$skip 180
$sort 180
$sortByCount 180
$unionWith 180
$unset 180
$unwind 180
about 178
aggregation stage operators 178
aggregation stages 144
ahead of time (AOT) 303
Amazon DocumentDB SLA
reference link 366
Amazon Web Services (AWS) 76, 296, 414
American National Standards Institute (ANSI) 4
Apache ActiveMQ 322
Apache Hadoop 324
Apache Kafka 323
Apache Lucene 225
Apache Lucene technology
URL 302
Apache Software Foundation (ASF)
Apache Hadoop 324
Apache Spark
versus Hadoop MapReduce 326
application design
about 401
read performance, optimizing 408
schema design-less 402
application design, anti-patterns
about 407
bloated documents anti-pattern 408
case-insensitive queries, without matching indexes anti-pattern 408
massive arrays anti-pattern 407
unnecessary indexes anti-pattern 407, 408
application design, patterns
about 402
approximation pattern 407
attribute pattern 402
bucket pattern 406
computed pattern 405
extended reference pattern 406
outlier pattern 406
polymorphism 404
schema versioning pattern 405
subset pattern 405
tree traversal pattern 403, 404
application programming interface (API) 53, 304
approximation pattern 407
arbiter node 343
Association of Computer Machinery (ACM) 4
Atomicity, Consistency, Isolation, and Durability (ACID)
need for 152
atomic operations
attribute pattern 402
audit case study
about 94
audit filters 87
audit guarantee 88
auditing
about 84
setup, in MongoDB Enterprise Edition 85
versus logging 84
audit log
authcheck
about 91
actions 91
authentication
about 264
client-server key-based authentication 264
localhost exception 265
server-server key-based authentication 264
username/password client-based authentication 264
authentication, with MongoDB
about 78
Enterprise Edition 78
B
backup and restore roles
backup 270
restore 270
backup options, cluster
backup, making of sharded cluster 260
cloud-based solutions 258
filesystem snapshots, using 259
mongodump, using 261
on-premises solutions 259
queuing system 262
raw files, copying 261
batch operations
performing, with mongo shell 60-62
big data
characteristics 320
data warehousing 324
landscape 320
message queuing systems 321
use case, with servers on-premises 327
Big O complexity chart
reference link 215
Bigtable 4
Binary JSON (BSON) 25, 78, 311, 332
bloated documents anti-pattern 408
Boyce-Codd normal form (BCNF) 24, 402
broker 328
BSON schema
data types 312
BSON types
reference link 25
B-tree data structure
about 215
reference link 216
B-tree indexes
versus LSM indexes 281
B-tree misses
tracking 254
bucket pattern 406
built-in analyzers
keyword 303
language 303
simple 303
standard 303
whitespace 303
BulkWrite API
C
Cacti tool 257
capped collections 13
case-insensitive queries
without, matching indexes anti-patterns 408
case insensitivity 227
Cassandra 10
central processing unit (CPU) 412
certificate authority (CA) 76
chained replication 364
change streams
advantages 137
approach 136
notes 141
production recommendations 141
replica sets 141
setup 137
sharded clusters 142
chunk administration, sharding
about 384
chunks, moving 385
default chunk size, modifying 385
read and write concern, setting 384
shards, adding 389
client-server key-based authentication 264
client-side field-level encryption 284
cloud-first database 9
Cloud Native Computing Foundation (CNCF) 304
cloud options, replica set
about 365
Microsoft Azure Cosmos DB 366, 367
MongoDB Atlas 367
cluster
monitoring 249
securing 264
setting up 296
cluster administration roles
clusterAdmin 270
clusterManager 269
clusterMonitor 270
hostManager 269
cluster backups
EC2 backup and restore 262
incremental backups 263
options 258
clustered index 236
CockroachDB 286
command-line interface (CLI) 94, 309
Common Gateway Interface (CGI) 4
comparison operators 181
Compose 297
compound index
about 220
reusing 221
sorting with 220
computed pattern 405
conditional expressions 191
configuration settings
base URL 305
project ID 305
public API key 305
user 305
connections 253
consumer 328
Continuous Cloud Backups 258
count() method 177
create, read, update, and delete (CRUD) 53
cursors 253
custom aggregation expression operator 192
custom write concern 353
D
data
modeling, for atomic operations 30, 31
modeling, for Internet of Things (IoT) 36, 37
modeling, for keyword searches 35, 36
database administration roles
dbAdmin 269
dbOwner 269
userAdmin 269
database-as-a-service (DBaaS) 296, 365
data definition language (DDL) 161
data mapper pattern 47
data migration
to WiredTiger 281
data modeling 25
data size operator 192
data source name (DSN) 315
data types
data warehousing 324
integrations, monitoring 412
delete operation
detect failures 412
diacritical marks 232
diacritic insensitivity 227
digital bank, building with MongoDB
about 153
accounts, transferring 155-163
Directed Acyclic Graph (DAG) 177
directoryperdb option 252
disaster recovery (DR) 258, 398, 415
distinct() method 177
Docker Swarm 304
Doctrine
about 127
best practices 131
documents, creating 127
documents, deleting 128
Doctrine, annotations
reference link 48
Document Metrics option 253
dropping index
about 218
compound index 220
embedded documents, indexing 219
embedded fields, indexing 218
DynamoDB 4
E
EC2 backup and restore 262
e-commerce application
demonstrating, with MongoDB 163-172
Elastic Block Store (EBS) 76
Elastic Compute Cloud (EC2) 300, 329
elections
encrypted storage engine
enterprise data warehouse (EDW) 321
Enterprise Edition
Kerberos authentication 78
LDAP authentication 79
Enterprise Kubernetes Operator 304-307
enumerators (enums) 404
estimatedDocumentCount() method 177
expression arithmetic operator 189, 190
expression operators 178
expression string operator 185, 186
extended reference pattern 406
extract, transform, load (ETL) 11, 72
F
fan-out queries 394
first-in, first-out (FIFO) 288, 328
first normal form (1NF) 24
flexible sync 309
foreign key (FK) 409
free cloud-based monitoring 256
free space
tracking 251
Fully Homomorphic Encryption (FHE) 142
G
General Data Protection Regulation (GDPR) 298
Generic Security Service Application Program Interface (GSSAPI) 265
geospatial index 232
Google Cloud Platform (GCP) 296
Google File System (GFS) 324
Graphical User Interface (GUI) 91, 296
H
Hadoop
reference link 332
Hadoop MapReduce
versus Apache Spark 326
Hadoop-to-MongoDB pipeline
using 334
hash-based sharding 378
hashed index 227
HBase 10
HdfsCLI 334
hidden index 235
high availability
types 342
high availability (HA) 323, 401
horizontal scaling
about 370
advantages 370
disadvantages 370
hot server 342
HttpFS Hadoop API 334
Hypertext Preprocessor (PHP) 103
Hypertext Transfer Protocol (HTTP) 415
Hypertext Transfer Protocol Secure (HTTPS) endpoints 312
I
identifier (ID) 305
incremental backups 263
Incremental MapReduce
data, setting up 69
indexes
building 237
managing 237
indexes management
about 241
index names 241
limitations 242
index types
about 234
2D geospatial index 233
2dsphere geospatial index 233
clustered index 236
dropping index 218
geospatial index 232
hashed index 227
hidden index 235
single-field index 217
time-to-live (TTL) index 228
unique index 230
wildcard indexes 233
index usage
about 242
performance, improving 244
performance, measuring 242, 243
initial coin offering (ICO) 209
input/output operations per second (IOPS) 300
integrations
monitoring 412
International Components for Unicode (ICU) 231
Internet Control Message Protocol (ICMP) 81
Internet of things (IoT)
about 36
Internet of Things (IoT) 9, 195, 321, 406
Internet Protocol (IP) 77
inter-process communication (IPC) 370
I/O operations per second (IOPS) 21
I/O wait 255
isolation
about 150
example 150
issues 150
levels 150
issues, in isolation
about 150
dirty reads 151
non-repeatable reads 151
phantom reads 151
J
Java ARchive (JAR) 332
Java Enterprise Edition (EE) 322
Java Message Service (JMS) 322
Java Native Interface (JNI) 334
JavaScript Object Notation (JSON) 55, 302, 328, 402
JIRA tracker
reference link 22
JSON audit filter 91
JSON schema
URL 311
using 166
validation types and attributes 166
JSON Web Token (JWT) 311
jumbo chunks 386
K
Kafka
Kerberos 265
Key Management Interoperability Protocol (KMIP) 85, 282
key performance indicators (KPIs) 315
keyword searches
Kubernetes
about 304
URL 304
L
Laravel 46
Last Observation Carried Forward 179
LDAP Proxy Authentication (LDAP SASL) 265
libhdfs 334
Lightweight Directory Access Protocol (LDAP) 78, 296
literal expression operator 184
localhost exception 265
lock percentage 255
logarithmic time 215
logging
about 84
versus auditing 84
logical replication 342
Logical Volume Manager (LVM) 259
log-structured merge (LSM) 281
low operational overhead 14
LSM indexes
versus B-tree indexes 281
M
man-in-the-middle (MitM) attacks 76
many-to-many relationships 34, 35
mapped memory 251
MapReduce 4
MapReduce, mongo shell
concurrency 68
massive arrays anti-pattern 407
mean time between failures (MTBF) 258
megabyte (MB) 68
Memory Mapped Storage Engine version 1 (MMAPv1) 411
memory usage
about 250
monitoring, in WiredTiger 253
message queuing systems
about 321
Apache ActiveMQ 322
Apache Kafka 323
RabbitMQ 322
Microsoft Azure Cosmos DB 366, 367
Microsoft Azure SLA
reference link 367
Mike Hillyer’s blog
reference link 403
Minimum Viable Product (MVP) 9
miscellaneous operator 185
MMAPv1 7
model-view-controller (MVC) 24
MongoDB
about 148
authentication 78
commands and locks, usage 291
commands, for database lock 291, 292
configuration 15
connecting 37
connecting, with PHP driver 44-46
connecting, with Python 42
connecting, with Ruby 38
lock reporting 290
lock yield 290
schema design 25
security tips 75
Spark clusters, querying 335
transactions background 148
URL 145
using, as data warehouse 326, 327
versions 1.0 features 5
versions 1.2 features 5
versions 2 features 6
versions 3 features 7
versions 4 features 7
versions 5 features 8
MongoDB Atlas
about 367
log files 93
monitoring metrics 250
MongoDB Atlas Data Lake
reference link 335
MongoDB Atlas platform 296
MongoDB Atlas search
about 302
MongoDB Atlas Search 225
MongoDB Atlas Serverless 307
MongoDB Atlas tips 302
MongoDB Atlas tool installation
reference link 94
MongoDB best practices
about 15
for AWS 21
for replication 19
for sharding 20
operational best practices 15-17
schema design best practices 17, 18
MongoDB Charts 313
MongoDB Cloud Manager
about 315
key features 315
MongoDB Compass 314
MongoDB Connector for Business Intelligence (MongoDB Connector for BI) 315
MongoDB CRUD operations
about 103
in Mongoid 110
PHP driver 120
smart querying 132
update operator 132
with Doctrine 127
with PyMODM 118
with Python driver 112
with Ruby driver 104
MongoDB driver methods
about 143
operators 143
MongoDB Enterprise Edition
audit, setting up 85
MongoDB, Express, Angular, and Node (MEAN) 402
MongoDB for NoSQL developers
about 10
flexibility 11
flexible query model 11
native aggregation 11
schema-less model 11
MongoDB for SQL developers 9, 10
MongoDB Kubernetes Operator 304
MongoDB limitations
data integrity checks 411
data storage checks 411
document size limit 411
schema checks 411
MongoDB Management Service (MMS) 295, 315
MongoDB Monitoring Service (MMS) 315
MongoDB Ops Manager
key features 316
MongoDB Query Language (MQL) 297, 335, 366
MongoDB Realm
about 307
Realm Application Services 310
Realm Sync 308
Realm Sync data model 309
Realm Sync mode 309
MongoDB, schema design
read-write ratio 25
MongoDB, security tips
auditing 77
communication, encrypting with TLS/SSL 75, 76
data, encrypting 76
firewalls and VPNs 77
network exposure, limiting 77
secure configuration options, using 77, 78
MongoDB Spark Connector
using 335
MongoDB tools
about 296
MongoDB Atlas platform 296
MongoDB University
reference link 22
MongoDB user group
reference link 22
MongoDB view
collation 197
pipeline 197
viewName 197
viewOn 197
MONGODB-x.509 264
mongodump tool 261
Mongoid
about 110
data, reading 110
documents, creating 112
documents, deleting 112
documents, updating 112
queries, scoping 111
Mongoid models
mongo shell
about 96
authentication and authorization 74
scripting 57
scripting for 56
scripting for, versus direct use 57, 58
securing 73
using, for administration tasks 62, 63
using, for batch inserts 58, 59
using, for batch operations 60-62
mongosh shell
about 79
advantages 80
mongos process
about 393
find 394
find operator 393
limit operator 394
remove operator 395
skip operator 394
sort operator 394
update operator 395
mongostat command 257
mongotop command 257
monitoring tools
about 256
free cloud-based monitoring 256
on-premises tools 257
open source tools 257
SaaS-hosted tools 256
Multi-Version Concurrency Control (MVCC) 276
Munin tool 257
N
Nagios tool 257
nested operations 107
network 253
network-level security 271
New Relic
URL 315
non-monotonic reads 351
Not only SQL (NoSQL)
about 4
evolution 4
O
object document mapping (ODM) 39, 104, 334, 402
object expression operator 185
ObjectId
structure 29
object-oriented programming (OOP) 404
object-relational mapping (ORM) 9, 154, 402
ObjectRocket 297
one-to-many relationship 34, 35
Online Analytical Processing (OLAP) 342
on-premises tools 257
Open Authorization 2.0 (OAuth 2.0) 311
Open Database Connectivity (ODBC) 315
open source tools 257
operations
operations log (oplog)
oplog size 252
Oracle Corporation 4
order of magnitude (OOM) 327
outlier pattern 406
P
page eviction event 251
page fault event 250
page faults
about 250
tracking 254
Pareto principle of 80/20 406
partialFilterExpression filter
operators 229
partition-based sync 309
Payment Card Industry Data Security Standard (PCI DSS) 85
Percona Server for MongoDB 286, 287
Perl Compatible Regular Expression (PCRE) 132
personally identifiable information (PII) 284
petabytes (PB) 324
PHP driver
about 120
documents, reading 126
documents, updating 126
used, for connecting to MongoDB 44-46
physical replication 342
Plain Old PHP Objects (POPO) 47
Platform as a Service (PaaS) 5
pluggable storage engines
about 275
client-side field-level encryption 284
encrypted storage engine 282, 283
Percona Server for MongoDB 286, 287
RocksDB 286
TokuMX 286
WiredTiger 275
point-in-time recovery (PITR) 316
polymorphism 404
prefix indexing 221
prevent failures 412
primary key (PK) 379
primary read preference 351
producer 328
Proof of Concept (POC) 9
publish/subscribe (pub/sub) 322
PyMODM
about 118
documents, creating 118
documents, deleting 119
documents, querying 119
documents, updating 118
PyMODM models
inheritance 44
PyMongo’s watch command
Python
used, for connecting to MongoDB 42
Python driver
about 112
documents, updating 117
Q
queryable encryption 142
query router 393
queues
reading 255
writing 255
R
RabbitMQ 322
random-access memory (RAM) 415
range-based sharding 377
Read-Eval-Print-Loop (REPL) 80
read performance
optimizing 408
read preference
levels 351
read querying
Realm Application Services
about 310
HTTPS endpoints 312
Realm functions 310
Realm GraphQL 310
Realm Studio 313
Realm triggers 310
Realm values 313
static file hosting 313
user authentication 311
Realm data access rules 309
Realm functions 310
Realm GraphQL 310
Realm Studio 313
Realm Sync 308
Realm Sync data model 309
Realm Sync mode
about 309
Realm data access rules 309
Realm template apps
reference link 310
Realm triggers 310
Realm values 313
Red Hat Enterprise Linux (RHEL) 316
regular expression (regex)
about 404
relational database management system (RDBMS) 215, 402
relational databases
schema design 24
relationships
many-to-many relationship 34, 35
modeling 32
one-to-many relationship 34, 35
replica set
advantages 347
backing up 259
limitations 368
production considerations 357
setting up 348
standalone server, converting into 348
replica set administration
about 360
chained replication 364
flow control 365
initial sync 361
maintenance, performing 360, 361
oplog’s size, modifying 362, 363
replica set member, resyncing 362
replica set, reconfiguring 364
streaming replication 365
replica set members, priority settings
about 354
delayed replica set members 356, 357
hidden replica set members 355, 356
replica sets 252
replication
architectural overview 343
high availability 342
logical replication 342
monitoring 252
physical replication 342
replication lag 252
REpresentational State Transfer (REST) 415
REpresentational State Transfer (REST API) 305
resident memory size 251
resilient distributed datasets (RDDs) 325
RocksDB 286
role-based access control-based authorization
backup and restore roles 270
cluster administration roles 269, 270
database administration roles 269
roles, across databases 270
user roles 269
role-based access control (RBAC) 336
roles, across databases
dbAdminAnyDatabase 270
readAnyDatabase 270
readWriteAnyDatabase 270
userAdminAnyDatabase 270
rolling index
building, on replica sets 240
building, on sharded environment 240, 241
Ruby
used, for connecting to MongoDB 38
Ruby driver
about 104
data, deleting 109
data, updating 108
documents, creating 104
nested operations 107
operations, chaining in find() 106, 107
Ruby hashes 107
RVM installation
reference link 39
S
SaaS-hosted tools 256
Salted Challenge Response Authentication Mechanism secure hashing algorithm 1 (SCRAM-SHA-1) 78
scaling 13
scatter-and-gather operations 394
schema flexibility 13
schemaless 233
schema versioning pattern 405
secondary cold server 343
secondary hot server 343
secondary warm server 343
second normal form (2NF) 24
Secure Sockets Layer (SSL) 74, 414
security
auditing 271
best practice recommendations 272, 273
booting 413
special cases 271
server-server key-based authentication 264
set operators 181
sharded cluster
about 372
architecture 372
continuous deployment environment 373
development environment 373
staging environment 374
sharded data querying
about 393
hedged reads 395
performance comparison, with replica sets 396
query router 393
with Ruby 396
sharding
about 13
administration 379
chunk administration 384
guidelines 374
monitoring 379
need for 370
operations, using with 395
planning with 374
setting up 374
shard key, selecting 375
sharding recovery
about 397
config server 397
entire cluster 398
mongod process 397
mongos process 397
shard 398
sharding strategies
about 377
custom key sharding 379
hash-based sharding 378
location-based data sharding 379
range-based sharding 377
shard key
changing, for MongoDB v4.2 376
changing, for MongoDB v5.0 376
changing, prior to MongoDB v4.2 375
characteristics 377
considerations, for selection 377
high cardinality 377
low frequency 377
nonmonotonically changing values 377
selecting 375
Short Message Service (SMS) 315
Simple Storage Service (S3) 297
single-field index 217
single point of failure (SPOF) 296
single-purpose aggregation methods 177, 178
single-server system
characteristics 371
limitations 371
slot-based query execution engine (SBE) 243
smart querying
about 132
regular expressions, using 132, 133
storage, considering for delete operation 135, 136
software-as-a-service (SaaS) 296
solid-state drive (SSD) 254, 415
SQL mapping, to MongoDB
reference link 10
standalone server
converting, into replica set 348
standard output (stdout) 57
static file hosting 313
Structured Query Language (SQL)
evolution 4
subset pattern 405
superuser roles
root 270
__system 271
T
terabytes (TB) 319
text expression operator 186
text index
case insensitivity 227
diacritic insensitivity 227
tokenization delimiters 227
third normal form (3NF) 24
time series collection
expireAfterSeconds 196
granularity 196
metaField 196
timeField 196
timestamp expression operator 187
time-to-live (TTL)
index 228
tokenization delimiters 227
TokuMX 286
topic 328
transactions
background 148
multi-document ACID transactions 172
Transport Layer Security (TLS) 302
tree traversal pattern 403, 404
trigonometry expression operator 187, 188
type conversion operator 194, 195
type expression operator
accumulators 190
accumulators stages 191
conditional expressions 191
custom aggregation expression operator 192
data size operator 192
expression arithmetic operator 189, 190
type conversion operator 194, 195
variable expression operator 192
U
Unified Modeling Language (UML) 24
Uniform Resource Identifier (URI) 302
Uniform Resource Locator (URL) 305
unique identifier (UID) 305, 323, 409
unique index 230
Universally unique ID (UUID) 312
unnecessary indexes anti-pattern 407, 408
update operator 132
user authentication 311
username/password client-based authentication 264
user roles
read 269
readWrite 269
V
variable expression operator 192
vertical scaling
about 370
advantages 370
disadvantages 370
virtual address 251
virtual machine (VM) 304
Virtual Private Cloud (VPC) 296, 414
Virtual private networks (VPNs) 77
W
WebHDFS 334
wildcard index
limitations 235
WiredTiger
about 275
benefits 275
B-tree versus LSM indexes 281
checkpoints 276
collection-level options 279, 280
data compression 277
data migration to 281
document-level locking 276
journaling 276
memory usage, monitoring 253
performance strategies 280, 281
snapshots 276
working sets
about 251
calculations 255
write-ahead log (WAL) 152
write concern
custom write concern 353
Y
YAML Ain’t Markup Language (YAML) 76
Yet Another Resource Negotiator (YARN) 324
Z
zettabytes (ZB) 319