## Entry points
- MongoDB is a document database [[Database#Document database]]
- [What is a document database?](https://www.mongodb.com/resources/basics/databases/document-databases) in the MongoDB documentation
- [MongoDB website](https://www.mongodb.com/)
- [MongoDB University](https://learn.mongodb.com)
- [MongoDB Atlas](https://www.mongodb.com/atlas): Cloud-based service
- Entry points in the excellent MongoDB documentation
- [Data modeling](https://www.mongodb.com/docs/manual/data-modeling/)
- [Documents](https://www.mongodb.com/docs/manual/core/document/)
- [Welcome to MongoDB Shell (`mongosh`)](https://www.mongodb.com/docs/mongodb-shell/)
- [Query and Projection Operators](https://www.mongodb.com/docs/manual/reference/operator/query/)
- Aggregation explained
- [Aggregation](https://www.mongodb.com/resources/products/capabilities/aggregation)
- [The MongoDB aggregation pipeline](https://www.mongodb.com/resources/products/capabilities/aggregation-pipeline)
## Structure of a MongoDB instance
Each MongoDB instance can hold multiple databases. Each database can hold multiple collections. Each collection contains documents that hold similar information (for example personal information about customers). A document inside a collection holds the respective data in structured format.
```
- MongoDB instance
- Database 1
- Collection 1
- Document 1
- Document 2
- ...
- Document n
- Collection 2
- ...
- Collection n
- Database 2
- Collection
- ...
- Database n
```
## Document content
- The data inside a document is structured as "key-value pairs" (sometimes also called "field-value" or "name-value")
- MongoDB allows to do write and read operations in JSON format, but it actually stores the data in [BSON](https://bsonspec.org) format (MongoDB created its specification)
- BSON has some advantages over JSON
- It is more efficient in terms of storage
- It supports several data formats (JSON does not)
- BSON is not human-readable
- Each document must contain an `_id` field. Its value is immutable. If no value is provided, MongoDB generates an `ObjectId` for it (which has a timestamp embedded)
## Data modeling
Option 1: Putting several types of data into the documents of one collection ([[#Embedded documents]]).
- Good: Quick and simple retrieval of data
- Bad: Risk of duplicating data points
Option 2: Splitting up data types into dedicated collections and reference them ([[#References]]):
- Good: Less memory consumption (no duplication of data)
- Bad: Slower data retrieval (having to look up data in each collection, then connect it)
### Embedded documents
This means storing data of each data point plus its related data in one document ("[[Database#Denormalization|denormalized]] design"), which results in one document per data point in a single collection. Consider this when working with **one-to-one** and **one-to-many** relationships.
The related inlined data is also called sub-document.
> For many use cases in MongoDB, the denormalized data model where related data is stored within a single document is optimal.
>
> -- From [Database References](https://www.mongodb.com/docs/manual/reference/database-references/#document-references) in the MongoDB documentation
### References
This means storing data points and related data in different collections and referencing the related data in the document where necessary ("[[Database#Normalization|normalized]] design"). Consider this when working with many-to-many relationships. The MongoDB documentation differentiates between "manual references" and DBRefs.
### Naming of databases, collections and fields
MongoDB has a few [restrictions when it comes to naming of databases](https://www.mongodb.com/docs/manual/reference/limits/#naming-restrictions), collections and fields. Everything else seems subject to personal preference or convention; it just should be consistent.
Good practices seem to be:
- Database name: Only letters, [[Case styles|flatcase]]
- Collection name: Only letters, plural, [[Case styles|flatcase]]
- Field names: [[Case styles|camelCase]], except when the field refers to a document id of a different collection (when using [[#References]]), where the field name would be collection name + `_id` (for example `customers_id`)
## Indexing
### Index types
#### Single field index
This means that the [[Database#Index|index]] references one field of a document. If the indexed field is an array, MongoDB handles that automatically. This is then called a "single field multikey index".
#### Compound index
This means that the [[Database#Index|index]] references multiple fields of a document. MongoDB only allows one of the indexed fields as an array.
A compound index not only improves performance for queries that filter for the indexed fields, but also for the first indexed field. In other words: There is no need to have a dedicated index only for the first indexed field.
For example, imagine an index that includes an `age` field as the first field and `grade` as the second field. Then imagine two types of queries:
- A `find()` query that only filters for `age` is also optimized by the compound index
- A `find()` query that only filters for `grade` is **not** optimized by the compound index
### Index properties
- [Partial index](https://www.mongodb.com/docs/manual/core/index-partial)
- [Sparse index](https://www.mongodb.com/docs/manual/core/index-sparse)
- [TTL index](https://www.mongodb.com/docs/manual/core/index-ttl)
- [Unique index](https://www.mongodb.com/docs/manual/indexes/#unique-indexes)
## MongoDB Shell
Connect to the MongoDB instance, then start the REPL with `mongosh`. The prompt shows the currently selected database.
```
show dbs # Show databases on the instance
use the-db-name # Switch to database
db # Show name of currently selected database
show collections # Show collections in the current database
db.collection-name.find() # Matches all documents in specified collection
it # Iterates the cursor to the next batch of results
```
All following code examples assume a collection name of `customers`. This is to reflect the good practice of naming a collection with the plural of the content.
### Methods
#### Mind the arguments
The different methods support different parameters, which sometimes confused me in the beginning when it came to getting right all the braces and brackets.
#### `find()` and `findOne()`
[`find()`](https://www.mongodb.com/docs/manual/reference/method/db.collection.find/) multiple documents or `findOne()` document (the first that matches the query). Both methods take three optional arguments: Query, projection and options.
```mongodb
db.customers.find(
{},
{},
{}
)
```
Return all documents of the collection:
```mongodb
db.customers.find()
```
Pass a query to search for documents that match the values of the specified fields:
```mongodb
db.customers.find({
fieldName: "value",
otherFieldName: "value"
})
```
Pass a query that looks for a value in an [[#Embedded documents|embedded document]]. Note the quotes and dot notation.
```mongodb
db.customers.find({
"parentField.childField": "value"
})
```
Match only the documents that have the exact values (not more, not less) in an array field. For more options on how to query data in arrays, see the [[#`$all`]] and [[#`$elemMatch`]] operators.
```mongodb
db.customers.find({
fieldName: [ "value1", "value2" ]
})
```
Call `find()` with the second parameter (projection) to only return specified fields. Pass `1` to include the specified field, pass `0` to exclude the specified field. Note: Combining inclusion and exclusion is only possible with the `_id` field. Mixing `1` and `0` values for other fields results in a `MongoServerError`.
```mongodb
db.customers.find(
{ fieldName: "value" },
{
desiredFieldName: 1
_id: 0
}
)
```
Chain `sort()` to sort the result in ascending order (use `-1` for descending order). Note: If the result contains multiple documents with the same value for the specified field, the order of documents can change with every query.
```mongodb
db.customers.find().sort({
fieldName: 1
otherFieldName: 1
})
```
Other useful chainable methods:
- `.count()`: Returns number of matching documents
- `.limit()`: Set maximum number of results
- `.pretty()`: Prettify output
- [[#`explain()`]]: Get performance metrics for a query
#### `insertOne()` and `insertMany()`
Insert one document into a collection (it creates the specified collection if it does not exist).
```mongodb
db.customers.insertOne({
fieldName: "value",
otherField: "value"
})
```
Insert multiple documents into a collection (and create the specified collection if non-existent). Note that the first argument is an array and not an object as with other commands.
```mongodb
db.customers.insertMany(
[document1, document2, ...]
)
```
#### `updateOne()` and `updateMany()`
These update one or many documents. Mind the arguments. Both methods take three: Filter (see [[#`find()` and `findOne()`]]), update (document) and options, each of which is an object. The first two arguments are mandatory.
```mongodb
db.customers.updateOne(
{},
{},
{}
)
```
Update an existing document (see also [`$set` operator docs](https://www.mongodb.com/docs/manual/reference/operator/update/set/#mongodb-update-up.-set/)). The first argument specifies a filter for finding the desired document. Note that if multiple documents match with the filter criteria, the command updates the **first** matching document.
```mongodb
db.customers.updateOne(
{ fieldName: "value" },
{ $set: { fieldToChange: "new-value" } }
)
```
Update an element **inside an embedded document**. Note the quotes and the dot notation.
```mongodb
db.customers.updateOne(
{ fieldName: "value" },
{ $set: { "someField.childField": "new-value" } }
)
```
Update an existing element **inside an array**. Note the quotes and the dot notation to specify the element's index inside the array.
```mongodb
db.customers.updateOne(
{ fieldName: "value" },
{ $set: { "arrayToChange.0": "new-value" } }
)
```
Append a new element to an array. If the array field does not exist, the command will create it and add the value. `$push` can update/create multiple field arrays with multiple values at once.
```mongodb
db.customers.updateOne(
{ fieldName: "value" },
{ $push: { arrayToAddSomethingTo: "new-value" } }
)
```
**[[Database#Upsert|Upsert]]** data. Note the third argument that contains an `upsert` field. This query uses the `updateOne()` method, but `updateMany()` would work as well.
```mongodb
db.customers.updateOne(
{ fieldName: "value" },
{ $set: { fieldToChange: "new-value" } },
{ upsert: true }
)
```
#### `findAndModify()`
Find and return a document and optionally change it. **Note that this method takes named parameters within an object, so watch the curly braces.** Take this example call that only uses a few of the numerous available parameters. It...
- ... searches for documents specified in `query`
- ... updates the fields specified in `update`
- ... returns the modified document, because of `new: true`
- ... adds a new document if `query` did not yield a result
```mongodb
db.customers.findAndModify({
query: { fieldToFind: "value" },
update: { fieldToUpdate: "value" },
new: true,
upsert: true
})
```
What is the difference compared to [[#`updateOne()` and `updateMany()`|`updateMany()`]]? `findAndModify()` returns the to-be-modified document first, so one can apply further conditional modifications. `updateMany()` does not return the documents, which makes it more efficient.
#### `deleteOne()` and `deleteMany()`
Both methods take two arguments, the first being a filter (mandatory), the second being options (optional).
```mongodb
db.customers.deleteOne(
{},
{}
)
```
Delete the first document in the collection.
```mongodb
db.customers.deleteOne({})
```
Delete all documents in the collection.
```mongodb
db.customers.deleteMany({})
db.customers.deleteMany()
```
#### `replaceOne()`
Takes three arguments: A filter (mandatory), the document to insert (mandatory), options (optional).
Note the difference to [[#`updateOne()` and `updateMany()`|`updateOne()`]]: `replaceOne()` replaces the found document completely, so if the original document contains more fields, the command will remove them.
```mongodb
db.customers.replaceOne(
{},
{},
{}
)
```
Replace the first returned document with the specified document.
```mongodb
db.customers.replaceOne(
{ content: "value" },
{ content: "New value", addedField: "Another value" }
)
```
#### `createIndex()`
Create an index for the specified field. Setting `1` sorts the index by value in ascending order, `-1` sorts the index in descending order. Pass more fields to get a [[#Compound index]].
```mongodb
db.customers.createIndex({
fieldToIndex: 1
})
```
#### `getIndexes()`
List indexes of a collection.
```mongodb
db.customers.getIndexes()
```
#### `dropIndex()`
Remove specified index.
```mongodb
db.customers.dropIndex("field_name_1")
```
#### `explain()`
Get execution statistics for a query. This is especially interesting to get insight on performance before/after having [[#`createIndex()`|created an index]].
```mongodb
db.customers
.find({ fieldName: "value" })
.explain("executionStats")
```
### Operators
#### `$gt` and `$lt`
Match documents with a value greater than the one specified in the query.
```mongodb
db.customers.find({
fieldName: { $gt: value }
})
```
Return documents that contain a value >= 2010 **and** <= 2015 inside an array.
```mongodb
db.customers.find({
activeYears: { $gte: 2010, $lte: 2015 }
})
```
Operators like `$gt` also work on strings. They compare lexicographically. This example finds all students that contain a grade after the letter "C" in an array `grade`, where each array element is an object/document.
```mongodb
db.students.find({
grades: { $elemMatch: { grade: { $gt: "C" } } }
})
```
#### `$all`
Return documents(s) that contain(s) all of the specified values **inside a plain array field, in any order**.
```mongodb
db.customers.find({
fieldName: { $all: ["value 1", "value 2"] }
})
```
#### `$elemMatch`
Return documents that contain the specified values in a field that holds an **array of objects/documents**. In this example, `fieldName` holds such an array of objects.
```mongodb
db.customers.find({
fieldName: { $elemMatch: {
subField: "value",
otherSub: "other-value"
}
}
})
```
#### `$exists`
Only return documents where the respective field exists.
## Aggregation
Analytics operations in MongoDB are called **aggregation**.
Aggregation pipeline: A collection's data passes from the start to the end of the pipeline. The pipeline comprises of several stages. Each stage can perform a specific operation on the data, like filtering or sorting, adding new elements and creating a new collection as output.
Use the `aggregate()` method to specify an aggregation pipeline. It takes an array of stages as first argument. The second argument is optional and thus not present in the following example.
```mongodb
db.customers.aggregate([
{ $match: { fieldName: "value" } }
])
```
See also the [`$match` operator](https://www.mongodb.com/docs/manual/reference/operator/aggregation/match/) in the MongoDB documentation.