MongoDB
Learn document databases from scratch - why they exist, how to design schemas, and how to query data efficiently.
The Filing Cabinet Problem#
Imagine you're organizing information about your friends. In a traditional filing cabinet (like SQL databases), you'd need separate drawers:
- People drawer: Names, birthdates
- Addresses drawer: Street, city, country
- Hobbies drawer: Activity name, skill level
To learn everything about one friend, you'd open three drawers, find the matching cards, and piece it together. Every. Single. Time.
Now imagine a smarter system where each friend has one folder containing everything: their info, addresses, hobbies, photos - all in one place. Need to know about Sarah? Open Sarah's folder. Done.
That's MongoDB. It stores everything related together in documents, not spread across tables.
When MongoDB Makes Sense#
MongoDB shines when:
- Your data is naturally nested (user profiles with settings, preferences, history)
- Your schema needs to evolve quickly
- You're building content management systems, catalogs, or real-time apps
- You don't need complex joins across many relationships
MongoDB struggles when:
- You need complex transactions across multiple documents
- Your data is highly relational (orders → products → suppliers → warehouses)
- You need strict schema enforcement
The Honest Truth
Many apps work fine with either MongoDB or SQL. Choose based on how your data naturally looks, not hype.
Getting Started#
First, install MongoDB locally or use MongoDB Atlas (free cloud tier). Then add the Node.js driver:
npm install mongoose
Mongoose is an ODM (Object Document Mapper) that makes working with MongoDB much nicer. It adds schemas, validation, and helpful methods.
Connecting to MongoDB#
import mongoose from 'mongoose';
async function connectDB() {
try {
await mongoose.connect(process.env.MONGODB_URI);
console.log('Connected to MongoDB');
} catch (error) {
console.error('MongoDB connection failed:', error.message);
process.exit(1);
}
}
// Handle disconnection gracefully
mongoose.connection.on('disconnected', () => {
console.log('MongoDB disconnected');
});
// Close connection when app stops
process.on('SIGINT', async () => {
await mongoose.connection.close();
process.exit(0);
});
export { connectDB };
Use it in your main file:
import express from 'express';
import { connectDB } from './config/database.js';
const app = express();
// Connect to database before starting server
await connectDB();
app.listen(3000, () => {
console.log('Server running on port 3000');
});
Your connection string looks like:
mongodb://localhost:27017/myapp # Local
mongodb+srv://user:pass@cluster.mongodb.net/myapp # Atlas
Documents and Collections#
In MongoDB:
- A document is like a JavaScript object (actually stored as BSON, binary JSON)
- A collection is a group of documents (like a folder full of similar documents)
- A database contains multiple collections
Here's what a user document might look like:
{
_id: ObjectId("507f1f77bcf86cd799439011"),
name: "Sarah Chen",
email: "sarah@example.com",
profile: {
bio: "Full-stack developer",
avatar: "https://example.com/sarah.jpg",
social: {
twitter: "@sarahcodes",
github: "sarahchen"
}
},
tags: ["javascript", "react", "mongodb"],
createdAt: ISODate("2024-01-15T10:30:00Z")
}
Notice how everything is together - profile, social links, tags. No joins needed.
Schemas: Bringing Order to Chaos#
MongoDB is "schemaless" - you can store anything. But that's usually a bad idea. Mongoose schemas define structure:
import mongoose from 'mongoose';
const userSchema = new mongoose.Schema({
// Required string with length limits
name: {
type: String,
required: [true, 'Name is required'],
minlength: [2, 'Name must be at least 2 characters'],
maxlength: [100, 'Name too long'],
trim: true
},
// Email with uniqueness and format validation
email: {
type: String,
required: true,
unique: true,
lowercase: true,
match: [/^\S+@\S+\.\S+$/, 'Invalid email format']
},
// Enum - only these values allowed
role: {
type: String,
enum: ['user', 'admin', 'moderator'],
default: 'user'
},
// Nested object
profile: {
bio: String,
avatar: String,
social: {
twitter: String,
github: String
}
},
// Array of strings
tags: [String],
// Number with min/max
loginCount: {
type: Number,
default: 0,
min: 0
}
}, {
timestamps: true // Automatically adds createdAt and updatedAt
});
// Create the model
const User = mongoose.model('User', userSchema);
export { User };
Schema Types at a Glance#
| Type | Example | Notes |
|---|---|---|
| String | name: String | Text data |
| Number | age: Number | Integers or floats |
| Boolean | active: Boolean | true/false |
| Date | birthday: Date | JavaScript dates |
| Array | tags: [String] | Arrays of any type |
| Object | profile: { bio: String } | Nested documents |
| ObjectId | author: mongoose.Schema.Types.ObjectId | References to other documents |
CRUD Operations: The Basics#
CRUD stands for Create, Read, Update, Delete - the four fundamental operations.
Creating Documents#
// Method 1: Create directly
const user = await User.create({
name: 'Sarah Chen',
email: 'sarah@example.com',
role: 'user'
});
console.log(user._id); // MongoDB generated this ID
// Method 2: New + Save (useful when you need to modify before saving)
const user = new User({
name: 'John Doe',
email: 'john@example.com'
});
user.profile = { bio: 'Software engineer' };
await user.save();
// Create multiple at once
const users = await User.insertMany([
{ name: 'Alice', email: 'alice@example.com' },
{ name: 'Bob', email: 'bob@example.com' }
]);
Reading Documents#
// Find all users
const allUsers = await User.find();
// Find with conditions
const admins = await User.find({ role: 'admin' });
// Find one (returns first match or null)
const sarah = await User.findOne({ email: 'sarah@example.com' });
// Find by ID (the most common operation)
const user = await User.findById('507f1f77bcf86cd799439011');
// Select only specific fields (projection)
const names = await User.find().select('name email');
// Returns: [{ _id: ..., name: 'Sarah', email: 'sarah@...' }, ...]
// Exclude fields
const noPasswords = await User.find().select('-password -__v');
// Sorting
const newest = await User.find().sort({ createdAt: -1 }); // -1 = descending
const alphabetical = await User.find().sort('name'); // ascending
// Limit results
const firstFive = await User.find().limit(5);
// Count documents
const totalUsers = await User.countDocuments();
const activeAdmins = await User.countDocuments({ role: 'admin', active: true });
Updating Documents#
// Find, update, and return the NEW document
const user = await User.findByIdAndUpdate(
'507f1f77bcf86cd799439011',
{ name: 'Sarah Chen-Smith' },
{
new: true, // Return updated document, not original
runValidators: true // Run schema validators on update
}
);
// Update without returning the document
await User.updateOne(
{ email: 'sarah@example.com' },
{ $set: { role: 'admin' } }
);
// Update many documents at once
await User.updateMany(
{ loginCount: { $lt: 5 } },
{ $set: { status: 'inactive' } }
);
Update Operators#
MongoDB has special operators for updates:
await User.findByIdAndUpdate(id, {
// Set a field
$set: { name: 'New Name' },
// Remove a field entirely
$unset: { temporaryField: '' },
// Increment a number
$inc: { loginCount: 1 },
// Add to an array
$push: { tags: 'mongodb' },
// Remove from an array
$pull: { tags: 'old-tag' },
// Add to array only if not already there
$addToSet: { tags: 'unique-tag' }
});
Deleting Documents#
// Find and delete, returns the deleted document
const deleted = await User.findByIdAndDelete('507f1f77bcf86cd799439011');
// Delete without returning
await User.deleteOne({ email: 'old@example.com' });
// Delete many
await User.deleteMany({ createdAt: { $lt: oneYearAgo } });
Query Operators: Finding What You Need#
MongoDB queries can be simple or sophisticated:
// Comparison operators
await User.find({ age: { $gt: 18 } }); // Greater than
await User.find({ age: { $gte: 18 } }); // Greater than or equal
await User.find({ age: { $lt: 65 } }); // Less than
await User.find({ age: { $lte: 65 } }); // Less than or equal
await User.find({ role: { $ne: 'admin' } }); // Not equal
await User.find({ status: { $in: ['active', 'pending'] } }); // In array
// Logical operators
await User.find({
$and: [
{ age: { $gte: 18 } },
{ role: 'user' }
]
});
await User.find({
$or: [
{ role: 'admin' },
{ role: 'moderator' }
]
});
// Array queries
await User.find({ tags: 'javascript' }); // Array contains this value
await User.find({ tags: { $all: ['js', 'react'] } }); // Contains ALL these
await User.find({ tags: { $size: 3 } }); // Array has exactly 3 items
// Field existence
await User.find({ 'profile.avatar': { $exists: true } });
// Regex (pattern matching)
await User.find({ name: /^sarah/i }); // Name starts with 'sarah', case-insensitive
await User.find({ email: /@gmail\.com$/i }); // Email ends with @gmail.com
Pagination: Don't Return Everything#
Never return thousands of documents at once. Paginate:
async function getUsers(page = 1, limit = 20) {
const skip = (page - 1) * limit;
const [users, total] = await Promise.all([
User.find()
.sort({ createdAt: -1 })
.skip(skip)
.limit(limit),
User.countDocuments()
]);
return {
data: users,
pagination: {
page,
limit,
total,
pages: Math.ceil(total / limit),
hasNext: page * limit < total,
hasPrev: page > 1
}
};
}
Use it in a route:
app.get('/api/users', async (req, res) => {
const page = parseInt(req.query.page) || 1;
const limit = Math.min(parseInt(req.query.limit) || 20, 100); // Max 100
const result = await getUsers(page, limit);
res.json(result);
});
Indexes: Making Queries Fast#
Without indexes, MongoDB scans every document to find matches. With indexes, it's nearly instant.
// Add indexes in your schema
userSchema.index({ email: 1 }); // 1 = ascending, -1 = descending
userSchema.index({ role: 1, createdAt: -1 }); // Compound index
userSchema.index({ name: 'text', 'profile.bio': 'text' }); // Text search
When to Index#
Index fields you:
- Search by frequently (
email,username) - Sort by often (
createdAt) - Use in compound queries together
Don't index:
- Fields that rarely appear in queries
- Fields with only a few unique values (like
boolean) - Everything (each index costs memory and slows writes)
Index Wisely
Each index speeds up reads but slows down writes. A collection with 10 indexes is usually over-indexed. Start with 2-3 and add more based on actual slow queries.
Checking Query Performance#
Use explain() to see if your query uses an index:
const explanation = await User.find({ email: 'test@example.com' })
.explain('executionStats');
console.log(explanation.executionStats.totalDocsExamined);
// If this equals your total documents, you need an index!
Relationships: References vs. Embedding#
You have two choices for related data:
Embedding (Store Together)#
const postSchema = new mongoose.Schema({
title: String,
content: String,
// Comments are embedded inside the post
comments: [{
author: String,
text: String,
createdAt: { type: Date, default: Date.now }
}]
});
Pros: One query gets everything. Fast. Cons: Document size limit (16MB). Updating embedded docs is awkward.
Referencing (Store Separately)#
const postSchema = new mongoose.Schema({
title: String,
content: String,
// Reference to the User who wrote this
author: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
}
});
Pros: Flexible. No size limits. Easy to update referenced docs. Cons: Requires extra queries (populate).
Using Populate#
// Without populate
const post = await Post.findById(id);
// post.author = ObjectId("507f1f77bcf86cd799439011") - just an ID
// With populate
const post = await Post.findById(id).populate('author');
// post.author = { _id: ..., name: 'Sarah', email: '...' } - full user object
// Select specific fields from populated document
const post = await Post.findById(id).populate('author', 'name avatar');
// Multiple populates
const post = await Post.findById(id)
.populate('author', 'name')
.populate('category', 'name slug');
Which to Choose?#
| Embed When | Reference When |
|---|---|
| Data is always accessed together | Data is accessed independently |
| Child data is small | Child data is large or unbounded |
| Data doesn't change often | Data changes frequently |
| One-to-few relationship | One-to-many or many-to-many |
Example: Embed a user's shipping addresses (few, always needed with user). Reference a user's orders (many, often queried separately).
Aggregation: Complex Queries#
For analytics, grouping, and data transformation, use the aggregation pipeline:
// Example: Get post counts by author, sorted by most prolific
const authorStats = await Post.aggregate([
// Stage 1: Only published posts
{ $match: { status: 'published' } },
// Stage 2: Group by author, count posts
{ $group: {
_id: '$author',
postCount: { $sum: 1 },
totalViews: { $sum: '$views' },
avgViews: { $avg: '$views' }
}},
// Stage 3: Sort by post count
{ $sort: { postCount: -1 } },
// Stage 4: Limit to top 10
{ $limit: 10 },
// Stage 5: Clean up the output
{ $project: {
authorId: '$_id',
postCount: 1,
totalViews: 1,
avgViews: { $round: ['$avgViews', 0] },
_id: 0
}}
]);
Common Aggregation Stages#
| Stage | Purpose |
|---|---|
$match | Filter documents (like find()) |
$group | Group and aggregate (sum, avg, count) |
$sort | Sort results |
$limit | Limit number of results |
$skip | Skip results (for pagination) |
$project | Shape output fields |
$lookup | Join with another collection |
$unwind | Flatten arrays into separate documents |
Lookup (Join Collections)#
const postsWithAuthors = await Post.aggregate([
{ $lookup: {
from: 'users', // Collection to join
localField: 'author', // Field in Post
foreignField: '_id', // Field in User
as: 'authorDetails' // Output array name
}},
{ $unwind: '$authorDetails' } // Convert array to object
]);
Practical Patterns#
Soft Deletes#
Instead of actually deleting, mark as deleted:
const userSchema = new mongoose.Schema({
// ... other fields
deletedAt: Date
});
// "Delete" a user
await User.findByIdAndUpdate(id, { deletedAt: new Date() });
// Find only non-deleted users
const activeUsers = await User.find({ deletedAt: null });
// Add a helper method
userSchema.statics.findActive = function() {
return this.find({ deletedAt: null });
};
Timestamps and Audit Fields#
const schema = new mongoose.Schema({
// ... your fields
}, {
timestamps: true // Adds createdAt and updatedAt automatically
});
// For more control
const schema = new mongoose.Schema({
createdAt: { type: Date, default: Date.now, immutable: true },
updatedAt: Date,
createdBy: { type: mongoose.Schema.Types.ObjectId, ref: 'User' },
updatedBy: { type: mongoose.Schema.Types.ObjectId, ref: 'User' }
});
// Update the timestamp before saving
schema.pre('save', function(next) {
this.updatedAt = new Date();
next();
});
Lean Queries for Speed#
By default, Mongoose returns full documents with methods. For read-only data, use lean():
// Returns Mongoose documents (with methods like .save())
const users = await User.find();
// Returns plain JavaScript objects (faster, less memory)
const users = await User.find().lean();
Use lean() when:
- You're only reading data (not modifying)
- You're sending data directly to an API response
- Performance matters
Key Takeaways#
MongoDB is powerful when you understand its strengths:
- Design around queries - Structure data based on how you'll access it, not how it's "normalized"
- Embed related data - When data is always accessed together and doesn't grow unboundedly
- Use references - When data is accessed independently or grows over time
- Index strategically - Index what you query, but don't over-index
- Use aggregation - For complex queries, let the database do the heavy lifting
- Lean for reads - Use
.lean()when you don't need Mongoose document methods
The Schema Design Rule
Ask yourself: "How will I query this data?" Design your documents to answer that question in one query whenever possible.
Ready to level up your skills?
Explore more guides and tutorials to deepen your understanding and become a better developer.