Characteristics of Mongo DB and it's use case.
MongoDB is a NoSQL, document-oriented database that stores data in a flexible, JSON-like format called BSON. It is designed for high performance, scalability, and ease of development.
🧩 1. Document-Oriented Storage
Stores data in BSON (Binary JSON) format.
Each document is a flexible, schema-less structure — similar to JSON.
Example document:
{
"_id": ObjectId("..."),
"name": "Alice",
"age": 30,
"skills": ["Python", "MongoDB"]
}
⚡ 2. High Performance
Optimized for fast reads and writes.
Efficient for real-time analytics and large-scale applications.
Indexes on fields improve query speed.
🔄 3. Schema Flexibility (Schema-less)
No rigid schema — each document in a collection can have different fields.
Great for evolving data models or polymorphic data.
🌐 4. Horizontal Scalability
Supports sharding: data is distributed across multiple machines.
Easily scales out to handle large volumes of data.
📊 5. Rich Query Language
Powerful query capabilities: filter, sort, project, aggregate, etc.
Supports nested documents and arrays in queries.
🧠 6. Aggregation Framework
Powerful for data transformation and analytics.
Similar to SQL's
GROUP BY
,HAVING
,JOIN
(via$lookup
).
🧵 7. Built-in Replication & High Availability
Uses replica sets for fault tolerance.
Automatically fails over to a secondary node if the primary fails.
🛠️ 8. Indexing
Supports single field, compound, multikey, geospatial, text, and hashed indexes.
Speeds up performance on queries.
🔐 9. ACID Transactions
Single-document operations are atomic by default.
Supports multi-document transactions (since MongoDB 4.0+).
☁️ 10. Cloud & Tooling Support
MongoDB Atlas for managed cloud deployment.
Has rich ecosystem: Compass (GUI), connectors for BI tools, drivers for all major languages.
🐯 What is WiredTiger?
WiredTiger is the default storage engine used in MongoDB since version 3.2. It is optimized for performance, concurrency, and compression.
One of the key components of WiredTiger is its in-memory cache, called the WiredTiger Cache.
🧠 What is the WiredTiger Cache?
The WiredTiger Cache is an in-memory area used to:
Hold frequently accessed documents and index entries
Buffer write operations before they're flushed to disk
Improve performance by reducing the need to access the disk frequently
Think of it as MongoDB’s internal “working memory.”
📏 Default Cache Size
By default, MongoDB assigns approximately 50% of (RAM - 1 GB) to the WiredTiger Cache.
Example:
If your system has 16 GB RAM:
OS reserve = 1 GB
Cache size ≈ 50% of (16 - 1) = 7.5 GB
You can override this using the setting:
storage:
wiredTiger:
engineConfig:
cacheSizeGB: <value>
🔄 How it Works
Reads:
When you read a document, MongoDB tries to fetch it from the WiredTiger Cache.
If it's not in cache, it reads from disk and places it in the cache for future use.
Writes:
Writes are first applied in the cache and logged to the journal for durability.
They are later flushed to disk during checkpoints or evictions.
Eviction Policy:
When the cache is full, the Least Recently Used (LRU) pages are evicted.
Modified (dirty) pages are flushed to disk before eviction.
db.serverStatus().wiredTiger.cache
Key fields:
bytes currently in the cache
maximum bytes configured
tracked dirty bytes in the cache
pages read into cache
pages written from cache
In MongoDB, the efficiency of a document's size depends on your workload pattern (read-heavy, write-heavy, etc.), hardware constraints, and access patterns. However, MongoDB does set some hard and soft limits and best practices around this.
📏 1. Maximum Document Size
Hard limit:
16 MB
(MongoDB enforces this — you cannot exceed it)
✅ 2. Best Practice — Optimal Document Size
For best performance, especially on frequently accessed documents:
🔸 Aim for ≤ 1 KB – 100 KB per document
This keeps reads/writes fast and avoids cache misses or large I/O operations.
🎯 3. Why Keep Documents Small?
| Reason | Explanation |
| ----------------------------- | ----------------------------------------------------------------------------------- |
| **Better cache fit** | Smaller documents allow more to fit in WiredTiger cache |
| **Less memory/disk pressure** | Reduces memory consumption and disk I/O |
| **Faster reads/writes** | Smaller payloads = faster network and disk operations |
| **Fewer updates overhead** | MongoDB may relocate large documents on updates if they grow beyond allocated space |
🧠 4. When Larger Documents Are Okay
In document modeling, embedding is encouraged (instead of joins), which might increase size.
Larger documents (100 KB – 1 MB) are fine if:
You need all embedded data together often
Access is mostly read-heavy and infrequent
Cache and memory are well-provisioned
🔄 5. Use GridFS
for Large Files
If you need to store files larger than 16 MB (e.g., videos, images), use GridFS, MongoDB’s built-in mechanism to store and retrieve large files in chunks.
🧪 6. Measuring and Monitoring
Use the Object.bsonsize(doc)
method to get the size of a document:
Object.bsonsize(db.collection.findOne())