Operations Checklist
The following checklist, along with theDevelopment Checklist list, providesrecommendations to help you avoid issues in your production MongoDBdeployment.
Filesystem
- Align your disk partitions with your RAID configuration.
- Avoid using NFS drives for your
dbPath
.Using NFS drives can result in degraded and unstable performance.See: Remote Filesystems for more information.- VMware users should use VMware virtual drives over NFS.
- Linux/Unix: format your drives into XFS or EXT4. If possible, useXFS as it generally performs better with MongoDB.
- With the WiredTiger storage engine, use of XFS is stronglyrecommended to avoid performance issues found when using EXT4with WiredTiger.
- If using RAID, you may need to configure XFS with your RAIDgeometry.
- Windows: use the NTFS file system.Do not use any FAT file system (i.e. FAT 16/32/exFAT).
Replication
Verify that all non-hidden replica set members are identicallyprovisioned in terms of their RAM, CPU, disk, network setup, etc.
Configure the oplog size tosuit your use case:
The replication oplog window should cover normal maintenance anddowntime windows to avoid the need for a full resync.
The replication oplog window should cover the time needed torestore a replica set member from the last backup.
Changed in version 3.4: The replication oplog window no longer needs to cover thetime needed to restore a replica set member via initial syncas the oplog records are pulled during the data copy.However, the member being restored must have enough diskspace in the localdatabase to temporarily store these oplog records for theduration of this data copy stage.
With earlier versions of MongoDB, replication oplog windowshould cover the time needed to restore a replica set memberby initial sync.
Ensure that your replica set includes at least three data-bearingnodes that run with journaling and that you issue writeswith
w:"majority"
write concern for availability and durability.Use hostnames when configuring replica set members, rather than IPaddresses.
Ensure full bidirectional network connectivity between all
mongod
instances.Ensure that each host can resolve itself.
Ensure that your replica set contains an odd number of voting members.
Ensure that
mongod
instances have0
or1
votes.For high availability, deploy your replica set into aminimum of three data centers.
Sharding
- Place your config servers on dedicated hardware foroptimal performance in large clusters. Ensure that the hardware hasenough RAM to hold the data files entirely in memory and that ithas dedicated storage.
- Deploy
mongos
routers in accordance with theProduction Configuration guidelines. - Use NTP to synchronize the clocks on all components of your shardedcluster.
- Ensure full bidirectional network connectivity between
mongod
,mongos
, and config servers. - Use CNAMEs to identify your config servers to the cluster so thatyou can rename and renumber your config servers without downtime.
Journaling: WiredTiger Storage Engine
- Ensure that all instances use journaling.
- Place the journal on its own low-latency disk for write-intensiveworkloads. Note that this will affect snapshot-style backups asthe files constituting the state of the database will reside onseparate volumes.
Hardware
- Use RAID10 and SSD drives for optimal performance.
- SAN and Virtualization:
Deployments to Cloud Hardware
- Windows Azure: Adjust the TCP keepalive (
tcp_keepalive_time
) to100-120. The TCP idle timeout on the Azure load balancer is tooslow for MongoDB’s connection pooling behavior. See:Azure Production Notesfor more information. - Use MongoDB version 2.6.4 or later on systems with high-latencystorage, such as Windows Azure, as these versions includeperformance improvements for those systems.
Operating System Configuration
Linux
Turn off transparent hugepages. SeeTransparent Huge Pages Settings for more information.
Adjust the readahead settings on the devicesstoring your database files.
- For the WiredTiger storage engine, set readahead between 8and 32 regardless of storage media type (spinning disk, SSD,etc.), unless testing shows a measurable, repeatable, andreliable benefit in a higher readahead value.
MongoDB commercial support can provideadvice and guidance on alternate readahead configurations.
- Disable the
tuned
tool if you are running RHEL 7 / CentOS 7 in avirtual environment.
When RHEL 7 / CentOS 7 run in a virtual environment, the tuned
toolautomatically invokes a performance profile derived fromperformance throughput, which automatically sets the readaheadsettings to 4MB. This can negatively impact performance.
Use the
noop
ordeadline
disk schedulers for SSD drives.Use the
noop
disk scheduler for virtualized drives in guest VMs.Disable NUMA or set vm.zone_reclaim_mode to 0 and run
mongod
instances with node interleaving. See: MongoDB and NUMA Hardwarefor more information.Adjust the
ulimit
values on your hardware to suit your use case. Ifmultiplemongod
ormongos
instances arerunning under the same user, scale theulimit
valuesaccordingly. See: UNIX ulimit Settings for more information.Use
noatime
for thedbPath
mount point.Configure sufficient file handles (
fs.file-max
), kernel pidlimit (kernel.pid_max
), maximum threads per process(kernel.threads-max
), and maximum number of memory map areas perprocess (vm.max_map_count
) for your deployment. For large systems,the following values provide a good starting point:fs.file-max
value of 98000,kernel.pid_max
value of 64000,kernel.threads-max
value of 64000, andvm.max_map_count
value of 128000
Ensure that your system has swap space configured. Refer to youroperating system’s documentation for details on appropriate sizing.
Ensure that the system default TCP keepalive is set correctly. Avalue of 300 often provides better performance for replica sets andsharded clusters. See: Does TCP keepalive time affect MongoDB Deployments? in the Frequently AskedQuestions for more information.
Windows
- Consider disabling NTFS “last access time” updates. This isanalogous to disabling
atime
on Unix-like systems. - Format NTFS disks using the defaultAllocation unit size of 4096 bytes.
Backups
- Schedule periodic tests of your back up and restore process to havetime estimates on hand, and to verify its functionality.
Monitoring
Use MongoDB Cloud Manager or Ops Manager, an on-premisesolution available in MongoDB Enterprise Advanced or another monitoring system tomonitor key database metrics and set up alerts for them. Includealerts for the following metrics:
- replication lag
- replication oplog window
- assertions
- queues
- page faults
- Monitor hardware statistics for your servers. In particular,pay attention to the disk use, CPU, and available disk space.
In the absence of disk space monitoring, or as a precaution:
- Create a dummy 4 GB file on the
storage.dbPath
driveto ensure available space if the disk becomes full. - A combination of
cron+df
can alert when disk space hits ahigh-water mark, if no other monitoring tool is available.
Load Balancing
- Configure load balancers to enable “sticky sessions” or “clientaffinity”, with a sufficient timeout for existing connections.
- Avoid placing load balancers between MongoDB cluster or replica setcomponents.