Rollbacks During Replica Set Failover
A rollback reverts write operations on a former primary when themember rejoins its replica set after a failover.A rollback is necessary only if the primary had accepted writeoperations that the secondaries had notsuccessfully replicated before the primary stepped down. When theprimary rejoins the set as a secondary, it reverts, or “rolls back,” itswrite operations to maintain database consistency with the othermembers.
MongoDB attempts to avoid rollbacks, which should be rare. When arollback does occur, it is often the result of a networkpartition. Secondaries that can not keep up with the throughput ofoperations on the former primary, increase the size and impact of therollback.
A rollback does not occur if the write operations replicate to anothermember of the replica set before the primary steps down and if thatmember remains available and accessible to a majority of the replicaset.
Collect Rollback Data
Configure Rollback Data
Starting in version 4.0, MongoDB adds the parametercreateRollbackDataFiles
to control whether or not rollbackfiles are created during rollbacks.
Rollback Data
By default, when a rollback occurs, MongoDB writes the rollback data toBSON files. For each collection whose data is rolled back, therollback files are located in a <dbpath>/rollback/<db>.<collection>
directory and have filenames of the form: [1]
- removed.<timestamp>.bson
For example, if data for the collection comments
in thereporting
database rolled back:
- <dbpath>/rollback/reporting.comments/removed.2019-01-31T02-57-40.0.bson
where <dbpath>
is the mongod
’s dbPath
.
If the operation to roll back is a collection drop or a documentdeletion, the rollback of the collection drop or document deletion isnot written to the rollback data directory.
Read Rollback Data
To read the contents of the rollback files, use bsondump. Based on the content and the knowledgeof their applications, administrators can decide the next course ofaction to take.
[1] | In previous versions, rollback files are located directly under the<dbpath>/rollback directory with the filenames of the form<db>.<collection>.<timestamp>.bson . |
Avoid Replica Set Rollbacks
For replica sets, the default write concern {w: 1} only provides acknowledgement of writeoperations on the primary. With the default write concern, data may berolled back if the primary steps down before the write operations havereplicated to any of the secondaries. This includes data written inmulti-document transactions that commitusing "w: 1"
write concern.
Journaling and Write Concern majority
To prevent rollbacks of data that have been acknowledged to the client,run all voting members with journaling enabled and use w:majority write concern to guarantee that the write operationspropagate to a majority of the replica set nodes before returning withacknowledgement to the issuing client.
With writeConcernMajorityJournalDefault
set to false
,MongoDB does not wait for w: "majority"
writes to be written to the on-disk journal before acknowledging thewrites. As such, majority
write operations couldpossibly roll back in the event of a transient loss (e.g. crash andrestart) of a majority of nodes in a given replica set.
Visibility of Data That Can Be Rolled Back
- Regardless of a write’s write concern, otherclients using
"local"
or"available"
read concern can see the result of a write operation before the writeoperation is acknowledged to the issuing client. - Clients using
"local"
or"available"
read concern can read data which may be subsequently rolledback during replica set failovers.
For operations in a multi-document transaction, when a transaction commits, all data changesmade in the transaction are saved and visible outside the transaction.That is, a transaction will not commit some of its changes whilerolling back others.
Until a transaction commits, the data changes made in thetransaction are not visible outside the transaction.
However, when a transaction writes to multiple shards, not alloutside read operations need to wait for the result of the committedtransaction to be visible across the shards. For example, if atransaction is committed and write 1 is visible on shard A but write2 is not yet visible on shard B, an outside read at read concern"local"
can read the results of write 1 withoutseeing write 2.
Rollback Considerations
User Operations
Starting in version 4.2, MongoDB kills all in-progress useroperations when a member enters the ROLLBACK
state.
Index Builds
- For feature compatibility version (fcv)
"4.2"
,MongoDB waits for any in-progressindex builds to finish before starting arollback. - For feature compatibility version (fcv)
"4.0"
,MongoDB waits for any in-progress backgroundindex builds to finish before starting arollback.
For more information on the index build process, seeIndex Builds on Populated Collections.
Size Limitations
Changed in version 4.0.
Starting in version 4.0, MongoDB has no limit on the amount of datathat can be rolled back.
In previous versions, a mongod
instance will notroll back more than 300 megabytes of data and requires manualintervention if more than 300 megabytes of data need to be rolled back.
Rollback Elapsed Time Limitations
Starting in version 4.0, the rollback time limit defaults to 24 hoursand is configurable using the parameterrollbackTimeLimitSecs
:
- In MongoDB 4.2+ and 4.0.13+, the rollback time limit is calculatedbetween the first operation after the common point and the last pointin the oplog for the member to roll back.
- In MongoDB 4.0.0-4.0.12, the rollback time limit is calculated between thecommon point and the last point in the oplog for the member to rollback.
In MongoDB 3.6 and earlier, the rollback time limit is not configurableand is set to 30 minutes.
See also