Reconfigure a Replica Set with Unavailable Members
To reconfigure a replica set when a majority ofmembers are available, use the rs.reconfig()
operation onthe current primary, following the example in theReplica Set Reconfiguration Procedure.
This document provides steps for re-configuring areplica set when only a minority of members are accessible.
You may need to use the procedure, for example, in ageographically distributed replica set, where no local group ofmembers can reach a majority. See Replica Set Elections for moreinformation on this situation.
Reconfigure by Forcing the Reconfiguration
This procedure lets you recover while a majority of replica setmembers are down or unreachable. You connect to any surviving member anduse the force
option to the rs.reconfig()
method.
The force
option forces a new configuration onto the member. Use this procedure only torecover from catastrophic interruptions. Do not use force
everytime you reconfigure. Also, do not use the force
option in any automaticscripts and do not use force
when there is still a primary.
To force reconfiguration:
Back up a surviving member.
Connect to a surviving member and save the current configuration.Consider the following example commands for saving the configuration:
- cfg = rs.conf()
- printjson(cfg)
- On the same member, remove the down and unreachable members of thereplica set from the
members
array bysetting the array equal to the surviving members alone. Consider thefollowing example, which uses thecfg
variable created in theprevious step:
- cfg.members = [cfg.members[0] , cfg.members[4] , cfg.members[7]]
- On the same member, reconfigure the set by using the
rs.reconfig()
command with theforce
option set totrue
:
- rs.reconfig(cfg, {force : true})
This operation forces the secondary to use the new configuration. Theconfiguration is then propagated to all the surviving members listedin the members
array. The replica set then elects a new primary.
Note
When you use force : true
, the version number in the replicaset configuration increases significantly, by tens or hundredsof thousands. This is normal and designed to prevent set versioncollisions if you accidentally force re-configurations on bothsides of a network partition and then the network partitioningends.
- If the failure or partition was only temporary, shut down ordecommission the removed members as soon as possible.
See also