Split Chunks in a Sharded Cluster
Normally, MongoDB splits a chunk after an insert if the chunkexceeds the maximum chunk size. However,you may want to split chunks manually if:
- you have a large amount of data in your cluster and very fewchunks, as is the case after deploying a cluster usingexisting data.
- you expect to add a large amount of data that would initially residein a single chunk or shard. For example, you plan to insert a largeamount of data with shard key values between
300
and400
, but all values of your shard keys are between250
and500
are in a single chunk.
Note
New in version 2.6: MongoDB provides the mergeChunks
commandto combine contiguous chunk ranges into a single chunk. SeeMerge Chunks in a Sharded Cluster for moreinformation.
The balancer may migrate recently split chunks to a new shardimmediately if the move benefits future insertions. The balancer doesnot distinguish between chunks split manually and those splitautomatically by the system.
Warning
Be careful when splitting data in a sharded collection to createnew chunks. When you shard a collection that has existing data,MongoDB automatically creates chunks to evenly distribute thecollection. To split data effectively in a sharded cluster you mustconsider the number of documents in a chunk and the averagedocument size to create a uniform chunk size. When chunks haveirregular sizes, shards may have an equal number of chunks but havevery different data sizes. Avoid creating splits that lead to acollection with differently sized chunks.
Use sh.status()
to determine the current chunk ranges acrossthe cluster.
To split chunks manually, use the split
command with eitherfields middle
or find
. The mongo
shell provides thehelper methods sh.splitFind()
and sh.splitAt()
.
splitFind()
splits the chunk that contains the _first_document returned that matches this query into two equally sized chunks.You must specify the full namespace (i.e. “<database>.<collection>
”)of the sharded collection to splitFind()
. The query insplitFind()
does not need to use the shard key, though itnearly always makes sense to do so.
Example
The following command splits the chunk that contains the value of63109
for the zipcode
field in the people
collection ofthe records
database:
- sh.splitFind( "records.people", { "zipcode": "63109" } )
Use splitAt()
to split a chunk in two, using the querieddocument as the lower bound in the new chunk:
Example
The following command splits the chunk that contains the value of63109
for the zipcode
field in the people
collection ofthe records
database.
- sh.splitAt( "records.people", { "zipcode": "63109" } )
Note
splitAt()
does not necessarily split the chunkinto two equally sized chunks. The split occurs at the location ofthe document matching the query, regardless of where that document isin the chunk.
See also