Split Chunks in a Sharded Cluster

Split Chunks in a Sharded Cluster

Normally, MongoDB splits a chunk after an insert if the chunkexceeds the maximum chunk size. However,you may want to split chunks manually if:

you have a large amount of data in your cluster and very fewchunks, as is the case after deploying a cluster usingexisting data.
you expect to add a large amount of data that would initially residein a single chunk or shard. For example, you plan to insert a largeamount of data with shard key values between 300 and400, but all values of your shard keys are between 250 and500 are in a single chunk.

Note

New in version 2.6: MongoDB provides the mergeChunks commandto combine contiguous chunk ranges into a single chunk. SeeMerge Chunks in a Sharded Cluster for moreinformation.

The balancer may migrate recently split chunks to a new shardimmediately if the move benefits future insertions. The balancer doesnot distinguish between chunks split manually and those splitautomatically by the system.

Warning

Be careful when splitting data in a sharded collection to createnew chunks. When you shard a collection that has existing data,MongoDB automatically creates chunks to evenly distribute thecollection. To split data effectively in a sharded cluster you mustconsider the number of documents in a chunk and the averagedocument size to create a uniform chunk size. When chunks haveirregular sizes, shards may have an equal number of chunks but havevery different data sizes. Avoid creating splits that lead to acollection with differently sized chunks.

Use sh.status() to determine the current chunk ranges acrossthe cluster.

To split chunks manually, use the split command with eitherfields middle or find. The mongo shell provides thehelper methods sh.splitFind() and sh.splitAt().

splitFind() splits the chunk that contains the _first_document returned that matches this query into two equally sized chunks.You must specify the full namespace (i.e. “<database>.<collection>”)of the sharded collection to splitFind(). The query insplitFind() does not need to use the shard key, though itnearly always makes sense to do so.

Example

The following command splits the chunk that contains the value of63109 for the zipcode field in the people collection ofthe records database:

sh.splitFind( "records.people", { "zipcode": "63109" } )

Use splitAt() to split a chunk in two, using the querieddocument as the lower bound in the new chunk:

Example

The following command splits the chunk that contains the value of63109 for the zipcode field in the people collection ofthe records database.

sh.splitAt( "records.people", { "zipcode": "63109" } )

Note

splitAt() does not necessarily split the chunkinto two equally sized chunks. The split occurs at the location ofthe document matching the query, regardless of where that document isin the chunk.