Create Chunks in a Sharded Cluster
In most situations a sharded cluster will create/split anddistribute chunks automatically without user intervention. However, ina limited number of cases, MongoDB cannot create enough chunks ordistribute data fast enough to support the required throughput.
For example, if you want to ingest a large volume of data into acluster that is unbalanced, or where the ingestion of data will lead todata imbalance, such as with monotonically increasing or decreasingshard keys. Pre-splitting the chunks of an empty sharded collectioncan help with the throughput in these cases.
Alternatively, starting in MongoDB 4.0.3, by defining the zonesand zone rangesbefore sharding an empty or anon-existing collection, the shard collection operation creates chunksfor the defined zone ranges as well as any additional chunks to coverthe entire range of the shard key values and performs an initial chunkdistribution based on the zone ranges. For more information, seeEmpty Collection.
Warning
Only pre-split chunks for an empty collection. Manually splittingchunks for a populated collection can lead to unpredictable chunkranges and sizes as well as inefficient or ineffective balancingbehavior.
To split empty chunks manually, you can run the split
command:
Example
To create chunks for documents in the myapp.users
collection using the email
field as the shard key,use the following operation in the mongo
shell:
- for ( var x=97; x<97+26; x++ ){
- for ( var y=97; y<97+26; y+=6 ) {
- var prefix = String.fromCharCode(x) + String.fromCharCode(y);
- db.adminCommand( { split: "myapp.users", middle: { email : prefix } } );
- }
- }
This assumes a collection size of 100 million documents.
- For information on the initial chunks created and distributed by thesharding command, see Empty Collection.
- For information on the balancer and automatic distribution of chunksacross shards, see Cluster Balancer andChunk Migration.
- For information on manually migrating chunks, seeMigrate Chunks in a Sharded Cluster.