With previous post, we have configured the two shards with each having replicaset rs0 and rs1.
In this post we will insert some data and observe how the data distribution is happening between shards.
To do the data distributed,
- firstly the database must have the sharding enabled
- secondly, the collection which data need to be distributed should have the shard key (shard key can be range, hash, tag based)
High Level Steps are,
- connect to mongos
- enable sharding on foo database
- enable sharding on foo.testdata collection with hashed shard key
- insert data into foo.testdata collection
- check data distribution and replication
### Connect to Mongos
root@wash-i-03aaefdf-restore $ mongo localhost:27017
MongoDB shell version: 3.2.1
connecting to: localhost:27017/test
Server has startup warnings:
2016-11-23T11:43:50.348+1100 I CONTROL [main] ** WARNING: You are running this process as the root user, which is not recommended.
2016-11-23T11:43:50.348+1100 I CONTROL [main]
### Enable Sharding on foo database
MongoDB Enterprise mongos> sh.enableSharding("foo")
{ "ok" : 1 }
### Insert some sample data
MongoDB Enterprise mongos> for (var i = 1; i <= 100; i++) db.testData.insert( { x : i } )
### Enable sharding on collection (table) testData for column "x" with hashed mechanism
MongoDB Enterprise mongos> sh.shardCollection("foo.testData", { "x": "hashed" })
{ "collectionsharded" : "foo.testData", "ok" : 1 }
### Insert more data after enable sharding on collection
MongoDB Enterprise mongos> for (var i = 101; i <= 1000000; i++) db.testData.insert( { x : i } )
WriteResult({ "nInserted" : 1 })
### Check the status of sharding
MongoDB Enterprise mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5834e646a2c25f3b06d65226")
}
shards:
{ "_id" : "rs0", "host" : "rs0/localhost:47018,localhost:47019,localhost:47020" }
{ "_id" : "rs1", "host" : "rs1/localhost:57018,localhost:57019,localhost:57020" }
active mongoses:
"3.2.1" : 1
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
1 : Success
databases:
{ "_id" : "foo", "primary" : "rs0", "partitioned" : true }
foo.testData
shard key: { "x" : "hashed" } --> collection shard key
unique: false
balancing: true
chunks:
rs0 2 --> Two chunk distributed to rs0 replicaset
rs1 2 --> two chunk distributed to rs1 replicaset
{ "x" : { "$minKey" : 1 } } -->> { "x" : NumberLong("-4611686018427387902") } on : rs0 Timestamp(2, 2)
{ "x" : NumberLong("-4611686018427387902") } -->> { "x" : NumberLong(0) } on : rs0 Timestamp(2, 3)
{ "x" : NumberLong(0) } -->> { "x" : NumberLong("4611686018427387902") } on : rs1 Timestamp(2, 4)
{ "x" : NumberLong("4611686018427387902") } -->> { "x" : { "$maxKey" : 1 } } on : rs1 Timestamp(2, 5)
{ "_id" : "test", "primary" : "rs0", "partitioned" : false }
### Check the data distribution for collection
MongoDB Enterprise mongos> db.testData.getShardDistribution()
Shard rs0 at rs0/localhost:47018,localhost:47019,localhost:47020
data : 22.32MiB docs : 487794 chunks : 2
estimated data per chunk : 7.44MiB
estimated docs per chunk : 162598
Shard rs1 at rs1/localhost:57018,localhost:57019,localhost:57020
data : 2.86MiB docs : 62648 chunks : 2
estimated data per chunk : 978KiB
estimated docs per chunk : 20882
Totals
data : 25.19MiB docs : 550442 chunks : 4
Shard rs0 contains 88.61% data, 88.61% docs in cluster, avg obj size on shard : 48B
Shard rs1 contains 11.38% data, 11.38% docs in cluster, avg obj size on shard : 48B
Note: the uneven distribution of data is possible until the chunks reach according to specification.
Follow Me!!!