Subscriber Count

    457

Subscribe2

Pages

MongoDB for Oracle DBA’s Part 8 : Understanding Data Distribution between Shards

With previous post, we have configured the two shards with each having replicaset rs0 and rs1.

In this post we will insert some data and observe how the data distribution is happening between shards.

To do the data distributed,

  • firstly the database must have the sharding enabled
  • secondly, the collection which data need to be distributed should have the shard key (shard key can be range, hash, tag based)

High Level Steps are,

  • connect to mongos
  • enable sharding on foo database
  • enable sharding on foo.testdata collection with hashed shard key
  • insert data into foo.testdata collection
  • check data distribution and replication

### Connect to Mongos

root@wash-i-03aaefdf-restore $ mongo localhost:27017

MongoDB shell version: 3.2.1

connecting to: localhost:27017/test

Server has startup warnings:

2016-11-23T11:43:50.348+1100 I CONTROL  [main] ** WARNING: You are running this process as the root user, which is not recommended.

2016-11-23T11:43:50.348+1100 I CONTROL  [main]

### Enable Sharding on foo database

MongoDB Enterprise mongos> sh.enableSharding("foo")

{ "ok" : 1 }

### Insert some sample data

MongoDB Enterprise mongos> for (var i = 1; i <= 100; i++) db.testData.insert( { x : i } )

### Enable sharding on collection (table) testData for column "x" with hashed mechanism

MongoDB Enterprise mongos> sh.shardCollection("foo.testData", { "x": "hashed" })

{ "collectionsharded" : "foo.testData", "ok" : 1 }

### Insert more data after enable sharding on collection

MongoDB Enterprise mongos> for (var i = 101; i <= 1000000; i++) db.testData.insert( { x : i } )

WriteResult({ "nInserted" : 1 })

### Check the status of sharding 

MongoDB Enterprise mongos> sh.status()

--- Sharding Status ---

  sharding version: {

        "_id" : 1,

        "minCompatibleVersion" : 5,

        "currentVersion" : 6,

        "clusterId" : ObjectId("5834e646a2c25f3b06d65226")

}

  shards:

        {  "_id" : "rs0",  "host" : "rs0/localhost:47018,localhost:47019,localhost:47020" }

        {  "_id" : "rs1",  "host" : "rs1/localhost:57018,localhost:57019,localhost:57020" }

  active mongoses:

        "3.2.1" : 1

  balancer:

        Currently enabled:  yes

        Currently running:  no

        Failed balancer rounds in last 5 attempts:  0

        Migration Results for the last 24 hours:

                1 : Success

  databases:

        {  "_id" : "foo",  "primary" : "rs0",  "partitioned" : true }

                foo.testData

                        shard key: { "x" : "hashed" }    --> collection shard key

                        unique: false

                        balancing: true

                        chunks:

                                rs0     2           --> Two chunk distributed to rs0 replicaset

                                rs1     2           --> two chunk distributed to rs1 replicaset

                        { "x" : { "$minKey" : 1 } } -->> { "x" : NumberLong("-4611686018427387902") } on : rs0 Timestamp(2, 2)

                        { "x" : NumberLong("-4611686018427387902") } -->> { "x" : NumberLong(0) } on : rs0 Timestamp(2, 3)

                        { "x" : NumberLong(0) } -->> { "x" : NumberLong("4611686018427387902") } on : rs1 Timestamp(2, 4)

                        { "x" : NumberLong("4611686018427387902") } -->> { "x" : { "$maxKey" : 1 } } on : rs1 Timestamp(2, 5)

        {  "_id" : "test",  "primary" : "rs0",  "partitioned" : false }

### Check the data distribution for collection 

MongoDB Enterprise mongos> db.testData.getShardDistribution()

Shard rs0 at rs0/localhost:47018,localhost:47019,localhost:47020

 data : 22.32MiB docs : 487794 chunks : 2

 estimated data per chunk : 7.44MiB

 estimated docs per chunk : 162598

Shard rs1 at rs1/localhost:57018,localhost:57019,localhost:57020

 data : 2.86MiB docs : 62648 chunks : 2

 estimated data per chunk : 978KiB

 estimated docs per chunk : 20882

Totals

 data : 25.19MiB docs : 550442 chunks : 4

 Shard rs0 contains 88.61% data, 88.61% docs in cluster, avg obj size on shard : 48B

 

 Shard rs1 contains 11.38% data, 11.38% docs in cluster, avg obj size on shard : 48B

Note: the uneven distribution of data is possible until the chunks reach according to specification.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>