Subscribe to Posts by Email

Subscriber Count

    701

Disclaimer

All information is offered in good faith and in the hope that it may be of use for educational purpose and for Database community purpose, but is not guaranteed to be correct, up to date or suitable for any particular purpose. db.geeksinsight.com accepts no liability in respect of this information or its use. This site is independent of and does not represent Oracle Corporation in any way. Oracle does not officially sponsor, approve, or endorse this site or its content and if notify any such I am happy to remove. Product and company names mentioned in this website may be the trademarks of their respective owners and published here for informational purpose only. This is my personal blog. The views expressed on these pages are mine and learnt from other blogs and bloggers and to enhance and support the DBA community and this web blog does not represent the thoughts, intentions, plans or strategies of my current employer nor the Oracle and its affiliates or any other companies. And this website does not offer or take profit for providing these content and this is purely non-profit and for educational purpose only. If you see any issues with Content and copy write issues, I am happy to remove if you notify me. Contact Geek DBA Team, via geeksinsights@gmail.com

Pages

Cassandra for Oracle DBA’s Part 7 – Adding & Deleting Nodes

Adding a Node, is straight forward,

1. Generate a token list by using script and so you can get the token range for new node.

2. Download the cassandra software, unpack it and change the cassandra.yaml of three important following parameters

cluster_name: 'geek_cluster'

seeds: "127.0.0.1, 127.0.0.2,127.0.0.3"

listen_address: 127.0.0.4

rpc_address: 127.0.0.4

token: 

3. Start the Cassandra

$CASSANDRA_HOME/bin/cassandra -f

Now when the new node bootstraps to cluster, there's the behind the scenes that start working, data rebalance.

If you recollect the ASM Disk operations, when you add / delete the disk at diskgroup level, the existing data should be rebalanced to other or to new disks with in the diskgroup. similarly cassandra does the same but at node level with token range that node have.

So with three nodes, the ownership data shows 33.3% of data its own, 

root@wash-i-16ca26c8-prod ~/.ccm $ ccm node1 nodetool ring

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: datacenter1

Address         Rack        Status State   Load            Owns                Token

                                                                                                      3074457345618258602

127.0.0.1       rack1       Up     Normal  24.84 KB        33.33%             -9223372036854775808

127.0.0.2       rack1       Up     Normal  24.8 KB         33.33%              -3074457345618258603

127.0.0.3       rack1       Up     Normal  24.87 KB        33.33%              3074457345618258602

Added a node with CCM rather manually,

ccm add --itf 127.0.0.4 --jmx-port 7400 -b node4

Check the status again , after a nodetool repair

root@wash-i-16ca26c8-prod ~/.ccm $ ccm node1 nodetool ring

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: datacenter1

==========

Address         Rack        Status State   Load            Owns                Token

                                                                                                      3074457345618258602

127.0.0.1       rack1       Up     Normal  43.59 KB        33.33%              -9223372036854775808

127.0.0.4       rack1       Up     Normal  22.89 KB        16.67%             -6148914691236517206

127.0.0.2       rack1       Up     Normal  48.36 KB        16.67%             -3074457345618258603

127.0.0.3       rack1       Up     Normal  57.37 KB        33.33%              3074457345618258602

As you see, with three nodes the each own 33%, where with four nodes two nodes have rebalanced it data of 16.67% each due to new token range assigned to it.

This way node additions/deletions would not have an impact of data loss since the rebalance operation is online and behind the scenes as like ASM.

While doing rebalancing one can check the following to understand how much completed and how much pending, as like v$asm_operation.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node1 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name                    Active   Pending      Completed

Commands                        n/a         0         140361

Responses                       n/a         0         266253

If a node is leaving from the cluster, this will also visible with nodetool netstats command

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node4 nodetool netstats

Mode: LEAVING

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name                    Active   Pending      Completed

Commands                        n/a         0            159

Responses                       n/a         0         238788

Further, to delete a node, nodetool decommission should be used rather remove since remove directly drop the node and delete data without rebalance. Here i directly removed the node4

root@wash-i-16ca26c8-prod ~ $ ccm node4 remove

Status shows only three nodes are up,

root@wash-i-16ca26c8-prod ~ $ ccm status

Cluster: 'geek_cluster'

node1: UP

node3: UP

node2: UP

root@wash-i-16ca26c8-prod ~ $ ccm node1 nodetool status

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: Cassandra

=====================

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address    Load       Tokens  Owns   Host ID                               Rack

UN  127.0.0.1  2.09 MB    1       33.3%  1dc82d65-f88d-4b79-9c1b-dc5aa2a55534  rack1

UN  127.0.0.2  3.02 MB    1       23.6%  ab247945-5989-48f3-82b3-8f44a3aaa375  rack1

UN  127.0.0.3  3.22 MB    1       33.3%  023a4514-3a74-42eb-be49-feaa69bf098c  rack1

DN  127.0.0.4  3.39 MB    1       9.8%   9d5b4aee-6707-4639-a2d8-0af000c25b45  rack1

See the Node 4 status showing DN, and it holds 9.8% of data which seems to be lost due to direct remove command, and I stopped the Cassandra and started again, and here is the result.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm start

[node1 ERROR] org.apache.cassandra.io.sstable.CorruptSSTableException: org.apache.cassandra.io.compress.CorruptBlockException: (/var/lib/cassandra/data/system/local/system-local-jb-93-Data.db): corruption detected, chunk at 0 of length 261.

Tried to do the nodetool repair, to repair the data whilst it wont allowed to do the repair on node4

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/bin $ ccm node1 nodetool repair

Traceback (most recent call last):

  File "/usr/local/bin/ccm", line 86, in <module>

    cmd.run()

  File "/usr/local/lib/python2.7/site-packages/ccmlib/cmds/node_cmds.py", line 267, in run

    stdout, stderr = self.node.nodetool(" ".join(self.args[1:]))

  File "/usr/local/lib/python2.7/site-packages/ccmlib/dse_node.py", line 264, in nodetool

    raise NodetoolError(" ".join(args), exit_status, stdout, stderr)

ccmlib.node.NodetoolError: Nodetool command '/root/.ccm/repository/4.5.2/bin/nodetool -h localhost -p 7100 repair' failed; exit status: 1; stdout: [2016-07-13 01:19:08,567] Nothing to repair for keyspace 'system'

[2016-07-13 01:19:08,573] Starting repair command #1, repairing 2 ranges for keyspace PortfolioDemo

[2016-07-13 01:19:10,719] Repair session cc495b80-4897-11e6-9deb-e7c99fc0dbe2 for range (-3074457345618258603,3074457345618258602] finished

[2016-07-13 01:19:10,720] Repair session cd8beda0-4897-11e6-9deb-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,720] Repair command #1 finished

[2016-07-13 01:19:10,728] Starting repair command #2, repairing 4 ranges for keyspace dse_system

[2016-07-13 01:19:10,735] Repair session cd8e1080-4897-11e6-9deb-e7c99fc0dbe2 for range (-3074457345618258603,3074457345618258602] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,736] Repair session cd8e3790-4897-11e6-9deb-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,737] Repair session cd8eacc0-4897-11e6-9deb-e7c99fc0dbe2 for range (-9223372036854775808,-7422755166451980864] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,738] Repair session cd8ed3d0-4897-11e6-9deb-e7c99fc0dbe2 for range (-7422755166451980864,-3074457345618258603] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,738] Repair command #2 finished

So best way to do the Node deletion is with decommission once the node show decommission you can remove the node.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node4 nodetool ring

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: Cassandra

=====================

Address    Rack        Status State   Load            Owns                Token

                                                                          3074457345618258602

127.0.0.1  rack1       Up     Normal  3.05 MB         33.33%              -9223372036854775808

127.0.0.2  rack1       Up     Normal  2.99 MB         33.33%              -3074457345618258603

127.0.0.3  rack1       Up     Normal  3.5 MB          33.33%              3074457345618258602

 

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node1 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 2

Mismatch (Blocking): 0

Mismatch (Background): 1

Pool Name                    Active   Pending      Completed

Commands                        n/a         0           5116

Responses                       n/a         0         243591

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node1 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 2

Mismatch (Blocking): 0

Mismatch (Background): 1

Pool Name                    Active   Pending      Completed

Commands                        n/a         0           5116

Responses                       n/a         0         243607

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node2 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name                    Active   Pending      Completed

Commands                        n/a         0           5955

Responses                       n/a         0         245289

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node3 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name                    Active   Pending      Completed

Commands                        n/a         0           4652

Responses                       n/a         0         243249

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node4 nodetool netstats

Mode: DECOMMISSIONED

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name                    Active   Pending      Completed

Commands                        n/a         0           4431

Responses                       n/a         0         280491

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node4 nodetool removenode

 

 

If you recollect the Oracle node delete we first deconfig the crs and then delete the node.

-Thanks

GEEK DBA

Comments are closed.