Cassandra for Oracle DBA’s Part 7 – Adding & Deleting Nodes

Adding a Node, is straight forward,

1. Generate a token list by using script and so you can get the token range for new node.

2. Download the cassandra software, unpack it and change the cassandra.yaml of three important following parameters

cluster_name: 'geek_cluster'

seeds: "127.0.0.1, 127.0.0.2,127.0.0.3"

listen_address: 127.0.0.4

rpc_address: 127.0.0.4

token:

3. Start the Cassandra

$CASSANDRA_HOME/bin/cassandra -f

Now when the new node bootstraps to cluster, there's the behind the scenes that start working, data rebalance.

If you recollect the ASM Disk operations, when you add / delete the disk at diskgroup level, the existing data should be rebalanced to other or to new disks with in the diskgroup. similarly cassandra does the same but at node level with token range that node have.

So with three nodes, the ownership data shows 33.3% of data its own,

root@wash-i-16ca26c8-prod ~/.ccm $ ccm node1 nodetool ring

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: datacenter1

Address Rack Status State Load Owns Token

3074457345618258602

127.0.0.1 rack1 Up Normal 24.84 KB 33.33% -9223372036854775808

127.0.0.2 rack1 Up Normal 24.8 KB 33.33% -3074457345618258603

127.0.0.3 rack1 Up Normal 24.87 KB 33.33% 3074457345618258602

Added a node with CCM rather manually,

ccm add --itf 127.0.0.4 --jmx-port 7400 -b node4

Check the status again , after a nodetool repair

root@wash-i-16ca26c8-prod ~/.ccm $ ccm node1 nodetool ring

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: datacenter1

==========

Address Rack Status State Load Owns Token

3074457345618258602

127.0.0.1 rack1 Up Normal 43.59 KB 33.33% -9223372036854775808

127.0.0.4 rack1 Up Normal 22.89 KB 16.67% -6148914691236517206

127.0.0.2 rack1 Up Normal 48.36 KB 16.67% -3074457345618258603

127.0.0.3 rack1 Up Normal 57.37 KB 33.33% 3074457345618258602

As you see, with three nodes the each own 33%, where with four nodes two nodes have rebalanced it data of 16.67% each due to new token range assigned to it.

This way node additions/deletions would not have an impact of data loss since the rebalance operation is online and behind the scenes as like ASM.

While doing rebalancing one can check the following to understand how much completed and how much pending, as like v$asm_operation.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node1 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 140361

Responses n/a 0 266253

If a node is leaving from the cluster, this will also visible with nodetool netstats command

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node4 nodetool netstats

Mode: LEAVING

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 159

Responses n/a 0 238788

Further, to delete a node, nodetool decommission should be used rather remove since remove directly drop the node and delete data without rebalance. Here i directly removed the node4

root@wash-i-16ca26c8-prod ~ $ ccm node4 remove

Status shows only three nodes are up,

root@wash-i-16ca26c8-prod ~ $ ccm status

Cluster: 'geek_cluster'

node1: UP

node3: UP

node2: UP

root@wash-i-16ca26c8-prod ~ $ ccm node1 nodetool status

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: Cassandra

=====================

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

-- Address Load Tokens Owns Host ID Rack

UN 127.0.0.1 2.09 MB 1 33.3% 1dc82d65-f88d-4b79-9c1b-dc5aa2a55534 rack1

UN 127.0.0.2 3.02 MB 1 23.6% ab247945-5989-48f3-82b3-8f44a3aaa375 rack1

UN 127.0.0.3 3.22 MB 1 33.3% 023a4514-3a74-42eb-be49-feaa69bf098c rack1

DN 127.0.0.4 3.39 MB 1 9.8% 9d5b4aee-6707-4639-a2d8-0af000c25b45 rack1

See the Node 4 status showing DN, and it holds 9.8% of data which seems to be lost due to direct remove command, and I stopped the Cassandra and started again, and here is the result.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm start

[node1 ERROR] org.apache.cassandra.io.sstable.CorruptSSTableException: org.apache.cassandra.io.compress.CorruptBlockException: (/var/lib/cassandra/data/system/local/system-local-jb-93-Data.db): corruption detected, chunk at 0 of length 261.

Tried to do the nodetool repair, to repair the data whilst it wont allowed to do the repair on node4

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/bin $ ccm node1 nodetool repair

Traceback (most recent call last):

File "/usr/local/bin/ccm", line 86, in <module>

cmd.run()

File "/usr/local/lib/python2.7/site-packages/ccmlib/cmds/node_cmds.py", line 267, in run

stdout, stderr = self.node.nodetool(" ".join(self.args[1:]))

File "/usr/local/lib/python2.7/site-packages/ccmlib/dse_node.py", line 264, in nodetool

raise NodetoolError(" ".join(args), exit_status, stdout, stderr)

ccmlib.node.NodetoolError: Nodetool command '/root/.ccm/repository/4.5.2/bin/nodetool -h localhost -p 7100 repair' failed; exit status: 1; stdout: [2016-07-13 01:19:08,567] Nothing to repair for keyspace 'system'

[2016-07-13 01:19:08,573] Starting repair command #1, repairing 2 ranges for keyspace PortfolioDemo

[2016-07-13 01:19:10,719] Repair session cc495b80-4897-11e6-9deb-e7c99fc0dbe2 for range (-3074457345618258603,3074457345618258602] finished

[2016-07-13 01:19:10,720] Repair session cd8beda0-4897-11e6-9deb-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,720] Repair command #1 finished

[2016-07-13 01:19:10,728] Starting repair command #2, repairing 4 ranges for keyspace dse_system

[2016-07-13 01:19:10,735] Repair session cd8e1080-4897-11e6-9deb-e7c99fc0dbe2 for range (-3074457345618258603,3074457345618258602] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,736] Repair session cd8e3790-4897-11e6-9deb-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,737] Repair session cd8eacc0-4897-11e6-9deb-e7c99fc0dbe2 for range (-9223372036854775808,-7422755166451980864] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,738] Repair session cd8ed3d0-4897-11e6-9deb-e7c99fc0dbe2 for range (-7422755166451980864,-3074457345618258603] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,738] Repair command #2 finished

So best way to do the Node deletion is with decommission once the node show decommission you can remove the node.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node4 nodetool ring

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: Cassandra

=====================

Address Rack Status State Load Owns Token

3074457345618258602

127.0.0.1 rack1 Up Normal 3.05 MB 33.33% -9223372036854775808

127.0.0.2 rack1 Up Normal 2.99 MB 33.33% -3074457345618258603

127.0.0.3 rack1 Up Normal 3.5 MB 33.33% 3074457345618258602

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node1 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 2

Mismatch (Blocking): 0

Mismatch (Background): 1

Pool Name Active Pending Completed

Commands n/a 0 5116

Responses n/a 0 243591

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node1 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 2

Mismatch (Blocking): 0

Mismatch (Background): 1

Pool Name Active Pending Completed

Commands n/a 0 5116

Responses n/a 0 243607

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node2 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 5955

Responses n/a 0 245289

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node3 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 4652

Responses n/a 0 243249

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node4 nodetool netstats

Mode: DECOMMISSIONED

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 4431

Responses n/a 0 280491

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node4 nodetool removenode

If you recollect the Oracle node delete we first deconfig the crs and then delete the node.

-Thanks

GEEK DBA

All about Database Administration, Tips & Tricks

New Features for DBA’s

Subscribe to Posts by Email

Subscriber Count

Disclaimer

Recent Posts

Categories

Archives

Pages

Cassandra for Oracle DBA’s Part 7 – Adding & Deleting Nodes

All about Database Administration, Tips & Tricks

New Features for DBA’s

Follow Me!!!

Subscribe to Posts by Email

Subscriber Count

Disclaimer

Recent Posts

Categories

Archives

Tags

Pages

Cassandra for Oracle DBA’s Part 7 – Adding & Deleting Nodes