All about Database Administration, Tips & Tricks

All information is offered in good faith and in the hope that it may be of use for educational purpose and for Database community purpose, but is not guaranteed to be correct, up to date or suitable for any particular purpose. db.geeksinsight.com accepts no liability in respect of this information or its use. This site is independent of and does not represent Oracle Corporation in any way. Oracle does not officially sponsor, approve, or endorse this site or its content and if notify any such I am happy to remove. Product and company names mentioned in this website may be the trademarks of their respective owners and published here for informational purpose only. This is my personal blog. The views expressed on these pages are mine and learnt from other blogs and bloggers and to enhance and support the DBA community and this web blog does not represent the thoughts, intentions, plans or strategies of my current employer nor the Oracle and its affiliates or any other companies. And this website does not offer or take profit for providing these content and this is purely non-profit and for educational purpose only. If you see any issues with Content and copy write issues, I am happy to remove if you notify me. Contact Geek DBA Team, via geeksinsights@gmail.com

Cassandra for Oracle DBA’s Part 11 : Alert log & Tracing

Alert Log and tracing Parameters in Cassandra.yaml

log4j-server.properties

log4j.appender.R.File=/root/.ccm/geek_cluster/node1/logs/system.log

root@wash-i-16ca26c8-prod ~ $ ccm node1 nodetool getlogginglevels

Logger Name Log Level

root INFO

org.apache.solr.core.SolrCore WARN

org.apache.thrift.server.TNonblockingServer FATAL

org.apache.solr.handler.component WARN

org.apache.solr.search.SolrIndexSearcher WARN

org.apache.solr.update WARN

org.apache.solr.core.CassandraSolrConfig WARN

SolrValidationLog INFO

org.apache.solr.core.RequestHandlers WARN

org.apache.solr.util.plugin.AbstractPluginLoader WARN

root@wash-i-16ca26c8-prod ~ $

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/logs $ ls -ltr

total 528

-rw-r--r-- 1 root root 0 Jul 10 13:42 solrvalidation.log

-rw-r--r-- 1 root root 538390 Jul 11 19:58 system.log

Cassandra for Oracle DBA’s Part 10: Commit Logs (Managing Archivelogs & Redlogs)

Archive Logs & Redologs (commit logs) Parameters that need to effect.

commitlog_directory: /root/.ccm/geek_cluster/node1/commitlogs

commitlog_segment_size_in_mb: 32

commitlog_sync: periodic

commitlog_sync_period_in_ms: 10000

/.ccm/geek_cluster/node1/resources/cassandra/conf/commitlog_archiving.properties file to keep archive_command

Argument 1 will give you the path to the files you'd want to copy, while

Argument 2 will give uou the name if it. You can then create a command:

archive_command=/bin/bash /home/cassandra/scripts/cassandra-archive.sh %path %name

The above would be an example for me. As the commands by default only execute 1 command, I have them point to a custom script that does what I desire.

My script then looks something like this:

#! /bin/bash

# use bzip to compress the file

bzip2 --best -k $1

# move to commit log archive

mv $1.bz2 $HOME/commitlog_restore/$2.bz2

I compress my commitlog and then move it somewhere else. Cassandra will call this operation first, and then delete the commitlog. You can apply similar behaviours to all of those commands.

Cassandra for Oracle DBA’s Part 9 : Creating Users & Granting Privileges

Create User in cassandra involve to change the parameter in cassandra.yaml and authentication mode.

# Authentication backend, implementing IAuthenticator; used to identify users

# Out of the box, Cassandra provides org.apache.cassandra.auth.{AllowAllAuthenticator,

# PasswordAuthenticator}.

# DSE also provides a Kerberos authenticator for external authentication.

# - AllowAllAuthenticator performs no checks - set it to disable authentication.

# - PasswordAuthenticator relies on username/password pairs to authenticate

# users. It keeps usernames and hashed passwords in system_auth.credentials table.

# Please increase system_auth keyspace replication factor if you use this authenticator.

# - com.datastax.bdp.cassandra.auth.KerberosAuthenticator For external authentication

# through Kerberos. Additional configuration is required in dse.yaml

authenticator: PasswordAuthenticator

# Out of the box, Cassandra provides org.apache.cassandra.auth.{AllowAllAuthorizer,

# - AllowAllAuthorizer allows any action to any user - set it to disable authorization.

authorizer: CassandraAuthorizer

# Will be disabled automatically for AllowAllAuthorizer.

Once the parameter is set, create user as follows using cqlsh

cqlsh> CREATE USER boone WITH PASSWORD 'Niner75' NOSUPERUSER;

cqlsh> select * from system_auth.users;

name | super

-----------+-------

spillman | False

boone | False

cassandra | True

akers | True

(4 rows)

cqlsh>

################

Granting privileges

#################

cqlsh:PortfolioDemo>

cqlsh> grant all on keyspace "PortfolioDemo" to akers;

cqlsh> grant all on keyspace "PortfolioDemo" to akers,boone,spillman;

Bad Request: line 1:46 missing EOF at ','

cqlsh> grant all on keyspace "PortfolioDemo" to boone,spillman;

Bad Request: line 1:46 missing EOF at ','

cqlsh> grant all on keyspace "PortfolioDemo" to boone;

cqlsh> grant all on keyspace "PortfolioDemo" to spillman;

cqlsh>

Cassandra for Oracle DBA Part 8: Data Dictionary Overview

cqlsh> select * from system.sstable_activity;

keyspace_name | columnfamily_name | generation | rate_120m | rate_15m

---------------+-------------------+------------+-----------+----------

PortfolioDemo | StockHist | 2 | 0 | 0

PortfolioDemo | Portfolios | 5 | 0 | 0

PortfolioDemo | Stocks | 5 | 0 | 0

PortfolioDemo | StockHist | 3 | 0 | 0

PortfolioDemo | StockHist | 1 | 0 | 0

(5 rows)

cqlsh>

cqlsh> select * from system.compaction_history;

--------------------------------------+----------+-----------+-------------------------+--------------------------+---------------+--------------

9490d1b0-470b-11e6-a73e-e7c99fc0dbe2 | 57105 | 56678 | Stocks | 2016-07-11 12:02:54+1000 | PortfolioDemo | {1: 2842}

6cf4a840-4650-11e6-b360-e7c99fc0dbe2 | 746 | 263 | schema_keyspaces | 2016-07-10 13:43:12+1000 | system | {1: 2, 4: 1}

6ce0f930-4650-11e6-b360-e7c99fc0dbe2 | 8819 | 6806 | schema_columnfamilies | 2016-07-10 13:43:11+1000 | system | {1: 1, 2: 2}

6ade7270-4650-11e6-b360-e7c99fc0dbe2 | 645 | 263 | schema_keyspaces | 2016-07-10 13:43:08+1000 | system | {1: 2, 3: 1}

c219e750-4659-11e6-a73e-e7c99fc0dbe2 | 15917 | 6797 | schema_columnfamilies | 2016-07-10 14:50:00+1000 | system | {2: 2, 4: 1}

6a9868c0-4650-11e6-b360-e7c99fc0dbe2 | 718 | 549 | local | 2016-07-10 13:43:08+1000 | system | {4: 1}

6ce20aa0-4650-11e6-b360-e7c99fc0dbe2 | 12924 | 11444 | schema_columns | 2016-07-10 13:43:11+1000 | system | {1: 1, 2: 2}

c2224bc0-4659-11e6-a73e-e7c99fc0dbe2 | 25270 | 11533 | schema_columns | 2016-07-10 14:50:00+1000 | system | {2: 2, 4: 1}

c7ba1e40-4709-11e6-a73e-e7c99fc0dbe2 | 1314 | 536 | peers | 2016-07-11 11:50:01+1000 | system | {2: 2, 3: 2}

56681bc0-4709-11e6-a73e-e7c99fc0dbe2 | 730 | 312 | schema_keyspaces | 2016-07-11 11:46:51+1000 | system | {1: 2, 2: 2}

94be7160-470b-11e6-a73e-e7c99fc0dbe2 | 436 | 57 | compactions_in_progress | 2016-07-11 12:02:54+1000 | system | {2: 2}

94bb3d10-470b-11e6-a73e-e7c99fc0dbe2 | 577360 | 576097 | Portfolios | 2016-07-11 12:02:54+1000 | PortfolioDemo | {1: 10000}

c83086c0-4709-11e6-a73e-e7c99fc0dbe2 | 780 | 543 | local | 2016-07-11 11:50:02+1000 | system | {4: 1}

6157ad10-4651-11e6-a73e-e7c99fc0dbe2 | 1095 | 534 | local | 2016-07-10 13:50:02+1000 | system | {4: 1}

(14 rows)

cqlsh>

Getting Metadata of a table & Schema

cqlsh> describe table PortfolioDemo.Portfolios;

CREATE TABLE "Portfolios" (

key bigint,

column1 text,

value double,

PRIMARY KEY ((key), column1)

) WITH COMPACT STORAGE AND

bloom_filter_fp_chance=0.010000 AND

caching='KEYS_ONLY' AND

comment='' AND

dclocal_read_repair_chance=0.100000 AND

gc_grace_seconds=60 AND

index_interval=128 AND

read_repair_chance=1.000000 AND

replicate_on_write='true' AND

populate_io_cache_on_flush='false' AND

default_time_to_live=0 AND

speculative_retry='NONE' AND

memtable_flush_period_in_ms=0 AND

compaction={'class': 'SizeTieredCompactionStrategy'} AND

compression={'sstable_compression': 'LZ4Compressor'};

cqlsh> describe keyspace PortfolioDemo;

CREATE KEYSPACE "PortfolioDemo" WITH replication = {

'class': 'SimpleStrategy',

'replication_factor': '2'

};

Creating Table

USE "PortfolioDemo";

CREATE TABLE "HistLoss" (

key text,

column1 text,

value text,

PRIMARY KEY ((key), column1)

) WITH COMPACT STORAGE AND

bloom_filter_fp_chance=0.010000 AND

caching='KEYS_ONLY' AND

comment='' AND

dclocal_read_repair_chance=0.100000 AND

gc_grace_seconds=60 AND

index_interval=128 AND

read_repair_chance=1.000000 AND

replicate_on_write='true' AND

populate_io_cache_on_flush='false' AND

default_time_to_live=0 AND

speculative_retry='NONE' AND

memtable_flush_period_in_ms=0 AND

compaction={'class': 'SizeTieredCompactionStrategy'} AND

compression={'sstable_compression': 'LZ4Compressor'};

Creating Table

CREATE TABLE "Portfolios" (

key bigint,

column1 text,

value double,

PRIMARY KEY ((key), column1)

) WITH COMPACT STORAGE AND

bloom_filter_fp_chance=0.010000 AND

caching='KEYS_ONLY' AND

==================================================

Write Time like OraROWSCN

==================================================

Cannot use the primary / part key

cqlsh:PortfolioDemo> SELECT WRITETIME (value) from "Portfolios" where key=1535;

writetime(value)

------------------

1468200969975

(6 rows)

===============================================

Finding Users & Roles & Privs

===============================================

cqlsh> select * from system_auth.users;

name | super

-----------+-------

spillman | False

boone | False

cassandra | True

akers | True

(4 rows)

==============================================

Knowing about Peers from each node

==============================================

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/bin $ ccm node1 cqlsh

cqlsh> select * from system.peers;

-----------+-------------+--------------------------------------+--------------+-------+-----------------+-------------+--------------------------------------+--------------------------+-----------

127.0.0.3 | Cassandra | 023a4514-3a74-42eb-be49-feaa69bf098c | null | rack1 | 2.0.10.71 | 127.0.0.3 | 97df5fdc-277b-339e-8aab-cb36317c1280 | {'3074457345618258602'} | Cassandra

127.0.0.2 | Cassandra | ab247945-5989-48f3-82b3-8f44a3aaa375 | null | rack1 | 2.0.10.71 | 127.0.0.2 | 97df5fdc-277b-339e-8aab-cb36317c1280 | {'-3074457345618258603'} | Cassandra

(2 rows)

cqlsh> exit

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/bin $ ccm node2 cqlsh

Connected to geek_cluster at 127.0.0.2:9160.

[cqlsh 4.1.1 | Cassandra 2.0.10.71 | DSE 4.5.2 | CQL spec 3.1.1 | Thrift protocol 19.39.0]

Use HELP for help.

cqlsh> select * from system.peers;

-----------+-------------+--------------------------------------+--------------+-------+-----------------+-------------+--------------------------------------+--------------------------+-----------

127.0.0.3 | Cassandra | 023a4514-3a74-42eb-be49-feaa69bf098c | null | rack1 | 2.0.10.71 | 127.0.0.3 | 97df5fdc-277b-339e-8aab-cb36317c1280 | {'3074457345618258602'} | Cassandra

127.0.0.1 | Cassandra | 1dc82d65-f88d-4b79-9c1b-dc5aa2a55534 | null | rack1 | 2.0.10.71 | 127.0.0.1 | 97df5fdc-277b-339e-8aab-cb36317c1280 | {'-9223372036854775808'} | Cassandra

(2 rows)

Performance Troubleshooting : DB File Sequential Read When performing Full Table Scan

Hello All,

What wait event you observe when Oracle Performs a full table scan?

DB File Scattered Read
Direct Path Reads (from 11g onwards under certain conditions)

What wait event you observe when Oracle Performs a table via index scan?

DB File Sequential Read

While we are working a issue, we apparently found an issue that a full tablescan of 17gb table performing a db file sequential read.

This post is not how to resolve the issue (that's need a separate post), but to know what is the reason that Oracle performs the db file sequential read when doing a full tablescan. The following are best possible cases.

Case	Reason	Description
Case 1	Chained Rows	chained rows
Case 2	Table Size just large above mbrc	For example, if mbrc is 8 and table has 10 blocks, then oracle do one db file scattered read and 2 sequential reads.
Case 3	Columns more than 255	Oracle fits the rows into different blocks of those columns beyond 255. So there is a chaining of rows , so when scan happens it has to fetch from different blocks, so indeed a sequential scan with in a multiblock scan, to know this, check v$sesstat for statistics "table fetch continued row"
Case 4	IOT Table	When having IOT table it can appear db file sequential read
Case 5	MView Table	When having Mview the mview with index may show db file sequential read
Case 6	Extent & Block Mapping	This occurs when Oracle is unable to read DBFMBRC worth of blocks from a given extent. This usually happens towards the edge of the high-water-mark of the table, but in theory can happen in any extent of the table.
Case 7	Cached Blocks	When certain blocks are already in cache from a table oracle will not read them from disk again, in those cases it do partial scans of mbrc value like block 9813, 9814 and 9815 - scattered read (MBR) with MBRC=3 block 9816 - cache read block 9817 and 9818 - scattered Oracle read with MBRC=2 block 9819 - cache read block 9820 - sequential Oracle read block 9821 - cache read block 9822 - sequential Oracle read In this case you got 2 MBR + 2 SBR + 3 cache reads instead of 1 MBR with MBRC=10 (or 16).

Seems the marked red was the ones we have encountered as the table has 299 columns and the extents allocation shows more varying sizes (not rounded to block sizes of mbrc)

(The last two cases found in internet from Gaza Vaidyanth and Other's)

Hope this helps

-Thanks

Geek

Quick Question : Finding Transactions on a Table

Folks! this subject line may sound silly, but most of us really think how for a moment.

Business Requirement is some want to add a column to high volatile table, So they want to understand,

Does it cause issue as its high volatile and access by Website?
What is the best time to add a column if so it can issue?
What is alternate method to minimise the impact if any?

Ans 1: You can add a column, that's not an issue.

Ans2: Adding a column to highly volatile table will not have any impact unless if there are transactions running and uncommitted your DDL may wait and complete thereafter. There's a caveat here, adding a column with null will complete quickly than adding a column with some default value. The logic behind is the new column value must be updated in all the old rows and if the table is huge it will take time and eventually your transactions may piled up as such DDL lock will protect the transactions to be start.

Ans3: Best time to add a column? First find how many transactions happening on that table and if audit enabled find when was least transactions happens

SQL> select * from dba_tab_modifications where table_name in ('XXXXXXXXXXXXXXX');

SCHEMA TABLE INSERTS UPDATES DELETES

PORTAL XXXXXXXXXXXXXXXXXXXX 16415 16415 9964

Then, Find from audit trail at what time most of the transactions fired on those tables. ( I am using xml auditing hence v$xml_audit_trail , you can use DBA_AUDIT_TRAIL instead), see at 04AM have very less transactions on the table. Establish a pattern of each day then find out when is the less transactional time for that table.

SQL> select object_name,to_char(EXTENDED_TIMESTAMP,'DD-MON-YY HH24'),count(*) from v$xml_audit_trail where object_name in ('XXXXXXXXXXXXXXX') group by object_name,to_char(EXTENDED_TIMESTAMP,'DD-MON-YY HH24');

TABLENAME DATE HR Count

-------------------------------- ----------- ---------

XXXXXXXXXXXXXXXXXXXXXXXX10-AUG-16 22483

XXXXXXXXXXXXXXXXXXXXXXXX10-AUG-16 23626

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 00736

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 01962

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 02183

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 03332

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 0466

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 05333

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 07162

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 0862

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 09894

XXXXXXXXXXXXXXXXXXXXXXXX11-AUG-16 10717

ANS 4: Alternative method? for Adding a column without impact? Scratching my head really 🙂 , Do you know any? Apart from keeping a replicated table and rename (still it will lock the table)

Geek DBA

Cassandra for Oracle DBA’s Part 7 – Adding & Deleting Nodes

Adding a Node, is straight forward,

1. Generate a token list by using script and so you can get the token range for new node.

2. Download the cassandra software, unpack it and change the cassandra.yaml of three important following parameters

cluster_name: 'geek_cluster'

seeds: "127.0.0.1, 127.0.0.2,127.0.0.3"

listen_address: 127.0.0.4

rpc_address: 127.0.0.4

token:

3. Start the Cassandra

$CASSANDRA_HOME/bin/cassandra -f

Now when the new node bootstraps to cluster, there's the behind the scenes that start working, data rebalance.

If you recollect the ASM Disk operations, when you add / delete the disk at diskgroup level, the existing data should be rebalanced to other or to new disks with in the diskgroup. similarly cassandra does the same but at node level with token range that node have.

So with three nodes, the ownership data shows 33.3% of data its own,

root@wash-i-16ca26c8-prod ~/.ccm $ ccm node1 nodetool ring

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: datacenter1

Address Rack Status State Load Owns Token

3074457345618258602

127.0.0.1 rack1 Up Normal 24.84 KB 33.33% -9223372036854775808

127.0.0.2 rack1 Up Normal 24.8 KB 33.33% -3074457345618258603

127.0.0.3 rack1 Up Normal 24.87 KB 33.33% 3074457345618258602

Added a node with CCM rather manually,

ccm add --itf 127.0.0.4 --jmx-port 7400 -b node4

Check the status again , after a nodetool repair

root@wash-i-16ca26c8-prod ~/.ccm $ ccm node1 nodetool ring

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: datacenter1

==========

Address Rack Status State Load Owns Token

3074457345618258602

127.0.0.1 rack1 Up Normal 43.59 KB 33.33% -9223372036854775808

127.0.0.4 rack1 Up Normal 22.89 KB 16.67% -6148914691236517206

127.0.0.2 rack1 Up Normal 48.36 KB 16.67% -3074457345618258603

127.0.0.3 rack1 Up Normal 57.37 KB 33.33% 3074457345618258602

As you see, with three nodes the each own 33%, where with four nodes two nodes have rebalanced it data of 16.67% each due to new token range assigned to it.

This way node additions/deletions would not have an impact of data loss since the rebalance operation is online and behind the scenes as like ASM.

While doing rebalancing one can check the following to understand how much completed and how much pending, as like v$asm_operation.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node1 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 140361

Responses n/a 0 266253

If a node is leaving from the cluster, this will also visible with nodetool netstats command

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node4 nodetool netstats

Mode: LEAVING

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 159

Responses n/a 0 238788

Further, to delete a node, nodetool decommission should be used rather remove since remove directly drop the node and delete data without rebalance. Here i directly removed the node4

root@wash-i-16ca26c8-prod ~ $ ccm node4 remove

Status shows only three nodes are up,

root@wash-i-16ca26c8-prod ~ $ ccm status

Cluster: 'geek_cluster'

node1: UP

node3: UP

node2: UP

root@wash-i-16ca26c8-prod ~ $ ccm node1 nodetool status

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: Cassandra

=====================

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

-- Address Load Tokens Owns Host ID Rack

UN 127.0.0.1 2.09 MB 1 33.3% 1dc82d65-f88d-4b79-9c1b-dc5aa2a55534 rack1

UN 127.0.0.2 3.02 MB 1 23.6% ab247945-5989-48f3-82b3-8f44a3aaa375 rack1

UN 127.0.0.3 3.22 MB 1 33.3% 023a4514-3a74-42eb-be49-feaa69bf098c rack1

DN 127.0.0.4 3.39 MB 1 9.8% 9d5b4aee-6707-4639-a2d8-0af000c25b45 rack1

See the Node 4 status showing DN, and it holds 9.8% of data which seems to be lost due to direct remove command, and I stopped the Cassandra and started again, and here is the result.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm start

[node1 ERROR] org.apache.cassandra.io.sstable.CorruptSSTableException: org.apache.cassandra.io.compress.CorruptBlockException: (/var/lib/cassandra/data/system/local/system-local-jb-93-Data.db): corruption detected, chunk at 0 of length 261.

Tried to do the nodetool repair, to repair the data whilst it wont allowed to do the repair on node4

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/bin $ ccm node1 nodetool repair

Traceback (most recent call last):

File "/usr/local/bin/ccm", line 86, in <module>

cmd.run()

File "/usr/local/lib/python2.7/site-packages/ccmlib/cmds/node_cmds.py", line 267, in run

stdout, stderr = self.node.nodetool(" ".join(self.args[1:]))

File "/usr/local/lib/python2.7/site-packages/ccmlib/dse_node.py", line 264, in nodetool

raise NodetoolError(" ".join(args), exit_status, stdout, stderr)

ccmlib.node.NodetoolError: Nodetool command '/root/.ccm/repository/4.5.2/bin/nodetool -h localhost -p 7100 repair' failed; exit status: 1; stdout: [2016-07-13 01:19:08,567] Nothing to repair for keyspace 'system'

[2016-07-13 01:19:08,573] Starting repair command #1, repairing 2 ranges for keyspace PortfolioDemo

[2016-07-13 01:19:10,719] Repair session cc495b80-4897-11e6-9deb-e7c99fc0dbe2 for range (-3074457345618258603,3074457345618258602] finished

[2016-07-13 01:19:10,720] Repair session cd8beda0-4897-11e6-9deb-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,720] Repair command #1 finished

[2016-07-13 01:19:10,728] Starting repair command #2, repairing 4 ranges for keyspace dse_system

[2016-07-13 01:19:10,735] Repair session cd8e1080-4897-11e6-9deb-e7c99fc0dbe2 for range (-3074457345618258603,3074457345618258602] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,736] Repair session cd8e3790-4897-11e6-9deb-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,737] Repair session cd8eacc0-4897-11e6-9deb-e7c99fc0dbe2 for range (-9223372036854775808,-7422755166451980864] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,738] Repair session cd8ed3d0-4897-11e6-9deb-e7c99fc0dbe2 for range (-7422755166451980864,-3074457345618258603] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/127.0.0.4) is dead: session failed

[2016-07-13 01:19:10,738] Repair command #2 finished

So best way to do the Node deletion is with decommission once the node show decommission you can remove the node.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node4 nodetool ring

Note: Ownership information does not include topology; for complete information, specify a keyspace

Datacenter: Cassandra

=====================

Address Rack Status State Load Owns Token

3074457345618258602

127.0.0.1 rack1 Up Normal 3.05 MB 33.33% -9223372036854775808

127.0.0.2 rack1 Up Normal 2.99 MB 33.33% -3074457345618258603

127.0.0.3 rack1 Up Normal 3.5 MB 33.33% 3074457345618258602

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node1 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 2

Mismatch (Blocking): 0

Mismatch (Background): 1

Pool Name Active Pending Completed

Commands n/a 0 5116

Responses n/a 0 243591

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node1 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 2

Mismatch (Blocking): 0

Mismatch (Background): 1

Pool Name Active Pending Completed

Commands n/a 0 5116

Responses n/a 0 243607

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node2 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 5955

Responses n/a 0 245289

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node3 nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 4652

Responses n/a 0 243249

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node4 nodetool netstats

Mode: DECOMMISSIONED

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name Active Pending Completed

Commands n/a 0 4431

Responses n/a 0 280491

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ ccm node4 nodetool removenode

If you recollect the Oracle node delete we first deconfig the crs and then delete the node.

-Thanks

GEEK DBA

Cassandra for Oracle DBA’s Part 5 – High Availability

High Availability is the key for any distributed systems which includes the load balancing. In High Availability , one can think of Active-Active Nodes or Active - Passive Nodes, Cassandra and Oracle RAC falls into the first category, active-active mode. But Cassandra have important architectural differences unlike oracle rac

Cassandra does not use shared storage, this helps the data can be distributed across nodes which is classic example of distributed storage processing.
Cassandra does not adhere to Master Slave technology, this helps to reduce the communication with master node every time at database level and at cluster level also
Cassandra does not use private network, indeed node communication happens using common network with gossip protocol
Cassandra does not do a heart beat to every node in the cluster which can be a clumsy network congestion as like in Oracle RAC

Consider this example of Cluster Network in Oracle RAC, where each node pings to other node in the cluster and that is the reason you must have low latency network used for RAC

But with Cassandra every node will be configured to know about a seed node rather whole nodes in cluster, consider this example which reduces the foot print of cluster network usage but also effective communication is possible.

In the cluster, A and B configured as seed nodes and C,D,E configured as non seed nodes (in cassandra.yaml file for each node), so C & D nodes are configured to reach A and E node is configured to reach B for a gossip information i.e node availability information.

When cluster starts, A, B nodes are up and see each other and C,D,E configured to reach seed nodes, once they warm up and each node knows about other nodes

Cassandra uses Gossip Protocol to reach nodes every 2 seconds and update about the availability of nodes.

Gossip is a peer-to-peer communication protocol in which nodes periodically exchange state information about themselves and about other nodes they know about. The gossip process runs every second and exchanges state messages with up to three other nodes in the cluster. The nodes exchange information about themselves and about the other nodes that they have gossiped about, so all nodes quickly learn about all other nodes in the cluster. A gossip message has a version associated with it, so that during a gossip exchange, older information is overwritten with the most current state for a particular node.

To prevent problems in gossip communications, use the same list of seed nodes for all nodes in a cluster. This is most critical the first time a node starts up. By default, a node remembers other nodes it has gossiped with between subsequent restarts. The seed node designation has no purpose other than bootstrapping the gossip process for new nodes joining the cluster. Seed nodes are not a single point of failure, nor do they have any other special purpose in cluster operations beyond the bootstrapping of nodes.

In my CCM (developer mode) Node configurations the nodes 1,2,3 configured as seed nodes

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/resources/cassandra/conf $ grep seed cassandra.yaml

seed_provider:

- seeds: 127.0.0.1,127.0.0.2,127.0.0.3

And for one Node1 the heartbeat information from Gossipinfo shows every 2 seconds.

root@wash-i-16ca26c8-prod /scripts/db_reconfiguration $ ccm node1 nodetool gossipinfo | grep heartbeat

heartbeat:122773

heartbeat:122775

heartbeat:122764

heartbeat:122776

Every 2 seconds considered as a heartbeat information, and failure detector (using PhiFailureDetection method of magical number 0.434) and cassandra default setting of phi_convict_threshold =8

So it means at every 2 seconds the heartbeat calculates and record last time sample if it exceed from convict thresholds, giving an example, 8/0.434 = 18.4 seconds, that mean the node will be marked down after 18.4 seconds.

In this way, Cassandra ensures the nodes are reachable and available to the cluster and process.

Cassandra for Oracle DBA’s Part 5 : Read and Write Consistency

With the Data not shared model, how does a client or query get a consistent data and how does a write ensure that it is writing right data.

Obvious answer is the maintaining consistency in a database involves in adhering the ACID Properties.

Cassandra follows Atomicity, Isolation, Durability but consistency is an eventual consistency or a Tunable Consistency.

Having said that, data will eventually become consistent among nodes.

What is eventual consistency?

ANS: The data will become consistency when required or by on Demand over a period of time.

But, Does client need to wait until it become consistent?

ANS: No, when client send a request for data, the underlying query have a consistency level for a session level or query level , for which ALL, ANY, ONE etc. should be mentioned,

When replication factor is set and read happens the co-ordinator that receives the client request will send a request to the nodes that contain the partition range of data and nodes will revert back with data to coordinator node. Whilst its the coordinator node to merge all the parts of the data and check with digest it received from each node and see the data is latest. If any node data is not matched with digest then that node data will be repaired in backend and so forth become consistent eventually.

I do not want to waste any efforts in rewriting this stuff since Datastax documentation has very good page for Read/Write Paths in Cassandra with all scenarios.

https://docs.datastax.com/en/cassandra_win/2.2/cassandra/dml/dmlClientRequestsReadExp.html

https://docs.datastax.com/en/cassandra_win/2.2/cassandra/dml/dmlClientRequestsWrite.html

So with Consistency Levels in place and the digest matching and read repair online, the Data that retrieves Cassandra ensures consistency of data can be achieved eventually and depends on your requirement.

Next Post High Availability Comparing with Oracle RAC, cluster interconnect, node evictions etc.

12c Upgrade Gotchas

The following are various gotchas during upgrade or post upgrade to 12c collect from various sources and also few of them in my environment.

1. Change of resource_limit parameter to true by default

Impact: Resource Manager will be invoked automatically and push the resource limits to the sessions and users may receive cpu/io restriction messages.

2. Global Temporary Tables Statistics session level

Impact: Earlier versions of Oracle uses shared statistics for global temporary tables, where in 12c the statistics on global temporary tables are session level.

that means other users cannot see it

3. HISTGRM$ add column

Impact: n 12.1, optimizer group added a column to histgrm$ which is a clustered table. For upgrade from 11.2 to 12.1, this column ep_repeat_count is added as a default value column with a not null constraint. Since histgrm$ table on a database could have millions of rows, this addition of a default value column could take a long time since we need to update all existing rows with a default value. This in turn slows down the upgrade significantly to apply the patch 19333670 on 12.1.0.2

4. Drop column with null default value can cause corruption bug 17325413

5. Dynamic Statistics - Performance Problems

Impact:- Dynamic statistics has been enhanced in Oracle 12c, but this can lead to a longer parse time.

Automatic dynamic statistics is used more often in 12c which can lead to a parse time increase in the more cases than before.

6. Wri$_optstat_histhead_history Size can cause much time to upgrade, before upgrade purge or reorg the table.

7. Wrong results in group by with outjoin for a bind variable

alter system set "_optimizer_aggr_groupby_elim"=false scope=both;

alter system set "_optimizer_reduce_groupby_key"=false scope=both;

8. In memory database - Do not enable unless you have a license

9. DBA_REGISTRY also reports CATALOG and CATPROC in "UPGRADED" status as shown below:

Solution: run utlrp.sql

10. SPD & Dynamic Sampling more CPU Utilisation & Cause result cache also fill.

11. Oracle 12c has deprecated the sec_case_sensitive_logon parameter. Changing to case sensitive passwords could break your application.

12. If you have client/JDBC version below 10g , ensure you must update the sqlnet.ora in database host SQLNET.ALLOWED_LOGON_VERSION_SERVER to 10 or 8 or 9.

This is due to in earlier versions it was 8 where in 12c it was 11, that means 11g client can connect to db. By keeping it 10 or 8 or 9 you are allowing other version clients to connect to database.

13. Be aware of, By default, In 12c version the auditing runs in mixed-mode, which means you can combine traditional auditing with unified auditing. There is one gotcha

to be aware of:

– if the database has been upgraded, then there will be no Unified Policy enabled!

– if the database has been newly created, then there will be 2 Unified Policies enabled : ORA_SECURECONFIG and ORA_LOGON_FAILURES.

Thanks

GEEK DBA

All about Database Administration, Tips & Tricks

New Features for DBA’s

Subscribe to Posts by Email

Subscriber Count

Disclaimer

Recent Posts

Categories

Archives

Pages

Cassandra for Oracle DBA’s Part 11 : Alert log & Tracing

Cassandra for Oracle DBA’s Part 10: Commit Logs (Managing Archivelogs & Redlogs)

Cassandra for Oracle DBA’s Part 9 : Creating Users & Granting Privileges

Cassandra for Oracle DBA Part 8: Data Dictionary Overview

Performance Troubleshooting : DB File Sequential Read When performing Full Table Scan

Quick Question : Finding Transactions on a Table

Cassandra for Oracle DBA’s Part 7 – Adding & Deleting Nodes

Cassandra for Oracle DBA’s Part 5 – High Availability

Cassandra for Oracle DBA’s Part 5 : Read and Write Consistency

12c Upgrade Gotchas

New Features for DBA’s

Follow Me!!!

Subscribe to Posts by Email

Subscriber Count

Disclaimer

Recent Posts

Categories

Archives

Tags

Pages