Contact for Online learnings
* indicates required field
Thanks for the inquiry. You will be receiving a mail shortly with course details. If not please check the spam folder once and mark our mail as not a spam.

Subscribe2

Subscriber Count

    416
Contact for Online learnings
* indicates required field
Thanks for the inquiry. You will be receiving a mail shortly with course details. If not please check the spam folder once and mark our mail as not a spam.

MongoDB for Oracle DBA’s Part 1 – Features or Keywords Comparision

Welcome to first Post on MongoDB.  

Given my experience over RDBMS during past decade, exploring the current NOSQL database technologies and keep ourself updated. This will not only help ourselves but also enable us to know where to use this, since majority NOSQL databases are not fit for everything, they are customised and fit for purpose, Like MongoDB have this issues.

  • cannot be used for transactional purpose or commerce/erp applications since its cannot meet the ACID requirements (partial available).
  • MongoDB states that complex queries / joins / aggregations should be avoidable as much as possible.
  • The more the normalized data it suit good but it inherits duplicacy in data.
  • Familiar SQL language is not useful, you should know how to write a query with complex {,}() etc.
  • Schemaless , means not a defined characteristic, means anyone can with any type of data is loaded, means if application required an specific characteristics like datatype check etc.
  • As version evolves, the changes to the database core engine, there is subtle difference between 2.0, 2.6 and 3.2, much more changes and one to keep understand all those.
  • More relies on OS Page cache and flush mechanism no control over but can ask OS to flush often
  • Cannot call as a Fully High Availability Database, hence the new definition BASE, (Basic Availability + Eventual Consistency)
  • Basic Availability - At one point of time, if you loose a shard (with all replica sets) until that shard is available , data is not availabel to business
  • Eventual Consistency - No guarantee that data is shown was consistency, typical example (Thanks to Tim Hall) - Inventory stock in webcommerce can show in two different sessions although the order for that stock is already placed. You will get refund for your order or your order get cancelled after a while
  • Locking - As version changes MongoDB locks also got changed drasitically, initially a database level lock, then a collection level as of 3.2 Document Level lock
  • Locking - Readers lock the writer (shared lock) as of old version
  • MVCC - No undo, until 3.2 , unless use wiredtiger engine
  • Storage - Double sizing due write ahead allocation.

The above is small set of what I understand from small implementations and reading documents. The following is the list of common keywords that we know as Oracle DBA's and what they called in MongoDB.

Oracle MongoDB Description
SGA No SGA MongoDB relies completely on OS Page Cache and Flushing Mechanisms, No Specific SGA concept
BufferPool OS Level Page Cache Controls by OS
Shared Pool Query Cache MongoDB Manages all statements parsed by three query frameworks Query,Aggregation,Sort and the query passes through this engines and parse control by "internalquerycachesize" parameter
Dictionary loaded into memory through .ns file Stores in memory map in the physical ram and use OS level mmap, while mongodb starts this metadata is loaded into memory and map the physical structures through the namespace file 
spfile/pfile /etc/mongod.conf Contains data directory location etc. Port, Sharding, Replication, security informations
StorageEngine mmapv1

wiredtiger

inmemory

This is what I look this is adopted by MySQL terminology MYISAM , INNODB, The Memory Storage Engine

Similarly until MongoDB 3.0 uses default mmapv1, and 3.2 allows wiredtiger as default engine basically (uses read/write or MVCC) and the final one is memory engine

Database Server mongod mongod is database instance
Database Client mongo Like SQLPLUS Shell
Database Listener Router aka mongos In a sharded cluster the mongos instance receives all connections and process
Dictionary Config Servers - mongod MongoD instances roles can be Database, configserver, sharded database
redolog journalfile A file called .j_0 is created in journal folder under storage directory and any write operations writes data to this file before flush for durability purposes, journal file is used for recovery purposes. Once the journal entry is written even mongod crashed without flushing dirty buffers to disk, this journal helps to reapply the statement, well like our instance recovery
database database A database is associated with a namespace file at physical level with default 16MB size and can contain 12000 collections means 12000 tables, and the filename like dbname.ns , which typically contains collections names and indexes details and this file will be mapped in memory
schema None
datafile datafile datafile denotes with database name for example if my database is mydb1 then datafile looks like mydb.1 , mydb.2 etc. Each file starts with 64MB  and new file created doubling the size, So as your database grows the files grows.
alertlog /var/log/mongodb/mongod.log Log for mongodb
Datatypes See table other side
table collection A table is a collection associated with .ns file which loaded into memory
row document
Joins embedded documents or linking
shutdown use admin; db.shutdownServer()
startup service mongod start
expdp mongodump --out <directory>
impdp mongorestore <directory>
v$sysstat db.stats()
v$lock db.currentOp()  shows locks information as well current operations in that node.
v$session db.currentOp()
kill session db.killOp(opid)
v$osstat db.serverStatus()
dba_tables db.collection.stats()
Create Table implicit creation When you insert a first row, the table is implicitly created, no definition is required
Drop Table db.tablename.drop()
Add column db.collection.update( set) To add a column
db.users.update(
{ },
{ $set: { join_date: new Date() } },
{ multi: true }
)

Drop Column db.collection.update( unset) db.users.update(
{ },
{ $unset: { join_date: "" } },
{ multi: true }
)

select db.tablename.find()
select only few columns See other column db.users.find(
{ status: "A" },
{ user_id: 1, status: 1, _id: 0 }
)
select with where see other column db.users.find(
{ status: "A" }
)

insert  Insert will eventually create a table db.users.insert(
{ user_id: "bcd001", age: 45, status: "A" }
)
update  Update a column age > 25 with status C db.users.update(
{ age: { $gt: 25 } },
{ $set: { status: "C" } },
{ multi: true }
)
delete db.tablename.remove()
select with join as of 3.2 version you can use $lookup
Group by db.tablename.group() only available in 3.2 , earlier versions should use db.tablenameaggregate().

count db.tablename.count() Count the documents i.e rows
OEM MMS  GUI tool
RAC Sharding  MongoDB use sharded cluster, each node will have its own partition of data through the key (range,hash,tag)
ASM  Nothing like that  Uses OS Filesystem page cache
Diskgroups  Disks  Uses disks
ASM Mirroring Replica Sets
Clustering Sharding with ReplicaSets Manage BASIC Availability, in event of a node failure with replica sets the data is partially available not completely hence called BASIC Availability not high availability
Private Network No need of private network
addnode sh.addShard("localhost:portnum")
Listener Router or MongoS in a cluster environment to redirect to specific shard for your query, the mongos instance will be used,
Client Mongos mongos is the instance that run client like tns entry
Nothing like that Configservers ConfigServers contains information about data distribution keys and route the request to certain shard. It synchronises the metadata often.
Master-Slave Master-Master MongoDB maintains a router instance called mongos and connect
redolog threads oplog In rac we use threads to detect the instance specific actions, here in MongoDB the high availability means at node level which contain the Primary and replica set with in the node itself, so the Primary and replica set maintain polling mechanism, to ensure all changes replicated to the replica set Oplog will be used for that node only.
datafile resize db.repairDatabase() Reduces the Datafiles for that databases
create user db.addUser db.addUser({ user: "geek",
pwd: "password",
roles: [ "readWrite", "dbAdmin" ]
})
dba_users db.system.users.find()
create database use dbname Eventually Create a new database.
drop database db.dropDatabase()
v$banner db.version()  
voting heartbeat voting between replicatset uses arbiter process to vote between a primary and secondary replica set in two node replicaset, remember not the other shards.
Optimizer  plancache  Plan cache is a program that resides in the memory and process the query
Statistics  Maintain metadata in the extents  Like the datafile header in oracle contains the extent information the table statistics is maintain in extents in the ns file
listener  router aka mongos  Mongos instance works like a listener listen your request, read that request, get the distribution keys from config servers and send request to the nodes.
Port 1521 Port 27017  Default port is 27017
sqlplus mongo  mongo shell
sql tracing db.setProfilingLevel(level, slowms)
  • 0 - logger off
  • 1 - log slow queries
  • 2 - log all queries
cluster status sh.status()
disk replication status rs.status()
Disk Striping sh.enableSharding("students") in ASM striping is done by default to diskgroups, where in mongodb the striping can be enabled at database level and then at collection level striping what is the basis of striping like which column

sh.shardCollection("students.testData", { "x": "hashed", "_id": "hashed" })

startup mongod -f <config filename> mongod is the instance that starts and acts as database node
Limits See the other column Heaps of limitations and varies with versions See, references, https://docs.mongodb.com/manual/reference/limits/

  • 16 MB is maximum row size (aka document)
  • A collection aka table can have 64 indexes
  • A composite index can contain 31 fields aka columns
  • Database Name limited to 64 Characters and case sensitive
  • Embedded documents aka rows i.e 1-N can be upto 100
  • The size of the database in single instance can be subjected to 64TB (archivelog aka journled) 128TB without archive logging (for linux)
  • Cannot rename views
  • Normal write operations from app cannot be more than 1000 unless use mongoshell or bulk () operations
  • Index field must not contain more than 1024 bytes
  • 12 nodes in a replicat set and can have 7 voting nodes
  • Striping aka Sharding of a table to distribute data to other nodes should not be more than 256gb
  • Group/Sort queries fails if it exceeds more than 10% of memory
  • Max connections can be 20K (hardcoded)
  • The namesspace file (a metadata file for collections or database) cannot have more than 24000 collections, hence in simple words each database cannot contain more than 24000 objects i.e collections + indexes
Flush SQL ID db.runCommand(

{

planClearCache: "orders"

})

Explain plan db.tablename.find(query).explain You can also explain by query and specific where condition see documentation

Next Post is on Installing & Creating Standalone MongoDB Database

-Thanks

Geek DBA

AWS RDS – Storing CloudWatch metrics for historical analysis

 

Update: 17 nov. AWS announced the cloudwatch metrics now stores until 15 months read this. Oh AWS world its ever chaning.....

https://aws.amazon.com/about-aws/whats-new/2016/11/cloudwatch-extends-metrics-retention-and-new-user-interface/

all aware, AWS CloudWatch metrics only hold for two weeks in console and beyond that you cannot analyse what happened unlike our other monitoring tools.

For those who are new to RDS or AWS world, cloudwatch is monitoring tool and its metrics are like host monitoring metrics , ex: CPU, Memory, Storage, IO Latencies etc.

If you largely depend on Cloudwatch metrics and assume you do not have any other monitoring system, I would suggest to store the metrics by download them into a flat file using AWS CLI.

The following scripts does this,

  • set your region
  • Set date to 1 day ago
  • Takes input of your database instance identifier
  • For every 5 mins from 00:00 to 23:59 of last day using aws cloudwatch get-metric-statistics --namespace AWS/RDS to a flat file
  • skim the file as per requirement using sed
  • load them to my oracle database using sqlldr

Click here to download the script and related files

The script contains all important metrics like cpu utilisation, freeablememory, freestorage, writelatency, diskqueuedepth etc.

Sample snippet of script.

#!/bin/ksh

        export AWS_DEFAULT_REGION=ap-southeast-2

        echo `date`

        ${METRICDATE}=`date +"%d-%m-%y" --date="1 days ago"`

        METRICDATE=`date +"%d-%m-%y"`

        echo $${METRICDATE}

        echo $METRICDATE

        NOW=$(date +"%Y-%m-%d")

        export WASHLOGFILE=logs/WASHPROD_$NOW.log

#####################################################################

## Get New Endpoint Name

#####################################################################

        EPNAME=`aws rds describe-db-instances --db-instance-identifier $1 | grep -i "Address" | cut -d '"' -f4` >> $WASHLOGFILE 2>&1

        echo $EPNAME

####################################################################

## Get metrics for given identifier

####################################################################

        DBNAME=`aws rds describe-db-instances --db-instance-identifier $1 | grep -i "DBName" | cut -d '"' -f4` >> $WASHLOGFILE 2>&1

        echo $DBNAME

aws cloudwatch get-metric-statistics --namespace AWS/RDS --metric-name CPUUtilization --start-time $${METRICDATE}T00:00:00Z --end-time ${METRICDATE}T23:59:00 --statistics Maximum --period 600 --dimensions Name=DBInstanceIdentifier,Value=$1 --output text >> CPUUtilization.txt

sed -i 's/DATAPOINTS/CAMPUS-PROD CPU/g' CPUUtilization.txt

sed -i '1d' CPUUtilization.txt

sed -i 's/Z//g' CPUUtilization.txt

sed -i 's/' '/,/g' CPUUtilization.txt

sed -i 's/T/ /g' CPUUtilization.txt

####################################################################

## Load them into a table

####################################################################

sqlldr dbmanager/22J8RGPI@$EPNAME/$DBNAME control=CPUUtilization.ctl

########################################################################

Sample: sqlldr control file

########################################################################

LOAD DATA

 INFILE CPUUtilization.txt

 APPEND INTO TABLE CloudWatch_Metrics_History

 FIELDS TERMINATED BY ','

(instance_name,

metric_type,

value,

collection_timestamp date 'YYYY-MM-DD HH24:MI:SS',

metric_unit)

Usage: process_cloudwatchmetrics.sh <instancename>

############################################################################

Installation Instructions

############################################################################

Download the zip file  Here

1. Create the table in oracle database or desired database. (if not oracle you have to use different method than sqlldr to copy the data)

      CREATE TABLE "DBMANAGER"."CLOUDWATCH_METRICS_HISTORY" 

        ("INSTANCE_NAME" VARCHAR2(50 BYTE), 

"METRIC_TYPE" VARCHAR2(30 BYTE), 

"VALUE" NUMBER(15,2), 

"COLLECTION_TIMESTAMP" DATE, 

"METRIC_UNIT" VARCHAR2(20 BYTE)  ;

2. Run the script CloudWatch_Metric_History.sh <instance_name>

Hope this helps.

Thanks

AWS RDS – Start & Stop Feature Available

Folks,

Finally, The long awaited feature AWS RDS Stop/Start database feature is available with Amazon RDS,  As of now you can only modify / terminate but cannot stop / start.

Check out your console and you see option to stop and start.

Seems not officially stated but available in console. 🙂

Enjoy

-Thanks

 

Independent Australia Oracle Users Group OTN DAYS 2016 – Conferences

Hello All,

IAOUG is back again in Australia for its second year consecutive year OTN Days, with Arup/Tim Hall/Connor around, two conferences being planned, please checkout if you are around Sydney / Goldcoast / or in Australia.

http://www.iaoug.com/home.html

Sydney, October 31st 2016

At Courtyard Marriott North Ryde

From 8:30 am until 5:30 pm

Buy Tickets Here

Check Agenda Here

 

Gold Coast, November 2nd 2016

At Marriott Surfers Paradise Resort

From 8:30 am until 5:30 pm

Buy Tickets Here

Check Agenda Here

 

Discount Codes Hurry Up:- 

For All 50% discount:-AIOUGFRIEND

For Students 90%Discount:- IAOUGEDU

Speakers

  • Arup Nanda (Oracle ACE Director - Database - Author - Oracle Magazine)
  • Bjoern Rost (Oracle ACE Director - Database)
  • Gustavo Gonzalez (Oracle ACE Director - Applications - Author )
  • Biju Thomas (Oracle ACE Director - Database/Development - Author)
  • Dr. Tim Hall (Oracle ACE Director - Database/Development -The man behind www.oracle-base.com )
  • Dr. Frank Munz (Oracle ACE Director - Middleware/Cloud - Author)
  • Satyendra Kumar (Oracle ACE Director - Applications - Author)
  • Chris Muir (Oracle Product Manager- Development/ADF)
  • Connor McDonald  (Oracle - Database/Development - The man behind asktom.com)
  • Debra Lilley (Oracle ACE Director - Applications)
  • Francisco Munoz Alvarez (Oracle ACE Director - Database - Author)
  • Jim Czuprynski (Oracle ACE Director - Database - Author)
  • Ronald Bradford (Oracle ACE Director - MySQL - Author)

Also register as member for free before 31st December 2016 and avail discounts for future conferences. Dont Miss it. 

http://www.iaoug.com/home.html

 

See you there in Sydney!!!

-Thanks

 

Quick Quiz : Find the best way to find how much time a query has taken to execute

Find the best way to find how much time the query (sql_id) has ran in past or current

ASH stores whenever a session is active every second in v$active_session_history for past two hours, In rotation one sample of every 10 seconds will be visible to dba_hist_active_session_history.

From v$active_session_history

 select to_char(sample_time,'DD-MON-YY HH24') Time, sql_id,sql_plan_hash_value,sql_exec_id,round((count(*))/60/60, 2) as Hours

     from v$active_session_history 

     where sql_id='fvv19kqad3vj3' 

     group by to_char(sample_time,'DD-MON-YY HH24'),sql_id,sql_plan_hash_value,sql_exec_id 

     order by 4;   

 

From dba_hist_active_sess_history (if query has executed sometime back more than 2 hours)

select to_char(sample_time,'DD-MON-YY HH24') Time, sql_id,sql_hash_value,sql_exec_id,round((count(*) * 10)/60/60, 2) as Hours

     from dba_hist_active_sess_history 

     where sql_id='fvv19kqad3vj3' 

     group by to_char(sample_time,'DD-MON-YY HH24'),sql_id,sql_hash_value,sql_exec_id 

     order by 4;

    Sample Output

 

 

 

Cassandra for Oracle DBA’s Part 18 : Change of Replication Factor & Repair Nodes

As like in ASM, when ever a disk is added a rebalance operation takes place between the disks, similarly when the replication factor is changed at keyspace level in the cassandra, the rebalance operation should take place, I have changed the replication factor for Keyspace and you should start the repair of each node so data is again redistributed and key ranges will be adjusted..

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node1 cqlsh

Connected to geek_cluster at 127.0.0.1:9160.

[cqlsh 4.1.1 | Cassandra 2.0.10.71 | DSE 4.5.2 | CQL spec 3.1.1 | Thrift protocol 19.39.0]

Use HELP for help.

cqlsh> ALTER KEYSPACE "Excalibur" WITH REPLICATION ={ 'class' : 'SimpleStrategy', 'replication_factor' : 3 };

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node1 nodetool repair

[2016-07-11 01:47:05,302] Nothing to repair for keyspace 'system'

[2016-07-11 01:47:05,310] Starting repair command #1, repairing 3 ranges for keyspace PortfolioDemo

[2016-07-11 01:47:12,591] Repair session 5edc4330-4709-11e6-a73e-e7c99fc0dbe2 for range (-3074457345618258603,1415355439304195653] finished

[2016-07-11 01:47:12,591] Repair session 60ae9230-4709-11e6-a73e-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] finished

[2016-07-11 01:47:12,717] Repair session 621ec400-4709-11e6-a73e-e7c99fc0dbe2 for range (1415355439304195653,3074457345618258602] finished

[2016-07-11 01:47:12,717] Repair command #1 finished

[2016-07-11 01:47:12,725] Starting repair command #2, repairing 4 ranges for keyspace dse_system

[2016-07-11 01:47:14,173] Repair session 6346f050-4709-11e6-a73e-e7c99fc0dbe2 for range (-3074457345618258603,1415355439304195653] finished

[2016-07-11 01:47:14,174] Repair session 638076e0-4709-11e6-a73e-e7c99fc0dbe2 for range (-9223372036854775808,-3074457345618258603] finished

[2016-07-11 01:47:14,175] Repair session 63b828b0-4709-11e6-a73e-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] finished

[2016-07-11 01:47:14,175] Repair session 63ed4270-4709-11e6-a73e-e7c99fc0dbe2 for range (1415355439304195653,3074457345618258602] finished

[2016-07-11 01:47:14,175] Repair command #2 finished

[2016-07-11 01:47:14,182] Starting repair command #3, repairing 2 ranges for keyspace system_traces

[2016-07-11 01:47:14,529] Repair session 64254260-4709-11e6-a73e-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] finished

[2016-07-11 01:47:14,534] Repair session 64401d60-4709-11e6-a73e-e7c99fc0dbe2 for range (1415355439304195653,3074457345618258602] finished

[2016-07-11 01:47:14,534] Repair command #3 finished

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node2 nodetool repair

 

[2016-07-11 01:47:58,676] Nothing to repair for keyspace 'system'

[2016-07-11 01:47:58,682] Starting repair command #1, repairing 3 ranges for keyspace PortfolioDemo

[2016-07-11 01:48:03,446] Repair session 7eac2ef0-4709-11e6-b499-e7c99fc0dbe2 for range (-9223372036854775808,-3074457345618258603] finished

[2016-07-11 01:48:03,446] Repair session 800be600-4709-11e6-b499-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] finished

[2016-07-11 01:48:03,453] Repair session 81086e70-4709-11e6-b499-e7c99fc0dbe2 for range (1415355439304195653,3074457345618258602] finished

[2016-07-11 01:48:03,453] Repair command #1 finished

[2016-07-11 01:48:03,463] Starting repair command #2, repairing 4 ranges for keyspace dse_system

[2016-07-11 01:48:04,869] Repair session 8184f170-4709-11e6-b499-e7c99fc0dbe2 for range (-3074457345618258603,1415355439304195653] finished

[2016-07-11 01:48:04,869] Repair session 81bc5520-4709-11e6-b499-e7c99fc0dbe2 for range (-9223372036854775808,-3074457345618258603] finished

[2016-07-11 01:48:04,870] Repair session 81f1e410-4709-11e6-b499-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] finished

[2016-07-11 01:48:04,871] Repair session 82230630-4709-11e6-b499-e7c99fc0dbe2 for range (1415355439304195653,3074457345618258602] finished

[2016-07-11 01:48:04,871] Repair command #2 finished

[2016-07-11 01:48:04,878] Starting repair command #3, repairing 2 ranges for keyspace system_traces

[2016-07-11 01:48:05,287] Repair session 825cdae0-4709-11e6-b499-e7c99fc0dbe2 for range (-9223372036854775808,-3074457345618258603] finished

[2016-07-11 01:48:05,292] Repair session 82812bc0-4709-11e6-b499-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] finished

[2016-07-11 01:48:05,292] Repair command #3 finished

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node3 nodetool repair

 

[2016-07-11 01:48:12,500] Nothing to repair for keyspace 'system'

[2016-07-11 01:48:12,506] Starting repair command #1, repairing 3 ranges for keyspace PortfolioDemo

[2016-07-11 01:48:16,205] Repair session 86e967e0-4709-11e6-bd37-e7c99fc0dbe2 for range (-3074457345618258603,1415355439304195653] finished

[2016-07-11 01:48:16,205] Repair session 87c98eb0-4709-11e6-bd37-e7c99fc0dbe2 for range (-9223372036854775808,-3074457345618258603] finished

[2016-07-11 01:48:16,211] Repair session 88bc5320-4709-11e6-bd37-e7c99fc0dbe2 for range (1415355439304195653,3074457345618258602] finished

[2016-07-11 01:48:16,211] Repair command #1 finished

[2016-07-11 01:48:16,219] Starting repair command #2, repairing 4 ranges for keyspace dse_system

[2016-07-11 01:48:17,591] Repair session 891f5ab0-4709-11e6-bd37-e7c99fc0dbe2 for range (-3074457345618258603,1415355439304195653] finished

[2016-07-11 01:48:17,591] Repair session 89511910-4709-11e6-bd37-e7c99fc0dbe2 for range (-9223372036854775808,-3074457345618258603] finished

[2016-07-11 01:48:17,592] Repair session 89891900-4709-11e6-bd37-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] finished

[2016-07-11 01:48:17,593] Repair session 89bd9680-4709-11e6-bd37-e7c99fc0dbe2 for range (1415355439304195653,3074457345618258602] finished

[2016-07-11 01:48:17,593] Repair command #2 finished

[2016-07-11 01:48:17,600] Starting repair command #3, repairing 2 ranges for keyspace system_traces

[2016-07-11 01:48:17,885] Repair session 89f21400-4709-11e6-bd37-e7c99fc0dbe2 for range (-3074457345618258603,1415355439304195653] finished

[2016-07-11 01:48:17,889] Repair session 8a05c310-4709-11e6-bd37-e7c99fc0dbe2 for range (1415355439304195653,3074457345618258602] finished

[2016-07-11 01:48:17,889] Repair command #3 finished

 

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/demos/portfolio_manager/bin $ ccm node4 nodetool repair

[2016-07-11 01:48:25,045] Nothing to repair for keyspace 'system'

[2016-07-11 01:48:25,051] Starting repair command #1, repairing 3 ranges for keyspace PortfolioDemo

[2016-07-11 01:48:27,804] Repair session 8e63c600-4709-11e6-8071-e7c99fc0dbe2 for range (-3074457345618258603,1415355439304195653] finished

[2016-07-11 01:48:27,804] Repair session 8ee81130-4709-11e6-8071-e7c99fc0dbe2 for range (-9223372036854775808,-3074457345618258603] finished

[2016-07-11 01:48:27,809] Repair session 8f92d020-4709-11e6-8071-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] finished

[2016-07-11 01:48:27,809] Repair command #1 finished

[2016-07-11 01:48:27,817] Starting repair command #2, repairing 4 ranges for keyspace dse_system

[2016-07-11 01:48:29,062] Repair session 90091190-4709-11e6-8071-e7c99fc0dbe2 for range (-3074457345618258603,1415355439304195653] finished

[2016-07-11 01:48:29,062] Repair session 90397060-4709-11e6-8071-e7c99fc0dbe2 for range (-9223372036854775808,-3074457345618258603] finished

[2016-07-11 01:48:29,063] Repair session 906896b0-4709-11e6-8071-e7c99fc0dbe2 for range (3074457345618258602,-9223372036854775808] finished

[2016-07-11 01:48:29,068] Repair session 90957310-4709-11e6-8071-e7c99fc0dbe2 for range (1415355439304195653,3074457345618258602] finished

[2016-07-11 01:48:29,069] Repair command #2 finished

[2016-07-11 01:48:29,080] Starting repair command #3, repairing 2 ranges for keyspace system_traces

[2016-07-11 01:48:29,367] Repair session 90c9c980-4709-11e6-8071-e7c99fc0dbe2 for range (-3074457345618258603,1415355439304195653] finished

[2016-07-11 01:48:29,372] Repair session 90e05ec0-4709-11e6-8071-e7c99fc0dbe2 for range (-9223372036854775808,-3074457345618258603] finished

[2016-07-11 01:48:29,372] Repair command #3 finished

-Thanks

Geek DBA

Cassandra for Oracle DBA’s Part 14: Tombstones – Undo type structures

Undo like pointers, tombstones

Cassandra uses a log-structured storage engine. Because of this, deletes do not remove the rows and columns immediately and in-place. Instead, Cassandra writes a special marker, called a tombstone, indicating that a row, column, or range of columns was deleted. These tombstones are kept for at least the period of time defined by the gc_grace_seconds per-table setting. Only then a tombstone can be permanently discarded by compaction.

This scheme allows for very fast deletes (and writes in general), but it’s not free: aside from the obvious RAM/disk overhead of tombstones, you might have to pay a certain price when reading data back if you haven’t modelled your data well.

gc_grace_seconds - Kind of undo retention

 

tombstone_warn_threshold 

    (Default: 1000) The maximum number of tombstones a query can scan before warning.

 

tombstone_failure_threshold 

    (Default: 100000) The maximum number of tombstones a query can scan before aborting. 

Cassandra for Oracle DBA’s Part 17: Backup & Recovery

Backup in Cassandra maintain at node level , so any backup that taken in the one node does not contain other nodes data, so you must take backup of each node, cassandra maintains backups as snapshots means an entire sstables at disk level copied to a different folder with snapshot id as like below.

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/bin $ ccm node1 nodetool snapshot PortfolioDemo

Requested creating snapshot for: PortfolioDemo

Snapshot directory: 1468236726789

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node2/data0/PortfolioDemo/Portfolios/snapshots $ ls -ltr

total 0

root@wash-i-16ca

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios/snapshots/1468236726789 $ ls -ltr

total 1072

-rw-r--r-- 2 root root  13456 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Filter.db

-rw-r--r-- 2 root root 220000 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Index.db

-rw-r--r-- 2 root root 576097 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Data.db

-rw-r--r-- 2 root root    243 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-CompressionInfo.db

-rw-r--r-- 2 root root   4406 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Statistics.db

-rw-r--r-- 2 root root     79 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-TOC.txt

-rw-r--r-- 2 root root   1652 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Summary.db

-rw-r--r-- 2 root root   3696 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Filter.db

-rw-r--r-- 2 root root  64548 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Index.db

-rw-r--r-- 2 root root 169778 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Data.db

-rw-r--r-- 2 root root     99 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-CompressionInfo.db

-rw-r--r-- 2 root root   4390 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Statistics.db

-rw-r--r-- 2 root root     79 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-TOC.txt

-rw-r--r-- 2 root root    532 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Summary.db

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios/snapshots/1468236726789 $

 

Size of Snapshot for One SS Table

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios/snapshots $ du -sh *

1.1M    1468236726789

Size of Data for One SS Table

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios $ du -h

1.1M    .

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios/snapshots $ ls -ltr

total 4

drwxr-xr-x 2 root root 4096 Jul 11 21:32 1468236726789

 

Deleting Snapshots

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios $ ccm node1 nodetool clearsnapshot PortfolioDemo

Requested clearing snapshot for: PortfolioDemo

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios $

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios $ ls -ltr

total 1072

-rw-r--r-- 1 root root  13456 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Filter.db

-rw-r--r-- 1 root root 220000 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Index.db

-rw-r--r-- 1 root root 576097 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Data.db

-rw-r--r-- 1 root root    243 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-CompressionInfo.db

-rw-r--r-- 1 root root   4406 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Statistics.db

-rw-r--r-- 1 root root     79 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-TOC.txt

-rw-r--r-- 1 root root   1652 Jul 11 12:02 PortfolioDemo-Portfolios-jb-5-Summary.db

-rw-r--r-- 1 root root   3696 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Filter.db

-rw-r--r-- 1 root root  64548 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Index.db

-rw-r--r-- 1 root root 169778 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Data.db

-rw-r--r-- 1 root root     99 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-CompressionInfo.db

-rw-r--r-- 1 root root   4390 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Statistics.db

-rw-r--r-- 1 root root     79 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-TOC.txt

-rw-r--r-- 1 root root    532 Jul 11 19:58 PortfolioDemo-Portfolios-jb-6-Summary.db

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios $

root@wash-i-16ca26c8-prod ~/.ccm/geek_cluster/node1/data0/PortfolioDemo/Portfolios $ ccm node1 nodetool clearsnapshot PortfolioDemo -t 1468236726789

 

Enabling Incremental Snapshots

root@wash-i-16ca26c8-prod ~/.ccm/repository/4.5.2/resources/cassandra/conf $ grep increment cassandra.yaml

incremental_backups: false

Change it to true.

-Thanks

Geek DBA

 

Cassandra for Oracle DBA Part 17: Tracing Sessions

As like oracle 100046  , you can also trace the Cassandra Session and it provide you the following information , the highlighted blue parts are important to understand that data has been fetched from different nodes and how many undo it has to read and how many from cache and how many reads from tables, this is very useful to understand the data distribution and the times that taking to get data from distributed nodes.

 

cqlsh>tracing on

Tracing session: ad1bd5e0-47c9-11e6-a439-e7c99fc0dbe2

cqlsh> select * from "PortfolioDemo"."Portfolios" LIMIT 10000;

 activity                                                                                       | timestamp    | source    | source_elapsed

------------------------------------------------------------------------------------------------+--------------+-----------+----------------

                                                                             execute_cql3_query | 10:43:39,968 | 127.0.0.1 |              0

                                Parsing select * from "PortfolioDemo"."Portfolios" LIMIT 10000; | 10:43:39,968 | 127.0.0.1 |            639

                                                                            Preparing statement | 10:43:39,969 | 127.0.0.1 |           1207

                                                                  Sending message to /127.0.0.2 | 10:43:39,971 | 127.0.0.1 |           3151

                                                      Executing single-partition query on users | 10:43:39,971 | 127.0.0.1 |           3455

                                                                   Acquiring sstable references | 10:43:39,971 | 127.0.0.1 |           3488

                                                                    Merging memtable tombstones | 10:43:39,971 | 127.0.0.1 |           3539

                                                                    Key cache hit for sstable 2 | 10:43:39,971 | 127.0.0.1 |           3649

                                                    Seeking to partition beginning in data file | 10:43:39,971 | 127.0.0.1 |           3660

                      Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | 10:43:39,972 | 127.0.0.1 |           4180

                                                               Message received from /127.0.0.1 | 10:43:39,972 | 127.0.0.2 |             88

                                                     Merging data from memtables and 1 sstables | 10:43:39,972 | 127.0.0.1 |           4195

                                                             Read 1 live and 0 tombstoned cells | 10:43:39,972 | 127.0.0.1 |           4251

                                                                  Sending message to /127.0.0.3 | 10:43:39,974 | 127.0.0.1 |           6523

                                                      Executing single-partition query on users | 10:43:39,974 | 127.0.0.2 |           1567

                                                                   Acquiring sstable references | 10:43:39,974 | 127.0.0.2 |           1653

                                                                    Merging memtable tombstones | 10:43:39,974 | 127.0.0.2 |           1767

                                                                    Key cache hit for sstable 2 | 10:43:39,974 | 127.0.0.2 |           1945

                                                    Seeking to partition beginning in data file | 10:43:39,974 | 127.0.0.2 |           1994

                      Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | 10:43:39,975 | 127.0.0.2 |           2654

                                                     Merging data from memtables and 1 sstables | 10:43:39,975 | 127.0.0.2 |           2677

 

-Thanks

GeekDBA

Cassandra for Oracle DBA’s Part 16 : Shutting Down Best Practices

Shutdown Best Practice While Cassandra is crash-safe, you can make a cleanershutdown and save some time during startup thus:

- Make other nodes think this one is down. 

- nodetool -h $(hostname) -p 8080 disablegossip

- Wait a few secs, cut off anyone from writing to this node. 

- nodetool -h $(hostname) -p 8080 dissablethrift

- Flush all memtables to disk. 

- nodetool -h $(hostname) -p 8080 drain

- Shut it down.

- /etc/init.d/cassandra stop