Oracle RAC: Node evictions & 11gR2 node eviction means restart of cluster stack not reboot of node

Cluster integrity and cluster membership will be governed by occsd (oracle cluster synchronization daemon) monitors the nodes using 2 communication channels:

- Private Interconnect aka Network Heartbeat
- Voting Disk based communication aka Disk Heartbeat

Network heartbeat:-

Each node in the cluster is “pinged” every second

Nodes must respond in css_misscount time (defaults to 30 secs.)
– Reducing the css_misscount time is generally not supported
Network heartbeat failures will lead to node evictions
CSSD-log:
[date / time] [CSSD][1111902528]
clssnmPollingThread: node mynodename (5) at 75% heartbeat fatal, removal in 6.7 sec

Disk Heartbeat:-

Each node in the cluster “pings” (r/w) the Voting Disk(s) every second

Nodes must receive a response in (long / short) diskTimeout time
– IF I/O errors indicate clear accessibility problems  timeout is irrelevant
Disk heartbeat failures will lead to node evictions
CSSD-log: …
[CSSD] [1115699552] >TRACE: clssnmReadDskHeartbeat:node(2) is down. rcfg(1) wrtcnt(1) LATS(63436584) Disk lastSeqNo(1)

Now, we know with above possibilities (network, disk heartbeat failures can lead to node eviction, but sometime when the server/occsd/resource request also makes the node get evicted which are extreme cases)

Why nodes should be evicted?

Evicting (fencing) nodes is a preventive measure (it’s a good thing)!

Nodes are evicted to prevent consequences of a split brain:
– Shared data must not be written by independently operating nodes
– The easiest way to prevent this is to forcibly remove a node from the cluster

How are nodes evicted? – STONITH
Once it is determined that a node needs to be evicted,

A “kill request” is sent to the respective node(s)
– Using all (remaining) communication channels
A node (CSSD) is requested to “kill itself” - “STONITH like”
– “STONITH” foresees that a remote node kills the node to be evicted

EXAMPLE: Voting Disk Failure

In a 2 node cluster, the node with the lowest node number should survive
In a n-node cluster, the biggest sub-cluster should survive (votes based)

EXAMPLE: Network heartbeat failure

The network heartbeat between nodes has failed
– It is determined which nodes can still talk to each other
– A “kill request” is sent to the node(s) to be evicted
Using all (remaining) communication channels  Voting Disk(s)
A node is requested to “kill itself”; executer: typically CSSD

EXAMPLE: What if CSSD is stuck or server itself is not responding?

A node is requested to “kill itself”

BUT CSSD is “stuck” or “sick” (does not execute) – e.g.:

– CSSD failed for some reason
– CSSD is not scheduled within a certain margin

OCSSDMONITOR (was: oprocd) will take over and execute

EXAMPLE: Cluster member (rac instance) can request a to kill another member (RAC Instance)

A cluster member (rac instance ) can request a kill another member in order to protect the data integrity, in such cases like control file progress record not written proper by the failure instance(read here) , then occsd tries to kill that member, if not possible its tries to evict the node.

11gR2 Changes –> Important, in 11GR2, the fencing (eviction) does not to reboot.

Until Oracle Clusterware 11.2.0.2, fencing (eviction) meant “re-boot”
With Oracle Clusterware 11.2.0.2, re-boots will be seen less, because:
– Re-boots affect applications that might run an a node, but are not protected
– Customer requirement: prevent a reboot, just stop the cluster – implemented...

How does this works?

With Oracle Clusterware 11.2.0.2, re-boots will be seen less: Instead of fast re-booting the node, a graceful shutdown of the cluster stack is attempted

It starts with a failure – e.g. network heartbeat or interconnect failure

Then IO issuing processes are killed; it is made sure that no IO process remains
     – For a RAC DB mainly the log writer and the database writer are of concern

Once all IO issuing processes are killed, remaining processes are stopped
     – IF the check for a successful kill of the IO processes, fails → reboot

Once all remaining processes are stopped, the stack stops itself with a “restart flag”

OHASD will finally attempt to restart the stack after the graceful shutdown

   Exception to above:-

IF the check for a successful kill of the IO processes fails → reboot

IF CSSD gets killed during the operation → reboot

IF cssdmonitor (oprocd replacement) is not scheduled → reboot

IF the stack cannot be shutdown in “short_disk_timeout”-seconds → reboot

RAC Interview questions « venkatrac

January 28, 2013 at 12:28 pm

[…] Read here […]

Umesh

June 6, 2013 at 12:18 pm

Very useful information. I have also listed Top 4 reasons for Node reboot or node Eviction at http://www.dbas-oracle.com/2013/06/Top-4-Reasons-Node-Reboot-Node-Eviction-in-Real-Application-Cluster-RAC-Environment.html.

Raj

June 24, 2013 at 1:56 am

Bro!! –

Simple & wonderfull!!.. Good work.. added your blog to my list..

Thanks Geek DBA.

Raj.

vijay

July 10, 2013 at 12:58 pm

Hi whatever you publish are really valuable and makes sense.

Thanks for the needful information and it means a lot.

I hope it continues and rocks

Santosh

August 1, 2016 at 11:40 am

Hi
I have a cluster with 3 nodes node1,node2,node3. Suppose if a node eviction happens then how can i get the information that which node got evicted?
Eg. Suppose if i am logged into node1 and node2 got evicted then can i get the info that which node got evicted from the node1 itself or i have to log in to the all the machines and i have to check which node or nodes got evicted.

please help
Thank you

All about Database Administration, Tips & Tricks

New Features for DBA’s

Subscribe to Posts by Email

Subscriber Count

Disclaimer

Recent Posts

Categories

Archives

Pages

Oracle RAC: Node evictions & 11gR2 node eviction means restart of cluster stack not reboot of node

5 comments to Oracle RAC: Node evictions & 11gR2 node eviction means restart of cluster stack not reboot of node

All about Database Administration, Tips & Tricks

New Features for DBA’s

Follow Me!!!

Subscribe to Posts by Email

Subscriber Count

Disclaimer

Recent Posts

Categories

Archives

Tags

Pages

Oracle RAC: Node evictions & 11gR2 node eviction means restart of cluster stack not reboot of node

5 comments to Oracle RAC: Node evictions & 11gR2 node eviction means restart of cluster stack not reboot of node