High cpu usage with library cache locks

We have got a call from team asking about a high cpu usage, and here is our hunting down

load averages:  97.5,  98.8, 100.0;  up 11+22:24:01   01:14:52
410 processes: 276 sleeping, 118 running, 3 zombie, 13 on cpu
CPU states:  0.0% idle, 89.6% user, 10.4% kernel,  0.0% iowait,  0.0% swap
Memory: 32G phys mem, 17G free mem, 20G total swap, 20G free swap

   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 24735 oracle     1  44    0 4266M 1834M run      8:13  1.17% oracle
 23492 oracle     1  47    0 4266M 1758M sleep    8:59  1.01% oracle
 25364 oracle     1  54    0 4266M 1927M sleep    9:55  1.00% oracle
 25442 oracle     1  47    0 4267M 1943M run     18:11  1.00% oracle
 25474 oracle     1  44    0 4267M 1943M run      9:45  0.99% oracle
 22904 oracle     1  35    0 4266M 1723M run     15:15  0.96% oracle
 24749 oracle     1  44    0 4266M 1834M run     10:21  0.95% oracle
 21442 oracle     1  37    0 4266M 1586M run     17:08  0.93% oracle
 24852 oracle     1  46    0 4266M 1854M run     11:18  0.92% oracle
 23088 oracle     1  42    0 4266M 1735M run     10:47  0.91% oracle
 23871 oracle     1  47    0 4266M 1766M run     16:52  0.89% oracle
 21093 oracle     1  44    0 4266M 1383M run      7:53  0.88% oracle
 20981 oracle     1  37    0 4266M 1366M run     16:03  0.88% oracle
 22158 oracle     1  47    0 4266M 1634M sleep    9:09  0.88% oracle
 25718 oracle     1  35    0 4266M 1967M run     18:05  0.86% oracle

We understand that from above

 
	89.6% CPU usage by user process
	10.4% by kernel

By looking at command, these are all Oracle process, but in my database server I have four database but how to identify the cpu usage from which database,

Just use Capital C switch while you run top command and then P

It will turn out something like below,

 
 CPU states:  0.0% idle, 89.7% user, 10.3% kernel,  0.0% iowait,  0.0% swap
 Memory: 32G phys mem, 17G free mem, 20G total swap, 20G free swap
 
	    PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
	  22384 oracle     1  49    0 4266M 1686M run     17:26  1.15% oraclePROD (LOCAL=NO)
	  25762 oracle     1  35    0 4266M 1970M run     12:41  0.94% oraclePROD (LOCAL=NO)
	  25157 oracle     1  44    0 4266M 1898M sleep   12:32  0.92% oraclePROD (LOCAL=NO)
	  22709 oracle     1  57    0 4266M 1694M cpu/1    9:44  0.90% oraclePROD (LOCAL=NO)
	  21527 oracle     1  44    0 4266M 1595M run     11:46  0.87% oraclePROD (LOCAL=NO)
	  21690 oracle     1  55    0 4266M 1606M run     17:43  0.87% oraclePROD (LOCAL=NO)
	  21706 oracle     1  55    0 4267M 1611M run     17:13  0.86% oraclePROD (LOCAL=NO)
	  25732 oracle     1  49    0 4266M 1966M run     16:18  0.86% oraclePROD (LOCAL=NO)
	  20981 oracle     1  49    0 4266M 1366M run     16:14  0.85% oraclePROD (LOCAL=NO)
	  21093 oracle     1  49    0 4266M 1383M run      8:03  0.85% oraclePROD (LOCAL=NO)
	  24427 oracle     1  46    0 4266M 1815M run     10:00  0.84% oraclePROD (LOCAL=NO)
	  24132 oracle     1  44    0 4266M 1782M sleep   11:44  0.83% oraclePROD (LOCAL=NO)
	  25430 oracle     1  45    0 4266M 1938M sleep   12:32  0.83% oraclePROD (LOCAL=NO)
	  22667 oracle     1  44    0 4266M 1691M run     11:05  0.82% oraclePROD (LOCAL=NO)
	 25317 oracle     1  53    0 4266M 1926M run      9:56  0.80% oraclePROD (LOCAL=NO)

So these process consuming top CPU were belongs to PROD database only,

But wait see that, CPU usage all are well below 2% but how come utilizing or burning more cpu.

Let's look at how many sessions or process, well ideally many process using 1-2% may consume more, just a thought,

 
	 $ps -eaf | grep wc -l | grep PROD
	 300

300, looks fair... my process /sessions count was set at 1000.

But whats causing, from OS level this is what we can dig, lets go into the database and see, what this processes/sessions are doing

For this, pretty straight view i will use is v$session

 
	SQL> select sid,serial#,logon_time,status,last_call_et/60,sql_id,event,blocking_session from v$session where username!='SYS' and last_call_et/60 > 0 order by logon_time;

	Output edited for breivity:- 
	  SID SERIAL# LOGON_TIME      STATUS     LAST_CALL_ET/60 SQL_ID          EVENT            BLOCKING_SESSION
	----- ------- --------------- ---------- --------------- --------------- ---------------- ----------------
	 3261     428 31-MAR-13       ACTIVE          166.033333 a049q1sr8wndm   resmgr:cpu quant
	 3234       1 31-MAR-13       ACTIVE          166.033333 96qsds85jm6w4   library cache lo
	 3240     216 31-MAR-13       ACTIVE          166.033333 a049q1sr8wndm   library cache lo
	 3222      10 31-MAR-13       ACTIVE          166.033333 96qsds85jm6w4   library cache: m
	 3232     527 31-MAR-13       ACTIVE          165.933333 96qsds85jm6w4   library cache lo
	 3213     211 31-MAR-13       ACTIVE          166.033333 96qsds85jm6w4   library cache lo
	 3209     147 31-MAR-13       ACTIVE          166.033333 96qsds85jm6w4   resmgr:cpu quant
	 3202      71 31-MAR-13       ACTIVE          166.033333 96qsds85jm6w4   resmgr:cpu quant
	 3200      10 31-MAR-13       ACTIVE          154.583333 a049q1sr8wndm   library cache: m
	 3194       7 31-MAR-13       ACTIVE          166.033333 96qsds85jm6w4   library cache lo
	 3184       7 31-MAR-13       ACTIVE          166.033333 96qsds85jm6w4   resmgr:cpu quant
	 ....
	 3015     597 31-MAR-13       ACTIVE          166.033333 a049q1sr8wndm   library cache lo
	 3008     488 31-MAR-13       ACTIVE          166.033333 a049q1sr8wndm   library cache lo
	 3002       9 31-MAR-13       ACTIVE          166.016667 96qsds85jm6w4   library cache: m
	 3000      80 31-MAR-13       ACTIVE          166.033333 a049q1sr8wndm   library cache lo
	 ...
	 2969      85 31-MAR-13       ACTIVE          79.6333333 a049q1sr8wndm   resmgr:cpu quant
	 2960       2 31-MAR-13       ACTIVE          166.033333 a049q1sr8wndm   library cache lo
	 2957      38 31-MAR-13       ACTIVE          166.033333 a049q1sr8wndm   library cache lo
	 2951      20 31-MAR-13       ACTIVE          166.033333 a049q1sr8wndm   library cache lo
	 ....
	 2707     354 31-MAR-13       ACTIVE          163.833333 a049q1sr8wndm   library cache lo
	 2684      14 31-MAR-13       ACTIVE          166.016667 96qsds85jm6w4   cursor: pin S
	 2683     130 31-MAR-13       ACTIVE          166.016667 96qsds85jm6w4   library cache: m
	 2677       7 31-MAR-13       ACTIVE          166.016667 96qsds85jm6w4   library cache lo
	 2673     183 31-MAR-13       ACTIVE          166.016667 96qsds85jm6w4   library cache: m
	 ...
	 2611      27 31-MAR-13       ACTIVE          166.016667 a049q1sr8wndm   resmgr:cpu quant
	 2605      55 31-MAR-13       ACTIVE          166.016667 a049q1sr8wndm   library cache lo
	 2604      93 31-MAR-13       ACTIVE          166.016667 96qsds85jm6w4   resmgr:cpu quant
	 2597     382 31-MAR-13       ACTIVE          166.016667 a049q1sr8wndm   library cache: m
	 2587      25 31-MAR-13       ACTIVE          166.016667 a049q1sr8wndm   resmgr:cpu quant
	 2583     139 31-MAR-13       ACTIVE          166.016667 96qsds85jm6w4   library cache lo
	 2571      95 31-MAR-13       ACTIVE          166.016667 a049q1sr8wndm   library cache: m
         2578     338 31-MAR-13       ACTIVE          166.016667 96

You can also keep blocking_session,row_wait_obj# columns to see the sessions which are blocking and the for what object it was.

Viola, I have got library cache locks from 31-Mar, which apparently burn or turns on keep knocking cpu in smaller times but stay constant and that is the reason we are seeing high cpu load constantly.

Library cache locks and libary cache mutex s, are wait events note for the sessions/users to wait for the object to be available,

Further, if you look at some wait event are showing resmgr:cpu quantum, the session is waiting for cpu to be available to process its request.

To identify which objects are in contention that part of another post already written here -->
http://db.geeksinsight.com/2013/02/14/resolving-library-cache-mutex-x/
For now, we got to know that application has recyled on 31-Mar where in the database sessions were hanged on from that time and not get cleared,

We have cleared/killed those sessions and apparently the cpu usage came down.

In many cases, tracking down cpu usage by oracle is not pretty straight and it depends on case to case. Some times you may have backups running and causing more cpu.

Hope this helps,
-Thanks
Geek DBA

All about Database Administration, Tips & Tricks

New Features for DBA’s

Subscribe to Posts by Email

Subscriber Count

Disclaimer

Recent Posts

Categories

Archives

Pages

High cpu usage with library cache locks

All about Database Administration, Tips & Tricks

New Features for DBA’s

Follow Me!!!

Subscribe to Posts by Email

Subscriber Count

Disclaimer

Recent Posts

Categories

Archives

Tags

Pages

High cpu usage with library cache locks