A typical day starts like this.
1) Application team complained about slowness in their job
2) DBA looking into this
No clues or details from App team saying this particular sql is that we are running but the response time of the database server is bit slow, so obvious that app team reported slowness.
This time I have chosen event based troubleshooting approach from AWR whether to see am I able to drill down the issue.
What I started first is, lets take a closer look at top classes in the database that is at the moment.
set lines 1000
set pages 1000
select wait_class_id, wait_class, count(*) cnt
from dba_hist_active_sess_history
where snap_id between 5205 and 5208
group by wait_class_id, wait_class
order by 3;WAIT_CLASS_ID WAIT_CLASS CNT
------------- ------------------ ----------
3875070507 Concurrency 3
3290255840 Configuration 6
3386400367 Commit 14
2000153315 Network 15
4217450380 Application 19
4108307767 System I/O 39
1893977003 Other 136
1740759767 User I/O 302
3871361733 Cluster 786
Note: replace your snap ID’s with your’s
BTW, this is the RAC system and the top Class showing is Cluster (786 count),
Let me see what events are reporting for Wait Class Cluster.
SQL> select event_id, event, count(*) cnt from dba_hist_active_sess_history
where snap_id between 5205 and 5208 and wait_class_id=3871361733
group by event_id, event
order by 3;EVENT_ID EVENT CNT
---------- ------------------------------ ----------
2277737081 gc current grant busy 1
2685450749 gc current grant 2-way 1
3201690383 gc cr grant 2-way 1
2701629120 gc current block busy 3
111015833 gc current block 2-way 5
3151901526 gc cr block lost 9
512320954 gc cr request 9
3905407295 gc current request 21
1912606394 gc buffer busy acquire 332
661121159 gc cr multi block request 404
Well, its gc cr multi block request, nothing but a db file scattered read in other words for a single instance database and the other one gc buffer busy acquire is something that relates to locking or getting latches in the buffer waiting so.
(I am not going through explaining those events since that may fairly require another post)
Lets drill down further what is causing this wait that is sql etc.
SQL> select sql_id, count(*) cnt from dba_hist_active_sess_history
where snap_id between 5205 and 5208
and event_id in ( 1912606394,661121159 )
group by sql_id
having count(*)> 100
order by 2
SQL_ID CNT
------------- ----------
1221u5awzqrb3 288
Note: I have filtered the event_id relates to my events and count > 100, you can change it if you need so. May the count of appearence can be also 1 time but can cause higher wait time for cluster class. So it depends.
Its time to look at the statement for that SQLID I have got above.
SQL> select sql_text from v$sql where sql_id='1456sjks32zqrb3';
SQL_TEXT
--------------------------------------------------------------------------------
Select max(sale_dt) from trade
Finally I have got the statement which is causing highest waits in my database, and rest of the story as usual , get back to application team ask them whether this statement is related to their job that is reported running slow. To resolve the same would be another blog story and the above is just meant to identify the statement from top events for a given snap ID’s rather than running a long awr report or ash report when you basically looking for a simple thing like above.
Hope this helps!!!!
Note:- There are many ways to ascertain this information from AWR which can be very easy than above. Like Tanel Poder’s session snapper provides run time information too from ASH.
Follow Me!!!