In the Previous Posts , to troubleshoot the problematic queries/long running session issues, we have sorted out two approaches, Here we discuss the Method 1
Method 1. Session Wait Event Approach:- In this approach we will try to identify from the problematic session what its waiting for , why its waiting, what can be done to resolve the problem.
The most common causes are explained here, however there may be many cases. The events classified here are can impact the instance level hence database performance can be degraded.
SQL> select sid,username,event,blocking_session,sql_id,prev_sql_id from v$session where username='username';
or
SQL> select sid,username,event,blocking_session,sql_id,prev_sql_id from v$session where sid='sid';
or
SQL> select session_id,username,event,blocking_session,sql_id,prev_sql_id from v$active_session_history where session_id='';
Concentrate on the Event column and the following events can represent you the problem. Events are mainly classified as Administration, I/O, Concurrency etc. And these notifies what the database is experiencing in terms of waits. As such the waits more the problem in that particular area is more.
For Example, you may have see the following output with above queries
<Screenshot>
As you see above the event column shows most of the times the following events and sessions are waiting for that event. Each wait event represents the issue it has and followed by solution.
1. Latch Cache Buffer chains:- Problem:- As such more buffers (headers) are been read and trying to modify and have a long list of chain for each buffer and repeatable reads (more nested loops), the session get waited on doing latch cache buffer chains. Solution:- Check the sql_id and generate execution plan, compare the execution plan with previous plan from history, and you may see more nested loops. This is due to in correct statistics which lead to more nested loops instead of hash joins. Collect the statistics and you rerun the statement, flush the buffer pool will help. Read more about Join methods Read here. |
2. Library Cache Lock/Pin (Cursor mutex s/x):- Problem 1:- More locks in shared pool for the objects and reread by multiple sessions and lot of invalidations. Every time any execution happen shared pool structure should be locked and pinned until the execution completed again you do execution you have repin and lock, rather between many executions you can do only one time lock/pin that will help to reduce this issue. Problem 2:- Invalidations (statistics collection can invalidate objects in shared pool and hence to repin and relock again, so it can one time issue but if its do more often then its a problem) Solution:- 1. flush the shared pool, 2. kill the session that hold Lock/Pin X in event column. 3. Where the long term you can reduce this issue by setting session_cached_cursors to a reasonable number and cursor_space_for_time to true. |
3. Shared Pool Latch/Latch free:- Problem: Inadequate shared pool size or no dynamic resize operations happening (v$sga_resize_ops), although the automatic memory management is set, sometime you will need to do a manual resizes since MMAN is busy or hanged sometimes. (from my experience I did a lot of times this stuff) Solution:- Flush shared pool and adjust pools 1. alter system flush shared pool; Long Terms if you see this is frequently happening: |
4. Res: mgr quantum:- Problem 1:- Sessions burning CPU or consuming more cpu or load increased, which the other sessions are waiting for CPU. Solution:- Read here and and identify the CPU consuming processes and take appropriate actions. |
5. DB File Sequential Read:- An index reads usually faster but due to index stale statistics, wrong index sessions can be doing more index scans. Problems:- Use of an unselective index , Fragmented Indexes , High I/O on a particular disk or mount point, Bad application design, Index reads performance can be affected by slow I/O subsystem and/or poor database files layout, which result in a higher waits on this wait event. Solutions:- Picked Wrong Index:- It may be the case that table column is involved in two indexes and optimizer has picked the wrong index in this case. 1. SQL> select sid,username,event,sql_id,row_wait_obj# from v$session where sid=''; Note:- the row_wait_obj# tell which object_id that session is wait on , with this we know which index by checking in dba_objects with object_id and get name of index the session is waiting. 2. Identify the Plan for the query, SQL> select * from table(dbms_xplan.display_cursor('sql_id')); 3. Identify (if any previous run history of the query) SQL> select * from table(dbms_xplan.display_awr('sql_id')); 4. Compare the Both Plans and observer change in the index name in highlighted sections. 5. The reason would be , the statistics of index/table would have picked up wrong index (index clustering factor) SQL> exec dbms_stats.gather_table_stats('OWNER','TABLE_NAME',CASCADE=>TRUE,no_invalidate=>false); 6. Rebuild the index that you want (probably the index that has picked up earlier) or collect statistics of the table with cascade true option would correct the issue. SQL> alter index indexname rebuild online; Picked Right Index only, but still slow:- It may be the case that is has picked up right index but still running slow 1. Identify the current plan and identify which index is it SQL> select sid,username,event,sql_id,row_wait_obj# from v$session where sid=''; Note:- the row_wait_obj# tell which object_id that session is wait on , with this we know which index by checking in dba_objects with object_id and get name of index the session is waiting. 2. Identify the Plan for the query, and check the index in the plan SQL> select * from table(dbms_xplan.display_cursor('sql_id')); 3. Identify (if any previous run history of the query) SQL> select * from table(dbms_xplan.display_awr('sql_id')); 4. Compare the Both Plans and observe that it is same index that picked all times but this time slow. 5. The reason would be , the statistics of index/table would have picked up wrong index (index clustering factor) SQL> exec dbms_stats.gather_table_stats('OWNER','TABLE_NAME',CASCADE=>TRUE,no_invalidate=>false); 6. Rebuild the index that you want (probably the index that has picked up earlier) or collect statistics of the table with cascade true option would correct the issue. SQL> alter index indexname rebuild online; |
6. DB File Scattered Read:- Usually full tablescans Problem: May be that no index there for the table, a Index is present but optimizer thinks that scanning via index is costly and hence do a full tablescan. But why then optimizer decide and how it decides, it decides by means of statistics available for the table and indexes . If statistics were stale and volume of data that is retrieving by query is close to number of rows in table lead to full table scans.
|
7. Direct Path Reads |
8. Log File Sync/Log File Parallel Write |
9. Log Buffer Space |
10. Buffer Free Waits |
11. Buffer Busy Waits |
12. Enq: Tx Row Lock Contention |
13. Enq: TM Contention |
Follow Me!!!