Have got a question from one of fellow DBA that how much PGA the session will use or needed.
Well to answer this, first we need to know what PGA looks like and what it used for?
The follow shows the components of a PGA, the little space used for Stack, Env Variables and Private SQL area.
Image Courtesy: Oracle Architecture Diagram
Where the most of the utilisation will be SORT AREA, HASH AREA
SORT AREA : is used when the session performing a sorting of data and use approximately
HASH AREA: is used when execution plan contains hash join and stores the hashed data and once data is matched or processed the final data put into buffer cache (db file scattered read) but stores with in PGA for Direct Path Reads.
Bit Map Merge area:- When using bitmap keys and merging of data uses.
To further dig into , the PGA setting is not hard limit, it can grow and utilize all free memory at OS if needed by oracle sessions. Until 12c you cannot limit the pga memory size.
For example if you have given PGA_AGGREGATE_TARGET =5g , pga can still grow more than 5gb accumulating of all sessions not by single session.
In dedicated architecture where each session will have its own PGA space and put the private data like variables , sorting, hashed data etc. However, although the PGA can grow much more than your setting each individual session can have only 200MB of PGA due to hidden parameter _pga_max_size which default to 200MB or 5% of pga_aggregate_target parameter. So that means if a session needs more than 200MB it has to spill to disk to do sorting.
How to find how much each of this area used by your query ,
select inst_id,sid,sql_id,operation_type,sum(actual_mem_used)/1024/1024,sum(work_area_size)/1024/1024, sum(tempseg_size)/1024/1024 from gv$sql_workarea_active group by inst_id,sid,sql_id,operation_type;
So far we seen the PGA dynamics and allocations of memory, but how to know and estimating how much sorting/hash need of PGA Memory. As of my understanding, this is two fold,
Sort : Once the sort area is full with in pga it will spill to temp space , so ideally with gather plan statistics hint we can find out the temporary space need
Hash: Basically a full table scans, ideally used for hash joins of two tables, find the two tables that join with hash you are looking for find the avg_row_len * num of rows should give you approximate hash utilization.
For example consider a simple query below, which does a full table scan of each table O,T,I.
The plan shows three tables having full tables doing hash join to finalize the data after equal predicates.
Lets look at the calculation tables (here I done some estimations to prove avg_row_len can be used but cases may vary, and its not hard fact rule it will be matched)
Estimation Calculation, to do estimation I have taken the exact row length and rows from dba_tables and the number of rows returned from query. I have added few columns manually to do calculation, the query may not return below.
Lets look at v$sql_workarea_active to see how much expected and actual pga / workarea used for this query.
As you see above, the Expected_MB column shows the expected pga utilization for this query which matches to our Estimation table above.
Happy Reading
Geek DBA
Follow Me!!!