Monday, March 15, 2010

GC Buffer Busy Waits in RAC: Finding Hot Blocks:

 

I was reading following article from a this blog by Jeremy Schneider"
link to the original blog (http://www.ardentperf.com/2007/09/12/gc-buffer-busy-waits-in-rac-finding-hot-blocks/)

Using the ASH

I’ll show two different ways here to arrive at the same conclusion. First, we’ll look a the ASH to see what the sampled sessions today were waiting on. Second, we’ll look at the segment statistics captured by the AWR.

First of all we will get the snap_id’s from dba_hist_snapshot table.

col begin_interval_time format a30
col end_interval_time format a30

select snap_id,begin_interval_time, end_interval_time
from dba_hist_snapshot
where to_char(begin_interval_time,'DD-MON-YYYY')='15-MAR-2010'
order by snap_id;

   SNAP_ID BEGIN_INTERVAL_TIME            END_INTERVAL_TIME
---------- ------------------------------ ------------------------------
      4486 15-MAR-10 12.00.57.293 AM      15-MAR-10 01.00.06.447 AM
      4487 15-MAR-10 01.00.06.447 AM      15-MAR-10 02.00.17.979 AM
      4488 15-MAR-10 02.00.17.979 AM      15-MAR-10 03.00.20.419 AM
      4489 15-MAR-10 03.00.20.419 AM      15-MAR-10 04.00.22.574 AM
      4490 15-MAR-10 04.00.22.574 AM      15-MAR-10 05.00.24.538 AM
      4491 15-MAR-10 05.00.24.538 AM      15-MAR-10 06.00.29.219 AM
      4492 15-MAR-10 06.00.29.219 AM      15-MAR-10 07.00.40.423 AM
      4493 15-MAR-10 07.00.40.423 AM      15-MAR-10 08.00.58.573 AM
      4494 15-MAR-10 08.00.58.573 AM      15-MAR-10 09.00.29.703 AM
      4495 15-MAR-10 09.00.29.703 AM      15-MAR-10 10.00.38.566 AM
      4496 15-MAR-10 10.00.38.566 AM      15-MAR-10 11.00.45.651 AM
      4497 15-MAR-10 11.00.45.651 AM      15-MAR-10 12.00.52.478 PM
      4498 15-MAR-10 12.00.52.478 PM      15-MAR-10 01.00.56.997 PM

This is the window I’m going to use; 12:00am to 12:00pm today.

select wait_class_id, wait_class, count(*) cnt
from dba_hist_active_sess_history
where snap_id between 4486 and 4498
group by wait_class_id, wait_class
order by 3;

WAIT_CLASS_ID WAIT_CLASS CNT
------------- ------------------------------ ----------
3290255840 Configuration 169
2000153315 Network 934
4108307767 System I/O 7199
3386400367 Commit 7809
4217450380 Application 12248
3875070507 Concurrency 14754
1893977003 Other 35499
97762
3871361733 Cluster 104810
1740759767 User I/O 121999


You can see that there were a very large number of cluster events recorded in the ASH. Let’s look a little closer.



select event_id, event, count(*) cnt from dba_hist_active_sess_history
where snap_id between 4486 and 4498 and wait_class_id=3871361733
group by event_id, event
order by 3;

EVENT_ID EVENT CNT
---------- ---------------------------------------- ----------
3905407295 gc current request 4
3785617759 gc current block congested 10
2705335821 gc cr block congested 15
512320954 gc cr request 16
3794703642 gc cr grant congested 17
3897775868 gc current multi block request 17
1742950045 gc current retry 18
1445598276 gc cr disk read 148
1457266432 gc current split 229
2685450749 gc current grant 2-way 290
957917679 gc current block lost 579
737661873 gc cr block 2-way 699
2277737081 gc current grant busy 991
3570184881 gc current block 3-way 1190
3151901526 gc cr block lost 1951
111015833 gc current block 2-way 2078
3046984244 gc cr block 3-way 2107
661121159 gc cr multi block request 4092
3201690383 gc cr grant 2-way 4129
1520064534 gc cr block busy 4576
2701629120 gc current block busy 14379
1478861578 gc buffer busy 67275


Notice the huge gap between the number of buffer busy waits and everything else. Other statistics I checked also confirmed that this wait event was the most significant on the cluster. So now we’ve got an event and we know that 67,275 sessions were waiting on it during ASH snapshots between 12:00am and 12:00pm today. Let’s see what SQL these sessions were executing when they got snapped. In fact lets even include the “gc current block busy” events since there was a bit of a gap for them too.



select sql_id, count(*) cnt from dba_hist_active_sess_history
where snap_id between 4486 and 4498
and event_id in (2701629120, 1478861578)
group by sql_id
having count(*)>1000
order by 2;

SQL_ID CNT
------------- ----------
6kk6ydpp3u8xw 1011
2hvs3mpab5j0w 1022
292jxfuggtsqh 1168
3mcxaqffnzgfw 1226
a36pf34c87x7s 1328
4vs8wgvpfm87w 1390
22ggtj4z9ak3a 1574
gsqhbt5a6d4uv 1744
cyt90uk11a22c 2240
39dtqqpr7ygcw 4251
8v3b2m405atgy 42292


Wow – another big leap – 4,000 to 42,000! Clearly there’s one SQL statement which is the primary culprit. What’s the statement?



select sql_text from dba_hist_sqltext where sql_id='8v3b2m405atgy';

SQL_TEXT
---------------------------------------------------------------------------
insert into bigtable(id, version, client, cl_business_id, cl_order_id, desc


Its an insert statement. Any guesses yet about what the problem might be? Well an insert statement could access a whole host of objects (partitions and indexes)… and even more in this case since there are a good number of triggers on this table. Conveniently, the ASH in 10g records what object is being waited on so we can drill down even to that level.



select count(distinct(current_obj#)) from dba_hist_active_sess_history
where snap_id between 4486 and 4498
and event_id=1478861578 and sql_id='8v3b2m405atgy';

COUNT(DISTINCT(CURRENT_OBJ#))
-----------------------------
14

select current_obj#, count(*) cnt from dba_hist_active_sess_history
where snap_id between 4486 and 4498
and event_id=1478861578 and sql_id='8v3b2m405atgy'
group by current_obj#
order by 2;

CURRENT_OBJ# CNT
------------ ----------
3122841 1
3122868 3
3173166 4
3324924 5
3325122 8
3064307 8
-1 10
3064369 331
0 511
3122795 617
3064433 880
3208619 3913
3208620 5411
3208618 22215


Well a trend is emerging. Another very clear outlier – less than a thousand sessions waiting on most objects but the last one is over twenty-two thousand. Let’s have a look at all three of the biggest ones.



select object_id, owner, object_name, subobject_name, object_type from dba_objects
where object_id in (3208618, 3208619, 3208620);

OBJECT_ID OWNER OBJECT_NAME SUBOBJECT_NAME OBJECT_TYPE
---------- ---------- ------------------------------ ------------------------------ -------------------
3208618 JSCHDER BIGTABLE_LOG P_2007_09 TABLE PARTITION
3208619 JSCHDER BIGTABL_LG_X_ID P_2007_09 INDEX PARTITION
3208620 JSCHDER BIGTABL_LG_X_CHANGE_DATE P_2007_09 INDEX PARTITION


Now wait just a moment… this isn’t even the object we’re updating!! Well I’ll spare you the details but one of the triggers logs every change to BIGTABLE with about 7 inserts into this one. It’s all PL/SQL so we get bind variables and everything – it’s just the sheer number of accesses that is causing all the contention.



One further thing we can do is actually see which blocks are getting most contended for – the ASH records this too. (Isn’t the ASH great?)



select current_file#, current_block#, count(*) cnt
from dba_hist_active_sess_history
where snap_id between 4486 and 4498
and event_id=1478861578 and sql_id='8v3b2m405atgy'
and current_obj# in (3208618, 3208619, 3208620)
group by current_file#, current_block#
having count(*)>50
order by 3;

CURRENT_FILE# CURRENT_BLOCK# CNT
------------- -------------- ----------
1330 238073 51
1542 22645 55
1487 237914 56
1330 238724 61
1330 244129 76
1487 233206 120


One thing that I immediately noticed is that there does not seem to be a single hot block!!! (What?) Out of 40,000 sessions accessing these three objects no more than 120 ever tried to hit the same block. Let’s quickly check if any of these are header blocks on the segments.



select segment_name, header_file, header_block
from dba_segments where owner='JHEIDER' and partition_name='P_2007_09'
and segment_name in ('PLACEMENTS_LOG','PLCMNTS_LG_X_ID',
'PLCMNTS_LG_X_CHANGE_DATE');

SEGMENT_NAME HEADER_FILE HEADER_BLOCK
------------------------------ ----------- ------------
BIGTABL_LG_X_CHANGE_DATE 1207 204809
BIGTABL_LG_X_ID 1207 196617
BIGTABLE_LOG 1209 16393


No – all seem to be data blocks. Why so much contention? Maybe the RAC and OPS experts out there already have some guesses… but first let’s explore one alternative method to check the same thing and see of the numbers line up.



AWR Segment Statistics



Here’s a handy little query I made up the other day to quickly digest any of the segment statistics from the AWR and grab the top objects for the cluster, reporting on each instance.

As an aside, there is a line in the middle that says “GC_BUFFER_BUSY_DELTA”. You can replace that line with any of these values to see the top objects for the corresponding waits during the reporting period:



LOGICAL_READS_DELTA
BUFFER_BUSY_WAITS_DELTA
DB_BLOCK_CHANGES_DELTA
PHYSICAL_READS_DELTA
PHYSICAL_WRITES_DELTA
PHYSICAL_READS_DIRECT_DELTA
PHYSICAL_WRITES_DIRECT_DELTA
ITL_WAITS_DELTA
ROW_LOCK_WAITS_DELTA
GC_CR_BLOCKS_SERVED_DELTA
GC_CU_BLOCKS_SERVED_DELTA
GC_BUFFER_BUSY_DELTA
GC_CR_BLOCKS_RECEIVED_DELTA
GC_CU_BLOCKS_RECEIVED_DELTA
SPACE_USED_DELTA
SPACE_ALLOCATED_DELTA
TABLE_SCANS_DELTA


col object format a60
col i format 99
select * from (
select o.owner||'.'||o.object_name||decode(o.subobject_name,NULL,'','.')||
o.subobject_name||' ['||o.object_type||']' object,
instance_number i, stat
from (
select obj#||'.'||dataobj# obj#, instance_number, sum(
GC_BUFFER_BUSY_DELTA
) stat
from dba_hist_seg_stat
where (snap_id between 12831 and 12838)
and (instance_number between 1 and 6)
group by rollup(obj#||'.'||dataobj#, instance_number)
having obj#||'.'||dataobj# is not null
) s, dba_hist_seg_stat_obj o
where o.dataobj#||'.'||o.obj#=s.obj#
order by max(stat) over (partition by s.obj#) desc,
o.owner||o.object_name||o.subobject_name, nvl(instance_number,0)
) where rownum<=40;

OBJECT I STAT
------------------------------------------------------------ --- ----------
JSCHDER.BIGTABLE_LOG.P_2007_09 [TABLE PARTITION] 2529540
JSCHDER.BIGTABLE_LOG.P_2007_09 [TABLE PARTITION] 1 228292
JSCHDER.BIGTABLE_LOG.P_2007_09 [TABLE PARTITION] 2 309684
JSCHDER.BIGTABLE_LOG.P_2007_09 [TABLE PARTITION] 3 289147
JSCHDER.BIGTABLE_LOG.P_2007_09 [TABLE PARTITION] 4 224155
JSCHDER.BIGTABLE_LOG.P_2007_09 [TABLE PARTITION] 5 1136822
JSCHDER.BIGTABLE_LOG.P_2007_09 [TABLE PARTITION] 6 341440
JSCHDER.BIGTABL_LG_X_CHANGE_DATE.P_2007_09 [INDEX PARTITION] 2270221
JSCHDER.BIGTABL_LG_X_CHANGE_DATE.P_2007_09 [INDEX PARTITION] 1 220094
JSCHDER.BIGTABL_LG_X_CHANGE_DATE.P_2007_09 [INDEX PARTITION] 2 313038
JSCHDER.BIGTABL_LG_X_CHANGE_DATE.P_2007_09 [INDEX PARTITION] 3 299509
JSCHDER.BIGTABL_LG_X_CHANGE_DATE.P_2007_09 [INDEX PARTITION] 4 217489
JSCHDER.BIGTABL_LG_X_CHANGE_DATE.P_2007_09 [INDEX PARTITION] 5 940827
JSCHDER.BIGTABL_LG_X_CHANGE_DATE.P_2007_09 [INDEX PARTITION] 6 279264
JSCHDER.BIGTABLE.P_WAREHOUSE [TABLE PARTITION] 1793931
JSCHDER.BIGTABLE.P_WAREHOUSE [TABLE PARTITION] 1 427482
JSCHDER.BIGTABLE.P_WAREHOUSE [TABLE PARTITION] 2 352305
JSCHDER.BIGTABLE.P_WAREHOUSE [TABLE PARTITION] 3 398699
JSCHDER.BIGTABLE.P_WAREHOUSE [TABLE PARTITION] 4 268045
JSCHDER.BIGTABLE.P_WAREHOUSE [TABLE PARTITION] 5 269230
JSCHDER.BIGTABLE.P_WAREHOUSE [TABLE PARTITION] 6 78170
JSCHDER.BIGTABL_LG_X_ID.P_2007_09 [INDEX PARTITION] 771060
JSCHDER.BIGTABL_LG_X_ID.P_2007_09 [INDEX PARTITION] 1 162296
JSCHDER.BIGTABL_LG_X_ID.P_2007_09 [INDEX PARTITION] 2 231141
JSCHDER.BIGTABL_LG_X_ID.P_2007_09 [INDEX PARTITION] 3 220573
JSCHDER.BIGTABL_LG_X_ID.P_2007_09 [INDEX PARTITION] 4 157050
JSCHDER.BIGTABLE.P_DEACTIVE [TABLE PARTITION] 393663
JSCHDER.BIGTABLE.P_DEACTIVE [TABLE PARTITION] 1 66277
JSCHDER.BIGTABLE.P_DEACTIVE [TABLE PARTITION] 2 10364
JSCHDER.BIGTABLE.P_DEACTIVE [TABLE PARTITION] 3 6930
JSCHDER.BIGTABLE.P_DEACTIVE [TABLE PARTITION] 4 3484
JSCHDER.BIGTABLE.P_DEACTIVE [TABLE PARTITION] 5 266722
JSCHDER.BIGTABLE.P_DEACTIVE [TABLE PARTITION] 6 39886
JSCHDER.BIGTABLE.P_ACTIVE_APPROVED [TABLE PARTITION] 276637
JSCHDER.BIGTABLE.P_ACTIVE_APPROVED [TABLE PARTITION] 1 13750
JSCHDER.BIGTABLE.P_ACTIVE_APPROVED [TABLE PARTITION] 2 12207
JSCHDER.BIGTABLE.P_ACTIVE_APPROVED [TABLE PARTITION] 3 23522
JSCHDER.BIGTABLE.P_ACTIVE_APPROVED [TABLE PARTITION] 4 28336
JSCHDER.BIGTABLE.P_ACTIVE_APPROVED [TABLE PARTITION] 5 99704
JSCHDER.BIGTABLE.P_ACTIVE_APPROVED [TABLE PARTITION] 6 99118

40 rows selected.





Now as you can see, these statistics confirm what we observed from the ASH: the top waits in the system are for the BIGTABLE_LOG table. However this also reveals something the ASH didn’t – that the date-based index on the same table is a close second.



The Real Culprit



Any time you see heavy concurrency problems during inserts on table data blocks there should always be one first place to look: space management. Since ancient versions of OPS it has been a well-known fact that freelists are the enemy of concurrency. In this case, that was exactly the culprit.



select distinct tablespace_name from dba_tab_partitions
where table_name='BIGTABLE_LOG';

TABLESPACE_NAME
------------------------------
BIGTABLE_LOG_DATA

select extent_management, allocation_type, segment_space_management
from dba_tablespaces where tablespace_name='BIGTABLE_LOG_DATA';

EXTENT_MAN ALLOCATIO SEGMEN
---------- --------- ------
LOCAL USER MANUAL

SQL> select distinct freelists, freelist_groups from dba_tab_partitions
2 where table_name='BIGTABLE_LOG';

FREELISTS FREELIST_GROUPS
---------- ---------------
1 1


And there you have it. The busiest table on their OLTP RAC system is using MSSM (Manual segment space management) with a single freelist group. I’m pretty sure this could cause contention problems! But in this case it wasn’t quite what I expected. It looks to me like the single freelist itself wasn’t the point of contention – but it was pointing all of the nodes to the same small number of blocks for inserts and these data blocks were getting fought over. But they were probably filling up quickly and so no single block had a large number of waits reported in the ASH. If anyone has another idea to identify the hot blocks then leave a comment and let me know!

3 comments:

Unknown said...

Excellent work, thank you for sharing.

Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.