Tom -
A while back I was looking into a question regarding what seemed to be unusually high logical reads as reported by AWR. I noticed that metrics in V$SQLSTATS showed buffer_gets being higher on average for SQL that is offloaded than it is for statements that are not. For example, in the production database:
SQL> select decode( io_cell_offload_eligible_bytes, 0,'NO', 'YES' ) offloadable,
2 ( sum( buffer_gets ) / sum( rows_processed ) ) buffer_gets_per_row
3 from v$sqlstats
4 group by decode( io_cell_offload_eligible_bytes, 0,'NO', 'YES' )
5 order by 1 desc;
OFF BUFFER_GETS_PER_ROW
--- -------------------
YES 191.92836
NO 1.16073833
SQL>
To investigate further I wanted to compare V$SQLSTATS.BUFFER_GETS for offloaded SQL to some Exadata specific metrics. My idea was to compare buffer_gets to the number of blocks actually being returned from the storage and also to see how many blocks might be skipped due to storage indexes. The test query that I came up with contains no joins or sorts so each buffer should be visited only once. It does a table scan and has a where clause on a column that is not indexed. (The examples from here on have been executed in a test system, not the production system that the above was taken from.) The table being queried, an archived copy of SYS.AUD$, occupies ~340,000 8k blocks:
SQL> col segment_name form a20
SQL> with cache_summary as ( select bh.objd, count( bh.block# ) buffers
2 from v$bh bh group by bh.objd )
3 select seg.segment_name, seg.blocks, seg.bytes/1024/1024 mbytes,
4 nvl( cache.buffers, 0 ) buffers
5 from dba_objects obj, dba_segments seg, cache_summary cache
6 where seg.owner = 'SYSTEM' and seg.segment_type = 'TABLE' and
7 seg.segment_name = 'ARCHIVE_AUDIT' and
8 obj.owner = seg.owner and
9 obj.object_type = seg.segment_type and
10 obj.object_name = seg.segment_name and
11 cache.objd (+) = obj.object_id;
SEGMENT_NAME BLOCKS MBYTES BUFFERS
-------------------- ---------- ---------- ----------
ARCHIVE_AUDIT 343680 2685 0
SQL>
As you can see the buffer cache currently contains no blocks for this table.
Prior to running the test I captured the current value of 'cell physical IO bytes saved by storage index':
SQL>
SQL> select ms.value bytes
2 from v$mystat ms, v$statname sn
3 where sn.statistic# = ms.statistic# and
4 sn.name = 'cell physical IO bytes saved by storage index';
BYTES
----------
0
SQL>
Then the test query was executed:
SQL>
SQL> set autotrace traceonly explain statistics
SQL>
SQL> select /* testquery */ spare1 os_user
2 from system.archive_audit
3 where userid = 'SYSTEM';
2943 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 120666263
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 977 | 16609 | 93025 (1)| 00:18:37 |
|* 1 | TABLE ACCESS STORAGE FULL| ARCHIVE_AUDIT | 977 | 16609 | 93025 (1)| 00:18:37 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - storage("USERID"='SYSTEM')
filter("USERID"='SYSTEM')
Statistics
----------------------------------------------------------
191 recursive calls
0 db block gets
341958 consistent gets
341852 physical reads
0 redo size
51757 bytes sent via SQL*Net to client
2680 bytes received via SQL*Net from client
198 SQL*Net roundtrips to/from client
12 sorts (memory)
0 sorts (disk)
2943 rows processed
SQL>
SQL> set autotrace off
SQL>
The V$SQLSTATS I/O values for the test query show that relatively few blocks were returned from the Exadata storage cells:
SQL>
SQL> select sql_id, plan_hash_value plan_hash, executions execs, buffer_gets,
2 ceil( io_cell_offload_returned_bytes / 8192 ) offload_rtn_blks,
3 ceil( io_interconnect_bytes / 8192 ) io_interconnect_blks,
4 ( ( io_interconnect_bytes - io_cell_offload_returned_bytes ) / 8192 ) block_served_blocks
5 from v$sqlstats
6 where sql_text like 'select /* testquery */%';
SQL_ID PLAN_HASH EXECS BUFFER_GETS OFFLOAD_RTN_BLKS IO_INTERCONNECT_BLKS BLOCK_SERVED_BLOCKS
------------- --------- ----- ----------- ---------------- -------------------- -------------------
7ujnzfrdt1hdp 120666263 1 341958 35 51 16
SQL>
Also, many of the table's blocks were skipped due to storage indexes:
SQL>
SQL> select ms.value bytes, ( ms.value/8192 ) blocks
2 from v$mystat ms, v$statname sn
3 where sn.statistic# = ms.statistic# and
4 sn.name = 'cell physical IO bytes saved by storage index';
BYTES BLOCKS
---------- ----------
1633697792 199426
SQL>
Given that only 42% of the table’s blocks were actually read by the storage cells and of those only 51 were returned to the database server, why do buffer_gets, consistent gets and physical reads all seem to indicate that the entire table was read?
I believe that this anomaly is the underlying cause of the AWR problem that I originally began investigating. I would like to be able to explain it to my colleagues and to the team of developers that I support. Please let me know what insights you have or if you spot any mistakes in my testing methods.
Randy