Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Tomas.

Asked: March 26, 2025 - 11:22 am UTC

Last updated: March 27, 2025 - 3:58 pm UTC

Version: 18c

Viewed 100+ times

You Asked

Dear Ask Tom team memebers.

Is it possible to to measure sql effectivity via computed throw away row count ?

Its clear that optimizer tries to build the plan to minimize throw away rows for the next steps in the plan. So it tries to send only relevant rows to next operation.

I get that huge throw away in the beginig is very beneficial for next steps.

My intent is to find big(size of queried data and running time) sqls that do a lot of ineffective work on irrelevant data to find terrible design of tables, partitoning, indexes and all data access paths.

So the findings should be something like a step in the plan and a ratio of rowsouce_result(rows_sent_from_plan_step) / rowsource_out(rows_sent_to_next_step i.e parent operation.) .

I know its possible to see it in the plan statistics, the metric actual_rows in the plan step.

But is there a method how to get those metrics for all schema sqls in a similiar way like top ten sqls measured by cpu, i/o, memory consumption ?

I know that such metric could be misleading, it should be for finding ineffective database design and sqls, that waste a lot of work and resources.

I tried some sqls based on sql plan statistics, but I'm not sure if its precise.

Thank You.

Tomas.



and Chris said...

I don't know an (easy) way to get this information. Even if there was, I'm unconvinced this is a good approach.

Its clear that optimizer tries to build the plan to minimize throw away rows for the next steps in the plan.

I disagree. The optimizer tries to minimize the number of resources it consumes. This usually means processing as few rows as possible for as long as possible. But this is different from minimizing "throw away" rows.

For example, this demo filters T1 to return ~half its rows

The database full scans T1, so "throws away" half the rows in the table:

create table t1 ( c1, stuff ) as 
  select mod ( level, 10 ), rpad ( 'stuff', 1000, 'f' ) 
  connect by level <= 100;

create index i on t1 ( c1 );

set feed only
select * from t1
where  c1 < 5;
set feed on

select * from dbms_xplan.display_cursor ( format => 'IOSTATS LAST' );

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |     50 |00:00:00.01 |      17 |
|*  1 |  TABLE ACCESS FULL| T1   |      1 |     51 |     50 |00:00:00.01 |      17 |
------------------------------------------------------------------------------------


But there's an index on T1.C1. If we force the optimizer to use this (via a hint), we can minimize "throw away" rows - the database will only access rows WHERE C2 < 5. This means it only reads rows it returns and throws nothing away:

set feed only
select /*+ index ( t1 ( c1 ) ) */* from t1
where  c1 < 5;
set feed on

select * from dbms_xplan.display_cursor ( format => 'IOSTATS LAST' );

------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |      1 |        |     50 |00:00:00.01 |      51 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1   |      1 |     51 |     50 |00:00:00.01 |      51 |
|*  2 |   INDEX RANGE SCAN                  | I    |      1 |     51 |     50 |00:00:00.01 |       1 |
------------------------------------------------------------------------------------------------------


But this does 3x the amount of work (look at the buffers 17 vs 51)!

Do you really want to triple the work done to reduce the rows thrown away from 50 to zero?

Note that these figures are specific to this demo - on different data the relative benefits will change.

So it tries to send only relevant rows to next operation

Again, I feel this is not quite what the database does. Each operation sends exactly the rows that are true for its access/filter conditions to the next operation. In general, whether these are "relevant" to the next operation is unknowable. For example, if you join two tables with no values in common, the join operation will throw away everything and return no rows. But the only way to know this is to do the join!

To reiterate:

The optimizer tries to minimize the work each query does. While often this means only accessing rows of interest, there are cases where this is suboptimal.

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.