You Asked

Dear Ask Tom team memebers.

Is it possible to to measure sql effectivity via computed throw away row count ?

Its clear that optimizer tries to build the plan to minimize throw away rows for the next steps in the plan. So it tries to send only relevant rows to next operation.

I get that huge throw away in the beginig is very beneficial for next steps.

My intent is to find big(size of queried data and running time) sqls that do a lot of ineffective work on irrelevant data to find terrible design of tables, partitoning, indexes and all data access paths.

So the findings should be something like a step in the plan and a ratio of rowsouce_result(rows_sent_from_plan_step) / rowsource_out(rows_sent_to_next_step i.e parent operation.) .

I know its possible to see it in the plan statistics, the metric actual_rows in the plan step.

But is there a method how to get those metrics for all schema sqls in a similiar way like top ten sqls measured by cpu, i/o, memory consumption ?

I know that such metric could be misleading, it should be for finding ineffective database design and sqls, that waste a lot of work and resources.

I tried some sqls based on sql plan statistics, but I'm not sure if its precise.

Thank You.

Tomas.

and Chris said...

I don't know an (easy) way to get this information. Even if there was, I'm unconvinced this is a good approach.

Its clear that optimizer tries to build the plan to minimize throw away rows for the next steps in the plan.

I disagree. The optimizer tries to minimize the number of resources it consumes. This usually means processing as few rows as possible for as long as possible. But this is different from minimizing "throw away" rows.

For example, this demo filters T1 to return ~half its rows

The database full scans T1, so "throws away" half the rows in the table:

create table t1 ( c1, stuff ) as 
  select mod ( level, 10 ), rpad ( 'stuff', 1000, 'f' ) 
  connect by level <= 100;

create index i on t1 ( c1 );

set feed only
select * from t1
where  c1 < 5;
set feed on

select * from dbms_xplan.display_cursor ( format => 'IOSTATS LAST' );

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |     50 |00:00:00.01 |      17 |
|*  1 |  TABLE ACCESS FULL| T1   |      1 |     51 |     50 |00:00:00.01 |      17 |
------------------------------------------------------------------------------------

But there's an index on T1.C1. If we force the optimizer to use this (via a hint), we can minimize "throw away" rows - the database will only access rows WHERE C2 < 5. This means it only reads rows it returns and throws nothing away:

set feed only
select /*+ index ( t1 ( c1 ) ) */* from t1
where  c1 < 5;
set feed on

select * from dbms_xplan.display_cursor ( format => 'IOSTATS LAST' );

------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |      1 |        |     50 |00:00:00.01 |      51 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1   |      1 |     51 |     50 |00:00:00.01 |      51 |
|*  2 |   INDEX RANGE SCAN                  | I    |      1 |     51 |     50 |00:00:00.01 |       1 |
------------------------------------------------------------------------------------------------------

But this does 3x the amount of work (look at the buffers 17 vs 51)!

Do you really want to triple the work done to reduce the rows thrown away from 50 to zero?

Note that these figures are specific to this demo - on different data the relative benefits will change.

So it tries to send only relevant rows to next operation

Again, I feel this is not quite what the database does. Each operation sends exactly the rows that are true for its access/filter conditions to the next operation. In general, whether these are "relevant" to the next operation is unknowable. For example, if you join two tables with no values in common, the join operation will throw away everything and return no rows. But the only way to know this is to do the join!

To reiterate:

The optimizer tries to minimize the work each query does. While often this means only accessing rows of interest, there are cases where this is suboptimal.

Rating

(1 rating)

Comments

Thank You.

A reader, April 14, 2025 - 12:35 pm UTC

Thank You, dear Chris.

Yes I wrote my question badly.

Of course, optimizer tries to maximize reduction of work needed to complete its task.

My question, badly asked : ), was about finding sqls that consume a lot of resources and do a lot of needless work due to wrong design, not using partitions in table design, etc.
So the goal is to find wrong design that causes sqls to do a lot of needless work that can be eliminated with better database design.

Yes, it could be done via finding top sqls (db time, i/o, number of executions, buffer gest, wait events.) and then improve the database design, so that the sqls could complete the task much more effectively.

So I agree, that the question was not good, because addm and awr and sql access advisor could help a lot with such tasks.

Thank You.

Tomas.

April 14, 2025 - 4:03 pm UTC

You're welcome. Lots of plan decisions are cost-based. So if a SQL statement doesn't use feature X, it could be because (the optimizer thinks) it's cheaper to do something else. This makes it hard to identify bad design due to plans not using X.

addm and awr and sql access advisor could help a lot with such tasks.

Yep.

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Thank You.

More to Explore

Performance

SQL effectivity metrics

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Thank You.

More to Explore

Performance