Skip to Main Content
  • Questions
  • delete statement doesn't using index

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Nadi.

Asked: July 06, 2010 - 8:23 am UTC

Last updated: July 19, 2010 - 8:49 am UTC

Version: 9.2.0.8

Viewed 10K+ times! This question is

You Asked

Hello,
i have the following delete statement:
delete from radacct a
where acctstarttime < '01-jan-2009'
an index is created on acctstarttime, but full table scan is perfromed to delete the data from the table

this behavior is only on 9i. the query uses the approperiate index on 10g.

can you please explain this strange behavior

regards,


and Tom said...

one would need to know

a) size of table (not in rows, in mega/gigabytes)
b) how many rows would be found by that where clause

and please explain to me why you are comparing what appears to be a date column to a string???? You don't really do that do you???????


It will have to do with the estimated cardinality. Here I can show that even in 10g, it will SOMETIMES use the index and SOMETIMES NOT use the index - as appropriate. Meaning - it is highly likely that a full scan in 9i is correct and appropriate - due the number of rows being removed.

ops$tkyte%ORA10GR2> create table t
  2  as
  3  select * from all_objects;

Table created.

ops$tkyte%ORA10GR2> create index t_idx on t(created);

Index created.

ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> delete from plan_table;

3 rows deleted.

ops$tkyte%ORA10GR2> explain plan for delete from t where created < to_date('01-jan-2009');

Explained.

ops$tkyte%ORA10GR2> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 3335594643

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | DELETE STATEMENT   |      | 35548 |   277K|    52   (4)| 00:00:01 |
|   1 |  DELETE            | T    |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| T    | 35548 |   277K|    52   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CREATED"<TO_DATE(' 2009-01-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

15 rows selected.

ops$tkyte%ORA10GR2> delete from plan_table;

3 rows deleted.

ops$tkyte%ORA10GR2> explain plan for delete from t where created < to_date('01-jan-2002');

Explained.

ops$tkyte%ORA10GR2> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 2757126757

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | DELETE STATEMENT  |       |    20 |   160 |     2   (0)| 00:00:01 |
|   1 |  DELETE           | T     |       |       |            |          |
|*  2 |   INDEX RANGE SCAN| T_IDX |    20 |   160 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CREATED"<TO_DATE(' 2002-01-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

15 rows selected.

Rating

  (9 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Nadi Hajeh, July 07, 2010 - 10:53 am UTC

a) size of table (not in rows, in mega/gigabytes): 7 milion records, size: 1.3 GB
b) how many rows would be found by that where clause arround 50,000 records

using ('01-jan-2009') or using '01-jan-2009' the same result

in 10g: when i use the below select, it performs full table scan:
select a.* from radacct a
where a.acctstarttime < '01-jan-2009'
or
select a.* from radacct a
where a.acctstarttime < to_date('01-jan-2009')

but when i use delete as below, it uses the index range scan:
delete from radacct a
where a.acctstarttime < '01-jan-2009'
delete from radacct a
where a.acctstarttime < to_date('01-jan-2009')


but is 9i, in both cases (delete and select) the optimizer uses full table scan

in fact i'm not doing compare between 9i ad 10g, but we have two production databases into two different sites doing the same purpose, one is 9i, the other is 10g

thanks for your cooperation
Tom Kyte
July 08, 2010 - 11:58 am UTC

tell me the clustering_factor of the index and the cost of the full scan (from explain plan) and the cost of the select with an index hint from an explain plan.

Nadi Hajeh, July 08, 2010 - 1:26 pm UTC

The clustering factor is 2802527
the cost of full scan is 38283
the cost when using index hint is 947146

Tom Kyte
July 08, 2010 - 1:47 pm UTC

can I see the plans, it sounds like it is getting the estimated card= values way off. You say "about 50,000", it looks like it is thinking more along the lines of "more than 900,000" records.

But, I'll guess that since the clustering factor is nearer to the number of rows than to the number of blocks in the table - it is estimating about 1 IO for every row - which might mean that ever for 50,000 records - it would not use the index - the full scan cost of 38k would be lower still.

Nadi Hajeh, July 08, 2010 - 2:12 pm UTC

full scan:

--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 3150K| 342M| 38283 |
| 1 | DELETE | RADACCT | | | |
| 2 | TABLE ACCESS FULL | RADACCT | 3150K| 342M| 38283 |
--------------------------------------------------------------------


with using the index hint:


-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 3150K| 342M| 1346K|
| 1 | DELETE | RADACCT | | | |
| 2 | INDEX RANGE SCAN | RADACCT_STRTIME_IND | 3150K| 342M| 10570 |
-----------------------------------------------------------------------------


Tom Kyte
July 08, 2010 - 2:26 pm UTC

see the estimated card= value, 3,150,000. No where near 50,000.

Are you sure this is only hitting 50,000?
are statistics up to date - does the optimizer have a chance of getting the right estimate?

Nadi Hajeh, July 08, 2010 - 2:40 pm UTC

Hi Tom,
the number of rwos to be deleted in the explain plan i perfromed is arround 3 milions, i increased the interval.

regarding the statisitcs, i'm gathering the statistics every one week, but we have arround 500,000+ record inserted to the tables daily, and other 500,000+ million records are deleted

this is our case in this table
Tom Kyte
July 08, 2010 - 4:19 pm UTC

if the number of rows is 3mill, then full scan is the only sensible way - I'm confused. What does 50,000 have to do with anything????

start over, state the problem from start to finish and supply supporting data such as I asked for above all in one place.

Nadi Hajeh, July 09, 2010 - 2:13 am UTC

the estimated rows to be deleted is 67,000

explain plan for
delete from radius.radacct a
where a.acctstarttime < to_date('27-nov-2009','dd-mon-yyyy')


--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 193K| 21M| 38283 |
| 1 | DELETE | RADACCT | | | |
| 2 | TABLE ACCESS FULL | RADACCT | 193K| 21M| 38283 |
--------------------------------------------------------------------

explain plan for
delete /*+ INDEX (a RADACCT_STRTIME_IND) */ from radius.radacct a
where a.acctstarttime < to_date('27-nov-2009','dd-mon-yyyy')

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 193K| 21M| 82445 |
| 1 | DELETE | RADACCT | | | |
| 2 | INDEX RANGE SCAN | RADACCT_STRTIME_IND | 193K| 21M| 890 |
-----------------------------------------------------------------------------


Tom Kyte
July 09, 2010 - 8:26 am UTC

where do you see 67,000 please.

the estimate I see is 193K - 193,000. I don't see 67,000 *anywhere*, where are you seeing it?


and why is it all of a sudden 193k, what happened to 3150K????? A little *consistency* is mandatory - you are really confusing everything here - stick with ONE problem, start to finish, be consistent with the numbers and use numbers that actually exist.

Nadi Hajeh, July 09, 2010 - 11:08 am UTC

the output of the below query is 67k:
select count(1) from radius.radacct a
where a.acctstarttime < to_date('27-nov-2009','dd-mon-yyyy')

the table was last analyzed is today morning
Tom Kyte
July 09, 2010 - 4:51 pm UTC

Ok, the estimated card= values are way off - now we need to figure out why


how do you gather statistics, exact command.

My car won't start

Dana, July 09, 2010 - 2:43 pm UTC

Really confusing from the questioner's side, but Tom is asking for the right clarification.
This thread is a moving target that will not be satisfactorily explained or resolved without tighter focus and detail.
This is one of the most valuable lessons I take away from the site. Be absolutely sure everyone is on the same page (of the same book). Good work Tom.

Nadi Hajeh, July 10, 2010 - 2:45 am UTC

exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'RADIUS', tabname => 'RADACCT', estimate_percent => 70, cascade => true);


Tom Kyte
July 19, 2010 - 8:49 am UTC

see below for next persons comments, I have the same question.

Is that column "skewed"
Can you try method_opt => 'for column thatcolumn size 254'



And...

Dana, July 10, 2010 - 12:16 pm UTC

What is the data skew for acctstarttime? And did dbms_stats default to cascade? And did dbms_stats gather a histogram on acctstarttime? And was no_invalidate=false?
These are all factors that may change the optimizer process.

Personal opinion here. If I can I do full compute, no cascade, separate full index compute, and columns size 1 for all columns unless the predicate column is a candidate for histograms. That way I know the optimizer has the maximum information to work with and is most likely going to get the best answer. I compare that with the plan generated "normally". Then I can say with some validity that oracle is working, I'm just not giving it enough information. I can also use the "ideal" optimizer plan as a goal. Otherwise tuning continues forever.

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library