Nadi Hajeh, July 07, 2010 - 10:53 am UTC
a) size of table (not in rows, in mega/gigabytes): 7 milion records, size: 1.3 GB
b) how many rows would be found by that where clause arround 50,000 records
using ('01-jan-2009') or using '01-jan-2009' the same result
in 10g: when i use the below select, it performs full table scan:
select a.* from radacct a
where a.acctstarttime < '01-jan-2009'
or
select a.* from radacct a
where a.acctstarttime < to_date('01-jan-2009')
but when i use delete as below, it uses the index range scan:
delete from radacct a
where a.acctstarttime < '01-jan-2009'
delete from radacct a
where a.acctstarttime < to_date('01-jan-2009')
but is 9i, in both cases (delete and select) the optimizer uses full table scan
in fact i'm not doing compare between 9i ad 10g, but we have two production databases into two different sites doing the same purpose, one is 9i, the other is 10g
thanks for your cooperation
July 08, 2010 - 11:58 am UTC
tell me the clustering_factor of the index and the cost of the full scan (from explain plan) and the cost of the select with an index hint from an explain plan.
Nadi Hajeh, July 08, 2010 - 1:26 pm UTC
The clustering factor is 2802527
the cost of full scan is 38283
the cost when using index hint is 947146
July 08, 2010 - 1:47 pm UTC
can I see the plans, it sounds like it is getting the estimated card= values way off. You say "about 50,000", it looks like it is thinking more along the lines of "more than 900,000" records.
But, I'll guess that since the clustering factor is nearer to the number of rows than to the number of blocks in the table - it is estimating about 1 IO for every row - which might mean that ever for 50,000 records - it would not use the index - the full scan cost of 38k would be lower still.
Nadi Hajeh, July 08, 2010 - 2:12 pm UTC
full scan:
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 3150K| 342M| 38283 |
| 1 | DELETE | RADACCT | | | |
| 2 | TABLE ACCESS FULL | RADACCT | 3150K| 342M| 38283 |
--------------------------------------------------------------------
with using the index hint:
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 3150K| 342M| 1346K|
| 1 | DELETE | RADACCT | | | |
| 2 | INDEX RANGE SCAN | RADACCT_STRTIME_IND | 3150K| 342M| 10570 |
-----------------------------------------------------------------------------
July 08, 2010 - 2:26 pm UTC
see the estimated card= value, 3,150,000. No where near 50,000.
Are you sure this is only hitting 50,000?
are statistics up to date - does the optimizer have a chance of getting the right estimate?
Nadi Hajeh, July 08, 2010 - 2:40 pm UTC
Hi Tom,
the number of rwos to be deleted in the explain plan i perfromed is arround 3 milions, i increased the interval.
regarding the statisitcs, i'm gathering the statistics every one week, but we have arround 500,000+ record inserted to the tables daily, and other 500,000+ million records are deleted
this is our case in this table
July 08, 2010 - 4:19 pm UTC
if the number of rows is 3mill, then full scan is the only sensible way - I'm confused. What does 50,000 have to do with anything????
start over, state the problem from start to finish and supply supporting data such as I asked for above all in one place.
Nadi Hajeh, July 09, 2010 - 2:13 am UTC
the estimated rows to be deleted is 67,000
explain plan for
delete from radius.radacct a
where a.acctstarttime < to_date('27-nov-2009','dd-mon-yyyy')
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 193K| 21M| 38283 |
| 1 | DELETE | RADACCT | | | |
| 2 | TABLE ACCESS FULL | RADACCT | 193K| 21M| 38283 |
--------------------------------------------------------------------
explain plan for
delete /*+ INDEX (a RADACCT_STRTIME_IND) */ from radius.radacct a
where a.acctstarttime < to_date('27-nov-2009','dd-mon-yyyy')
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 193K| 21M| 82445 |
| 1 | DELETE | RADACCT | | | |
| 2 | INDEX RANGE SCAN | RADACCT_STRTIME_IND | 193K| 21M| 890 |
-----------------------------------------------------------------------------
July 09, 2010 - 8:26 am UTC
where do you see 67,000 please.
the estimate I see is 193K - 193,000. I don't see 67,000 *anywhere*, where are you seeing it?
and why is it all of a sudden 193k, what happened to 3150K????? A little *consistency* is mandatory - you are really confusing everything here - stick with ONE problem, start to finish, be consistent with the numbers and use numbers that actually exist.
Nadi Hajeh, July 09, 2010 - 11:08 am UTC
the output of the below query is 67k:
select count(1) from radius.radacct a
where a.acctstarttime < to_date('27-nov-2009','dd-mon-yyyy')
the table was last analyzed is today morning
July 09, 2010 - 4:51 pm UTC
Ok, the estimated card= values are way off - now we need to figure out why
how do you gather statistics, exact command.
My car won't start
Dana, July 09, 2010 - 2:43 pm UTC
Really confusing from the questioner's side, but Tom is asking for the right clarification.
This thread is a moving target that will not be satisfactorily explained or resolved without tighter focus and detail.
This is one of the most valuable lessons I take away from the site. Be absolutely sure everyone is on the same page (of the same book). Good work Tom.
Nadi Hajeh, July 10, 2010 - 2:45 am UTC
exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'RADIUS', tabname => 'RADACCT', estimate_percent => 70, cascade => true);
July 19, 2010 - 8:49 am UTC
see below for next persons comments, I have the same question.
Is that column "skewed"
Can you try method_opt => 'for column thatcolumn size 254'
And...
Dana, July 10, 2010 - 12:16 pm UTC
What is the data skew for acctstarttime? And did dbms_stats default to cascade? And did dbms_stats gather a histogram on acctstarttime? And was no_invalidate=false?
These are all factors that may change the optimizer process.
Personal opinion here. If I can I do full compute, no cascade, separate full index compute, and columns size 1 for all columns unless the predicate column is a candidate for histograms. That way I know the optimizer has the maximum information to work with and is most likely going to get the best answer. I compare that with the plan generated "normally". Then I can say with some validity that oracle is working, I'm just not giving it enough information. I can also use the "ideal" optimizer plan as a goal. Otherwise tuning continues forever.