Unfortunately.. I cannot provide a working example because this invovles a table of 120 million rows in 10 partitions. When specifing an 11 month date range over this data, a partitioned index is used which causes the query to time out. A hint for a full table scan causes it to finish in 10 minutes. Lower date ranges go with the full table scan. All the forums I've posted on indicates this is usually a sign that the optimizer is making a bad decision on the statistics. The statitics are fresh after the last data load. Do you have any kind of methodology to track down why the optimizer would be making a bad choice? We rebuilt the index, recomputed stats, and the optimizer used the index scan (the bad plan) even MORE often under more circumstances.
The optimizer makes the wrong choice not because statistics are "bad", but because estimating cardinalities can be *very* difficult.
Consider a situation like this - you have a table that has a persons month of birth. This table also has their zodiac sign in it as well.
If you gather statistics - the optimizer will know very well that 1/12th of the people in that table were born in December (assuming a uniform distribution of course, which we will).
If you gather statistics - the optimizer will know very well that 1/12th of the people in that table are Pisces (assuming that same uniform distribution).
Now, assume there are 10,000,000 rows in this table.
where month_of_birth = 'dec' will be estimated to return about 833,333 records.
where zodiac_sign = 'Pisces' will be estimated to return about 833,333 records.
What will "where month_of_birth = 'dec' and zodiac_sign = 'Pisces'" be estimated at?
Well, the optimizer knows A LOT about month_of_birth, it knows A LOT about zodiac_size. It knows (in 10g and before, and in 11g by default) NOTHING about the two together.
So, it will use basic statistics, assume that the two predicates are independent of each other and estimate that 1/12th * 1/12th of the data or about 70,000 records will be returned.
Now, if the optimizer guesses 70,000 records and you've partitioned the data by month of birth (for example), that is a lot of the records in that partition. The optimizer is going to full scan probably - not use an index.
But you know and I know that there are really 0 records because all of us Pisces were born in Feb or Mar - not Dec. So, we get the wrong plan.
At a high level - that is what is going on here. There is an incorrect cardinality estimate happening.
In current releases, we have tons of ways to diagnose this and correct for it. The SQL Monitor in 11g displays the estimated versus actual cardinalities for a query - allowing you to quickly see "where did it go wrong". And then you have many tools to attempt to correct it - from SQL Profiles to Extended statistics (in 11g - you could gather stats on month_of_birth AND zodiac_sign *together*, we'd know that there are no records and estimate the cardinality to be 1 - no 70k).
In 9i, you have to do a lot of the work yourself and your toolset for fixing it is much smaller.
What you'd need to do is get the plan with the estimated cardinalities (get it from v$sql, not from explain plan - dbms_xplan can be used to extract it from v$sql - if you have access to my book Effective Oracle by Design - I show exactly how to do this, it is pretty easy). Then, you would either need to
a) use your knowledge of what the REAL cardinalities are
b) get a tkprof that shows you the real cardinalities from executing the query
to figure out where the estimate is going bad. And then - apply some corrective action to 'fix' it.
Perhaps that means getting histograms on that column.
Perhaps that means getting local partition statistics instead of global.
Perhaps that means collecting global statistics instead of letting us roll them up (prior to 11g, the number of distinct values in the default global statistics could be way off - which would throw off cardinality estimates in a big way)
Perhaps that means using dynamic sampling
http://www.oracle.com/technetwork/issue-archive/2009/09-jan/o19asktom-086775.html (works best if you don't use binds - which if this is a long running, as it is, it would be OK to not bind in this case)
But it will always go back to "estimated versus actual" cardinalities when tracking these things down.