Home>Question Details



-- Thanks for the question regarding "histogram", version 8.1.5

Submitted on 27-May-2001 11:42 Central time zone
Last updated 7-Mar-2009 12:15

You Asked

Tom

Can you give a simple example of how to create a histogram, to see the performance of a 
query?

 

and we said...

Sure, run a script like this:

create table t
as
select 1 X, rpad('*',4000,'*') data from all_objects a
/
insert into t
select 2, rpad('*',4000,'*') from all_objects a where rownum = 1
/
create index t_idx on t(x)
/

alter session set sql_trace=true;

select count(data) from t t1 where x = 1;
select count(data) from t t1 where x = 2;

analyze table t compute statistics
for table
for all indexes
for all indexed columns;

select count(data) from t t2 where x = 1;
select count(data) from t t2 where x = 2;



(the for all indexed columns got the histograms for us on the indexed column X)...

Now the tkprof shows (your mileage WILL vary on this)

select count(data)
from
 t t1 where x = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      4.46      29.99      30847      30864          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      4.46      29.99      30847      30864          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 43984

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE
  30802   TABLE ACCESS BY INDEX ROWID T
  30803    INDEX RANGE SCAN (object id 123779)


select count(data)
from
 t t1 where x = 2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          1          3          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          1          3          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 43984

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE
      1   TABLE ACCESS BY INDEX ROWID T
      2    INDEX RANGE SCAN (object id 123779)


that shows the optimizer (without any stats at this point) used the index.  For "x=1" 
this was a very bad idea  - we were reading almost every row already (all but 1) much 
faster to full scan.  the optimizer did not realize that the indexed column was heavily 
skewed.  so we ran stats and then:

select count(data)
from
 t t2 where x = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.03      11.06      22744      30807         52           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.03      11.07      22744      30807         52           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 43984

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE
  30802   TABLE ACCESS FULL T

********************************************************************************

Ahh, thats much better.  1/2 the cpu time and much better on the elapsed time.  as 
well....  Note that an index will still be used when appropriate:

select count(data)
from
 t t2 where x = 2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          3          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.01       0.00          0          3          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 43984

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE
      1   TABLE ACCESS BY INDEX ROWID T
      2    INDEX RANGE SCAN (object id 123779)


follow up to comment one

the optimizer would use heuristics (rules) to see if it wanted to use the index in 
general or not and would come up with ONE plan that both queries would use.  You would 
obviate the usefulness of histograms in this case.  

this is not a reason to NOT USE bind variabes -- it simply points out one time where you 
might consider NOT using them
 

Reviews    
3 stars   May 27, 2001 - 11pm Central time zone
Reviewer: A reader 
What the CBO will do if the bind variable used for the x? 


3 stars Data Skewness   December 3, 2003 - 5pm Central time zone
Reviewer: A reader from NJ, USA
Tom,

What is the best way to get the data skewness. For example, assume that we have an index on a table 
on column COL1. If it uses indexes for certain values and does a full-table scan for other values, 
then the approach would be :
1. Check the data skewness. Is that correct?
2. If the column in question is a integer column, how do we do we do that
3. If the column in question is a non-integer column, how do we do that

The reason I am asking this question is that when I generate a histogram for an integer column, I 
am able to use the ENDPOINT_NUMBER and ENDPOINT_VALUE to determine the data distribution by finding 
the difference in ENDPOINT_NUMBER between the previous and current row (after we sort on 
ENDPOINT_NUMBER). 
However, for non-numeric columns, only ENDPOINT_NUMBER column shows the numbers. The ENDPOINT_VALUE 
shows some cryptic number and the ENDPOINT_ACTUAL_VALUE is NULL. 

I request you to provide me a methodology to determine data skewness in general (both for numeric 
and non-numeric columns). The text books merely state that we need to check the data skewness, but 
does not provide an example or way to determine that. An example from you in this regard will be of 
great help
Also, please provide me an example and a clear explanation of the SIZE clause in ANALYZE command 
because I am not able to find clear explanation in Oracle documentation or in any performance 
tuning books I read so far. 


Followup   December 3, 2003 - 7pm Central time zone:

select column, count(*) from t group by column;

look at the results, or maybe

select min(cnt), max(cnt), avg(cnt), stddev(cnt)
  from ( select column, count(*) cnt from t group by column )


to see if there is skewness -- min far from max indicates skewed. 

5 stars what is the best way to get skewness   February 20, 2004 - 12pm Central time zone
Reviewer: reader 
Tom, in 9i is WIDTH_BUCKET function useful to generate info about skewed data? If so, any example 
you have? Thanks. 


Followup   February 20, 2004 - 1pm Central time zone:

it is a tool you can use to measure "skewedness".   for example:


ops$tkyte@ORA920PC> select min(object_id), max(object_id), count(object_id), wb
  2    from (
  3  select object_id,
  4         width_bucket( object_id,
  5                       (select min(object_id)-1 from all_objects),
  6                       (select max(object_id)+1 from all_objects), 5 ) wb
  7    from all_objects
  8         )
  9   group by wb
 10  /
 
MIN(OBJECT_ID) MAX(OBJECT_ID) COUNT(OBJECT_ID)         WB
-------------- -------------- ---------------- ----------
             3           8639             7972          1
          8640          17277             8638          2
         17278          25915             8619          3
         25916          34528             4751          4
         35159          43191              238          5
 

shows that if I bucketize into 5 buckets, most of my data is "low" 1,2,3.  object_ids over 26k are 
"few and far between"

it is an analysis tool you can use to see what you have. 

4 stars Can you please explain above ,little detail for my tiny brain   February 21, 2004 - 3am Central time zone
Reviewer: A reader 
"
shows that if I bucketize into 5 buckets, most of my data is "low" 1,2,3.  
object_ids over 26k are "few and far between"
"
I did not understand what is low 1,2,3. where how to see it is low.
Similarly where/how to see
object_ids over 26k are "few and far between". few must be the count * 


Followup   February 21, 2004 - 11am Central time zone:

ops$tkyte@ORA920PC> select min(object_id), max(object_id), count(object_id), wb
  2    from (
  3  select object_id,
  4         width_bucket( object_id,
  5                       (select min(object_id)-1 from all_objects),
  6                       (select max(object_id)+1 from all_objects), 5 ) wb
  7    from all_objects
  8         )
  9   group by wb
 10  /

I took the range:

low object_id-1 .. hi object_id + 1

and broken it into 5 equi-sized buckets (well, I didn't, the database did).  In this example, that 
was the range

2..43191.

Now those ranges are roughly the min and max object ids in the following report:

MIN(OBJECT_ID) MAX(OBJECT_ID) COUNT(OBJECT_ID)         WB
-------------- -------------- ---------------- ----------
             3           8639             7972          1
          8640          17277             8638          2
         17278          25915             8619          3
         25916          34528             4751          4
         35159          43191              238          5
 

I say roughly cause I do not have a continous set here -- 34529 for example "has gone missing" -- 
not in the set.  

So, now that we have those five ranges, we assign a bucket to each row in the result set based on 
which range they fall into.  

As you can see, range 1, 2, 3 have about 8k members each.  Most of my data is "low".  Most of my 
data has an object_id less than 26k.

When you look at buckets 4 and 5 here -- there is only 5k rows between them. 

Hence, the data is skewed,  most of it resides in the first 3 buckets.

do with that information "what you will" (eg: if those facts are un-interesting to you -- that is 
OK.  this function is useful to someone analyzing the data -- to them, this would be fascinating 
perhaps) 

4 stars Histogram   February 21, 2004 - 2pm Central time zone
Reviewer: Peter Rhoades from Berkeley California
Good example.  However, our DBA runs a script that creates Histograms on EVERY column that's in 
EVERY index in EVERY table--whether it makes sense to have the Histogram stats or not.  I think 
that's crazy but he says it "doesn't hurt."  If this is not a new thread, what do you think?  Can 
too many, needless histograms mess things up? 


Followup   February 21, 2004 - 2pm Central time zone:

I typically start with the equivalent of "table", "index" and "all indexed columns" myself.  If you 
have the cycles to compute the histograms, they can be helpful (with bind variable peeking and 
such).  

Could you skip most/many of them? probably.
Would it change your systems?  stats would take a little less time/resources.

I'm not opposed to what they are doing, it does more work then you need in theory.  they might 
consider "auto" or "skewonly" in 9i. 

5 stars what about bind variable peeking ?   February 21, 2004 - 3pm Central time zone
Reviewer: Alberto Dell'Era from Milan, Italy
Imagine that a column has skewed values. 

As an extreme case, all values are unique except one, say 0, which occurs n times, where n is big 
enough to make a fts cheaper than an index scan (we have an index on this column). 

What if the app is using bind variables (and so i assume it's an OLTP application) and the app 
developers thought "well, we don't compute histograms, so the plan will always be the same, and 
will be an index scan. That's ok, since we bind 0 only in very very rare occasions, and we can 
afford to pay for a (very) suboptimal index scan in this very very rare occasion."

Now, the aforementioned DBA builds histograms since "they don't hurt", and immediately after 0 is 
bound ... voila', full table scans forever (ok, until the death of the shared sql stmt in the 
shared pool).

Isn't this an example of "hurting histograms" ?
Wouldn't it be better to avoid collecting them, if the app designers didn't ask for them ?

Alberto



 

 

  


Followup   February 21, 2004 - 4pm Central time zone:

the aforementioned DBA would have been building histograms since the beginning of time and you 
would have caught this in test/development since you always test/develop on real live sized result 
sets (or you spend lots of money on coffee and soda right before/during a rollout -- because you 
won't be getting any sleep)

I'll have to assume that what is "standard operating procedure" in production is also in test/dev 
-- else, well, you have bigger things to worry about than this.


My point is -- "if this is what they always do, from the get go, I don't see anything inheritly 
evil in it"

Sure, anytime you walk into a production system and say "Hey, we are going to make this change" and 
just do it -- you stand a risk of "something going wrong".

The app designers in my experience don't even know what a histogram is.  They would not know to ask 
for that which they do not know about.

 

5 stars understood   February 21, 2004 - 5pm Central time zone
Reviewer: Alberto Dell'Era from Milan, Italy
Thanks for your prompt answer. Well, I assumed the worst scenario since "it doesn't hurt" is 
exactly the same phrase that a DBA said during a nightly rollout of mine, before making a change to 
the prod db, and the Day After [pun intended] we experienced an incredible loss of performance ...

Got a Pavlovian Reflex to that phrase. 

thanks
Alberto

PS What about automatically removing trailing empty lines from reviews/questions posts, like mine 
before ? I normally pay attention and get rid of them myself but sometimes I overlook them ...   


3 stars why it's not good to generate histograms always   February 21, 2004 - 9pm Central time zone
Reviewer: A reader 
Hi

Recently we generated histograms for our application and most queries was deadly slow, when we got 
rid of histograms performance went back again except some queries.

My question is why queries run slow with histograms? Arent we giving more information to the 
optimizer in order to get a better plan? It seems the other way round, the more information you 
give the worse CBO performs! 


Followup   February 22, 2004 - 9am Central time zone:

it depends -- and it really depends on your system characteristics.

Many times people have OLTP systems that have "open ended queries" that they fetch the first 10, 
100 or so rows from and never get to the bottom.  These (unless hinted as "first rows" or run on a 
system where thought was given to the optimizer_index_* init.ora parameters) can perform poorly in 
BOTH cases "apparently" (apparently perform poorly, but not really since the optimizer will 
optimize for throughput in such a system -- not initial response time.  give it more info and it'll 
come up with plans that better optimize throughput response times at the expense of initial 
responses times.  tell it first rows and it'll change that.)

But the bottom line here was -- you introduced "a rather large fundemental change into the system" 
and would have to expect some ramifications.

Do this -- develop a system from day one with histograms generated.

Then, take them away

And then report back "what happened".  You will find pretty much any major league "change in the 
way we do business" is going to have such an effect.  For example -- in the above thread with 
regards to the DBA that always gets table/index/column stats -- if you want to kill that system, 
stop gathering histograms and see what happens. 

5 stars you said   February 22, 2004 - 4am Central time zone
Reviewer: A reader 
hi

You said that DBA should always generate histograms deelopment to catch bind variable peeking 
effects. 

I always thought we should not generate histograms always and identify those columns with skewed 
data with analysist?!?!

As someone else already said, generating histograms always is deadly, I got burnt as well 


Followup   February 22, 2004 - 9am Central time zone:

no, no I did not.

where did I say "dba should always generate histograms"

generating histograms "as a new operating procedure after the code is in place" is what is known as 
a "bad idea (tm)"

NOT generating histograms "as a new operating proceudre after the code is in place" is what is 
knowns as a "bad idea(tm)"


what I said was "do what you do, but do what you do consistently.  don't use one method in test/dev 
and another DIFFERENT method.   You most likely got burnt either

o testing in a database with little or no data 
o testing in a database with stats gathered one way and a different way in prod

else, you could not "get burned" as you would have discovered this stuff right from the get go in 
test/dev.


What I propose is "consistency". 

5 stars histogram's effect on performance   March 2, 2004 - 10pm Central time zone
Reviewer: Ryan Gaffuri from Tysons Corner, VA
there is a debate on oracle-l about excessive use of histogram's negatively effecting performance. 
Carrie Milsap posted the following:

"I don't have access to the data to back this up, but I have seen PARSE call response time move 
from over a minute to less than a second by eliminating histograms from the equation. I think when 
I last saw this, it was an Oracle8 system."

Do you have any comments on this? How often do you recommend gathering performance? I have been 
using 'FOR ALL COLUMNS' by default recently with no adverse affects on performance.
 


Followup   March 3, 2004 - 9am Central time zone:

and it could go the other way as well.  histograms are just *data*.  in the above, i would have to 
guess that histograms caused the optimizer to consider "more stuff", more join paths perhaps (but 
it is hard to say, it is really rare to have parse times of a minute).

the big thing with histograms many times is the amount of resources they take to compute in the 
first place.  I start with "none" or "for all indexed columns" -- the key thing is to be consistent 
(in dev, test, and prod).  In order to avoid surprises. 

3 stars Why HISTOGRAM?   March 23, 2004 - 8pm Central time zone
Reviewer: Jayesh from India
Can you please explain why should we go for histogram? What is real purpose of using histograms? 
What will be high and low value when using a varchar column?
 


4 stars stored procedure and histograms   July 11, 2004 - 10am Central time zone
Reviewer: Nikhil G Mishra from Hyderabad, India
Tom,
  
    All the variables in queries within stored procedures, are bind variables.
    Does that mean , histograms cant help in cases we have a poorly performing query in the stored 
procedure involving highly skewed values ? Because with bind variables, the CBO will generate a 
"fit all" explain plan based on its best guess estimates ?

nikhil 


Followup   July 11, 2004 - 10am Central time zone:

it'll use bind variable peeking in 9i -- to determine the best fit plan.

describe to me the nature of a stored procedure that should sometimes full scan and sometimes not?  
frequency of run? who runs it? how does that fit in to the system (if interactive, it would give 
wildly different response times -- end users would deem it "unreliable"). 

4 stars height and width based   July 12, 2004 - 8am Central time zone
Reviewer: pooja from India
hi tom,
  this is very useful and now im able to do histograms 
 but i have a little bit confusion about height and width based histograms. please explain me with 
a simple example.

In CBO, the statistics are calculated based on the predicate in the SQL statement and on the IO, 
MEMORY and CPU reads..but how it will calculate statistics...
please expalin this also 


Followup   July 12, 2004 - 11am Central time zone:

I'll recommend to you the performance guide, a freely available document part of the Oracle 
documentation set..... 

3 stars Variable response times   July 12, 2004 - 7pm Central time zone
Reviewer: Matt from Australia
Above at: 

http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:707586567563#20833867221653
You say:

"describe to me the nature of a stored procedure that should sometimes full scan 
and sometimes not?  frequency of run? who runs it? how does that fit in to the 
system (if interactive, it would give wildly different response times -- end 
users would deem it "unreliable")."

We have a PL/SQL batch process which generates a report on data owned by particular users of the 
system. This data is highly skewed. 99.5% of the data is owned by one user the remaining .5% is 
split between 5 other users.

For some reason the whole report always worked fine when accessing generating data for the .5%, 
however, the 99.5% was extremely slow. 

This report is run rarely (it was a report generated at the end of a data migration process to 
allow user to verify their data). It was manually run. However, I can see an upcoming requirement 
to run this report more often - and it's performance is "unreliable".

What (general - I'll post the detail another time when you are accepting questions) options for 
tuning this report were available to me?

In the end I 10046 traced the report and noted lots of index range scans and nested loop joins. I 
verified the stats (and histograms) - (dbms_stats "for all columns size auto" "cascade = true"). 
However, the optimizer failed to pick a full scan. Due to time pressure I've slotted lots of 
"full(t)" hints in the query and improved the reports execution time from 4+ hours to 15 minutes - 
for all cases of data.

Obviously you would need more information to comment, but what other approaches are available to 
get the optimizer to do what it is supposed to?....yes I will post this when I see the accepting 
questions button.

Best Regards, 






  


Followup   July 12, 2004 - 11pm Central time zone:

a report that is run rarely against known skewed data can certainly use "no binds".

dynamic sql via ref cursors might be the answer. 

3 stars Thanks   July 13, 2004 - 3am Central time zone
Reviewer: Matt from Australia
The report uses dynamic sql and ref cursors. Currently the skewed data columns are not literals, 
though. 

I will give this approach a  whirl and see whether I get a good a run time. 


3 stars How intellinget is CBO now   July 28, 2004 - 9pm Central time zone
Reviewer: A reader 
Hi Tom, one hard question,
I had been trying to test it,
but I didn't get it, so I ask you.

Now, the CBO, ONLY use histogram info, when he have not binded variables like ='ADM'
or there is some exception that allows you for example i don't know in the soft parse to take 
advantage of histogram info.

Thanks Tom 


Followup   July 29, 2004 - 7am Central time zone:

that is not true, there is bind variable peeking going on at hard parse time too (search for 
peeking on this site to read more)


during the soft parse, the plan is in place -- literal or not - the histograms would not be 
consulted. 

5 stars   July 29, 2004 - 10am Central time zone
Reviewer: A reader 
Thanks Tom
From
OracleWorld Copenhagen, Database Track
BIND VARIABLES AND CURSOR SHARING – NEW DIRECTIONS IN
ORACLE9I
Bjørn Engsig, Miracle A/S

"BIND VARIABLE PEEKING IN ORACLE9I In Oracle9i, the optimizer will peek at actual values of bind 
variables if these are available. By doing this, the optimizer can use generated table/index- or 
column-level statistics when bind variables are used. This will benefit both applications that 
themselves are written to use bind variables, and applications that have had literals replaced by 
bind variables using the cursor_sharing parameter. The tests that we have conducted show that the 
bind variable peeking works as expected: the optimizer will make its decision based on actual 
values of bind variables. However, once an execution plan is generated by the optimizer, no further 
peeking into bind variables will take place; neither from the same session or from a different 
session parsing the same SQL statement. In both cases, the same SQL statement and associated 
execution plan is found in the library cache." 


Followup   July 29, 2004 - 1pm Central time zone:

that is correct -- that is in line with what I said.  during hard parse, they would be used, during 
soft parse -- not. 

5 stars How to get literal from endpoint_value   July 30, 2004 - 2pm Central time zone
Reviewer: A reader 
Hi Tom, please
1) Could you please tell me how to get the values from dba_histograms  view,
 for example  instead of 4.12433214 to get 'NY'
something more friendly to analyze (why ENDPOINT_ACTUAL_VALUE column is not always filled)
2) Do you agree with oracle documentation when it says to use size auto
I think is something like this
DBMS_STATS.GATHER_SCHEMA_STATS( OWNNAME=>'ADM', ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL COLUMNS 
SIZE AUTO ');
3) Which is the real different between
DBMS_STATS.GATHER_SCHEMA_STATS( OWNNAME=>'ADM', ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL COLUMNS 
SIZE AUTO ');
and 
DBMS_STATS.GATHER_SCHEMA_STATS( OWNNAME=>'ADM', ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL COLUMNS 
SIZE SKEWONLY ');
4) How I know the column is heigh or value based histogram

Thanks Tom 


Followup   July 30, 2004 - 6pm Central time zone:

1) you'll want to search for:

endpoint_value endpoint_actual_value

on this site, but:


ops$tkyte@ORA9IR2> create table t ( x varchar2(26) );
 
Table created.
 
ops$tkyte@ORA9IR2> insert into t values ( 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' );
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t values ( 'XXXX' );
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t values ( 'YYYY' );
 
1 row created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace function hexstr( p_number in number ) return varchar2
  2  as
  3          l_str long := to_char(p_number,'fm'||rpad('x',50,'x'));
  4          l_return varchar2(4000);
  5  begin
  6          while ( l_str is not null )
  7          loop
  8                  l_return := l_return || chr(to_number(substr(l_str,1,2),'xx'));
  9                  l_str := substr( l_str, 3 );
 10          end loop;
 11          return l_return;
 12  end;
 13  /
 
Function created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> analyze table t compute statistics for all columns;
 
Table analyzed.
 
ops$tkyte@ORA9IR2> column column_name format a10
ops$tkyte@ORA9IR2> column ENDPOINT_ACTUAL_VALUE format a20
ops$tkyte@ORA9IR2> select endpoint_number, endpoint_value, endpoint_actual_value,
  2         to_char(endpoint_value,rpad('x',32,'x') ),
  3         substr( hexstr( endpoint_value ), 1, 30 ) ev
  4  from dba_tab_histograms where table_name = 'T' and owner = user;
 
ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU
--------------- -------------- --------------------
TO_CHAR(ENDPOINT_VALUE,RPAD('X',3 EV
--------------------------------- ------------------------------
              0     3.3884E+35
   41424344454657fd6de9c0ccc00000 ABCDEFW�m���
 
              1     4.6393E+35
   59595958ffffefab8e7525d2400000 YYYX��龜u%�@



it is not as simple as "a low bound"....


2) I'm thinking you don't need histograms too often and given how expensive they are to compute -- 
for all columns is hugely expensive, so I might be looking at doing it just for the columns deemed 
necessary.


3) search for skewonly, we've written about this before.

4) depends on the cardinality -- searching for the terms in #1 will lead you to some discussions on 
that. 

5 stars   July 30, 2004 - 6pm Central time zone
Reviewer: A reader 
Thanks Tom 


4 stars endpoint_actual_value   August 17, 2004 - 10pm Central time zone
Reviewer: A reader 
I gathered histograms on a table with

method_opt=>'for all columns size 254'.

When I query *_tab_histograms for a specific column in that table, I get my 254 rows, but the 
ENDPOINT_ACTUAL_VALUE for all of them is null. 

Why is this? This is throwing off the cardinality estimates for queries involving where column like 
'blah%'

Thanks 


4 stars endpoint_actual_value   August 17, 2004 - 11pm Central time zone
Reviewer: A reader 
OK the endpoint_value (the number in scientific notation) is probably the hex representation of my 
string or something.

anyway, i am trying to understand the following

select * from table
where col like 'A%' or col like 'B%'

col has a UNIQUE index, it has histograms using 254 buckets, the no. of distinct values of col is 
20000

The above query works fine, the cardinality estimate for each of the CONCATENATION steps is close 
to reality.

But when I add another OR predicate like

select * from table
where col like 'A%' or col like 'B%' or col like 'C%'

the query tanks, the plan shows a CONCATENATION of 3 INDEX RANGE SCANS, each with a cardinality of 
ONE! The total card for the query is 12 but the query returns 2000 rows.

I tried to create a simple, concise test case for you

create table t (i varchar2(6) unique);

insert into t select dbms_random.string('x',6) from all_objects where rownum<=20000;

exec dbms_stats.gather_table_stats(null,'t',method_opt=>'for all columns size 254')

select * from t
where i like 'A%' or i like 'B%' or i like 'C%'

I guess since the table is so skinny, it chooses a FTS for this instead of the 3 individual range 
scans and a concatenation, so I am unable to reproduce what I am seeing in my actual table as I 
mention above.

Help? Thanks
 


Followup   August 18, 2004 - 7am Central time zone:

when you have more values than buckets (254 in your case), you have to expect "guestimation" to 
kick in -- you cannot represent the actual cardinality in the data dictionary.  you get ranges and 
within each range -- you get estimates.

throw in a "like" or two and you really guess. (your C values could be crossing many "buckets" 
there)

but why would 2000 rows out of 20,000 via an index be necessarily "a bad thing (tm)"?


and are you really sure you want histograms on every column?  especially columns with lots and lots 
of distinct values like this one?
 

4 stars Histograms   August 18, 2004 - 10am Central time zone
Reviewer: A reader 
<quote>
but why would 2000 rows out of 20,000 via an index be necessarily "a bad thing (tm)"?
<quote>

Its not bad, the query performs fine

select acctno,blah,blah from <view> 
where acctno like 'A%' or acctno like 'B%' or acctno like 'C%'

performs great. Returns 2000 rows from possible 20000. Fine.

As soon as I change this to

select acctno,blah,blah from <view> 
where acctno like 'A%' or acctno like 'B%' or acctno like 'C%'
order by acctno

it tanks (I posted this in another review recently also, sorry, maybe we can continue this 
discussion here)


"and are you really sure you want histograms on every column?  especially columns with lots and 
lots of distinct values like this one?"

OK I just deleted stats on the 3 tables involved in the view and re-gathered them using dbms_stats 
'for all indexed columns' and now the query is great even with the order by!

So, seems like the histograms on the UK (acctno) were actually hurting the performance!

Questions

1. I didnt know that having histograms can actually be detrimental to queries? Does this make 
sense? I thought that , if anything, they can only help?

2. I noticed that the overall cardinality of the query with the good performance is 10, but the 
actual rows returned is 2000. 

The performance is good, so I dont really care, but if the CBO estimated card is so much off, is 
there a problem lurking out there that might adversely affect other queries using these tables?

Thanks 


Followup   August 18, 2004 - 10am Central time zone:

need to *see the plans*

for histograms it depends on the number of values and buckets.  when you have 254 buckets but 2000 
values -- well, you don't have a bucket/value so it can be mislead at that point.

 

4 stars Histograms   August 18, 2004 - 11am Central time zone
Reviewer: A reader 
Darn it, now no matter what I do, I cannot reproduce the "bad plan" that was resulting in the poor 
performance with the order by.

I deleted all stats on the 3 tables using

dbms_stats.delete_table_stats(null,'table_name')

re-gathered them  with the 254-bucket histograms and everything is still good!

This is what frustrates me sometimes about the CBO, everything is too magical and automatic, it is 
hard to reproduce stuff.

Oh well, I will keep an eye on it and see if the problem reoccurs.

Thanks 


3 stars Automagical parameters   October 29, 2004 - 9am Central time zone
Reviewer: Ronald from Fortress Hill, Hong Kong
Problem with CBO and gather_stats:

sql> @desc tshi
Elapsed: 00:00:00.00
Datatypes for Table tshi

                           Data           Data
Column Name                Type           Length     Nullable Dafault  COMMENTS
-------------------------- -------------- ---------- -------- -------- ---------
SHI_NUM                    NUMBER         (4,0)      not null
POL_NUM                    VARCHAR2       10         not null
REASN_CODE                 VARCHAR2       3          not null
ACCT_MNE_CD                VARCHAR2       8          not null
SUSP_DT                    DATE           7          not null
TRXN_CD                    VARCHAR2       8          not null
SUSP_AMT                   NUMBER         (13,2)     not null
CRCY_CODE                  VARCHAR2       2          not null
TRXN_ID                    VARCHAR2       15         null
UNDO_TRXN_ID               VARCHAR2       15         null
SUSP_REASN                 VARCHAR2       3          null
POL_STAT_CD                VARCHAR2       1          null
FREZ_CODE                  VARCHAR2       1          null
PD_TO_DT                   DATE           7          null
MODE_PREM                  NUMBER         (11,2)     null
PMT_MODE                   VARCHAR2       5          null
Elapsed: 00:00:00.01

Indexes on tshi
Index                                                        Is
Name                                                         Unique columns
------------------------------------------------------------ ------ 
--------------------------------
SHI_PK                                                       Yes    POL_NUM, SHI_NUM, SUSP_DT
SHI_IDX1                                                     No     TRXN_ID
I_SHI_POL_FK                                                 No     POL_NUM
Elapsed: 00:00:00.04

sql> select min(TRXN_ID), max(TRXN_ID), count(TRXN_ID), wb
  2  from ( select TRXN_ID,
  3  width_bucket( to_number(TRXN_ID), (select to_number(min(TRXN_ID)) from TSHI),
  4  (select to_number(max(TRXN_ID)) from TSHI), 10) wb
  5  from TSHI) group by wb ;

MIN(TRXN_ID)    MAX(TRXN_ID)    COUNT(TRXN_ID)         WB
--------------- --------------- -------------- ----------
001304030000001 020405190002259          26300          1
041304030000001 060405190002259          26300          3
061304030000001 080405190002259          26300          4
081304030000001 098405190002259          23670          5
200406011220485 200406011220486              2         11
021304030000001 040405190002259          26300          2
200206040000008 200406011220484        4623660         10
                                             0

8 rows selected.

Elapsed: 00:00:30.04

cas@chkp.world> select count(*) from tshi ;

  COUNT(*)
----------
   5290940

1 row selected.

SQL and autotrace plans:

sql> select *
from tshi
where undo_trxn_id is null
and trxn_id >= '200304150000003'
and pol_num = '2870000013'
order by trxn_id, shi_num ;

<<run forever>>

Plan 1:
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=1 Bytes=100)
   1    0  SORT (ORDER BY) (Cost=7 Card=1 Bytes=100)
   2    1    TABLE ACCESS (BY INDEX ROWID) OF 'TSHI' (Cost=5 Card=1 Bytes=100)
   3    2      INDEX (RANGE SCAN) OF 'SHI_IDX1' (NON-UNIQUE) (Cost=3 Card=1)


Plan 2:
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=8 Card=1 Bytes=100)
   1    0   SORT (ORDER BY) (Cost=8 Card=1 Bytes=100)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'TSHI' (Cost=6 Card=1 Bytes=100)
   3    2       INDEX (RANGE SCAN) OF 'SHI_PK' (UNIQUE) (Cost=3 Card=22)

Plan 3:
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=1 Bytes=75)
   1    0   SORT (ORDER BY) (Cost=7 Card=1 Bytes=75)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'TSHI' (Cost=5 Card=1 Bytes=75)
   3    2       INDEX (RANGE SCAN) OF 'I_SHI_POL_FK' (NON-UNIQUE) (Cost=3 Card=16)

   SHI_NUM POL_NUM    REA ACCT_MNE SUSP_DT            TRXN_CD    SUSP_AMT CR TRXN_ID
---------- ---------- --- -------- ------------------ -------- ---------- -- ---------------
UNDO_TRXN_ID    SUS P F PD_TO_DT            MODE_PREM PMT_M
--------------- --- - - ------------------ ---------- -----
       880 2870000013 301 SMMKKY   15-Apr-03 12:00    XOOREZ         1500 04 200
304150000003
                    6 0 04-Apr-03 12:00           750 01

       881 2870000013 108 SMMKKY   15-Apr-03 12:00    XOOREZ        -1500 04 200
304150000004
                    6 0 04-Apr-03 12:00           750 01


2 rows selected.

Elapsed: 00:00:00.46


Test results:
   (tshi /w)                     (autotrace plan)
1) no stats             ---->    Plan 3
2) analyze + estimate   ---->    Plan 3
3) analyze + compute    ---->    Plan 1
4) dbms_stats + estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE + method_opt => 'FOR ALL INDEXED 
COLUMNS SIZE AUTO'  ----> Plan 1
5) dbms_stats + method_opt => 'FOR ALL COLUMNS SIZE 5'   ----> Plan 2

Question:
1) Why the CBO tends to choose the bad index? (since from the predicate itself should a equal 
condition more selective than that of a range condition??? (very confusing to me.))
2) What is the correct (->less risk) bucket size?  seemed the automagical values is not doing good, 
should it be just 254 all the way for the sake of simplicity of it?
3) do you actually have a post here on how to search for answers, yes=there are almost all the 
answers here already, but the words are simply too common to locate what I need, want to know where 
magical keywords like “show_space”, “skewonly” from? 


Followup   October 29, 2004 - 10am Central time zone:

can we see a tkprof . with so few rows "runs forever" doesn't compute to me

how many buckets did "auto" actually come up with.

do you use binds in the "real world" 

2 stars Re: Automagical parameters   October 29, 2004 - 2pm Central time zone
Reviewer: Ronald from Fortress Hill, Hong Kong
the "run forever" is just it take so longer I never bother to wait till finish.

the "auto" produced 2 buckets

yes we did, but it makes the story longer,
with bind variable

the CBO seemed always always pick SHI_PK
why is it so???

and if c/200304150000003/100304150000003/
it magically picks I_SHI_POL_FK for different stats’ analyzed

I am very much wanted to know why???

does this has anything to do with “peeking”? (or halloween) 


1 stars Re: Re   November 1, 2004 - 12pm Central time zone
Reviewer: Ronald Chan from Fortress Hill, Hong Kong
The "auto" always produced 2 buckets

how to parameter gather_stats in order to safely help the optimizer make decisions. 


5 stars Histograms   November 2, 2004 - 4am Central time zone
Reviewer: Ronald from Fortress Hill, Hong Kong
Trace output for when the query use wrong index and run for 1:15 hour

ora_3376.trc:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
ORACLE_HOME = /test/oracle/product/9.2.0.4
System name:    HP-UX
Node name:      hkgxd09
Release:        B.11.00
Version:        U
Machine:        9000/800
Instance name: tun8
Redo thread mounted by this instance: 1
Oracle process number: 11
Unix process pid: 3376, image: oracle@hkgxd09 (TNS V1-V3)
*** 2004-11-02 11:14:58.848
*** SESSION ID:(17.1593) 2004-11-02 11:14:58.848
QUERY
SELECT * FROM TSHI
WHERE POL_NUM = '2870000013'
AND TRXN_ID >= '200304150000003'
AND UNDO_TRXN_ID IS NULL
ORDER BY TRXN_ID, SHI_NUM
***************************************
PARAMETERS USED BY THE OPTIMIZER
********************************
OPTIMIZER_FEATURES_ENABLE = 9.2.0
OPTIMIZER_MODE/GOAL = Choose
_OPTIMIZER_PERCENT_PARALLEL = 101
HASH_AREA_SIZE = 131072
HASH_JOIN_ENABLED = TRUE
HASH_MULTIBLOCK_IO_COUNT = 0
SORT_AREA_SIZE = 65536
OPTIMIZER_SEARCH_LIMIT = 5
PARTITION_VIEW_ENABLED = FALSE
_ALWAYS_STAR_TRANSFORMATION = FALSE
_B_TREE_BITMAP_PLANS = TRUE
STAR_TRANSFORMATION_ENABLED = TRUE
_COMPLEX_VIEW_MERGING = TRUE
_PUSH_JOIN_PREDICATE = TRUE
PARALLEL_BROADCAST_ENABLED = TRUE
OPTIMIZER_MAX_PERMUTATIONS = 2000
OPTIMIZER_INDEX_CACHING = 0
_SYSTEM_INDEX_CACHING = 0
OPTIMIZER_INDEX_COST_ADJ = 100
OPTIMIZER_DYNAMIC_SAMPLING = 1
_OPTIMIZER_DYN_SMP_BLKS = 32
QUERY_REWRITE_ENABLED = FALSE
QUERY_REWRITE_INTEGRITY = ENFORCED
_INDEX_JOIN_ENABLED = TRUE
_SORT_ELIMINATION_COST_RATIO = 0
_OR_EXPAND_NVL_PREDICATE = TRUE
_NEW_INITIAL_JOIN_ORDERS = TRUE
ALWAYS_ANTI_JOIN = CHOOSE
ALWAYS_SEMI_JOIN = CHOOSE
_OPTIMIZER_MODE_FORCE = TRUE
_OPTIMIZER_UNDO_CHANGES = FALSE
_UNNEST_SUBQUERY = TRUE
_PUSH_JOIN_UNION_VIEW = TRUE
_FAST_FULL_SCAN_ENABLED = TRUE
_OPTIM_ENHANCE_NNULL_DETECTION = TRUE
_ORDERED_NESTED_LOOP = TRUE
_NESTED_LOOP_FUDGE = 100
_NO_OR_EXPANSION = FALSE
_QUERY_COST_REWRITE = TRUE
QUERY_REWRITE_EXPRESSION = TRUE
_IMPROVED_ROW_LENGTH_ENABLED = TRUE
_USE_NOSEGMENT_INDEXES = FALSE
_ENABLE_TYPE_DEP_SELECTIVITY = TRUE
_IMPROVED_OUTERJOIN_CARD = TRUE
_OPTIMIZER_ADJUST_FOR_NULLS = TRUE
_OPTIMIZER_CHOOSE_PERMUTATION = 0
_USE_COLUMN_STATS_FOR_FUNCTION = TRUE
_SUBQUERY_PRUNING_ENABLED = TRUE
_SUBQUERY_PRUNING_REDUCTION_FACTOR = 50
_SUBQUERY_PRUNING_COST_FACTOR = 20
_LIKE_WITH_BIND_AS_EQUALITY = FALSE
_TABLE_SCAN_COST_PLUS_ONE = TRUE
_SORTMERGE_INEQUALITY_JOIN_OFF = FALSE
_DEFAULT_NON_EQUALITY_SEL_CHECK = TRUE
_ONESIDE_COLSTAT_FOR_EQUIJOINS = TRUE
_OPTIMIZER_COST_MODEL = CHOOSE
_GSETS_ALWAYS_USE_TEMPTABLES = FALSE
DB_FILE_MULTIBLOCK_READ_COUNT = 32
_NEW_SORT_COST_ESTIMATE = TRUE
_GS_ANTI_SEMI_JOIN_ALLOWED = TRUE
_CPU_TO_IO = 0
_PRED_MOVE_AROUND = TRUE
***************************************
BASE STATISTICAL INFORMATION
***********************
Table stats    Table: TSHI   Alias: TSHI
  TOTAL ::  CDN: 5301581  NBLKS:  77544  AVG_ROW_LEN:  88
-- Index stats
  INDEX NAME: I_SHI_POL_FK  COL#: 2
    TOTAL ::  LVLS: 2   #LB: 18481  #DK: 324526  LB/K: 1  DB/K: 1  CLUF: 357471
  INDEX NAME: SHI_IDX1  COL#: 9
    TOTAL ::  LVLS: 2   #LB: 17940  #DK: 4595967  LB/K: 1  DB/K: 1  CLUF: 4400837
  INDEX NAME: SHI_PK  COL#: 2 1 5
    TOTAL ::  LVLS: 2   #LB: 27726  #DK: 5301581  LB/K: 1  DB/K: 1  CLUF: 405855
_OPTIMIZER_PERCENT_PARALLEL = 100
***************************************
SINGLE TABLE ACCESS PATH
Column:    POL_NUM  Col#: 2      Table: TSHI   Alias: TSHI
    NDV: 324526    NULLS: 0         DENS: 3.0814e-06
    NO HISTOGRAM: KT: 1 #VAL: 2
Column: UNDO_TRXN_  Col#: 10     Table: TSHI   Alias: TSHI
    NDV: 2916      NULLS: 5287899   DENS: 3.4294e-04
    NO HISTOGRAM: KT: 1 #VAL: 2
Column:    TRXN_ID  Col#: 9      Table: TSHI   Alias: TSHI
    NDV: 4595967   NULLS: 549056    DENS: 2.1758e-07
    NO HISTOGRAM: KT: 1 #VAL: 2
  TABLE: TSUSPENSE_HISTORIES     ORIG CDN: 5301581  ROUNDED CDN: 1  CMPTD CDN: 0
  Access path: tsc  Resc:  7459  Resp:  1865
  Access path: index (equal)
      Index: I_SHI_POL_FK
  TABLE: TSHI
      RSC_CPU: 0   RSC_IO: 5
  IX_SEL:  0.0000e+00  TB_SEL:  3.0814e-06
  Access path: index (scan)
      Index: SHI_IDX1
  TABLE: TSHI
      RSC_CPU: 0   RSC_IO: 4
  IX_SEL:  2.2177e-07  TB_SEL:  2.2177e-07
  Skip scan: ss-sel 0  andv 9989
    ss cost 9989
    index io scan cost 0
  Access path: index (scan)
      Index: SHI_PK
  TABLE: TSHI
      RSC_CPU: 0   RSC_IO: 5
  IX_SEL:  3.0814e-06  TB_SEL:  3.0814e-06
  Access path: index (equal)
      Index: I_SHI_POL_FK
  TABLE: TSHI
      RSC_CPU: 0   RSC_IO: 3
  IX_SEL:  3.0814e-06  TB_SEL:  3.0814e-06
  Access path: index (index-only)
      Index: SHI_IDX1
  TABLE: TSHI
      RSC_CPU: 0   RSC_IO: 3
  IX_SEL:  2.2177e-07  TB_SEL:  2.2177e-07
    SORT resource      Sort statistics
      Sort width:            3 Area size:       43008 Max Area size:       43008   Degree: 1
      Blocks to Sort:        1 Row size:           21 Rows:          2
      Initial runs:          1 Merge passes:        1 IO Cost / pass:          2
      Total IO sort cost: 2
      Total CPU sort cost: 0
      Total Temp space used: 0
  BEST_CST: 4.00  PATH: 4  Degree:  1
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***********************
Join order[1]: TSHI [TSHI]
ORDER BY sort
    SORT resource      Sort statistics
      Sort width:            3 Area size:       43008 Max Area size:       43008   Degree: 1
      Blocks to Sort:        1 Row size:           93 Rows:          1
      Initial runs:          1 Merge passes:        1 IO Cost / pass:          2
      Total IO sort cost: 2
      Total CPU sort cost: 0
      Total Temp space used: 0
Best so far: TABLE#: 0  CST:          6  CDN:          1  BYTES:         75
****** Recost for parallel table scan  *******
***************************************
SINGLE TABLE ACCESS PATH
  TABLE: TSHI     ORIG CDN: 5301581  ROUNDED CDN: 1  CMPTD CDN: 0
  Access path: tsc  Resc:  7459  Resp:  1865
  BEST_CST: 1865.00  PATH: 2  Degree:  4
***********************
Join order[1]: TSHI [TSHI]
ORDER BY sort
    SORT resource      Sort statistics
      Sort width:            3 Area size:       43008 Max Area size:       43008   Degree: 1
      Blocks to Sort:        1 Row size:           93 Rows:          1
      Initial runs:          1 Merge passes:        1 IO Cost / pass:          2
      Total IO sort cost: 2
      Total CPU sort cost: 0
      Total Temp space used: 0
    SORT response      Sort statistics
      Sort width:            3 Area size:       43008 Max Area size:       43008   Degree: 4
      Blocks to Sort:        1 Row size:           93 Rows:          1
      Initial runs:          1 Merge passes:        1 IO Cost / pass:          2
      Total IO sort cost: 2
      Total CPU sort cost: 0
    SORT resource      Sort statistics
      Sort width:            3 Area size:       43008 Max Area size:       43008   Degree: 1
      Blocks to Sort:        1 Row size:           93 Rows:          1
      Initial runs:          1 Merge passes:        1 IO Cost / pass:          2
      Total IO sort cost: 2
      Total CPU sort cost: 0
      Total Temp space used: 0
Final:
  CST: 6  CDN: 1  RSC: 6  RSP: 6  BYTES: 75
  IO-RSC: 6  IO-RSP: 6  CPU-RSC: 0  CPU-RSP: 0

2) "auto" created 2 buckets

3) yes we use bind in "real world"
but just wanted to know how to create "reliable" statistics for this case. 


5 stars Which Tables to have histogram?   February 10, 2005 - 6am Central time zone
Reviewer: A reader from MD, USA
Tom,
I have created histograms on some big tables(10 to 40 million rows) and benefited. I do not want to 
create histograms on all tables. Is there a way I can find which tables will be best candidates for 
creating histograms? Is there a way to quantify how much I will get benefited, performacnewise - 
apart from doing it in stress/test environment?
Thanks, 


Followup   February 11, 2005 - 2am Central time zone:

 
 

do you have some knowledge of your data?

if so, you might know the candidate columns (which ones are skewed) and whether you place 
predicates on them at all.

if you do not, you can query:

http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:30966075177614
and see if a column you predicate on frequently is "skewed" and may benefit from this. 

5 stars Helpful   February 13, 2005 - 1pm Central time zone
Reviewer: A reader from MD, USA
Tom,

Your reply & the link was very helpful. I can decide about the bucket width easily with this. 
However, AUTO or SKEWONLY - I have found to address this effectively. Any comments?

Also, I am concerned with deciding what tables to collect histograms on, rather than the other 
details once we decide what table. Histograms take long time to collect, even on big servers - 
atleast for big tables. And since all my tables are MONITORED, I do  not know when actually the 
stats willbe gathered - it could be right in the core business hour of my DSS. So, I prefer to be 
very selective in collecting histograms. Any input, guidelines?
Thanks,  


Followup   February 13, 2005 - 4pm Central time zone:

this is simple a query for you to visualize the data, nothing more, nothing less.

monitoring does NOT mean you do not "know" when...  you still call dbms_stats.gather_.... it'll 
only happen during core business hours if YOU make it so (so don't do that)


auto and skewonly do what they do -- dbms_stats peeks at the data sort of like I did and figures 
out if it wants to.   

5 stars How to delete histogram stats?   April 29, 2005 - 6pm Central time zone
Reviewer: Jim 
How do I delete histogram stats related to a particular column from data dictionary? Thanks. 


Followup   April 29, 2005 - 6pm Central time zone:

ops$tkyte@ORA10GR1> create table t as select * from all_users;
 
Table created.
 
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> exec dbms_stats.gather_table_stats( user, 'T', method_opt => 'for all columns 
size 254' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> select column_name, count(*) from user_tab_histograms where table_name = 'T' 
group by column_name;
 
COLUMN_NAME                      COUNT(*)
------------------------------ ----------
CREATED                                22
USERNAME                               32
USER_ID                                32
 
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> exec dbms_stats.delete_column_stats( user, 'T', 'USERNAME' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10GR1> select column_name, count(*) from user_tab_histograms where table_name = 'T' 
group by column_name;
 
COLUMN_NAME                      COUNT(*)
------------------------------ ----------
CREATED                                22
USER_ID                                32
 

4 stars Not getting the same output   November 5, 2005 - 10am Central time zone
Reviewer: Jagjeet Singh from Delhi, India
Hi,

 I am not getting the same result.
 Can you pls. look into it.

SQL> drop table t;

Table dropped.

SQL> create table t as select * from dba_objects;

Table created.

SQL> update t set object_type = 'TABLE';

5795 rows updated.

SQL> update t set object_type = 'INDEX' where rownum = 1;

1 row updated.

SQL> create index ind_t on t (object_Type) ;

Index created.

SQL> analyze table t compute statistics;

Table analyzed.


select * 
from
 t where object_Type = 'TABLE'


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      388      0.12       0.11          0        684          0        5794
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      390      0.12       0.12          0        684          0        5794

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 23  

Rows     Row Source Operation
-------  ---------------------------------------------------
   5794  TABLE ACCESS FULL T (cr=684 r=0 w=0 time=61709 us)

*********************************************

select * 
from
 t where object_type = 'INDEX' 


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.01       0.01          0        312          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.02       0.01          0        312          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 23  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS FULL T (cr=312 r=0 w=0 time=10384 us)

**********************************************


 


Followup   November 6, 2005 - 8am Central time zone:

well, you didn't do what I did - I gathered histograms, you did not. (the question was all about 
*histograms*)

here is the example updated to use dbms_stats properly:


ops$tkyte@ORA10GR2> create table t
  2  as
  3  select 99 id, a.*
  4    from all_objects a;

Table created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> update t set id = 1 where rownum = 1;

1 row updated.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> create index t_idx on t(id);

Index created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> begin
  2      dbms_stats.gather_table_stats
  3      ( user,
  4       'T',
  5        method_opt=>'for all indexed columns size 254',
  6        cascade => true );
  7  end;
  8  /

PL/SQL procedure successfully completed.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> set autotrace traceonly explain
ops$tkyte@ORA10GR2> select * from t where id = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 470836197

------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    96 |     2   (0)| 00:
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |    96 |     2   (0)| 00:
|*  2 |   INDEX RANGE SCAN          | T_IDX |     1 |       |     1   (0)| 00:
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=1)

ops$tkyte@ORA10GR2> select * from t where id = 99;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 49996 |  4687K|   165   (3)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| T    | 49996 |  4687K|   165   (3)| 00:00:02 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"=99)

ops$tkyte@ORA10GR2> set autotrace off
 

4 stars Same path but diff. LIOs   November 6, 2005 - 3am Central time zone
Reviewer: Jagjeet Singh from Delhi, India
 Both queries using the same path but diff.
 block gets.
  


Followup   November 6, 2005 - 8am Central time zone:

That is due to the way blocks are gotten from the cache and the number of rows returned. 

....
Fetch      388      0.12       0.11          0        684          0        5794
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      390      0.12       0.12          0        684          0        5794

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 23  

Rows     Row Source Operation
-------  ---------------------------------------------------
   5794  TABLE ACCESS FULL T (cr=684 r=0 w=0 time=61709 us


SQLPlus array fetches 15 rows at a time.   There are about 310 blocks in this table (based on the 
LIO's from the second query which got all of the data in one call). So, you have 18-19 rows per 
block (5794/310).


When you fetched rows 1..15, we went to the buffer cache got block 1 and copied the 15 rows off of 
it and sent them back.

When you fetched 16..30, we went to the buffer cache and got block 1 again to get the last 3 or 4 
rows, then got block 2 from the buffer cache and got say 12 rows and sent them back.

When you fetched 31..45, we went to the buffer cache to get block 2 again, got 6 or 7 remaining 
rows and then got block 3 to get 8 or 9 more and sent them back.


We read every block two times pretty much from the buffer cache (increase the arraysize in sqlplus 
and you'll see a decrease in IO).

Then, you full scanned the table to get a single row - we read block 1, didn't find it, read block 
2, didn't find it, read block 3 - say we found it, then we continued on reading every block (not 
finding anymore data) and returned the single row - we read each block once. 

5 stars Re:   November 6, 2005 - 10am Central time zone
Reviewer: Jagjeet Singh from Delhi, India
 Thanks for this explaination.


 o  I used "analyze table t compute statistics" which is equal of  

   "analyze table t compute statistics for table for all indexes for 
   all indexd columns".
    
    and it generates histograms. Do you mean no. of buckets is insuff.

 o  As you said diff. in LIO because of the way of getting the block from
    cache for two queries.

   In first query oracle scans 1-15 [ as per arraysize ] and send them 
   back and again picks the block 1 for rest records. that's why two 
   gets for per block.
 
   For second query, Oracle picks the 1 block and scans it and then
   picks the next block. In this case, oracle is scanning the whole 
   block, why there is no limitation with 15 rows. 

   Is this because oracle does not need to send data back using sqlnet
   is it correct ?
 
   If this is the case,     I tried this with procedure using two diff. cursors. like ..

          create procedure .. as
          begin
          for i in ( First query ) loop
          null;
          end loop;
          for ii in ( Second Query ) loop
          ... same ..

Just accessing the data. -- Nothing to send back ---- But still there is diff. in LIO's.


 o   One more case --  without histograms  ---

SQL> create table t as select 1 id,
  2  a.* from dba_objects a ;

Table created.

SQL> create index ind_t on t ( id );

Index created.

SQL> analyze table t compute statistics
  2  for table for all indexes;

Table analyzed.

SQL> set autot traceonly 
SQL> select * from t where id = 1 ;

8195 rows selected.


----------------------------------------------------------
0    SELECT STATEMENT Optimizer=CHOOSE (Cost=74 Card=82 Bytes=7298)
1  0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=74 Card=82 BY..
2  1     INDEX (RANGE SCAN) OF 'IND_T' (NON-UNIQUE) (Cost=71 Card=33)



Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1582  consistent gets   <-----


SQL> select /*+FULL(t)*/ * from t where id = 1 ;

8195 rows selected.


Execution Plan
----------------------------------------------------------
0   SELECT STATEMENT Optimizer=CHOOSE (Cost=75 Card=82 Bytes=7298)
1  0   TABLE ACCESS (FULL) OF 'T' (Cost=75 Card=82 Bytes=7298)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        967  consistent gets <-------
 

 I have not created histograms but still CBO has this
 information that this table has 8195 records and there is
 only one distinct value. But still CBO is choosing index scan?

 
SQL> select num_rows,distinct_keys from user_indexes where index_name
 = 'IND_T' ;


  NUM_ROWS DISTINCT_KEYS
---------- -------------
      8195             1


Thanks,
Js 


Followup   November 6, 2005 - 10am Central time zone:

1) it is not equivalent to that.  


ops$tkyte@ORA10GR2> create table t as select object_id, object_name, object_type from all_objects;

Table created.

ops$tkyte@ORA10GR2> create index t_idx on t(object_id);

Index created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> analyze table t compute statistics;

Table analyzed.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' 
group by column_name;

COLUMN_NAME            COUNT(*)
-------------------- ----------
OBJECT_ID                     2
OBJECT_TYPE                   2
OBJECT_NAME                   2

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> analyze table t delete statistics;

Table analyzed.

ops$tkyte@ORA10GR2> analyze table t compute statistics
  2  for table
  3  for all indexes
  4  for all indexed columns;

Table analyzed.

ops$tkyte@ORA10GR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' 
group by column_name;

COLUMN_NAME            COUNT(*)
-------------------- ----------
OBJECT_ID                    76

very different.

2) I do not understand why you expect two different queries to have the same IO's in the first 
place?  

3) it did not have that information, you specifically say "don't look at the columns".  You skipped 
all column level information.

 

4 stars Why this is comming in BOLD   November 6, 2005 - 10am Central time zone
Reviewer: Jagjeet Singh from Delhi, India
I do not know why my followup is in BOLD ? 


Followup   November 6, 2005 - 10am Central time zone:

there is an unbalanced <b> tag on the page somewhere, don't worry about it.

I use hash-B and hash-b for <b> and </b> tags - someone posted something with a hash-B in it  

5 stars Now I know why the other post is in bold :) ... On to the histograms.   November 12, 2005 - 2pm Central time zone
Reviewer: Vladimir Sadilovskiy from MA,US
Tom,

I have a table that usually contains a sinle value in a column.

create table t1 as select * from all_objects;
insert /*+ append */ into t1 select * ftom t1;
...
update t1 set object_name='30_CHARACTERS_STRING_PPPPPPPPP';
dbms_stats.gather_table_stats(null,tabname=>'t1',method_opt=> 'for all columns size auto');

I have this simple query that is extracted from a bigger one:

select * from t1
where object_name != :1;

Now lets explain series of them with different imput:

explain plan for
select * from t1
where object_name != '1';

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |   495K|    47M|  3821   (8)|
|*  1 |  TABLE ACCESS FULL   | T1          |   495K|    47M|  3821   (8)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."OBJECT_NAME"<>'1')

explain plan for
select * from t1
where object_name != 'A';

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |   495K|    47M|  3821   (8)|
|*  1 |  TABLE ACCESS FULL   | T1          |   495K|    47M|  3821   (8)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."OBJECT_NAME"<>'A')

explain plan for
select * from t1
where object_name != 'ABC';

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |   495K|    47M|  3821   (8)|
|*  1 |  TABLE ACCESS FULL   | T1          |   495K|    47M|  3821   (8)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."OBJECT_NAME"<>'ABC')

explain plan for
select * from t1
where object_name != 'ABCDEF';

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |   495K|    47M|  3821   (8)|
|*  1 |  TABLE ACCESS FULL   | T1          |   495K|    47M|  3821   (8)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."OBJECT_NAME"<>'ABCDEF')

explain plan for
select * from t1
where object_name != 'ABCDEFG';

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |   100 |  3703   (5)|
|*  1 |  TABLE ACCESS FULL   | T1          |     1 |   100 |  3703   (5)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."OBJECT_NAME"<>'ABCDEFG')
SQL> 

Please, take a look at estimated cardinality. Our problem that joining to such table with this 
predicate usually leads to merge join Cartesian. So, it's not an academic question.

Q1. Do you know why the value for inequality predicate such drasticly changes cardinality 
expectations? Especially '1' and 'A'. They seem to be similar.
Q2. I've made series of additional tests and found that the boundary length of the value and its 
content after which the cardinality drops to 1 depends on the value in the table.
Q3. What would be your workaround in case you don't know the cause?

Thanks as always for your thoughts.

- Vladimir 


Followup   November 12, 2005 - 4pm Central time zone:

you don't have histograms, you did size auto - and without any pre-existing queries - the 
dbms_stats won't get any.

the magic is when you crossed 6 characters (suggest reading for you based on your last two 
questions is Jonathan Lewis's new book - see home page for a direct link to it.

I'm not sure what you mean by 1 and A as they seem to be the same as the rest? 

5 stars Bah! Probably something in the text again. Sorry for that.   November 12, 2005 - 2pm Central time zone
Reviewer: A reader 


Followup   November 12, 2005 - 4pm Central time zone:

hash-mark B turns on bolding
hash-mark b turns it off

the 10053 traces have hash-mark B's in them. 

5 stars Not that easy...   November 12, 2005 - 6pm Central time zone
Reviewer: Vladimir Sadilovskiy from MA,US
Tom,

It's just a coincidence I provided the tests with auto size. I can accept that migh brough in a 
different issue.

But, here is an example of the same behavior with histograms.

Could you please elaborate on "size auto - and without any pre-existing queries - the dbms_stats 
won't get any"?

Thanks,

- Vladimir

SQL> update t1 set object_name = 'aa.bb.cccccc';
commit;

495568 rows updated.

SQL> 
Commit complete.

SQL> exec dbms_stats.gather_table_stats(null,'T1',method_opt=>'for all columns size 254');

PL/SQL procedure successfully completed.

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |   495K|    38M|  3883   (8)|
|*  1 |  TABLE ACCESS FULL   | T1          |   495K|    38M|  3883   (8)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
   1 - filter("T1"."OBJECT_NAME"<>'aa.bb.')

12 rows selected.

SQL> SQL> 
2 rows deleted.

SQL>   2    3  
Explained.

SQL> SQL> 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |    82 |  3766   (5)|
|*  1 |  TABLE ACCESS FULL   | T1          |     1 |    82 |  3766   (5)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
   1 - filter("T1"."OBJECT_NAME"<>'aa.bb.c')

12 rows selected.

SQL> SQL> 
2 rows deleted.

SQL>   2    3  
Explained.

SQL> SQL> 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |    82 |  3766   (5)|
|*  1 |  TABLE ACCESS FULL   | T1          |     1 |    82 |  3766   (5)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
   1 - filter("T1"."OBJECT_NAME"<>'aa.bb.cccccc.dddddddddddddd')

12 rows selected.

SQL> SQL> 


Followup   November 12, 2005 - 7pm Central time zone:

...
Could you please elaborate on "size auto - and without any pre-existing queries 
- the dbms_stats won't get any"?...

size auto looks at the historical set of predicates used to figure out what columns might benefit 
from histgrams.


ops$tkyte@ORA9IR2> create table t
  2  as
  3  select 99 id, a.* from all_objects a;

Table created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> update t set id = 1 where rownum = 1;

1 row updated.

ops$tkyte@ORA9IR2> create index t_idx on t(id);

Index created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt => 'for all columns 
size auto', cascade=>true );

PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> select * from t where id = 99;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=45 Card=15342 Bytes=1472832)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=45 Card=15342 Bytes=1472832)



ops$tkyte@ORA9IR2> select * from t where id = 1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=45 Card=15342 Bytes=1472832)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=45 Card=15342 Bytes=1472832)



ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt => 'for all columns 
size auto' );

PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> select * from t where id = 99;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=45 Card=30683 Bytes=2945568)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=45 Card=30683 Bytes=2945568)



ops$tkyte@ORA9IR2> select * from t where id = 1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=96)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=96)
   2    1     INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=1 Card=1)



ops$tkyte@ORA9IR2> set autotrace off




you are seeing the 6 to 7 transition again.  The column information is 6 bytes long only.  strange 
things will happen if the first 6 bytes are the "same"


You will very much enjoy Jonathan's book - it will explain a lot.
 

5 stars Thanks Tom. That was very informative.   November 12, 2005 - 9pm Central time zone
Reviewer: Vladimir Sadilovskiy from MA,US


4 stars   December 21, 2005 - 3am Central time zone
Reviewer: Reader from Ind
Oracle generating 2 buckets for each column if I am executing this
command.

SQL> analyze table t compute statistics;

Table analyzed.

SQL> select count(*),column_name from user_Tab_histograms where
table_name = 'T' group by column_name;

  COUNT(*) COLUMN_NAME
---------- --------------------------------------------------
         2 CREATED
         2 DATA_OBJECT_ID
         2 GENERATED
         2 LAST_DDL_TIME
         2 OBJECT_ID
         2 OBJECT_NAME
         2 OBJECT_TYPE
         2 OWNER
         2 SECONDARY
         2 STATUS
         2 SUBOBJECT_NAME
         2 TEMPORARY
         2 TIMESTAMP

13 rows selected.

And If use this command ...

SQL> analyze table t compute statistics for table for all columns;

Table analyzed.

SQL> select count(*),column_name from user_Tab_histograms where
table_name = 'T' group by column_name;

  COUNT(*) COLUMN_NAME
---------- --------------------------------------------------
        76 CREATED
        75 DATA_OBJECT_ID
         2 GENERATED
        76 LAST_DDL_TIME
        76 OBJECT_ID
        76 OBJECT_NAME
        24 OBJECT_TYPE
        11 OWNER
         1 SECONDARY
         2 STATUS
         2 SUBOBJECT_NAME
         2 TEMPORARY
        76 TIMESTAMP

I am confused ..  No of buckets is dependent upon what ?

Regards.  


Followup   December 21, 2005 - 7am Central time zone:

first and foremost - stop using analyze to gather stats, start using dbms_stats.

Second, analyze table t compute statististics is

analyze table t
compute statistics
for table
for all columns SIZE 1
for all indexes;




ops$tkyte@ORA9IR2> create table t as select * from scott.emp;

Table created.

ops$tkyte@ORA9IR2> create index t_idx on t(ename);

Index created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> analyze table t compute statistics;

Table analyzed.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';

TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                      14

ops$tkyte@ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' 
group by column_name;

COLUMN_NAME            COUNT(*)
-------------------- ----------
COMM                          2
DEPTNO                        2
EMPNO                         2
ENAME                         2
HIREDATE                      2
JOB                           2
MGR                           2
SAL                           2

8 rows selected.

ops$tkyte@ORA9IR2> select index_name, num_rows from user_indexes where table_name= 'T';

INDEX_NAME                       NUM_ROWS
------------------------------ ----------
T_IDX                                  14

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> analyze table t delete statistics;

Table analyzed.

ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';

TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T

ops$tkyte@ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' 
group by column_name;

no rows selected

ops$tkyte@ORA9IR2> select index_name, num_rows from user_indexes where table_name= 'T';

INDEX_NAME                       NUM_ROWS
------------------------------ ----------
T_IDX

ops$tkyte@ORA9IR2> analyze table t
  2  compute statistics
  3  for table
  4  for all columns SIZE 1
  5  for all indexes;

Table analyzed.

ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';

TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                      14

ops$tkyte@ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' 
group by column_name;

COLUMN_NAME            COUNT(*)
-------------------- ----------
COMM                          2
DEPTNO                        2
EMPNO                         2
ENAME                         2
HIREDATE                      2
JOB                           2
MGR                           2
SAL                           2

8 rows selected.

ops$tkyte@ORA9IR2> select index_name, num_rows from user_indexes where table_name= 'T';

INDEX_NAME                       NUM_ROWS
------------------------------ ----------
T_IDX                                  14

ops$tkyte@ORA9IR2>
 

4 stars Re:   December 21, 2005 - 7am Central time zone
Reviewer: Reader from Ind
Thanks Tom for quick response.

And 

analyze table t compute statistics for table for all columns;

This command using what size for columns .. 


Followup   December 21, 2005 - 7am Central time zone:

you can research it further

http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96540/statements_46a.htm#2076689
default size 75 - but suggestion: use dbms_stats, that is the way to go, pretend analyze table t 
compute... doesn't exist for now. 

4 stars Got it default is 75 -- thanks   December 21, 2005 - 7am Central time zone
Reviewer: Reader from ind


5 stars Analyze table   December 21, 2005 - 8am Central time zone
Reviewer: A reader 
Tom,
This is not a follow up question but more of supporting incident to what you said about not using 
analyze table.

This happened several months ago so I do not remember all the table names. I was working with 
Oracle support to resolve an issue of EM running some very expensive queries against the 10gr1 
database. In an OWC session, the Oracle support analyst asked me to analyze some tables using 
analyze table...command. After analyzing the tables, the problem did not go away. When I checked in 
the last_analyzed date in dba_tables, the dates were totally weird for those tables. We tried many 
times but no effect. Then we used the dbms_stats to analyze these tables and the problem was 
resolved and the weird dates were replaced by the correct ones. Though the analyze table command 
was populating the statistics, it was missing something which dbms_stats fixed.

After that I have stopped using analyze table completely even for small adhoc test cases on 10g 
databases. 


Followup   December 21, 2005 - 7pm Central time zone:

analyze and dbms_stats can and will result in different numbers in the dictionary.  The optimizer 
is built expecting the dbms_stats numbers.  food for thought. 

3 stars Fixing plan for table having skewed data   March 24, 2006 - 6am Central time zone
Reviewer: Narendra from India
Hi Tom,

In following query in EXPLAIN PLAN, I am getting CARD=9 where as row returned is 1.

Following are necessary details:
SQL> select * from v$version ;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.1.0.4.0 - 64bi
PL/SQL Release 10.1.0.4.0 - Production
CORE    10.1.0.4.0      Production
TNS for Solaris: Version 10.1.0.4.0 - Production
NLSRTL Version 10.1.0.4.0 - Production

SQL> desc product
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 PRODUCT_KEY                               NOT NULL NUMBER(10)
 DESCRIPTION                                        VARCHAR2(30)
 FULL_DESCRIPTION                                   VARCHAR2(50)
 PRODUCT_TYPE                                       VARCHAR2(20)
 BRAND                                              VARCHAR2(20)
 PRODUCT_CATEGORY                                   VARCHAR2(20)
 AGE_CATEGORY                                       VARCHAR2(20)
 DEPARTMENT                                         VARCHAR2(15)

PRODUCT_KEY is PRIMARY KEY and has a UNIQUE index on it.

Following is distribution of data for BRAND column.
SQL> select brand, count(*) from product group by brand ;

BRAND                  COUNT(*)
-------------------- ----------
Astro                         4
Big Studios                  24
Coda                          1
Dordor                        2
Duff                          1
Galore                        1
Leavenworth                   1
Little Guys                   2
MKF Studios                  24
Nagazoo                       6
Parabuster Inc.              29

BRAND                  COUNT(*)
-------------------- ----------
Sani                          3
Solo                          6
Wild Age                      9
Wolf                         28

15 rows selected.

Now, when I execute following query,
SQL> select * from product where brand = 'Duff';

PRODUCT_KEY DESCRIPTION
----------- ------------------------------
FULL_DESCRIPTION                                   PRODUCT_TYPE
-------------------------------------------------- --------------------
BRAND                PRODUCT_CATEGORY     AGE_CATEGORY         DEPARTMENT
-------------------- -------------------- -------------------- ---------------
        141 Beer
                                                   FOOD
Duff                 Alcohol              over 21              Beverage



Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=9 Bytes=684)

   1    0   TABLE ACCESS (FULL) OF 'PRODUCT' (TABLE) (Cost=2 Card=9 Bytes=684)

The plan shows FULL table access step has CARD=9, whereas query returns only single row.

I tried collecting statistics using 
SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname => 'PRODUCT', method_opt => 'FOR 
ALL INDEXED COLUMNS SIZE 254', cascade => true);

PL/SQL procedure successfully completed.

However, still no change in plan.
SQL> select * from product where brand = 'Duff';

PRODUCT_KEY DESCRIPTION
----------- ------------------------------
FULL_DESCRIPTION                                   PRODUCT_TYPE
-------------------------------------------------- --------------------
BRAND                PRODUCT_CATEGORY     AGE_CATEGORY         DEPARTMENT
-------------------- -------------------- -------------------- ---------------
        141 Beer
                                                   FOOD
Duff                 Alcohol              over 21              Beverage



Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=9 Bytes=684)

   1    0   TABLE ACCESS (FULL) OF 'PRODUCT' (TABLE) (Cost=2 Card=9 Bytes=684)

But when I do following:
SQL> create table product_test as select * from product ;

Table created.

And fire same query on new table:
SQL> select * from product_test where brand = 'Duff';

PRODUCT_KEY DESCRIPTION
----------- ------------------------------
FULL_DESCRIPTION                                   PRODUCT_TYPE
-------------------------------------------------- --------------------
BRAND                PRODUCT_CATEGORY     AGE_CATEGORY         DEPARTMENT
-------------------- -------------------- -------------------- ---------------
        141 Beer
                                                   FOOD
Duff                 Alcohol              over 21              Beverage



Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=114)

   1    0   TABLE ACCESS (FULL) OF 'PRODUCT_TEST' (TABLE) (Cost=2 Card=1 Bytes=114)

The CARDINALITY is correct.

How can I fix the problem of incorrect cardinality on PRODUCT table? This table is part of VIDEO5 
schema (which is part of ORACLE DISCOVERER) and PRODUCT table contains 141 rows. 


Followup   March 24, 2006 - 9am Central time zone:

Histograms would "correct" the estimate, the estimate right now is:


ops$tkyte@ORA10GR2> select 141/15 from dual;
 
    141/15
----------
       9.4


(141 rows, 15 distinct values).  However, do you really NEED to fix it - 1 or 9 - is it making a 
material difference somewhere (if you use binds, unlikely for this table that you want to have 
histograms) 

2 stars For study   March 26, 2006 - 3am Central time zone
Reviewer: Narendra from India
Tom,

I was (obviously) not doing any tuning in this case. The idea of probing was just to know
1. Why estimates were incorrect ?
2. On creating separate table (without PK) why did it get "corrected"?
 


Followup   March 26, 2006 - 8am Central time zone:

why obviously?  If you have to state "this was obviously" - that means "it was not" (else it would 
be obvious and you would have no need to state "obviously" :)



I told you why the estimates were "incorrect" and even showed you the math. (just divide)

On the other table, you gathered statistics differently one would presume.  You have different 
statistics. 

As a bit of magic, I can show this (how different statistics affect the opimizer)

I put your aggregated data into a table T1 and created T2 with your data as follows:

ops$tkyte@ORA10GR2> create table t2
  2  as
  3  with data
  4  as
  5  (select level l from dual connect by level <= 29)
  6  select rownum id, t1.nm
  7    from t1, data
  8   where data.l <= t1.cnt
  9  /

Table created.

ops$tkyte@ORA10GR2> select nm, count(*) from t2 group by nm;

NM                               COUNT(*)
------------------------------ ----------
Astro                                   4
Duff                                    1
Big Studios                            24
Leavenworth                             1
Sani                                    3
Wild Age                                9
Solo                                    6
Galore                                  1
Nagazoo                                 6
Parabuster Inc.                        29
Coda                                    1
Dordor                                  2
Little Guys                             2
MKF Studios                            24
Wolf                                   28

15 rows selected.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T2' );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> set autotrace traceonly explain
ops$tkyte@ORA10GR2> select * from t2 where nm = 'Duff';

Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     9 |   117 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |     9 |   117 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("NM"='Duff')

exactly what you saw - and for the reasons I stated - DIVIDE 141/15 ~ 9.  About nine rows.  Why?  
Because of the lack of information, the optimizer just knows the HIGH and LOW values for NM and the 
number of distinct values:

ops$tkyte@ORA10GR2> set autotrace off
ops$tkyte@ORA10GR2> select column_name, count(*) from user_tab_histograms where table_name = 'T2' 
group by column_name;

COLUMN_NAME            COUNT(*)
-------------------- ----------
NM                            2
ID                            2

but, and here is the "magic", we just regather stats and since we've run a query that puts a 
predicate on a column - the stats Oracle gathers using AUTO change:

ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T2' );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10GR2> select column_name, count(*) from user_tab_histograms where table_name = 'T2' 
group by column_name;

COLUMN_NAME            COUNT(*)
-------------------- ----------
NM                           15
ID                            2

Oracle now has a rather complete histogram of the values in NM and will therefore make a better 
"guess"

ops$tkyte@ORA10GR2> set autotrace traceonly explain
ops$tkyte@ORA10GR2> select * from t2 where nm = 'Duff';

Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    13 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |     1 |    13 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("NM"='Duff')

ops$tkyte@ORA10GR2> set autotrace off


Now, this DOES NOT MEAN run out and gather histograms on everything - in fact, quite the 
opposite.  In many/most systems (transactional for example), you probably don't want them at all.  
In a warehouse (where the use of bind variables for the big queries would not be prevalent), you 
may well want them

But this explains "why" - you had different statistics for the two different tables. 

3 stars Thanks ("Obviously...")   March 27, 2006 - 4am Central time zone
Reviewer: Narendra from India
Tom,

When I gathered statistics using METHOD_OPT as 'FOR ALL COLUMNS SIZE SKEWONLY', the cardinality in 
query plan got corrected. I feel reason for not being able to achieve that earlier was due to 
gathering statistics on INDEXED COLUMNS only. BTW, I NEVER gathered statistics on NEW table after 
creating it with CTAS statement. So can I assume that CTAS ALWAYS keeps table stats up-to-date?

p.s. That "Obvious" was too much...:)

Thanks. 


Followup   March 27, 2006 - 9am Central time zone:

create table as select gathers no statistics, nothing would be "up to date"



 

5 stars Histograms on system tables.   August 14, 2006 - 3am Central time zone
Reviewer: Raju from India
Hi Tom,

We have noticed several system tables where
    dba_tab_col_statistics reports HEIGHT BALANCED histograms where bucket_cnt >
    distcnt (count of distinct values). It was my understanding that a similar
    problem was fixed in 10.2.0.1 (dbms_stats method_opt size n generates a
    FREQUENCY histogram where n >= NDV). However, even more perplexing is why do we
    have many height balanced histogram buckets on a column with only 1 distinct
    value?
    SQL > select ENDPOINT_NUMBER,ENDPOINT_VALUE from dba_histograms where
    table_name = 'INDPART$' and column_name = 'TS#';
    
    ENDPOINT_NUMBER ENDPOINT_VALUE
    --------------- --------------
                217             79

So, some questions?
    1. in sys.hist_head$, how is bucket_cnt calculated?
    2. what determines how many buckets a column gets?

Thanks,
Raju
 


Followup   August 14, 2006 - 11am Central time zone:

bucket cnt is not calculated, it is "set" - you either decide the number of buckets (size on the 
method opt) or you let us use any algorithm we feel like to do it "auto".

and you don't need/want to be querying hist_head$, just use the normal views like the rest of us :)

frequency based histograms - number of distinct values < 255
"height balanced" - number of distinct values >= 255

Now, example please?  I can definitely reproduce your findings, by modifying data after gathering 
of course.

So, got example?

 

5 stars Histogram   September 22, 2006 - 7pm Central time zone
Reviewer: A Reader from USA
hi tom

Oracle 9i
DW Environment
Monthly Data Load
120 Million rows in Fact 

we create hsitograms for all indexed columns with 10% sample size

one of the indexed columns is Month-Year to which data belongs - this column is not that skewed as 
each month roughly same # of rows get inserted.


The warehouse has 120 million rows for approx 72 months, when we do 

select * from fact_table

it does a full scan, even though the selectivity is approx 1/72

i was wondering if the 10% samplesize is generating histograms which are more hurting then helping 
as with a selectivity of 1/72 indexes should be useful

2 questions

a) do you think removing the histograms on this (such)
   columns will be helpful ?
b) do we also need to remove the stats of the index based
   on this column ?


Thanks very much 


Followup   September 24, 2006 - 1pm Central time zone:

... we create hsitograms for all indexed columns with 10% sample size
....

why?  why not the other columns? why all indexed columns?  why not "columns we know to be skewed 
and that we know we use in some predicate and that we know the optimizer needs to be told about"

since, well, histograms cost alot to compute... 

4 stars Histrogram   September 24, 2006 - 2pm Central time zone
Reviewer: Rinku from India
Excellent description about Histogram.

Thanks Tom & All...
 


2 stars   September 24, 2006 - 8pm Central time zone
Reviewer: A Reader from US
hi tom,

if i am not mistaken - you have mentioned a couple of diff times that starting with indexed columns 
for creating histograms is not a bad starting point - and to me it stands to reason - the fact that 
one decided to create indexes on these columns is a good indication that these often get used in 
predicates,

i know this is an approximation - but nevertheless a good starting point. The other issue is of 
manageabilty. A decent sized app can easilt have 2000 columns (100 tables with 20 columns each) - 
to tell precisely which colsumns to create histograms can be quite a task - with high maintenance 
as well, as the schema evolves. (As a suggestion it would be a rather useful addition to oracle 
create table DDL syntax to be able to earmark columns for which to create histograms - on the 
column clause).

My point is - if we do fall back on this rather common paradigm of creating histograms on indexed 
columns - a good best practice could be :

"IF YOU CREATE HISTOGRAMS FOR ALL INDEXED COLUMNS (COZ YOU ARE LAZY NOT TO ANALYZE WHICH ONES 
REALLY NEEN EM :-) AND
IF YOU ARE ANALYZING A SMALL % OF ROWS - EG 10 % - THEN MAKE SURE YOU DELETE THE COLUMN STATS FOR 
COLUMN WHICH YOU KNOW FOR A FACT ARE NOT SKEWED"

the basic reason, i think we need to do is 

if a table has 72 M rows - say 1 M for each month - in such as case a predicate suct as month_year 
= '01-2006' - 
has a selectivity of of 1/72. And with that kind of selectivity i feel it should use the index - 
and its the histograms build with just 10% of the rows that are confusing the system. Do you agree 
?

Does all this even remotely make sense - will be graeful to hear your take.

Regards, 


Followup   September 25, 2006 - 2am Central time zone:

can you point to a place where I said that - so I can update it.


In a data warehouse, you don't index in many cases - so it doesn't really hold true (indexes are 
good for getting a few rows from a big table, warehouses are many times characterized for NOT doing 
that)

to gather histograms on all indexed columns to me would be the wrong (inefficient) approach.

maybe you should not gather any.  I don't fall back on "common paradigms", those are also known as 
ROT (rule of thumb)

you would not need to remove the ones you know are not skewed (this is funny, you say you don't 
have the time/energy/motivation to discover what ones you SHOULD gather on, but you are 
knowledgable about the ones you don't need them on?? that would imply you do know the skewed 
columns by definition - it is the set of columns left over after you take these away).  Since they 
are not skewed, the histograms will reflect that.

No, it doesn't make sense to me to say "it is too much work to understand our data"
 

5 stars test   September 25, 2006 - 7am Central time zone
Reviewer: A reader 
test

5 stars Dumb question... How do I NOT create histograms when using DBMS_STATS to gather stats?   September 26, 2006 - 2pm Central time zone
Reviewer: Philip Moore from Cincinnati, OH USA
Hi Tom,

This may be an EXTREMELY dumb question - but I can't find the answer in the 9i documentation or on 
askTom...

How do I explicitly NOT create histograms when using DBMS_STATS with 9.2.0.6?  What is the 
"METHOD_OPT" parameter I should use?

I really need to analyze my tables with DBMS_STATS - I just don't want histograms :)

Thanks for your help in advance, sir.
 


Followup   September 26, 2006 - 5pm Central time zone:

size 1

if you have one bucket, only thing we know is "high and low and number distinct" values.

 

5 stars statistics & HIST_HEAD$   September 27, 2006 - 8am Central time zone
Reviewer: Antonio Dell'Elce from Italy
Dear Tom,

I am investigating possible performance improvements on 
on a 10.1.0.2.0 instance, looking at the ADDM report:

FINDING 1: 88% impact (36798 seconds)
-------------------------------------
SQL statements consuming significant database time were found.

   RECOMMENDATION 1: SQL Tuning, 60% benefit (25021 seconds)
      ACTION: Investigate the SQL statement with SQL_ID "b6usrg82hwsa3" for
         possible performance improvements.
         RELEVANT OBJECT: SQL statement with SQL_ID b6usrg82hwsa3
         call dbms_stats.gather_database_stats_job_proc (  )

(only first lines reported)

and at v$segment_statistics: 

OWNER      OBJ                            SUBNAME         SNAME                           VALUE
---------- ------------------------------ --------------- -------------------- ----------------
SYS        HIST_HEAD$                                     db block changes          107,039,200
STAT_OWN   INTERVALLI                     SYS_SUBP1119    physical reads             80,407,080
STAT_OWN   INTERVALLI                     SYS_SUBP1121    physical reads             80,335,637
STAT_OWN   INTERVALLI                     SYS_SUBP1117    physical reads             68,966,192
STAT_OWN   INTERVALLI                     SYS_SUBP1118    physical reads             55,163,908
STAT_OWN   INTERVALLI                     SYS_SUBP1120    physical reads             43,452,888
SYS        I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST                 db block changes           11,185,920
SYS        HISTGRM$                                       db block changes           10,908,512

I came to believe that statistics collections are not working at best, ... how do you suggest I can 
investigate this?

Thanks,
Antonio


 


Followup   September 27, 2006 - 3pm Central time zone:

is this job running during your time of "peak usage", this is the auto scheduled job that should be 
running "off peak", is it running when you are "doing useful work" or "when you are asleep and no 
one is really doing stuff" 

5 stars   September 29, 2006 - 12pm Central time zone
Reviewer: Antonio 
I verified the schedules and it is on the default schedule
(night and weekends) but while the system is "operational" 
on the whole day (it collects data from other systems which 
are 24hrs active) 


Followup   September 30, 2006 - 7am Central time zone:

ok, big question: should you be gathering stats so frequently...

you cannot really change what this routine does. 

5 stars Find queries using histograms   October 26, 2007 - 4pm Central time zone
Reviewer: Ravi from Chicago, IL
Tom,

How do we find a query is using histograms or not?  Only way is through trace?  

Thanks
Ravi


Followup   October 29, 2007 - 12pm Central time zone:

query user_tab_histograms - if they have them, the query was optimized with them in mind.

that is, every query that accesses a table with histograms "used" them - whether they make any difference to the plan or not
3 stars Histogram is not working ?   February 17, 2009 - 3pm Central time zone
Reviewer: Roberto Veiga from Brazil
Tom, I think I have a problem.
The user is complaining about performance.

I have a query that uses bind variables.

We are investigating the query and we've noticed the plan changes according with the values that 
are passed to the query.

We have already collected statistcs and histograms. But sometimes the performance is good and 
sometimes is bad.

I dont know what to say to the user because this process have to have a good performance and one 
minute makes a lot of difference.

Here is my tests:

** *******************************************************************
** ORIGINAL QUERY AND ORIGINAL ACCESS PLAN
** *******************************************************************

SELECT RAES.RAES_USR_ANT_ATM, RAES_DAT_ANT_ATM, RAES.RAES_STA_LBR_RMS, RAES.RAES_CTRA_AFL_CDS_COD, 
RAES.RAES_DAT_CLC, RAES.RAES_DAT_PGT, VPR.PGT_VAL_BRT VAL_BRT, RAES.RAES_VAL_LQD, 
RAES.RAES_VAL_BLQ_ALR, RAES.RAES_ADMS_ADME_ADM_COD, RAES.RAES_RMS_ANO_EMO, RAES.RAES_RMS_LOCP_COD, 
RAES.RAES_RMS_TIP_NUM_RMS, RAES.RAES_RMS_NUM_PRG_RMS, RAES.RAES_DAT_COR, RAES.RAES_PRM_COB 
RAES_PRM_COB_AUX, RAES.RAES_PGT_COD, RAES.RAES_ADMS_FRMS_COD, RAES.RAES_MOE_COD, RAES.RAES_NUM_PRL, 
RAES.RAES_ADMS_SRV_COD, VPR.GUI_ADME_ADM_COD GUI_ADME_ADM_COD, VPR.GUI_EQP_NUM GUI_EQP_NUM, 
VPR.GUI_COD GUI_COD, VPR.GUI_DTA_ETG GUI_DTA_ETG, VPR.GUI_QTD_TOT_IFD GUI_QTD_TOT_IFD, 
VPR.GUI_VAL_TOT_IFD GUI_VAL_TOT_IFD, C620040P(RAES.RAES_STA_LBR_RMS, RAES.RAES_PRM_COB, 
RAES.RAES_DAT_PGT, RAES.RAES_PGT_COD) RAES_STATUS, RAES_CTRA_ADMS_ADME_ADM_COD, 
RAES_CTRA_ADMS_FRMS_COD, RAES_CTRA_ADMS_SRV_COD, RAES_CTRA_MOE_COD FROM RMS_ADM_EMP_SRV RAES, 
V_RAES_PGT VPR
WHERE RAES.RAES_STA_LBR_RMS < 4
AND NVL(RAES.RAES_VAL_PGO_PRL, 0) > 0
AND NOT VPR.RAES_RMS_TIP_NUM_RMS = 7
AND NOT VPR.RAES_FRMP_COD = 7
AND RAES.RAES_STA_RAES + 0 >= 2
AND RAES.RAES_PGT_COD + 0 IS NULL
AND RAES.RAES_RMS_ANO_EMO = VPR.RAES_RMS_ANO_EMO
AND RAES.RAES_RMS_LOCP_COD = VPR.RAES_RMS_LOCP_COD
AND RAES.RAES_RMS_TIP_NUM_RMS = VPR.RAES_RMS_TIP_NUM_RMS
AND RAES.RAES_RMS_NUM_PRG_RMS = VPR.RAES_RMS_NUM_PRG_RMS
AND RAES.RAES_ADMS_ADME_ADM_COD = VPR.RAES_ADMS_ADME_ADM_COD
AND RAES.RAES_ADMS_SRV_COD = VPR.RAES_ADMS_SRV_COD
AND RAES.RAES_ADMS_FRMS_COD = VPR.RAES_ADMS_FRMS_COD
AND RAES.RAES_MOE_COD = VPR.RAES_MOE_COD
AND RAES.RAES_NUM_PRL = VPR.RAES_NUM_PRL
AND RAES.RAES_ADMS_ADME_ADM_COD = 5
AND RAES.RAES_ADMS_FRMS_COD = 2
AND RAES.RAES_ADMS_SRV_COD = 1
AND RAES.RAES_MOE_COD = 'R$'
AND RAES.RAES_DAT_PGT = '12/16/2008'
AND RAES.RAES_PGT_GRD = 'N'
AND ( ('S' = 'S' AND VPR.PGT_CDS_COD_REL = 104938 )
OR ('S' != 'S' AND VPR.PGT_CDS_COD = 104938 ) );


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
--------------------------
Plan hash value: 3546104366

----------------------------------------------------------------------------------------------------
--------------------------
| Id  | Operation                          | Name                    | Rows  | Bytes | Cost (%CPU)| 
Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
--------------------------
|   0 | SELECT STATEMENT                   |                         |     1 |   179 |    49   (0)| 
00:00:01 |       |       |
|*  1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| RMS_ADM_EMP_SRV         |     1 |   102 |     1   (0)| 
00:00:01 | ROWID | ROWID |
|   2 |   NESTED LOOPS                     |                         |     1 |   179 |    49   (0)| 
00:00:01 |       |       |
|*  3 |    TABLE ACCESS BY INDEX ROWID     | SNAP$_RAES_PGT          |    13 |  1001 |    46   (3)| 
00:00:01 |       |       |
|*  4 |     INDEX RANGE SCAN               | SNAP$RAES_AFL_DAT_PGT_I |   218 |       |     1   (0)| 
00:00:01 |       |       |
|*  5 |    INDEX RANGE SCAN                | RAES_RMS_FK_I           |     1 |       |     1   (0)| 
00:00:01 |       |       |
----------------------------------------------------------------------------------------------------
--------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RAES"."RAES_PGT_GRD"='N' AND "RAES"."RAES_ADMS_FRMS_COD"=2 AND 
"RAES"."RAES_ADMS_SRV_COD"=1 AND
              "RAES"."RAES_PGT_COD"+0 IS NULL AND "RAES"."RAES_DAT_PGT"='12/16/2008' AND 
NVL("RAES"."RAES_VAL_PGO_PRL",0)>0 AND
              "RAES"."RAES_ADMS_ADME_ADM_COD"=5 AND "RAES"."RAES_STA_LBR_RMS"<4 AND 
"RAES"."RAES_STA_RAES"+0>=2 AND
              "RAES"."RAES_MOE_COD"='R$' AND "RAES"."RAES_NUM_PRL"="RAES_NUM_PRL")
   3 - filter("RAES_ADMS_SRV_COD"=1 AND "RAES_ADMS_FRMS_COD"=2 AND "RAES_RMS_TIP_NUM_RMS"<>7 AND 
"RAES_FRMP_COD"<>7
              AND "RAES_ADMS_ADME_ADM_COD"=5 AND "RAES_MOE_COD"='R$' AND ("RAES_RMS_TIP_NUM_RMS"=1 
OR "RAES_RMS_TIP_NUM_RMS"=2 OR
              "RAES_RMS_TIP_NUM_RMS"=3 OR "RAES_RMS_TIP_NUM_RMS"=4 OR "RAES_RMS_TIP_NUM_RMS"=5 OR 
"RAES_RMS_TIP_NUM_RMS"=6 OR
              "RAES_RMS_TIP_NUM_RMS"=7 OR "RAES_RMS_TIP_NUM_RMS"=8 OR "RAES_RMS_TIP_NUM_RMS"=9))
   4 - access("PGT_CDS_COD_REL"=104938)
   5 - access("RAES"."RAES_RMS_NUM_PRG_RMS"="RAES_RMS_NUM_PRG_RMS" AND
              "RAES"."RAES_RMS_TIP_NUM_RMS"="RAES_RMS_TIP_NUM_RMS" AND 
"RAES"."RAES_RMS_ANO_EMO"="RAES_RMS_ANO_EMO" AND
              "RAES"."RAES_RMS_LOCP_COD"="RAES_RMS_LOCP_COD")
       filter("RAES"."RAES_RMS_TIP_NUM_RMS"<>7)

29 rows selected.



** *******************************************************************
** TESTING VALUES ON COLUMN RAES_ADMS_SRV_COD
** *******************************************************************

-- -------------------------------------------------------------------
-- DISTINCT VALUES FOR COLUMN RAES_ADMS_SRV_COD
-- -------------------------------------------------------------------

SELECT DISTINCT RAES_ADMS_SRV_COD FROM RMS_ADM_EMP_SRV ORDER BY 1;

RAES_ADMS_SRV_COD
-----------------
                1
                2
                3
                4
                6
                7
               16
               22
               23
               24
               25
               26
               27
               30
               31
               32

16 rows selected.


-- -------------------------------------------------------------------
-- BUCKETS/VALUES FOR COLUMN RAES_ADMS_SRV_COD
-- -------------------------------------------------------------------

select min(RAES_ADMS_SRV_COD), max(RAES_ADMS_SRV_COD), count(RAES_ADMS_SRV_COD), wb
from (
select RAES_ADMS_SRV_COD,
         width_bucket( RAES_ADMS_SRV_COD,
                       (select min(RAES_ADMS_SRV_COD)-1 from prod_dba.RMS_ADM_EMP_SRV),
                       (select max(RAES_ADMS_SRV_COD)+1 from prod_dba.RMS_ADM_EMP_SRV), 50 ) wb
    from prod_dba.RMS_ADM_EMP_SRV
         )
  group by wb
 order by wb
 /
 
MIN(RAES_ADMS_SRV_COD)|MAX(RAES_ADMS_SRV_COD)|COUNT(RAES_ADMS_SRV_COD)|        WB
----------------------|----------------------|------------------------|----------
                     1|                     1|                11692690|         2
                     2|                     2|                36778973|         4
                     3|                     3|                    3309|         5
                     4|                     4|                    5125|         7
                     6|                     6|                    3431|        10
                     7|                     7|                    2111|        11
                    16|                    16|                    4731|        25
                    22|                    22|                      43|        34
                    23|                    23|                       5|        35
                    24|                    24|                    8851|        37
                    25|                    25|                  537676|        38
                    26|                    26|                    1387|        40
                    27|                    27|                     797|        41
                    30|                    30|                  227446|        46
                    31|                    31|                 3285855|        47
                    32|                    32|                 1473147|        49

16 rows selected.

-- -------------------------------------------------------------------
-- TESTING RAES_ADMS_SRV_COD = 1
-- -------------------------------------------------------------------

SELECT RAES.RAES_USR_ANT_ATM, RAES_DAT_ANT_ATM, RAES.RAES_STA_LBR_RMS, RAES.RAES_CTRA_AFL_CDS_COD, 
RAES.RAES_DAT_CLC, RAES.RAES_DAT_PGT, VPR.PGT_VAL_BRT VAL_BRT, RAES.RAES_VAL_LQD, 
RAES.RAES_VAL_BLQ_ALR, RAES.RAES_ADMS_ADME_ADM_COD, RAES.RAES_RMS_ANO_EMO, RAES.RAES_RMS_LOCP_COD, 
RAES.RAES_RMS_TIP_NUM_RMS, RAES.RAES_RMS_NUM_PRG_RMS, RAES.RAES_DAT_COR, RAES.RAES_PRM_COB 
RAES_PRM_COB_AUX, RAES.RAES_PGT_COD, RAES.RAES_ADMS_FRMS_COD, RAES.RAES_MOE_COD, RAES.RAES_NUM_PRL, 
RAES.RAES_ADMS_SRV_COD, VPR.GUI_ADME_ADM_COD GUI_ADME_ADM_COD, VPR.GUI_EQP_NUM GUI_EQP_NUM, 
VPR.GUI_COD GUI_COD, VPR.GUI_DTA_ETG GUI_DTA_ETG, VPR.GUI_QTD_TOT_IFD GUI_QTD_TOT_IFD, 
VPR.GUI_VAL_TOT_IFD GUI_VAL_TOT_IFD, C620040P(RAES.RAES_STA_LBR_RMS, RAES.RAES_PRM_COB, 
RAES.RAES_DAT_PGT, RAES.RAES_PGT_COD) RAES_STATUS, RAES_CTRA_ADMS_ADME_ADM_COD, 
RAES_CTRA_ADMS_FRMS_COD, RAES_CTRA_ADMS_SRV_COD, RAES_CTRA_MOE_COD FROM RMS_ADM_EMP_SRV RAES, 
V_RAES_PGT VPR
WHERE RAES.RAES_STA_LBR_RMS < 4
AND NVL(RAES.RAES_VAL_PGO_PRL, 0) > 0
AND NOT VPR.RAES_RMS_TIP_NUM_RMS = 7
AND NOT VPR.RAES_FRMP_COD = 7
AND RAES.RAES_STA_RAES + 0 >= 2
AND RAES.RAES_PGT_COD + 0 IS NULL
AND RAES.RAES_RMS_ANO_EMO = VPR.RAES_RMS_ANO_EMO
AND RAES.RAES_RMS_LOCP_COD = VPR.RAES_RMS_LOCP_COD
AND RAES.RAES_RMS_TIP_NUM_RMS = VPR.RAES_RMS_TIP_NUM_RMS
AND RAES.RAES_RMS_NUM_PRG_RMS = VPR.RAES_RMS_NUM_PRG_RMS
AND RAES.RAES_ADMS_ADME_ADM_COD = VPR.RAES_ADMS_ADME_ADM_COD
AND RAES.RAES_ADMS_SRV_COD = VPR.RAES_ADMS_SRV_COD
AND RAES.RAES_ADMS_FRMS_COD = VPR.RAES_ADMS_FRMS_COD
AND RAES.RAES_MOE_COD = VPR.RAES_MOE_COD
AND RAES.RAES_NUM_PRL = VPR.RAES_NUM_PRL
AND RAES.RAES_ADMS_ADME_ADM_COD = 5
AND RAES.RAES_ADMS_FRMS_COD = 2
AND RAES.RAES_ADMS_SRV_COD = 1 -- <<
AND RAES.RAES_MOE_COD = 'R$'
AND RAES.RAES_DAT_PGT = '12/16/2008'
AND RAES.RAES_PGT_GRD = 'N'
AND ( ('S' = 'S' AND VPR.PGT_CDS_COD_REL = 104938 )
OR ('S' != 'S' AND VPR.PGT_CDS_COD = 104938 ) );
 
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
-----------------------------
Plan hash value: 3546104366

----------------------------------------------------------------------------------------------------
--------------------------
| Id  | Operation                          | Name                    | Rows  | Bytes | Cost (%CPU)| 
Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
--------------------------
|   0 | SELECT STATEMENT                   |                         |     1 |   179 |    49   (0)| 
00:00:01 |       |       |
|*  1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| RMS_ADM_EMP_SRV         |     1 |   102 |     1   (0)| 
00:00:01 | ROWID | ROWID |
|   2 |   NESTED LOOPS                     |                         |     1 |   179 |    49   (0)| 
00:00:01 |       |       |
|*  3 |    TABLE ACCESS BY INDEX ROWID     | SNAP$_RAES_PGT          |    13 |  1001 |    46   (3)| 
00:00:01 |       |       |
|*  4 |     INDEX RANGE SCAN               | SNAP$RAES_AFL_DAT_PGT_I |   218 |       |     1   (0)| 
00:00:01 |       |       |
|*  5 |    INDEX RANGE SCAN                | RAES_RMS_FK_I           |     1 |       |     1   (0)| 
00:00:01 |       |       |
----------------------------------------------------------------------------------------------------
--------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RAES"."RAES_PGT_GRD"='N' AND "RAES"."RAES_ADMS_FRMS_COD"=2 AND 
"RAES"."RAES_ADMS_SRV_COD"=1 AND
              "RAES"."RAES_PGT_COD"+0 IS NULL AND "RAES"."RAES_DAT_PGT"='12/16/2008' AND 
NVL("RAES"."RAES_VAL_PGO_PRL",0)>0 AND
              "RAES"."RAES_ADMS_ADME_ADM_COD"=5 AND "RAES"."RAES_STA_LBR_RMS"<4 AND 
"RAES"."RAES_STA_RAES"+0>=2 AND
              "RAES"."RAES_MOE_COD"='R$' AND "RAES"."RAES_NUM_PRL"="RAES_NUM_PRL")
   3 - filter("RAES_ADMS_SRV_COD"=1 AND "RAES_ADMS_FRMS_COD"=2 AND "RAES_RMS_TIP_NUM_RMS"<>7 AND 
"RAES_FRMP_COD"<>7
              AND "RAES_ADMS_ADME_ADM_COD"=5 AND "RAES_MOE_COD"='R$' AND ("RAES_RMS_TIP_NUM_RMS"=1 
OR "RAES_RMS_TIP_NUM_RMS"=2 OR
              "RAES_RMS_TIP_NUM_RMS"=3 OR "RAES_RMS_TIP_NUM_RMS"=4 OR "RAES_RMS_TIP_NUM_RMS"=5 OR 
"RAES_RMS_TIP_NUM_RMS"=6 OR
              "RAES_RMS_TIP_NUM_RMS"=7 OR "RAES_RMS_TIP_NUM_RMS"=8 OR "RAES_RMS_TIP_NUM_RMS"=9))
   4 - access("PGT_CDS_COD_REL"=104938)
   5 - access("RAES"."RAES_RMS_NUM_PRG_RMS"="RAES_RMS_NUM_PRG_RMS" AND
              "RAES"."RAES_RMS_TIP_NUM_RMS"="RAES_RMS_TIP_NUM_RMS" AND 
"RAES"."RAES_RMS_ANO_EMO"="RAES_RMS_ANO_EMO" AND
              "RAES"."RAES_RMS_LOCP_COD"="RAES_RMS_LOCP_COD")
       filter("RAES"."RAES_RMS_TIP_NUM_RMS"<>7)

29 rows selected.


-- -------------------------------------------------------------------
-- TESTING RAES_ADMS_SRV_COD = 4
-- -------------------------------------------------------------------

** >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< **
** >>>>>>>>>>>>>>>>>>>>>>>>>>> CHANGE ON ACCESS PLAN  <<<<<<<<<<<<<<<<<<<<<<<<<< **
** >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< **

SELECT RAES.RAES_USR_ANT_ATM, RAES_DAT_ANT_ATM, RAES.RAES_STA_LBR_RMS, RAES.RAES_CTRA_AFL_CDS_COD, 
RAES.RAES_DAT_CLC, RAES.RAES_DAT_PGT, VPR.PGT_VAL_BRT VAL_BRT, RAES.RAES_VAL_LQD, 
RAES.RAES_VAL_BLQ_ALR, RAES.RAES_ADMS_ADME_ADM_COD, RAES.RAES_RMS_ANO_EMO, RAES.RAES_RMS_LOCP_COD, 
RAES.RAES_RMS_TIP_NUM_RMS, RAES.RAES_RMS_NUM_PRG_RMS, RAES.RAES_DAT_COR, RAES.RAES_PRM_COB 
RAES_PRM_COB_AUX, RAES.RAES_PGT_COD, RAES.RAES_ADMS_FRMS_COD, RAES.RAES_MOE_COD, RAES.RAES_NUM_PRL, 
RAES.RAES_ADMS_SRV_COD, VPR.GUI_ADME_ADM_COD GUI_ADME_ADM_COD, VPR.GUI_EQP_NUM GUI_EQP_NUM, 
VPR.GUI_COD GUI_COD, VPR.GUI_DTA_ETG GUI_DTA_ETG, VPR.GUI_QTD_TOT_IFD GUI_QTD_TOT_IFD, 
VPR.GUI_VAL_TOT_IFD GUI_VAL_TOT_IFD, C620040P(RAES.RAES_STA_LBR_RMS, RAES.RAES_PRM_COB, 
RAES.RAES_DAT_PGT, RAES.RAES_PGT_COD) RAES_STATUS, RAES_CTRA_ADMS_ADME_ADM_COD, 
RAES_CTRA_ADMS_FRMS_COD, RAES_CTRA_ADMS_SRV_COD, RAES_CTRA_MOE_COD FROM RMS_ADM_EMP_SRV RAES, 
V_RAES_PGT VPR
WHERE RAES.RAES_STA_LBR_RMS < 4
AND NVL(RAES.RAES_VAL_PGO_PRL, 0) > 0
AND NOT VPR.RAES_RMS_TIP_NUM_RMS = 7
AND NOT VPR.RAES_FRMP_COD = 7
AND RAES.RAES_STA_RAES + 0 >= 2
AND RAES.RAES_PGT_COD + 0 IS NULL
AND RAES.RAES_RMS_ANO_EMO = VPR.RAES_RMS_ANO_EMO
AND RAES.RAES_RMS_LOCP_COD = VPR.RAES_RMS_LOCP_COD
AND RAES.RAES_RMS_TIP_NUM_RMS = VPR.RAES_RMS_TIP_NUM_RMS
AND RAES.RAES_RMS_NUM_PRG_RMS = VPR.RAES_RMS_NUM_PRG_RMS
AND RAES.RAES_ADMS_ADME_ADM_COD = VPR.RAES_ADMS_ADME_ADM_COD
AND RAES.RAES_ADMS_SRV_COD = VPR.RAES_ADMS_SRV_COD
AND RAES.RAES_ADMS_FRMS_COD = VPR.RAES_ADMS_FRMS_COD
AND RAES.RAES_MOE_COD = VPR.RAES_MOE_COD
AND RAES.RAES_NUM_PRL = VPR.RAES_NUM_PRL
AND RAES.RAES_ADMS_ADME_ADM_COD = 5
AND RAES.RAES_ADMS_FRMS_COD = 2
AND RAES.RAES_ADMS_SRV_COD = 4 -- <<
AND RAES.RAES_MOE_COD = 'R$'
AND RAES.RAES_DAT_PGT = '12/16/2008'
AND RAES.RAES_PGT_GRD = 'N'
AND ( ('S' = 'S' AND VPR.PGT_CDS_COD_REL = 104938 )
OR ('S' != 'S' AND VPR.PGT_CDS_COD = 104938 ) );

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
-------------------------
Plan hash value: 3292825880

----------------------------------------------------------------------------------------------------
--------------------
| Id  | Operation                          | Name              | Rows  | Bytes | Cost (%CPU)| Time  
   | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
--------------------
|   0 | SELECT STATEMENT                   |                   |     1 |   179 |     9   (0)| 
00:00:01 |       |       |
|*  1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| RMS_ADM_EMP_SRV   |     1 |   102 |     1   (0)| 
00:00:01 | ROWID | ROWID |
|   2 |   NESTED LOOPS                     |                   |     1 |   179 |     9   (0)| 
00:00:01 |       |       |
|   3 |    INLIST ITERATOR                 |                   |       |       |            |       
   |       |       |
|*  4 |     TABLE ACCESS BY INDEX ROWID    | SNAP$_RAES_PGT    |     1 |    77 |     8   (0)| 
00:00:01 |       |       |
|*  5 |      INDEX RANGE SCAN              | SNAP$_RAES_PGT_PK |   185 |       |     2   (0)| 
00:00:01 |       |       |
|*  6 |    INDEX RANGE SCAN                | RAES_RMS_FK_I     |     1 |       |     1   (0)| 
00:00:01 |       |       |
----------------------------------------------------------------------------------------------------
--------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RAES"."RAES_ADMS_SRV_COD"=4 AND "RAES"."RAES_PGT_GRD"='N' AND 
"RAES"."RAES_ADMS_FRMS_COD"=2 AND
              "RAES"."RAES_PGT_COD"+0 IS NULL AND "RAES"."RAES_DAT_PGT"='12/16/2008' AND 
NVL("RAES"."RAES_VAL_PGO_PRL",0)>0
              AND "RAES"."RAES_ADMS_ADME_ADM_COD"=5 AND "RAES"."RAES_STA_LBR_RMS"<4 AND 
"RAES"."RAES_STA_RAES"+0>=2 AND
              "RAES"."RAES_MOE_COD"='R$' AND "RAES"."RAES_NUM_PRL"="RAES_NUM_PRL")
   4 - filter("PGT_CDS_COD_REL"=104938 AND "RAES_FRMP_COD"<>7)
   5 - access(("RAES_RMS_TIP_NUM_RMS"=1 OR "RAES_RMS_TIP_NUM_RMS"=2 OR "RAES_RMS_TIP_NUM_RMS"=3 OR
              "RAES_RMS_TIP_NUM_RMS"=4 OR "RAES_RMS_TIP_NUM_RMS"=5 OR "RAES_RMS_TIP_NUM_RMS"=6 OR 
"RAES_RMS_TIP_NUM_RMS"=7 OR
              "RAES_RMS_TIP_NUM_RMS"=8 OR "RAES_RMS_TIP_NUM_RMS"=9) AND "RAES_ADMS_ADME_ADM_COD"=5 
AND "RAES_ADMS_SRV_COD"=4
              AND "RAES_ADMS_FRMS_COD"=2 AND "RAES_MOE_COD"='R$')
       filter("RAES_RMS_TIP_NUM_RMS"<>7 AND "RAES_MOE_COD"='R$')
   6 - access("RAES"."RAES_RMS_NUM_PRG_RMS"="RAES_RMS_NUM_PRG_RMS" AND
              "RAES"."RAES_RMS_TIP_NUM_RMS"="RAES_RMS_TIP_NUM_RMS" AND 
"RAES"."RAES_RMS_ANO_EMO"="RAES_RMS_ANO_EMO" AND
              "RAES"."RAES_RMS_LOCP_COD"="RAES_RMS_LOCP_COD")
       filter("RAES"."RAES_RMS_TIP_NUM_RMS"<>7)

31 rows selected.




Followup   February 17, 2009 - 4pm Central time zone:

it is *because* the histograms are there and are working that you are seeing this

http://tkyte.blogspot.com/2007/09/tuning-with-sqltracetrue.html

http://tkyte.blogspot.com/2007/09/sqltracetrue-part-two.html


3 stars The First example in this page against 10204   March 6, 2009 - 6pm Central time zone
Reviewer: Yoav 
Hi Tom,
I run the first example in this page againt oracle 10204 , and got totally different results.
Actually the performance are better before gathering statistics : Its took to oracle 0.56 sec to 
return an answer befor gathering stats. and 3.33 sec after gathering stats .
I did not find any benift using histograms against 10204 - in this case. (It may be related to 
dynamic sampling ?)
could you please verify it ?
If indeed this is the case could you please give an example where i could benit from histograms in 
10gr2 ?

Thanks

SQL*Plus: Release 10.2.0.4.0 - Production on Sat Mar 7 00:24:08 2009
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter session set optimizer_mode='ALL_ROWS';

Session altered.

SQL> show parameter optimizer_mode

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_mode                       string      ALL_ROWS

SQL> create table t
as
select 1 X, rpad('*',4000,'*') data from all_objects a  2    3  ;

Table created.

SQL> insert into t
  2  select 2, rpad('*',4000,'*') from all_objects a where rownum = 1;

1 row created.

SQL> commit;

Commit complete.

SQL> create index t_idx on t(x);

Index created.

SQL> alter session set sql_trace=true;

Session altered.

SQL> select count(data) from t t1 where x = 1;

COUNT(DATA)
-----------
      53130

1 row selected.

SQL> select count(data) from t t1 where x = 2;

COUNT(DATA)
-----------
          1

1 row selected.

SQL> alter system flush shared_pool;

System altered.


SQL> alter system flush buffer_cache;

System altered.

SQL> exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'FOR ALL COLUMNS 
SIZE AUTO', cascade=>true);


PL/SQL procedure successfully completed.


SQL> select count(data) from t t2 where x = 1;

COUNT(DATA)
-----------
      53130

1 row selected.


SQL> select count(data) from t t2 where x = 2;

COUNT(DATA)
-----------
          1

1 row selected.

SQL> select sysdate from dual;

SYSDATE
---------
07-MAR-09

1 row selected.

===========================================================

TKPROF shows , befor collecting statsistics:

select count(data)
from  t t1 where x = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.58       0.56          0     118851          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.59       0.56          0     118853          0           

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 13

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=118851 pr=0 pw=0 time=565898 us)
  53130   TABLE ACCESS FULL T (cr=118851 pr=0 pw=0 time=584474 us)
  

select count(data)
from
 t t1 where x = 2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          4          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.01       0.00          0          6          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 13

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=4 pr=0 pw=0 time=39 us)
      1   TABLE ACCESS BY INDEX ROWID T (cr=4 pr=0 pw=0 time=31 us)
      1    INDEX RANGE SCAN T_IDX (cr=2 pr=0 pw=0 time=13 us)(object id 10977844)
      
      
AFTER COLLECTING STATISTICS:

select count(data)
from
 t t2 where x = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      1.16       3.33      22770     118851          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      1.16       3.33      22770     118851          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 13

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=118851 pr=22770 pw=0 time=3337004 us)
  53130   TABLE ACCESS FULL T (cr=118851 pr=22770 pw=0 time=3932313 us)

***********************************************************

select count(data)
from
 t t2 where x = 2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          4          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          0          4          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 13

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=4 pr=0 pw=0 time=112 us)
      1   TABLE ACCESS BY INDEX ROWID T (cr=4 pr=0 pw=0 time=98 us)
      1    INDEX RANGE SCAN T_IDX (cr=2 pr=0 pw=0 time=71 us)(object id 10977844)

***********************************************************






Followup   March 7, 2009 - 12pm Central time zone:

ummm, please apply a little bit of critical reading to what you see above and see if you cannot figure this out yourself?


do you see that they use the same plans? (yes, they do)
do you see that before and after statistics - they are identical in nature (yes, they are)

do you see that in between, you flushed the buffer cache and caused the second one to do A TON OF physical IO?

??????


Look - anytime you have the same exact plan, well, you have the SAME EXACT PLAN. If the plans are the same - they do - the same thing.

They did exactly the same thing, you flushed the cache for whatever reason (why would you do that???? that never happens in real life - I hate that command)
4 stars Guidelines for better sql statement   May 31, 2009 - 9am Central time zone
Reviewer: Roberto Veiga from BRAZIL, Sao Paulo
Tom I am facing a lot of problems with bind variables because the access plan is changing and a 
critical performance problem occurs.
My customer is very disapointed about this "feature" of Oracle and he is asking me what to do to 
avoid this behavior.

We have already tried to remove histograms and the situation become worse.

I have proposed to him to focus on the most critical process and put some hints to freeze the plan.

But the processes work with a lot of procedures with a lot of statements. So I need to define a 
method to focus only in the queries with most change to have this problem.

My points are:

-Statements with Bind Variables
-Columns with Skewed distribution
-Statements with more than 5 tables join
-Tables with many rows (500k or more)

This could be a good start to identify queries that can have problem?



Write a Review
 


All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.

About Oracle | Legal Notices and Terms of Use | Privacy Statement