Skip to Main Content
  • Questions
  • How to interpret strange values in ENDPOINT_NUMBER column in histograms

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question.

Asked: August 06, 2008 - 12:29 am UTC

Answered by: Tom Kyte - Last updated: August 08, 2008 - 1:10 pm UTC

Category: Database - Version: 9.2.0

Viewed 1000+ times

You Asked

Tom, consider this:
SQL> column table_name format a11
SQL> column column_name format a11
SQL> column column_name format a11
SQL> column endpoint_number format 999,999
SQL> column endpoint_value format 9.999999EEEE
SQL> column endpoint_actual_value format a22
SQL> column owner format A8
SQL> 
SQL> create table test as select owner from dba_objects where owner < 'C'
  2  /

Table created.

SQL> select owner, count(*) from test group by owner
  2  /

OWNER      COUNT(*)
-------- ----------
ABM             464
AHL             518
AHM             167
AK              188
ALR             128
AMF             106
AMS            1557
AMV             189
AMW             244
AP              795
APPLSYS        2766
APPLSYSPUB       17
APPS         140786
AR             2211
ASF              10
ASG             154
ASL              53
ASN               4
ASO             285
ASP               4
AST              38
AX              141
AZ               74
BEN            2783
BIC              70
BIL             226
BIM             838
BIS            1497
BIV              39
BIX             193
BNE             150
BOM             787
BSC             300

33 rows selected.

SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 33');

PL/SQL procedure successfully completed.

SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'
  2  /

TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
----------- ----------- --------------- -------------- ----------------------
TEST        OWNER                     0   3.388440E+35
TEST        OWNER                     1   3.391283E+35
TEST        OWNER                    30   3.391283E+35
TEST        OWNER                    31   3.391625E+35
TEST        OWNER                    32   3.440973E+35
TEST        OWNER                    33   3.443803E+35

6 rows selected.

SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 40');

PL/SQL procedure successfully completed.

SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'
  2  /

TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
----------- ----------- --------------- -------------- ----------------------
TEST        OWNER                     0   3.388440E+35
TEST        OWNER                     1   3.391219E+35
TEST        OWNER                    37   3.391283E+35
TEST        OWNER                    38   3.391625E+35
TEST        OWNER                    39   3.441782E+35
TEST        OWNER                    40   3.443803E+35

6 rows selected.

SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 50');

PL/SQL procedure successfully completed.

SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'
  2  /

TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
----------- ----------- --------------- -------------- ----------------------
TEST        OWNER                   464   3.388440E+35 ABM
TEST        OWNER                   982   3.389657E+35 AHL
TEST        OWNER                 1,149   3.389657E+35 AHM
TEST        OWNER                 1,337   3.390205E+35 AK
TEST        OWNER                 1,465   3.390473E+35 ALR
TEST        OWNER                 1,571   3.390666E+35 AMF
TEST        OWNER                 3,128   3.390676E+35 AMS
TEST        OWNER                 3,317   3.390679E+35 AMV
TEST        OWNER                 3,561   3.390679E+35 AMW
TEST        OWNER                 4,356   3.391219E+35 AP
TEST        OWNER                 7,122   3.391283E+35 APPLSYS
TEST        OWNER                 7,139   3.391283E+35 APPLSYSPUB
TEST        OWNER               147,925   3.391283E+35 APPS
TEST        OWNER               150,136   3.391625E+35 AR
TEST        OWNER               150,146   3.391883E+35 ASF
TEST        OWNER               150,300   3.391884E+35 ASG
TEST        OWNER               150,353   3.391888E+35 ASL
TEST        OWNER               150,357   3.391889E+35 ASN
TEST        OWNER               150,642   3.391890E+35 ASO
TEST        OWNER               150,646   3.391891E+35 ASP
TEST        OWNER               150,684   3.391894E+35 AST
TEST        OWNER               150,825   3.392841E+35 AX
TEST        OWNER               150,899   3.393247E+35 AZ
TEST        OWNER               153,682   3.440973E+35 BEN
TEST        OWNER               153,752   3.441775E+35 BIC
TEST        OWNER               153,978   3.441782E+35 BIL
TEST        OWNER               154,816   3.441783E+35 BIM
TEST        OWNER               156,313   3.441788E+35 BIS
TEST        OWNER               156,352   3.441790E+35 BIV
TEST        OWNER               156,545   3.441792E+35 BIX
TEST        OWNER               156,695   3.442791E+35 BNE
TEST        OWNER               157,482   3.443000E+35 BOM
TEST        OWNER               157,782   3.443803E+35 BSC

33 rows selected.

SQL> drop table test
  2  /

Table dropped.

SQL> select * from v$version
  2  /

BANNER
--------------------------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
PL/SQL Release 9.2.0.7.0 - Production
CORE 9.2.0.7.0 Production
TNS for IBM/AIX RISC System/6000: Version 9.2.0.7.0 - Production
NLSRTL Version 9.2.0.7.0 - Production

5 rows selected.


There are 33 distinct values in OWNER column. After gathering statistics for this column with number of buckets equal and even more than number of distinct values (N between 33 and 44), I see in ENDPOINT_NUMBER column ordinal number of bucket and nothing in ENDPOINT_ACTUAL_VALUE. But when N in 'for columns owner size N' is sufficient greater than the number of distinct values (it seems the rule is "N >= 1.33333*number_of_distinct_values+1") then in ENDPOINT_NUMBER column appears the cumulative numbers of rows (according metalink Note:72539.1) and ENDPOINT_ACTUAL_VALUE is not empty. Here are my questions:

1. Why is it so? In above example I am expecting to see same results after
exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 33');

and after
exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 50');

2. Why there are only 6 rows (buckets) in USER_HISTOGRAMS when I calling dbms_stats with 'for columns owner size 33' (for example)? I am expecting to see 33 rows (buckets) after this call...
3. Is it better to have in ENDPOINT_NUMBER column cumulative numbers of rows than just bucket numbers? Can optimizer make its decisions more accurately in this case?

and we said...

This happens when you have some values that utterly dominate the other values - as you do - that one really high value can be used to infer the other buckets.

consider the differences between t1 and t2 below:

ops$tkyte%ORA9IR2> create table t1
  2  as
  3  select 'user' || mod(rownum,33) owner, object_name
  4    from all_objects;

Table created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> create table t2
  2  as
  3  select owner, object_name
  4    from all_objects;

Table created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select 't1', owner, count(*) from t1 group by owner union all
  2  select 't2', owner, count(*) from t2 group by owner
  3  order by 1, 2;

'T OWNER                                          COUNT(*)
-- -------------------------------------------- ----------
t1 user0                                               929
t1 user1                                               930
t1 user10                                              930
t1 user11                                              930
t1 user12                                              930
t1 user13                                              930
t1 user14                                              930
t1 user15                                              930
t1 user16                                              930
t1 user17                                              930
t1 user18                                              929
t1 user19                                              929
t1 user2                                               930
t1 user20                                              929
t1 user21                                              929
t1 user22                                              929
t1 user23                                              929
t1 user24                                              929
t1 user25                                              929
t1 user26                                              929
t1 user27                                              929
t1 user28                                              929
t1 user29                                              929
t1 user3                                               930
t1 user30                                              929
t1 user31                                              929
t1 user32                                              929
t1 user4                                               930
t1 user5                                               930
t1 user6                                               930
t1 user7                                               930
t1 user8                                               930
t1 user9                                               930
t2 A                                                     1
t2 BIG_TABLE                                             2
t2 CTXSYS                                              261
t2 DEMO                                                  1
t2 HR                                                   34
t2 MDSYS                                               234
t2 ODM                                                 439
t2 ODM_MTR                                              12
t2 OE                                                   86
t2 OLAPSYS                                             662
t2 OPS$TKYTE                                            16
t2 ORDPLUGINS                                           29
t2 ORDSYS                                              969
t2 OUTLN                                                 7
t2 PERFSTAT                                             86
t2 PM                                                    9
t2 PUBLIC                                            12285
t2 QS                                                   41
t2 QS_ADM                                                7
t2 QS_CBADM                                             24
t2 QS_CS                                                23
t2 QS_ES                                                39
t2 QS_OS                                                39
t2 QS_WS                                                39
t2 SCOTT                                                 6
t2 SH                                                  173
t2 SYS                                               14073
t2 SYSTEM                                              382
t2 USER1                                                 9
t2 USER2                                                 9
t2 WKSYS                                               279
t2 WMSYS                                               129
t2 XDB                                                 270

66 rows selected.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T1', method_opt => 'for columns owner size 33' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> select * from user_tab_col_statistics where table_name = 'T1';

TABLE_NAME      COLUMN_NAME     NUM_DISTINCT LOW_VALUE  HIGH_VALUE    DENSITY  NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN
--------------- --------------- ------------ ---------- ---------- ---------- ---------- ----------- --------- ----------- --- --- -----------
T1              OWNER                     33 7573657230 7573657239 .030302545          0          32 06-AUG-08       30674 YES NO            7

ops$tkyte%ORA9IR2> select * from user_histograms where table_name = 'T1' and column_name = 'OWNER';

TABLE_NAME      COLUMN_NAME     ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL
--------------- --------------- --------------- -------------- ---------------
T1              OWNER                         0     6.0984E+35
T1              OWNER                         1     6.0984E+35
T1              OWNER                         2     6.0984E+35
T1              OWNER                         3     6.0984E+35
T1              OWNER                         4     6.0984E+35
T1              OWNER                         5     6.0984E+35
T1              OWNER                         6     6.0984E+35
T1              OWNER                         7     6.0984E+35
T1              OWNER                         8     6.0984E+35
T1              OWNER                         9     6.0984E+35
T1              OWNER                        10     6.0984E+35
T1              OWNER                        11     6.0984E+35
T1              OWNER                        12     6.0984E+35
T1              OWNER                        13     6.0984E+35
T1              OWNER                        14     6.0984E+35
T1              OWNER                        15     6.0984E+35
T1              OWNER                        16     6.0984E+35
T1              OWNER                        17     6.0984E+35
T1              OWNER                        18     6.0984E+35
T1              OWNER                        19     6.0984E+35
T1              OWNER                        20     6.0984E+35
T1              OWNER                        21     6.0984E+35
T1              OWNER                        22     6.0984E+35
T1              OWNER                        23     6.0984E+35
T1              OWNER                        24     6.0984E+35
T1              OWNER                        25     6.0984E+35
T1              OWNER                        26     6.0984E+35
T1              OWNER                        27     6.0984E+35
T1              OWNER                        28     6.0984E+35
T1              OWNER                        29     6.0984E+35
T1              OWNER                        30     6.0984E+35
T1              OWNER                        31     6.0984E+35
T1              OWNER                        33     6.0984E+35

33 rows selected.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T2', method_opt => 'for columns owner size 33' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> select * from user_tab_col_statistics where table_name = 'T2';

TABLE_NAME      COLUMN_NAME     NUM_DISTINCT LOW_VALUE  HIGH_VALUE    DENSITY  NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN
--------------- --------------- ------------ ---------- ---------- ---------- ---------- ----------- --------- ----------- --- --- -----------
T2              OWNER                     33 41         584442     .015563337          0           7 06-AUG-08       30675 YES NO            6

ops$tkyte%ORA9IR2> select * from user_histograms where table_name = 'T2' and column_name = 'OWNER';

TABLE_NAME      COLUMN_NAME     ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL
--------------- --------------- --------------- -------------- ---------------
T2              OWNER                         0     3.3750E+35
T2              OWNER                         1     4.1158E+35
T2              OWNER                         2     4.1186E+35
T2              OWNER                         3     4.1679E+35
T2              OWNER                        16     4.1711E+35
T2              OWNER                        31     4.3277E+35
T2              OWNER                        32     4.3277E+35
T2              OWNER                        33     4.5831E+35

8 rows selected.


sys and public so dominated the others in the second case that they allowed us to collapse the histogram.

If you use dbms_xplan to view the results, you can see that we just used a low value for the "low values" and the dominating values appear:


ops$tkyte%ORA9IR2> explain plan for
  2  select null from t2 where owner = 'A' union all
  3  select null from t2 where owner = 'PUBLIC' union all
  4  select null from t2 where owner = 'SYS' union all
  5  select null from t2 where owner = 'SYSTEM' union all
  6  select null from t2 where owner = 'XDB';

Explained.

ops$tkyte%ORA9IR2> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost  |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             | 27458 |   160K|    85 |
|   1 |  UNION-ALL           |             |       |       |       |
|*  2 |   TABLE ACCESS FULL  | T2          |   477 |  2862 |    17 |
|*  3 |   TABLE ACCESS FULL  | T2          | 12084 | 72504 |    17 |
|*  4 |   TABLE ACCESS FULL  | T2          | 13943 | 83658 |    17 |
|*  5 |   TABLE ACCESS FULL  | T2          |   477 |  2862 |    17 |
|*  6 |   TABLE ACCESS FULL  | T2          |   477 |  2862 |    17 |
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T2"."OWNER"='A')
   3 - filter("T2"."OWNER"='PUBLIC')
   4 - filter("T2"."OWNER"='SYS')
   5 - filter("T2"."OWNER"='SYSTEM')
   6 - filter("T2"."OWNER"='XDB')

Note: cpu costing is off



it is better than good enough - if you up the number of buckets really high - then it will go to the trouble to save them all for you - but not at much gain in that case.

Basically, your table has two types of values

a) low cardinality
b) really high cardinality


and you rated our response

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Where is dominant value?

August 07, 2008 - 2:13 pm UTC

Reviewer: jian huang zheng from China

Tom
That is very interesting and enlightening,but
If you use dbms_xplan to view the results, you can see that we just used a low value for the "low values" and the dominating values appear:

how does oracle caculate 477 and *KNOW* the dominant value since dominant values are not in the histograms.

Thanks!
Tom Kyte

Followup  

August 07, 2008 - 2:18 pm UTC

it does now the value, you just cannot see it, but it can. That big ugly number there - it has everything we need to know.

could it be the bug in 9i and 10.1?

August 07, 2008 - 3:21 pm UTC

Reviewer: Bo Jin

With 33 distinct values and 33 buckets, you should get a frequency histogram, where endpoint_number is the cumulative number of rows. But instead you got a height balanced histogram, where endpoint_number is the bucket number. Could it be the bug that Jonathan mentioned in page 164 of "Cost base optimizer Oracle fundamentals"? He also said that the bug was fixed in 10.2.
Tom Kyte

Followup  

August 07, 2008 - 3:45 pm UTC

same results in 10gr2 and 11gr1

and everything I see in the problem database says "we meant to do this", meaning - not a bug.

it is the predominant value, dbms_stats is a "program", it gathered them and said "ok - this is what I want to store"

Continuing the first example

August 08, 2008 - 1:08 am UTC

Reviewer: A reader

Not a bug? Are you sure, Tom?  How about this:

<code>

SQL> create table test as select owner from dba_objects where owner < 'C' and owner <> 'APPS'
  2  /

Table created.

SQL> select owner, count(*) from test group by owner
  2  /

OWNER          COUNT(*)
------------ ----------
ABM                 464
AHL                 518
AHM                 167
AK                  188
ALR                 128
AMF                 106
AMS                1557
AMV                 189
AMW                 244
AP                  795
APPLSYS            2791
APPLSYSPUB           17
AR                 2211
ASF                  10
ASG                 154
ASL                  53
ASN                   4
ASO                 285
ASP                   4
AST                  38
AX                  141
AZ                   74
BEN                2783
BIC                  70
BIL                 226
BIM                 838
BIS                1497
BIV                  39
BIX                 193
BNE                 150
BOM                 787
BSC                 300

32 rows selected.

SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 33');

PL/SQL procedure successfully completed.

SQL> 
SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'
  2  /

TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
----------- ----------- --------------- -------------- ----------------------
TEST        OWNER                     0   3.388440E+35
TEST        OWNER                     1   3.389657E+35
TEST        OWNER                     2   3.389657E+35
TEST        OWNER                     3   3.390666E+35
TEST        OWNER                     6   3.390676E+35
TEST        OWNER                     8   3.391219E+35
TEST        OWNER                    13   3.391283E+35
TEST        OWNER                    18   3.391625E+35
TEST        OWNER                    19   3.391890E+35
TEST        OWNER                    25   3.440973E+35
TEST        OWNER                    27   3.441783E+35
TEST        OWNER                    30   3.441788E+35
TEST        OWNER                    32   3.443000E+35
TEST        OWNER                    33   3.443803E+35

14 rows selected.

SQL> analyze table test compute statistics for columns owner size 33
  2  /

Table analyzed.

SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'
  2  /

TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
----------- ----------- --------------- -------------- ----------------------
TEST        OWNER                   464   3.388440E+35 ABM
TEST        OWNER                   982   3.389657E+35 AHL
TEST        OWNER                 1,149   3.389657E+35 AHM
TEST        OWNER                 1,337   3.390205E+35 AK
TEST        OWNER                 1,465   3.390473E+35 ALR
TEST        OWNER                 1,571   3.390666E+35 AMF
TEST        OWNER                 3,128   3.390676E+35 AMS
TEST        OWNER                 3,317   3.390679E+35 AMV
TEST        OWNER                 3,561   3.390679E+35 AMW
TEST        OWNER                 4,356   3.391219E+35 AP
TEST        OWNER                 7,147   3.391283E+35 APPLSYS
TEST        OWNER                 7,164   3.391283E+35 APPLSYSPUB
TEST        OWNER                 9,375   3.391625E+35 AR
TEST        OWNER                 9,385   3.391883E+35 ASF
TEST        OWNER                 9,539   3.391884E+35 ASG
TEST        OWNER                 9,592   3.391888E+35 ASL
TEST        OWNER                 9,596   3.391889E+35 ASN
TEST        OWNER                 9,881   3.391890E+35 ASO
TEST        OWNER                 9,885   3.391891E+35 ASP
TEST        OWNER                 9,923   3.391894E+35 AST
TEST        OWNER                10,064   3.392841E+35 AX
TEST        OWNER                10,138   3.393247E+35 AZ
TEST        OWNER                12,921   3.440973E+35 BEN
TEST        OWNER                12,991   3.441775E+35 BIC
TEST        OWNER                13,217   3.441782E+35 BIL
TEST        OWNER                14,055   3.441783E+35 BIM
TEST        OWNER                15,552   3.441788E+35 BIS
TEST        OWNER                15,591   3.441790E+35 BIV
TEST        OWNER                15,784   3.441792E+35 BIX
TEST        OWNER                15,934   3.442791E+35 BNE
TEST        OWNER                16,721   3.443000E+35 BOM
TEST        OWNER                17,021   3.443803E+35 BSC

32 rows selected.

SQL> drop table test
  2  /

Table dropped.



1. The good old ANALYZE command (obsolete as stated in Oracle docs) produces more expected, correct result, but new dbms_stat does not. So, where is bug? Or both work correct, but every in its own fashion?

2. You write "we meant to do this". Please, explain that exactly do you mean. Does Oracle try to compress histogram? But why? There is no need in above case to do this! There are 32 distinct values and I try to create the same number of buckets - "please, just do it!". You write about one dominating value - I excluded that one - no changes. Furthermore, histograms are "born" to deal with uneven distributed data! Aren't they?

3. Where are values in ENDPOINT_ACTUAL_VALUE column? Why they appears and disappears so mystically?</code>
Tom Kyte

Followup  

August 08, 2008 - 1:10 pm UTC

dbms_stats is the thing that the optimizer expects to have generated the statistics.

dbms_stats is what is tested with the optimizer, analyze is not.

dbms_stats and analyze have always returned different results.

If you believe this to be a bug, you'll want to file it in metalink, but I do believe they'll close it as "not a bug, we programmed it that way".

More to Explore

DBMS_STATS

More on PL/SQL routine DBMS_STATS here