Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, mangesh.

Asked: October 03, 2016 - 12:09 pm UTC

Last updated: October 04, 2016 - 11:39 am UTC

Version: 11gR2

Viewed 1000+ times

You Asked

Hi,

I have table with around 30 million rows in it. There is number(10) column in it.
Column is having around 9000 distinct values and height balanced histogram exists on it.

When I fire simple query shown below, the estimates are way off:

SQL_ID 76pfz1vt2scjh, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ COUNT(*) FROM payment
WHERE DOMAIN_ID=131902

Plan hash value: 1268272264

---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:03.08 | 1986 | 1979 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:03.08 | 1986 | 1979 |
|* 2 | INDEX RANGE SCAN| payment_IX04 | 1 | 2778 | 126K|00:00:06.90 | 1986 | 1979 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("DOMAIN_ID"=131902)

If I replace DOMAIN_ID value with anything else, estimated value remains same though.
What could be the reason for this mismatch even though histogram exists? Does it mean height-balanced not always accurate?

Thanks.

and Connor said...

The moment you have more than 255 distinct values (in 11g), then by definition, you are not guaranteed to have accurate estimate for a given distinct value.

You will possibly have a *better* estimate than if you didnt have histograms at all, but you obviously cant record accurate measurements on 9000 things with 255 buckets.

For example, here's a table with values 1 .. 5000, such that each value occurs that number of times. So there is 1 row with value 1, 2 rows with value 2, ..., 5000 rows with value 5000.

So there is skew in there - I grab a histogram, and look at what results

SQL> exec dbms_stats.gather_table_stats('','T',method_opt=>'for all columns size 254');

PL/SQL procedure successfully completed.

SQL> select endpoint_number, endpoint_value
  2  from   user_histograms
  3  where  table_name = 'T'
  4  and    column_name = 'R'
  5  order by 1;

ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------
              0             86
              1            354
              2            461
              3            556
              4            621
              5            701
              6            766
              7            824
              8            885
              9            931
             10            973
             11           1013
             12           1050
             13           1084
             14           1148
             15           1197
             16           1230
             17           1288
             18           1332
             19           1368
             20           1398
             21           1432
             22           1459
             23           1495
             24           1528
             25           1550
             26           1580
             27           1611
             28           1647
             29           1667
             30           1694
             31           1724
             32           1748
             33           1777
             34           1809
             35           1835
             36           1863
             37           1888
             38           1905
             39           1931
             40           1968
             41           1992
             42           2012
             43           2038
             44           2068
             45           2093
             46           2114
             47           2131
             48           2161
             49           2180
             50           2198
             51           2223
             52           2237
             53           2259
             54           2274
             55           2304
             56           2323
             57           2344
             58           2363
             59           2388
             60           2408
             61           2425
             62           2447
             63           2464
             64           2480
             65           2510
             66           2524
             67           2549
             68           2569
             69           2593
             70           2615
             71           2635
             72           2653
             73           2676
             74           2697
             75           2713
             76           2732
             77           2756
             78           2772
             79           2787
             80           2808
             81           2835
             82           2852
             83           2871
             84           2892
             85           2912
             86           2926
             87           2945
             88           2959
             89           2977
             90           2992
             91           3013
             92           3038
             93           3057
             94           3071
             95           3096
             96           3111
             97           3127
             98           3143
             99           3159
            100           3175
            101           3191
            102           3207
            103           3219
            104           3236
            105           3251
            106           3263
            107           3279
            108           3291
            109           3303
            110           3314
            111           3329
            112           3341
            113           3362
            114           3375
            115           3395
            116           3412
            117           3429
            118           3450
            119           3464
            120           3479
            121           3494
            122           3513
            123           3525
            124           3538
            125           3553
            126           3565
            127           3577
            128           3588
            129           3603
            130           3617
            131           3631
            132           3643
            133           3655
            134           3665
            135           3674
            136           3685
            137           3698
            138           3712
            139           3725
            140           3736
            141           3749
            142           3769
            143           3779
            144           3793
            145           3805
            146           3814
            147           3827
            148           3836
            149           3853
            150           3863
            151           3883
            152           3894
            153           3904
            154           3918
            155           3930
            156           3940
            157           3955
            158           3965
            159           3978
            160           3990
            161           4008
            162           4023
            163           4038
            164           4051
            165           4063
            166           4075
            167           4088
            168           4098
            169           4112
            170           4123
            171           4133
            172           4142
            173           4155
            174           4172
            175           4188
            176           4197
            177           4211
            178           4222
            179           4237
            180           4248
            181           4258
            182           4269
            183           4280
            184           4287
            185           4297
            186           4309
            187           4324
            188           4331
            189           4347
            190           4360
            191           4372
            192           4382
            193           4391
            194           4403
            195           4410
            196           4423
            197           4429
            198           4436
            199           4448
            200           4461
            201           4471
            202           4483
            203           4492
            204           4504
            205           4513
            206           4524
            207           4533
            208           4544
            209           4550
            210           4561
            211           4570
            212           4580
            213           4591
            214           4601
            215           4611
            216           4620
            217           4631
            218           4641
            219           4651
            220           4662
            221           4672
            222           4683
            223           4695
            224           4703
            225           4712
            226           4721
            227           4732
            228           4745
            229           4756
            230           4768
            231           4779
            232           4787
            233           4797
            234           4802
            235           4811
            236           4819
            237           4829
            238           4843
            239           4853
            240           4861
            241           4872
            242           4885
            243           4893
            244           4906
            245           4919
            246           4927
            247           4938
            248           4948
            249           4958
            250           4967
            251           4975
            252           4980
            253           4991
            254           5000

255 rows selected.




Notice that there is no repeated value in the histogram. So when I do

select * from t where r = ...

the best we can do is know that it falls into one of the buckets, which isnt much different to having no buckets.

That doesn't mean the histogram is useless... for example, for a range scan, it improves our estimates a lot:

SQL> select count(*) from t where r > 4000;

Explained.

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |  5280   (2)| 00:01:04 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| T    |  4599K|    17M|  5280   (2)| 00:01:04 |
---------------------------------------------------------------------------

SQL> select count(*) from t where r > 4000;

  COUNT(*)
----------
   4500500



Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

questions on mangesh data

MatteoP, October 04, 2016 - 10:45 am UTC

Hello Connor,
your explaination is very clear; I've done the math with your example data and everything makes sense now!
What I do not understand is the data provided in original question:
1. E-Rows 2778, which is close to cardinality/Number Distinct Value (30M/9K), as if optimizer ignored histogram while estimating
2. A-Rows 126K, which is close to cardinality/N of Bucket (30M/254), as if bucket width for the given domain_id is (accidentally?) accurate
3. Range Scan A-Time is 6.9s while the query overall took 3.08s

Can you explain if those are plausible or I am missing something?

Connor McDonald
October 04, 2016 - 11:39 am UTC

1. E-Rows 2778, which is close to cardinality/Number Distinct Value (30M/9K), as if optimizer ignored histogram while estimating

The way to tell this would be with an optimizer trace (10053), but as per my example, if an equality predicate falls into a single bucket in the height balanced histogram, then we're not really gleaning any additional information from it, so card/num_dist is about as good as we can do.

2. A-Rows 126K, which is close to cardinality/N of Bucket (30M/254), as if bucket width for the given domain_id is (accidentally?) accurate

Presumably though, this is the case for this particular domain_id. It can be the case for all of them, otherwise you'd have more than 30m rows.

3. Range Scan A-Time is 6.9s while the query overall took 3.08s

Often an issue with what precision the OS can offer. For example, if the time elapsed system call (or more accurately, the delta between two "what is the current time" calls) is to the nearest (say) centisecond, then we can easily over count.

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library