Thanks for the question, Peter.

Asked: April 11, 2004 - 12:52 pm UTC

Last updated: July 31, 2017 - 3:01 am UTC

Version: 9.2.0.5

Viewed 1000+ times

You Asked

Hi Tom,

I know I read this somewhere, but I forgot where. I was hoping you can help me or point me to the correct link.

Anyway, I have an attribute column storing data ranging from 0..100.

I would like an output to tell my how many records are within each range using a histogram output. E.g.

0 - 10 *****
11 - 20 **
21 - 30 ****
31 - 40 *********
41 - 50 *
51 - 60 **************
61 - 70 ***
71 - 80 ******
81 - 90 ********
91 - 100 *

What would be the most efficient way to do this? I could do this in JAVA pretty fast, if I had the data in memory. In Java I would only need to make one pass through the dataset, but I think there's got to be a way to do this just as efficiently on the database.

I tried using an inline subquery, for each range above, but I don't think that's the best approach. It seems that each range would make a pass through the data which means if I have 10 range buckets, then I'm making 10 passes through the table.

Thanks,
-Peter

and Tom said...

ops$tkyte@ORA9IR2> create table t
2 as
3 select rownum-1 data
4 from all_objects
5 where rownum <= 101
6 /

Table created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select data, width_bucket( data, 0, 101, 10 )
2 from t
3 /

DATA WIDTH_BUCKET(DATA,0,101,10)
---------- ---------------------------
0 1
1 1
2 1
3 1
.....
96 10
97 10
98 10
99 10
100 10

101 rows selected.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select width_bucket(data,0,101,10), count(*)
2 from t
3 group by width_bucket(data,0,101,10)
4 /

WIDTH_BUCKET(DATA,0,101,10) COUNT(*)
--------------------------- ----------
1 11
2 10
3 10
4 10
5 10
6 10
7 10
8 10
9 10
10 10

10 rows selected.

(heck for this one simple DIVISION would do it for you!

ops$tkyte@ORA9IR2> select trunc((data-0.000001)/10)+1, count(*)
2 from t
3 group by trunc((data-0.000001)/10)+1
4 /

TRUNC((DATA-0.000001)/10)+1 COUNT(*)
--------------------------- ----------
1 11
2 10
3 10
4 10
5 10
6 10
7 10
8 10
9 10
10 10

10 rows selected.

..... I'll take your java code on speedwise -- both in time to code/debug and runtime performance anyday for that one :)

Rating

(5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Excellent

Peter Tran, April 11, 2004 - 5:45 pm UTC

Touché!

Thank you!

-Peter

A reader, April 12, 2004 - 6:38 am UTC

width_bucket understanding wrt performance

a, June 04, 2004 - 7:56 am UTC

TOM

I AM GETTING scare as still need lot of understanding :(

Please help me to understand concept of width_bucket

June 04, 2004 - 10:43 am UTC

it is just a histogram function?

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:707586567563#15793066195196 <code>

is another example -- it is documented. Just breaks a range of values you pass in, into N "buckets" and assigns a bucket number to each row.

Finishing touches

Jack, June 07, 2004 - 12:19 pm UTC

Here's a way to translate your counts into bars:

  1  select rownum, lpad('*',rownum,'*') hist
  2  from all_objects
  3* where rownum <= 10
SQL> /

    ROWNUM HIST
---------- --------------------
         1 *
         2 **
         3 ***
         4 ****
         5 *****
         6 ******
         7 *******
         8 ********
         9 *********
        10 **********

10 rows selected.

Empty buckets, how to display them?

José, July 28, 2017 - 8:04 am UTC

Hi Tom,
thanks for your explanation. I like this feature!

I was just wondering if there is a (simple) way to make empty "buckets" visible. That would be of great help to display a histogram without distorting the x axis.

One example: I have an application where case numbers are created. However, as data is imported from different locations, not every number range is used: we have some "holes".

CREATE TABLE SAMPLE_TABLE (CASENUMBER number);

The table is populated from various sources.

SELECT     WIDTH_BUCKET (CASENUMBER, 1, 5000000000, 500) * 10000000 - 10000000 "Bereich (von)",
           WIDTH_BUCKET (CASENUMBER, 1, 5000000000, 500) * 10000000 - 1        "Bereich (bis)",
           COUNT(*)   Anzahl
    FROM   SAMPLE_TABLE
GROUP BY   WIDTH_BUCKET (CASENUMBER, 1, 5000000000, 500)
ORDER BY 1 ASC;

My example shows:

Bereich (von)   Bereich (bis)   Anzahl
3050000000      3059999999      16903
3060000000      3069999999      38977
3070000000      3079999999      39091
3090000000      3099999999      1336

As we can see, there is a "hole" for the range between 3080000000 and 3089999999. I would like to display that as follows:

Bereich (von)   Bereich (bis)   Anzahl
3050000000      3059999999      16903
3060000000      3069999999      38977
3070000000      3079999999      39091
3080000000      3089999999      0
3090000000      3099999999      1336

Is there any way to do that?

Thanks!

July 31, 2017 - 3:01 am UTC

You just need to enumerate the possible buckets, and then outer join

SQL> with all_buckets as
  2    ( select rownum-1 bk from dual
  3      connect by level <= 12 ),
  4  original_data as
  5    (
  6    select width_bucket(object_id,1,1000,10) w, count(*) hist
  7    from t
  8    group by width_bucket(object_id,1,1000,10)
  9    )
 10  select bk, nvl(hist,0) hist
 11  from original_data, all_buckets
 12  where bk = w(+)
 13  order by 1;

        BK       HIST
---------- ----------
         0          0
         1         99
         2        100
         3         99
         4         99
         5          0
         6          0
         7        100
         8        100
         9        100
        10         99
        11          0

12 rows selected.

Live SQL

Share and learn SQL and PL/SQL; free access to the latest version of Oracle Database!

Dev Gym

Classes, workouts and quizzes on Oracle Database technologies. Expertise through exercise!

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Excellent

width_bucket understanding wrt performance

Finishing touches

Empty buckets, how to display them?

More to Explore

Live SQL

Dev Gym

Generating histogram output against a column

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Excellent

width_bucket understanding wrt performance

Finishing touches

Empty buckets, how to display them?

More to Explore

Live SQL

Dev Gym