Thanks for the question, Marc.

Asked: July 05, 2005 - 10:10 am UTC

Last updated: October 19, 2005 - 4:37 pm UTC

Version: 9.2.0

Viewed 1000+ times

You Asked

Hi Tom

My question is really simple and came from the doc (SQL Reference Manual) :

"The GROUP BY clause groups rows but does not guarantee the order of the result set. To order the groupings, use the ORDER BY clause."

Ok but I can not find an example (even on your site and believe me I use all the possible combinations with words like 'GROUP BY' , 'NOSORT', 'not sorted', 'oder by' etc)

Moreover I do not really understand how oracle can do a GROUP BY without sorting the data.

So could you please explain it to me and of course, give me an example where the GROUP BY clause needs a ORDER BY to sort the data in the sequence of the columns specified in the GROUP BY.

Regards

Marc

and Tom said...

ops$tkyte@ORA9IR2> create table t ( x int not null, y int );

Table created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t values ( 2, 1 );

1 row created.

ops$tkyte@ORA9IR2> insert into t values ( 1, 2 );

1 row created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx on t(y,x);

Index created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select x, y, count(*) from t group by x, y;

X Y COUNT(*)
---------- ---------- ----------
1 2 1
2 1 1

ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );

PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> select x, y, count(*) from t group by x, y;

X Y COUNT(*)
---------- ---------- ----------
2 1 1
1 2 1

Rating

(5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Marc Musette, July 05, 2005 - 11:28 am UTC

Thanks a lot but maybe I can have the explanation confirmed : the GROUP BY will use the index to group the columns , which is not necessary the order of the GROUP BY columns ???

is it already true in 8i ?

July 05, 2005 - 11:38 am UTC

this is true in 7.0

partitioning, parallel query, other things can affect this as well.

bottom line: group by does not sort data

and even when it does sort as a by product of its processing, the sort is a BINARY SORT, not your sort (with your language and character set)

Marc Musette, July 05, 2005 - 11:48 am UTC

OK

but in this case preciselly : due to the index or not

as always : thanks a lot

July 05, 2005 - 12:13 pm UTC

this one is due to the index, yes.

Thanks Tom

Marc Musette, July 06, 2005 - 8:16 am UTC

Found another neat case

Thomas Kyte, July 08, 2005 - 2:58 pm UTC

here is a group by in 10gr2, was "binning" some object ids, wanted to have 5% ranges....

ops$tkyte-ORA10GR2> set autotrace on explain
ops$tkyte-ORA10GR2> select min(object_id), max(object_id), nt from ( select object_id, ntile(20) over (order by object_id) nt from big_table ) group by nt;

MIN(OBJECT_ID) MAX(OBJECT_ID) NT
-------------- -------------- ----------
2 2471 1
12853 15352 6
25353 27852 11
30353 32852 13
2471 4925 2
32853 35352 14
49111 52553 20
7574 10352 4
10353 12852 5
17853 20352 8
40361 43723 17
4925 7573 3
15353 17852 7
43724 46398 18
20353 22852 9
22853 25352 10
27853 30352 12
35353 37856 15
37857 40360 16
46399 49110 19

20 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3192204185

---------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| ...
| 1 | HASH GROUP BY | | 1000K| ...
| 2 | VIEW | | 1000K| ...
| 3 | WINDOW SORT | | 1000K| ...
| 4 | TABLE ACCESS FULL| BIG_TABLE | 1000K| ...
---------------------------------------------------------

hash group by....

Ah-ha...an explanation for the mysterious staircase

Mark, October 19, 2005 - 3:43 pm UTC

Hey Tom,

I guess this binary sorting explains this "staircase" type of pattern we've been experiencing. We have a query that's executed frequently. It displays stats about each employee - each employee is associated with an office. We group by office, then employee name. There is no order by clause.

The results always come out as if we order by length(empname). I can't list the actual query results, but they look something like this:

office employee name stats columns...
------ ------------- ----------------
A ***
A *****
A *****
A *******
A ********
A **********
A ***********
B **
B ****
B ******
B ********
B *********
C ****
C ******
C ***********
C *************
C *************

We actually have many more employees than that, so the pattern is even more pronounced. I just had to find an explanation for this phenomenon... I'm satifised with binary sorting when group by is used. :)

Mark

October 19, 2005 - 4:37 pm UTC

but beware - it'll become a group by hash in 10gr2 -- really random ;)

meaning, unless you have an ORDER BY, don't expect the data to be sorted by anything, you have no right to expect it.

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Thanks Tom

Found another neat case

Ah-ha...an explanation for the mysterious staircase

More to Explore

PL/SQL demos

PL/SQL docs

GROUP BY and sorting data

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Thanks Tom

Found another neat case

Ah-ha...an explanation for the mysterious staircase

More to Explore

PL/SQL demos

PL/SQL docs