Skip to Main Content

Breadcrumb

Dev Live Dev Intro

This month we are celebrating Developers at AskTOM. We welcome Developers of all levels of experience to join us at our FREE Developer Live events coming in August. Just click on the left to register today! If you are brand new to Database Technology, then we also have got you covered. Just click on the right for your comprehensive FREE training program to kick start your Oracle Database Development journey!

Question and Answer

Tom Kyte

Thanks for the question, Marc.

Asked: July 05, 2005 - 10:10 am UTC

Answered by: Tom Kyte - Last updated: October 19, 2005 - 4:37 pm UTC

Category: Database - Version: 9.2.0

Viewed 1000+ times

You Asked

Hi Tom

My question is really simple and came from the doc (SQL Reference Manual) :

"The GROUP BY clause groups rows but does not guarantee the order of the result set. To order the groupings, use the ORDER BY clause."

Ok but I can not find an example (even on your site and believe me I use all the possible combinations with words like 'GROUP BY' , 'NOSORT', 'not sorted', 'oder by' etc)

Moreover I do not really understand how oracle can do a GROUP BY without sorting the data.

So could you please explain it to me and of course, give me an example where the GROUP BY clause needs a ORDER BY to sort the data in the sequence of the columns specified in the GROUP BY.

Regards


Marc



and we said...

ops$tkyte@ORA9IR2> create table t ( x int not null, y int );

Table created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t values ( 2, 1 );

1 row created.

ops$tkyte@ORA9IR2> insert into t values ( 1, 2 );

1 row created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx on t(y,x);

Index created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select x, y, count(*) from t group by x, y;

X Y COUNT(*)
---------- ---------- ----------
1 2 1
2 1 1

ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );

PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> select x, y, count(*) from t group by x, y;

X Y COUNT(*)
---------- ---------- ----------
2 1 1
1 2 1



and you rated our response

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

July 05, 2005 - 11:28 am UTC

Reviewer: Marc Musette

Thanks a lot but maybe I can have the explanation confirmed : the GROUP BY will use the index to group the columns , which is not necessary the order of the GROUP BY columns ???

is it already true in 8i ?

Tom Kyte

Followup  

July 05, 2005 - 11:38 am UTC

this is true in 7.0

partitioning, parallel query, other things can affect this as well.

bottom line: group by does not sort data


and even when it does sort as a by product of its processing, the sort is a BINARY SORT, not your sort (with your language and character set)

July 05, 2005 - 11:48 am UTC

Reviewer: Marc Musette

OK

but in this case preciselly : due to the index or not

as always : thanks a lot

Tom Kyte

Followup  

July 05, 2005 - 12:13 pm UTC

this one is due to the index, yes.

Thanks Tom

July 06, 2005 - 8:16 am UTC

Reviewer: Marc Musette


Found another neat case

July 08, 2005 - 2:58 pm UTC

Reviewer: Thomas Kyte from Leesburg VA

here is a group by in 10gr2, was "binning" some object ids, wanted to have 5% ranges....

ops$tkyte-ORA10GR2> set autotrace on explain
ops$tkyte-ORA10GR2> select min(object_id), max(object_id), nt from ( select object_id, ntile(20) over (order by object_id) nt from big_table ) group by nt;

MIN(OBJECT_ID) MAX(OBJECT_ID) NT
-------------- -------------- ----------
2 2471 1
12853 15352 6
25353 27852 11
30353 32852 13
2471 4925 2
32853 35352 14
49111 52553 20
7574 10352 4
10353 12852 5
17853 20352 8
40361 43723 17
4925 7573 3
15353 17852 7
43724 46398 18
20353 22852 9
22853 25352 10
27853 30352 12
35353 37856 15
37857 40360 16
46399 49110 19

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3192204185

---------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| ...
| 1 | HASH GROUP BY | | 1000K| ...
| 2 | VIEW | | 1000K| ...
| 3 | WINDOW SORT | | 1000K| ...
| 4 | TABLE ACCESS FULL| BIG_TABLE | 1000K| ...
---------------------------------------------------------


hash group by....

Ah-ha...an explanation for the mysterious staircase

October 19, 2005 - 3:43 pm UTC

Reviewer: Mark from NY

Hey Tom,

I guess this binary sorting explains this "staircase" type of pattern we've been experiencing. We have a query that's executed frequently. It displays stats about each employee - each employee is associated with an office. We group by office, then employee name. There is no order by clause.

The results always come out as if we order by length(empname). I can't list the actual query results, but they look something like this:

office employee name stats columns...
------ ------------- ----------------
A ***
A *****
A *****
A *******
A ********
A **********
A ***********
B **
B ****
B ******
B ********
B *********
C ****
C ******
C ***********
C *************
C *************

We actually have many more employees than that, so the pattern is even more pronounced. I just had to find an explanation for this phenomenon... I'm satifised with binary sorting when group by is used. :)

Mark

Tom Kyte

Followup  

October 19, 2005 - 4:37 pm UTC

but beware - it'll become a group by hash in 10gr2 -- really random ;)

meaning, unless you have an ORDER BY, don't expect the data to be sorted by anything, you have no right to expect it.

More to Explore

DBMS_STATS

More on PL/SQL routine DBMS_STATS here