Skip to Main Content
  • Questions
  • Does a Group By gaurantee that output data is sorted?

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, Ravi.

Asked: October 14, 2006 - 4:36 pm UTC

Last updated: October 07, 2009 - 8:39 am UTC

Version: 9.0.4

Viewed 10K+ times! This question is

You Asked

Does a Group By clause in an Query gaurantee that the output data will be sorted on the Group By columns in order, even if there is NO Order By clause?

and Tom said...

ABSOLUTELY NOT,

It never has, it never did, it never will.


Here is a really old discussion on this:

</code> http://groups.google.com/group/comp.databases.oracle.misc/msg/678c556eb264b60c?dmode=source&hl=en

<b>without an order by, there is absolutely no way you can make any assumptions regarding the sorted order of data returned by a query</b>

Read this too:
http://asktom.oracle.com/Misc/order-in-court.html <code>


Rating

  (66 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

really ?

Sokrates, October 15, 2006 - 6:17 am UTC

"without an order by, there is absolutely no way you can make any assumptions
regarding the sorted order of data returned by a query"

really ?
no exceptions to this rule ?

what about querying a pipelined table function, as first counterexample ?

for example

select * from table(dbms_xplan.display)
always gave me a pretty well-sorted-looking resultset
*EVERY TIME* I used it
(some 1000s of times so far)

without any order by !


Tom Kyte
October 15, 2006 - 6:43 am UTC

I flipped a coin......

Anyway, what happens when parallel kicks in? for example.

Yeah, a simple query against a pipelined function will probably return the data in the order the pipelined function "piped" it. so?

Looking at that query, what order will the rows be in? what happens when I rewrite dbms_xplan.display?


IF YOU WANT DATA TO BE SORTED IN SOME ORDER, well, you know the rest.


What about ROLLUP?

Robert Massey, October 15, 2006 - 12:03 pm UTC

I've noticed that ROLLUP seems to always return sorted results, but is that just an accident of the GROUP BY?

If ROLLUP isn't reliable for sorting, a combination of GROUPING functions in the ORDER BY clause would probably be necessary to get the desired sort.

Tom Kyte
October 15, 2006 - 1:11 pm UTC

ops$tkyte%ORA10GR2> create table emp as select * from scott.emp;

Table created.

ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'EMP', numrows=>100000, numblks => 10000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> @at
ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace on explain
ops$tkyte%ORA10GR2> select /*+ parallel( emp 2 ) */ ename, sum(sal), grouping(ename) from emp group by rollup(ename)
  2  /

ENAME        SUM(SAL) GROUPING(ENAME)
---------- ---------- ---------------
                29025               1
FORD             3000               0
ADAMS            1100               0
BLAKE            2850               0
CLARK            2450               0
JAMES             950               0
SCOTT            3000               0
MILLER           1300               0
KING             5000               0
WARD             1250               0
ALLEN            1600               0
JONES            2975               0
SMITH             800               0
MARTIN           1250               0
TURNER           1500               0

15 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1391315833

------------------------------------------------------------------------
| Id  | Operation                 | Name     | Rows  | Bytes | Cost (%CP
------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |          |   100K|  1953K|  1717   (
|   1 |  PX COORDINATOR           |          |       |       |
|   2 |   PX SEND QC (RANDOM)     | :TQ10001 |   100K|  1953K|  1717   (
|   3 |    SORT GROUP BY          |          |   100K|  1953K|  1717   (
|   4 |     PX RECEIVE            |          |   100K|  1953K|  1717   (
|   5 |      PX SEND HASH         | :TQ10000 |   100K|  1953K|  1717   (
|   6 |       SORT GROUP BY ROLLUP|          |   100K|  1953K|  1717   (
|   7 |        PX BLOCK ITERATOR  |          |   100K|  1953K|  1703   (
|   8 |         TABLE ACCESS FULL | EMP      |   100K|  1953K|  1703   (
------------------------------------------------------------------------

ops$tkyte%ORA10GR2> select ename, sum(sal), grouping(ename) from emp group by rollup(ename)
  2  /

ENAME        SUM(SAL) GROUPING(ENAME)
---------- ---------- ---------------
ADAMS            1100               0
ALLEN            1600               0
BLAKE            2850               0
CLARK            2450               0
FORD             3000               0
JAMES             950               0
JONES            2975               0
KING             5000               0
MARTIN           1250               0
MILLER           1300               0
SCOTT            3000               0
SMITH             800               0
TURNER           1500               0
WARD             1250               0
                29025               1

15 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 52302870

------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   100K|  1953K|  3114   (2)| 00:00
|   1 |  SORT GROUP BY ROLLUP|      |   100K|  1953K|  3114   (2)| 00:00
|   2 |   TABLE ACCESS FULL  | EMP  |   100K|  1953K|  3089   (2)| 00:00
------------------------------------------------------------------------

ops$tkyte%ORA10GR2> set autotrace off
 

Robert Massey, October 15, 2006 - 4:36 pm UTC

I didn't think I was going to get away with that one. :-)


index_desc hint and the order by

Mohamed, October 16, 2006 - 7:59 am UTC

Dear Tom,

you said
without an order by, there is absolutely no way you can make any assumptions
regarding the sorted order of data returned by a query

--- ---------
I have tested a select like this

select /*+ index_desc(aa pk_index_name) */
aa.pk, aa.xxx
from table T aa


and my output data have always been ordered by pk
without a need to have an order by pk

Kind Regars




Tom Kyte
October 16, 2006 - 8:05 am UTC

so, i flipped a coin and every time is comes up heads.

therefore, coins when flipped will come up heads. I win right?


ARG, UGH, unbelievable - I just don't get the persistence people keep showing with such an obvious thing here.

Tell me - DOES ORACLE HAVE TO OBEY YOUR HINT. does it?

Ok, let us say it even DOES follow your hint, did you, umm, well, read the order in the court link above? You might find it illustrative.

I sort of have an example there that shows an index hint, the index hint being used, and the data - well - the data is sort of not sorted.


IF YOU WANT YOUR DATA SORTED IN SOME ORDER THAT YOU CAN RELY ON YOU WILL USE ORDER BY.

No if and or buts about it, period, it is a fact, anyone that says otherwise is wrong, not only wrong but DANGEROUSLY wrong




If your goal was to retrieve data from your query as fast as possible - then your approach should have been first rows optimization, if an index would could be used to skip the sort - it would skip the sort.

I won't even get into the "binary sort vs linguistic sort" that would take place with 8bit (or multi-byte) data as well - that would return data that is not sorted correctly when using an index.

IOT?

Duke Ganote, October 16, 2006 - 8:39 am UTC

Would an exception be an IOT? There's no heap to scan through, and I can't imagine Oracle would "leap around" the index.

Tom Kyte
October 16, 2006 - 10:24 am UTC

index full scan on the IOT would retrieve the data accidently by primary key.

index FAST full scan on the IOT would retrieve data as it hits it, in any order.


There is a HEAP to scan through of sorts - there are many ways to access data, index full scan, index range scan, parallel index range scan, partitioning, index fast full scan

<b>IF YOU WANT DATA SORTED, THOU SHALT USE ORDER BY

and if the optimizer can use an index retrieve the data sorted AND it makes sense to use said index - the optimizer WILL skip the ORDER BY (knowing it has been satisfied)

So, the optimizer can skip the ORDER BY - but YOU CANNOT EVER skip the order by.

If you do, it is a BUG</b>



ops$tkyte%ORA9IR2> create table t
  2  ( user_id number primary key,
  3    username varchar2(30),
  4    created  date,
  5    data char(2000)
  6  )
  7  organization index;

Table created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> begin
  2          for i in 1..20
  3          loop
  4                  insert into t values ( -i, i, sysdate, i );
  5                  commit;
  6          end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> exec dbms_stats.set_table_stats( user, 'T', numrows => 100000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> set autotrace on explain
ops$tkyte%ORA9IR2> select user_id, username from t;

   USER_ID USERNAME
---------- ------------------------------
       -20 20
       -19 19
        -3 3
        -2 2
        -1 1
        -6 6
        -5 5
        -4 4
        -9 9
        -8 8
        -7 7
       -12 12
       -11 11
       -10 10
       -15 15
       -14 14
       -13 13
       -18 18
       -17 17
       -16 16

20 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=100000 Bytes=3000000)
   1    0   INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_31792' (UNIQUE) (Cost=4 Card=100000 Bytes=3000000)



ops$tkyte%ORA9IR2> select /*+ first_rows */ user_id, username from t;

   USER_ID USERNAME
---------- ------------------------------
       -20 20
       -19 19
       -18 18
       -17 17
       -16 16
       -15 15
       -14 14
       -13 13
       -12 12
       -11 11
       -10 10
        -9 9
        -8 8
        -7 7
        -6 6
        -5 5
        -4 4
        -3 3
        -2 2
        -1 1

20 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=26 Card=100000 Bytes=3000000)
   1    0   INDEX (FULL SCAN) OF 'SYS_IOT_TOP_31792' (UNIQUE) (Cost=26 Card=100000 Bytes=3000000)



ops$tkyte%ORA9IR2> set autotrace off
 

to Duke Ganote

Michel Cadot, October 16, 2006 - 8:58 am UTC

Same answer with IOT.

SQL> create table t (id primary key) organization index
  2  as select object_id from dba_objects where object_id is not null order by dbms_random.value;

Table created.

SQL> exec dbms_stats.gather_table_stats (user, 'T', cascade=>true);

PL/SQL procedure successfully completed.

SQL> explain plan for select * from t;

Explained.

SQL> col PLAN_TABLE_OUTPUT format a78 trunc
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------
Plan hash value: 1035703402

------------------------------------------------------------------------------
| Id  | Operation            | Name              | Rows  | Bytes | Cost (%CPU)
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                   | 49167 |   192K|    15   (0)
|   1 |  INDEX FAST FULL SCAN| SYS_IOT_TOP_59843 | 49167 |   192K|    15   (0)
------------------------------------------------------------------------------

8 rows selected.

Operation is "INDEX FAST FULL SCAN" that is (approximately) index is scanned in the order of the blocks not in the order of the keys. 

Tom Kyte
October 16, 2006 - 10:27 am UTC

thanks, I added an example to that effect right above before I read this :)

to: Michel Cadot

Duke Ganote, October 16, 2006 - 9:07 am UTC

Thank you! So even if Oracle guaranteed it would follow an index hint, the order is still not guaranteed because the "index is scanned in the order of the blocks not in the order of the keys."

to: Tom

Duke Ganote, October 16, 2006 - 10:42 am UTC

Nice short example on the IOT ordering... thanks!

How did it split?

Arian, October 16, 2006 - 2:21 pm UTC

Hi Tom,

I guess the order on your IOT is because of block splits? But I can't figure out why it comes out the way it does. I reproduced your results in 9.2.0.7 and 10.2.
And I just wonder why the first block would get the 2 lowest numbers (last block in). Followed by the block with the highest (first block in).
I would expect either a reversed order, or an order with -1 in the middle.

Thanks.

Arian

Tom Kyte
October 16, 2006 - 5:46 pm UTC

ASSM perhaps :) automatic segment space management likes to "spray data about"

but yes, splits and all are going to move the data around, I made it so that only 2 keys per leaf so every 2 inserts - "let us do a reorg of the structure" would happen.

sorted hash clusters

Tom, October 18, 2006 - 3:11 pm UTC

I hate to beat a dead horse, especially when I agree that order by does indeed order the data, but don't you think that it is about time for the documentation folks to agree with us? In Admin Workshop II they told us that using sorted hash clusters will sort your data (within a hash key?) without using order by, but what I think they meant to say was that using sorted hash clusters will skip the sorting step on your data when the order by in your query matches the sort in the hash cluster.

</code> http://download-east.oracle.com/docs/cd/B14117_01/server.101/b10739/hash.htm#ADMIN01901 <code>



Tom Kyte
October 18, 2006 - 3:55 pm UTC

You are correct, that is a documentation bug.

The ORDER BY must be included for correctness, the optimizer can decide that the physical sort is not necessary.

consider documentation bug filed.

I detect an Oracle magazine ASK TOM article forthcoming, no?

A reader, October 18, 2006 - 5:18 pm UTC

I know this is sometimes just an interview question, but it appears to be a widespread misconception...

Tom Kyte
October 18, 2006 - 5:46 pm UTC

you guess it :) I was a little short for my last submission - so this got submitted to fill it out..

Oramag-- AskTom

A reader, October 19, 2006 - 8:58 am UTC

another counterexample

Matthias Rogel, October 19, 2006 - 11:10 am UTC

a well documented case where data is sorted without any
order by is a hierarchical query without siblings.
for example
select level n from dual connect by level <= :b

compare discussion on
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:210612357425#43219762039024 <code>

Tom Kyte
October 19, 2006 - 2:14 pm UTC

but that isn't sorted, else we would not require ORDER SIBILINGS now would we.

It is returned unsorted, in the hierarchy.


scott%ORA10GR2> select level, rpad('*', 2*level, '*') || ename nm
2 from emp
3 start with mgr is null
4 connect by prior empno = mgr
5 /

LEVEL NM
---------- ---------------
1 **KING
2 ****JONES
3 ******SCOTT
4 ********ADAMS
3 ******FORD
4 ********SMITH
2 ****BLAKE
3 ******ALLEN
3 ******WARD
3 ******MARTIN
3 ******TURNER
3 ******JAMES
2 ****CLARK
3 ******MILLER

14 rows selected.

scott%ORA10GR2>
scott%ORA10GR2> select level, rpad('*', 2*level, '*') || ename nm
2 from emp
3 start with mgr is null
4 connect by prior empno = mgr
5 order SIBLINGS by ename
6 /

LEVEL NM
---------- ---------------
1 **KING
2 ****BLAKE
3 ******ALLEN
3 ******JAMES
3 ******MARTIN
3 ******TURNER
3 ******WARD
2 ****CLARK
3 ******MILLER
2 ****JONES
3 ******FORD
4 ********SMITH
3 ******SCOTT
4 ********ADAMS

14 rows selected.


That data wasn't "sorted" until and unless I sorted it with ORDER BY!


well well, I should have read the link before I worked up the example. Seems I already made that point

siblings and no siblings

Matthias Rogel, October 20, 2006 - 8:03 am UTC

your example had siblings
in which case we need an
ORDER SIBLINGS BY
to determine the order

I said, in a query WITHOUT SIBLINGS
(which means that there are no two distinct rows
with the same level) the order of the result set is
completely determined without any
ORDER BY clause,
namely the data will always be ordered by level

Tom Kyte
October 20, 2006 - 8:22 am UTC

where is it "well documented"...


anyway, I'll say it one more time for those that didn't see it before:

If you want data sorted in some specific order, then you had better well use ORDER BY or you cannot have ANY REASONABLE expectation of the order of the rows, period

while this connect by trick is neat, I would hardly hold it up as a shining example, if you want data sorted - you better SORT IT.

Mirjana, October 20, 2006 - 9:49 am UTC

Just one thing to comment ...
Recently we did a 8.0 to 10g upgrade to one of our clients databases, and next day all of the application's reports output data became unsorted and thus useless to users.
And all because back then (at the 8.0 version) someone thought that "group by" also meant "order by".
The solution was pretty expensive... re-enter and modify few hundreds of reports to add the "order by"...





Tom Kyte
October 20, 2006 - 9:55 am UTC

group by did NOT sort in 8.x in general either!!! they got lucky, plain and simple.

add an index - no sort
gather stats, plan changes - no sort
partition - no sort
hash index - no sort
parallel - no sort

unless and until there is an ORDER BY, well, you get my point by now I'm sure...

Group BY

Dan Clamage, October 20, 2006 - 2:53 pm UTC

I think the reason people think GROUP BY returns rows in sorted order is because for small data sets, it tends to do so, 90% (or better) of the time. They figure, hey, it's doing a sort anyway.

I bet there's a LOT of code that should be remediated to include ORDER BY.

I remember one kid who kept saying, "But we inserted the rows into the database in order! It should come out the way we put it in".

Tom Kyte
October 20, 2006 - 4:57 pm UTC

I remember the dba that asked me how to update a rowid.....

the developer wanted to put the rows back in the right order, seems they got out of order somehow.

documented

Matthias Rogel, October 20, 2006 - 2:53 pm UTC

'where is it "well documented"...'

</code> http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14200/queries003.htm#i2053972 <code>

not a shining example, I agree
I only wanted o say:
never say never

there is mostly a counterexample

temporary tables

Anthony, November 01, 2006 - 2:41 pm UTC

Tom, Excellent site.

It appears with a temporary table that the select always returns rows in the same order that they were inserted in:

- can you give a simple example in 10g that this is not the case, we have tried with varying row sizes but failed.

I'd like to demonstrate to some external developers that you always need an order by to guarantee the order - even with temporary tables

Thanks


Tom Kyte
November 01, 2006 - 6:25 pm UTC

i give up, i just totally give up. honestly. I entirely and totally *give up*

rolls eyes, walks away.


I hope your developers don't work on anything important. honestly, I don't know what else to say.


I
give
up


I just flipped a coin.... it always came up heads. ta-dah.

Refusing to Order By

Mike, November 01, 2006 - 7:20 pm UTC

Stop arguing with such people. Put it in your coding standards: If the rows must be in some order to meet the requirements or to function properly, you must code ORDER BY and there is no substitute and no exception.

One of the purposes of coding standards is to provide protection against those who do not (or will not) know better.

Chuck Jolley, November 02, 2006 - 11:37 am UTC

Is there some misunderstanding about the difference between the words "possibly" and "guaranteed"?

And why is this interesting?
What possible objection could anyone have to simply putting "order by" in their code if they want to guarantee the rows are sorted before they reach the client?
That's the part of this that I don't get.

guarateed

A reader, November 05, 2006 - 10:32 am UTC

rows are always ordered by rowum, that's for sure
you don't need an
"order by rownum"

Tom Kyte
November 05, 2006 - 11:00 am UTC

cott%ORA10GR2> select job, count(*), rownum from emp group by job, rownum;

JOB COUNT(*) ROWNUM
--------- ---------- ----------
CLERK 1 11
ANALYST 1 13
CLERK 1 1
PRESIDENT 1 9
SALESMAN 1 2
SALESMAN 1 3
SALESMAN 1 10
CLERK 1 12
CLERK 1 14
MANAGER 1 7
MANAGER 1 4
ANALYST 1 8
SALESMAN 1 5
MANAGER 1 6

14 rows selected.


really.

Temporary tables

Andrew Max, November 05, 2006 - 12:54 pm UTC

>> It appears with a temporary table that the select always returns rows 
>> in the same order that they were inserted in
>> ...
>> I'd like to demonstrate to some external developers that you always need 
>> an order by to guarantee the order - even with temporary tables

Anthony,

why do you need to prove obvious things to someone?
Ok, if they really wanna see an example with a temporary table - let them try this one:

SQL> select * from v$version where rownum = 1;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.1.0.3.0 - Prod

SQL> create global temporary table test
  2    (x number, padding varchar2(2000))
  3    on commit preserve rows;

Table created.

SQL> insert into test
  2  select rownum, rpad('*', 1000, '*')
  3    from all_objects
  4   where rownum < = 10;

10 rows created.

SQL> select x from test;

         X
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10

10 rows selected.

SQL> -- So far so good, but let's modify data a little...
SQL> update test
  2     set padding = padding || padding
  3   where x <= 5;

5 rows updated.

SQL> select x from test;

         X
----------
         1
         3
         5
         6
         7
         8
         9
        10
         2
         4

10 rows selected.

Kind regards. 

really.

A reader, November 06, 2006 - 5:12 am UTC

SQL> select job, count(*), rownum from emp group by job, rownum;

JOB         COUNT(*)     ROWNUM
--------- ---------- ----------
CLERK              1          1
CLERK              1         11
CLERK              1         12
CLERK              1         14
ANALYST            1          8
ANALYST            1         13
MANAGER            1          4
MANAGER            1          6
MANAGER            1          7
SALESMAN           1          2
SALESMAN           1          3
SALESMAN           1          5
SALESMAN           1         10
PRESIDENT          1          9

14 rows selected.

BUT

SQL> select rownum, "inner".* from (select job, count(*), rownum from emp group by job, rownum) "inner";

    ROWNUM JOB         COUNT(*)     ROWNUM
---------- --------- ---------- ----------
         1 CLERK              1          1
         2 CLERK              1         11
         3 CLERK              1         12
         4 CLERK              1         14
         5 ANALYST            1          8
         6 ANALYST            1         13
         7 MANAGER            1          4
         8 MANAGER            1          6
         9 MANAGER            1          7
        10 SALESMAN           1          2
        11 SALESMAN           1          3
        12 SALESMAN           1          5
        13 SALESMAN           1         10
        14 PRESIDENT          1          9

14 rows selected. 

Tom Kyte
November 06, 2006 - 10:05 am UTC

what is your point?

I just showed a case where rownum on the outer level of a query will not be ordered when you select from it.

That sort of shows that the statement:


<quote>
rows are always ordered by rowum, that's for sure
you don't need an
"order by rownum"
</quote>

Is not true.


I don't care how many times you flip your coin and see heads, you'll never convince me coins always end up heads - the reason:

Once I saw it land tails.

that is all we need, one counter case.

Not realy...

Jurij, November 06, 2006 - 6:22 am UTC

Didn't you say:
"rows are *always* ordered by rowum, that's for sure" ?

If you say ALLWAYS (and add THAT'S FOR SURE on top of that), then there should be no exceptions to this, right? Well, tom has demonstrated you a simple example that proves you are 100% wrong. Accept it, get ower with it.

Why this thread exists

David Weigel, November 06, 2006 - 11:11 am UTC

It seems that Oracle 10.2 introduced "HASH UNIQUE" for some grouping activities where it used to "SORT UNIQUE". A select with a group-by that happened to sort in older Oracle versions stopped sorting because Oracle changed its manner of grouping.

I discovered this en route to running into Oracle bugs 4926618 and 4604970, which caused a nasty performance problem for the first and caused wrong sum()s for the second. Once I altered the system to cure the bugs with _GBY_HASH_AGGREGATION_ENABLED=FALSE, which replaces "HASH UNIQUE" with "SORT UNIQUE", I noticed the side-effect: things which happened to sort in Oracle 9 and earlier, and stopped sorting in Oracle 10.2, started sorting again.

(Naturally, you wouldn't want to set _GBY_HASH_AGGREGATION_ENABLED=FALSE just to get default sorting back -- who knows if it'll have the same lucky effect in future releases.)

Tom Kyte
November 06, 2006 - 11:47 am UTC

no, it always had SORT GROUP BY NOSORT

It always had parallel execution.

It always had partitioning.

It always had LOTS OF WAYS.

If you read the original answer, it points to a google thread that is many many many years old.

dbms_xplan revisited

Sokrates, November 07, 2006 - 4:58 am UTC

Hallo Tom

SQL> explain plan for select * from dual;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|   1 |  TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |
Plan hash value: 272002086
|   0 | SELECT STATEMENT  |      |     1 |     2 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------

8 rows selected.


Hmm, not so much what I wanted.

I wanted

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 272002086

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     2 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

How to achieve this ?
Which "order by" to add to my query ?

(To tell you the truth, I cheated and changed the order the
rows came out by hand, but according to you, I was just lucky and it was pure sheer coincidence that they came out the way I wanted them.
I sometimes use this utility dbms_xplan - I know it sometimes doesn't tell me the right plan, but anyway - 
and I want to use it sometimes for demonstration purposes
to other developpers.
But I don't want to be dependent on coincidence, especially
when demonstrating, so what to do ?) 

Tom Kyte
November 07, 2006 - 4:35 pm UTC

I sort of addressed this above, no?
wait - yes.


play with parallel, for example....


for something as trivial and NON-PRODUCTION-ISH and "safe if it totally entirely screws up" as dbms_xplan, I have no problem here.

So Sokrates, would you count on it in a mission critical sort of way?

I'll say it one last time:

If you want to have an expectation, a clear expectation, of the order of rows in a result set, thou shall use order by.



I have an exception...

Michael Friedman, November 26, 2006 - 5:07 am UTC

V$Logminer

The undo and redo SQL are stored in VARCHAR2 columns. If they are more than 4K bytes they get split across multiple rows in v$Logminer.

There is no Row Number column... so we asked Oracle Support how the heck we could get back all the rows in each query in order.

Oracle Support checked with development and then responded that Oracle DB GUARANTEES that these rows will be returned in order in any simple query without Order By, Group By, etc. and that Oracle does not support complex queries on any V$ tables (and they pointed us to chapter and verse in the manual for that last bit).

I still feel very uncomfortable with this approach... but there doesn't seem to be an alternative.

Tom Kyte
November 26, 2006 - 9:34 am UTC

with v$table
as
(select rownum r, v.* from v$table v),
....

select



you'll likely find if you use that approach, many of your v$ queries are more efficient as well - v$ tables are tricky beasts - no read consistency, few if any indexes...

Does a DISTINCT gaurantee data is ordered though?

A reader, December 08, 2006 - 4:01 pm UTC

What if I do a distinct instead of a Group By, would that mean data will be ordered?

Tom Kyte
December 09, 2006 - 2:40 pm UTC

NO

order by means data will be ordered.

anything else - NO, the data is NOT sorted for sure. It might accidentally sometimes appear to be sorted.



without an order by, there is absolutely no way you can make any assumptions
regarding the sorted order of data returned by a query


I quite simply do not know how else to say this..... So, I've copied it from above.




Lock the follow up's to this one

A reader, December 09, 2006 - 6:19 pm UTC

I can't believe how much discussion and follow up there has been on this question. It's quite amusing :)
I'm laughing, you (Tom) must be crying judging by the no. of bold comments. Maybe you should turn off the follow ups to this question.

Tom Kyte
December 09, 2006 - 7:00 pm UTC

I just sit here AMAZED, just totally AMAZED that people continue to ask along the same lines...

yes.



If you want data sorted in some specific order, then you had better well use
ORDER BY or you cannot have ANY REASONABLE expectation of the order of the rows,
period


There is a real issue here though

Mike Friedman, December 09, 2006 - 11:01 pm UTC

Fact is, there is undocumented (and unsupported) database behavior that many systems rely on that changes between versions.

A good example is the Ordering behavior of Group By.

A system could pass intensive detailed unit and system testing on 8i and then die on 10g because of this issue.

It would be nice to have some kind of "Be nasty" flag on init.ora that automatically tried to not provide any unsupported behavior - for example, that deliberately messed up ordering unless there was an explicit "ordery by" for testing purposes. But I don't think it's really practical.

In general, almost any multi-row select(except for batch jobs that don't care about ordering) should have an Order By. It's one of the key things I look for in code reviews.

Tom Kyte
December 10, 2006 - 7:37 pm UTC

there is no undocumented database behavior here WHATSOEVER.

group by NEVER had to sort and in many cases WOULD NOT sort and even if it did, it was a BINARY sort, not a character set sort.

That system worked by flipping total accident and could have STOPPED working by accident at any second, just a statistics gathering causing a plan to change. You were living on the edge.


this is not "unsupported behavior", this is not "undocumented behavior". I have a word for what this is - but won't use it right now.

ALL selects that care about the ordering of rows should have an order by.

no "in general", no "multi-row", no "almost".

Greg Norris, December 11, 2006 - 8:57 am UTC

It's hard to believe this topic is still alive... I blame it on toxic levels of stuponitron emissions in the atmosphere. :p


What about subqueries?

A reader, December 30, 2006 - 3:26 pm UTC

Is an outer query guaranteed to get an ordered result set if a subquery sorts the results but the outer query does not explicitly do that?

Will queries of the following form return deterministic results? Or do i need to include another ORDER BY in the outer query?

SELECT ename FROM (
  SELECT * FROM emp
  ORDER BY ename
);

Tom Kyte
December 30, 2006 - 5:26 pm UTC

in this case, due to the lack of a "set operator" like ROWNUM in the outer level - it is able to technically ignore the inner order by.

But - the really good news - IF WE DON'T NEED TO SORT, WE WON'T....

ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from (select ename, empno from scott.emp order by ename) order by ename;

Execution Plan
----------------------------------------------------------
Plan hash value: 150391907

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    14 |   140 |     4  (25)| 00:00:01 |
|   1 |  SORT ORDER BY     |      |    14 |   140 |     4  (25)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| EMP  |    14 |   140 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

ops$tkyte%ORA10GR2> select * from (select ename, empno from scott.emp order by ename);

Execution Plan
----------------------------------------------------------
Plan hash value: 150391907

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    14 |   140 |     4  (25)| 00:00:01 |
|   1 |  SORT ORDER BY     |      |    14 |   140 |     4  (25)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| EMP  |    14 |   140 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------


but you don't need, do you

A reader, December 31, 2006 - 5:55 am UTC

in order to
select * from (select ename, empno from scott.emp order by ename);
,
you don't need to sort, why should you ?
Tom Kyte
December 31, 2006 - 4:01 pm UTC

I don't understand what point you are trying to make.

Order by dbms_random.value

Arindam Mukherjee, January 01, 2007 - 7:28 am UTC

Respected Mr. Tom,

I always use "Order by <column name> or sequence number of column name in Select list". But in so many examples even in your book also, I see "order by dbms_random.value or dbms_random.random". I am sorry I could not understand what objective it achieves. In this thread, there is also one "order by dbms_random.value".

Question 1: Could you kindly tell me - WHEN and WHY I should use "order by dbms_random.value or dbms_random.random"?

Question 2: What is the difference between "order by dbms_random.value or dbms_random.random" AND "no order by clause in Select as opposed to order by dbms_random.value or dbms_random.random"?

Tom Kyte
January 01, 2007 - 9:14 am UTC

I wanted the data sorted randomly for the example apparently.



seems straightforward. the examples you looked at where like regarding clustering factors (index metric). If the data is sorted by the key being indexed - clustering factor is low, if the data is not (eg: it is sorted randomly), the clustering factor would be high.

no sorting required

Sokrates, January 02, 2007 - 1:58 am UTC

"select * from (select ename, empno from scott.emp order by ename); "

doesn't need a sort because

"in this case, due to the lack of a "set operator" like ROWNUM in the outer level - it is able to technically ignore the inner order by. "

however, a sort seems to occur.

Could you please clarify ?
Have you an example when Oracle ignores the inner order by ?
Tom Kyte
January 02, 2007 - 8:04 am UTC

it technically does not need a sort and some day into the future, it might not do one.

thx

Sokrates, January 02, 2007 - 8:11 am UTC


and the bar keeps moving...

Connor, January 02, 2007 - 8:32 am UTC

Its amazing where Oracle finds ways to avoid sorting, eg, when I was timing sorting (and didn't want to fetch the results) I typically used:

SQL> select count(*)
2 from
3 ( select /*+ NO_MERGE */ c
4 from MY_BIG_TABLE
5 order by c ) ;

Problem is - in 10.2, the optimizer looked and said "hey, thats a count(*) - the sort won't be needed even though its a NO_MERGE", so I had to resort to:

SQL> select count(*)
2 from
3 ( select /*+ NO_MERGE */ c, rownum
4 from MY_BIG_TABLE
5 order by c ) ;

to test my sorting...

Bottom Line - relying on anything to sort when its not explicitly asked for is a a recipe for disaster

Surprising how often this comes up...

Mark J. Bobak, January 02, 2007 - 9:14 am UTC

I'm really amazed how often this one comes up, and how frequently it has to be quashed, over and over again...

I think part of the problem is that lots of folks were doing group by, without order by, and getting lucky for many, many years. Then, they upgraded to 10g, and 10g said "Hey, I don't have to do a sort operation for group by any more, I can do a hash instead!" And suddenly people start getting different output, and the sky is falling, and the rest is history....;-)

Eliminating ORDER BY

Jonathan Lewis, January 02, 2007 - 7:23 pm UTC

Connor,
If you want to avoid wasting resources on the redundant rownum, 10g a hint to block the elimination of the "order by". Change your current hint set to read:
/*+
no_merge
qb_name(ordering)
no_eliminate_oby(@ordering)
*/
This names the query block (good practice for 10g) and then blocks "order by elimination".

There are cases where you actually NEED to do this to stop oracle from sub-optimising your queries. There's an example of this on my blog at:
http://jonathanlewis.wordpress.com/2006/11/06/filter-subqueries/


Come on... no system is fully specified

Michael Friedman, January 02, 2007 - 10:31 pm UTC

Tom,

I'm not sure why you denied above that Oracle has undocumented behavior.

No non-trivial system can be fully specified - it's not commerically or practically possible.

That means every system will have undocumented behaviors.

And that also means that when other systems are integrated with such systems they will sometimes deliberately or inadvertantly rely on such behaviors.

Fact of life.

So, for example, the order of rows returned from a query without an Order By on columns with unique values is not specified. But depending on your version and the characteristics of your data it will come back in some order.

And sometimes systems will get built that rely on that ordering, that pass rigorous testing, and then break totally when you do a database upgrade that is theoretically transparent to an application.

This is an issue for anyone developing systems. Facto of life.

The key is to understand this and to know what kinds of things to look for when a DB upgrade breaks your system instead of throwing your hands in the air and complaining that the new version of the DB is a buggy piece of crap since the upgrade should have been transparent.
Tom Kyte
January 05, 2007 - 7:49 am UTC

where did I deny that? I've said hundreds if not thousands of times - "sorry, that is not documented, not going to talk about it"


It is documented that IF YOU WANT DATA SORTED IN SOME FASHION THEN YOU WILL USE ORDER BY

I'm not getting your point, if your query doesn't have an order by on it at the outer most level, YOU CANNOT ASSUME, PRESUME, WHATEVER - that the data is sorted by anything.


Documented vs. undocumented vs. side effects

A reader, January 03, 2007 - 10:58 am UTC

Documented: the official "public" documentation states how the system behaves.
Undocumented: the official "public" documentation does not state anything, but some internal, non public documentation may exist
Side effect: the system behaves that way because of another documented or undocumented behaviour. It just happens.

I guess Oracle internal documentation describes how a GROUP BY is implemented :) but if the *current* implementation as a side effect leads to some sort of ordering may be not documented at all, it may be not important at all, and may change whenever implementation is changed, even slightly.

It is NOT a behavior, documented or not !

Nico, January 04, 2007 - 12:35 pm UTC

This is to answer to Michael Friedman :
You said : "That means every system will have undocumented behaviors."

The error in your demonstration is to consider the "occasionnaly sorted result set of a group by clause query" as a behavior !!!

According to you, one could say : "the (undocumented) behavior of a coin is to always come up heads"


Nevertheless ...

Jeff Seeman, January 22, 2007 - 6:24 pm UTC

In an Oracle 10.2.0.2 database I populated a table with one million randomly-generated integers. I then created a second table based on a select distinct of the first. The second table was perfectly sorted. I then created a third table based on a group by of the numeric column in the first table. This third table too was perfectly sorted. Why am I getting "impossible" results if the rule that distinct and group by does not guarantee any particular order is absolute?

To: Jeff Seeman

Michel Cadot, January 24, 2007 - 7:11 am UTC

"Does not guarantee" does mean neither "forbids" nor "impossible result". It means it can be in any order including in sorting order.
Just as simple as this.

Michel

To Jeff

A reader, March 05, 2007 - 3:10 pm UTC

Think of it this way. Oracle is NOT contractually required to return it in a sorted order. It MIGHT, but the database has NO obligation to do so and has NO obligation to use the same behavior in the future. (or in a different execution in the same database) Just because something happens N number of times does not mean that it will always keep happening.

Just put the order by clause on.


optimizer_features_enable

steve, September 26, 2007 - 11:25 am UTC

Hello Tom,

We are experiencing this issue (group by queries no longer
sorted after upgrading to 10.2). A trouble ticket has been
raised with the design group but the fix will take a few weeks.

In the meantime (as a temporary remedy), do you see any
major problem with setting optimizer_features_enable to
10.1.0.5? It seems to correct the issue (again, just as
a temporary solution).

Tom Kyte
September 26, 2007 - 10:04 pm UTC

GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER


why wasn't this found in test????

if you find setting that parameter appears to sometimes fix your problem, go for it.

there is only one fix. only one.

there is an underscore parameter you can ask support about as well to disable hash group by

order by in analytic functions

Anurag, May 20, 2008 - 2:51 am UTC

Hi Tom,

Can an order by in an analytic function do the work of an order by clause in the query?

Consider the following examples:

SQL> drop table t;

Table dropped

SQL> create table t
  2  (i varchar2(10),
  3   j varchar2(10),
  4   k number);

Table created

SQL> insert into t values ('A', 'a', 1);

1 row inserted

SQL> insert into t values ('A', 'b', 1);

1 row inserted

SQL> insert into t values ('B', 'a', 1);

1 row inserted

SQL> insert into t values ('B', 'b', 1);

1 row inserted

SQL> insert into t values ('C', 'a', 1);

1 row inserted

SQL> insert into t values ('C', 'a', 1);

1 row inserted

SQL> commit;

Commit complete


Query 1:
-----------
SQL> select i,
  2         j,
  3         row_number() over(order by i asc, j asc) r1
  4  from   t;

I          J                  R1
---------- ---------- ----------
A          a                   1
A          b                   2
B          a                   3
B          b                   4
C          a                   5
C          a                   6

6 rows selected


Query 2:
-----------
SQL> select i,
  2         j,
  3         row_number() over(order by i asc, j asc) r1
  4  from   t
  5  order  by i, j;

I          J                  R1
---------- ---------- ----------
A          a                   1
A          b                   2
B          a                   3
B          b                   4
C          a                   5
C          a                   6

6 rows selected


Query 3:
-----------
SQL> select i,
  2         j,
  3         row_number() over(order by i asc, j asc) r1,
  4         row_number() over(order by j asc, i asc) r2
  5  from   t;

I          J                  R1         R2
---------- ---------- ---------- ----------
A          a                   1          1
B          a                   3          2
C          a                   6          3
C          a                   5          4
A          b                   2          5
B          b                   4          6

6 rows selected


Query 4:
-----------
SQL> select i,
  2         j,
  3         row_number() over(order by i asc, j asc) r1,
  4         row_number() over(order by j asc, i asc) r2
  5  from   t
  6  order  by i, j;

I          J                  R1         R2
---------- ---------- ---------- ----------
A          a                   1          1
A          b                   2          5
B          a                   3          2
B          b                   4          6
C          a                   5          4
C          a                   6          3

6 rows selected



Logic (please correct me if I am wrong) would tell me that the Query 1 is different from Query 2. 
The order by in the analytic function is used to generate the row_number while the order by clause in the query sorts the records fetched by the query.
Hence there should be no expectation of getting the data sorted based on i and j if I were to use Query 1.

But the results of the queries 3 and 4 show that the results are sorted by the last analytic function called. 
I understand that this is because the analytic functions are called after the where clause but before the order by clause; the optimizer sees that a sorting has already been performed and does not perform another sort.

Should we have any reason to believe that the otimizer will continue to work this way everytime?


Tom Kyte
May 20, 2008 - 11:14 am UTC

...
Can an order by in an analytic function do the work of an order by clause in
the query?

....

Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.


...
But the results of the queries 3 and 4 show that the results are sorted by the
last analytic function called.
.....

and the last time I flipped a coin, it landed up heads - so therefore, all coins will land heads up right?


IF YOU NEED DATA SORTED THERE IS ONE AND ONLY ONE WAY TO GET THAT.

use order by.


IF the optimizer can skip the actual sort, because some prior step did it, then it will (it won't sort it twice if it already knows it is sorted). BUT - unless and until you add an order by, you cannot be sure the data will be returned sorted.


Empirical observation does not imply "this is assured to be true always"


USE ORDER BY IF YOU NEED IT SORTED.


please - just read the original answer to this question above. I thought I was absolutely unambiguous. Since I must not have been - proven by the fact you are asking this - maybe you can help me out....


What would you have needed to see on this page to understand that without and ORDER BY - the result set is not guaranteed to be sorted in any way shape or form?

Yet another Question

bc, May 20, 2008 - 12:26 pm UTC

I fully understand and always adhere to the fact that If you need data sorted, then use ORDER BY.

However, I am trying to understand why prior to 10g the resultset was was always sorted when grouped by.

APPS@dev  > select * from v$version;

BANNER                                                            
----------------------------------------------------------------  
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production  
PL/SQL Release 9.2.0.6.0 - Production                             
CORE 9.2.0.6.0 Production                                        
TNS for Solaris: Version 9.2.0.6.0 - Production                   
NLSRTL Version 9.2.0.6.0 - Production                             

5 rows selected.

APPS@dev  > select   extract(month from created) || ' - ' || to_char(created, 'Month') month_created,
  2           count(*)                                                          object_count
  3  from     all_objects
  4  group by extract(month from created) || ' - ' || to_char(created, 'Month')
  5  /

MONTH_CREATED                  OBJECT_COUNT   
------------------------------ ------------   
1 - January                             114   
10 - October                            167   
11 - November                           233   
12 - December                          1911   
2 - February                            164   
3 - March                                89   
4 - April                               120   
5 - May                               28719   
6 - June                             162833   
7 - July                                185   
8 - August                               96   
9 - September                            78   

12 rows selected.

APPS@prod > select * from v$version;

BANNER                                                             
----------------------------------------------------------------   
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi   
PL/SQL Release 10.2.0.3.0 - Production                             
CORE 10.2.0.3.0 Production                                         
TNS for Solaris: Version 10.2.0.3.0 - Production                   
NLSRTL Version 10.2.0.3.0 - Production                             

5 rows selected.

APPS@prod > select   extract(month from created) || ' - ' || to_char(created, 'Month') month_created,
  2           count(*)                                                          object_count
  3  from     all_objects
  4  group by extract(month from created) || ' - ' || to_char(created, 'Month')
  5  /

MONTH_CREATED                  OBJECT_COUNT   
------------------------------ ------------   
5 - May                               94395   
4 - April                               161   
12 - December                          1927   
6 - June                             156811   
3 - March                                45   
2 - February                            160   
11 - November                           257   
7 - July                                203   
9 - September                            70   
10 - October                            105   
1 - January                             126   
8 - August                               93   

12 rows selected.


Tom Kyte
May 20, 2008 - 3:36 pm UTC

...
However, I am trying to understand why prior to 10g the resultset was was
always sorted when grouped by.
...


IT WASN'T, did you read any of the links in the original question - they date back well before 10g was even dreamt of and demonstrate the simple fact that:


group by does not, did not have to sort - ever - in any release of any database - ever.

those links demonstrate CLEARLY that forever - group by was not the same as saying order by.


Please stop flipping a coin and saying "it will always be heads", logic, proof - they do not work that way.


you can show as many queries as you like where you say "the data is sorted", as soon as I show you ONE QUERY that is isn't, I have proven wrong the premise that the data is always returned sorted!!!!!!!!! It is called proof by contradiction.

Data was NEVER always returned sort - never never never

Unless and until there is an order by - the data is not sorted and you can not have any expectations otherwise


(10g introduced a new group by approach, hash group by - but the data was never returned ordered by anything on purpose, ever)

Awesome as always ....

bc, May 20, 2008 - 6:01 pm UTC

This is exactly what I was looking for Tom, My current client just migrated from 9i to 10g ( finally ) and there have been a boatload of problems because their developers assumed that oracle would or already had implemented some kind of crystal ball mechanism that would automagically figure out their design / intent and provide a sorted output without using the order by clause.

They also assumed that relational is the synomym for sequential and "when others then commit" is perfectly normal ... :)

Thanks

BC


Tom Kyte
May 20, 2008 - 9:49 pm UTC

... "when others then commit" ...


ouch, that hurts, a lot.

¿bug with ROLLUP, COUNT(DISTINCT ...) and ORDER BY?

MutenRo, June 09, 2008 - 6:47 pm UTC

Hi,

this query is returning the rows in a WRONG order:

SELECT A, B, COUNT(DISTINCT C) AS C
FROM (
SELECT 1 AS A, 3 AS B, 2 AS C FROM DUAL
UNION ALL
SELECT 1 AS A, 3 AS B, 3 AS C FROM DUAL
UNION ALL
SELECT 2 AS A, 1 AS B, 2 AS C FROM DUAL
UNION ALL
SELECT 2 AS A, 2 AS B, 3 AS C FROM DUAL
)
GROUP BY ROLLUP(A), B
ORDER BY B

returns:

A B C
2 1 1
2 2 1
1 3 2
null 1 1
null 2 1
null 3 2

It seems to be a bug because if we use COUNT(C) or SUM(C) instead of COUNT(DISTINCT C) the we get a correct result.

Is it a bug?

Thanks

Tom Kyte
June 09, 2008 - 9:49 pm UTC

I reproduced in 10.2.0.2 (I have to get .4)
Does not reproduce in 11.1.0.6

what release are you on?

¿bug with ROLLUP and ORDER BY?

MutenRo, June 10, 2008 - 4:04 am UTC

I'm using 10.2.0.1.0, I didn't try with 11g, so I guess the fixed it.

Thanks.
Tom Kyte
June 10, 2008 - 7:53 am UTC

oh, I should have given you the workaround


with data
as
(select /*+ materialize */ .... group by rollup(...) )
select * from data order by b;

data never ever sorted without order by?

Pasko, June 10, 2008 - 3:48 pm UTC

Hi Tom,

only related slightly to this thread ;)

i read from some OCP Books that data in a Sorted Hash Cluster is always ordered even without an order-by, in fact the order-by is even ignored in this case.

is that true?

Regards,

Pasko
Tom Kyte
June 10, 2008 - 8:18 pm UTC

it is NOT true.

the order by is never ignored.

It is KNOWN by the optimizer that "we can retrieve the data sorted without having to sort it again, since it was placed there 'in order'"

The OPTIMIZER has been doing that for longer than I've been using Oracle (skipping actually sorting data when it understands "data is already sorted")

Eg:

select * from t where indexed_column between 1 and 2 order by indexed_column;

you might well find the "order by is ignored (NOT REALLY)", the order by will probably be satisfied by accessing the data in order via the index. The order by is NOT IGNORED, it is optimized away - because we know the data is retrieved in a manner that satisfies the sort (big big BIG difference, not ignored, never IGNORED)


tell me, what happens if you join data in a sorted hash cluster with other data? Will it be sorted? If so, by what?

Ok, wait, I'll just use a single table.

ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2> drop table audit_trail_shc;
ops$tkyte%ORA10GR2> drop cluster shc;
ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> CREATE CLUSTER shc
ops$tkyte%ORA10GR2> (
ops$tkyte%ORA10GR2>    ip_address    NUMBER,
ops$tkyte%ORA10GR2>    request_time  timestamp SORT
ops$tkyte%ORA10GR2> )
ops$tkyte%ORA10GR2> single table
ops$tkyte%ORA10GR2> HASHKEYS 1000
ops$tkyte%ORA10GR2> HASH IS ip_address
ops$tkyte%ORA10GR2> SIZE  20480
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table audit_trail_shc
ops$tkyte%ORA10GR2> ( ip_address number,
ops$tkyte%ORA10GR2>   request_time timestamp sort,
ops$tkyte%ORA10GR2>   bytes number,
ops$tkyte%ORA10GR2>   url   varchar2(255)
ops$tkyte%ORA10GR2> )
ops$tkyte%ORA10GR2> cluster shc (ip_address, request_time);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into audit_trail_shc (ip_address, request_time, bytes, url )
ops$tkyte%ORA10GR2> select 123456780001 + mod(rownum,255),
ops$tkyte%ORA10GR2>        sysdate-100+rownum/24/60,
ops$tkyte%ORA10GR2>        object_id,
ops$tkyte%ORA10GR2>        rpad( '*', 80+mod(rownum,150), '*' ) url
ops$tkyte%ORA10GR2>   from all_objects;
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2> select to_char( ip_address,'999999999999999999999' ) ip, request_time, bytes from audit_trail_shc where bytes between 100 and 110;

IP                     REQUEST_TIME                        BYTES
---------------------- ------------------------------ ----------
          123456780106 02-MAR-08 09.54.49.000000 PM          106
          123456780107 02-MAR-08 09.55.49.000000 PM          107
          123456780100 02-MAR-08 09.48.49.000000 PM          100
          123456780108 02-MAR-08 09.56.49.000000 PM          108
          123456780101 02-MAR-08 09.49.49.000000 PM          101
          123456780109 02-MAR-08 09.57.49.000000 PM          109
          123456780102 02-MAR-08 09.50.49.000000 PM          102
          123456780110 02-MAR-08 09.58.49.000000 PM          110
          123456780103 02-MAR-08 09.51.49.000000 PM          103
          123456780104 02-MAR-08 09.52.49.000000 PM          104
          123456780105 02-MAR-08 09.53.49.000000 PM          105

11 rows selected.


does that appear to be sorted by *anything* ??

easier workaround for the Bug MutenRo from Spain hit

M. Rogel, June 11, 2008 - 6:17 am UTC

SELECT A, B, COUNT(DISTINCT C) AS C
FROM (
SELECT 1 AS A, 3 AS B, 2 AS C FROM DUAL
UNION ALL
SELECT 1 AS A, 3 AS B, 3 AS C FROM DUAL
UNION ALL
SELECT 2 AS A, 1 AS B, 2 AS C FROM DUAL
UNION ALL
SELECT 2 AS A, 2 AS B, 3 AS C FROM DUAL
)
GROUP BY ROLLUP(A), B
ORDER BY b+0

Perhaps update the documentation set

Steve, June 11, 2008 - 10:31 am UTC

Tom,
Perhaps you should suggest to the Documentation Group that they add an additional volume to the forthcoming 11g documentation set. It would be titled _Query Results Ordering Concepts_ and after the title page, TOC, etc would consist of one page with one paragraph: "Query results have no order. None. Query results have no implied, implicit, or natural order. The only method for obtained ordered results is to use the ORDER BY clause." And maybe an Appendix with a copy of your "Order in The Court" essay.

Based on what I have experienced recently, sitting here in 2008 (2008 for gosh sakes), I can't see any other way to kill this dangerous assumption.

sPh

sorted hash cluster

A reader, June 11, 2008 - 1:57 pm UTC

Hi Tom,

thanks for your follow-up.
You are right that the order by clause in a sorted hash cluster is NOT ignored.I have tested with your sample data.

Then i think this could be a New Feature Doc Bug:
see: http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14231/hash.htm#ADMIN01901

According to the Docs , we are supposed to get FIFO output from a sorted hash cluster, but it is NOT so!

Example:
insert into audit_trail_shc (ip_address, request_time, bytes, url )
values ( 11 ,  to_date(`010120081000¿,¿DDMMYYYYHH24MI¿), 111,  `url1¿)
/
insert into audit_trail_shc (ip_address, request_time, bytes, url )
values ( 3 ,  to_date(`010120081100¿,¿DDMMYYYYHH24MI¿), 33,  `url1¿)
/
insert into audit_trail_shc (ip_address, request_time, bytes, url )
values ( 2 ,  to_date(`010120080700¿,¿DDMMYYYYHH24MI¿), 22,  `url1¿)
/
insert into audit_trail_shc (ip_address, request_time, bytes, url )
values ( 1 ,  to_date(`010120081200¿,¿DDMMYYYYHH24MI¿), 11,  `url1¿)
/

SQL> select to_char( ip_address,'999999999999999999999' ) ip, request_time, bytes
  2  from audit_trail_shc
  3  where bytes between 10 and 120;

IP                     REQUEST_TIME                   BYTES
---------------------- ------------------------------ -----
                    11 01.01.08 10:00:00,000000         111
                     3 01.01.08 11:00:00,000000          33
                     2 01.01.08 07:00:00,000000          22
                     1 01.01.08 12:00:00,000000          11

If i change the order of inserts, then i do NOT get the expected result ( FIFO)

SQL> delete audit_trail_shc;

4 Zeilen wurden gel÷scht.

SQL> commit
  2  /

Transaktion mit COMMIT abgeschlossen.

SQL> insert into audit_trail_shc (ip_address, request_time, bytes, url )
  2  values ( 1 ,  to_date('010120081200','DDMMYYYYHH24MI'), 11,  'url1')
  3  /

1 Zeile wurde erstellt.

SQL> insert into audit_trail_shc (ip_address, request_time, bytes, url )
  2  values ( 2 ,  to_date('010120080700','DDMMYYYYHH24MI'), 22,  'url1')
  3  /

1 Zeile wurde erstellt.

SQL> insert into audit_trail_shc (ip_address, request_time, bytes, url )
  2  values ( 3 ,  to_date('010120081100','DDMMYYYYHH24MI'), 33,  'url1')
  3  /

1 Zeile wurde erstellt.

SQL> insert into audit_trail_shc (ip_address, request_time, bytes, url )
  2  values ( 11 ,  to_date('010120081000','DDMMYYYYHH24MI'), 111,  'url1')
  3  /

1 Zeile wurde erstellt.

SQL>
SQL> select to_char( ip_address,'999999999999999999999' ) ip, request_time, bytes
  2  from audit_trail_shc
  3  where bytes between 10 and 120;

IP                     REQUEST_TIME                   BYTES
---------------------- ------------------------------ -----
                    11 01.01.08 10:00:00,000000         111
                     3 01.01.08 11:00:00,000000          33
                     2 01.01.08 07:00:00,000000          22
                     1 01.01.08 12:00:00,000000          11


Tom Kyte
June 11, 2008 - 5:54 pm UTC

without an order by, there can be no expectation of order.

I concur that the statement:
Given the sort order of the data, the following query would return the call records for a specified hash key by oldest record first.

SELECT * WHERE telephone_number = 6505551212; 


is misleading - it would almost certainly by accident do that, but without the order by - it doesn't have to.

In their example, they query by the hash key - if you do that, you will probably accidentally get the data in sorted order by the sort key.

However, if you

Given the sort order of the data, the following query would return the call records for a specified hash key by oldest record first.

SELECT * FROM t1 left join t2 on (...) WHERE t1.telephone_number = 6505551212;

who knows.

there should have been an order by there.

Just make it break

Loz, June 11, 2008 - 7:37 pm UTC

Hey Tom,
Why don't you save yourself some blood pressure problems and suggest to the dev team they add a startup flag on Oracle to deliberately randomise output order when there is no explicit ORDER BY. I know I'd use it (if I used Oracle any more).

Tom Kyte
June 11, 2008 - 9:29 pm UTC

;)

I've always wanted the database to have this feature.

there would be a parameter, it's name would be:

FAST

it would default to false. when set to false, there would be lots of:

if ( !fast )
then
for i in 1 .. 1000 loop x = ln(i); end loop;
end if;

loops in the kernel.

then we could set

FAST=true

how cool would that be... similar concept.

lol

A reader, June 12, 2008 - 3:10 am UTC

about this FAST=TRUE !

Sub query - revisited

KK -$, February 18, 2009 - 5:44 am UTC

Thanks for an excellent article like this!

In response to one of the query copied below, you said:But - the really good news - IF WE DON'T NEED TO SORT, WE WON'T....

So, does the below query returns records in ascending order of ename?

SELECT ename FROM (
SELECT * FROM emp
ORDER BY ename
);


What about subqueries?   December 30, 2006 - 3pm US/Eastern Bookmark | Bottom | TopReviewer: A reader 
Is an outer query guaranteed to get an ordered result set if a subquery sorts the results but the outer query does not explicitly do that? 

Will queries of the following form return deterministic results? Or do i need to include another ORDER BY in the outer query? 


SELECT ename FROM (
  SELECT * FROM emp
  ORDER BY ename
);




Followup   December 30, 2006 - 5pm US/Eastern:

in this case, due to the lack of a "set operator" like ROWNUM in the outer level - it is able to technically ignore the inner order by. 

But - the really good news - IF WE DON'T NEED TO SORT, WE WON'T....


Tom Kyte
February 18, 2009 - 3:35 pm UTC

... So, does the below query returns records in ascending order of ename? ...

that was already answered in the bit you cut and pasted....

in this case, due to the lack of a "set operator" like ROWNUM in the outer
level - it is able to technically ignore the inner order by.



there is no reason it would have to sort, we can look and see that there is nothing (like rownum) that would FORCE us to sort and since there is no order by on the outer bit of the query, we are free to skip it.

it could return the data unsorted if it felt like it, there is no reason we would HAVE to sort this particular set of data.

This shows the order by ename is "skipable" if we felt like it

ops$tkyte%ORA11GR1> create table emp as select * from scott.emp;

Table created.

ops$tkyte%ORA11GR1> exec dbms_stats.gather_table_stats( user, 'EMP' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> @at
ops$tkyte%ORA11GR1> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA11GR1> set autotrace traceonly explain
ops$tkyte%ORA11GR1> select ename from (select * from emp order by ename);

Execution Plan
----------------------------------------------------------
Plan hash value: 150391907

------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    14 |    84 |     4  (25)| 00:00:0
|   1 |  SORT ORDER BY     |      |    14 |    84 |     4  (25)| 00:00:0
|   2 |   TABLE ACCESS FULL| EMP  |    14 |    84 |     3   (0)| 00:00:0
------------------------------------------------------------------------

<b>here is didn't drop the order by, it could have, but it did not, I would not necessarily rely on that, the following shows the order by can be skipped:</b>


ops$tkyte%ORA11GR1> select ename from (select * from emp order by ename) order by empno;

Execution Plan
----------------------------------------------------------
Plan hash value: 150391907

------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    14 |   140 |     4  (25)| 00:00:0
|   1 |  SORT ORDER BY     |      |    14 |   140 |     4  (25)| 00:00:0
|   2 |   TABLE ACCESS FULL| EMP  |    14 |   140 |     3   (0)| 00:00:0
------------------------------------------------------------------------

<b>there is only one sort - not a sort for ENAME and then a sort for EMPNO, just one sort - for empno, the sort of ename can be skipped - unless of course we do something that makes it "not skippable" </b>


ops$tkyte%ORA11GR1> select ename, <b>rownum</b> from (select * from emp order by ename) order by empno;

Execution Plan
----------------------------------------------------------
Plan hash value: 1334697481

------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |    14 |   280 |     5  (40)| 00:0
|   1 |  SORT ORDER BY        |      |    14 |   280 |     5  (40)| 00:0
|   2 |   COUNT               |      |       |       |            |
|   3 |    VIEW               |      |    14 |   280 |     4  (25)| 00:0
|   4 |     SORT ORDER BY     |      |    14 |   518 |     4  (25)| 00:0
|   5 |      TABLE ACCESS FULL| EMP  |    14 |   518 |     3   (0)| 00:0
------------------------------------------------------------------------

ops$tkyte%ORA11GR1> set autotrace off

<b>rownum is assured (top-n queries rely on this, it has to happen) to be assigned after the sort of ename, before the sort by empno.  So we sort by ename, assign rownum, then sort by empno</b>




Clever oracle...

Matt, February 24, 2009 - 11:18 am UTC

Just add my 2 cents. We agree that ORDER BY is mandatory, and if your results are sorted just because you have an index, there is no guaranty that it will be always be the same. I just found funy that oracle knows that it doe not need to perform that sort. Here is a query performed on my database. I have an index on N_SUBMISSION_DATE ASC, N_RECORD_STATE, N_UNEXECUTED and perform the following query. As you will notice, statistics states that it didn't execute a sort despite the order by clause :

SQL> SELECT C_INVOCATION_ID, N_SUBMISSION_DATE, C_REQUESTED_SERVICE, C_INVOCATION_STATE, C_TRANSACTION_ID FROM BS_INVOCATION WHERE N_SUBMISSION_DATE>0 AND N_RECORD_STATE = 1 AND N_UNEXECUTED = 0 ORDER BY N_SUBMISSION_DATE;
 
26 rows selected.
 

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=RULE (Cost=3 Card=1 Bytes=176)
   1    0   TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'BS_INVOCATION' (C
          ost=3 Card=1 Bytes=176)
 
   2    1     INDEX (RANGE SCAN) OF 'FK_BS_INV_RECORD_STATE' (NON-UNIQ
          UE) (Cost=2 Card=346)
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          9  consistent gets
          0  physical reads
          0  redo size
       1623  bytes sent via SQL*Net to client
        514  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         26  rows processed 

Hierarchical query

A reader, May 14, 2009 - 12:25 pm UTC

It is a new twist of the old story.

In our application we have a query:

select * from
( select Element_eid, Level as lvl
from P
start with Ancestor_eid=:1 and SortOrder=:2
connect by Ancestor_eid=prior Element_eid and SortOrder=2
) t, P1 pr, W1 wo
where pr.Project_eid (+) = t.Element_eid and
wo.Work_eid (+) = t.Element_eid ;

Project_eid is a primary key of P1 and Work_eid is a primary key of W1.

It is a simple hierarchical query with 2 joins and of course result set will be in "hierarchical" order.
What else could it be? And it was sorted properly for several years!
But yesterday CBO changed its mind and decided that hash join is better than nested loops and now result set is not sorted at all. Of course it is very interesting why CBO changed its mind, but right now I am trying to figure out how to sort it in "hierarchical" order.

Lev

Tom Kyte
May 14, 2009 - 12:48 pm UTC

it was purely an accident that it was sorted before.

    select *  from
        ( select Element_eid, Level as lvl <b>, rownum OC</b>
          from P
          start with Ancestor_eid=:1 and SortOrder=:2
          connect by Ancestor_eid=prior Element_eid and SortOrder=2
        ) t, P1 pr, W1 wo
      where pr.Project_eid (+) = t.Element_eid and
            wo.Work_eid   (+) = t.Element_eid<b>
      order by OC ;</b>

A reader, May 14, 2009 - 2:25 pm UTC

Thanks Tom,

It works.

Lev

The same old story again

A reader, August 13, 2009 - 5:56 pm UTC

That's what I found in our application:
-- I had to include this index hint to get records returned based on time.
-- Using primary key for the index hint doesn't work.
-- Explicitly order by clause doesn't work either
-- This is done to get the next pending job
 select /*+hint INDEX_ASC ( t t_idx ) */
 ....
 from  jobs T 
 where  AssignedTo = 'NONE' 
  and RowNum < 2;

I think I can guess why order by clause didn't work.

Somehow it is working.
I just don't know - should I fix it now or wait until it's broken?

Thanks,
Lev



Tom Kyte
August 24, 2009 - 7:16 am UTC

it is just waiting to break on you.

IT MUST HAVE ORDER BY else the records do not need to be ordered.


What if it decided to use the index to access table T but did so using an index fast full scan (read index "not sorted")

What if the index is later renamed to something more reasonable?

Unless and until there is an order by - this query is 100% wrong (according to the comments)
select ...
  from (select ..... from jobs where assignedto='NONE' order by DATE_COLUMN)
 where rownum < 2;


Helena Marková, September 09, 2009 - 3:01 am UTC


ORDER BY MONTH #, DISPLAY MONTH NAME

Abhijit, September 29, 2009 - 9:03 am UTC

Hi Tom,
Thanks for the explanation. This is what I have been trying to do. I have been trying to find out the number of objects created in a particular month.

This is my query.
select 
owner, count(object_name), to_char(created,'MON') from dba_objects
where OWNER = 'SPACE'
AND to_char(created,'YYYY') = '2009'
GROUP by owner, to_char(created,'MON')
ORDER By to_char(created,'MON')


This orders the list in this way.
APR
AUG
FEB
JAN
MAR
SEP

If I want to change the order to
JAN
FEB
.

I have to change the above query to
select 
owner, count(object_name), to_char(created,'MON'), to_char(created,'MM') from dba_objects
where OWNER = 'SPACE'
AND to_char(created,'YYYY') = '2009'
GROUP by owner, to_char(created,'MON'), to_char(created,'MM')
ORDER By to_char(created,'MM')


For now, my objective is being satisfied. This is giving me an extra column in a report.

I am curious to know if I can avoid outputting the
month # column and still get the list in the order I need.
Tom Kyte
October 07, 2009 - 6:50 am UTC

  1  select owner, count(object_name), trunc(created,'mm')
  2    from dba_objects
  3   where owner = 'SCOTT'
  4     and trunc(created,'y') = to_date( '01-01-2009', 'dd-mm-yyyy' )
  5   group by owner, trunc(created,'mm')
  6*  order by owner, trunc(created,'mm')
ops$tkyte%ORA11GR2> /

OWNER                          COUNT(OBJECT_NAME) TRUNC(CRE
------------------------------ ------------------ ---------
SCOTT                                           6 01-AUG-09
SCOTT                                           1 01-OCT-09

2 rows selected.


leave dates as dates

A friend of mine at work solved this one.

Abhijit, September 30, 2009 - 9:36 am UTC

<code>
select 
owner, count(object_name), to_char(created,'MON') from dba_objects
where OWNER = 'SPACE'
AND to_char(created,'YYYY') = '2009'
GROUP by owner, to_char(created,'MON'), to_char(created,'MM')
ORDER By to_char(created,'MM')
</code>
Tom Kyte
October 07, 2009 - 8:39 am UTC

they did it wrong.



Why do you hate my grades Tom?

A reader, December 03, 2015 - 7:57 pm UTC

I was really hoping that group by was also an order by....thank Tom....and thanks obama