really ?
Sokrates, October 15, 2006 - 6:17 am UTC
"without an order by, there is absolutely no way you can make any assumptions
regarding the sorted order of data returned by a query"
really ?
no exceptions to this rule ?
what about querying a pipelined table function, as first counterexample ?
for example
select * from table(dbms_xplan.display)
always gave me a pretty well-sorted-looking resultset
*EVERY TIME* I used it
(some 1000s of times so far)
without any order by !
October 15, 2006 - 6:43 am UTC
I flipped a coin......
Anyway, what happens when parallel kicks in? for example.
Yeah, a simple query against a pipelined function will probably return the data in the order the pipelined function "piped" it. so?
Looking at that query, what order will the rows be in? what happens when I rewrite dbms_xplan.display?
IF YOU WANT DATA TO BE SORTED IN SOME ORDER, well, you know the rest.
What about ROLLUP?
Robert Massey, October 15, 2006 - 12:03 pm UTC
I've noticed that ROLLUP seems to always return sorted results, but is that just an accident of the GROUP BY?
If ROLLUP isn't reliable for sorting, a combination of GROUPING functions in the ORDER BY clause would probably be necessary to get the desired sort.
October 15, 2006 - 1:11 pm UTC
ops$tkyte%ORA10GR2> create table emp as select * from scott.emp;
Table created.
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'EMP', numrows=>100000, numblks => 10000 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> @at
ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace on explain
ops$tkyte%ORA10GR2> select /*+ parallel( emp 2 ) */ ename, sum(sal), grouping(ename) from emp group by rollup(ename)
2 /
ENAME SUM(SAL) GROUPING(ENAME)
---------- ---------- ---------------
29025 1
FORD 3000 0
ADAMS 1100 0
BLAKE 2850 0
CLARK 2450 0
JAMES 950 0
SCOTT 3000 0
MILLER 1300 0
KING 5000 0
WARD 1250 0
ALLEN 1600 0
JONES 2975 0
SMITH 800 0
MARTIN 1250 0
TURNER 1500 0
15 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1391315833
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CP
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 1953K| 1717 (
| 1 | PX COORDINATOR | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 100K| 1953K| 1717 (
| 3 | SORT GROUP BY | | 100K| 1953K| 1717 (
| 4 | PX RECEIVE | | 100K| 1953K| 1717 (
| 5 | PX SEND HASH | :TQ10000 | 100K| 1953K| 1717 (
| 6 | SORT GROUP BY ROLLUP| | 100K| 1953K| 1717 (
| 7 | PX BLOCK ITERATOR | | 100K| 1953K| 1703 (
| 8 | TABLE ACCESS FULL | EMP | 100K| 1953K| 1703 (
------------------------------------------------------------------------
ops$tkyte%ORA10GR2> select ename, sum(sal), grouping(ename) from emp group by rollup(ename)
2 /
ENAME SUM(SAL) GROUPING(ENAME)
---------- ---------- ---------------
ADAMS 1100 0
ALLEN 1600 0
BLAKE 2850 0
CLARK 2450 0
FORD 3000 0
JAMES 950 0
JONES 2975 0
KING 5000 0
MARTIN 1250 0
MILLER 1300 0
SCOTT 3000 0
SMITH 800 0
TURNER 1500 0
WARD 1250 0
29025 1
15 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 52302870
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 1953K| 3114 (2)| 00:00
| 1 | SORT GROUP BY ROLLUP| | 100K| 1953K| 3114 (2)| 00:00
| 2 | TABLE ACCESS FULL | EMP | 100K| 1953K| 3089 (2)| 00:00
------------------------------------------------------------------------
ops$tkyte%ORA10GR2> set autotrace off
Robert Massey, October 15, 2006 - 4:36 pm UTC
I didn't think I was going to get away with that one. :-)
index_desc hint and the order by
Mohamed, October 16, 2006 - 7:59 am UTC
Dear Tom,
you said
without an order by, there is absolutely no way you can make any assumptions
regarding the sorted order of data returned by a query
--- ---------
I have tested a select like this
select /*+ index_desc(aa pk_index_name) */
aa.pk, aa.xxx
from table T aa
and my output data have always been ordered by pk
without a need to have an order by pk
Kind Regars
October 16, 2006 - 8:05 am UTC
so, i flipped a coin and every time is comes up heads.
therefore, coins when flipped will come up heads. I win right?
ARG, UGH, unbelievable - I just don't get the persistence people keep showing with such an obvious thing here.
Tell me - DOES ORACLE HAVE TO OBEY YOUR HINT. does it?
Ok, let us say it even DOES follow your hint, did you, umm, well, read the order in the court link above? You might find it illustrative.
I sort of have an example there that shows an index hint, the index hint being used, and the data - well - the data is sort of not sorted.
IF YOU WANT YOUR DATA SORTED IN SOME ORDER THAT YOU CAN RELY ON YOU WILL USE ORDER BY.
No if and or buts about it, period, it is a fact, anyone that says otherwise is wrong, not only wrong but DANGEROUSLY wrong
If your goal was to retrieve data from your query as fast as possible - then your approach should have been first rows optimization, if an index would could be used to skip the sort - it would skip the sort.
I won't even get into the "binary sort vs linguistic sort" that would take place with 8bit (or multi-byte) data as well - that would return data that is not sorted correctly when using an index.
IOT?
Duke Ganote, October 16, 2006 - 8:39 am UTC
Would an exception be an IOT? There's no heap to scan through, and I can't imagine Oracle would "leap around" the index.
October 16, 2006 - 10:24 am UTC
index full scan on the IOT would retrieve the data accidently by primary key.
index FAST full scan on the IOT would retrieve data as it hits it, in any order.
There is a HEAP to scan through of sorts - there are many ways to access data, index full scan, index range scan, parallel index range scan, partitioning, index fast full scan
<b>IF YOU WANT DATA SORTED, THOU SHALT USE ORDER BY
and if the optimizer can use an index retrieve the data sorted AND it makes sense to use said index - the optimizer WILL skip the ORDER BY (knowing it has been satisfied)
So, the optimizer can skip the ORDER BY - but YOU CANNOT EVER skip the order by.
If you do, it is a BUG</b>
ops$tkyte%ORA9IR2> create table t
2 ( user_id number primary key,
3 username varchar2(30),
4 created date,
5 data char(2000)
6 )
7 organization index;
Table created.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> begin
2 for i in 1..20
3 loop
4 insert into t values ( -i, i, sysdate, i );
5 commit;
6 end loop;
7 end;
8 /
PL/SQL procedure successfully completed.
ops$tkyte%ORA9IR2> exec dbms_stats.set_table_stats( user, 'T', numrows => 100000 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA9IR2> set autotrace on explain
ops$tkyte%ORA9IR2> select user_id, username from t;
USER_ID USERNAME
---------- ------------------------------
-20 20
-19 19
-3 3
-2 2
-1 1
-6 6
-5 5
-4 4
-9 9
-8 8
-7 7
-12 12
-11 11
-10 10
-15 15
-14 14
-13 13
-18 18
-17 17
-16 16
20 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=100000 Bytes=3000000)
1 0 INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_31792' (UNIQUE) (Cost=4 Card=100000 Bytes=3000000)
ops$tkyte%ORA9IR2> select /*+ first_rows */ user_id, username from t;
USER_ID USERNAME
---------- ------------------------------
-20 20
-19 19
-18 18
-17 17
-16 16
-15 15
-14 14
-13 13
-12 12
-11 11
-10 10
-9 9
-8 8
-7 7
-6 6
-5 5
-4 4
-3 3
-2 2
-1 1
20 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=26 Card=100000 Bytes=3000000)
1 0 INDEX (FULL SCAN) OF 'SYS_IOT_TOP_31792' (UNIQUE) (Cost=26 Card=100000 Bytes=3000000)
ops$tkyte%ORA9IR2> set autotrace off
to Duke Ganote
Michel Cadot, October 16, 2006 - 8:58 am UTC
Same answer with IOT.
SQL> create table t (id primary key) organization index
2 as select object_id from dba_objects where object_id is not null order by dbms_random.value;
Table created.
SQL> exec dbms_stats.gather_table_stats (user, 'T', cascade=>true);
PL/SQL procedure successfully completed.
SQL> explain plan for select * from t;
Explained.
SQL> col PLAN_TABLE_OUTPUT format a78 trunc
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------
Plan hash value: 1035703402
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 49167 | 192K| 15 (0)
| 1 | INDEX FAST FULL SCAN| SYS_IOT_TOP_59843 | 49167 | 192K| 15 (0)
------------------------------------------------------------------------------
8 rows selected.
Operation is "INDEX FAST FULL SCAN" that is (approximately) index is scanned in the order of the blocks not in the order of the keys.
October 16, 2006 - 10:27 am UTC
thanks, I added an example to that effect right above before I read this :)
to: Michel Cadot
Duke Ganote, October 16, 2006 - 9:07 am UTC
Thank you! So even if Oracle guaranteed it would follow an index hint, the order is still not guaranteed because the "index is scanned in the order of the blocks not in the order of the keys."
to: Tom
Duke Ganote, October 16, 2006 - 10:42 am UTC
Nice short example on the IOT ordering... thanks!
How did it split?
Arian, October 16, 2006 - 2:21 pm UTC
Hi Tom,
I guess the order on your IOT is because of block splits? But I can't figure out why it comes out the way it does. I reproduced your results in 9.2.0.7 and 10.2.
And I just wonder why the first block would get the 2 lowest numbers (last block in). Followed by the block with the highest (first block in).
I would expect either a reversed order, or an order with -1 in the middle.
Thanks.
Arian
October 16, 2006 - 5:46 pm UTC
ASSM perhaps :) automatic segment space management likes to "spray data about"
but yes, splits and all are going to move the data around, I made it so that only 2 keys per leaf so every 2 inserts - "let us do a reorg of the structure" would happen.
sorted hash clusters
Tom, October 18, 2006 - 3:11 pm UTC
I hate to beat a dead horse, especially when I agree that order by does indeed order the data, but don't you think that it is about time for the documentation folks to agree with us? In Admin Workshop II they told us that using sorted hash clusters will sort your data (within a hash key?) without using order by, but what I think they meant to say was that using sorted hash clusters will skip the sorting step on your data when the order by in your query matches the sort in the hash cluster.
</code>
http://download-east.oracle.com/docs/cd/B14117_01/server.101/b10739/hash.htm#ADMIN01901 <code>
October 18, 2006 - 3:55 pm UTC
You are correct, that is a documentation bug.
The ORDER BY must be included for correctness, the optimizer can decide that the physical sort is not necessary.
consider documentation bug filed.
I detect an Oracle magazine ASK TOM article forthcoming, no?
A reader, October 18, 2006 - 5:18 pm UTC
I know this is sometimes just an interview question, but it appears to be a widespread misconception...
October 18, 2006 - 5:46 pm UTC
you guess it :) I was a little short for my last submission - so this got submitted to fill it out..
Oramag-- AskTom
A reader, October 19, 2006 - 8:58 am UTC
another counterexample
Matthias Rogel, October 19, 2006 - 11:10 am UTC
October 19, 2006 - 2:14 pm UTC
but that isn't sorted, else we would not require ORDER SIBILINGS now would we.
It is returned unsorted, in the hierarchy.
scott%ORA10GR2> select level, rpad('*', 2*level, '*') || ename nm
2 from emp
3 start with mgr is null
4 connect by prior empno = mgr
5 /
LEVEL NM
---------- ---------------
1 **KING
2 ****JONES
3 ******SCOTT
4 ********ADAMS
3 ******FORD
4 ********SMITH
2 ****BLAKE
3 ******ALLEN
3 ******WARD
3 ******MARTIN
3 ******TURNER
3 ******JAMES
2 ****CLARK
3 ******MILLER
14 rows selected.
scott%ORA10GR2>
scott%ORA10GR2> select level, rpad('*', 2*level, '*') || ename nm
2 from emp
3 start with mgr is null
4 connect by prior empno = mgr
5 order SIBLINGS by ename
6 /
LEVEL NM
---------- ---------------
1 **KING
2 ****BLAKE
3 ******ALLEN
3 ******JAMES
3 ******MARTIN
3 ******TURNER
3 ******WARD
2 ****CLARK
3 ******MILLER
2 ****JONES
3 ******FORD
4 ********SMITH
3 ******SCOTT
4 ********ADAMS
14 rows selected.
That data wasn't "sorted" until and unless I sorted it with ORDER BY!
well well, I should have read the link before I worked up the example. Seems I already made that point
siblings and no siblings
Matthias Rogel, October 20, 2006 - 8:03 am UTC
your example had siblings
in which case we need an
ORDER SIBLINGS BY
to determine the order
I said, in a query WITHOUT SIBLINGS
(which means that there are no two distinct rows
with the same level) the order of the result set is
completely determined without any
ORDER BY clause,
namely the data will always be ordered by level
October 20, 2006 - 8:22 am UTC
where is it "well documented"...
anyway, I'll say it one more time for those that didn't see it before:
If you want data sorted in some specific order, then you had better well use ORDER BY or you cannot have ANY REASONABLE expectation of the order of the rows, period
while this connect by trick is neat, I would hardly hold it up as a shining example, if you want data sorted - you better SORT IT.
Mirjana, October 20, 2006 - 9:49 am UTC
Just one thing to comment ...
Recently we did a 8.0 to 10g upgrade to one of our clients databases, and next day all of the application's reports output data became unsorted and thus useless to users.
And all because back then (at the 8.0 version) someone thought that "group by" also meant "order by".
The solution was pretty expensive... re-enter and modify few hundreds of reports to add the "order by"...
October 20, 2006 - 9:55 am UTC
group by did NOT sort in 8.x in general either!!! they got lucky, plain and simple.
add an index - no sort
gather stats, plan changes - no sort
partition - no sort
hash index - no sort
parallel - no sort
unless and until there is an ORDER BY, well, you get my point by now I'm sure...
Group BY
Dan Clamage, October 20, 2006 - 2:53 pm UTC
I think the reason people think GROUP BY returns rows in sorted order is because for small data sets, it tends to do so, 90% (or better) of the time. They figure, hey, it's doing a sort anyway.
I bet there's a LOT of code that should be remediated to include ORDER BY.
I remember one kid who kept saying, "But we inserted the rows into the database in order! It should come out the way we put it in".
October 20, 2006 - 4:57 pm UTC
I remember the dba that asked me how to update a rowid.....
the developer wanted to put the rows back in the right order, seems they got out of order somehow.
documented
Matthias Rogel, October 20, 2006 - 2:53 pm UTC
temporary tables
Anthony, November 01, 2006 - 2:41 pm UTC
Tom, Excellent site.
It appears with a temporary table that the select always returns rows in the same order that they were inserted in:
- can you give a simple example in 10g that this is not the case, we have tried with varying row sizes but failed.
I'd like to demonstrate to some external developers that you always need an order by to guarantee the order - even with temporary tables
Thanks
November 01, 2006 - 6:25 pm UTC
i give up, i just totally give up. honestly. I entirely and totally *give up*
rolls eyes, walks away.
I hope your developers don't work on anything important. honestly, I don't know what else to say.
I
give
up
I just flipped a coin.... it always came up heads. ta-dah.
Refusing to Order By
Mike, November 01, 2006 - 7:20 pm UTC
Stop arguing with such people. Put it in your coding standards: If the rows must be in some order to meet the requirements or to function properly, you must code ORDER BY and there is no substitute and no exception.
One of the purposes of coding standards is to provide protection against those who do not (or will not) know better.
Chuck Jolley, November 02, 2006 - 11:37 am UTC
Is there some misunderstanding about the difference between the words "possibly" and "guaranteed"?
And why is this interesting?
What possible objection could anyone have to simply putting "order by" in their code if they want to guarantee the rows are sorted before they reach the client?
That's the part of this that I don't get.
guarateed
A reader, November 05, 2006 - 10:32 am UTC
rows are always ordered by rowum, that's for sure
you don't need an
"order by rownum"
November 05, 2006 - 11:00 am UTC
cott%ORA10GR2> select job, count(*), rownum from emp group by job, rownum;
JOB COUNT(*) ROWNUM
--------- ---------- ----------
CLERK 1 11
ANALYST 1 13
CLERK 1 1
PRESIDENT 1 9
SALESMAN 1 2
SALESMAN 1 3
SALESMAN 1 10
CLERK 1 12
CLERK 1 14
MANAGER 1 7
MANAGER 1 4
ANALYST 1 8
SALESMAN 1 5
MANAGER 1 6
14 rows selected.
really.
Temporary tables
Andrew Max, November 05, 2006 - 12:54 pm UTC
>> It appears with a temporary table that the select always returns rows
>> in the same order that they were inserted in
>> ...
>> I'd like to demonstrate to some external developers that you always need
>> an order by to guarantee the order - even with temporary tables
Anthony,
why do you need to prove obvious things to someone?
Ok, if they really wanna see an example with a temporary table - let them try this one:
SQL> select * from v$version where rownum = 1;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.1.0.3.0 - Prod
SQL> create global temporary table test
2 (x number, padding varchar2(2000))
3 on commit preserve rows;
Table created.
SQL> insert into test
2 select rownum, rpad('*', 1000, '*')
3 from all_objects
4 where rownum < = 10;
10 rows created.
SQL> select x from test;
X
----------
1
2
3
4
5
6
7
8
9
10
10 rows selected.
SQL> -- So far so good, but let's modify data a little...
SQL> update test
2 set padding = padding || padding
3 where x <= 5;
5 rows updated.
SQL> select x from test;
X
----------
1
3
5
6
7
8
9
10
2
4
10 rows selected.
Kind regards.
really.
A reader, November 06, 2006 - 5:12 am UTC
SQL> select job, count(*), rownum from emp group by job, rownum;
JOB COUNT(*) ROWNUM
--------- ---------- ----------
CLERK 1 1
CLERK 1 11
CLERK 1 12
CLERK 1 14
ANALYST 1 8
ANALYST 1 13
MANAGER 1 4
MANAGER 1 6
MANAGER 1 7
SALESMAN 1 2
SALESMAN 1 3
SALESMAN 1 5
SALESMAN 1 10
PRESIDENT 1 9
14 rows selected.
BUT
SQL> select rownum, "inner".* from (select job, count(*), rownum from emp group by job, rownum) "inner";
ROWNUM JOB COUNT(*) ROWNUM
---------- --------- ---------- ----------
1 CLERK 1 1
2 CLERK 1 11
3 CLERK 1 12
4 CLERK 1 14
5 ANALYST 1 8
6 ANALYST 1 13
7 MANAGER 1 4
8 MANAGER 1 6
9 MANAGER 1 7
10 SALESMAN 1 2
11 SALESMAN 1 3
12 SALESMAN 1 5
13 SALESMAN 1 10
14 PRESIDENT 1 9
14 rows selected.
November 06, 2006 - 10:05 am UTC
what is your point?
I just showed a case where rownum on the outer level of a query will not be ordered when you select from it.
That sort of shows that the statement:
<quote>
rows are always ordered by rowum, that's for sure
you don't need an
"order by rownum"
</quote>
Is not true.
I don't care how many times you flip your coin and see heads, you'll never convince me coins always end up heads - the reason:
Once I saw it land tails.
that is all we need, one counter case.
Not realy...
Jurij, November 06, 2006 - 6:22 am UTC
Didn't you say:
"rows are *always* ordered by rowum, that's for sure" ?
If you say ALLWAYS (and add THAT'S FOR SURE on top of that), then there should be no exceptions to this, right? Well, tom has demonstrated you a simple example that proves you are 100% wrong. Accept it, get ower with it.
Why this thread exists
David Weigel, November 06, 2006 - 11:11 am UTC
It seems that Oracle 10.2 introduced "HASH UNIQUE" for some grouping activities where it used to "SORT UNIQUE". A select with a group-by that happened to sort in older Oracle versions stopped sorting because Oracle changed its manner of grouping.
I discovered this en route to running into Oracle bugs 4926618 and 4604970, which caused a nasty performance problem for the first and caused wrong sum()s for the second. Once I altered the system to cure the bugs with _GBY_HASH_AGGREGATION_ENABLED=FALSE, which replaces "HASH UNIQUE" with "SORT UNIQUE", I noticed the side-effect: things which happened to sort in Oracle 9 and earlier, and stopped sorting in Oracle 10.2, started sorting again.
(Naturally, you wouldn't want to set _GBY_HASH_AGGREGATION_ENABLED=FALSE just to get default sorting back -- who knows if it'll have the same lucky effect in future releases.)
November 06, 2006 - 11:47 am UTC
no, it always had SORT GROUP BY NOSORT
It always had parallel execution.
It always had partitioning.
It always had LOTS OF WAYS.
If you read the original answer, it points to a google thread that is many many many years old.
dbms_xplan revisited
Sokrates, November 07, 2006 - 4:58 am UTC
Hallo Tom
SQL> explain plan for select * from dual;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
| 1 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)| 00:00:01 |
Plan hash value: 272002086
| 0 | SELECT STATEMENT | | 1 | 2 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
8 rows selected.
Hmm, not so much what I wanted.
I wanted
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 272002086
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
How to achieve this ?
Which "order by" to add to my query ?
(To tell you the truth, I cheated and changed the order the
rows came out by hand, but according to you, I was just lucky and it was pure sheer coincidence that they came out the way I wanted them.
I sometimes use this utility dbms_xplan - I know it sometimes doesn't tell me the right plan, but anyway -
and I want to use it sometimes for demonstration purposes
to other developpers.
But I don't want to be dependent on coincidence, especially
when demonstrating, so what to do ?)
November 07, 2006 - 4:35 pm UTC
I sort of addressed this above, no?
wait - yes.
play with parallel, for example....
for something as trivial and NON-PRODUCTION-ISH and "safe if it totally entirely screws up" as dbms_xplan, I have no problem here.
So Sokrates, would you count on it in a mission critical sort of way?
I'll say it one last time:
If you want to have an expectation, a clear expectation, of the order of rows in a result set, thou shall use order by.
I have an exception...
Michael Friedman, November 26, 2006 - 5:07 am UTC
V$Logminer
The undo and redo SQL are stored in VARCHAR2 columns. If they are more than 4K bytes they get split across multiple rows in v$Logminer.
There is no Row Number column... so we asked Oracle Support how the heck we could get back all the rows in each query in order.
Oracle Support checked with development and then responded that Oracle DB GUARANTEES that these rows will be returned in order in any simple query without Order By, Group By, etc. and that Oracle does not support complex queries on any V$ tables (and they pointed us to chapter and verse in the manual for that last bit).
I still feel very uncomfortable with this approach... but there doesn't seem to be an alternative.
November 26, 2006 - 9:34 am UTC
with v$table
as
(select rownum r, v.* from v$table v),
....
select
you'll likely find if you use that approach, many of your v$ queries are more efficient as well - v$ tables are tricky beasts - no read consistency, few if any indexes...
Does a DISTINCT gaurantee data is ordered though?
A reader, December 08, 2006 - 4:01 pm UTC
What if I do a distinct instead of a Group By, would that mean data will be ordered?
December 09, 2006 - 2:40 pm UTC
NO
order by means data will be ordered.
anything else - NO, the data is NOT sorted for sure. It might accidentally sometimes appear to be sorted.
without an order by, there is absolutely no way you can make any assumptions
regarding the sorted order of data returned by a query
I quite simply do not know how else to say this..... So, I've copied it from above.
Lock the follow up's to this one
A reader, December 09, 2006 - 6:19 pm UTC
I can't believe how much discussion and follow up there has been on this question. It's quite amusing :)
I'm laughing, you (Tom) must be crying judging by the no. of bold comments. Maybe you should turn off the follow ups to this question.
December 09, 2006 - 7:00 pm UTC
I just sit here AMAZED, just totally AMAZED that people continue to ask along the same lines...
yes.
If you want data sorted in some specific order, then you had better well use
ORDER BY or you cannot have ANY REASONABLE expectation of the order of the rows,
period
There is a real issue here though
Mike Friedman, December 09, 2006 - 11:01 pm UTC
Fact is, there is undocumented (and unsupported) database behavior that many systems rely on that changes between versions.
A good example is the Ordering behavior of Group By.
A system could pass intensive detailed unit and system testing on 8i and then die on 10g because of this issue.
It would be nice to have some kind of "Be nasty" flag on init.ora that automatically tried to not provide any unsupported behavior - for example, that deliberately messed up ordering unless there was an explicit "ordery by" for testing purposes. But I don't think it's really practical.
In general, almost any multi-row select(except for batch jobs that don't care about ordering) should have an Order By. It's one of the key things I look for in code reviews.
December 10, 2006 - 7:37 pm UTC
there is no undocumented database behavior here WHATSOEVER.
group by NEVER had to sort and in many cases WOULD NOT sort and even if it did, it was a BINARY sort, not a character set sort.
That system worked by flipping total accident and could have STOPPED working by accident at any second, just a statistics gathering causing a plan to change. You were living on the edge.
this is not "unsupported behavior", this is not "undocumented behavior". I have a word for what this is - but won't use it right now.
ALL selects that care about the ordering of rows should have an order by.
no "in general", no "multi-row", no "almost".
Greg Norris, December 11, 2006 - 8:57 am UTC
It's hard to believe this topic is still alive... I blame it on toxic levels of stuponitron emissions in the atmosphere. :p
What about subqueries?
A reader, December 30, 2006 - 3:26 pm UTC
Is an outer query guaranteed to get an ordered result set if a subquery sorts the results but the outer query does not explicitly do that?
Will queries of the following form return deterministic results? Or do i need to include another ORDER BY in the outer query?
SELECT ename FROM (
SELECT * FROM emp
ORDER BY ename
);
December 30, 2006 - 5:26 pm UTC
in this case, due to the lack of a "set operator" like ROWNUM in the outer level - it is able to technically ignore the inner order by.
But - the really good news - IF WE DON'T NEED TO SORT, WE WON'T....
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from (select ename, empno from scott.emp order by ename) order by ename;
Execution Plan
----------------------------------------------------------
Plan hash value: 150391907
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 140 | 4 (25)| 00:00:01 |
| 1 | SORT ORDER BY | | 14 | 140 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 14 | 140 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
ops$tkyte%ORA10GR2> select * from (select ename, empno from scott.emp order by ename);
Execution Plan
----------------------------------------------------------
Plan hash value: 150391907
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 140 | 4 (25)| 00:00:01 |
| 1 | SORT ORDER BY | | 14 | 140 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 14 | 140 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
but you don't need, do you
A reader, December 31, 2006 - 5:55 am UTC
in order to
select * from (select ename, empno from scott.emp order by ename);
,
you don't need to sort, why should you ?
December 31, 2006 - 4:01 pm UTC
I don't understand what point you are trying to make.
Order by dbms_random.value
Arindam Mukherjee, January 01, 2007 - 7:28 am UTC
Respected Mr. Tom,
I always use "Order by <column name> or sequence number of column name in Select list". But in so many examples even in your book also, I see "order by dbms_random.value or dbms_random.random". I am sorry I could not understand what objective it achieves. In this thread, there is also one "order by dbms_random.value".
Question 1: Could you kindly tell me - WHEN and WHY I should use "order by dbms_random.value or dbms_random.random"?
Question 2: What is the difference between "order by dbms_random.value or dbms_random.random" AND "no order by clause in Select as opposed to order by dbms_random.value or dbms_random.random"?
January 01, 2007 - 9:14 am UTC
I wanted the data sorted randomly for the example apparently.
seems straightforward. the examples you looked at where like regarding clustering factors (index metric). If the data is sorted by the key being indexed - clustering factor is low, if the data is not (eg: it is sorted randomly), the clustering factor would be high.
no sorting required
Sokrates, January 02, 2007 - 1:58 am UTC
"select * from (select ename, empno from scott.emp order by ename); "
doesn't need a sort because
"in this case, due to the lack of a "set operator" like ROWNUM in the outer level - it is able to technically ignore the inner order by. "
however, a sort seems to occur.
Could you please clarify ?
Have you an example when Oracle ignores the inner order by ?
January 02, 2007 - 8:04 am UTC
it technically does not need a sort and some day into the future, it might not do one.
thx
Sokrates, January 02, 2007 - 8:11 am UTC
and the bar keeps moving...
Connor, January 02, 2007 - 8:32 am UTC
Its amazing where Oracle finds ways to avoid sorting, eg, when I was timing sorting (and didn't want to fetch the results) I typically used:
SQL> select count(*)
2 from
3 ( select /*+ NO_MERGE */ c
4 from MY_BIG_TABLE
5 order by c ) ;
Problem is - in 10.2, the optimizer looked and said "hey, thats a count(*) - the sort won't be needed even though its a NO_MERGE", so I had to resort to:
SQL> select count(*)
2 from
3 ( select /*+ NO_MERGE */ c, rownum
4 from MY_BIG_TABLE
5 order by c ) ;
to test my sorting...
Bottom Line - relying on anything to sort when its not explicitly asked for is a a recipe for disaster
Surprising how often this comes up...
Mark J. Bobak, January 02, 2007 - 9:14 am UTC
I'm really amazed how often this one comes up, and how frequently it has to be quashed, over and over again...
I think part of the problem is that lots of folks were doing group by, without order by, and getting lucky for many, many years. Then, they upgraded to 10g, and 10g said "Hey, I don't have to do a sort operation for group by any more, I can do a hash instead!" And suddenly people start getting different output, and the sky is falling, and the rest is history....;-)
Eliminating ORDER BY
Jonathan Lewis, January 02, 2007 - 7:23 pm UTC
Connor,
If you want to avoid wasting resources on the redundant rownum, 10g a hint to block the elimination of the "order by". Change your current hint set to read:
/*+
no_merge
qb_name(ordering)
no_eliminate_oby(@ordering)
*/
This names the query block (good practice for 10g) and then blocks "order by elimination".
There are cases where you actually NEED to do this to stop oracle from sub-optimising your queries. There's an example of this on my blog at:
http://jonathanlewis.wordpress.com/2006/11/06/filter-subqueries/
Come on... no system is fully specified
Michael Friedman, January 02, 2007 - 10:31 pm UTC
Tom,
I'm not sure why you denied above that Oracle has undocumented behavior.
No non-trivial system can be fully specified - it's not commerically or practically possible.
That means every system will have undocumented behaviors.
And that also means that when other systems are integrated with such systems they will sometimes deliberately or inadvertantly rely on such behaviors.
Fact of life.
So, for example, the order of rows returned from a query without an Order By on columns with unique values is not specified. But depending on your version and the characteristics of your data it will come back in some order.
And sometimes systems will get built that rely on that ordering, that pass rigorous testing, and then break totally when you do a database upgrade that is theoretically transparent to an application.
This is an issue for anyone developing systems. Facto of life.
The key is to understand this and to know what kinds of things to look for when a DB upgrade breaks your system instead of throwing your hands in the air and complaining that the new version of the DB is a buggy piece of crap since the upgrade should have been transparent.
January 05, 2007 - 7:49 am UTC
where did I deny that? I've said hundreds if not thousands of times - "sorry, that is not documented, not going to talk about it"
It is documented that IF YOU WANT DATA SORTED IN SOME FASHION THEN YOU WILL USE ORDER BY
I'm not getting your point, if your query doesn't have an order by on it at the outer most level, YOU CANNOT ASSUME, PRESUME, WHATEVER - that the data is sorted by anything.
Documented vs. undocumented vs. side effects
A reader, January 03, 2007 - 10:58 am UTC
Documented: the official "public" documentation states how the system behaves.
Undocumented: the official "public" documentation does not state anything, but some internal, non public documentation may exist
Side effect: the system behaves that way because of another documented or undocumented behaviour. It just happens.
I guess Oracle internal documentation describes how a GROUP BY is implemented :) but if the *current* implementation as a side effect leads to some sort of ordering may be not documented at all, it may be not important at all, and may change whenever implementation is changed, even slightly.
It is NOT a behavior, documented or not !
Nico, January 04, 2007 - 12:35 pm UTC
This is to answer to Michael Friedman :
You said : "That means every system will have undocumented behaviors."
The error in your demonstration is to consider the "occasionnaly sorted result set of a group by clause query" as a behavior !!!
According to you, one could say : "the (undocumented) behavior of a coin is to always come up heads"
Nevertheless ...
Jeff Seeman, January 22, 2007 - 6:24 pm UTC
In an Oracle 10.2.0.2 database I populated a table with one million randomly-generated integers. I then created a second table based on a select distinct of the first. The second table was perfectly sorted. I then created a third table based on a group by of the numeric column in the first table. This third table too was perfectly sorted. Why am I getting "impossible" results if the rule that distinct and group by does not guarantee any particular order is absolute?
To: Jeff Seeman
Michel Cadot, January 24, 2007 - 7:11 am UTC
"Does not guarantee" does mean neither "forbids" nor "impossible result". It means it can be in any order including in sorting order.
Just as simple as this.
Michel
To Jeff
A reader, March 05, 2007 - 3:10 pm UTC
Think of it this way. Oracle is NOT contractually required to return it in a sorted order. It MIGHT, but the database has NO obligation to do so and has NO obligation to use the same behavior in the future. (or in a different execution in the same database) Just because something happens N number of times does not mean that it will always keep happening.
Just put the order by clause on.
optimizer_features_enable
steve, September 26, 2007 - 11:25 am UTC
Hello Tom,
We are experiencing this issue (group by queries no longer
sorted after upgrading to 10.2). A trouble ticket has been
raised with the design group but the fix will take a few weeks.
In the meantime (as a temporary remedy), do you see any
major problem with setting optimizer_features_enable to
10.1.0.5? It seems to correct the issue (again, just as
a temporary solution).
September 26, 2007 - 10:04 pm UTC
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
GROUP BY QUERIES NEVER HAD TO SORT - EVER
why wasn't this found in test????
if you find setting that parameter appears to sometimes fix your problem, go for it.
there is only one fix. only one.
there is an underscore parameter you can ask support about as well to disable hash group by
order by in analytic functions
Anurag, May 20, 2008 - 2:51 am UTC
Hi Tom,
Can an order by in an analytic function do the work of an order by clause in the query?
Consider the following examples:
SQL> drop table t;
Table dropped
SQL> create table t
2 (i varchar2(10),
3 j varchar2(10),
4 k number);
Table created
SQL> insert into t values ('A', 'a', 1);
1 row inserted
SQL> insert into t values ('A', 'b', 1);
1 row inserted
SQL> insert into t values ('B', 'a', 1);
1 row inserted
SQL> insert into t values ('B', 'b', 1);
1 row inserted
SQL> insert into t values ('C', 'a', 1);
1 row inserted
SQL> insert into t values ('C', 'a', 1);
1 row inserted
SQL> commit;
Commit complete
Query 1:
-----------
SQL> select i,
2 j,
3 row_number() over(order by i asc, j asc) r1
4 from t;
I J R1
---------- ---------- ----------
A a 1
A b 2
B a 3
B b 4
C a 5
C a 6
6 rows selected
Query 2:
-----------
SQL> select i,
2 j,
3 row_number() over(order by i asc, j asc) r1
4 from t
5 order by i, j;
I J R1
---------- ---------- ----------
A a 1
A b 2
B a 3
B b 4
C a 5
C a 6
6 rows selected
Query 3:
-----------
SQL> select i,
2 j,
3 row_number() over(order by i asc, j asc) r1,
4 row_number() over(order by j asc, i asc) r2
5 from t;
I J R1 R2
---------- ---------- ---------- ----------
A a 1 1
B a 3 2
C a 6 3
C a 5 4
A b 2 5
B b 4 6
6 rows selected
Query 4:
-----------
SQL> select i,
2 j,
3 row_number() over(order by i asc, j asc) r1,
4 row_number() over(order by j asc, i asc) r2
5 from t
6 order by i, j;
I J R1 R2
---------- ---------- ---------- ----------
A a 1 1
A b 2 5
B a 3 2
B b 4 6
C a 5 4
C a 6 3
6 rows selected
Logic (please correct me if I am wrong) would tell me that the Query 1 is different from Query 2.
The order by in the analytic function is used to generate the row_number while the order by clause in the query sorts the records fetched by the query.
Hence there should be no expectation of getting the data sorted based on i and j if I were to use Query 1.
But the results of the queries 3 and 4 show that the results are sorted by the last analytic function called.
I understand that this is because the analytic functions are called after the where clause but before the order by clause; the optimizer sees that a sorting has already been performed and does not perform another sort.
Should we have any reason to believe that the otimizer will continue to work this way everytime?
May 20, 2008 - 11:14 am UTC
...
Can an order by in an analytic function do the work of an order by clause in
the query?
....
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
Yes - it CAN, but NO you cannot rely on it to do so.
...
But the results of the queries 3 and 4 show that the results are sorted by the
last analytic function called.
.....
and the last time I flipped a coin, it landed up heads - so therefore, all coins will land heads up right?
IF YOU NEED DATA SORTED THERE IS ONE AND ONLY ONE WAY TO GET THAT.
use order by.
IF the optimizer can skip the actual sort, because some prior step did it, then it will (it won't sort it twice if it already knows it is sorted). BUT - unless and until you add an order by, you cannot be sure the data will be returned sorted.
Empirical observation does not imply "this is assured to be true always"
USE ORDER BY IF YOU NEED IT SORTED.
please - just read the original answer to this question above. I thought I was absolutely unambiguous. Since I must not have been - proven by the fact you are asking this - maybe you can help me out....
What would you have needed to see on this page to understand that without and ORDER BY - the result set is not guaranteed to be sorted in any way shape or form?
Yet another Question
bc, May 20, 2008 - 12:26 pm UTC
I fully understand and always adhere to the fact that If you need data sorted, then use ORDER BY.
However, I am trying to understand why prior to 10g the resultset was was always sorted when grouped by.
APPS@dev > select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
PL/SQL Release 9.2.0.6.0 - Production
CORE 9.2.0.6.0 Production
TNS for Solaris: Version 9.2.0.6.0 - Production
NLSRTL Version 9.2.0.6.0 - Production
5 rows selected.
APPS@dev > select extract(month from created) || ' - ' || to_char(created, 'Month') month_created,
2 count(*) object_count
3 from all_objects
4 group by extract(month from created) || ' - ' || to_char(created, 'Month')
5 /
MONTH_CREATED OBJECT_COUNT
------------------------------ ------------
1 - January 114
10 - October 167
11 - November 233
12 - December 1911
2 - February 164
3 - March 89
4 - April 120
5 - May 28719
6 - June 162833
7 - July 185
8 - August 96
9 - September 78
12 rows selected.
APPS@prod > select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi
PL/SQL Release 10.2.0.3.0 - Production
CORE 10.2.0.3.0 Production
TNS for Solaris: Version 10.2.0.3.0 - Production
NLSRTL Version 10.2.0.3.0 - Production
5 rows selected.
APPS@prod > select extract(month from created) || ' - ' || to_char(created, 'Month') month_created,
2 count(*) object_count
3 from all_objects
4 group by extract(month from created) || ' - ' || to_char(created, 'Month')
5 /
MONTH_CREATED OBJECT_COUNT
------------------------------ ------------
5 - May 94395
4 - April 161
12 - December 1927
6 - June 156811
3 - March 45
2 - February 160
11 - November 257
7 - July 203
9 - September 70
10 - October 105
1 - January 126
8 - August 93
12 rows selected.
May 20, 2008 - 3:36 pm UTC
...
However, I am trying to understand why prior to 10g the resultset was was
always sorted when grouped by.
...
IT WASN'T, did you read any of the links in the original question - they date back well before 10g was even dreamt of and demonstrate the simple fact that:
group by does not, did not have to sort - ever - in any release of any database - ever.
those links demonstrate CLEARLY that forever - group by was not the same as saying order by.
Please stop flipping a coin and saying "it will always be heads", logic, proof - they do not work that way.
you can show as many queries as you like where you say "the data is sorted", as soon as I show you ONE QUERY that is isn't, I have proven wrong the premise that the data is always returned sorted!!!!!!!!! It is called proof by contradiction.
Data was NEVER always returned sort - never never never
Unless and until there is an order by - the data is not sorted and you can not have any expectations otherwise
(10g introduced a new group by approach, hash group by - but the data was never returned ordered by anything on purpose, ever)
Awesome as always ....
bc, May 20, 2008 - 6:01 pm UTC
This is exactly what I was looking for Tom, My current client just migrated from 9i to 10g ( finally ) and there have been a boatload of problems because their developers assumed that oracle would or already had implemented some kind of crystal ball mechanism that would automagically figure out their design / intent and provide a sorted output without using the order by clause.
They also assumed that relational is the synomym for sequential and "when others then commit" is perfectly normal ... :)
Thanks
BC
May 20, 2008 - 9:49 pm UTC
... "when others then commit" ...
ouch, that hurts, a lot.
¿bug with ROLLUP, COUNT(DISTINCT ...) and ORDER BY?
MutenRo, June 09, 2008 - 6:47 pm UTC
Hi,
this query is returning the rows in a WRONG order:
SELECT A, B, COUNT(DISTINCT C) AS C
FROM (
SELECT 1 AS A, 3 AS B, 2 AS C FROM DUAL
UNION ALL
SELECT 1 AS A, 3 AS B, 3 AS C FROM DUAL
UNION ALL
SELECT 2 AS A, 1 AS B, 2 AS C FROM DUAL
UNION ALL
SELECT 2 AS A, 2 AS B, 3 AS C FROM DUAL
)
GROUP BY ROLLUP(A), B
ORDER BY B
returns:
A B C
2 1 1
2 2 1
1 3 2
null 1 1
null 2 1
null 3 2
It seems to be a bug because if we use COUNT(C) or SUM(C) instead of COUNT(DISTINCT C) the we get a correct result.
Is it a bug?
Thanks
June 09, 2008 - 9:49 pm UTC
I reproduced in 10.2.0.2 (I have to get .4)
Does not reproduce in 11.1.0.6
what release are you on?
¿bug with ROLLUP and ORDER BY?
MutenRo, June 10, 2008 - 4:04 am UTC
I'm using 10.2.0.1.0, I didn't try with 11g, so I guess the fixed it.
Thanks.
June 10, 2008 - 7:53 am UTC
oh, I should have given you the workaround
with data
as
(select /*+ materialize */ .... group by rollup(...) )
select * from data order by b;
data never ever sorted without order by?
Pasko, June 10, 2008 - 3:48 pm UTC
Hi Tom,
only related slightly to this thread ;)
i read from some OCP Books that data in a Sorted Hash Cluster is always ordered even without an order-by, in fact the order-by is even ignored in this case.
is that true?
Regards,
Pasko
June 10, 2008 - 8:18 pm UTC
it is NOT true.
the order by is never ignored.
It is KNOWN by the optimizer that "we can retrieve the data sorted without having to sort it again, since it was placed there 'in order'"
The OPTIMIZER has been doing that for longer than I've been using Oracle (skipping actually sorting data when it understands "data is already sorted")
Eg:
select * from t where indexed_column between 1 and 2 order by indexed_column;
you might well find the "order by is ignored (NOT REALLY)", the order by will probably be satisfied by accessing the data in order via the index. The order by is NOT IGNORED, it is optimized away - because we know the data is retrieved in a manner that satisfies the sort (big big BIG difference, not ignored, never IGNORED)
tell me, what happens if you join data in a sorted hash cluster with other data? Will it be sorted? If so, by what?
Ok, wait, I'll just use a single table.
ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2> drop table audit_trail_shc;
ops$tkyte%ORA10GR2> drop cluster shc;
ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> CREATE CLUSTER shc
ops$tkyte%ORA10GR2> (
ops$tkyte%ORA10GR2> ip_address NUMBER,
ops$tkyte%ORA10GR2> request_time timestamp SORT
ops$tkyte%ORA10GR2> )
ops$tkyte%ORA10GR2> single table
ops$tkyte%ORA10GR2> HASHKEYS 1000
ops$tkyte%ORA10GR2> HASH IS ip_address
ops$tkyte%ORA10GR2> SIZE 20480
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table audit_trail_shc
ops$tkyte%ORA10GR2> ( ip_address number,
ops$tkyte%ORA10GR2> request_time timestamp sort,
ops$tkyte%ORA10GR2> bytes number,
ops$tkyte%ORA10GR2> url varchar2(255)
ops$tkyte%ORA10GR2> )
ops$tkyte%ORA10GR2> cluster shc (ip_address, request_time);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into audit_trail_shc (ip_address, request_time, bytes, url )
ops$tkyte%ORA10GR2> select 123456780001 + mod(rownum,255),
ops$tkyte%ORA10GR2> sysdate-100+rownum/24/60,
ops$tkyte%ORA10GR2> object_id,
ops$tkyte%ORA10GR2> rpad( '*', 80+mod(rownum,150), '*' ) url
ops$tkyte%ORA10GR2> from all_objects;
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2> select to_char( ip_address,'999999999999999999999' ) ip, request_time, bytes from audit_trail_shc where bytes between 100 and 110;
IP REQUEST_TIME BYTES
---------------------- ------------------------------ ----------
123456780106 02-MAR-08 09.54.49.000000 PM 106
123456780107 02-MAR-08 09.55.49.000000 PM 107
123456780100 02-MAR-08 09.48.49.000000 PM 100
123456780108 02-MAR-08 09.56.49.000000 PM 108
123456780101 02-MAR-08 09.49.49.000000 PM 101
123456780109 02-MAR-08 09.57.49.000000 PM 109
123456780102 02-MAR-08 09.50.49.000000 PM 102
123456780110 02-MAR-08 09.58.49.000000 PM 110
123456780103 02-MAR-08 09.51.49.000000 PM 103
123456780104 02-MAR-08 09.52.49.000000 PM 104
123456780105 02-MAR-08 09.53.49.000000 PM 105
11 rows selected.
does that appear to be sorted by *anything* ??
easier workaround for the Bug MutenRo from Spain hit
M. Rogel, June 11, 2008 - 6:17 am UTC
SELECT A, B, COUNT(DISTINCT C) AS C
FROM (
SELECT 1 AS A, 3 AS B, 2 AS C FROM DUAL
UNION ALL
SELECT 1 AS A, 3 AS B, 3 AS C FROM DUAL
UNION ALL
SELECT 2 AS A, 1 AS B, 2 AS C FROM DUAL
UNION ALL
SELECT 2 AS A, 2 AS B, 3 AS C FROM DUAL
)
GROUP BY ROLLUP(A), B
ORDER BY b+0
Perhaps update the documentation set
Steve, June 11, 2008 - 10:31 am UTC
Tom,
Perhaps you should suggest to the Documentation Group that they add an additional volume to the forthcoming 11g documentation set. It would be titled _Query Results Ordering Concepts_ and after the title page, TOC, etc would consist of one page with one paragraph: "Query results have no order. None. Query results have no implied, implicit, or natural order. The only method for obtained ordered results is to use the ORDER BY clause." And maybe an Appendix with a copy of your "Order in The Court" essay.
Based on what I have experienced recently, sitting here in 2008 (2008 for gosh sakes), I can't see any other way to kill this dangerous assumption.
sPh
sorted hash cluster
A reader, June 11, 2008 - 1:57 pm UTC
Hi Tom,
thanks for your follow-up.
You are right that the order by clause in a sorted hash cluster is NOT ignored.I have tested with your sample data.
Then i think this could be a New Feature Doc Bug:
see: http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14231/hash.htm#ADMIN01901
According to the Docs , we are supposed to get FIFO output from a sorted hash cluster, but it is NOT so!
Example:
insert into audit_trail_shc (ip_address, request_time, bytes, url )
values ( 11 , to_date(`010120081000¿,¿DDMMYYYYHH24MI¿), 111, `url1¿)
/
insert into audit_trail_shc (ip_address, request_time, bytes, url )
values ( 3 , to_date(`010120081100¿,¿DDMMYYYYHH24MI¿), 33, `url1¿)
/
insert into audit_trail_shc (ip_address, request_time, bytes, url )
values ( 2 , to_date(`010120080700¿,¿DDMMYYYYHH24MI¿), 22, `url1¿)
/
insert into audit_trail_shc (ip_address, request_time, bytes, url )
values ( 1 , to_date(`010120081200¿,¿DDMMYYYYHH24MI¿), 11, `url1¿)
/
SQL> select to_char( ip_address,'999999999999999999999' ) ip, request_time, bytes
2 from audit_trail_shc
3 where bytes between 10 and 120;
IP REQUEST_TIME BYTES
---------------------- ------------------------------ -----
11 01.01.08 10:00:00,000000 111
3 01.01.08 11:00:00,000000 33
2 01.01.08 07:00:00,000000 22
1 01.01.08 12:00:00,000000 11
If i change the order of inserts, then i do NOT get the expected result ( FIFO)
SQL> delete audit_trail_shc;
4 Zeilen wurden gel÷scht.
SQL> commit
2 /
Transaktion mit COMMIT abgeschlossen.
SQL> insert into audit_trail_shc (ip_address, request_time, bytes, url )
2 values ( 1 , to_date('010120081200','DDMMYYYYHH24MI'), 11, 'url1')
3 /
1 Zeile wurde erstellt.
SQL> insert into audit_trail_shc (ip_address, request_time, bytes, url )
2 values ( 2 , to_date('010120080700','DDMMYYYYHH24MI'), 22, 'url1')
3 /
1 Zeile wurde erstellt.
SQL> insert into audit_trail_shc (ip_address, request_time, bytes, url )
2 values ( 3 , to_date('010120081100','DDMMYYYYHH24MI'), 33, 'url1')
3 /
1 Zeile wurde erstellt.
SQL> insert into audit_trail_shc (ip_address, request_time, bytes, url )
2 values ( 11 , to_date('010120081000','DDMMYYYYHH24MI'), 111, 'url1')
3 /
1 Zeile wurde erstellt.
SQL>
SQL> select to_char( ip_address,'999999999999999999999' ) ip, request_time, bytes
2 from audit_trail_shc
3 where bytes between 10 and 120;
IP REQUEST_TIME BYTES
---------------------- ------------------------------ -----
11 01.01.08 10:00:00,000000 111
3 01.01.08 11:00:00,000000 33
2 01.01.08 07:00:00,000000 22
1 01.01.08 12:00:00,000000 11
June 11, 2008 - 5:54 pm UTC
without an order by, there can be no expectation of order.
I concur that the statement:
Given the sort order of the data, the following query would return the call records for a specified hash key by oldest record first.
SELECT * WHERE telephone_number = 6505551212;
is misleading - it would almost certainly by accident do that, but without the order by - it doesn't have to.
In their example, they query by the hash key - if you do that, you will probably accidentally get the data in sorted order by the sort key.
However, if you
Given the sort order of the data, the following query would return the call records for a specified hash key by oldest record first.
SELECT * FROM t1 left join t2 on (...) WHERE t1.telephone_number = 6505551212;
who knows.
there should have been an order by there.
Just make it break
Loz, June 11, 2008 - 7:37 pm UTC
Hey Tom,
Why don't you save yourself some blood pressure problems and suggest to the dev team they add a startup flag on Oracle to deliberately randomise output order when there is no explicit ORDER BY. I know I'd use it (if I used Oracle any more).
June 11, 2008 - 9:29 pm UTC
;)
I've always wanted the database to have this feature.
there would be a parameter, it's name would be:
FAST
it would default to false. when set to false, there would be lots of:
if ( !fast )
then
for i in 1 .. 1000 loop x = ln(i); end loop;
end if;
loops in the kernel.
then we could set
FAST=true
how cool would that be... similar concept.
lol
A reader, June 12, 2008 - 3:10 am UTC
about this FAST=TRUE !
Sub query - revisited
KK -$, February 18, 2009 - 5:44 am UTC
Thanks for an excellent article like this!
In response to one of the query copied below, you said:
But - the really good news - IF WE DON'T NEED TO SORT, WE WON'T....So, does the below query returns records in ascending order of ename?
SELECT ename FROM (
SELECT * FROM emp
ORDER BY ename
);
What about subqueries? December 30, 2006 - 3pm US/Eastern Bookmark | Bottom | TopReviewer: A reader
Is an outer query guaranteed to get an ordered result set if a subquery sorts the results but the outer query does not explicitly do that?
Will queries of the following form return deterministic results? Or do i need to include another ORDER BY in the outer query?
SELECT ename FROM (
SELECT * FROM emp
ORDER BY ename
);
Followup December 30, 2006 - 5pm US/Eastern:
in this case, due to the lack of a "set operator" like ROWNUM in the outer level - it is able to technically ignore the inner order by.
But - the really good news - IF WE DON'T NEED TO SORT, WE WON'T....
February 18, 2009 - 3:35 pm UTC
... So, does the below query returns records in ascending order of ename? ...
that was already answered in the bit you cut and pasted....
in this case, due to the lack of a "set operator" like ROWNUM in the outer
level - it is able to technically ignore the inner order by.there is no reason it would have to sort, we can look and see that there is nothing (like rownum) that would FORCE us to sort and since there is no order by on the outer bit of the query, we are free to skip it.
it could return the data unsorted if it felt like it, there is no reason we would HAVE to sort this particular set of data.
This shows the order by ename is "skipable" if we felt like it
ops$tkyte%ORA11GR1> create table emp as select * from scott.emp;
Table created.
ops$tkyte%ORA11GR1> exec dbms_stats.gather_table_stats( user, 'EMP' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> @at
ops$tkyte%ORA11GR1> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA11GR1> set autotrace traceonly explain
ops$tkyte%ORA11GR1> select ename from (select * from emp order by ename);
Execution Plan
----------------------------------------------------------
Plan hash value: 150391907
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 84 | 4 (25)| 00:00:0
| 1 | SORT ORDER BY | | 14 | 84 | 4 (25)| 00:00:0
| 2 | TABLE ACCESS FULL| EMP | 14 | 84 | 3 (0)| 00:00:0
------------------------------------------------------------------------
<b>here is didn't drop the order by, it could have, but it did not, I would not necessarily rely on that, the following shows the order by can be skipped:</b>
ops$tkyte%ORA11GR1> select ename from (select * from emp order by ename) order by empno;
Execution Plan
----------------------------------------------------------
Plan hash value: 150391907
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 140 | 4 (25)| 00:00:0
| 1 | SORT ORDER BY | | 14 | 140 | 4 (25)| 00:00:0
| 2 | TABLE ACCESS FULL| EMP | 14 | 140 | 3 (0)| 00:00:0
------------------------------------------------------------------------
<b>there is only one sort - not a sort for ENAME and then a sort for EMPNO, just one sort - for empno, the sort of ename can be skipped - unless of course we do something that makes it "not skippable" </b>
ops$tkyte%ORA11GR1> select ename, <b>rownum</b> from (select * from emp order by ename) order by empno;
Execution Plan
----------------------------------------------------------
Plan hash value: 1334697481
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 280 | 5 (40)| 00:0
| 1 | SORT ORDER BY | | 14 | 280 | 5 (40)| 00:0
| 2 | COUNT | | | | |
| 3 | VIEW | | 14 | 280 | 4 (25)| 00:0
| 4 | SORT ORDER BY | | 14 | 518 | 4 (25)| 00:0
| 5 | TABLE ACCESS FULL| EMP | 14 | 518 | 3 (0)| 00:0
------------------------------------------------------------------------
ops$tkyte%ORA11GR1> set autotrace off
<b>rownum is assured (top-n queries rely on this, it has to happen) to be assigned after the sort of ename, before the sort by empno. So we sort by ename, assign rownum, then sort by empno</b>
Clever oracle...
Matt, February 24, 2009 - 11:18 am UTC
Just add my 2 cents. We agree that ORDER BY is mandatory, and if your results are sorted just because you have an index, there is no guaranty that it will be always be the same. I just found funy that oracle knows that it doe not need to perform that sort. Here is a query performed on my database. I have an index on N_SUBMISSION_DATE ASC, N_RECORD_STATE, N_UNEXECUTED and perform the following query. As you will notice, statistics states that it didn't execute a sort despite the order by clause :
SQL> SELECT C_INVOCATION_ID, N_SUBMISSION_DATE, C_REQUESTED_SERVICE, C_INVOCATION_STATE, C_TRANSACTION_ID FROM BS_INVOCATION WHERE N_SUBMISSION_DATE>0 AND N_RECORD_STATE = 1 AND N_UNEXECUTED = 0 ORDER BY N_SUBMISSION_DATE;
26 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE (Cost=3 Card=1 Bytes=176)
1 0 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'BS_INVOCATION' (C
ost=3 Card=1 Bytes=176)
2 1 INDEX (RANGE SCAN) OF 'FK_BS_INV_RECORD_STATE' (NON-UNIQ
UE) (Cost=2 Card=346)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
9 consistent gets
0 physical reads
0 redo size
1623 bytes sent via SQL*Net to client
514 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
26 rows processed
Hierarchical query
A reader, May 14, 2009 - 12:25 pm UTC
It is a new twist of the old story.
In our application we have a query:
select * from
( select Element_eid, Level as lvl
from P
start with Ancestor_eid=:1 and SortOrder=:2
connect by Ancestor_eid=prior Element_eid and SortOrder=2
) t, P1 pr, W1 wo
where pr.Project_eid (+) = t.Element_eid and
wo.Work_eid (+) = t.Element_eid ;
Project_eid is a primary key of P1 and Work_eid is a primary key of W1.
It is a simple hierarchical query with 2 joins and of course result set will be in "hierarchical" order.
What else could it be? And it was sorted properly for several years!
But yesterday CBO changed its mind and decided that hash join is better than nested loops and now result set is not sorted at all. Of course it is very interesting why CBO changed its mind, but right now I am trying to figure out how to sort it in "hierarchical" order.
Lev
May 14, 2009 - 12:48 pm UTC
it was purely an accident that it was sorted before.
select * from
( select Element_eid, Level as lvl <b>, rownum OC</b>
from P
start with Ancestor_eid=:1 and SortOrder=:2
connect by Ancestor_eid=prior Element_eid and SortOrder=2
) t, P1 pr, W1 wo
where pr.Project_eid (+) = t.Element_eid and
wo.Work_eid (+) = t.Element_eid<b>
order by OC ;</b>
A reader, May 14, 2009 - 2:25 pm UTC
Thanks Tom,
It works.
Lev
The same old story again
A reader, August 13, 2009 - 5:56 pm UTC
That's what I found in our application:
-- I had to include this index hint to get records returned based on time.
-- Using primary key for the index hint doesn't work.
-- Explicitly order by clause doesn't work either
-- This is done to get the next pending job
select /*+hint INDEX_ASC ( t t_idx ) */
....
from jobs T
where AssignedTo = 'NONE'
and RowNum < 2;
I think I can guess why
order by clause didn't work.
Somehow it is working.
I just don't know - should I fix it now or wait until it's broken?
Thanks,
Lev
August 24, 2009 - 7:16 am UTC
it is just waiting to break on you.
IT MUST HAVE ORDER BY else the records do not need to be ordered.
What if it decided to use the index to access table T but did so using an index fast full scan (read index "not sorted")
What if the index is later renamed to something more reasonable?
Unless and until there is an order by - this query is 100% wrong (according to the comments)
select ...
from (select ..... from jobs where assignedto='NONE' order by DATE_COLUMN)
where rownum < 2;
Helena Marková, September 09, 2009 - 3:01 am UTC
ORDER BY MONTH #, DISPLAY MONTH NAME
Abhijit, September 29, 2009 - 9:03 am UTC
Hi Tom,
Thanks for the explanation. This is what I have been trying to do. I have been trying to find out the number of objects created in a particular month.
This is my query.
select
owner, count(object_name), to_char(created,'MON') from dba_objects
where OWNER = 'SPACE'
AND to_char(created,'YYYY') = '2009'
GROUP by owner, to_char(created,'MON')
ORDER By to_char(created,'MON')
This orders the list in this way.
APR
AUG
FEB
JAN
MAR
SEP
If I want to change the order to
JAN
FEB
.
I have to change the above query to
select
owner, count(object_name), to_char(created,'MON'), to_char(created,'MM') from dba_objects
where OWNER = 'SPACE'
AND to_char(created,'YYYY') = '2009'
GROUP by owner, to_char(created,'MON'), to_char(created,'MM')
ORDER By to_char(created,'MM')
For now, my objective is being satisfied. This is giving me an extra column in a report.
I am curious to know if I can avoid outputting the
month # column and still get the list in the order I need.
October 07, 2009 - 6:50 am UTC
1 select owner, count(object_name), trunc(created,'mm')
2 from dba_objects
3 where owner = 'SCOTT'
4 and trunc(created,'y') = to_date( '01-01-2009', 'dd-mm-yyyy' )
5 group by owner, trunc(created,'mm')
6* order by owner, trunc(created,'mm')
ops$tkyte%ORA11GR2> /
OWNER COUNT(OBJECT_NAME) TRUNC(CRE
------------------------------ ------------------ ---------
SCOTT 6 01-AUG-09
SCOTT 1 01-OCT-09
2 rows selected.
leave dates as dates
A friend of mine at work solved this one.
Abhijit, September 30, 2009 - 9:36 am UTC
<code>
select
owner, count(object_name), to_char(created,'MON') from dba_objects
where OWNER = 'SPACE'
AND to_char(created,'YYYY') = '2009'
GROUP by owner, to_char(created,'MON'), to_char(created,'MM')
ORDER By to_char(created,'MM')
</code>
October 07, 2009 - 8:39 am UTC
they did it wrong.
Why do you hate my grades Tom?
A reader, December 03, 2015 - 7:57 pm UTC
I was really hoping that group by was also an order by....thank Tom....and thanks obama