Skip to Main Content
  • Questions
  • Select statement sort without order by clause

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Steve.

Asked: March 07, 2013 - 7:01 am UTC

Answered by: Tom Kyte - Last updated: December 13, 2016 - 1:44 am UTC

Category: Database - Version: 11.2.1

Viewed 10K+ times! This question is

You Asked

Tom,
I am looking in the documentation where it states oracle doesn't guarantee the order of a select statement without an order by clause. I'm having trouble finding in to prove it to the higher ups. I know its a shot in the dark but do you know off hand where that is?

Thanks

and we said...

http://docs.oracle.com/cd/E11882_01/server.112/e26088/statements_10002.htm#SQLRF20039

<quote>
Use the ORDER BY clause to order rows returned by the statement. Without an order_by_clause, no guarantee exists that the same query executed more than once will retrieve rows in the same order.
</quote>



You can read this also

http://asktom.oracle.com/Misc/order-in-court.html

Or you can just show them a single example, that is all it takes. A *single* example.

I'm assuming this is about group by. Just show them:


ops$tkyte%ORA11GR2> select deptno, count(*) from scott.emp group by deptno;

    DEPTNO   COUNT(*)
---------- ----------
        30          6
        20          5
        10          3

ops$tkyte%ORA11GR2> 


here is one reference:

http://docs.oracle.com/cd/E11882_01/server.112/e26088/statements_10002.htm#SQLRF20038

<quote>
The GROUP BY clause groups rows but does not guarantee the order of the result set. To order the groupings, use the ORDER BY clause.

</quote>


and you rated our response

  (18 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

it all depends ...

March 08, 2013 - 3:45 am UTC

Reviewer: Sokrates

Without an order_by_clause, no guarantee exists that the same query executed more than once will retrieve rows in the same order.

It all depends.
Here, it depends on the query.
Of course, there are queries, that guarantee to retrieve the rows in the same order every time they are executed, though they have no order_by_clause.

Trivial examples - there also are nontrivial ones - are queries that guarantee to retrieve at most one row. On sets with cardinality less than or equal to 1 there exists only one ordering.
Tom Kyte

Followup  

March 11, 2013 - 8:11 am UTC

give a non-trivial one.


the one row one doesn't even really begin to count in my opinion.


but give us a non-trivial example please.

March 08, 2013 - 10:18 am UTC

Reviewer: Chuck Jolley from OKC, OK USA

There are NO GUARANTEES.
They may always return rows in the same order, but Oracle makes no GUARANTEE that they will.

documentation bug then ?

March 08, 2013 - 1:34 pm UTC

Reviewer: Sokrates

http://docs.oracle.com/cd/E11882_01/server.112/e26088/queries003.htm#i2053935
discusses hierarchical queries and also the order of the rows returned by them in absence of an order-by-clause.

From what is written there, it follows especially that a hierarchical query with no siblings ( one row per level ) by guarantee returns the rows sorted by level in case no order-by-clause is specified.

So, for example, it is guaranteed, that
select level from dual connect by level <= 2

will always return 1 and then 2 in this order.
Tom Kyte

Followup  

March 11, 2013 - 8:18 am UTC

you have the special case of a bunch of one row queries.

connect by returns hierarchies in a certain order, but the rows within each level are not ordered in any sense.

so, you have that trivial one row example yet again. Each level is unsorted.



March 08, 2013 - 8:02 pm UTC

Reviewer: Chuck Jolley from OKC, OK USA

Where does it say that?
Admittedly, i only scanned it, but I don't see anything like that.
Looks to me like it says no order by, no guaranteed sorting.

And I think it would just be good policy to pretend that that's what it says anyway. Why not just ask for what you want to begin with? What other subject in Oracle draws so many people to the idea of getting accidentally correct results?
Tom Kyte

Followup  

March 11, 2013 - 8:24 am UTC

hiearcharies are returned in order, but data within each level is not - unless you use order siblings by.


In short:

no order by

NO GUARANTEES on the order of rows, period.


yes, if you have a one row result set/level - it will not be sorted - but it doesn't matter since there is only one row. it is not useful to even consider that, it doesn't make sense to consider order when you have a singleton. the concept doesn't even exist.

order of rows in outter table in nested loop join

March 09, 2013 - 4:02 am UTC

Reviewer: reader

Hi,
What about following SQL that uses nested loop to join tables:

select /*+ USE_NL(a b) NO_MERGE(a) */a.id,b.desc
from (select id from table1 order by id desc) a, table2 b
where a.id = b.id

Can we assume that the result set is sorted by a.id?

Thanks.
Tom Kyte

Followup  

March 11, 2013 - 8:33 am UTC

nope


not until you put an order by on there.


what if I drop an index so that suddenly use_nl is meaningless.

furthermore the optimmizer is free to recognize that order by id desc, as it is used, is meaningless and can optimize it away. It as no meaning and does not need to be preserved unless there was a ROWNUM construct on the outer query (and no, just adding "where rownum > 1" isn't sufficient either.


if you want data sorted


YOU USE ORDER BY ON THE RESULT SET.

If the order by can be skipped by the optimizer because it used an index, recognizes the data is already sorted, whatever - fine, but YOU CANNOT SKIP THE ORDER BY.

Only the optimizer can.



Using hints to get sorted data is one of the stupidest things you can do - ever.

you would only be asking for a world of hurt.

to reader

March 09, 2013 - 5:27 pm UTC

Reviewer: Sokrates

no

trivial

March 11, 2013 - 9:34 am UTC

Reviewer: Sokrates

agree, all counterexamples are trivial

Just do it!

March 11, 2013 - 1:49 pm UTC

Reviewer: Jeff

Tom,
I've been following your threads on this topic for years.

Can someone explain to me WHY there is such resistance to just putting the [redacted] ORDER BY clause on the SELECT statement?

People act like there's something offensive about it!

Nobody complains about a WHERE clause or a GROUP BY clause, why do they have such a problem with the ORDER BY?


Tom Kyte

Followup  

March 12, 2013 - 8:16 am UTC

Here is my thought on the subject:

they think it will run faster. they think that if they find a magic way to get the data sorted without using order by, the statement will magically run faster.




the problem with that is:

the optimizer already does that, if it doesn't have to sort, it won't. If it does - it will. An order by does not mean that it will perform a sort - if the data is retrieved in sorted order. However, the day the data isn't retrieved sorted - then the data WILL BE sorted by the database and you'll get the right answer (but the "magic" way will have stopped working).



Here are cases I've seen it over and over:

"we use group by to sort" - no, no you don't. In the past, group by sometimes did a "sort group by" - but even that wasn't "sorted". It was a binary sort - not a character set sort - so depending on your character set - you were probably getting slightly wrong data. IF the binary sort was good enough to sort - the optimizer would SKIP the sort anyway. If the binary sort wasn't - it would sort it for you.

"we use an index, so it is OK - it comes back sorted" - no, no it doesn't. character set issues once again - and what if one day the index IS NOT USED? what then?

"we use an index hint so it will always use that index" - no, no it won't. How about someone "accidentally drops" your table (remember, you are in a shop where they think they don't need order by - I'll bet tables are accidentally dropped every week). So you "undrop it" with flashback table - only that didn't rename the index - so the index name is different. Whoops.


"we have observed that the data always gets returned sorted, every time we run it" - but that doesn't mean it will continue to do so in the future does it. No, it does not.


and so on.

It is pathetic in my opinion. I don't get it either. I've had people get very upset that "group by" didn't return my data sorted all of a sudden. The fact is that group by NEVER had to return the data sorted, a simple change in plan is all it ever took (from AT LEAST version 5 on, seriously).


IF you want data sorted THEN you will use order by, period, end of discussion.

Bug tickets

March 11, 2013 - 2:17 pm UTC

Reviewer: Charlie B. from Fairfax, VA USA

My guess is that developers find out about this the hard way. If they can get their production folks to flip a magic init.ora switch and make it all better, then they've dodged getting a bug ticket.

I can't say that I know anyone who got dinged on a review based on the number of bugs in their code. But I wouldn't be surprised to hear that it happens.

March 12, 2013 - 7:34 am UTC

Reviewer: David Aldridge

Hmmm, I would have thought that if your software vendor states that their system does not guarantee X unless you do Y, then there is no arguable case to the contrary.

Unless the higher-ups have got a secret document from Oracle telling them to ignore that statement, and specifically guaranteeing X regardless of Y, then all they have is an opinion.

There's really nothing to prove.

a doubt

March 12, 2013 - 8:57 am UTC

Reviewer: Sokrates

the optimizer already does that, if it doesn't have to sort, it won't. If it does - it will.

I doubt that.

Take the query
select * from big_table order by dbms_random.value()


In my Oracle version, the optimizer will do a sort.
But, as you stated in
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:3181424400346795479#3192817700346734242
, there is no need to perform a sort here.
Tom Kyte

Followup  

March 12, 2013 - 9:30 am UTC

come on, be realistic.

You know that currently - it is assigning dbms_random.value() to each row and then sorting.

You know that according to sql - it could change its mind. We could decide one day that we will just assume all functions are statement level deterministic. We currently don't, but we could

and if that was ever to happen - then the statement would become:
select * from big_table order by CONSTANT;

which becomes
select * from big_table


right now, it *doesn't work that way* so right now the sort cannnot be removed.
the sort *cannot* be removed right now.



this has nothing to do with anything we are discussing here, please don't confuse this issue. This is important - you want sorted data, use order by, nothing else works. nothing.

March 12, 2013 - 12:41 pm UTC

Reviewer: Chuck Jolley from OKC, OK USA

Look at the original question.
He has to prove that he needs to ask for a sort to be assured of getting one?
Could it be that DBAs from MANY years ago are now managers and think that is the way it works? I can remember sometimes counting on sorted group by when I first started 15 years ago. There was a kind of "sorting is evil" school of thought back then.


You get it from all camps!

March 12, 2013 - 2:05 pm UTC

Reviewer: Galen Boyer from BOSTON

In a thread about correlated subqueries, some guy is being all preachy
about Oracle not being "relational". Well of course it isn't 100%
perfectly relational, because it, like all other database vendors,
supports SQL. But, the underpinnings of Oracle and all other database
vendors are very much relational, for example, the fact that you must
put an order by on your SQL if you want an ordered result. In this
functionality, Oracle is following exactly what set theory states, the
elements in a set have no order. But, now you have to argue with
people that do not want relational! Jeez...

can we depend on an idnex_asc or indesc_desc hint?

October 26, 2016 - 8:05 pm UTC

Reviewer: Naresh Bhandare from GA, USA

Hello,

Can we depend on the order of rows returned with an index_asc or index_des hint?

e.g.
with a table "ranges", having a column "range_start",
and an index on ranges(range_start)

to get the max value of range_start, if we write:

select /*+ index_desc(r (range_start)) */ range_start
from ranges r where range <= '777627365'
and rownum = 1

will that work?

Connor McDonald

Followup  

October 27, 2016 - 12:28 am UTC

No, you cannot depend on that .... ever !

When I

- drop that index
- rename that index
- mark it unusable
- change the statistics

any or all of these things could mean that hint is no longer applicable...and you will NOT get an error. You will just get a random row.

Doing "select Max(...) from ..." will use an index if appropriate to 'walk down' the 'right hand side' of the index, so that will be very fast *AND* if you drop that index, it would slow down but STILL be correct.

about the previos question

October 26, 2016 - 10:20 pm UTC

Reviewer: Naresh from GA, US

in the previous question,

"max value of range_start"

should be

"max value of range_start <= 'an in put value' "

Thank You

October 30, 2016 - 10:57 am UTC

Reviewer: Naresh from GA, US

Thank You for the response above - some great points there.

I will check the option of using the max.
Connor McDonald

Followup  

October 31, 2016 - 3:34 am UTC

glad we could help

Sometimes I think about ordering without ORDER BY

December 09, 2016 - 1:16 pm UTC

Reviewer: Arsenij from Ukraine

I think about ordering by execution plan, without explicit ORDER BY, when I see "ORA-22813: operand value exceeds system limits".
Because of its explanation:
*Cause: Object or Collection value was too large. The size of the value
might have exceeded 30k in a SORT context, or the size might be
too big for available memory.

For SDO_GEOMETRY it is common to be more than 30k. That's why we do not have officially supported way of sorting for queries that may contain such large objects or collections, do we?
However, most queries do not show this error even with geometries larger than 30k. Why?

Example for getting error:
create table table_geom (unid number, geom MD_SYS.SDO_GEOMETRY);

insert into table_geom 
select 1,
      SDO_GEOMETRY(2002,8307,NULL,SDO_ELEM_INFO_ARRAY(1,2,1),
                   cast(multiset(select coord 
                                 from
                                     (
                                      select mod(OBJECT_ID,100) coord from all_objects where rownum < 100000
                                     union all
                                      select mod(OBJECT_ID,100) coord from all_objects where rownum < 100000
                                     ) ) as mdsys.sdo_ordinate_array)
                  ) as geom 
from dual;

select /*+ PARALLEL */
unid, geom 
from table_geom 
order by unid


ORA-22813: operand value exceeds system limits
22813. 00000 -  "operand value exceeds system limits"
*Cause:    Object or Collection value was too large. The size of the value
           might have exceeded 30k in a SORT context, or the size might be
           too big for available memory.
*Action:   Choose another value and retry the operation.

Connor McDonald

Followup  

December 12, 2016 - 1:47 am UTC

Patch 16882070 might be relevant here - check with Support.

December 12, 2016 - 4:28 pm UTC

Reviewer: John Keymer

Hi,
more out interest than anything else...

With regards to being unable to determine how many times the function is going to be called, I assume that applies when the function is used as part of a predicate, order by, or in the select list itself?
So I shouldn't be surprised if the following code produces 1000 rows containing the same value?

Select dbms_random.value(1,1000) d From Dual Connect By Level <= 1000;


It doesn't incidentally (on my 11.2.0.3 instance I have access to here), although I can easily demonstrate Scalar Subquery Caching with it by doing

Select (Select dbms_random.value(1,1000) From dual) d From Dual Connect By Level <= 1000;


That aside, if I wanted to generate a evenly distributed list of 1000 numbers between 1 and 1000, would I theoretically need to push something into the call to dbms_random.value such as rowid or Level? And if I did that, how could I ensure it's not just optimized out?
I.e. would the optimizer realize that Mod(Level,Level)=1 is a tautology and optimize it out to a constant, thus negating the intentions completely?

Select dbms_random.value(Mod(Level,Level),1000) d From Dual Connect By Level <= 1000;


Just for experimental purposes I tried this:

Select (Select dbms_random.value(1,case when z.rowid is not null then 1000 end) From dual) d From Dual z Connect By Level <= 1000;


thinking it might force the function to get called 1000 times, however I'm assuming the optimizer is clever enough to know that rowid is never going to be null without an outer joined table, and optimizes that out to 1=1, thus enabling SSqC?
Connor McDonald

Followup  

December 13, 2016 - 1:44 am UTC

Putting scalar subquery caching aside, to my knowledge, we can only avoid function execution if the function is deterministic or we've been fiddling with result cache.

So unless we change the definition of dbms_random to be deterministic, then I think you're safe, because scalar subquery caching and result cache are explicit steps taken by the author of the SQL.