Skip to Main Content
  • Questions
  • Select statement sort without order by clause

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Steve.

Asked: March 07, 2013 - 7:01 am UTC

Last updated: December 13, 2016 - 1:44 am UTC

Version: 11.2.1

Viewed 50K+ times! This question is

You Asked

Tom,
I am looking in the documentation where it states oracle doesn't guarantee the order of a select statement without an order by clause. I'm having trouble finding in to prove it to the higher ups. I know its a shot in the dark but do you know off hand where that is?

Thanks

and Tom said...

http://docs.oracle.com/cd/E11882_01/server.112/e26088/statements_10002.htm#SQLRF20039

<quote>
Use the ORDER BY clause to order rows returned by the statement. Without an order_by_clause, no guarantee exists that the same query executed more than once will retrieve rows in the same order.
</quote>



You can read this also

http://asktom.oracle.com/Misc/order-in-court.html

Or you can just show them a single example, that is all it takes. A *single* example.

I'm assuming this is about group by. Just show them:


ops$tkyte%ORA11GR2> select deptno, count(*) from scott.emp group by deptno;

    DEPTNO   COUNT(*)
---------- ----------
        30          6
        20          5
        10          3

ops$tkyte%ORA11GR2> 


here is one reference:

http://docs.oracle.com/cd/E11882_01/server.112/e26088/statements_10002.htm#SQLRF20038

<quote>
The GROUP BY clause groups rows but does not guarantee the order of the result set. To order the groupings, use the ORDER BY clause.

</quote>


Rating

  (18 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

it all depends ...

Sokrates, March 08, 2013 - 3:45 am UTC

Without an order_by_clause, no guarantee exists that the same query executed more than once will retrieve rows in the same order.

It all depends.
Here, it depends on the query.
Of course, there are queries, that guarantee to retrieve the rows in the same order every time they are executed, though they have no order_by_clause.

Trivial examples - there also are nontrivial ones - are queries that guarantee to retrieve at most one row. On sets with cardinality less than or equal to 1 there exists only one ordering.
Tom Kyte
March 11, 2013 - 8:11 am UTC

give a non-trivial one.


the one row one doesn't even really begin to count in my opinion.


but give us a non-trivial example please.

Chuck Jolley, March 08, 2013 - 10:18 am UTC

There are NO GUARANTEES.
They may always return rows in the same order, but Oracle makes no GUARANTEE that they will.

documentation bug then ?

Sokrates, March 08, 2013 - 1:34 pm UTC

http://docs.oracle.com/cd/E11882_01/server.112/e26088/queries003.htm#i2053935
discusses hierarchical queries and also the order of the rows returned by them in absence of an order-by-clause.

From what is written there, it follows especially that a hierarchical query with no siblings ( one row per level ) by guarantee returns the rows sorted by level in case no order-by-clause is specified.

So, for example, it is guaranteed, that
select level from dual connect by level <= 2

will always return 1 and then 2 in this order.
Tom Kyte
March 11, 2013 - 8:18 am UTC

you have the special case of a bunch of one row queries.

connect by returns hierarchies in a certain order, but the rows within each level are not ordered in any sense.

so, you have that trivial one row example yet again. Each level is unsorted.



Chuck Jolley, March 08, 2013 - 8:02 pm UTC

Where does it say that?
Admittedly, i only scanned it, but I don't see anything like that.
Looks to me like it says no order by, no guaranteed sorting.

And I think it would just be good policy to pretend that that's what it says anyway. Why not just ask for what you want to begin with? What other subject in Oracle draws so many people to the idea of getting accidentally correct results?
Tom Kyte
March 11, 2013 - 8:24 am UTC

hiearcharies are returned in order, but data within each level is not - unless you use order siblings by.


In short:

no order by

NO GUARANTEES on the order of rows, period.


yes, if you have a one row result set/level - it will not be sorted - but it doesn't matter since there is only one row. it is not useful to even consider that, it doesn't make sense to consider order when you have a singleton. the concept doesn't even exist.

order of rows in outter table in nested loop join

reader, March 09, 2013 - 4:02 am UTC

Hi,
What about following SQL that uses nested loop to join tables:

select /*+ USE_NL(a b) NO_MERGE(a) */a.id,b.desc
from (select id from table1 order by id desc) a, table2 b
where a.id = b.id

Can we assume that the result set is sorted by a.id?

Thanks.
Tom Kyte
March 11, 2013 - 8:33 am UTC

nope


not until you put an order by on there.


what if I drop an index so that suddenly use_nl is meaningless.

furthermore the optimmizer is free to recognize that order by id desc, as it is used, is meaningless and can optimize it away. It as no meaning and does not need to be preserved unless there was a ROWNUM construct on the outer query (and no, just adding "where rownum > 1" isn't sufficient either.


if you want data sorted


YOU USE ORDER BY ON THE RESULT SET.

If the order by can be skipped by the optimizer because it used an index, recognizes the data is already sorted, whatever - fine, but YOU CANNOT SKIP THE ORDER BY.

Only the optimizer can.



Using hints to get sorted data is one of the stupidest things you can do - ever.

you would only be asking for a world of hurt.

to reader

Sokrates, March 09, 2013 - 5:27 pm UTC

no

trivial

Sokrates, March 11, 2013 - 9:34 am UTC

agree, all counterexamples are trivial

Just do it!

Jeff, March 11, 2013 - 1:49 pm UTC

Tom,
I've been following your threads on this topic for years.

Can someone explain to me WHY there is such resistance to just putting the [redacted] ORDER BY clause on the SELECT statement?

People act like there's something offensive about it!

Nobody complains about a WHERE clause or a GROUP BY clause, why do they have such a problem with the ORDER BY?


Tom Kyte
March 12, 2013 - 8:16 am UTC

Here is my thought on the subject:

they think it will run faster. they think that if they find a magic way to get the data sorted without using order by, the statement will magically run faster.




the problem with that is:

the optimizer already does that, if it doesn't have to sort, it won't. If it does - it will. An order by does not mean that it will perform a sort - if the data is retrieved in sorted order. However, the day the data isn't retrieved sorted - then the data WILL BE sorted by the database and you'll get the right answer (but the "magic" way will have stopped working).



Here are cases I've seen it over and over:

"we use group by to sort" - no, no you don't. In the past, group by sometimes did a "sort group by" - but even that wasn't "sorted". It was a binary sort - not a character set sort - so depending on your character set - you were probably getting slightly wrong data. IF the binary sort was good enough to sort - the optimizer would SKIP the sort anyway. If the binary sort wasn't - it would sort it for you.

"we use an index, so it is OK - it comes back sorted" - no, no it doesn't. character set issues once again - and what if one day the index IS NOT USED? what then?

"we use an index hint so it will always use that index" - no, no it won't. How about someone "accidentally drops" your table (remember, you are in a shop where they think they don't need order by - I'll bet tables are accidentally dropped every week). So you "undrop it" with flashback table - only that didn't rename the index - so the index name is different. Whoops.


"we have observed that the data always gets returned sorted, every time we run it" - but that doesn't mean it will continue to do so in the future does it. No, it does not.


and so on.

It is pathetic in my opinion. I don't get it either. I've had people get very upset that "group by" didn't return my data sorted all of a sudden. The fact is that group by NEVER had to return the data sorted, a simple change in plan is all it ever took (from AT LEAST version 5 on, seriously).


IF you want data sorted THEN you will use order by, period, end of discussion.

Bug tickets

Charlie B., March 11, 2013 - 2:17 pm UTC

My guess is that developers find out about this the hard way. If they can get their production folks to flip a magic init.ora switch and make it all better, then they've dodged getting a bug ticket.

I can't say that I know anyone who got dinged on a review based on the number of bugs in their code. But I wouldn't be surprised to hear that it happens.

David Aldridge, March 12, 2013 - 7:34 am UTC

Hmmm, I would have thought that if your software vendor states that their system does not guarantee X unless you do Y, then there is no arguable case to the contrary.

Unless the higher-ups have got a secret document from Oracle telling them to ignore that statement, and specifically guaranteeing X regardless of Y, then all they have is an opinion.

There's really nothing to prove.

a doubt

Sokrates, March 12, 2013 - 8:57 am UTC

the optimizer already does that, if it doesn't have to sort, it won't. If it does - it will.

I doubt that.

Take the query
select * from big_table order by dbms_random.value()


In my Oracle version, the optimizer will do a sort.
But, as you stated in
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:3181424400346795479#3192817700346734242
, there is no need to perform a sort here.
Tom Kyte
March 12, 2013 - 9:30 am UTC

come on, be realistic.

You know that currently - it is assigning dbms_random.value() to each row and then sorting.

You know that according to sql - it could change its mind. We could decide one day that we will just assume all functions are statement level deterministic. We currently don't, but we could

and if that was ever to happen - then the statement would become:
select * from big_table order by CONSTANT;

which becomes
select * from big_table


right now, it *doesn't work that way* so right now the sort cannnot be removed.
the sort *cannot* be removed right now.



this has nothing to do with anything we are discussing here, please don't confuse this issue. This is important - you want sorted data, use order by, nothing else works. nothing.

Chuck Jolley, March 12, 2013 - 12:41 pm UTC

Look at the original question.
He has to prove that he needs to ask for a sort to be assured of getting one?
Could it be that DBAs from MANY years ago are now managers and think that is the way it works? I can remember sometimes counting on sorted group by when I first started 15 years ago. There was a kind of "sorting is evil" school of thought back then.


You get it from all camps!

Galen Boyer, March 12, 2013 - 2:05 pm UTC

In a thread about correlated subqueries, some guy is being all preachy
about Oracle not being "relational". Well of course it isn't 100%
perfectly relational, because it, like all other database vendors,
supports SQL. But, the underpinnings of Oracle and all other database
vendors are very much relational, for example, the fact that you must
put an order by on your SQL if you want an ordered result. In this
functionality, Oracle is following exactly what set theory states, the
elements in a set have no order. But, now you have to argue with
people that do not want relational! Jeez...

can we depend on an idnex_asc or indesc_desc hint?

Naresh Bhandare, October 26, 2016 - 8:05 pm UTC

Hello,

Can we depend on the order of rows returned with an index_asc or index_des hint?

e.g.
with a table "ranges", having a column "range_start",
and an index on ranges(range_start)

to get the max value of range_start, if we write:

select /*+ index_desc(r (range_start)) */ range_start
from ranges r where range <= '777627365'
and rownum = 1

will that work?

Connor McDonald
October 27, 2016 - 12:28 am UTC

No, you cannot depend on that .... ever !

When I

- drop that index
- rename that index
- mark it unusable
- change the statistics

any or all of these things could mean that hint is no longer applicable...and you will NOT get an error. You will just get a random row.

Doing "select Max(...) from ..." will use an index if appropriate to 'walk down' the 'right hand side' of the index, so that will be very fast *AND* if you drop that index, it would slow down but STILL be correct.

about the previos question

Naresh, October 26, 2016 - 10:20 pm UTC

in the previous question,

"max value of range_start"

should be

"max value of range_start <= 'an in put value' "

Thank You

Naresh, October 30, 2016 - 10:57 am UTC

Thank You for the response above - some great points there.

I will check the option of using the max.
Connor McDonald
October 31, 2016 - 3:34 am UTC

glad we could help

Sometimes I think about ordering without ORDER BY

Arsenij, December 09, 2016 - 1:16 pm UTC

I think about ordering by execution plan, without explicit ORDER BY, when I see "ORA-22813: operand value exceeds system limits".
Because of its explanation:
*Cause: Object or Collection value was too large. The size of the value
might have exceeded 30k in a SORT context, or the size might be
too big for available memory.

For SDO_GEOMETRY it is common to be more than 30k. That's why we do not have officially supported way of sorting for queries that may contain such large objects or collections, do we?
However, most queries do not show this error even with geometries larger than 30k. Why?

Example for getting error:
create table table_geom (unid number, geom MD_SYS.SDO_GEOMETRY);

insert into table_geom 
select 1,
      SDO_GEOMETRY(2002,8307,NULL,SDO_ELEM_INFO_ARRAY(1,2,1),
                   cast(multiset(select coord 
                                 from
                                     (
                                      select mod(OBJECT_ID,100) coord from all_objects where rownum < 100000
                                     union all
                                      select mod(OBJECT_ID,100) coord from all_objects where rownum < 100000
                                     ) ) as mdsys.sdo_ordinate_array)
                  ) as geom 
from dual;

select /*+ PARALLEL */
unid, geom 
from table_geom 
order by unid


ORA-22813: operand value exceeds system limits
22813. 00000 -  "operand value exceeds system limits"
*Cause:    Object or Collection value was too large. The size of the value
           might have exceeded 30k in a SORT context, or the size might be
           too big for available memory.
*Action:   Choose another value and retry the operation.

Connor McDonald
December 12, 2016 - 1:47 am UTC

Patch 16882070 might be relevant here - check with Support.

John Keymer, December 12, 2016 - 4:28 pm UTC

Hi,
more out interest than anything else...

With regards to being unable to determine how many times the function is going to be called, I assume that applies when the function is used as part of a predicate, order by, or in the select list itself?
So I shouldn't be surprised if the following code produces 1000 rows containing the same value?

Select dbms_random.value(1,1000) d From Dual Connect By Level <= 1000;


It doesn't incidentally (on my 11.2.0.3 instance I have access to here), although I can easily demonstrate Scalar Subquery Caching with it by doing

Select (Select dbms_random.value(1,1000) From dual) d From Dual Connect By Level <= 1000;


That aside, if I wanted to generate a evenly distributed list of 1000 numbers between 1 and 1000, would I theoretically need to push something into the call to dbms_random.value such as rowid or Level? And if I did that, how could I ensure it's not just optimized out?
I.e. would the optimizer realize that Mod(Level,Level)=1 is a tautology and optimize it out to a constant, thus negating the intentions completely?

Select dbms_random.value(Mod(Level,Level),1000) d From Dual Connect By Level <= 1000;


Just for experimental purposes I tried this:

Select (Select dbms_random.value(1,case when z.rowid is not null then 1000 end) From dual) d From Dual z Connect By Level <= 1000;


thinking it might force the function to get called 1000 times, however I'm assuming the optimizer is clever enough to know that rowid is never going to be null without an outer joined table, and optimizes that out to 1=1, thus enabling SSqC?
Connor McDonald
December 13, 2016 - 1:44 am UTC

Putting scalar subquery caching aside, to my knowledge, we can only avoid function execution if the function is deterministic or we've been fiddling with result cache.

So unless we change the definition of dbms_random to be deterministic, then I think you're safe, because scalar subquery caching and result cache are explicit steps taken by the author of the SQL.