Skip to Main Content
  • Questions
  • Order By clause on an indexed column makes query 10x slower

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question.

Asked: July 20, 2022 - 5:47 pm UTC

Last updated: July 22, 2022 - 12:29 pm UTC

Version: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0

Viewed 100+ times

You Asked

We need to do pagination on a table having 10M+ rows. We are using the seek pagination method described at https://asktom.oracle.com/pls/apex/f?p=100:11:9519106226066:::::

Before doing seek pagination we were doing pagination using offset. so we had a query like

select * from table offset 0 rows fetch next 10000 rows only


this query completed in 8s

when we added a order by clause to fetch the first page of results

select * from table order by commit_time offset 0 rows fetch next 10000 rows only


the query time jumped to 80s!

commit_time is a column of type timestamp and we have indexed it like so:

CREATE INDEX "XXX" ON "XXX" ("COMMIT_TIME") 
  PCTFREE 10 INITRANS 2 MAXTRANS 255 
  STORAGE(
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "XXX"  LOCAL
...


it seems like oracle is ordering the rows on the fly and not using any B-tree index. how can we speed up our query?

it is not necessary for us to order the results by commit_time. so we can drop the order by clause but if we don't order the results and paginate using offset e.g. using,

select * from table offset 0 rows fetch next 10000 rows only
select * from table offset 10000 rows fetch next 10000 rows only
select * from table offset 20000 rows fetch next 10000 rows only
select * from table offset 30000 rows fetch next 10000 rows only


and so on...

are we guaranteed that each row is returned once and only once?

and Chris said...

are we guaranteed that each row is returned once and only once?

No. Without an ORDER BY, the database can return rows in any way it wants. If you omit this it's possible you could fetch some rows many times and never access others.

oracle is ordering the rows on the fly and not using any B-tree index

Reading the rows in sorted order via an index will generally only help if you're fetching a (relatively) small number of rows.

For example, this gets 10,000 rows from a 1,000,000 row table (1%) yet uses a full table scan despite an index on the sort column:

create table t ( c1 not null, c2 not null, c3 not null ) as 
  select level c1, 
         date'2020-01-01' + dbms_random.value ( 1, level ) c2, 
         rpad ( 'stuff', 10, 'f' ) c3
  from   dual
  connect by level <= 1000000;
  
create index i 
  on t ( c2 );

set serveroutput off
alter session set statistics_level = all;

set feed only
select * from t
order  by c2
fetch first 10000 rows only;
set feed on

select * 
from   table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST +coST'));
/*
--------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |      1 |        |  8130 (100)|  10000 |00:00:00.50 |    4021 |
|*  1 |  VIEW                    |      |      1 |  10000 |  8130   (1)|  10000 |00:00:00.50 |    4021 |
|*  2 |   WINDOW SORT PUSHED RANK|      |      1 |   1000K|  8130   (1)|  10000 |00:00:00.50 |    4021 |
|   3 |    TABLE ACCESS FULL     | T    |      1 |   1000K|  1115   (1)|   1000K|00:00:00.09 |    4021 |
--------------------------------------------------------------------------------------------------------
*/


If you force the query to read the rows via an index, it does more than twice the logical I/O - over 10,000 gets compared to just over 4,000:

set feed only

select /*+ index ( t ( c2 ) ) */* from t
order  by c2
fetch first 10000 rows only;

set feed on

select * 
from   table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
/*
---------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |      1 |        |  10000 |00:00:00.08 |   10009 |     35 |
|*  1 |  VIEW                         |      |      1 |  10000 |  10000 |00:00:00.08 |   10009 |     35 |
|*  2 |   WINDOW NOSORT STOPKEY       |      |      1 |  10000 |  10000 |00:00:00.07 |   10009 |     35 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T    |      1 |  10000 |  10000 |00:00:00.06 |   10009 |     35 |
|   4 |     INDEX FULL SCAN           | I    |      1 |   1000K|  10000 |00:00:00.02 |     128 |     35 |
---------------------------------------------------------------------------------------------------------
*/


So using an index could make your queries slower.

how can we speed up our query?

Make your fetch sizes smaller. Much smaller. I struggle to see the use case for a fetch size of 10,000. No person is reading that information. Even if this is for a (REST) API, that's still lots of data to send over the network.

Limiting the query to 10 rows instead of 10,000 and it now uses the index:

set feed only

select * from t
order  by c2
fetch first 10 rows only;

set feed on

select * 
from   table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
/*
------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |      1 |        |     10 |00:00:00.01 |       8 |
|*  1 |  VIEW                         |      |      1 |     10 |     10 |00:00:00.01 |       8 |
|*  2 |   WINDOW NOSORT STOPKEY       |      |      1 |     10 |     10 |00:00:00.01 |       8 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T    |      1 |   1000K|     10 |00:00:00.01 |       8 |
|   4 |     INDEX FULL SCAN           | I    |      1 |     10 |     10 |00:00:00.01 |       3 |
------------------------------------------------------------------------------------------------
*/


Combine this with the seek method and the queries should be efficient whichever page in the data set you fetch:

set feed only

select * from t
where  c2 > date'2022-07-21'
order  by c2
fetch first 10 rows only;

set feed on

select * 
from   table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
/*
------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |      1 |        |     10 |00:00:00.01 |      13 |
|*  1 |  VIEW                         |      |      1 |     10 |     10 |00:00:00.01 |      13 |
|*  2 |   WINDOW NOSORT STOPKEY       |      |      1 |     12 |     10 |00:00:00.01 |      13 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T    |      1 |     12 |     10 |00:00:00.01 |      13 |
|*  4 |     INDEX RANGE SCAN          | I    |      1 |        |     10 |00:00:00.01 |       3 |
------------------------------------------------------------------------------------------------
*/


Note - the link you've supplied doesn't point to any page! I'm guessing you're referring to this one https://asktom.oracle.com/pls/apex/asktom.search?tag=comparing-pagination-methods-offset-vs-seek

Rating

  (2 ratings)

Comments

A reader, July 22, 2022 - 12:58 am UTC

Hi Chris,

I fail to understand this: Reading the rows in sorted order via an index will generally only help if you're fetching a (relatively) small number of rows

If the database were using a B-tree index, it would use binary search to quickly seek the position at which the WHERE clause is satisfied. From there on, scanning next N rows should take same time as when we did not have the order by clause.

So I was hoping that when we added order by clause to our query, the extra time should be O(log n) which is the time to do binary search. This time cannot be 10x! 80s vs 8s that we are getting without order by.

What program are you using to show the code in your answer? is it sqlplus?
Chris Saxon
July 22, 2022 - 12:29 pm UTC

it would use binary search to quickly seek the position at which the WHERE clause is satisfied

That's only the first step. The database then has to go to the table to read the data!

Also in this query:

select /*+ index ( t ( c2 ) ) */* from t
order  by c2
fetch first 10000 rows only;


There is no WHERE clause!

The database starts reading the first entry in the index. The index only contains the value C2. So it has to go to the table to read the rest of the columns.

So the database walks through the index, but for each entry it's doing an additional I/O to fetch the row. The binary search is irrelevant, most of the work is looking up the rows in the table.

A full table scan can fetch many rows in one I/O operation.

That's why using the index is so much work (compared to full scanning the table & sorting that).

For more details on this, I suggest watching my videos going into more details on this:

https://www.youtube.com/playlist?list=PL78V83xV2fYlLA-bjMU2ZvUKQOZNrqLEa

What program are you using to show the code in your answer?

I run the examples in SQL Developer.

A reader, July 22, 2022 - 1:12 am UTC

also I am afraid I don't understand the code you have pasted in your answer. could you explain it?
Chris Saxon
July 22, 2022 - 12:29 pm UTC

What exactly are you struggling with?

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.