We need to do pagination on a table having 10M+ rows. We are using the seek pagination method described at
https://asktom.oracle.com/pls/apex/f?p=100:11:9519106226066::::: Before doing seek pagination we were doing pagination using offset. so we had a query like
select * from table offset 0 rows fetch next 10000 rows only
this query completed in 8s
when we added a order by clause to fetch the first page of results
select * from table order by commit_time offset 0 rows fetch next 10000 rows only
the query time jumped to 80s!
commit_time is a column of type timestamp and we have indexed it like so:
CREATE INDEX "XXX" ON "XXX" ("COMMIT_TIME")
PCTFREE 10 INITRANS 2 MAXTRANS 255
STORAGE(
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "XXX" LOCAL
...
it seems like oracle is ordering the rows on the fly and not using any B-tree index. how can we speed up our query?
it is not necessary for us to order the results by commit_time. so we can drop the order by clause but if we don't order the results and paginate using offset e.g. using,
select * from table offset 0 rows fetch next 10000 rows only
select * from table offset 10000 rows fetch next 10000 rows only
select * from table offset 20000 rows fetch next 10000 rows only
select * from table offset 30000 rows fetch next 10000 rows only
and so on...
are we guaranteed that each row is returned once and only once?
are we guaranteed that each row is returned once and only once? No. Without an ORDER BY, the database can return rows in any way it wants. If you omit this it's possible you could fetch some rows many times and never access others.
oracle is ordering the rows on the fly and not using any B-tree indexReading the rows in sorted order via an index will generally only help if you're fetching a (relatively) small number of rows.
For example, this gets 10,000 rows from a 1,000,000 row table (1%) yet uses a full table scan despite an index on the sort column:
create table t ( c1 not null, c2 not null, c3 not null ) as
select level c1,
date'2020-01-01' + dbms_random.value ( 1, level ) c2,
rpad ( 'stuff', 10, 'f' ) c3
from dual
connect by level <= 1000000;
create index i
on t ( c2 );
set serveroutput off
alter session set statistics_level = all;
set feed only
select * from t
order by c2
fetch first 10000 rows only;
set feed on
select *
from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST +coST'));
/*
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 8130 (100)| 10000 |00:00:00.50 | 4021 |
|* 1 | VIEW | | 1 | 10000 | 8130 (1)| 10000 |00:00:00.50 | 4021 |
|* 2 | WINDOW SORT PUSHED RANK| | 1 | 1000K| 8130 (1)| 10000 |00:00:00.50 | 4021 |
| 3 | TABLE ACCESS FULL | T | 1 | 1000K| 1115 (1)| 1000K|00:00:00.09 | 4021 |
--------------------------------------------------------------------------------------------------------
*/
If you force the query to read the rows via an index, it does more than twice the logical I/O - over 10,000 gets compared to just over 4,000:
set feed only
select /*+ index ( t ( c2 ) ) */* from t
order by c2
fetch first 10000 rows only;
set feed on
select *
from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
/*
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10000 |00:00:00.08 | 10009 | 35 |
|* 1 | VIEW | | 1 | 10000 | 10000 |00:00:00.08 | 10009 | 35 |
|* 2 | WINDOW NOSORT STOPKEY | | 1 | 10000 | 10000 |00:00:00.07 | 10009 | 35 |
| 3 | TABLE ACCESS BY INDEX ROWID| T | 1 | 10000 | 10000 |00:00:00.06 | 10009 | 35 |
| 4 | INDEX FULL SCAN | I | 1 | 1000K| 10000 |00:00:00.02 | 128 | 35 |
---------------------------------------------------------------------------------------------------------
*/
So using an index could make your queries
slower.
how can we speed up our query?Make your fetch sizes smaller. Much smaller. I struggle to see the use case for a fetch size of 10,000. No person is reading that information. Even if this is for a (REST) API, that's still lots of data to send over the network.
Limiting the query to 10 rows instead of 10,000 and it now uses the index:
set feed only
select * from t
order by c2
fetch first 10 rows only;
set feed on
select *
from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
/*
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.01 | 8 |
|* 1 | VIEW | | 1 | 10 | 10 |00:00:00.01 | 8 |
|* 2 | WINDOW NOSORT STOPKEY | | 1 | 10 | 10 |00:00:00.01 | 8 |
| 3 | TABLE ACCESS BY INDEX ROWID| T | 1 | 1000K| 10 |00:00:00.01 | 8 |
| 4 | INDEX FULL SCAN | I | 1 | 10 | 10 |00:00:00.01 | 3 |
------------------------------------------------------------------------------------------------
*/
Combine this with the seek method and the queries should be efficient whichever page in the data set you fetch:
set feed only
select * from t
where c2 > date'2022-07-21'
order by c2
fetch first 10 rows only;
set feed on
select *
from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
/*
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.01 | 13 |
|* 1 | VIEW | | 1 | 10 | 10 |00:00:00.01 | 13 |
|* 2 | WINDOW NOSORT STOPKEY | | 1 | 12 | 10 |00:00:00.01 | 13 |
| 3 | TABLE ACCESS BY INDEX ROWID| T | 1 | 12 | 10 |00:00:00.01 | 13 |
|* 4 | INDEX RANGE SCAN | I | 1 | | 10 |00:00:00.01 | 3 |
------------------------------------------------------------------------------------------------
*/
Note - the link you've supplied doesn't point to any page! I'm guessing you're referring to this one
https://asktom.oracle.com/pls/apex/asktom.search?tag=comparing-pagination-methods-offset-vs-seek