A little background - the optimizer can optimize for one of two goals:
a) initial response time, get the first row as fast as possible.
b) total throughput, get the last row as fast as possible.
So, say for example - you have a query:
select * from big_table where some_condition order by something_not_null_and_indexed;
The way to get the first row as fast as possible might be best achieved by using the index to read individual rows from the table in order and apply the where clause. That will use row by row processing.
The way to get to the last row as fast as possible might be best achieved by using a full scan to read the entire table, sort it and start returning.
The index would probably get me my first record much faster than the full scan would. The full scan, using multiblock IO's and big bulky things would likely return me my last record as fast as possible.
I've set up a table with 1,000,000 rows in it - a copy of all-objects over and over - with an ID column populated by a sequence (and it is the primary key). if I run:
ops$tkyte%ORA11GR2> declare
2 cursor c1 is select /*+ first_rows(25) */ * from big_table.big_table where owner = 'SYS' order by id;
3 cursor c2 is select /*+ all_rows */ * from big_table.big_table where owner = 'SYS' order by id;
4
5 type array is table of c1%rowtype;
6 l_data array;
7 begin
8 open c1;
9 fetch c1 bulk collect into l_data LIMIT 25;
10 close c1;
11
12 open c2;
13 fetch c2 bulk collect into l_data LIMIT 25;
14 close c2;
15 end;
16 /
PL/SQL procedure successfully completed.
<b>the tkprof might show me:</b>
SELECT /*+ first_rows(25) */ * FROM BIG_TABLE.BIG_TABLE WHERE OWNER = 'SYS'
ORDER BY ID
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 1 4 0 25
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.00 0.00 1 4 0 25
Row Source Operation
---------------------------------------------------
TABLE ACCESS BY INDEX ROWID BIG_TABLE (cr=4 pr=1 pw=0 time=547 us cost=2100 siz
INDEX FULL SCAN BIG_TABLE_PK (cr=3 pr=0 pw=0 time=98 us cost=2099 size=0 card
--------------------------------------------------------------------------------
SELECT /*+ all_rows */ * FROM BIG_TABLE.BIG_TABLE WHERE OWNER = 'SYS'
ORDER BY ID
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 1 1.00 1.86 7869 14544 2 25
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 1.00 1.86 7869 14544 2 25
Row Source Operation
---------------------------------------------------
SORT ORDER BY (cr=14544 pr=7869 pw=6296 time=1862560 us cost=13777 size=429878
TABLE ACCESS FULL BIG_TABLE (cr=14544 pr=7862 pw=0 time=404112 us cost=399
That shows that if the optimizer knew you wanted the first 25 as soon as possible - the index in this case was the way to go. However, if you run this code:
ops$tkyte%ORA11GR2> declare
2 cursor c1 is select /*+ first_rows(25) */ * from big_table.big_table where owner = 'SYS' order by id;
3 cursor c2 is select /*+ all_rows */ * from big_table.big_table where owner = 'SYS' order by id;
4
5 type array is table of c1%rowtype;
6 l_data array;
7 begin
8 for x in c1
9 loop
10 null;
11 end loop;
12 for x in c2
13 loop
14 null;
15 end loop;
16 end;
17 /
SELECT /*+ first_rows(25) */ *
FROM BIG_TABLE.BIG_TABLE WHERE OWNER = 'SYS' ORDER BY ID
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 4343 3.32 3.32 14023 25237 0 434221
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4345 3.32 3.32 14023 25237 0 434221
Row Source Operation
---------------------------------------------------
TABLE ACCESS BY INDEX ROWID BIG_TABLE (cr=25237 pr=14023 pw=0 time=2116614 us
INDEX FULL SCAN BIG_TABLE_PK (cr=6421 pr=46 pw=0 time=511456 us cost=2099 siz
--------------------------------------------------------------------------------
SELECT /*+ all_rows */ *
FROM BIG_TABLE.BIG_TABLE WHERE OWNER = 'SYS' ORDER BY ID
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 4343 2.59 2.88 14134 14539 2 434221
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4345 2.59 2.88 14134 14539 2 434221
Row Source Operation
---------------------------------------------------
SORT ORDER BY (cr=14539 pr=14134 pw=6296 time=1978420 us cost=13777 size=42987879
TABLE ACCESS FULL BIG_TABLE (cr=14539 pr=7838 pw=0 time=352168 us cost=3999 siz
the full scan was the fastest way to the last record.
So, to the questions
if Oracle took away first_rows(n) would applications suffer - YES, definitely. An interactive, end user driven application should probably be using first rows (whereas batch, reports - things that need to finish completely before being useful - should use all rows, which is the default)
Would anybody notice it without a shadow of doubt? YES, definitely, I would notice it right away.
What percent of operations get a significant improvement with it and what are those operations? Pretty close to 100% - and those operations are any time you want to get the first set of rows first - fast - to put up on a screen for example and let the end user start playing with them.
Would setting it in sqlplus be beneficial? Only if you are trying to judge what the initial response time of a given query is - otherwise, since sqlplus always wants to get to the last row as fast as possible - not really.
What situations would do better if first rows was set at the system level? I would not suggest that, that changes the default behavior of the database and many tools/products/etc assume the defaults are the defaults. You should avoid setting things at the instance level whenever you can.