Database, SQL and PL/SQL

Getting Just the Rows I Want

Our Oracle expert looks for the number of rows searched, the fastest way to go, and how to automate the creation of a materialized view.

By Tom Kyte

July/August 2001

I have a cursor with an ORDER BY clause that returns 10,000 rows, but I'm restricting it to 500 rows with ROWNUM. Does the Oracle database store 10,000 rows in memory or open with 500 rows? For example:

SELECT empno, empnm
FROM   ( SELECT empno, empnm
         FROM  employee
         ORDER BY empno )
WHERE  ROWNUM < 500;

This is a frequent question—especially with the abundance of Web-based applications trying to "paginate" through result sets without maintaining a state.

Generally, the Oracle database does not store an entire result "in memory." Instead, the database tries to answer the query and return the first row before it gets the last row. Consider this:

select * from one_billion_row_table;

If the database attempted to store the result set in memory, we would never be able to ask a query like that. Oracle generally answers the query on the fly whenever possible. In the above case—if EMPNO is indexed and EMPNO is NOT NULL—the optimizer will read the index to process the ORDER BY and stop after reading 500 rows. For example:

ops$tkyte@ORA8I.WORLD> create table emp as
2 select object_id empno, object_name
ename from all_objects;
Table created. (with 20,000 plus rows)

ops$tkyte@ORA8I.WORLD> desc emp

Name Null? Type
----- ----- ----
EMPNO NOT NULL NUMBER
ENAME NOT NULL VARCHAR2(30)

ops$tkyte@ORA8I.WORLD> create index emp_idx
on emp(empno);
Index created.

ops$tkyte@ORA8I.WORLD> select empno, ename

2 from ( select empno, ename from emp
order by empno )
3 where rownum < 500;

499 rows selected.

SQL_TRACE/TKPROF reports

Rows Row Source Operation
---- --------------------

499 COUNT STOPKEY
499 VIEW
499 TABLE ACCESS BY INDEX ROWID EMP
499 INDEX FULL SCAN (object id 27418)

So the database starts a full scan of the index, reading the data in order from the table, and stops after 500 rows. It never gets to the other 19,500 rows. If, on the other hand, you did something like this:

ops$tkyte@ORA8I.WORLD> select *
2 from ( select ename, count(*) from emp
group by ename )
3 where rownum < 500;

499 rows selected.

SQL_TRACE/TKPROF shows

Rows Row Source Operation
---- --------------------
499 COUNT STOPKEY
499 VIEW
499 SORT GROUP BY STOPKEY
21933 TABLE ACCESS FULL EMP

That is, the database would have to answer the subquery first and then select 500 rows from it. In this case, the temporary result set of the subquery would typically be materialized into the temporary segments and then read back from there, depending on the size.

Finding the Fastest Way

Within a large loop, if I want to commit every 1,000 (or million, or billion) records, which is faster— using mod() and then commit , as in:

LOOP
cnt := cnt + 1;
IF ( mod( cnt, 1000 ) ) = 0 THEN
commit;
END IF;
END LOOP;

or setting up a counter, THEN commit , and then counter := 0, as in:

LOOP
cnt := cnt + 1;
IF cnt = 1000 THEN
commit;
cnt := 0;
END IF;
END LOOP;

Do you want to commit in the loop in the first place? If you don't, the fastest way is to not even consider committing every n rows. You should configure sufficient rollback and do the update in a single UPDATE statement. Don't use a loop at all. It is slower than an INSERT INTO SELECT or a single UPDATE statement.

The second-fastest way is to configure sufficient rollback and do the transaction in a single loop without commits. Each time you commit, you must wait for the log-writer process to fully flush the buffers. If you just keep going, LGWR does this in the background—you'll have to wait once at the end instead of every n rows. The commit is what will slow you down. You'll have to consider how to restart this process too. Watch out for ORA-01555, which will inevitably get you if you're updating the table you're reading.

As for the code above, we can use the source-code profiler provided in Oracle8i to analyze this. I created and ran the following procedures:

create or replace procedure do_mod
as
cnt number := 0;
begin
dbms_profiler.start_profiler( 'mod' );
for i in 1 .. 500000
loop
cnt := cnt + 1;
if ( mod(cnt,1000) = 0 )
then
commit;
end if;
end loop;
dbms_profiler.stop_profiler;

end;
/

create or replace procedure no_mod
as
cnt number := 0;
begin
dbms_profiler.start_profiler( 'no mod'
);
for i in 1 .. 500000
loop
cnt := cnt + 1;
if ( cnt = 1000 )
then
commit;
cnt := 0;
end if;
end loop;

dbms_profiler.stop_profiler;
end;
/

Running the profiler reports, I find

Percentage of time in each module,
summarized across runs

UNIT_OWNER UNIT_NAME SECS PERCENTAG
---------- -------- ---- ---------
OPS$TKYTE DO_MOD 8.18 71.67
OPS$TKYTE NO_MOD 3.23 28.32
SYS DBMS_PROFILER .00 .00

Already, this shows that the MOD function takes longer: over 8 seconds in that routine, compared to just over 3 seconds in the routine without the MOD function. To prove that it is the MOD function itself, see Listing 1. Using MOD took about 5.5 seconds, whereas doing if ( cnt=1000 ) took 1 second, plus the time to do cnt := 0, for a total of about 1.5 seconds.

Automatic Summary Tables

I have two tables, TABLE1 and TABLE2, and I want to create a trigger like:

create or replace trigger trig_1
after insert on table1
....
update table2 set col1 = ....(based on
sum(col2 of table1))
where key_condition
....

Within this trigger, I want to update TABLE2. This update is based on COL1 in TABLE1 (on summation of COL1 for TABLE1). If I do select sum(col1) from table1 , TABLE1 mutates. There is no foreign-key relationship, and the COL1 in TABLE2 and COL2 in TABLE1 are not keys in both tables.

This might best be handled by a materialized view—a mechanism whereby the server will maintain summary tables automatically, either upon committing changes to the database, on a recurring (timed) basis, or on demand. This example demonstrates how you might store the rolled-up sum of a column in a detail table in another summary table. We'll have the summary refreshed upon commit to keep the rolled-up summary as current as the last transaction. First, our detail table:

SQL> create table t1 ( a int,
2 b int,
3 y number,
4 primary key(a,b) )
5 /
Table created.

On this table, we will need to keep a materialized view log to capture the changes made to T1 during our transaction. We'll keep a summary of column Y rolled up by the values of column A—hence those columns must appear in our materialized view log:

SQL> create materialized view log on t1
2 with rowid ( a, y ) including new
values
3 /
Materialized view log created.

We are ready for our materialized view:

SQL> create materialized view t2
2 build immediate
3 refresh on commit
4 as
5 select a, count(*), count(y), sum(y) from t1 group by a
6 /
Materialized view created.

We need to maintain more then just A and SUM(Y) in this case, to allow the database to incrementally refresh this view and not have to rebuild it. The counts are small and probably useful to our applications. Load up the table T1:

SQL> begin
2 for i in 1 .. 10
3 loop
4 for j in 1 .. 10
5 loop

6 insert into t1 values (
i, j, dbms_random.random );
7 end loop;
8 end loop;
9 end;
10 /
PL/SQL procedure successfully completed.

SQL> commit;
Commit complete.

SQL> select t2.a, t2.sum_y,
2 (select sum(y) from t1 where t1.a =
t2.a ) sum_t1_y
3 from t2x t2
4 /

A SUM_Y SUM_T1_Y
- ----- --------
1 -2.985E+09 -2.985E+09
2 -1.788E+09 -1.788E+09
3 -38754027 -38754027
4 -4.144E+09 -4.144E+09

5 164759646 164759646
6 2717491103 2717491103
7 6081803620 6081803620
8 1222195769 1222195769
9 706431010 706431010
10 -6.878E+09 -6.878E+09

10 rows selected.

The materialized view T2 automatically kept SUM(Y) for us—when compared directly to the actual SUM(Y) from T1, the results are the same. We can insert, update, and delete T1; the changes will be reflected in T2 automatically.

Next Steps

Oracle Managing Technologist Tom Kyte answers your most difficult Oracle technology questions in Oracle Publishing Online's forum Ask Tom, at asktom.oracle.com. Highlights from that forum appear in this column.

Materialized views were introduced with Oracle8i and are available in Oracle8i Enterprise and Personal editions. For more information on materialized views, see the Oracle8i Data Warehousing Guide on the Oracle Technology Network.

Tom Kyte

In his 22 years at Oracle, Tom Kyte started the Ask Tom question and answer site and was the site's sole proprietor for almost 15 years. In his spare time, he also wrote Expert Oracle Database Architecture (Apress, 2005, 2010, 2014) and Effective Oracle by Design (Oracle Press, 2003), among other books. (The Ask Tom site continues today as Ask TOM: Ask The Oracle Masters.)

DISCLAIMER: We've captured these popular historical magazine articles for your reference. Links etc contained within these article possibly won't work. These articles are provided as-is with no guarantee that the information within them is the best advice for current versions of the Oracle Database.

Share

Getting Just the Rows I Want

Our Oracle expert looks for the number of rows searched, the fastest way to go, and how to automate the creation of a materialized view.