Anything we can do about the performance of the pickler fetch? I've always had pretty impressive performance from pipelined functions, but I've just seen a performance degradation from turning a row-by-row write into a bulk insert from a pipelined function (the real business code has loads of calculations hence the function).
See the following testcase. First I have a small procedure to dynamically create a view of a variable number of columns. This helps me to test different "row widths".
create procedure prc ( p in number ) authid current_user as
s varchar2(32767);
begin
s := 'create view v as select ';
for i in 1 .. p loop
s := s || '''XXXXXXXXX'' as c' || i || ',';
end loop;
s := rtrim(s,',') || ' from dual connect by rownum <= 10000';
execute immediate s;
end;
/
After executing this procedure, I then create a GTT to be the target of my inserts.
create global temporary table gtt
on commit preserve rows
as select * from v where rownum < 1;
Now I have a pipelined function to select from v and pipe it out. I've used bulk collect etc (this is 9i BTW but the same results are observable on 10.1 with plsql_optimize_level=0).
create package pkg as
type ntt is table of gtt%rowtype;
function pfx (p in sys_refcursor) return ntt pipelined;
end pkg;
/
create package body pkg as
function pfx (p in sys_refcursor) return ntt pipelined is
nt ntt;
begin
loop
fetch p bulk collect into nt limit 100;
exit when nt.count = 0;
for i in 1 .. nt.count loop
pipe row (nt(i));
end loop;
end loop;
close p;
return;
end pfx;
end pkg;
/
And this is my test harness to run at different column numbers...
exec runstats_pkg.rs_start;
begin
for r in (select * from v) loop
insert into gtt values r;
end loop;
end;
/
exec runstats_pkg.rs_middle;
insert into gtt
select * from table(pkg.pfx(cursor(select * from v)));
exec runstats_pkg.rs_stop(1000);
Timing results:
Columns Run1 Run2 %diff
======= ======= ======= =======
30 560 212 264
50 345 363 95
100 555 945 59
150 764 1787 43
The stats outputs are as you'd expect from bulk over row-by-row. Here's the 30 output and 150 output for the extremes.
30 cols...
SQL> exec runstats_pkg.rs_stop(1000);
Run1 ran in 560 hsecs
Run2 ran in 212 hsecs
Run1 ran in 264.15% of the time
Name Run1 Run2 Diff
STAT..consistent changes 11,921 2,718 -9,203
STAT..redo entries 10,051 805 -9,246
LATCH.redo allocation 10,273 805 -9,468
STAT..calls to get snapshot scn: kcmgss 10,034 68 -9,966
STAT..execute count 10,030 20 -10,010
LATCH.library cache 20,601 10,492 -10,109
LATCH.shared pool 10,475 342 -10,133
LATCH.cache buffers lru chain 12,220 0 -12,220
STAT..db block changes 22,099 3,532 -18,567
STAT..recursive calls 20,760 685 -20,075
LATCH.library cache pin 20,307 229 -20,078
LATCH.checkpoint queue latch 22,556 314 -22,242
LATCH.simulator hash latch 43,038 156 -42,882
LATCH.row cache objects 276 60,102 59,826
LATCH.row cache enqueue latch 262 60,102 59,840
LATCH.undo global data 136,599 1 -136,598
STAT..session logical reads 693,804 2,983 -690,821
STAT..db block gets 693,745 2,388 -691,357
STAT..session uga memory 65,408 1,177,344 1,111,936
STAT..session pga memory 65,536 1,179,648 1,114,112
STAT..redo size 1,299,236 122,376 -1,176,860
LATCH.cache buffers chains 1,438,611 11,983 -1,426,628
Run1 latches total versus run2 -- difference and pct
Run1 Run2 Diff Pct
1,717,857 145,218 -1,572,639 1,182.95%
PL/SQL procedure successfully completed.
150 cols...
SQL> exec runstats_pkg.rs_stop(1000);
Run1 ran in 764 hsecs
Run2 ran in 1787 hsecs
Run1 ran in 42.75% of the time
Name Run1 Run2 Diff
STAT..CPU used by this session 767 1,784 1,017
STAT..CPU used when call started 767 1,784 1,017
LATCH.checkpoint queue latch 1,678 3,249 1,571
STAT..consistent gets 66 2,881 2,815
STAT..session logical reads 19,335 15,957 -3,378
STAT..consistent changes 21,019 15,004 -6,015
LATCH.redo allocation 10,073 4,021 -6,052
STAT..redo entries 10,057 4,000 -6,057
STAT..db block gets 19,269 13,076 -6,193
STAT..calls to get snapshot scn: kcmgss 10,036 192 -9,844
LATCH.library cache 20,612 10,603 -10,009
STAT..execute count 10,031 20 -10,011
LATCH.shared pool 10,815 556 -10,259
STAT..db block changes 31,210 19,040 -12,170
LATCH.library cache pin 20,317 306 -20,011
STAT..recursive calls 20,885 685 -20,200
LATCH.cache buffers chains 90,604 65,206 -25,398
LATCH.row cache objects 76 60,126 60,050
LATCH.row cache enqueue latch 62 60,126 60,064
STAT..redo size 1,308,108 526,160 -781,948
STAT..session pga memory max 378,504 3,735,552 3,357,048
STAT..session uga memory max 588,672 4,186,112 3,597,440
STAT..session uga memory 196,224 4,643,968 4,447,744
STAT..session pga memory 196,608 4,653,056 4,456,448
Run1 latches total versus run2 -- difference and pct
Run1 Run2 Diff Pct
158,483 208,263 49,780 76.10%
PL/SQL procedure successfully completed.
Guess which end of the column-width spectrum I'm having to work on :(
If there's anything you can suggest...?