Bulk collect
Santhanam, September 08, 2003 - 9:36 am UTC
Excellent
Java & Blobs
Vinnie, November 13, 2003 - 4:47 pm UTC
Can this be used with Java & Blob Data?
November 13, 2003 - 9:21 pm UTC
what do you mean?
performance
Rukshan Soza, November 14, 2003 - 7:10 am UTC
How can this be quicker than insert into ... select *
November 14, 2003 - 9:18 am UTC
the assumption would be of course that line 7 of the example on "how to bulk bind" would have some amount of nasty procedural code that cannot be performed in SQL alone
(good point though, if you can do it in a SINGLE SQL statment -- by all means -- DO IT)
Java Bulk Insert with a BLOB
Vinnie, November 14, 2003 - 8:08 am UTC
I am looking for an example of the fastest way from within a JAVA app to insert a record which contains a BLOB.
We currently have it working with bind variables doing 1 insert/commit at a time. How can we speed this up? Bulk inserts? Using a ORACLE Procedure?
November 14, 2003 - 10:11 am UTC
less commits = better. commit when you are DONE working.
any chance you can use sqlldr? (you know, don't write code....)
are you streaming?
are you using the THICK jdbc driver?
whats the SMALLEST example you have that demonstrates HOW you are doing it (small, yet complete -- emphasis on small)
Java Bulk Insert with a BLOB
Vinnie, November 14, 2003 - 10:45 am UTC
Ok, this is a snippet!
insertEventCall = con.prepareCall( "begin INSERT INTO ...
(a,b,c,BLOB_DATA) VALUES (?,?,? empty_blob())
return BLOB_DATA into ?; end; );
protected void insertEvent()
{
try
{
syncronized( this.con)
{
CallableStatement stmt = insertEventCall;
stmt.setlong (1,...);
stmt.setString (2,...);
stmt.setString (3,...);
stmt.registerOutParameter(4, java.sql.Types.BLOB);
stmt.executeUpdate();
BLOB blob = (BLOB) stmt.getBlob(4);
OutputStream writer = blob.getBinaryOutputStream();
byte[]theBytes =
this.toByteArray(event.getRaw().getElement() );
writer.write(theBytes);
writer.flush;
writer.close;
con.commit;
} // end syncronize
...
...
Can we use bulk inserts or call a procedure & pass everything in & Let ORACLE handle it? Examples are greatly appreciated:)
Hope this is enough
November 14, 2003 - 11:19 am UTC
ugh,
parse
bind
execute
commit
terrible.... do this
parse ONCE -- ONCE, not more then ONCE. make the insert a "global" class variable. parse it ONCE
bind/execute over and over and over
commit as infrequently possible
how big are the blobs?
BLOBs again!
Vinnie, November 14, 2003 - 11:31 am UTC
The insert is only being parsed once. I just show it there to let you know what the statement looked like.
The Blobs vary in size. Average lets say 4k but can grow to 500k at times.
How can I commit infrequently? Can you give me an example of this with the blob?
November 14, 2003 - 4:59 pm UTC
thats why I asked for "a simple, short, example that shows HOW you do what you do"
that's really important, need to see HOW.
to commit less frequently? don't call commit? thats the only way.
have you identified the database as being the performance issue here? is it the bottleneck?
looking for comments on types based alternative
Kevin, November 14, 2003 - 12:10 pm UTC
Hi Tom, nice threads, can you please comment on the differences (resource utilization, etc.) between your first bulk bind exmample and a variation using objece types. I can do my own testing for my specific situations to see which is faster and so on but I was hoping you could give us a little indepth knowledge about what is happening differently between the two behind the looking glass. I am wondering about memory consumption on the client vs. server, how many copies of things there might be, and most importantly, whatever you figure I should know about it. Curiosity may have killed the cat but I find satisfaction always brings him back. Your bluk bind example from above was:
ops$tkyte@ORA920LAP> declare
2 type array is table of emp%rowtype index by binary_integer;
3
4 l_data array;
5 begin
6 select * bulk collect into l_data from emp;
7
8 forall i in 1 .. l_data.count
9 insert into emp values l_data(i);
10 end;
11 /
PL/SQL procedure successfully completed.
Using objects one would code something like this:
SQL> desc emp
Name Null? Type
----------------------------- -------- --------------------
EMPNO NOT NULL NUMBER(4)
ENAME VARCHAR2(10)
JOB VARCHAR2(9)
MGR NUMBER(4)
HIREDATE DATE
SAL NUMBER(7,2)
COMM NUMBER(7,2)
DEPTNO NUMBER(2)
SQL> create table new_emp as select * from emp where rownum < 1;
Table created.
SQL> create or replace type o_emp is object
2 (
3 EMPNO NUMBER(4)
4 , ENAME VARCHAR2(10)
5 , JOB VARCHAR2(9)
6 , MGR NUMBER(4)
7 , HIREDATE DATE
8 , SAL NUMBER(7,2)
9 , COMM NUMBER(7,2)
10 , DEPTNO NUMBER(2)
11 )
12 /
Type created.
SQL> show errors
No errors.
SQL> create or replace type c_emp is table of o_emp
2 /
Type created.
SQL> show errors
No errors.
SQL> declare
2 c_emp_v c_emp;
3 begin
4 select cast(multiset(select * from emp) as c_emp) into c_emp_v from dual;
5 insert into new_emp select * from table(cast(c_emp_v as c_emp));
6 end;
7 /
PL/SQL procedure successfully completed.
November 14, 2003 - 5:06 pm UTC
well, neither will affect client memory consumption since the client (sqlplus) isn't really involved.
use runstats
</code>
http://asktom.oracle.com/~tkyte/runstats.html <code>
and benchmark it!
I would use the plsql only method unless i needed the extra capabilities of the collection. I believe you'll find the plsql index by table ( a plsql ONLY thing ) to be slightly more efficient in plsql. the collection type is nice to return things to other languages (plsql table works with plsql ONLY) and when you need to use SQL on it.
Simply superb
"Fan", November 14, 2003 - 1:14 pm UTC
Bulk Insert with a BLOB
Vinnie, November 14, 2003 - 7:15 pm UTC
I sent a short snippet a few posts ago. Do you need more?
I do realize that I can commit less. What we would really like to be able to do is do a batch insert, something like addBatch with a certain batch size. But we are not sure if we can do the batch insert with the BLOB because you first have to get the blob locator back from the insert.
Here is the code again:
insertEventCall = con.prepareCall( "begin INSERT INTO ...
(a,b,c,BLOB_DATA) VALUES (?,?,? empty_blob())
return BLOB_DATA into ?; end; );
protected void insertEvent()
{
try
{
syncronized( this.con)
{
CallableStatement stmt = insertEventCall;
stmt.setlong (1,...);
stmt.setString (2,...);
stmt.setString (3,...);
stmt.registerOutParameter(4, java.sql.Types.BLOB);
stmt.executeUpdate();
BLOB blob = (BLOB) stmt.getBlob(4);
OutputStream writer = blob.getBinaryOutputStream();
byte[]theBytes =
this.toByteArray(event.getRaw().getElement() );
writer.write(theBytes);
writer.flush;
writer.close;
con.commit;
} // end syncronize
...
...
Perhaps I am asking the question wrong...let me know.
November 15, 2003 - 9:00 am UTC
the short snippet did not represent reality. so, i spent time telling you how to do some tuning that "already happened" but I did not know that since you posted a fake example that did not represent what you really did - which is all I asked for:
whats the SMALLEST example you have that demonstrates HOW you are doing it
(small, yet complete -- emphasis on small)
I was just expressing my disappointment in the fact that we didn't get that.
again, tell us the process here.
o how many blobs are you loading
o why AREN'T we using a simple tool (why are we writing code here)
o how long is this taking and how long must it take
Bulk - missing something
Phil, November 14, 2003 - 9:11 pm UTC
SQL> select banner from v$version
2 /
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE 9.2.0.1.0 Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
SQL> desc big_all_tables
Name Null? Type
----------------------------------------- -------- ----------------------------
OWNER VARCHAR2(30)
TABLE_NAME VARCHAR2(30)
COLUMN_NAME VARCHAR2(30)
DATA_TYPE VARCHAR2(106)
DATA_TYPE_MOD VARCHAR2(3)
DATA_TYPE_OWNER VARCHAR2(30)
DATA_LENGTH NUMBER
DATA_PRECISION NUMBER
DATA_SCALE NUMBER
NULLABLE VARCHAR2(1)
COLUMN_ID NUMBER
DEFAULT_LENGTH NUMBER
NUM_DISTINCT NUMBER
LOW_VALUE RAW(32)
HIGH_VALUE RAW(32)
DENSITY NUMBER
NUM_NULLS NUMBER
NUM_BUCKETS NUMBER
LAST_ANALYZED DATE
SAMPLE_SIZE NUMBER
CHARACTER_SET_NAME VARCHAR2(44)
CHAR_COL_DECL_LENGTH NUMBER
GLOBAL_STATS VARCHAR2(3)
USER_STATS VARCHAR2(3)
AVG_COL_LEN NUMBER
CHAR_LENGTH NUMBER
CHAR_USED VARCHAR2(1)
V80_FMT_IMAGE VARCHAR2(3)
DATA_UPGRADED VARCHAR2(3)
SQL> desc million_all_tables
Name Null? Type
----------------------------------------- -------- ----------------------------
OWNER VARCHAR2(30)
TABLE_NAME VARCHAR2(30)
COLUMN_NAME VARCHAR2(30)
DATA_TYPE VARCHAR2(106)
DATA_TYPE_MOD VARCHAR2(3)
DATA_TYPE_OWNER VARCHAR2(30)
DATA_LENGTH NUMBER
DATA_PRECISION NUMBER
DATA_SCALE NUMBER
NULLABLE VARCHAR2(1)
COLUMN_ID NUMBER
DEFAULT_LENGTH NUMBER
NUM_DISTINCT NUMBER
LOW_VALUE RAW(32)
HIGH_VALUE RAW(32)
DENSITY NUMBER
NUM_NULLS NUMBER
NUM_BUCKETS NUMBER
LAST_ANALYZED DATE
SAMPLE_SIZE NUMBER
CHARACTER_SET_NAME VARCHAR2(44)
CHAR_COL_DECL_LENGTH NUMBER
GLOBAL_STATS VARCHAR2(3)
USER_STATS VARCHAR2(3)
AVG_COL_LEN NUMBER
CHAR_LENGTH NUMBER
CHAR_USED VARCHAR2(1)
V80_FMT_IMAGE VARCHAR2(3)
DATA_UPGRADED VARCHAR2(3)
SQL> @bulk_collect_record
SP2-0310: unable to open file "bulk_collect_record.sql"
SQL> @bulk_coll_record
declare
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
SQL>
Hello Tom,
I have created another procedure for bulk collect using your site that works fine. This new concept I am not able use. It just disconnects me.
November 15, 2003 - 9:06 am UTC
9201 is quite a few patch sets ago. 9204 is the current patch set. that is what support will say when you go to file this tar (which is what you must do at this point)
Sorry , missed the code
Phil, November 14, 2003 - 9:13 pm UTC
declare
type array is table of big_all_tables%rowtype index by binary_integer;
l_data array;
begin
select * bulk collect into l_data from big_all_tables;
forall i in 1 .. l_data.count
insert into million_all_tables values l_data(i);
end;
/
November 15, 2003 - 9:07 am UTC
oh my -- is big_all_tables really big????
you are NOT array fetching like 1,000,000 rows are you?????????
that must be:
open c;
loop
fetch c bulk collect into l_data limit N;
forall i in 1 .. l_data.count insert into....
exit when c%notfound;
end loop;
close c;
More Blobs
vinnie, November 15, 2003 - 11:30 am UTC
Ok, lets try again.
We are loading a record being read from another source. The source has a constant flow of data. At times, the rate can be much greater than other times. The data is currently read using a JAVA app. At the current time we can't change that architecture. We currently insert 1 record at a time.
The table struct it:
event_id number
event_type varchar2
event_data blob
We are wondering if we can batch insert these rows, because I know that batch inserts dramatically increase insert performance. We know how long the current insert rate is, but feel if we can use batch inserts this can be dratically increased.
November 15, 2003 - 11:44 am UTC
you can batch rows when the blob is 4000 bytes or less since you can just bind a RAW and the blob will be created implicitly.
as you pointed out with the lob locator however, you need it right after you insert in order to be able to stream to it -- so by very definition you cannot batch the inserts since you must "insert, get locator, stream". if you batched -- you would not "get locator"
so, you could set up two statements -- one for blobs of less then 4000 bytes and another for > 4000 bytes. You can set the "batch" size using a method of the prepared statement -- so that the 4000 byte blobs will be array inserted (and sent anytime you commit). that way you could batch the small rows and just insert the big.
you are using the THICK oci driver right?
also -- you have IDENTIFIED a PERFORMANCE ISSUE right?
and you have diagnosed that it is this, the insert?
have you played with cached vs non-cached blobs? (this might be a good candidate for cached blobs)
Bulk inserts, Limit Mising
Phil, November 15, 2003 - 6:25 pm UTC
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE 9.2.0.1.0 Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
SQL> select count(*) from big_all_tables;
COUNT(*)
----------
31322
SQL> select count(*) from million_all_tables;
COUNT(*)
----------
0
SQL> get bulk_coll_record
1 create or replace procedure insert_million_limit
2 as
3 type array is table of big_all_tables%rowtype index by binary_integer;
4 l_data array;
5 cursor c1 is select
6 OWNER,
7 TABLE_NAME,
8 COLUMN_NAME,
9 DATA_TYPE,
10 DATA_TYPE_MOD,
11 DATA_TYPE_OWNER,
12 DATA_LENGTH,
13 DATA_PRECISION,
14 DATA_SCALE,
15 NULLABLE,
16 COLUMN_ID,
17 DEFAULT_LENGTH,
18 NUM_DISTINCT,
19 LOW_VALUE,
20 HIGH_VALUE,
21 DENSITY,
22 NUM_NULLS,
23 NUM_BUCKETS,
24 LAST_ANALYZED,
25 SAMPLE_SIZE,
26 CHARACTER_SET_NAME,
27 CHAR_COL_DECL_LENGTH,
28 GLOBAL_STATS,
29 USER_STATS,
30 AVG_COL_LEN,
31 CHAR_LENGTH,
32 CHAR_USED,
33 V80_FMT_IMAGE,
34 DATA_UPGRADED from big_all_tables;
35 begin
36 open c1;
37 loop
38 fetch c1 bulk collect into l_data LIMIT 5;
39 forall i in 1 .. l_data.count
40 insert into million_all_tables values l_data(i);
41 exit when c1%notfound;
42 end loop;
43 close c1;
44* end;
45 /
Procedure created.
SQL> execute insert_million_limit
BEGIN insert_million_limit; END;
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
SQL>
Hello Tom,
I am sorry to bother you again, but again this is the same error.
November 16, 2003 - 10:36 am UTC
read my original answer, up 2 reviews.
Blobs
Vinnie, November 17, 2003 - 8:10 am UTC
Just wanted to say thanks for the help.
I'm sorry I was not explaining or giving the details
required.
This dialog was most usefull!!
Bulk insert error
Venkat, November 19, 2003 - 1:18 pm UTC
We recently moved to 9.2 and I was trying your example for bulk insert and got this error. Can you please help? Thanks..
1 declare
2 type array is table of ta%rowtype index by binary_integer;
3 l_data array;
4 begin
5 select * bulk collect into l_data from ta;
6 forall i in 1 .. l_data.count
7 insert into ta values l_data(i);
8* end;
SQL> /
declare
*
ERROR at line 1:
ORA-06550: line 7, column 24:
PLS-00518: This INSERT statement requires VALUES clause containing a parenthesised list of values
ORA-06550: line 7, column 2:
PL/SQL: SQL Statement ignored
ORA-06550: line 7, column 2:
PLS-00435: DML statement without BULK In-BIND cannot be used inside FORALL
November 21, 2003 - 3:15 pm UTC
ops$tkyte@ORA920LAP> create table ta( x int, y int );
Table created.
ops$tkyte@ORA920LAP> insert into ta values ( 1, 1 );
1 row created.
ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> declare
2 type array is table of ta%rowtype index by binary_integer;
3 l_data array;
4 begin
5 select * bulk collect into l_data from ta;
6 forall i in 1 .. l_data.count
7 insert into ta values l_data(i);
8 end;
9 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA920LAP>
i cannot reproduce in 9203/9204?
Continuing the previous question...
A reader, December 18, 2003 - 3:44 pm UTC
Does the compatible parameter have to be 9.2.0.4.0 for this to work? The value in our db is 9.2.0.0.0 (and I got the previous error "PLS-00435: DML statement without BULK In-BIND cannot be used inside FORALL")
v$version values :
...
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
...
Thanks!
December 18, 2003 - 4:19 pm UTC
nope, show us this same output please:
ops$tkyte@ORA920PC> create table ta( x int, y int );
Table created.
ops$tkyte@ORA920PC> insert into ta values ( 1, 1 );
1 row created.
ops$tkyte@ORA920PC> declare
2 type array is table of ta%rowtype index by binary_integer;
3 l_data array;
4 begin
5 select * bulk collect into l_data from ta;
6 forall i in 1 .. l_data.count
7 insert into ta values l_data(i);
8 end;
9 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA920PC> show parameter compat
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
compatible string 9.2.0.0.0
plsql_v2_compatibility boolean FALSE
ops$tkyte@ORA920PC> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE 9.2.0.3.0 Production
TNS for Linux: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production
output...
A reader, December 19, 2003 - 10:44 am UTC
SQL> create table ta( x int, y int );
Table created.
SQL> insert into ta values ( 1, 1 );
1 row created.
SQL> declare
2 type array is table of ta%rowtype index by binary_integer;
3 l_data array;
4 begin
5 select * bulk collect into l_data from ta;
6 forall i in 1 .. l_data.count
7 insert into ta values l_data(i);
8 end;
9 /
insert into ta values l_data(i);
*
ERROR at line 7:
ORA-06550: line 7, column 27:
PLS-00518: This INSERT statement requires VALUES clause containing a parenthesised list of values
ORA-06550: line 7, column 5:
PL/SQL: SQL Statement ignored
ORA-06550: line 7, column 5:
PLS-00435: DML statement without BULK In-BIND cannot be used inside FORALL
SQL> show parameter compat
unknown SHOW option "parameter"
compatibility version NATIVE
SQL> select name, value from v$parameter where name like '%compat%';
NAME
----------------------------------------------------------------
VALUE
----------------------------------------------------------------------------------------------------
compatible
9.2.0.0.0
plsql_v2_compatibility
FALSE
2 rows selected.
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE 9.2.0.3.0 Production
TNS for Solaris: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production
5 rows selected.
SQL>
December 20, 2003 - 9:25 am UTC
hmm, wonder if it is the client is the issue (not using the 9i plus client I see)
but -- I cannot reproduce this -- i've connected to 9204 from 734, 806, 817, 9ir2 and could not reproduce this so.....
time to file a tar with support.
Comparing records in pl/sql table
newton, December 22, 2003 - 7:23 am UTC
Tom,
I have this problem that I can't compare more than records here...
Begin
Select <columnBulk Collect Into lc_id
From <table>;
i := lc_id.First;
n := pcId.First;
/* in pcId Variable...it has only 3 rows */
While i Is Not Null Loop
If pcId.Exists(n) Then
If lc_id(i) <> pcId(n) Then
dbms_output.put_line(i);
Else
n := pcId.Next(n);
End If;
End If;
i := lc_id.Next(i);
End Loop;
/* How to get those records which are not same as lc_id */
December 22, 2003 - 9:13 am UTC
a simple array compare?
you need a double loop. compare each p(n) to each l(i). very straighforward programming exercise.
FORALL limit?
Mark, February 24, 2004 - 9:58 pm UTC
Your book states for BULK COLLECT that 100 row limit "feels about right" in your experience for performance and system resource concerns. The examples on this site show examples using BULK COLLECT with a limit of 100 to load an array, then FORALL to do DML based on the array - so it too is dealing with 100 rows in these examples. But is this limiting of FORALL "about right" at 100 rows? If I load my array without using BULK COLLECT should I be concerned about the maximum number of rows I might need to process in the FORALL? Or would it be my rollback segment that defines this upper limit, and whatever fits in that would be OK for the FORALL to deal with from a performance standpoint? Thanks.
February 25, 2004 - 8:31 am UTC
it is a memory thing (limit 1000 will take 10 times the memory limit 100 would).
it is a throughput thing -- think about it like this. You are going to pave a brick driveway. You have a pile of bricks, a wheelbarrow and two people.
You can
a) have one person bring the other a single brick -- brick by brick by brick. One at a time. Not very efficient. the guy laying the bricks (the client application) spends lots of time waiting as the other guy goes back and forth with a single brick at a time.
b) have one person load a couple of hundred bricks in the wheelbarrow. It takes a while to load up this wheelbarrow so the brick layer sits there idle. Additionally, it takes a while to navigate this ton of bricks in the wheelbarrow (the poor guy is struggling to push it over the ground and trying not to spill them all). Finally the brick layer gets a ton of bricks and starts to process them (this takes a while), then the whole process repeats. Lots of "spurty" activity
c) have one person load a reasonable number of brick in the wheelbarrow. Run with them over to the bricklayer, lay them get more and so on. Less bursts, better "flow", the work gets done with lots less roundtrips than A) but with lots less effort than B)
This analogy holds pretty well for array fetching. In fact, in 10g, they have silently implemented the 100 rule in the engine itself for plsql!!! (for fetching anyway). Watch this:
ops$ora10g@ORA10G> alter session set sql_trace=true;
Session altered.
ops$ora10g@ORA10G> begin
2 for x in ( select * from all_objects ) loop null; end loop;
3 end;
4 /
PL/SQL procedure successfully completed.
SELECT *
FROM
ALL_OBJECTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.07 0.22 3 127 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 465 14.62 18.72 483 221246 0 46463
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 467 14.70 18.94 486 221373 0 46463
465 fetchs, 46k rows.... in 9i and before, there would have been 46k fetches.
but a 'good' limit for FORALL?
Mark, February 25, 2004 - 12:34 pm UTC
Very interesting and neat feature of 10g - and I understand the ideal balance on a BULK COLLECT, trips vs. amount, good analogy. But what about the FORALL update? If I have 1000 rows in an array, should I be concerned that that is so high for one FORALL to work? What about 10,000 rows in the array? Etc.
Thanks.
February 25, 2004 - 12:42 pm UTC
same analogy works in reverse, same picture, different perspective in general.
memory, network slam, spurts of processing.
I like 100
bulk collect for lobs?
reader, March 02, 2004 - 8:05 am UTC
Your insight about bulk insert of lobs was most useful.
For bulk collect is there any reason you can't bulk collect lob locators into a collection? Is it just the standard bulk collect like:
clob_table (id number, doc clob)
DECLARE
TYPE ClobRecTab IS TABLE OF clob_table%ROWTYPE;
clob_recs ClobRecTab;
CURSOR c1 IS
SELECT id,clob FROM clob_table;
BEGIN
OPEN c1;
FETCH c1 BULK COLLECT INTO clob_recs;
END;
March 02, 2004 - 8:59 am UTC
ops$tkyte@ORA920PC> create table t
2 as
3 select owner, view_name, to_lob(text) text
4 from all_views
5 where rownum <= 100;
Table created.
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> declare
2 type array is table of t%rowtype index by binary_integer;
3 l_rec array;
4 begin
5 select * bulk collect into l_rec from t;
6 end;
7 /
PL/SQL procedure successfully completed.
You can, it just fetches the lob locator. It won't load up the 4gig of lob data into memory or anything like that.
Bulk Binding Exception
Reader, May 03, 2004 - 1:23 pm UTC
Tom,
We are in Oracle 8.7.1.4. I want to update a table using the following:
open c;
loop
fetch c bulk collect into idArray, valArray limit 50;
FORALL i in idArray.first .. idArray.count
update my_table
set some_col = valArray(i)
where pk_col = idArray(i)
;
exit when c%notfound;
commit;
end loop;
close c;
--------------------------------
My problem is: when the cursor c doesn't have any data to fetch in the beginning or after the previous limited fetch,
it generates the following error:
ORA-06502: PL/SQL: numeric or value error ORA-06512:
How should I avoid that? Thanks!
May 03, 2004 - 7:39 pm UTC
what you *really* what to do is:
update ( select t1.some_col, t2.value
from my_table t1, t2
where t1.pk_col = t2.id )
set some_col = value;
commit;
(but your error is the idArray.first is NULL, you want to forall i in 1 .. idArray.count, not idArray.first..idArray.count)
Bulk Update
Reader, May 04, 2004 - 1:06 am UTC
Tom,
You are the best! Not only did you quickly find where my problem is, you also pointed out a better way!
I used 1..ar.count before, but this time, I try to be a little fancy trying to replace the hard coded 1 with array.first. I guess "1" is better because it immediately evaluates to false if null, then bypassing the update statement.
I first learned your update (select ... from tab1, tab2) method from this site, too, and used it to update 1.5M rows within 30-40 minutes (for loop update will take 3-4 hours!), It was great. This time I am so submerged in the bulk collect idea, and didn't even think about it (shame).
My only question to this update approach is: Will the bulk collect approach with limit n clause reduce the RBS problem ocurrance(by committing every n rows) and with little efficiency drop if the updated rows are only <200000 and most likely <5000? This is a daily ETL process and we have more than 30 ETL jobs running between midnight to 4 am.
Thanks,
Wayne
May 04, 2004 - 7:19 am UTC
it'll allow rbs to be released and reused which ultimately leads to ORA-1555 after spending lots of time waiting.
I've always been a firm believer that you size RBS for the jobs you do, you don't size your jobs for the rbs you have -- that would be backwards.
doing things procedurally, slow by slow generates more redo, more undo, and takes longer.
J, May 04, 2004 - 11:38 am UTC
Is it just me, or did that guy above just need the parens
7 insert into ta values (l_data(i));
SQL> declare
2 type array is table of ta%rowtype index by binary_integer;
3 l_data array;
4 begin
5 select * bulk collect into l_data from ta;
6 forall i in 1 .. l_data.count
7 insert into ta values l_data(i);
8 end;
9 /
insert into ta values l_data(i);
*
ERROR at line 7:
ORA-06550: line 7, column 27:
May 04, 2004 - 1:49 pm UTC
ops$tkyte@ORA920LAP> declare
2 type array is table of emp%rowtype index by binary_integer;
3
4 l_data array;
5 begin
6 select * bulk collect into l_data from emp;
7
8 forall i in 1 .. l_data.count
9 insert into emp values l_data(i);
10 end;
11 /
PL/SQL procedure successfully completed.
Nope -- new 9i feature.
LIMIT WITH SELECT CLAUSE
VKOUL, June 18, 2004 - 2:24 pm UTC
I know LIMIT clause can be used with FETCH but
can LIMIT clause be used with SELECT ?
e.g.
SELECT *
BULK COLLECT
INTO collection_name
FROM table_name
LIMIT 100;
June 18, 2004 - 2:34 pm UTC
for a select into, if you just want the first 100, you use "rownum <= 100" (but it really doesn't make sense in most cases)
to bulk collect with limit, you use an explicit cursor:
open c;
fetch c bulk collec into .... LIMIT 100;
fetch c bulk collec into .... LIMIT 100;
close c;
Bulk fetching
Serge Shmygelsky, August 16, 2004 - 5:59 am UTC
Tom,
your example with bricks is very good for understanding how 'bulk collect' works. But could you explain it please in terms of memory and processes? In documentation we can read that 'The keywords BULK COLLECT tell the SQL engine to bulk-bind output collections before returning them to the PL/SQL engine'. So in your example 'first worker' is SQL engine and the second one is PL/SQL enginge. But what processes are behind these engines and what is this 'wheelbarrow' that can take 100 bricks but can't take all of them at once?
August 16, 2004 - 8:33 am UTC
the "processes" are resources -- cpu (to get all one billion rows -- or just 100), network (to transfer all one billion row -- or just 100), the client patience (to wait for the last of 1 billion rows to be fetched -- or just 100) and so on.
Thanks but not clear anyway
Serge Shmygelsky, August 16, 2004 - 8:57 am UTC
Thanks for quick answer. But sorry - it's not clear.
The task is: I want to get all the values from some query into an array to process them. Processing doesn't begin until all the records are retreived into this array (nested table).
I can declare in my stored procedure nested table that is area in memory (isn't it?) and fill it in a loop with unlimited number of values (I can suppose that 'unlimited' is also limited by memory size). But what prevents me from fetching those values at once as anyway size of the nested table will be the same?
And the total time for transfer over network will also be the same as it doesn't matter if I transfer 100000 records per once or 1000 times by 100.
Maybe I'm wrong, but I can't get the idea.
August 16, 2004 - 6:58 pm UTC
think steady state -- think flow -- think "nice and steady wins the race"
think what happens to you if you get 100,000 things *at once* versus recieving 100 things over time. think what happens to the conduit that gets you these things. think of the ram needed to buffer on both ends, the huge network burst you just inflict on everyone.
too much of a good thing is just that. Your 100k records would just "flood things". sort of like trying to fill a bucket with a fire hose.
but hey, here is an idea - parameterize it, make the array size adjustable as you see fit, now you need not worry, you can try everything from 1 to 1,000,000 and see what works best for you.
(it always makes me worry about the state of programming when our algorithm involves loading up over 100,000 things in an array before we can start working with them, but that is another soap box)
Bulk and arrays - ctd
Serge Shmygelsky, August 17, 2004 - 8:47 am UTC
Thanks again. I was thinking about this 'bulk and arrays' all the last night :-). So, as I understand, the main problem is not the bulk fetch array size but the total size of the array I'm going to get as a result. Because I can run out of memory (or flood it) not only using 'bulk' but just putting the same number of values one-by-one in a loop in the same array. Did I get it right?
How can we measure advantages of using different bulk limits?
Could you please give some URL to read more about Oracle memory management?
Thanks. Discussion was very useful for putting things in the mind in correct order.
August 17, 2004 - 8:59 am UTC
this isn't really "oracle's memory mgmt" more than "common sense, what must be done".
if you ask for 100,000 rows to be transmitted, we have to get 1..100,000, buffer them, send them, receive them, unbundle them, give them to you. Nothing more or less than you would have to do.
You would *parameterize* your code to accept the "array fetch size" and you would array fetch that many at a time.
Bulk and arrays - ctd 2
Serge Shmygelsky, August 17, 2004 - 10:21 am UTC
Of course you're right about common sence etc. But there are applications that must have all the information before the work begin. For example, on-line raiting for mobile customers. In this case you don't have time to fetch data from the database each time a call comes. You should have it in the memory - millions of records.
That's why I'm interested in how 'bulk' works and what Oracle does with memory.
Thanks.
August 17, 2004 - 10:24 am UTC
huh? give me a better example. No way I'd have millions of records sitting in my client. It would take you longer to find them then it would the database to find, retreive and return them.
?
Serge Shmygelsky, August 17, 2004 - 10:32 am UTC
H-m. It's strange. Because billing systems I saw work exactly this way. They store all the data in the memory.
August 17, 2004 - 10:47 am UTC
and the billing systems I've seen/worked with do not.
databases were sort of *invented* do do what you describe.
Memory - reloaded
Serge Shmygelsky, August 17, 2004 - 11:31 am UTC
So SQL parsing (even if we have shared SQL), execution and fetch, including possible disk i/o operations is faster than simple array lookup in 'c' program? I am not sure of this.
August 17, 2004 - 11:56 am UTC
well, there would be no parsing (parse once, execute many)
In most all cases, I wonder about where programming is going, when people re-invent a database. For a simple array lookup, you must have some sequential index -- else you are sort of "reinventing a database". and what about when the data changes and so on.
Things are not as simple as "get me array entry 55".
explanation bulk bind in your book-Effective Oracle By Design"
A reader, January 11, 2005 - 3:39 pm UTC
Hi Tom,
I've bought recently your book.It's excelent book.
There is something that i didn't understand in your book:
on page 564 you wrote "When i put 56 in EMP the bulk collect version was 380% better.But on the other hand on the page 565 you wrote:"...i recommend using bulk processing only where and when it would have the most dramatic effect.....If you did that process once for 50 rowsyou would discoverthey run in about the same amount of time and that the BULK COLLECT actually does more latching."
Can you explain what is the difference here.
Bye
viki
January 11, 2005 - 4:02 pm UTC
the one example -- i processed a 56 row table 5,000 times.
I then said "you know what, if you did this thing ONCE, not 5,000 times, it really wouldn't be noticable -- it is "not big enough""
(i said that -- "these LOOK dramatic but that is only because we did this 5,000 times..."
300 rows procesed in ORACLE 8I
viki, January 11, 2005 - 4:08 pm UTC
Tom,thank's for the answer.
I have to process between 300-400 records
and in each row to make some selects
and then to make an insert.
If i do it: SELECT INTO...
BULK BINd into arrays
and then forall..
insert
will it be faster then process in regular cursor?
Viki
January 11, 2005 - 4:16 pm UTC
try it out ;)
it would probably make a nominal difference if you:
select ... BULK COLLECT into arrays.....
for i in 1 .. array.count
loop
process data
end loop;
forall i in arrays.count
insert ........
10g
Yuan, February 22, 2005 - 12:50 pm UTC
<quote>
In fact, in 10g, they have
silently implemented the 100 rule in the engine itself for plsql!!! (for fetching anyway).
</quote>
So does this mean that in 10g, you wouldn't ever bother with BULK COLLECT?
February 22, 2005 - 5:09 pm UTC
works with
for x in ( select * from t ) loop
for x in C loop
not with explictly OPEN'ed cursors.
so, if you have been coding the way I prefer people to code -- no need.
If you have been doing it the hard way, more code for you to write (or use the easy way)
10g Continued
Yuan, February 23, 2005 - 9:21 am UTC
Then do you think the documentation should be altered? I find this statement a little misleading:
<quote>
Using the keywords BULK COLLECT with a query is a very efficient way to retrieve the result set. Instead of looping through each row, you store the results in one or more collections, in a single operation.
</quote>
-- </code>
http://download-west.oracle.com/docs/cd/B14117_01/appdev.101/b10807/12_tune.htm#sthref1314 <code>
It seems to be advising the longer, more complicated way for no real benefit. At the very least, it didn't mention that the new implicit 100 limit. The only benefit I can see to still using the longer way is that it gives the option to fine tune the limit, but you've said that's probably not worth the effort and I tend to trust your judgments.
What would you recommend for new development done in 9i at this point? If we find a significant benefit using the "long way," would we eventually want to refactor the code once we move to 10g? We just recently upgraded to 9i, so I don't expect the 10g move to come any time soon.
February 24, 2005 - 4:29 am UTC
I'll file a doc bug to change that section slightly - bulk collect still has it's uses with ref cursors and with select into (when the goal is to get an array of data to send to another routine for example - or to use in an in clause later).
If the result set is going to be less than about 100 rows, you might consider using select into and looping over it (you should be pretty sure of the size of the result set, you don't want really really huge arrays there).
Otherwise, if the result set is going to be thousands of records - you would have to seriously consider bulk collect with LIMIT in 9i -- it will make a measurable difference in response time.
bulk collect into
Robert Boutin, March 03, 2005 - 4:25 pm UTC
Is it possible to return data from a bulk collect, in a pl/sql table indexed by the column I want ?
For example,
***********************************************************
declare
cursor cur_a is
select a, b from t; -- «a» and «b» are number(10)
type l_typ_tab is table of cur_a%rowtype indexed by binary_integer;
l_tab l_typ_tab;
begin
open cur_a;
fetch cur_a bulk collect into l_tab;
close cur_a;
end;
***********************************************************
I want the result in a table indexed by column « a »
March 03, 2005 - 5:43 pm UTC
no, you need to fetch out a and b and then assign them to an array element by element, the bulk collect is always contigous from index 1 .. N
bulk collect for lobs
Alicja, April 13, 2005 - 10:33 am UTC
Hi Tom,
I try to bulk collect lob locators into a collection with limit option and get the following error ORA-22922 "nonexistent LOB value" during the last fetch. In the last fetch (which is less than limit), there are rows available, but they are empty. What is the cause for that?
Below the example:
declare
TYPE t_clob IS table of clob;
tab_text t_clob;
cursor c is select object_name from all_objects ;
begin
open c;
loop
fetch c bulk collect into tab_text limit 200 ;
forall i in tab_text.first..tab_text.last
insert into temp
values (tab_text(i));
exit when c%notfound;
end loop ;
close c;
end;
April 13, 2005 - 12:00 pm UTC
first, use
forall i in 1..tab_text.count
(to see why, add "where rownum <= 200" and try it out).
I was able to reproduce your issue in 9ir2, looks "wrong", in this case however, I see no reasong to use a table of clob, I would as a workaround use a table of varchar2
In anycase though, please submit that to support.
Bulk collect problem
r, June 09, 2005 - 12:52 pm UTC
Hello Tom,
This is going to difficult for me to explain as I can only reproduce this problem in our prod enviroment.
I have a SQL which works OK when run on its own and returns a record, but when I use the same SQL within a PL/SQL block and try to bulk collect it into a variable then I find that the count of the bulk collect variable is zero i,e it hasn't been able to fetch the record. However this problem happens only with some of the records and in some instances I can see that bulk collect variable having records. This is very strange behaviour.
The bulk collect variable is a table of record type.
Have you come across any scenarios like this or could you give me pointers to investigate further.
Thanks for your time
June 09, 2005 - 6:28 pm UTC
You will have to either
o help me reproduce it
o work with support on it
sorry.
Related to existing bug
r, June 10, 2005 - 10:52 am UTC
Tom,
Thanks for your reply. I have traced my problem to be an existing bug with bulk collect ("No exception raised during bulk collect into collection of records"). I understand that it will be fixed in 10.2.
10G - bulk binding - error handling?
Arul Ramachandran, June 11, 2005 - 10:28 am UTC
Hi Tom,
If 10G does 100 rows of bulk collect automagically for implicit cursors, how does error handling work to trap DML error for a specific row? Does SQL%BULK_EXCEPTIONS work here?
Thanks,
Arul
June 11, 2005 - 1:58 pm UTC
oh, it is for RETRIEVAL only -- forall processing is still very much alive and well.
please elaborate
chris, February 23, 2006 - 6:11 pm UTC
Could you provide the location of the syntax for the update that you described. Also, what is the performance benefit over bulk collect.
thx.
<quote>
what you *really* what to do is:
update ( select t1.some_col, t2.value
from my_table t1, t2
where t1.pk_col = t2.id )
set some_col = value;
commit;
(but your error is the idArray.first is NULL, you want to forall i in 1 ..
idArray.count, not idArray.first..idArray.count)
</quote>
Bulk Collect/One Insert/Several Inserts
Lise, June 07, 2006 - 6:31 am UTC
I want to join two tables depending on 20 different criteria (say name and address match so select, name and postcode match so select etc.) If a match is found in search criteria 1 then I do not want to look into the other 19 criterias etc.
I have indeces created for name, postcode etc. on table 1. Table 2 which is the driving table, has indeces available to extract the core rows to be matched against table 1 (around 24k rows selected out of millions).
I can think of three options:
1. One big insert, where I create a temp view holding all core rows in table 2 (need to partition by address to get latest one etc.) and then case/where statements for each search criteria.
2. Bulk collect data and do a forall for each search criteria.
3. Individual insert/select for each search criteria.
I believe 1 to be the best, however it will then not allow me to use the indeces on table 1. Option 3 is the best in this case, but then I have to select the core data from table 2 20 times (approx 276 sec per fetch).
Option 2 will allow me to fetch the core data once, and then use the indeces in table 1.
June 07, 2006 - 7:12 am UTC
how many rows in these two tables first...
Bulk Collect/One Insert/Several Inserts
Lise, June 07, 2006 - 6:31 am UTC
I want to join two tables depending on 20 different criteria (say name and address match so select, name and postcode match so select etc.) If a match is found in search criteria 1 then I do not want to look into the other 19 criterias etc.
I have indeces created for name, postcode etc. on table 1. Table 2 which is the driving table, has indeces available to extract the core rows to be matched against table 1 (around 24k rows selected out of millions).
I can think of three options:
1. One big insert, where I create a temp view holding all core rows in table 2 (need to partition by address to get latest one etc.) and then case/where statements for each search criteria.
2. Bulk collect data and do a forall for each search criteria.
3. Individual insert/select for each search criteria.
I believe 1 to be the best, however it will then not allow me to use the indeces on table 1. Option 3 is the best in this case, but then I have to select the core data from table 2 20 times (approx 276 sec per fetch).
Option 2 will allow me to fetch the core data once, and then use the indeces in table 1.
What if Huge Volumes of Data?
Suren, June 07, 2006 - 8:44 am UTC
Hey Tom,
Looking at the first example you gave i have a doubt
what if the volume of data is more say 10k records or more then that , will that not blowup all the resources and make us wait for a long time as there is no LIMIT
Condition for Bulk Collect and For all (as i dont see a loop opening there)
Another one, i always see you saying benchmark Data
what exactly do you mean by BenchMarking?
Thanks,
Suren
first/last Vs count
kk, June 07, 2006 - 11:46 am UTC
Hi Tom,
Looking at your view to use 1..a.count instead of using c.first .. c.last, please have a look at the following :-
CREATE OR REPLACE procedure bulk_test_procedure1 as
TYPE t_tab IS table of bulk_test%rowtype ;
tab_text t_tab := t_tab();
cursor c is select object_name from all_objects where object_type = 'TABLE'and rownum <=200;
begin
open c;
loop
fetch c bulk collect into tab_text ;
--forall i in tab_text.first..tab_text.last
forall i in 1..tab_text.count
insert into bulk_test values tab_text(i);
exit when c%notfound;
end loop ;
close c;
end;
AND
CREATE OR REPLACE procedure bulk_test_procedure1 as
TYPE t_tab IS table of bulk_test%rowtype ;
tab_text t_tab := t_tab();
cursor c is select object_name from all_objects where object_type = 'TABLE'and rownum <=200;
begin
open c;
loop
fetch c bulk collect into tab_text ;
forall i in tab_text.first..tab_text.last
--forall i in 1..tab_text.count
insert into bulk_test values tab_text(i);
exit when c%notfound;
end loop ;
close c;
end;
i dont see any difference in bulk_test table.Am i missing anything?
June 07, 2006 - 3:18 pm UTC
since first is always one in this case, and count will always equal last - they are the same
IN THIS CASE, since the bulk collect will put the array elements in starting at 1..count
can you give us an example
KK, June 07, 2006 - 4:03 pm UTC
Hi Tom,
can you please give us an example when it would be a problem using the first/last.
Thanks.
June 07, 2006 - 5:37 pm UTC
it isn't?
Why is there an implied problem - I "prefer" to use 1..array.count, but array.first..array.last works just fine in this case.
Thanks
suren, June 08, 2006 - 8:10 am UTC
Cool Thanks for your answer that bit of hint was enough for me to solve my problem.How about a little softness tom..
Well, I know its your kingdom....
Thanks again
June 08, 2006 - 9:20 am UTC
not sure what you mean?
There is a problem with .first and .last in forall
Lise, July 19, 2006 - 11:51 am UTC
If you do a bulk collect and nothing is selected, and you then do a FORALL .FIRST .. .LAST it falls over with the error ORA-06502:PL/SQL Numeric or value error.
I therefore prefer to use FORALL 1 .. .COUNT
Would you use BULK COLLECT without the FORALL? I need to trawl through records, but will save some time using BULK COLLECT.
July 20, 2006 - 7:49 am UTC
i don't know what you mean by "would you use bulk collect with forall"?
Bulk collect and for all
Lise, July 21, 2006 - 8:26 am UTC
Bulk Collect and FORALL is great together. However, if I cannot make use of FORALL, is it still performance wise a good idea to use Bulk Collect on its own so to speak?
July 23, 2006 - 7:22 am UTC
depends on the release.
In 9i and before - sure, bulk collecting will make the retrieval faster for large result sets (and take less work in general, use less resources on the server)
In 10g and above however, the constructs:
for x in ( select .... )
loop
and
for x in CURSOR_NAME
loop
will both implicitly array fetch 100 rows at a time (meaning - we don't have to, it happens already). Only if you use the harder:
open cursor_name;
loop
fetch
exit when
end loop
close cursor_name
would you have to do it yourself (all releases)
Bulk update
Srinivas, October 05, 2006 - 8:12 pm UTC
Tom,
Thanks for yoru service, it helps us a great deal in learning!
I have the following scenario -
declare
var1 number;
var2 number;
var3 number;
cursor c is
select col1,col2,col3 from tab@dblink where col3 > some value
MINUS
select col1,col2,col3 from tab@dblink where col3 > some value and col1=col2
order by col3;
begin
loop
fetch c into var1,var2,var3;
exit when c%notfound;
update TAB2
set col1=c.var1
where col2=c.var2;
end loop;
commit;
close c;
end;
/
This currently executes in 20 mins, I thought bulk binding is always better than the cursor for loop and I created 3 arrays like below -
TYPE col1_tab IS TABLE OF NUMBER(10);
TYPE col2_tab IS TABLE OF NUMBER(10);
TYPE col3_tab IS TABLE OF DATE;
t_col1 col1_tab;
t_col2 col2_tab;
t_col3 col3_tab;
and ran the following -
SELECT col1,col2,col3 BULK COLLECT into t_col1,t_col2,t_col3 from (SELECT col1,col2,col3 FROM tab1@link
WHERE col3> some value
MINUS
SELECT col1,col2,col3
FROM tab1@link
WHERE col3 > some value
AND col1= col2
ORDER BY col3);
FORALL i in t_var1.FIRST .. t_var1.LAST
UPDATE TAB2
SET col1=t_var1(i)
where col2=t_var2(i);
COMMIT;
But, even this approach took 20 mins. Absolutely no improvement!
Please clarify when would it be good to use cursor for loop, when to use bulk update (don't see much examples for updates) and also please show if there is any other way you would like to do it.
Thanks,
Srinivas
October 06, 2006 - 8:46 am UTC
if 19.9 minutes of processing is spent running the sql statement, then bulk binding isn't going to help.
now, is that your REAL query -for real?
why order by? you don't need order by.
why minus? why not just distinct and a single pass??? the logic is not getting to me here.
seems that
select distinct col1, col2
from tab1@link
where decode( col1, col2, 0, 1 ) = 1
and col1 > some_value;
(distinct ONLY if needed, I added because MINUS does a distinct)
You really ought to TRACE IT to see where you might be spending your time, there are NO silver bullets remember, you have to fix the problem you are having - and if the problem you are having is "my strange query with minus is inefficient and takes 19 minutes", you will not fix it with array fetching.
and also cursor with bulk collect and forall is same
Srinivas, October 05, 2006 - 10:39 pm UTC
and also the following takes the same time -
begin
loop
fetch c bulk collect into var1,var2,var3 limit 100;
forall i in t_var1.FIRST .. t_var1.LAST
UPDATE TAB2
SET col1=t_var1(i)
where col2=t_var2(i);
COMMIT;
exit when c%notfound;
end loop;
commit;
close c;
end;
/
This is 9.2.0.7 version.
Thanks,
Srinivas
agreed...but
Srinivas, October 06, 2006 - 9:37 am UTC
Tom,
This is a real query. It updates the fact table after a load. I agree to your comments, but the whole thing finishes in about 2 mins when I do -
create table temp_test as
SELECT survivor_id, merged_id, insert_date
FROM merged_company_lst_hist@cdb_select_link
WHERE insert_date >= TO_CHAR(SYSDATE - 2, 'DD-MON-YYYY')
MINUS
SELECT survivor_id, merged_id, insert_date
FROM merged_company_lst_hist@cdb_select_link
WHERE insert_date >= TO_CHAR(SYSDATE - 2, 'DD-MON-YYYY')
AND survivor_id = merged_id
ORDER BY insert_date
UPDATE truck_order_fact_test
SET customer_company_id = (select survivor_id from temp_test where truck_order_fact_test.customer_company_id=temp_test.merged_id)
WHERE exists (select merged_id from temp_test where truck_order_fact_test.customer_company_id = temp_test.merged_id )
This finishes from start to finish in about 2 mins. The query in CTAS returns about 1100 rows.
1.We know that sql is faster than pl/sql but would it be so different?
2.If there is so much performance then would it be a good approach to create tables like how I did and drop them after the operation is done so that there is no additional maintanence either.
I will however try to rewrite the code like how you suggest, this is something I tried and found so much performance gain, so I wanted to take your opinion on this.
Thanks,
Srinivas
October 06, 2006 - 1:23 pm UTC
create global temporary table temp_test on commit delete rows
as
SELECT survivor_id, merged_id, insert_date
FROM merged_company_lst_hist@cdb_select_link
where 1=0;
alter table temp_test add constraint temp_test_pk primary key(merged_id);
and then:
insert into your global temporary table;
update (select a.customer_company_id, b.survivor_id
from truck_order_fact_test a,
temp_test b
where a.customer_company_id = b.merged_id)
set customer_company_id = survivor_id;
1 why not? why wouldn't it be. You need to find out WHERE TIME IS SPENT, then you concentrate on fixing that. sql is going to be faster than plsql.
You don't want to create the table each time, use a global temporary table.
You can update a join...
You are a genius!
Srinivas, October 06, 2006 - 11:17 am UTC
Hi Tom,
I rewrote the query as you suggested and it works, however the update is NOT any faster. It still takes 20 mins. Just great that you could correct it just by looking at the code.
Please review the other method by which I could get this doen in 2 mins.
Thanks,
Srinivas
October 06, 2006 - 1:32 pm UTC
you sort of need to, well, do what I suggested.
trace it. see where the heck you are spending time.
trace files..
Srinivas, October 10, 2006 - 11:01 am UTC
Tom,
I am not able to reason out why there is no difference while using bulk, can you please help me understand?
The tkprof while using bulk is -
DECLARE
TYPE v_survivor_id_tab IS TABLE OF NUMBER(10);
TYPE v_merged_id_tab IS TABLE OF NUMBER(10);
TYPE v_insert_date_tab IS TABLE OF DATE;
v_survivor_id v_survivor_id_tab;
v_merged_id v_merged_id_tab;
v_insert_date v_insert_date_tab;
BEGIN
select distinct survivor_id, merged_id, insert_date BULK COLLECT into v_survivor_id, v_merged_id, v_insert_date
from merged_company_lst_hist@cdb_select_link
where insert_date >= TO_CHAR(SYSDATE - 2, 'DD-MON-YYYY')
and
decode( survivor_id, merged_id, 0, 1 ) = 1;
FORALL i in v_survivor_id.FIRST .. v_survivor_id.LAST
UPDATE truck_order_fact_test
SET customer_company_id=v_survivor_id(i)
where customer_company_id=v_merged_id(i);
COMMIT;
--END LOOP;
-- COMMIT;
--CLOSE c_merged_cust;
END;
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.04 0.11 0 2 3 0
Execute 1 0.00 0.02 0 0 9 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.04 0.13 0 2 12 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 31
********************************************************************************
SELECT DISTINCT SURVIVOR_ID, MERGED_ID, INSERT_DATE
FROM
MERGED_COMPANY_LST_HIST@CDB_SELECT_LINK WHERE INSERT_DATE >= TO_CHAR(SYSDATE
- 2, 'DD-MON-YYYY') AND DECODE( SURVIVOR_ID, MERGED_ID, 0, 1 ) = 1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.03 0.72 0 0 0 1074
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.03 0.73 0 0 0 1074
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 31 (recursive depth: 1)
********************************************************************************
UPDATE TRUCK_ORDER_FACT_TEST SET CUSTOMER_COMPANY_ID=:B1
WHERE
CUSTOMER_COMPANY_ID=:B2
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 1586.42 1860.44 7014009 9342726 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1586.42 1860.44 7014009 9342726 0 0
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 31 (recursive depth: 1)
Rows Row Source Operation
------- ---------------------------------------------------
0 UPDATE (cr=9342726 r=7014009 w=0 time=1860311032 us)
0 TABLE ACCESS FULL TRUCK_ORDER_FACT_TEST (cr=9342726 r=7014009 w=0 time=1860297962 us)
********************************************************************************
The tkprof output while using traditional cursor for loop is -
DECLARE
v_survivor_id NUMBER(10);
v_merged_id NUMBER(10);
v_insert_date DATE;
CURSOR c_merged_cust IS
SELECT survivor_id, merged_id, insert_date
FROM merged_company_lst_hist@cdb_select_link
WHERE insert_date >= TO_CHAR(SYSDATE - 2, 'DD-MON-YYYY')
MINUS
SELECT survivor_id, merged_id, insert_date
FROM merged_company_lst_hist@cdb_select_link
WHERE insert_date >= TO_CHAR(SYSDATE - 2, 'DD-MON-YYYY')
AND survivor_id = merged_id
ORDER BY insert_date;
BEGIN
OPEN c_merged_cust;
LOOP
FETCH c_merged_cust INTO v_survivor_id, v_merged_id, v_insert_date;
EXIT WHEN c_merged_cust%NOTFOUND;
UPDATE truck_order_fact_test
SET customer_company_id = v_survivor_id
WHERE customer_company_id = v_merged_id;
END LOOP;
COMMIT;
CLOSE c_merged_cust;
END;
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.13 0.23 0 2 3 0
Execute 1 1.05 0.94 0 0 9 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.18 1.17 0 2 12 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 31
********************************************************************************
SELECT SURVIVOR_ID, MERGED_ID, INSERT_DATE
FROM
MERGED_COMPANY_LST_HIST@CDB_SELECT_LINK WHERE INSERT_DATE >= TO_CHAR(SYSDATE
- 2, 'DD-MON-YYYY') MINUS SELECT SURVIVOR_ID, MERGED_ID, INSERT_DATE FROM
MERGED_COMPANY_LST_HIST@CDB_SELECT_LINK WHERE INSERT_DATE >=
TO_CHAR(SYSDATE - 2, 'DD-MON-YYYY') AND SURVIVOR_ID = MERGED_ID ORDER BY
INSERT_DATE
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1075 0.83 3.29 0 0 0 578349
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1077 0.83 3.30 0 0 0 578349
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 31 (recursive depth: 1)
********************************************************************************
UPDATE TRUCK_ORDER_FACT_TEST SET CUSTOMER_COMPANY_ID = :B2
WHERE
CUSTOMER_COMPANY_ID = :B1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1074 1072.74 1119.10 2190055 9342726 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1075 1072.74 1119.10 2190055 9342726 0 0
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 31 (recursive depth: 1)
Rows Row Source Operation
------- ---------------------------------------------------
0 UPDATE (cr=9342726 r=2190055 w=0 time=1118741628 us)
0 TABLE ACCESS FULL TRUCK_ORDER_FACT_TEST (cr=9342726 r=2190055 w=0 time=1118728166 us)
********************************************************************************
Thanks,
Srinivas
October 10, 2006 - 12:23 pm UTC
umm, are you missing an index there, doesn't seem like it should full scan on that company id
Awesome!
Srinivas, October 10, 2006 - 2:13 pm UTC
Thanks much!
I created an index on company_id and that did the trick.
Thanks again!
-Srinivas
Better method
Robert Boutin, October 10, 2006 - 3:32 pm UTC
Hi,
I want to return a plsql/table as result of select with a pl/sql function. This is a generic function to avoid same query in different package.
My two table are v$session and v$process. (column as same name)
How is the best method to do this (cursor, type, bulk collect, for cursor, ...) ?
Ideally, i want to use select *
October 10, 2006 - 8:18 pm UTC
why not return a ref cursor? that is the best way to return a result set from a stored procedure.
Bulk Collect and CAST
Chris, October 26, 2006 - 1:47 pm UTC
Tom,
I know you must use an OBJECT type in order to use TABLE and CAST to select data from a PL/SQL table. However, if the data in the PL/SQL table is going to come from a BULK COLLECT which would presumable fetch into a PL/SQL type (PL/SQL table of PL/SQL records) what is the best way to do this. I saw an example above where the SELECT statement could use CAST(MULTISET()) but something tells me I'm missing an easier way. There's obviously more caode involved than this but something along these lines:
OPEN c;
FETCH c BULK COLLECT INTO c_tab;
CLOSE c;
... do a bunch of stuff, then copy records to another
pl/sql table applying some SQL logic (ordering etc.) ...
SELECT *
BULK COLLECT INTO c_tab2
FROM TABLE(CAST(c_tab AS myTabType))
ORDER BY 4, 2;
... or get another list with distinct values from a column
SELECT DISTINCT(col1)
BULK COLLECT INTO c_tab3
FROM TABLE(CAST(c_tab AS myTabType));
Basically, once I get my results from a bulk collect, I want to create a couple other pl/sql from this data after it is massaged.
In my example I am only dealing with PL/SQL tables/types except that I would have to declare a similar OBJECT type in order to do the CAST. Hence, I'd have what are essentially duplicate declarations (one object type and one pl/sql record and table of records).
How should this be done?
October 26, 2006 - 3:26 pm UTC
if you want to use sql on the stuff, you'll be using sql types - plain and simply.
bulk collect
Robert Hanrahan, November 15, 2006 - 6:18 am UTC
dear Tom,
I tried this code:
declare
type array is table of USAGE_PREV%rowtype index by binary_integer;
l_data array;
begin
select * bulk collect into l_data from USAGE_PREV;
forall i in 1 .. l_data.count
insert into USAGE_PREV values l_data(i);
commit;
end;
/
It gives this error:
ORA-04030: out of process memory when trying to allocate 16408 bytes (koh-kghu call ,pmuccst: adt/record)
ORA-06512: at line 6
the table has only 99999 records, is there a parameter I have to change?
November 15, 2006 - 7:29 am UTC
why would you do that?
basically, 4030 is returned when the operating system fails to allocate memory to us. Unless you have set ulimit on linux/unix, or you are running on windows (32bit windows) where the one process is limited to a bit less then 2gb of ram - you have basically "run out of memory"
but that piece of code must be erased, it is entirely the wrong approach.
MERGE.....RETURNING BULK COLLECT ?
klabu, November 15, 2006 - 11:24 am UTC
<10gR2>
SQL> DECLARE
2 TYPE typ_tab_empno IS TABLE OF myemp.empno%TYPE;
3 l_tab typ_tab_empno ;
4 BEGIN
5 MERGE INTO myemp
6 USING( SELECT empno, 'COMEDIAN' col2 FROM myemp ) usesql
7 ON (myemp.empno = usesql.empno )
8 WHEN MATCHED THEN
9 UPDATE SET job = usesql.col2 ; --RETURNING empno BULK COLLECT INTO l_tab ;
10 END;
11 /
Is it possible to use "RETURNING...BULK COLLECT" clause (something like line 9) on a MERGE ?
thanks
</10gR2>
November 16, 2006 - 3:55 am UTC
returning is not part of the merge syntax at this time, no.
but in your cas, you can just use an update of a join, no need for merge here.
bulk collect
Robert Hanrahan, November 16, 2006 - 11:31 am UTC
sorry for the wrong approach, I'm not a coder :)
I changed the code to this:
declare
type array is table of T%rowtype index by binary_integer;
l_data array;
l_start number default dbms_utility.get_time;
cursor crs is select * from T;
begin
open crs;
loop
fetch crs bulk collect into l_data limit 10000;
forall i in 1 .. l_data.count
insert /*+ APPEND */into T values l_data(i);
exit when crs%notfound;
end loop;
close crs;
dbms_output.put_line ( 'Elapsed ' || round( (dbms_utility.get_time-l_start)/100, 2 ) );
commit;
end;
/
this way, with the LIMIT clause, it seems to work
November 16, 2006 - 3:28 pm UTC
ugh, you missed the point entirely unfortunately :(
there should BE NO CODE, NONE, JUST AN INSERT.
and that append just makes you look bad :) It does *nothing* (thankfully, be very very thankful it is ignored)
Now, please - erase the code, just use an insert as select, period, that is all.
this is not about "not being a coder", this is about "not knowing the database and how to use it"
this screams "one insert statement, that is all"
bulk collect
Robert Hanrahan, November 17, 2006 - 7:50 am UTC
ok, code erased :>
So, all I have to do is;
insert into t select * from t;
Funny, this was my first approach actually!
some other told me to do it the other way (pl/sql)
the table t has 516917350 rows and they want to get the
data with a dblink, what do you think is the best way?
>and that append just makes you look bad :)
ugh? should I be worried? :)
Robert Hanrahan
November 17, 2006 - 8:12 am UTC
insert /*+ append */ into t select * from another_t;
probably - but for over a half billion records via a dblink, I hope you have good patience.
bulk collect
Robert Hanrahan, November 17, 2006 - 1:06 pm UTC
patience? yes I have lots of it... :>
but I think that I'll use exp and imp instead.
thanks for the advice tom
Robert Hanrahan
Bulk Collect key value output
Raj, December 15, 2006 - 5:27 pm UTC
Hi Tom,
I am trying to an update and insert into a table
create or replace procedure exam_po as
TYPE NumList IS TABLE OF NUMBER;
num_tab NumList := NumList(1,0,1,0,3);
lv_num NUMBER;
errors NUMBER;
dml_errors EXCEPTION;
PRAGMA exception_init(dml_errors, -24381);
BEGIN
loop
FORALL Upd IN num_tab.FIRST..num_tab.LAST SAVE EXCEPTIONS
UPDATE FROM emp WHERE sal > 500000/num_tab(Upd);
insert into exam values ( num_tab (Upd) );
commit;
end loop;
EXCEPTION
WHEN dml_errors THEN
errors := SQL%BULK_EXCEPTIONS.COUNT;
dbms_output.put_line('Number of errors is ' || errors);
FOR i IN 1..errors LOOP
sys.dbms_output.put_line('Error ' || i || ' occurred during '||
'iteration ' || SQL%BULK_EXCEPTIONS(i).ERROR_INDEX);
sys.dbms_output.put_line('Oracle error is ' || SQLERRM(-SQL%BULK_EXCEPTIONS(i).ERROR_CODE));
sys.dbms_output.put_line('Error '||SQLERRM(-SQL%BULK_EXCEPTIONS(i).ERROR_CODE)
||' encountered for emp '|| num_tab(i));
END LOOP;
end;
1)While trying to compile i am getting the following error
ora-00904: invalid identifier
pls-00201: identifier 'UPD' must be declared
2) while tryingto to incorporate the below said dbms output message
sys.dbms_output.put_line('The value of num_tab(Upd) is'||num_tab(Upd));
pls-00201: identifier 'UPD' must be declared
3) Apart from this I want to track the at which row of empno is errored out.
Oracle Version is 9.2.0.4
Unix Environment
December 16, 2006 - 5:50 pm UTC
you cannot do "code" inside of a forall
forall is an extension to BULK EXECUTE a SINGLE SQL statement.
your code won't fly at all.
lose the commit - that is a really bad idea there.
and what is up with your infinite loop:
loop
FORALL ....
commit;
end loop;
it is really hard to commit on code that doesn't stand a change of working even a tiny bit.
how about you try to specify the logic you intend, I see stuff like this:
FORALL Upd IN num_tab.FIRST..num_tab.LAST SAVE EXCEPTIONS
UPDATE FROM emp WHERE sal > 500000/num_tab(Upd);
and just say "huh, why would anyone even consider thinking about doing that - it doesn't seem to make sense"
Bulk collect output values
Raj, December 17, 2006 - 8:45 pm UTC
You are right.. I agree with you and corrected it.
I have the following questions:
1.How can I display or insert into a table of the element value(here num_tab(i))
2. In my exception, I have given the following output
sys.dbms_output.put_line('Error
'||SQLERRM(-SQL%BULK_EXCEPTIONS(i).ERROR_CODE)
||' encountered for emp '|| num_tab(i));
I was excepting the values as 0 for the second and fourth record while doing the processing..
Because the second and fourth values are zeros in Num_tab variable been assigned. I was getting the ORA-01476: divisor is equal to zero which was excepted.
But i am not getting the appropriate values of num_tab(i) while displaying in the exception.
Reply would be appreciated.
December 17, 2006 - 8:51 pm UTC
1)
dbms_output.put_line( num_tab(i) );
or
insert into t (x) values ( num_tab(i) );
2) your code is not functional at all - it makes no sense. I won't be commenting on it, it does not compile, not worth talking about.
When to commit
Synnove, December 27, 2006 - 6:52 am UTC
Hi Tom,
If I bulk collect 100 records at a time, loop through them and do some updates, capture any exceptions, exit out of the cursor and then get the next 100 records.
When do you see it feasible to commit? I thought it would be best to do it after each 100 records. If you have processed say 1000 records and the 1001 fails, you will have lost all.
So:
OPEN cursor;
LOOP
FETCH cursor BULK COLLECT INTO collection LIMIT 100;
FOR i IN 1..collection.COUNT
LOOP
BEGIN
-- Processing including DML to same table
EXCEPTION
END;
END LOOP;
COMMIT;
EXIT WHEN cursor%NOTFOUND;
END LOOP;
CLOSE cursor;
December 28, 2006 - 9:21 am UTC
unless your processes is RESTARTABLE (eg: if and when it fails halfway through, can you safely restart it without reprocessing records) - you cannot commit in the for loop.
It is feasible to commit when your logic is such that everything is OK in all cases - errors and otherwise.
A reader, December 28, 2006 - 10:38 am UTC
And what about commit accross fetch? Even if the process is restartable, I would not advise to commit inside the loop.
I would prefer this
LOOP -- main loop
v_recor_number := 0;
OPEN cursor;
LOOP
FETCH cursor BULK COLLECT INTO collection LIMIT 100;
FOR i IN 1..collection.COUNT
LOOP
BEGIN
v_record_number := v_record_number + 1;
-- Processing including DML to same table
EXCEPTION
END;
END LOOP;
EXIT;
END LOOP;
CLOSE cursor;
COMMIT; -- commit outside the loop
if v_recor_number < 100 THEN
EXIT;
end if;
END LOOP; -- end of the main loop
COMMIT;
December 29, 2006 - 9:02 am UTC
correct, it'll increase the odds of the 1555, but as long as the process is "restartable", that is just an annoyance
if the process is not restartable, it is fatal.
bulk collect into table type
Kumar, March 16, 2007 - 7:03 am UTC
Tom:
Can we use bulk collect to load data into a table type created on an object? My code is mentioned below:
SQL> create table test1 (no number, effdt date);
Table created.
SQL> insert into test1 values (1,sysdate);
1 row created.
SQL> insert into test1 values (1,sysdate+1);
1 row created.
SQL> commit;
Commit complete.
SQL> create or replace type test1_obj as object (no number, effdt date)
2 /
Type created.
SQL> create or replace type test1_tab is table of test1_obj
2 /
Type created.
SQL> create or replace procedure testproc
2 (
3 p_test1_tab out test1_tab
4 )
5 is
6 begin
7 select no, effdt bulk collect into p_test1_tab.no, p_test1_tab.effdt from test1 order by no;
8 end;
9 /
Warning: Procedure created with compilation errors.
SQL> sho err
Errors for PROCEDURE TESTPROC:
LINE/COL ERROR
-------- -----------------------------------------------------------------
7/3 PL/SQL: SQL Statement ignored
7/50 PLS-00302: component 'NO' must be declared
7/72 PL/SQL: ORA-00904: : invalid identifier
March 17, 2007 - 2:49 pm UTC
ops$tkyte%ORA10GR2> create or replace procedure testproc
2 (
3 p_test1_tab out test1_tab
4 )
5 is
6 begin
7 select test1_obj( no, effdt )
8 bulk collect into p_test1_tab
9 from test1 order by no;
10 end;
11 /
Procedure created.
Help
Rajesh, April 23, 2007 - 8:33 am UTC
Hi Tom,
consider the below requirement.
SQL> create table test(dt date)
2 /
Table created.
SQL> create table test_stg as select * from test where 1 = 0
2 /
Table created.
SQL> insert into test select sysdate-5+(rownum) from obj where rownum <= 5
2 /
5 rows created.
Using this data If I insert into test_stg for the first time it must
insert all rows into test_stg table and subsequent inserts must insert the new
data only but not the entire set
SQL> insert into test_stg select * from test -- This must insert all 5 rows.
Adding a new row into test table
SQL> insert into test values (sysdate+1)
/
SQL> insert into test_stg select * from test -- This insert must put ONLY the above row
AND NOT ALL rows.
How to achieve this?
OK
Rajesh, April 30, 2007 - 4:52 am UTC
Hi Tom,
Any help please?
April 30, 2007 - 9:40 am UTC
don't do it.
You really don't tell us the goal here, what you are trying to achieve, the net result.
You have a "solution", but no problem statement here.