Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question.

Asked: September 17, 2001 - 5:13 pm UTC

Answered by: Tom Kyte - Last updated: May 04, 2016 - 1:12 am UTC

Category: Database - Version: 817

Viewed 100K+ times! This question is

You Asked

Hi Tom
I am executing a query that fetches from a cursor first then does an insert based on the number of matching records as follows

CREATE OR REPLACE PROCEDURE test AS
Dt date;
DelID Integer;
LocID Integer;
CURSOR c1 IS select po_id, po_dt, to_number(name)
from Order, Location
Where Order.id = Location.id;
BEGIN
OPEN c1;
LOOP
FETCH c1 Into DelID, Dt, LocID;
EXIT WHEN C1%NOTFOUND;

Insert into Results
select DelID, NVal, prod_id, NULL, qty,
decode(qty, 0, 0, cost/qty), 0,
NULL, 1
from subs, Inv
Where subs.id = LocID
And v_dt = Dt
And Subs.id = Inv.inv_id;

COMMIT;

END LOOP;
CLOSE C1;
END;
/

The subs table contains 128 million records
Inv table contain 40000 records .
Cursor c1 returns 1.3 million records.
This insert has taken 3 days to insert just 4 million records and there are way more than 4 million records.
Im thinking of using direct insert :Insert /* + Append */. Is this going to help me?
I have another question on the placement of commit, I think in hte above query it is getting committed at every one insert , is this right thing to do?
Also should I disable the Index on the Results tables and rebuild it later?
Is there any other way to optimize the above operation?

Thanks
Steve


and we said...

The fastest would be to disable the indexes (mark them unusable) and do this in a SINGLE insert:

insert /*+ append */ into results
select delId, nval, prod_id, null, qtr, decode(qtr,0,0,cost/qty), 0, null, 1
from subs, inv, ( select po_id DelID, po_dt dt, to_number(name) LocID
from order, location
where order.id = location.id ) c1
where subs.id = c1.locId
and v_dt = c1.dt
and subs.id = inv.inv_id;

commit;

and rebuild the indexes using UNRECOVERABLE (and maybe even parallel).

I hate committing in a cursor for loop -- wastes resources, takes longer, generally inefficient.

I really don't like doing procedural code where a single statement will do the work for me.


and you rated our response

  (48 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

September 17, 2001 - 11:47 pm UTC

Reviewer: A reader

Hi Tom
I have altered the table to be in NOLOGGING mode , would this single stmt insert without being in a loop fail because of not being committed
ORA-01555 "snapshot too old

How do I commit after every 10000 records?

Thanks

Tom Kyte

Followup  

September 18, 2001 - 9:08 am UTC

The insert might fail with a 1555 because the query:

select delId, nval, prod_id, null, qtr, decode(qtr,0,0,cost/qty), 0, null, 1
from subs, inv, ( select po_id DelID, po_dt dt, to_number(name) LocID
from order, location
where order.id = location.id ) c1
where subs.id = c1.locId
and v_dt = c1.dt
and subs.id = inv.inv_id;


could not get a consistent read. This would be caused by OTHER sessions modifying subs, inv, order, and location however.

You committing inside of a loop would not prevent the 1555 (in fact, the #1 cause of a 1555 is committing inside of a loop with an open cursor).


Your loop will fail with an ora1555 just as easily as the single insert would. It would indicate that your rollback segments wrap faster then the longest running query you have. The solution for that is to size your rollback segments bigger, you size RBS based not only on your transaction size but your QUERIES as well.

but..

September 18, 2001 - 1:07 am UTC

Reviewer: Nag from USA

If we commit every 10,000 records or so, there is chance that we might still get snapshot too old error. This is because when yu open your cursor, it gets the records as of that point of time. In between when you are working with y our , cursor , if some else changes your records, You will lose your data integrity in the cursor, as a commit, will clear the blocks you had for your records in the rollback segment.

Tom, your comments, and what is the solution.

Tom Kyte

Followup  

September 18, 2001 - 9:10 am UTC

See above.

You will NOT lose data integrity. You get a consistent read from Oracle. The result set is fixed in stone at the point of time the statement begins..

What is unrecoverable option

September 18, 2001 - 1:54 am UTC

Reviewer: Nag from USA

Tom

Kindly explain the 3 phrases you have used in the given context.

wastes resources,
takes longer,
generally inefficient.


Tom Kyte

Followup  

September 18, 2001 - 9:19 am UTC

Unrecoverable/Nologging allows some operations to take place without generating redo information.

The direct path insert using the /*+ APPEND */ hint is one of those.  A create table as select can be another.  An index build yet another.

I say committing in a loop wastes resources because people generally do it in the belief that it SAVES resources.  It has the opposite effect.  You'll generate MORE REDO this way, you'll WAIT longer and more frequently on LGWR, and you'll generally use more system resources.

It takes longer -- well, thats self explainatory.  It takes LONGER.  


Here is an exmaple that sums it up nicely.  

ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace view stats
  2  as select 'STAT...' || a.name name, b.value
  3        from v$statname a, v$mystat b
  4       where a.statistic# = b.statistic#
  5      union all
  6      select 'LATCH.' || name,  gets
  7        from v$latch;

View created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> drop table t1;
Table dropped.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t1 ( x int, y char(2000) );
Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> declare
  2      l_x     number := 0;
  3      l_start number;
  4  begin
  5      insert into run_stats select 'before', stats.* from stats;
  6  
  7      l_start := dbms_utility.get_time;
  8      for i in 1 .. 1000
  9      loop
 10                  insert into t1 values ( 1, 'x' );
 11                  commit;
 12      end loop;
 13      dbms_output.put_line( (dbms_utility.get_time-l_start) || ' hsecs' );
 14  
 15      insert into run_stats select 'after 1', stats.* from stats;
 16  
 17      l_start := dbms_utility.get_time;
 18      for i in 1 .. 1000
 19      loop
 20                  insert into t1 values ( 1, 'x' );
 21      end loop;
 22      commit;
 23      dbms_output.put_line( (dbms_utility.get_time-l_start) || ' hsecs' );
 24  
 25      insert into run_stats select 'after 2', stats.* from stats;
 26  end;
 27  /
190 hsecs
27 hsecs

PL/SQL procedure successfully completed.

<b>Ok, so it is CLEAR that it is faster to commit once instead of committing lots... Look at the resources now</b>

ops$tkyte@ORA817DEV.US.ORACLE.COM> select a.name, b.value-a.value run1, c.value-b.value run2,
  2         ( (c.value-b.value)-(b.value-a.value)) diff
  3    from run_stats a, run_stats b, run_stats c
  4   where a.name = b.name
  5     and b.name = c.name
  6     and a.runid = 'before'
  7     and b.runid = 'after 1'
  8     and c.runid = 'after 2'
  9     and (c.value-a.value) > 0
 10     and (c.value-b.value) <> (b.value-a.value)
 11     and a.name like '%redo%'
 12   order by abs( (c.value-b.value)-(b.value-a.value))
 13  /

NAME                                           RUN1       RUN2       DIFF
---------------------------------------- ---------- ---------- ----------
STAT...redo buffer allocation retries             1          0         -1
STAT...redo log space requests                    2          0         -2
STAT...redo log space wait time                 137          0       -137
LATCH.redo writing                             1031          3      -1028
STAT...redo entries                            3578       2124      -1454
LATCH.redo allocation                          3592       2132      -1460
STAT...redo size                            2728780    2345248    -383532

7 rows selected.

The commit once logic used less of everything, time, latches (locks), redo space -- everything. 

Iam saying exactly what you are saying

September 18, 2001 - 9:35 am UTC

Reviewer: Follow up from USA

Tom

"You will NOT lose data integrity. You get a consistent read from Oracle. The result set is fixed in stone at the point of time the statement begins.."

Yes, you do get result set which is fixed in stone. But when you loop, and do some operations, and commit, for every 10,000 records, aint the blocks in the rollback segment for this cursor, get cleared. i.e. if someone else has changed the data selected by your cursor , and committed.


I had enough of snapshot too old -- lets beat it to death

September 18, 2001 - 9:49 am UTC

Reviewer: A reader from USA

Expand on

"It would indicate that your rollback segments wrap faster then the longest running query you have."

and

For some reasons, the dba's are reluctant to assgin a separate rollback segment for each indivudual process, is there any other solution for this.

What can be the dba's concerns.

I have tried to enquire, but either they dont answer, or the issue gets into bureaucracy, from one desk to other.

Here is one frustrated developer, what do I need to do. Most of my processses do what had been discussed here.

September 18, 2001 - 11:19 am UTC

Reviewer: A reader

Hi Tom
The single insert you suggested does not execute and hangs from the sqlplus ,if I assume it is processing then it means it is taking way longer than the insert using the cursor and looping through it.

Also I have noticed that even though we are using the APPEND hint the transaction still generates redo , as I monitored the Redo Size and Redo Entries for the session executing the Insert.
What I am doing in the cursor is a simple join and then the Insert with the Append hint.

CURSOR c1 IS select po_id, po_dt, to_number(name)
from Order, Location
Where Order.id = Location.id;

I also altered the Results table in Nologging mode but it still generates Redo.

Thanks
Steve


Tom Kyte

Followup  

September 18, 2001 - 11:28 am UTC

The query is processing, you can use v$session_longops to monitor its progress.

Since you just got this advice yesterday and you said:

This insert has taken 3 days to insert just 4 million records and there are way more than 4 million records.....

I don't see how you could have determined that its taking longer. I still have 2 days to go don't I? :)

You can take steps to TUNE the query itself -- I just merged your query into it, write it differently after analyzing its execution plan.

The redo being generated will be from the INDEXES on the table. They generate redo. Suggestion was to disable them, bulk load, rebuild them unrecoverable.

Huge Inserts

September 19, 2001 - 2:07 am UTC

Reviewer: Raj from India

Hi Mr.Tom,
can i use Commit like this?. Is cursor for loop itself has implicit commit?.Please let me know

BEGIN
for record1 in cursor1
LOOP
huge insert statements;
END LOOP;
BEGIN
for record2 in cursor2
LOOP
huge insert statements;
END LOOP;
COMMIT;
END;

Tom Kyte

Followup  

September 19, 2001 - 7:23 am UTC

there are no implicit commits in PLSQL. Some SQL statements perform a commit (all DDL) but normally -- there are no implicit commits.

As I've said many many times -- if you can avoid the loop and just do an insert into t select * from something_else -- you'll be better off.

Yes, you can do the above if you want.

Alternatively, use some NDS

September 19, 2001 - 12:00 pm UTC

Reviewer: Keith Miller from UK

I have a "bulk insert" template I use which I've adapted for your query (see below). Be interested to know what you think of this solution Tom.

You could always hash partition the the results table into x partitions, create x seperate bits of code that will select x seperate lots of data and insert into seperate the partitions of the results table. You could then kick off all those inserts at once.

DECLARE
CURSOR cursor1 IS
SELECT /*+ FIRST_ROWS */
delId,
nval,
prod_id,
null,
qtr,
DECODE(qtr,0,0,cost/qty),
0,
null,
1
FROM subs, inv, (SELECT po_id DelID,
po_dt dt,
TO_NUMBER(name) LocID
FROM order, location
WHERE order.id = location.id ) c1
WHERE subs.id = c1.locId
AND v_dt = c1.dt
AND subs.id = inv.inv_id;

field1_array DBMS_SQL.VARCHAR2_TABLE; -- or DBMS_SQL.NUMBER_TABLE or DBMS_SQL.DATE_TABLE
field2_array DBMS_SQL.VARCHAR2_TABLE; -- or DBMS_SQL.NUMBER_TABLE or DBMS_SQL.DATE_TABLE
field3_array DBMS_SQL.VARCHAR2_TABLE; -- or DBMS_SQL.NUMBER_TABLE or DBMS_SQL.DATE_TABLE
field4_array DBMS_SQL.VARCHAR2_TABLE; -- or DBMS_SQL.NUMBER_TABLE or DBMS_SQL.DATE_TABLE
field5_array DBMS_SQL.VARCHAR2_TABLE; -- or DBMS_SQL.NUMBER_TABLE or DBMS_SQL.DATE_TABLE
field6_array DBMS_SQL.VARCHAR2_TABLE; -- or DBMS_SQL.NUMBER_TABLE or DBMS_SQL.DATE_TABLE
field7_array DBMS_SQL.VARCHAR2_TABLE; -- or DBMS_SQL.NUMBER_TABLE or DBMS_SQL.DATE_TABLE
field8_array DBMS_SQL.VARCHAR2_TABLE; -- or DBMS_SQL.NUMBER_TABLE or DBMS_SQL.DATE_TABLE
field9_array DBMS_SQL.VARCHAR2_TABLE; -- or DBMS_SQL.NUMBER_TABLE or DBMS_SQL.DATE_TABLE

v_rowcount PLS_INTEGER;
v_bulk_collect_limit PLS_INTEGER := 50000;

BEGIN
v_rowcount := 0;

OPEN cursor1;

LOOP
FETCH cursor1 BULK COLLECT
INTO field1_array,
field2_array,
field3_array,
field4_array,
field5_array,
field6_array,
field7_array,
field8_array,
field9_array
LIMIT v_bulk_collect_limit;

IF cursor1%NOTFOUND AND v_rowcount = cursor1%ROWCOUNT
THEN
EXIT;
ELSE
v_rowcount := cursor1%ROWCOUNT;
END IF;

FORALL indx IN field1_array.FIRST..field1_array.LAST
INSERT /*+ APPEND */ INTO results
VALUES
(field1_array(indx),
field2_array(indx),
field3_array(indx),
field4_array(indx),
field5_array(indx),
field6_array(indx),
field7_array(indx),
field8_array(indx),
field9_array(indx));

COMMIT;
END LOOP;
CLOSE cursor1;
END;
/


Tom Kyte

Followup  

September 19, 2001 - 5:15 pm UTC

Two things -- it doesn't use NDS (assume that means native dynamic sql?) and the /*+ append */ hint is ignored unless it is an "INSERT ... SELECT", "INSERT ... VALUES" can never use a direct path insert.

It is better then row at a time -- not as good as just doing it in one statement. 50,000 rows could tend to eat gobs of RAM as well -- it's adjustable I see....

How about compromise - say, COMMIT every few hundred-thousand rows?

September 19, 2001 - 7:13 pm UTC

Reviewer: MDz from CA, USA

Knowing the distribution of data, it would be possible to run this insert in batches. Something along these lines:

BEGIN
FOR b IN 1..CEIL(:maxPOId/100000) LOOP
insert /*+ append */ into results
select delId, nval, prod_id, null, qtr,
decode (qtr,0,0,cost/qty), 0, null, 1
from subs, inv, ( select po_id DelID, po_dt dt,
to_number(name) LocID
from order, location
where order.id = location.id
AND PO_ID BETWEEN (b-1)*100000
AND B*100000-1) c1
where subs.id = c1.locId
and v_dt = c1.dt
and subs.id = inv.inv_id;
commit;
end loop;
end;

With a little more work on this script, it could be modified to re-run after a failure, and continue from where it left off. The only downside I see in doing this is that the SELECT for insert may take something like a full table scan/hash join in every iteration, which would then slow down the process...

Thanks for the feedback Tom

September 20, 2001 - 6:34 am UTC

Reviewer: Keith Miller

I dind't mean NDS, meant new bulk processing features, but it was in the middle of a very busy day (that's my excuse and I'm sticking to it!)
I didn't realise about the APPEND hint not working there, so that's useful.

September 20, 2001 - 8:04 pm UTC

Reviewer: Steve

Seems to me this insert is going to take forever to load this data.
I am thinking if I could use Sql loader instead of this insert to load my results table. But Im not too proficient in it.
Im thinking before running the loader to do the following steps

Ill put the result of this cursor into a tmp table

Instead of :
CURSOR c1 IS select po_id, po_dt, to_number(name)
from Order, Location
Where Order.id = Location.id;
Ill use:
Insert into tmp(po_id,po_dt,ord_id)
select po_id, po_dt, to_number(name)
from Order, Location
Where Order.id = Location.id;

I have the huge (128 million rows) SUBS table's data in .dat files. I can formulate the ctl file ,of
course with your help as follows. I dont know right syntax for the following:

load data
infile 'x.dat'
APPEND
into table results
fields terminated by ','
(DelID, (this is the po_id from the tmp table?)
NVal, (this is from a sequence)
prod_id, (this is a column in the .dat file for the subs table)
null (this would be default),
qty, (this is a column in the .dat file for the subs table)
decode(qty, 0, 0, cost/qty), (cost is also a column in the .dat file for the subs table. How to do this?)
0 (this would be default),
NULL (this would be default),
1 (this would be default))

How can I ensure in the ctl file the following:
Where subs.id = tmp.ord_id
And v_dt = tmp.po_dt
So that it loads from the .dat file only matching records between subs data & tmp table rows into the results table
I dont know how to do the above , I would greatly appreciate your help.


A long time, well yes of course it is

September 24, 2001 - 12:05 pm UTC

Reviewer: Keith Miller

Steve,

Of course your load is going to take a while. You are looking at 128 million rows. However you write it even the most efficient code in the world is going to take time to process all this data.

If you are going to use sql*loader, the only way to make that quick is to use a direct path load, but even then you are going to find that it will take time.

Look at hash partitioning your results table and then running several processes in parallel. And before dismissing any of the suggestions out of hand I suggest you try a few.

But remember, this is not a small amount of data so don't expect to find anything that will run in ten minutes!

And furthermore

September 24, 2001 - 12:22 pm UTC

Reviewer: Keith Miller

Just re-read your SQL*Loader query and there is no way you can join the data in the flat file to a table within SQL*Loader. It just doesn't work that way. SQL*Loader can only get data from the file you supply it, not external sources.

You could write the combined results to a flat file so you effectively have the whole results table in flat file format. Then you could use direct path load to load this in. You are not able to apply functions to fields when using direct path so you will have to have the data in the format you wish it to be inserted.

I just can't see though how this could possibly be faster than running an update in a single process.

Face the facts - you have a large load to do, it's going to take time. If that is not acceptable to your superiors then they obviously have no understanding about how long things take with very large volumes of data.

Your Suggestion Required

April 30, 2002 - 10:12 am UTC

Reviewer: pawan from USA

Tom,
I have gone through various answers on your site and think that I have understood why we get ORA 01555 error. I am running into ORA 0155 while running the following query. Can you please suggest an alternative or ways to avoid the error. Also how do I figure out how much rollback will be needed if I know that the SELECT part of the query will result in 210 million records. The SOURCE TABLES ( SOURCE_TB_A and SOURCE_TB_B ) have primary keys on them I have disabled other Indexes. An example to find the RBS required would be great help. Suppose I log on to SQL*plus create similar source tables A & B with only 1000 records and run tghis Insert statement and query the USED_UBLK can I approximate the RBS requirement.
Here is my query

INSERT INTO TARGET_TABLE
SELECT /*+ USE_NL(X, S) INDEX(X) INDEX(S) */
X.XPONENT_CAT,
X.RX_PAYMENT_TP,
X.YR,
X.MTH,
X.DAY,
X.WEEK,
X.TRI,
X.QTR,
X.MKT_ID,
X.IMS_PRESCRIBER_NUM,
X.PRD_GRP_ID,
X.IMS_PAYER_ID,
X.IMS_PLAN_ID,
X.AS_OF_DT,
X.IMS_MTH_NUM,
X.CMTY_NRX_CNT_PTRK,
X.CMTY_TRX_CNT_PTRK,
X.CMTY_NRX_DOLL_PTRK,
X.CMTY_TRX_DOLL_PTRK,
X.CMTY_NRX_VOL_PTRK,
X.CMTY_TRX_VOL_PTRK,
X.CMTY_NRX_THPY_PTRK,
X.CMTY_TRX_THPY_PTRK,
S.TERR_ID
FROM
SOURCE_TB_B S, SOURCE_TB_A X
WHERE
S.IMS_PRESCRIBER_ID=X.IMS_PRESCRIBER_NUM
AND X.MKT_ID =$$MKT_ID
AND EXISTS
(SELECT 'X'
FROM SLS_FORCE_MKT_TB SF
WHERE SF.MKT_ID = X.MKT_ID
AND SF.SLS_FORCE_TP_ID = S.SLS_FORCE_TP_ID)


Thanks for your help


Tom Kyte

Followup  

April 30, 2002 - 11:40 am UTC

You need enough rollback permanently configured so your RBS do not wrap in the time it takes to run that query.

If you estimate it'll take 1 hour (or whatever) to run that query, you need to make sure you don't wrap rollback in 1 hour. In 9i, you would use an undo tablespace with a 1 hour, 15 minute retention period. In 8i and before, you need to monitor v$rollstat to see how often you are currently wrapping and size your RBS upwards to make it so you wrap every 1 hour 15 minutes instead.

Huge insert!

April 06, 2004 - 3:15 pm UTC

Reviewer: Janet A from Atlanta, GA USA

Hi Tom.
I'm using a package to batch load data. I originally have a cursor and then do the insert when I call the cursor in my FOR LOOP. So it was slow. I tried to run it in a separate sql INSERT INTO /*+ APPEND PARALLEL ... table SELECT .. FROM.. and it was fast. Now I tried to remove my cursor, and paste the INSERT INTO..SELECT.. sql in the package. However it is still slow. Does it have something to do with running it in a package?
Thanks!



Tom Kyte

Followup  

April 07, 2004 - 8:45 am UTC

tkprof it and compare the two. tkprof shows all.

Huge Insert

April 06, 2004 - 4:01 pm UTC

Reviewer: Janet A from Atlanta, GA USA

Additional info.
When I ran my insert outside the package, I hardcoded the parameter dates. This is the query that inserted millions of rows fast.
But when I copied in the package, and replaced the dates with variables, it was slow. I changed the dates in the package and hardcoded my inputs and they ran fast.
Any idea why?
I need them in variables because I will need to run it every month on batch by passing the parameter dates.
Thanks.

Tom Kyte

Followup  

April 07, 2004 - 8:48 am UTC

You can definitely use native dynamic sql for this. for large, batch (data warehouse like things) that are run infrequently -- not using binds is acceptable.




Does the direct-load insert work with database link?

June 10, 2004 - 6:47 pm UTC

Reviewer: A reader from Herndon, VA USA

Tom,

Does the direct-load insert (using hint of append and parallel in insert and select statement) work with a database link?

Thanks as always.

Tom Kyte

Followup  

June 10, 2004 - 8:18 pm UTC

yes.

Thanks and more ...

June 11, 2004 - 8:48 am UTC

Reviewer: A reader from Herndon, VA USA

Tom,

Thanks for your quick response. Yesterday I post another question in the web site but forgot to bookmark it. Now I could not find it by searching the key words. Would you mind to give me the URL if you can find it, or answer it here? The topic of the question is: why does the index is not picked up by Oracle optimizer even a hint is used in the select statement. The example of the query is as following:

SELECT /*+ index (outlet_dstservices OUTLET_DSTSERVICES_IX) */
od.corp_id, od.house_id, od.cust_num, od.outlet, cs.service_id, COUNT(*)
FROM CORP_SERVICES cs, OUTLET_DSTSERVICES od
WHERE cs.CORP_ID = od.CORP_ID
AND cs.RATE_CODE = od.RATE_CODE
AND cs.CORP_ID in (select corp_id from INIT_LOAD_CORP)
GROUP BY od.corp_id, od.house_id, od.cust_num, od.outlet, cs.service_id;

Sorry for the extra work. Thanks as always.

Tom Kyte

Followup  

June 11, 2004 - 4:13 pm UTC

take your email address and plug it into "your questions" -- tab on the home page.

ORA-12840 when direct-load insert

June 11, 2004 - 10:37 am UTC

Reviewer: A reader from Herndon, VA USA

Tom,

I tried as following. The data was populated, but I got ORA-12840 and SP2-0612 errors in the execution plan. Could you please explain why? Thanks.

SQL> rollback;

Rollback complete.

Elapsed: 00:00:00.04
SQL> alter session enable parallel dml;

Session altered.

Elapsed: 00:00:00.00
SQL> alter table outlet_dstservices_tmp nologging;

Table altered.

Elapsed: 00:00:00.08
SQL> insert /*+ append parallel (tmp, 12) */
  2  into outlet_dstservices_tmp tmp
  3  select /*+ append parallel (od, 12) */
  4  * from outlet_dstservices@dblinksun240 od;

3368858 rows created.

Elapsed: 00:00:46.07

Execution Plan
----------------------------------------------------------
ERROR:
ORA-12840: cannot access a remote table after parallel/insert direct load txn


SP2-0612: Error generating AUTOTRACE EXPLAIN report

Statistics
----------------------------------------------------------
        199  recursive calls
        322  db block gets
         43  consistent gets
          9  physical reads
      15464  redo size
        608  bytes sent via SQL*Net to client
        656  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          6  sorts (memory)
          0  sorts (disk)
    3368858  rows processed
 

Tom Kyte

Followup  

June 11, 2004 - 4:25 pm UTC

[tkyte@localhost tkyte]$ oerr ora 12840
12840, 00000, "cannot access a remote table after parallel/insert direct load txn"
// *Cause: Within a transaction, an attempt was made to perform distributed
// access after a PDML or insert direct statement had been issued.
// *Action: Commit/rollback the PDML transaction first, and then perform
// the distributed access, or perform the distributed access before the
// first PDML statement in the transaction.

you cannot touch that table until you commit. explain plan, et.al.


Re: ORA-12840 when direct-load insert

June 14, 2004 - 12:03 pm UTC

Reviewer: A reader from Herndon, VA USA

Tom,

What I don't understand here is I didn't touch both outlet_dstservices_tmp and outlet_dstservices tables before I committed the insert. Why do I get the error?

I have another question on parallel_min_servers and parallel_max_servers initial parameters:

How much memory will Oracle allocate for each one I specified in parallel_min_servers, say, if I set parallel_min_servers = 10 and parallel_max_servers = 20? Will the memory constantly dedicate to those parallel servers even though there is no parallel DML? Will it impact overall performance?

Thanks as always.

Tom Kyte

Followup  

June 14, 2004 - 1:03 pm UTC

you insert appended into 

SQL> insert <b>/*+ append parallel (tmp, 12) */
  2  into outlet_dstservices_tmp tmp</b>
  3  select /*+ append parallel (od, 12) */
  4  * from outlet_dstservices@dblinksun240 od;


that is all it took (just fyi, the append in the select is just a comment, not doing anything)


PGA memory is allocated for PQ slaves in the same fashion as they would be for any server process.  It is allocated as needed (as used) and can be freed back in some cases (pga_aggregate_target usage in 9i and above) 

Huge table insert

June 18, 2004 - 7:04 pm UTC

Reviewer: A reader from Herndon, VA USA

Tom,

I have a table, say T1, has about 40 million rows. Most of its VARCHAR2 columns contain leading and trailing spaces and one column contains HEX values. Now I am trying to trim those spaces and convert the HEX column to a new column with DECIMAL value.

Please review the following steps:

1. alter table t1 add (decimal_val varchar2(12));
2. create table T1_tmp as select * from T1 where 1 = 2;
3. alter table T1_tmp nologging;
4. insert /*+ append parallel (t1_tmp, 12) into t1_tmp
select ltrim(rtrim(col1)), ... ltrim(rtrim(col_n)) from t1;
5. drop table t1;
6. alter table t1_tmp rename to t1;
7. create all primary key and constraints;
8. alter table t1 logging;

But when I was in step 4, I got an error of "ORA-01722: invalid number" and the insert statement was rolled back. Apparently there is some data in a particular row violate the column definition after they are converted. Now I want to locate those bad rows. I think the only way to do it is use a FOR loop to catch the error and insert the bad rows into an exception table. But it will significantly slow down the conversion process if a loop is used. Is there a better way for it?

Thanks in advance for your help.

Tom Kyte

Followup  

June 19, 2004 - 8:02 am UTC

what should the data that is being converted into a number from a string "look like". what format are you storing it in.

using that, we can develop a decode or CASE statment to "edit" the data, only to_number it when it matches a known pattern. in fact, using a multi-table insert (assuming 9i) we can put good data into one table and bad data into another.

Re: Huge table insert

June 19, 2004 - 9:45 am UTC

Reviewer: A reader from Herndon, VA USA

Tom,

Thanks for your information. I checked multiple insert syntax, and it seems the insert {all|first} must have WHEN clause specified. I am wondering how to specify the WHEN clause in my case. Say if I have following tables:

SQL> select * from t1;

        ID TEXT
---------- -----
         1 aaa
         2 BBBBB
         3 CCCC

SQL> create table t1_bad as select * from t1 where 1 = 2;
SQL> create table t1_good (id number, text varchar2(4));

When transform data from t1 to t1_good table, I would expect the first and third rows in t1 are inserted into t1_good and the second row inserted into t1_bad.  The insert statement could be as below:
insert first
     when ??? then into t1_good values (id, text)
     else into t1_bad values (id, text)
select * from t1;

Here, how to specify ??? in the WHER clause. Thanks again.
 

Tom Kyte

Followup  

June 19, 2004 - 1:52 pm UTC

insert first
when <condition> then into good
when NOT<condition> then into bad



Re: Huge table insert

June 20, 2004 - 8:25 am UTC

Reviewer: A reader from Herndon, VA USA

Tom,

I know the syntax for the 'insert first ...'. I am asking HOW TO SPECIFY THE CONDITION PARTY in my situation. As I mentioned previously, I have a table of 40 million rows and tried to convert HEX to DECIMAL, trim leading and trailing spaces for all of VARCHAR2 columns. In the middle of the converting process, I got "ORA-01722: invalid number". Now I am trying to locate the bad rows but I don't know what caused the error. Therefore I don’t know what condition I can specify for the WHEN <condition> clause. Thanks.

Tom Kyte

Followup  

June 20, 2004 - 10:23 am UTC

i simply asked you before:

<quote>
what should the data that is being converted into a number from a string "look
like". what format are you storing it in.

using that, we can develop a decode or CASE statment to "edit" the data, only
to_number it when it matches a known pattern. in fact, using a multi-table
insert (assuming 9i) we can put good data into one table and bad data into
another.
</quote>


so, if you would care to answer that already asked question, I'll be glad to help you develop a decode or CASE statement.


So, if you list out how the data should be found in the table, something like:

o it will always be even in length because we pad leading zeros...
o it consists of the upper/lower case letters ABCDEF and the digits 0123456789 in any order.
o there are no whitespaces, decimals or anything

and so on -- I'll tell you what you need (or maybe by simply listing out what you expect, you'd see how to write the case statement yourself -- it might become "obvious")

Re: Huge table insert

June 20, 2004 - 11:49 am UTC

Reviewer: A reader from Herndon, VA USA

Tom,

Here is the table that is going to be converted:

SQL> desc box_inventory 
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 CORP_ID                                   NOT NULL NUMBER(10)
 BOX_NUM                                   NOT NULL VARCHAR2(14)
 BOX_STATUS                                         CHAR(1)
 RATING                                             CHAR(1)
 PPV_ORDER                                          CHAR(1)
 VENDOR_TYPE                                        VARCHAR2(2)
 HIBIT                                              NUMBER(5)
 OUTLET                                             NUMBER(5)
 HOUSE_ID                                           VARCHAR2(6)
 CUST_NUM                                           VARCHAR2(2)
 UNIT_ADDRESS                                       VARCHAR2(16)
 MAC_ADDRESS                                        VARCHAR2(12)

And here is the insert into ... select statement:

insert /*+ append parallel (box_inventory_tmp, 12) */
into   box_inventory_tmp
select /*+ parallel (box_inventory, 12) */
       CORP_ID, ltrim(rtrim(BOX_NUM)), ltrim(rtrim(BOX_STATUS)),
       ltrim(rtrim(RATING)), ltrim(rtrim(PPV_ORDER)),
       decode (VENDOR_TYPE, 'XX', decode(UNIT_ADDRESS, NULL, 'XA', 'XX'), NULL, decode(UNIT_ADDRESS, NULL, 'XA', 'XX'), VENDOR_TYPE),
       HIBIT, OUTLET, ltrim(rtrim(HOUSE_ID)), ltrim(rtrim(CUST_NUM)),
       ltrim(rtrim(UNIT_ADDRESS)),
       decode ( vendor_type,
                'MT', lpad( lpad(to_char(to_number(substr(unit_address, 1, 3)), 'fmXX'), 2, '0') ||
                          lpad(to_char(to_number(substr(unit_address, 4, 10)), 'fmXXXXXXXX'), 8, '0'), 12, '0' ),
                'SA', lpad(substr(unit_address, 5), 12, '0') )
from  box_inventory;

Do you need any additional information? Thanks for your help. 

Tom Kyte

Followup  

June 20, 2004 - 4:03 pm UTC

give me the specification for the values in the field that is causing the problem -- the format they should be in. Like I listed out above:

it is a field with an even length...
it consists of these characters...


give me the data constraint that should have been in place on this column from day one to ensure only valid data was entered.




Re: Huge table insert

June 20, 2004 - 11:44 pm UTC

Reviewer: A reader from Herndon, VA USA

Tom,

My further investigation found that some rows contain the value of ‘A’ in UNIT_ADDRESS column with VENDOR_TYPE = ‘MT’, which causes to_number(substr(unit_address, 1, 3)) to fail => ’invalid number’. Supposedly the UNIT_ADDRESS column contains only digit numbers when VENDOR_TYPE = ‘MT’. The other failure was caused by ltrim(rtrim(BOX_NUM)). Some rows have only blank spaces in BOX_NUM field. When the spaces are trimmed, its value becomes NULL, which violates the NOT NULL constraint.

The data for this table is replicated from other database (maintained by other company) in which we have no control. I think the data might be entered by some applications. I don’t think there are more strict constrains enforced when the data first inserted into the original table. It seems to me that any value could be in VARCHAR2 columns. So it is very difficulty to define the CONDITION clause. I am wondering if there is a general way to insert bad rows into a t_bad table no matter what kind of condition happened?

Thanks.


Tom Kyte

Followup  

June 21, 2004 - 8:14 am UTC

you are missing my point.

WHEN the field is something that you want to convert -- DEFINE what it looks like.

We can look for that pattern.
When we do not see that pattern, we do not to_number it.

Define your VALID pattern.

Re: Huge table insert

June 21, 2004 - 9:13 am UTC

Reviewer: A reader from Herndon, VA USA

Tom,

If converted, the MAC_ADDRESS field should contains 12 HEX characters: 2 leading zeros and 10 digital numbers or 6 (A-F) character letters. Thans.

Tom Kyte

Followup  

June 21, 2004 - 9:36 am UTC

case when substr( mac_address,1,2) = '00'
and
translate(mac_address,'0123456789ABCDEF','0000000000000000') =
rpad( '0', 12, '0' )
then to_number( mac_address, rpad( 'X', 12, 'X' ) )
else null
end


verify starts with 00
verify all characters are 0..9A..F

done.

Re: Huge table insert

June 21, 2004 - 9:19 am UTC

Reviewer: A reader from Herndon, VA USA

Tom,

As I memtioned in the previous e-mail, when the vendor_type = 'MT' the UNIT_ADDRESS is converted to HEX format. So the UNIT_ADDRESS should be all digital numbers when vendor_type = 'MT'.

Regarding committing inside a loop

June 21, 2004 - 12:11 pm UTC

Reviewer: Ian from Amsterdam

Coming back to your earlier follow-up regarding an example proving that committing inside a loop is slower than committing at the end - could you comment on the below?

In our 'real' world (which is a very strange place) - we tend to commit say every 1000 rows - so I took the liberty of added an index to your example (because we tend to have 'em) - and committed inside the loop every 1000 records. 

And I got completely different results. 370 hsecs committing inside the loop - 1130 hsecs outside.

We have to do it this way as we are calling Oracle Applications API's to do the actual inserts and some of our data loads will run for several hours.

Do this make sense to you - it seems contrary to what you demonstrated above. Or am I missing a point somewhere?

Regards

Ian

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production

SQL> create index t1_n1 on t1(x);

Index created.

SQL> declare
  2      l_count   number := 10000;
  3      l_commit  number := 1000;
  4      l_start   number;
  5  begin
  6      insert into run_stats select 'before', a.* from stats a;
  7  
  8      l_start := dbms_utility.get_time;
  9      for i in 1 .. l_count
 10      loop
 11                  insert into t1 values ( i, 'x' );
 12                  if mod(1,l_commit) = 0 then
 13                     commit;
 14                  end if;
 15      end loop;
 16      dbms_output.put_line( (dbms_utility.get_time-l_start) || ' hsecs' );
 17  
 18      insert into run_stats select 'after 1', a.* from stats a;
 19  
 20      l_start := dbms_utility.get_time;
 21      for i in 1 .. l_count
 22      loop
 23                  insert into t1 values ( i, 'x' );
 24      end loop;
 25      commit;
 26      dbms_output.put_line( (dbms_utility.get_time-l_start) || ' hsecs' );
 27  
 28      insert into run_stats select 'after 2', a.* from stats a;
 29  end;
 30  /
370 hsecs
1130 hsecs

PL/SQL procedure successfully completed.


SQL> select a.name, b.value-a.value run1, 
  2  c.value-b.value run2,
  3         ( (c.value-b.value)-(b.value-a.value)) diff
  4    from run_stats a, run_stats b, run_stats c
  5   where a.name = b.name
  6     and b.name = c.name
  7     and a.runid = 'before'
  8     and b.runid = 'after 1'
  9     and c.runid = 'after 2'
 10     and (c.value-a.value) > 0
 11     and (c.value-b.value) <> (b.value-a.value)
 12     and a.name like '%redo%'
 13   order by abs( (c.value-b.value)-(b.value-a.value))
 14  /

NAME                                           RUN1       RUN2       DIFF
---------------------------------------- ---------- ---------- ----------
STAT...redo buffer allocation retries             0          2          2
STAT...redo log space requests                    0          2          2
STAT...redo log space wait time                   0          8          8
STAT...redo entries                           26736      26682        -54
LATCH.redo writing                              181        258         77
LATCH.redo allocation                         26854      27208        354
STAT...redo size                           25137592   25135488      -2104

7 rows selected. 

Tom Kyte

Followup  

June 21, 2004 - 7:57 pm UTC

something else happened. use tkprof with a level 12 10046 trace. You were waiting for something in one and not the other.

plsql has an interesting commit optimization that could be coming into play here as well. but -- commiting in the for loop can only

o generate more redo
o help you get ora-1555's like mad



Correct as usual

June 22, 2004 - 9:02 am UTC

Reviewer: Ian from Amsterdam

You were correct - the second run spent a lot longer waiting on db file sequential read. I was forgetting that my test database is attached to an ESS that is serving many other instances too.

So I tried it on my laptop - Personal Oracle Database 10g Release 10.1.0.2.0 - Production

I also changed the code so it reflects what we are doing a bit more accurately - for instance we do not commit inside a cursor for loop (No ORA-1555's)

And basically there is no difference between them. Same time - same redo. Even if I drop down to committing every 100 there is very little difference.


Here's what we are doing (simplified)

drop table t_staging1;
create table t_staging1 (x int, status varchar2(10));
create unique index t_staging1_u1 on t_staging1(x);
insert into t_staging1 (x, status)
select rownum, 'NEW'
from all_objects
where rownum <= 10000;
exec dbms_stats.GATHER_TABLE_STATS('scott', 't_staging1', estimate_percent => 20, method_opt => 'for all columns size 1', cascade => true, degree => 2);


drop table t_staging2;
create table t_staging2 (x int, status varchar2(10));
create unique index t_staging2_u1 on t_staging2(x);
insert into t_staging2 (x, status)
select rownum, 'NEW'
from all_objects
where rownum <= 10000;
exec dbms_stats.GATHER_TABLE_STATS('scott', 't_staging2', estimate_percent => 20, method_opt => 'for all columns size 1', cascade => true, degree => 2);

drop table t1;
create table t1 ( x int, y char(2000) );
create index t1_n1 on t1(x);

drop table t2;
create table t2 ( x int, y char(2000) );
create index t2_n1 on t2(x);

create table run_stats ( runid varchar2(15), name varchar2(80), value int );


declare
l_commit NUMBER := 100;
l_start NUMBER;
l_records_processed BOOLEAN;
begin

insert into run_stats
select 'before', 'STAT...' || a.name name, b.value
from v$statname a, v$mystat b
where a.statistic# = b.statistic#
union all
select 'before', 'LATCH.' || name, gets
from v$latch;

l_start := dbms_utility.get_time;

loop
l_records_processed := FALSE;
for rec in (select x
from t_staging1
where status = 'NEW'
and rownum <= l_commit)
loop
l_records_processed := TRUE;

-- Loads of pre-processing and then call Oracle Applications API's
-- which eventually do some inserts

insert into t1 values ( rec.x, 'x' );

update t_staging1
set status = 'OK'
where x = rec.x;

end loop;

if not l_records_processed then
exit;
end if;
commit; -- Outside of cursor for loop

end loop;

dbms_output.put_line( (dbms_utility.get_time-l_start)||' hsecs for run1' );

insert into run_stats
select 'after 1', 'STAT...' || a.name name, b.value
from v$statname a, v$mystat b
where a.statistic# = b.statistic#
union all
select 'after 1', 'LATCH.' || name, gets
from v$latch;

l_start := dbms_utility.get_time;

for rec in (select x
from t_staging2
where status = 'NEW')
loop

-- Loads of pre-processing and then call Oracle Applications API's
-- which eventually do some inserts

insert into t2 values ( rec.x, 'x' );

update t_staging2
set status = 'OK'
where x = rec.x;

end loop;
commit;

dbms_output.put_line( (dbms_utility.get_time-l_start)||' hsecs for run2' );

insert into run_stats
select 'after 2', 'STAT...' || a.name name, b.value
from v$statname a, v$mystat b
where a.statistic# = b.statistic#
union all
select 'after 2', 'LATCH.' || name, gets
from v$latch;

end;
/


And here are the results (including snippets from tkprof with waits).


Run 1 (Commit every 1000)

2529 hsecs for run1

INSERT INTO T1
VALUES
( :B1 , 'x' )


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 10000 2.20 21.82 3343 3712 50796 10000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10001 2.20 21.82 3343 3712 50796 10000

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 57 (recursive depth: 1)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 3343 0.32 18.05
log file switch completion 8 0.37 1.42
log buffer space 3 0.07 0.17
********************************************************************************


Run 2 (Commit at end)

2742 hsecs for run2

INSERT INTO T2
VALUES
( :B1 , 'x' )


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 10000 2.39 23.48 3347 3692 50742 10000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10001 2.39 23.48 3347 3692 50742 10000

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 57 (recursive depth: 1)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 3347 0.37 19.77
log file switch completion 6 0.42 1.48
log buffer space 1 0.02 0.02
********************************************************************************


And the Statistics

NAME RUN1 RUN2 DIFF
---------------------------------------- ---------- ---------- ----------
STAT...redo log space requests 7 8 1
STAT...redo buffer allocation retries 8 12 4
STAT...redo log space wait time 170 162 -8
STAT...redo entries 47373 47345 -28
LATCH.redo allocation 920 869 -51
LATCH.redo writing 1862 1933 71
STAT...redo size 28469328 28473552 4224

7 rows selected.



So would it be safe to say that:

If you have to do loads of pre-processing
And you are inserting into loads of tables
And you are processing complete entities at a time
And you've had your hands tied behind your back by a third party vendor's APIs
Then it's OK to commit say every 1000 transaction boundaries?

I understand that this is slightly off the thread - but would welcome and value your opinion.

PS - If you recognize the solution to committing outside of a cursor for loop its because it's your solution. :-) Thanks.

PPS - You mentioned that plsql has an interesting commit optimization - any chance you could expound on this? Sounds fascinating.

Regards

Ian

Tom Kyte

Followup  

June 22, 2004 - 9:44 am UTC

I'll add an "and"

AND your process is restartable, so that if it fails after 15 commits, with 15 more to go -- it'll pick up where it left off nicely.

then yes (there is a theory that says instead of every N records, every N minutes makes more sense. You are committing in order to be restartable -- sort of like a checkpoint. Using TIME makes more sense (I want to lose at most 5 minutes of processing -- therefore commit every 5 minutes).


As for the plsql optimization, consider this pro*c piece of code:


void process()
{
EXEC SQL BEGIN DECLARE SECTION;
int i;
EXEC SQL END DECLARE SECTION;

EXEC SQL WHENEVER SQLERROR DO sqlerror_hard();

exec sql alter session
set events '10046 trace name context forever, level 12';
for( i = 0; i <= 10000; i++ )
{
exec sql insert into t values ( :i );
exec sql commit;
}
}

versus this plsql block:

begin
for i in 1 .. 10000
loop
insert into t values ( i );
commit;
end loop;
end;


They look to be about the same right -- well, not really. If we look at the wait events each one incurred (starting with pro*c)

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 10002 2.34 2.45 0 12 20340 10001
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10003 2.35 2.46 0 12 20340 10001

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 20004 0.00 0.03
SQL*Net message from client 20004 0.03 2.00
log file sync 9988 0.22 5.37

It waited about 10,000 times for a log file sync -- after each and every single commit. Look at plsql though:





OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.01 0.01 0 0 1 0
Execute 3 1.39 1.24 0 0 0 2
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5 1.40 1.26 0 0 1 2

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 3 0.00 0.00
SQL*Net message from client 3 15.32 16.01
log file sync 1 0.00 0.00

it waited ONCE for log file sync. why the difference?



well, that is the plsql commit optimization. when you commit in plsql -- plsql says "you know, until I return to the client -- you don't really know what I've been doing so we will just post lgwr and tell him to flush the stuff out to disk but we won't wait. the only time we'll wait is right before we go back to the client -- then we'll make sure all of our commits have actually committed (because before that, you don't really know that we tripped over the commit code in the stored procedure, client doesn't get control back until we are done)"


The use of certain features can show this -- if you do something "non-transactional" (eg: with utl_file -- open, write, close) inside that loop -- eg: write out "last row was || i" after the commit -- if you run that and shutdown abort the instance, there is a wee slim chance that the "i" you observe in your flat file is different from the "i" you see in the database (the "i" in the database could be smaller, the same or larger -- smaller due to commit optimization, the same due to luck, larger becase we may well have inserted/committed that value before utl_file did it's work)

Other things like getting into a distributed transaction will short circut this optimization -- 2pc needs to have the commit really really happen.





Good to hear.

June 22, 2004 - 10:23 am UTC

Reviewer: Ian from Amsterdam

Yes - we are restartable. Thanks for the confirmation.

Interesting idea on the time based commit - will give that some thought.

Thanks also for the illumination on the plsql commit optimization - always fascinating to see 'under the covers'.

Loved the "you know, until I return to the client -- you don't really know what I've been doing" bit. Until v$session_longops came along that was the story of my life! In fact I suspect that might be part of the reason you see so much plsql code with commits all over the place - it used to be the only way to see how far you had got. :-)

Thanks and Regards

Ian



Time-based restart

April 07, 2005 - 5:40 pm UTC

Reviewer: VA from New Jersey, USA

"Using TIME makes more sense (I want to lose at most 5
minutes of processing -- therefore commit every 5 minutes)"

That makes a lot of sense, but how would one go about implementing it? Record counts are easily available in the code, how would I determine if "N minutes of work has occured, time to commit"?

Thanks

Tom Kyte

Followup  

April 07, 2005 - 5:52 pm UTC

every couple hundred records ask 'what time is it', is the difference between now and then greater than N minutes? if so, commit and set then to now.

(don't ask what time it is every time, every couple hundred/thousand, asking what time it is itself can get expensive)

Why Message from Client hi in PL-SQL?

December 18, 2005 - 11:48 pm UTC

Reviewer: naresh from Cyprus

Tom,

In your example above to demostrate PL-SQL commit optimization, the SQL-Ner Message from CLient wait in the PL-sql code was 16 seconds as opposed to 2 seconds in Pro*C.

Was this something peculiar to your setup or ...?

Thanks,
Naresh

Tom Kyte

Followup  

December 19, 2005 - 7:25 am UTC

it wasn't really relevant to anything - it was just how long I paused before hitting the enter key to make it "go".

ORA-12840: cannot access a remote table...

January 06, 2006 - 11:32 am UTC

Reviewer: Stewart W. Bryson from Atlanta, GA

Going back to this for a minute...

I've written a PL/SQL package that uses the dictionary to load numerous tables in a schema according to regular expressions. However, some of these tables are actually views to remote databases, and I am "insulated" from this fact. However, I do need to know whether I have just inserted from a direct-path read from a distributed system to know whether a commit is required. Any way to check whether I am "in the middle of" a direct load txn to know whether a commit is required before I start selecting from another table?

Thanks.

Tom Kyte

Followup  

January 06, 2006 - 2:19 pm UTC

you can select from another table if you've done a direct path load (insert append), you cannot select from the table you've just LOADED.

Your transaction boundaries are constant - regardless of anything else. You either ALWAYS commit there or you don't - you would not ever make the decision to commit or not at a certain point using the "am I in a direct path transaction" logic - your business logic dictates when the transaction is complete - nothing else.

ORA-12840: cannot access a remote table...

January 12, 2006 - 11:06 am UTC

Reviewer: Stewart W. Bryson from Atlanta, GA

SQL> -- here is the DDL for the db_link
SQL> -- the password is the only thing modified in the whole sqlplus output
SQL> select dbms_metadata.get_ddl('DB_LINK', db_link, owner) from dba_db_links where db_link='FNDEV.OAS.VANDERBILT' and owner=USER;

DBMS_METADATA.GET_DDL('DB_LINK',DB_LINK,OWNER)
----------------------------------------------------------------------------------------------------------------------------------------------------------------

  CREATE DATABASE LINK "FNDEV.OAS.VANDERBILT"
   CONNECT TO "BUS_INTEL_RO" IDENTIFIED BY [HIDDEN]
   USING 'fndev';



1 row selected.

Elapsed: 00:00:00.03
SQL> -- here are the two views
SQL> select dbms_metadata.get_ddl(decode(object_type,'MATERIALIZED VIEW','MATERIALIZED_VIEW','DATABASE LINK','DB_LINK',object_type), object_name, owner) as OBJECT_DDL from all_objects where object_name='APPO_PS_PO_HDR_RMT' and object_type not like '%BODY' and object_type not like '%PARTITION' order by object_type;

OBJECT_DDL
----------------------------------------------------------------------------------------------------------------------------------------------------------------

  CREATE OR REPLACE FORCE VIEW "SUPPLYCHAIN"."APPO_PS_PO_HDR_RMT" ("BUSINESS_UNIT", "PO_ID", "CHNG_ORD_BATCH", "PO_TYPE", "PO_STATUS", "HOLD_STATUS", "RECV_STAT
US", "DISP_ACTION", "DISP_METHOD", "CHANGE_STATUS", "PO_DT", "PO_REF", "VENDOR_SETID", "VENDOR_ID", "PYMNT_TERMS_CD", "BUYER_ID", "ORIGIN", "CHNG_ORD_SEQ", "ADD
RESS_SEQ_NUM", "CNTCT_SEQ_NUM", "SALES_CNTCT_SEQ_N", "BILL_LOCATION", "TAX_EXEMPT", "TAX_EXEMPT_ID", "CURRENCY_CD", "CUR_RT_TYPE", "MATCH_ACTION", "MATCH_CNTRL_
ID", "MATCH_STATUS_PO", "MATCH_PROCESS_FLG", "PROCESS_INSTANCE", "APPL_JRNL_ID_ENC", "POST_DOC", "DST_CNTRL_ID", "BCM_HDR_STATUS", "BCM_TRAN_TYPE", "OPRID_MODIF
IED_BY", "ACCOUNTING_DT", "BUSINESS_UNIT_GL", "IN_PROCESS_FLG", "ACTIVITY_DATE", "PO_POST_STATUS", "NEXT_MOD_SEQ_NBR", "ERS_ACTION", "ACCRUE_USE_TAX", "VU_REQ_R
ENEW_DT", "VU_DRC_SETID", "VU_DRC") AS
  SELECT "BUSINESS_UNIT","PO_ID","CHNG_ORD_BATCH","PO_TYPE","PO_STATUS","HOLD_STATUS","RECV_STATUS","DISP_ACTION","DISP_METHOD","CHANGE_STATUS","PO_DT","PO_REF"
,"VENDOR_SETID","VENDOR_ID","PYMNT_TERMS_CD","BUYER_ID","ORIGIN","CHNG_ORD_SEQ","ADDRESS_SEQ_NUM","CNTCT_SEQ_NUM","SALES_CNTCT_SEQ_N","BILL_LOCATION","TAX_EXEMP
T","TAX_EXEMPT_ID","CURRENCY_CD","CUR_RT_TYPE","MATCH_ACTION","MATCH_CNTRL_ID","MATCH_STATUS_PO","MATCH_PROCESS_FLG","PROCESS_INSTANCE","APPL_JRNL_ID_ENC","POST
_DOC","DST_CNTRL_ID","BCM_HDR_STATUS","BCM_TRAN_TYPE","OPRID_MODIFIED_BY","ACCOUNTING_DT","BUSINESS_UNIT_GL","IN_PROCESS_FLG","ACTIVITY_DATE","PO_POST_STATUS","
NEXT_MOD_SEQ_NBR","ERS_ACTION","ACCRUE_USE_TAX","VU_REQ_RENEW_DT","VU_DRC_SETID","VU_DRC"
       FROM sysadm.ps_po_hdr@fndev;



1 row selected.

Elapsed: 00:00:00.06
SQL> select dbms_metadata.get_ddl(decode(object_type,'MATERIALIZED VIEW','MATERIALIZED_VIEW','DATABASE LINK','DB_LINK',object_type), object_name, owner) as OBJECT_DDL from all_objects where object_name='APPO_PS_VENDOR_RMT' and object_type not like '%BODY' and object_type not like '%PARTITION' order by object_type;

OBJECT_DDL
----------------------------------------------------------------------------------------------------------------------------------------------------------------

  CREATE OR REPLACE FORCE VIEW "SUPPLYCHAIN"."APPO_PS_VENDOR_RMT" ("SETID", "VENDOR_ID", "VENDOR_NAME_SHORT", "VNDR_NAME_SHRT_USR", "VNDR_NAME_SEQ_NUM", "NAME1"
, "NAME2", "VENDOR_STATUS", "VENDOR_CLASS", "VENDOR_PERSISTENCE", "REMIT_ADDR_SEQ_NUM", "PRIM_ADDR_SEQ_NUM", "ADDR_SEQ_NUM_ORDR", "REMIT_SETID", "REMIT_VENDOR",
 "CORPORATE_SETID", "CORPORATE_VENDOR", "CUST_SETID", "CUST_ID", "ENTERED_BY", "DB_NUMBER", "AR_NUM", "OLD_VENDOR_ID", "SIC_DIVISION", "SIC_GRP", "SIC_IND_GRP",
 "SIC_IND_CD", "SIC_IND_CD_SEQ", "M1099_SW", "VAT_SW", "VNDR_STATUS_PO") AS
  SELECT "SETID","VENDOR_ID","VENDOR_NAME_SHORT","VNDR_NAME_SHRT_USR","VNDR_NAME_SEQ_NUM","NAME1","NAME2","VENDOR_STATUS","VENDOR_CLASS","VENDOR_PERSISTENCE","R
EMIT_ADDR_SEQ_NUM","PRIM_ADDR_SEQ_NUM","ADDR_SEQ_NUM_ORDR","REMIT_SETID","REMIT_VENDOR","CORPORATE_SETID","CORPORATE_VENDOR","CUST_SETID","CUST_ID","ENTERED_BY"
,"DB_NUMBER","AR_NUM","OLD_VENDOR_ID","SIC_DIVISION","SIC_GRP","SIC_IND_GRP","SIC_IND_CD","SIC_IND_CD_SEQ","M1099_SW","VAT_SW","VNDR_STATUS_PO"
       FROM sysadm.ps_vendor@fndev;



1 row selected.

Elapsed: 00:00:00.04
SQL> insert /*+ APPEND */ into SUPPLYCHAIN.APPO_PS_PO_HDR select * from SUPPLYCHAIN.APPO_PS_PO_HDR_RMT;

153244 rows created.

Elapsed: 00:00:06.99
SQL> -- no commit
SQL> insert /*+ APPEND */ into SUPPLYCHAIN.APPO_PS_VENDOR select * from SUPPLYCHAIN.APPO_PS_VENDOR_RMT;
insert /*+ APPEND */ into SUPPLYCHAIN.APPO_PS_VENDOR select * from SUPPLYCHAIN.APPO_PS_VENDOR_RMT
                                                                                                *
ERROR at line 1:
ORA-12840: cannot access a remote table after parallel/insert direct load txn


Elapsed: 00:00:00.02
SQL> 
SQL>  

Tom Kyte

Followup  

January 12, 2006 - 11:15 am UTC

It would appear you should not be using direct path operations then in your case since sometimes you have this self referencing issue - you DO NOT WANT TO CHANGE YOUR TRANSACTION boundaries to work around this (that breaks your code), you DO want to change your use of direct path operations.

ORA-12840: cannot access a remote table...

January 12, 2006 - 11:24 am UTC

Reviewer: Stewart W. Bryson from Atlanta, GA

I agree with you on the commit issue... I don't want to commit.

So I can't use the direct-path because the tables are related (foreign key) on the source system?

If that's the case, I guess I could check for that information and use the direct-path when there are no relations and regular path when there are.

Thanks.





Tom Kyte

Followup  

January 12, 2006 - 11:40 am UTC

are you really sure you even need direct path. The only thing that'll really do is bypass undo generation on the affected tables - indexes will generate undo/redo. You might find that this optimization isn't buying you sufficient anything to be used in this case at all.

ORA-12840: cannot access a remote table...

January 12, 2006 - 12:03 pm UTC

Reviewer: Stewart W. Bryson from Atlanta, GA

Thanks for spending so much time on my issue. I appreciate it.

Is it indeed the relationship on the source system that causes the error? I cannot find anything in the concepts guide about it.

In this case, you might be correct. I haven't tested how much direct-path helps me in this situation. However, I'm developing a generic package that could be used in lots of other situations. I may have to abandon the direct-path, but I just wanted to know my options before proceeding, and what was actually causing the situation.

Thanks much.

Tom Kyte

Followup  

January 12, 2006 - 12:13 pm UTC

actually, it is that direct path operations are not permitted in the distributed transaction.

ORA-12840: cannot access a remote table...

January 12, 2006 - 12:59 pm UTC

Reviewer: Stewart W. Bryson from Atlanta, GA

Tom Kyte said:

#1: "you can select from another table if you've done a direct path load (insert
append), you cannot select from the table you've just LOADED."

and #2: "actually, it is that direct path operations are not permitted in the distributed transaction."

I don't mean to take all your time, I just don't "get it" yet. As my example shows above, each of my two statements in the transaction are insert appends between two different tables. So I'm not violating statement #1 above. So what causes #2 to be true?

Thanks, again for your time.

Just finished your new book. I'm now waiting patiently for volume 2.

Tom Kyte

Followup  

January 13, 2006 - 10:14 am UTC

It is in the error message:

[tkyte@me ~]$ oerr ora 12840
12840, 00000, "cannot access a remote table after parallel/insert direct load txn"
// *Cause: Within a transaction, an attempt was made to perform distributed
// access after a PDML or insert direct statement had been issued.
// *Action: Commit/rollback the PDML transaction first, and then perform
// the distributed access, or perform the distributed access before the
// first PDML statement in the transaction.


You tried to access a remote table AFTER doing a direct path operation (second select).

ORA-12840: cannot access a remote table...

January 16, 2006 - 11:44 am UTC

Reviewer: Stewart W. Bryson from Atlanta, GA

I get it now.

When you said:

"you can select from another table if you've done a direct path load (insert append), you cannot select from the table you've just LOADED."

... you meant non-distributed transactions. Because I mentioned the possibility to veiws to remote objects in my initial comment, I thought you must have been thinking about them as well.

Thanks for seeing this through.

Huge INserts

March 24, 2006 - 4:00 pm UTC

Reviewer: Manish Sharma from Edison, NJ USA

Hi Tom,
I have partitioned table having around 20-30 million rows in each partition and I have to move this data from 8174 to 9205 on HP-unix.
I am doing this over a dblink and using 'insert /*+ APPEND */ select into' for one partition by one partition and I have a locally partitioned index on the table.

I have put both the tables & index in nologging mode.

Regarding this I have some questions:

1. Is there any way where we can use 'PARALLEL' hint in insert ....select ...over a dblink.

2. AS index is there 'insert /*+ append */ takes a lot of time to move this type of data. In many of your articles you recommended to drop and recreate the index. But this being a locally partitioned index, i can't drop the index partition, to do that I have to drop the table partition and when i add that table partition back, index partition comes back again. Also I can't drop the table partition, as they already exists. I also tried unusable index option, which doesn't lets me insert data into the table partition givin me ORA-01502.

I tried searching your other articles, but I couldn't come across any other option.

Can you pl. advise.

Thanks in advance
Manish

FYI: As my query related to huge inserts to I posted it here.

Tom Kyte

Followup  

March 24, 2006 - 4:17 pm UTC

1) no

2) have you tried disabling the index partition, loading, and rebuilding that partition.

Lightning response

March 24, 2006 - 4:41 pm UTC

Reviewer: Manish Sharma from Edison, NJ USA

That was so quick, do you have any magic button.

For 2) I haven't tried that, can you pl. tell me how to disable the index or expand on it.

Thanks once again
Manish

Tom Kyte

Followup  

March 24, 2006 - 5:56 pm UTC

ops$tkyte@ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /

Table created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> create index t_idx on t(x) local;

Index created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> alter index t_idx modify partition part1 unusable;

Index altered.

ops$tkyte@ORA10GR2> select partition_name, status from user_ind_partitions where index_name = 'T_IDX';

PARTITION_NAME                 STATUS
------------------------------ --------
JUNK                           USABLE
PART1                          UNUSABLE
PART2                          USABLE

ops$tkyte@ORA10GR2> alter session set skip_unusable_indexes=true;

Session altered.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> insert /*+ append */ into t
  2  select to_date( '12-mar-2003', 'dd-mon-yyyy'), rownum, rownum
  3    from all_users;

33 rows created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> commit;

Commit complete.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> alter index t_idx rebuild partition part1;

Index altered.

ops$tkyte@ORA10GR2> select partition_name, status from user_ind_partitions where index_name = 'T_IDX';

PARTITION_NAME                 STATUS
------------------------------ --------
JUNK                           USABLE
PART1                          USABLE
PART2                          USABLE
 

Great, it worked

March 27, 2006 - 11:26 am UTC

Reviewer: Manish Sharma from Edison, NJ USA

Thanks Tom for your example.
It worked.

Manish

When procedural code is faster?

October 11, 2006 - 10:17 am UTC

Reviewer: Christopher from Portsmouth, NH

I have an insert which runs much faster by looping on a cursor and inserting one row at a time. Should this ever be faster? When the insert is written as a single SQL statement the explain plan of the query is completely different and runs much slower. Why would a query have a different explain plan when it is used in an insert?

Tom Kyte

Followup  

October 11, 2006 - 3:58 pm UTC

got example, bad plans arise from wrong cardinalities, have you compared the autotrace traceonly explain with the tkprof results?

October 11, 2006 - 4:54 pm UTC

Reviewer: A reader

Tom,
Is there any way to skip unique unusable indexes without dropping ?
Thank you,
V






Tom Kyte

Followup  

October 11, 2006 - 8:17 pm UTC

not in sql.

only workarond in sql would be to not use unique index, use unique constraints (which do not need unique indexes)

October 11, 2006 - 5:01 pm UTC

Reviewer: A reader

I had to add some information to my question:
Unique Indexes are created using
create unique indexes not as alter table add constraint.

OK

March 24, 2007 - 2:05 pm UTC

Reviewer: Kamesh

Hi Tom,
We have an application in which a staging table stores
Purchase orders of last 3 months.This table will be truncated and populated
from a set of csv files as a daily job with the latest PO for the most recent
transaction date added and oldest one day PO being removed.
we have a requirement from our Mexico client to stop this
truncating operation and append the latest one day data and then issue
a delete retaining 3 months of PO's at any point of time.

code snippet is as follows:

begin

execute immediate 'Truncate table po_staging'

and followed by an insert as follows

/* First time insert must do an insert of all 3 months of data and
later inserts must append the latest one day PO data specific to Mexico
identified by cnty_nm column */


insert into po_staging
select purchase_order_dt,po_order_id,...
from ....
where purchase_order_dt between :st_date and :en_date
and po.prod_id = pol.prod_id

and cntry_nm = decode(cntry_nm,'MEX',... -- If country = 'mexico'
-- then insert latest Oneday PO data

I should have a where condition like

" where purchase_order_dt > ( select max(purchase_order_dt) from po_staging)

But I feel this where condition may fail for the first insert and hence no rows
would be inserted.This where condition need to be corrected so that it allows the
first insert and allows later inserts to append only latest oneday PO data.

Any Idea you can provide?
Thanks for your time.
Bye!

Huge Insert in procedure

April 17, 2013 - 9:31 am UTC

Reviewer: Rajeshwaran Jeyabal

Tom,
We got a requirement to load high volume of data daily from Transaction table to Reporting table.

So we thought of having a procedure like this, Is this good to "Enable parallel dml" inside a stored procedure like this? do you see any harm in doing this?

We did a benchmark on Pre-prod environment where this procedure got completed in 20 min's (which is okay for us). please let us know your thoughts

rajesh@ORA10G> create or replace procedure
  2  load_data(p_in varchar2)
  3  as
  4  begin
  5     execute immediate ' alter session enable parallel dml ';
  6     insert /*+ append */ into t
  7     select * from all_objects
  8     where owner = p_in ;
  9     commit;
 10     execute immediate ' alter session disable parallel dml ';
 11  end;
 12  /

Procedure created.

Tom Kyte

Followup  

April 22, 2013 - 7:39 pm UTC

this would be fine, yes. it is a good use of a parallel direct path insert.

if you can, you might also consider making the target table "nologging" so as to by pass redo generation for the table. Also, look to your index maintenance and decide if you want to maintain indexes during the load or afterwards..

Insert and UPdate around 3 millions row

May 03, 2016 - 12:27 pm UTC

Reviewer: Love Sharma from Singapore

Hi Tom,
Currently I am using the parallel merge statement to update 2 millions records in a table and it is taking around 20 mins for the same.
Is there a way that I can achieve the same in 10 mins or so.
There are few conditions that I can't use.
1. I can't disable indexes during insert/update as this table mapped with lot of child tables.
2. CTAS approach is not allowed as it involve some risk and would be difficult if it fails on production.
3. I already tried insert into select, bulk update
update table
set col=(select col from tab b)

Thanks in advance




Chris Saxon

Followup  

May 03, 2016 - 3:13 pm UTC

Please submit this as a new question. Be sure to include the following details:

- The statement you're running
- It's execution plan
- Details of the indexes on the tables you're using

Don't forget triggers

May 03, 2016 - 7:05 pm UTC

Reviewer: Evan

Interesting discussion...

Just wanted to add that you should check whether there are triggers on the table you are inserting into. If there are triggers firing on insert, that will slow down the performance considerably and you should disable them.
Chris Saxon

Followup  

May 04, 2016 - 1:12 am UTC

Good point.