Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Anand Kumar.

Asked: October 10, 2013 - 4:31 am UTC

Last updated: November 01, 2013 - 8:05 pm UTC

Version: 11.2.0.3

Viewed 10K+ times! This question is

You Asked

Hi Tom,

We are seeing performance issue with an insert into table t1 select * from t1,t3 ......; select is pretty fast but insert is taking ~2 hrs to insert 1M rows.

I was able to figure that its due to the referential constraints as i disable them insert is 10X times faster. It has 4 ref constraints( 3 small parent tables , 1 large parent table). If i disable the foriegn key pointing to huge table, insert is 2X times faster.


Please suggest me how to get around this.


and Tom said...

referential integrity is great for row by row operations. It is *very* fast - faster than any of your developers could code it - but if you do something fast *alot* it adds up.


for a big bulk operation it would make sense to:

a) disable constraints
b) bulk load data
c) run queries against freshly loaded data to ensure the newly loaded data conforms to your rules
d) alter the constraints to be "enable rely novalidate"


for a foreign key - enable rely novalidate will be more or less instant. for a primary key/unique constraint it will have to build an index (probably faster than maintaining it)

Rating

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

akumar, October 10, 2013 - 6:40 pm UTC

thanks

akumar, October 10, 2013 - 6:42 pm UTC

We are afraid that disabling constraints and enabling them after the insert can change explain plans on the sql's involved on this table.

what's your suggestion here?


Tom Kyte
October 10, 2013 - 7:32 pm UTC

if you use rely and allow "query_rewrite_integrity" to be set to trusted - then the optimize can and will use the constraints to optimize just like it did before.

if you do that - make darn sure you run the queries that validate the data. see http://asktom.oracle.com/Misc/stuck-in-rut.html for why you need to make sure the data conforms to the rules.

otherwise, you'll have to "enable validate" the constraints - which can use parallel query, but will necessarily take longer than "rely enable novalidate" would (except for maybe the primary key which spends most of its time building the index)

Followup Question

akumar, October 11, 2013 - 1:07 am UTC

Hi Tom,

I dropped the constraint and created a deferred constraint which made a huge improvement with insert performance.

do you recommend this approach?

I am curious to know how come deferred constraint can result in better performance as it still need to verify the data in the end.

thanks in advance.

Tom Kyte
October 11, 2013 - 1:26 pm UTC

well, the deferred constraint will use a non-unique index (I'm assuming, I have to assume since you don't actually say which constraint - you just say "THE constraint") for the primary key - and will defer checking for uniqueness until the end of the process.

however, a deferrable constraint is "as good as" a disabled and then "enable novalidate" constraint. That is - the issue you were explicitly trying to avoid:

We are afraid that disabling constraints and enabling them after the insert can change explain
plans on the sql's involved on this table.


will absolutely be true. If you have a deferrable constraint - the optimizer cannot, will not under any circumstances use that constraint to optimize the query.


at least with a "rely enable novalidate" constraint you can make it so the constraint is used to optimize the query.


No, I do not recommend this approach.

Insert performance issue

Girish, October 17, 2013 - 4:50 am UTC

Hi Tom,

Sometime it is difficult to do steps (a to d) in below response .Here we have a simple INSERT statement like

insert into tab_D(
col1,
col2,
col3
)
select
p.col1,
p.col2,
q.col3
from
tab_C p,tab_D a
where p.col1=q.col1
and nvl(p.col1,p.col0)=q.col1;

Here tab_D has 200 million rows and tab_C is a GTT which has say around 1000 rows.This insert tables runs about 1 hr

We removed NVL function to make use of index on col0 and also put a leading(tab_C) hint.This did not help much

And this insert is in a PL/SQL batch job.Customer is not willing to purge data from tab_D as this is required for his analysis , so that option is out of question

Plan in tkprof is as given below
LOAD TABLE CONVENTIONAL (cr=1059481 pr=806067 pw=0 time=2656862240 us)
SEQUENCE tab_D_SEQ (cr=713882 pr=545856 pw=0 time=259690501 us)
HASH JOIN (cr=706090 pr=545850 pw=0 time=190504038 us cost=189815 size=180389700 card=846900)
TABLE ACCESS FULL tab_C (cr=7 pr=0 pw=0 time=1537 us cost=3 size=113068 card=1229)
PARTITION HASH ALL PARTITION: 1 64 (cr=706083 pr=545850 pw=0 time=508535786 us cost=189401 size=5948383111 card=49160191)
TABLE ACCESS FULL tab_D PARTITION: 1 64 (cr=706083 pr=545850 pw=0 time=446087663 us cost=189401 size=5948383111 card=49160191)


Is there way to improve the performance of INSERT

Thanks


Tom Kyte
November 01, 2013 - 8:05 pm UTC

first of all - figure out where the time is spent.

does the query by itself run ok, or does it take forever.

if the query itself runs OK, then you would be looking to make the insert faster

o using direct path perhaps, bypass the inefficient buffer cache.
o looking to bypass many most or all of the indexes (doesn't matter if you find it "difficult", many things are "difficult"


Alot

SQB, October 21, 2013 - 1:53 pm UTC