Skip to Main Content
  • Questions
  • Compare SQLCOPY, CREATE TABLE, INSERT, BULK COLLECT and BULK INSERT.

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Boopathi.

Asked: April 28, 2002 - 4:50 pm UTC

Last updated: January 17, 2013 - 8:36 am UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

Tom,

Can you please tell me which one would be the fastest and why?

The data is selected using the following query:
select
chb.id dnz_chr_id,
chb.CONTRACT_NUMBER contractNum,
msib.segment1 itemName,
ib.quantity
From
mtl_system_items_b msib,
hz_cust_site_uses_all csu_in,
hz_cust_acct_sites_all cas_in,
csi_item_instances ib,
okc_k_items ci,
okc_k_lines_b clb_product,
okc_k_lines_b clb_service,
okc_k_headers_b chb
where chb.scs_code = 'SERVICE'
and chb.sts_code IN ('ACTIVE', 'OVERDUE', 'QA_HOLD', 'SIGNED')
and clb_service.chr_id = chb.id
and clb_service.lse_id = 1
and clb_service.sts_code IN ('ACTIVE', 'OVERDUE', 'QA_HOLD', 'SIGNED')
and clb_product.cle_id = clb_service.id
and clb_product.lse_id = 9
and clb_product.sts_code IN ('ACTIVE', 'OVERDUE', 'QA_HOLD', 'SIGNED')
and ci.cle_id = clb_product.id
and ci.jtot_object1_code = 'OKX_CUSTPROD'
and ib.instance_id = to_number(ci.object1_id1)
and cas_in.party_site_id(+) = ib.install_location_id
and csu_in.cust_acct_site_id(+) = nvl(cas_in.cust_acct_site_id, -999)
and nvl(csu_in.site_use_code,'SHIP_TO') = 'SHIP_TO'
and msib.inventory_item_id = ib.inventory_item_id
and msib.organization_id = ib.inv_master_organization_id
)

These are huge 11i tables.

1) SQLCOPY ... INSERT dest_table USING ....
with following
SET COPYCOMMIT 4;
SET ARRAYSIZE 5000;
SET LONG 200

2) INSERT /*APPEND parallel(dest_table,10) */ into dest_table ...

3) CREATE TABLE AS ...

4) Using PLSQL:
BULK COLLECT
FORALL.. BULK INSERT
COMMIT for every 25000 records.

Can you please also explain if the COMMIT size matters in this. Can commit size be set for #3 without modifying the select statement.

Thanks a lot,

Boopathi

and Tom said...

I would believe the parallel direct path insert OR a parallel create table as select will beat the others.

#4 -- that would be the SLOWEST -- don't even consider it. committing doesn't save "resources", it causes you to WAIT more often for log file sync's. It generates MORE redo. It generates MORE undo. You introduce procedural code into something non-procedural. That'll only slow it down.

#1 -- while very efficient -- would work best on a heavily indexed table (not the case here, since you are doing the copy into a new table -- no indexes). It is "C" based (fast). I wouldn't use nearly that arraysize (100 rows is a good "sweet spot" and I wouldn't use copy commit. I would size my rollback segments properly. #1 can be very slow if the SQLPLUS session *is not* on the server in question as the data is pulled TO the client and sent back to the SERVER. Do that on a single machine if you can.

I would go with a create table NOLOGGING + parallel direct path insert personally. It'll be the same as a create table as select. Then, do unrecoverable index builds in parallel on this table.

You DO NOT want to set a commit site. A CTAS and direct path insert into unindexed tables generate VERY VERY little undo (they load above the high water mark for the table, they do not record row by row the changes). committing frequently is *bad*, *slow*, *bad*.



Rating

  (15 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, April 28, 2002 - 7:59 pm UTC


Append hint

A reader, April 29, 2002 - 12:14 am UTC

Tom,
Regarding 2..
a) When Insert is parallelised using parallel hint, Isn't append mode used by default?
b) For using hints, and especially for comparing the performance between various options and that two for big tables, the hint syntax is so important (missing + in this case), as oracle just assumes hint as a comment, and one could end up in wrong conclusion.
c) What about using parallel DML?


Tom Kyte
April 29, 2002 - 7:32 am UTC

a) no, read:

</code> http://docs.oracle.com/docs/cd/A87860_01/doc/server.817/a76965/c21dlins.htm#2507 <code>

b) yes, if you give us garbage in, we'll give garbage out.

c) what about it?

CTAS vs query

A reader, November 17, 2004 - 10:24 am UTC

CBO, 9.2.0.4

On many occasions, I see that a query runs in less than a minute, but if I do a CTAS with that query it takes forever (more than 10 times longer).

Why is this? The default optimization mode for CBO is "all rows", so the query is already optimized to return all rows, so why does the CTAS take so long? The only difference in the plan I see is that it has a extra 'LOAD AS SELECT' at the top?

Thanks

Tom Kyte
November 17, 2004 - 2:02 pm UTC

how many rows -- remember a select just fetches and dumps, a CTAS writes

A reader, November 17, 2004 - 2:10 pm UTC

Less than 1000 rows, thats why I was surprised?

Tom Kyte
November 17, 2004 - 2:14 pm UTC

lets see the tkprof.

CTAS vs. Direct Path Insert Oracle 9.2.

Jens, February 25, 2005 - 9:54 am UTC

Hi Tom,

I have a CTAS and a Direct-Path-Insert with the same underlying query (it's a little bit complicated query with some subqueries). I get totally different execution plans. I would have thought, that the optimizer would give me the same plan. What could be the reason? I generated the plan in the same session.

Bye,

Jens

Tom Kyte
February 25, 2005 - 5:42 pm UTC

CTAS can do things insert /*+ append */ cannot (especially in PQ operations)

how big are the plans? can you share them (if they are relatively small and digestable)

Execution Plan

Jenes, February 28, 2005 - 9:29 am UTC

Ok, here is the select:

CREATE TABLE t_test
AS
SELECT k.kost_buch_wert_id,
k.kart_id,
k.kart_kto_id,
k.verm_vtr_id,
k.verm_vtr_dhln_bio_neu AS VERM_VTR_DHLN_BIO,
k.adnr,
k.kst_id,
k.waehr_knz_id,
k.skr_abr_intv_von,
k.skr_abr_intv_bis,
k.skr_abr_monat,
k.skr_abr_jahr,
k.buch_wert,
k.buch_text,
k.buch_dat,
k.oe_id_neu AS OE_ID,
oe.dhln AS OE_DHLN,
oe.uego_oe_id AS UEGO_OE_ID,
k.ur_dv_system_id,
k.dv_system_id,
k.exp_dat,
k.imp_dat,
k.liefer_id,
k.liefer_id_dgeu,
k.liefer_id_dkeu
FROM (
SELECT k.kost_buch_wert_id,
k.kart_id,
k.kart_kto_id,
k.verm_vtr_id,
v.dhln AS VERM_VTR_DHLN_BIO_NEU,
k.adnr,
k.kst_id,
k.waehr_knz_id,
k.skr_abr_intv_von,
k.skr_abr_intv_bis,
k.skr_abr_monat,
k.skr_abr_jahr,
k.buch_wert,
k.buch_text,
k.buch_dat,
CASE WHEN v.verm_vtr_id IS NOT NULL THEN
v.oe_id
ELSE
NVL(( SELECT DISTINCT /*wg.Fehler in Daten*/
oe_id
FROM bio.s_kst_bio
WHERE kst_id = k.kst_id
AND dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND dgedat > TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND dgbdat <= TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
),
( SELECT oe_id
FROM bio.s_kst_bio kst
WHERE kst.kst_id = k.kst_id
AND kst.dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND kst.dhln = ( SELECT MAX( dhln )
FROM bio.s_kst_bio kst2
WHERE kst2.kst_id = kst.kst_id
AND kst2.dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
)
)
)
END AS OE_ID_NEU,
k.ur_dv_system_id,
k.dv_system_id,
k.exp_dat,
k.imp_dat,
k.liefer_id,
k.liefer_id_dgeu,
k.liefer_id_dkeu
FROM bio.t_kost_buch_wert_alt_bio k,
bio.t_verm_vtr_bio v
WHERE v.verm_vtr_id(+) = k.verm_vtr_id
AND v.dgbdat(+) <= TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND v.dgedat(+) > TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND v.dkedat(+) = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND k.liefer_id BETWEEN 2402 AND 2403
) k,
bio.s_oe_bio oe
WHERE oe.oe_id = k.oe_id_neu
AND oe.dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND (
( oe.dgbdat <= TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND oe.dgedat > TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
)
OR
( oe.dhln = ( SELECT MAX( dhln )
FROM bio.s_oe_bio
WHERE oe_id = oe.oe_id
AND dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
)
AND NOT EXISTS( SELECT 1
FROM bio.s_oe_bio
WHERE oe_id = oe.oe_id
AND dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND dgbdat <= TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND dgedat > TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
)
)
)

------------------------------------------------------------


This is the execution-plan with the CTAS:


Statement Id=4203132 Type=
Cost=2,64023809101798E-308 TimeStamp=28-02-05::12::37:51

(1) CREATE TABLE STATEMENT CHOOSE
Est. Rows: 371.528 Cost: 7.923
LOAD AS SELECT
(13) FILTER
(7) HASH JOIN
Est. Rows: 371.528 Cost: 7.923
(2) TABLE ACCESS FULL BIO.S_OE_BIO [Analyzed]
(2) Blocks: 20 Est. Rows: 367 of 1.127 Cost: 5
Tablespace: BIO
(6) VIEW (Embedded SQL)
Est. Rows: 365.871 Cost: 7.914
(5) HASH JOIN OUTER
Est. Rows: 365.871 Cost: 7.914
(3) TABLE ACCESS FULL BIO.T_KOST_BUCH_WERT_ALT_BIO [Analyzed]
(3) Blocks: 9.064 Est. Rows: 286.165 of 1.729.955 Cost: 1.377
Tablespace: BIO
(4) TABLE ACCESS FULL BIO.T_VERM_VTR_BIO [Analyzed]
(4) Blocks: 39.955 Est. Rows: 173.277 of 1.850.710 Cost: 6.066
Tablespace: BIO
(10) SORT AGGREGATE
Est. Rows: 1
(9) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(9) Blocks: 20 Est. Rows: 3 of 1.127 Cost: 4
Tablespace: BIO
(8) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 2
(12) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(12) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(11) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 2



------------------------------------------------------------

This is the execution-plan with the INSERT:

Statement Id=4203132 Type=
Cost=2,64023809101798E-308 TimeStamp=28-02-05::15::25:35

(1) INSERT STATEMENT CHOOSE
Est. Rows: 66.486 Cost: 203.153
LOAD AS SELECT
(23) FILTER
(17) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(17) Blocks: 20 Est. Rows: 367 of 1.127 Cost: 3
Tablespace: BIO
(16) NESTED LOOPS
Est. Rows: 66.486 Cost: 203.153
(5) HASH JOIN OUTER
Est. Rows: 65.474 Cost: 6.731
(3) TABLE ACCESS BY INDEX ROWID BIO.T_KOST_BUCH_WERT_ALT_BIO [Analyzed]
(3) Blocks: 15.602 Est. Rows: 65.474 of 2.146.363 Cost: 578
Tablespace: BIO
(2) NON-UNIQUE INDEX RANGE SCAN BIO.X_T_KOST_BUCH_WERT_ALT_2 [Analyzed]
Est. Rows: 65.474 Cost: 101
(4) TABLE ACCESS FULL BIO.T_VERM_VTR_BIO [Analyzed]
(4) Blocks: 40.178 Est. Rows: 177 of 1.858.758 Cost: 6.100
Tablespace: BIO
(15) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 1
(8) SORT UNIQUE
Est. Rows: 1 Cost: 49
(7) TABLE ACCESS BY INDEX ROWID BIO.S_KST_BIO [Analyzed]
(7) Blocks: 187 Est. Rows: 1 of 11.294 Cost: 4
Tablespace: BIO
(6) UNIQUE INDEX RANGE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 2 Cost: 2
(14) NESTED LOOPS
Est. Rows: 1 Cost: 3
(11) VIEW SYS.VW_SQ_1
Est. Rows: 1 Cost: 2
(10) SORT GROUP BY
Est. Rows: 1 Cost: 2
(9) UNIQUE INDEX RANGE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 2 Cost: 2
(13) TABLE ACCESS BY INDEX ROWID BIO.S_KST_BIO [Analyzed]
(13) Blocks: 187 Est. Rows: 1 of 11.294 Cost: 1
Tablespace: BIO
(12) UNIQUE INDEX UNIQUE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 5.647
(20) SORT AGGREGATE
Est. Rows: 1
(19) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(19) Blocks: 20 Est. Rows: 3 of 1.127 Cost: 4
Tablespace: BIO
(18) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 2
(22) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(22) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(21) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 2


Tom Kyte
February 28, 2005 - 9:38 am UTC

interesting -- I would not have expected such wide variations in the cardinalities.

can you use explain plan instead? I'm not familar entirely with the above output (are the '.'s for decimals or thousands separators -- looks like thousands right?)

I'm assuming there is a large performance difference between the two?

diff between CATS and inset parallel

Steve, February 28, 2005 - 10:11 am UTC

Tom,

Can you give a simple example for
"
CTAS can do things insert /*+ append */ cannot (especially in PQ operations)"


Thanks!

Steve

Tom Kyte
February 28, 2005 - 10:46 am UTC

has to do with the number of PQ processes that need be used, the CTAS can be "parent combined with child" instead of pairs of PQ processes.  It amounts to extra work with the insert append that the CTAS can skip

ops$tkyte@ORA9IR2> create table t1 as select * from all_objects where 1=0;
Table created.
                                                                                                                                                         
ops$tkyte@ORA9IR2> delete from plan_table;
3 rows deleted.
                                                                                                                                                         
ops$tkyte@ORA9IR2> explain plan for &1;
old   1: explain plan for &1
new   1: explain plan for create table t2 parallel( degree 4) as select /*+ parallel( t1 4) */* from t1
Explained.
                                                                                                                                                         
ops$tkyte@ORA9IR2> select * from table(dbms_xplan.display);
                                                                                                                                                         
PLAN_TABLE_OUTPUT
-------------------------
                                                                                                                                                         
--------------------------------------------------------------------------
| Id  | Operation             |Name|Rows|Bytes|Cost| TQ  |INOUT|PQ Distrib|
--------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT|    |  82|10496|  1 |     |     |          |
|   1 |  LOAD AS SELECT       |    |    |     |    |16,00|P->S |QC (RAND) |
|   2 |   TABLE ACCESS FULL   |T1  |  82|10496|  1 |16,00|PCWP |          |
--------------------------------------------------------------------------
Note: cpu costing is off
10 rows selected.
                                                                                                                                                         
ops$tkyte@ORA9IR2> create table t2 as select * from all_objects where 1=0;
Table created.
                                                                                                                                                         
ops$tkyte@ORA9IR2> delete from plan_table;
3 rows deleted.
                                                                                                                                                         
ops$tkyte@ORA9IR2> explain plan for &1;
old   1: explain plan for &1
new   1: explain plan for insert /*+ append parallel( t2 4) */ into t2 select /*+ parallel( t1 4) */* from t1
Explained.
                                                                                                                                                         
ops$tkyte@ORA9IR2> select * from table(dbms_xplan.display);
                                                                                                                                                         
PLAN_TABLE_OUTPUT
--------------------------
                                                                                                                                                         
-------------------------------------------------------------------------
| Id  | Operation            |Name|Rows|Bytes|Cost| TQ  |INOUT|PQ Distrib|
-------------------------------------------------------------------------
|   0 | INSERT STATEMENT     |    |  82|10496|   1|     |     |          |
|   1 |  LOAD AS SELECT      |    |    |     |    |     |     |          |
|   2 |   TABLE ACCESS FULL  |T1  |  82|10496|   1|17,00|P->S |QC (RAND) |
-------------------------------------------------------------------------
Note: cpu costing is off
10 rows selected.
 

New Analyze, new Plans

Jens, February 28, 2005 - 11:21 am UTC

Sorry Tom, I couldn't use any other Explain-Plan. It's SQL Navigator (yes .=, )

I asked the DBA to do a new analyze for the underlying tables, now we have these different plans:

CTAS (runs in 17 Sec., produces 280.000 rows):
----------------------------------------------

Statement Id=4203132 Type=
Cost=2,64023809101798E-308 TimeStamp=28-02-05::16::49:51

(1) CREATE TABLE STATEMENT CHOOSE
Est. Rows: 1.447 Cost: 6.737
LOAD AS SELECT
(14) FILTER
(8) HASH JOIN
Est. Rows: 1.447 Cost: 6.737
(2) TABLE ACCESS FULL BIO.S_OE_BIO [Analyzed]
(2) Blocks: 20 Est. Rows: 11 of 1.127 Cost: 5
Tablespace: BIO
(7) VIEW (Embedded SQL)
Est. Rows: 65.474 Cost: 6.731
(6) HASH JOIN OUTER
Est. Rows: 65.474 Cost: 6.731
(4) TABLE ACCESS BY INDEX ROWID BIO.T_KOST_BUCH_WERT_ALT_BIO [Analyzed]
(4) Blocks: 15.602 Est. Rows: 65.474 of 2.146.363 Cost: 578
Tablespace: BIO
(3) NON-UNIQUE INDEX RANGE SCAN BIO.X_T_KOST_BUCH_WERT_ALT_2 [Analyzed]
Est. Rows: 65.474 Cost: 101
(5) TABLE ACCESS FULL BIO.T_VERM_VTR_BIO [Analyzed]
(5) Blocks: 40.178 Est. Rows: 177 of 1.858.758 Cost: 6.100
Tablespace: BIO
(11) SORT AGGREGATE
Est. Rows: 1
(10) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(10) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(9) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 2 Cost: 2
(13) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(13) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(12) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 2 Cost: 2


DIREC-PATH-INSERT (after more than 20 min. breaks with "cannot extend temp segment in temp tablespace)" Temp-Tablespace is more than 10 Gig!:
----------------------------------------------------------

Statement Id=4203132 Type=
Cost=2,64023809101798E-308 TimeStamp=28-02-05::17::02:33

(1) INSERT STATEMENT CHOOSE
Est. Rows: 733.618 Cost: 35.226.859
LOAD AS SELECT
(23) FILTER
(8) HASH JOIN OUTER
Est. Rows: 733.618 Cost: 13.195
(6) MERGE JOIN CARTESIAN
Est. Rows: 733.618 Cost: 6.352
(2) TABLE ACCESS FULL BIO.S_OE_BIO [Analyzed]
(2) Blocks: 20 Est. Rows: 11 of 1.127 Cost: 5
Tablespace: BIO
(5) BUFFER SORT
Est. Rows: 65.474 Cost: 6.347
(4) TABLE ACCESS BY INDEX ROWID BIO.T_KOST_BUCH_WERT_ALT_BIO [Analyzed]
(4) Blocks: 15.602 Est. Rows: 65.474 of 2.146.363 Cost: 577
Tablespace: BIO
(3) NON-UNIQUE INDEX RANGE SCAN BIO.X_T_KOST_BUCH_WERT_ALT_2 [Analyzed]
Est. Rows: 65.474 Cost: 100
(7) TABLE ACCESS FULL BIO.T_VERM_VTR_BIO [Analyzed]
(7) Blocks: 40.178 Est. Rows: 177 of 1.858.758 Cost: 6.100
Tablespace: BIO
(11) SORT UNIQUE
Est. Rows: 1 Cost: 48
(10) TABLE ACCESS BY INDEX ROWID BIO.S_KST_BIO [Analyzed]
(10) Blocks: 202 Est. Rows: 1 of 12.123 Cost: 3
Tablespace: BIO
(9) UNIQUE INDEX RANGE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 1 Cost: 2
(17) NESTED LOOPS
Est. Rows: 1 Cost: 3
(14) VIEW SYS.VW_SQ_1
Est. Rows: 1 Cost: 2
(13) SORT GROUP BY
Est. Rows: 1 Cost: 2
(12) UNIQUE INDEX RANGE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 1 Cost: 2
(16) TABLE ACCESS BY INDEX ROWID BIO.S_KST_BIO [Analyzed]
(16) Blocks: 202 Est. Rows: 1 of 12.123 Cost: 1
Tablespace: BIO
(15) UNIQUE INDEX UNIQUE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 12.123
(20) SORT AGGREGATE
Est. Rows: 1
(19) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(19) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(18) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 2 Cost: 2
(22) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(22) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(21) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 2 Cost: 2


I think there is something wrong with the Optimizer. Unfortunality, I can't invest more time in this, cause I'm not responsible for that query (and my access is very restricted). The DBA "helped" the responsible developer by doing the select with a RULE-Hint...




Tom Kyte
February 28, 2005 - 12:01 pm UTC

if you do get time -- let me know, I'd like to reproduce because it does look fishy.

I would need

a) export of tables with rows=n
b) export of stats using dbms_stats


if you are willing -- I'd be willing to import that and reproduce and see if the root cause is "obvious" or buried deep.

CTAS to create a partitioned table

A reader, August 21, 2006 - 12:42 pm UTC

Tom,
I wonder if there is any way to create copy of a partitioned table such that the copy also has the identical partitions/sub-partitions?

Thanks!

Tom Kyte
August 27, 2006 - 9:56 am UTC

ops$tkyte%ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /

Table created.

ops$tkyte%ORA10GR2> insert into t select to_date( '12-mar-2003','dd-mon-yyyy')+mod(rownum,3), user_id, username from all_users;

29 rows created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> CREATE TABLE new_t
  2  (
  3    dt  ,
  4    x  ,
  5    y
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  as
 14  select * from t;

Table created.



You can use dbms_metadata.get_ddl to get the DDL and just remove the datatypes... 

CTAS to create a partitioned table

A reader, August 29, 2006 - 4:09 pm UTC

Thanks, Tom.

I even removed the entire column list and the data types (easier to remove lines instead of words) and I need an exact copy anyway:

CREATE TABLE new_t
PARTITION BY RANGE (dt)
(
PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
PARTITION junk VALUES LESS THAN (MAXVALUE)
)
as
select * from t;

Another Question

Ashish, April 27, 2007 - 8:57 am UTC

Hi Tom,

We are trying to aggregate data from one table and dump its result into another. Something like:
create table2 as select ... sum(....) from table1 group by ...


After this when we check the results we find that the sum is less than expected(...all values in table1 are positive). We are suspecting that Oracle is somehow not able to pick up some rows to include them in the aggregation.
The size of the table1 is 200 mill rows.

One the other hand if we run the same query but without the create i.e.:

select * from (<i>select ... sum(....) from table1 group by ...</i>) 
where <some condition to pin point required dataset>


... we get correct results.

Is there a chance that the database/db file is corrupted?
Can you please suggest how should we proceed solving this problem?

Regards,
Ashish
Tom Kyte
April 27, 2007 - 11:15 am UTC

please utilize support.

CTAS/Direct Path Load with DBLink

A Mitra, June 30, 2010 - 4:07 pm UTC

Hi Tom,
In connection to the first question in the thread, could you tell me if CTAS or parallel direct path load will be equally effective over dblink?

In my scenario, I have a big table with multiple partitions (each partition having approx 150 million records) in one database (SRCDB) from where I want to pull 1 partition data to target database - TGTDB. SRCDB and TGTDB are residing on different physical servers connected over gigabit ethernet network. Source table does not have any index and I do not have any option to create any index, or any database object for that matter, to source side. My target table will also have partition (which is different from the source side partition scheme). I only have a DBLink which I need to use to fetch the data. My options are,

1. COPY
2. CTAS (also need to create the partition in target)
3. INSERT /*+ APPEND */ (After enabling parallel ddl, dml)

Which should be the best option for me and why?

Tom Kyte
July 06, 2010 - 2:12 pm UTC

the dblink will become a single thread in any of the parallel cases - you can run parallel on the target and the source, but there will be a single pipe from source to target at some point.

create table as select is generally preferred if the table is to be fully populated one time. It can skip a step that insert append would have to take.

The copy command should probably be avoided for ongoing use as it is deprecated and might not work in the future (something to avoid for ongoing use, if you have a one time need - it might be the right trick to use).

If this is ongoing, probably insert append - it'll direct path load the table, it'll manage the index maintenance as efficiently as possible - without having to full scan the target table once per index (that is, load with indexes enabled to avoid having to scan the table N times after loading)


You don't mention a version, but data pump might also be an option for you to consider, it can do parallel things, direct path things and so on.

CTAS or insert-select

Daniel, January 16, 2013 - 2:18 am UTC

Hi Tom.

I have a large table partitioned with 13 million records.
I have to create another, applying a function to lob field.
If I use CTA, it takes much longer to Insert: select proces.
Can you explain why?
It is a parallel query issue?

Thanks

This is a test case whit 10000 rows:

(Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 )

SQL> create table dd as select C_REFREC, st_centroid(shape) shape  from T1 where 1=2;

Table created.

Elapsed: 00:00:00.13
SQL> sta tr
SQL>  insert into dd  select C_REFREC, st_centroid(shape) shape  from T1 where  st_area(shape) >0 and rownum <10000;

9999 rows created.

Elapsed: 00:00:10.69


| Id  | Operation                     | Name           | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT              |                |  9999 |  4130K|  2090  (14)| 00:00:11 |       |       |        |      |            |
|*  1 |  COUNT STOPKEY                |                |       |       |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR FORCED SERIAL|                |       |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)        | :TQ10000       |   666K|   268M|  2090  (14)| 00:00:11 |       |       |  Q1,00 | P->S | QC (RAND)  |
|*  4 |     COUNT STOPKEY             |                |       |       |            |          |       |       |  Q1,00 | PCWC |            |
|   5 |      PX BLOCK ITERATOR        |                |   666K|   268M|  2090  (14)| 00:00:11 |     1 |     9 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL       | RECFE12_120131 |   666K|   268M|  2090  (14)| 00:00:11 |     1 |     9 |  Q1,00 | PCWP |            |

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<10000)
   4 - filter(ROWNUM<10000)
   6 - filter("SDE"."ST_AREA"("SHAPE")>0)
Statistics
----------------------------------------------------------
      41271  recursive calls
     272821  db block gets
      31218  consistent gets
        347  physical reads
    4505756  redo size
        671  bytes sent via SQL*Net to client
        661  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
       9999  rows processed

SQL> create table dd as  select C_REFREC, st_centroid(shape) shape  T1 where  st_area(shape) >0 and rownum <10000;

Table created.

Elapsed: 00:02:17.28

Tom Kyte
January 16, 2013 - 11:57 am UTC

trace both, use tkprof



I can say you are comparing apples to flying toaster ovens. Your insert is a conventional path load - using the buffer cache, your CTAS is a direct path load.


I would suspect however different plans.


why test on 10,000 rows when you really want to do millions? You cannot really compare these two at 10,000 rows and expect that to scale up.


It isn't going to be fast in any case - invoking a user defined function against a lob 13 million times. are you sure you need to do that? what does st_area do exactly?

ctas or insert-select

Daniel, January 17, 2013 - 1:52 am UTC

I tested the case with the actual table. First, I launched the CTAS and it stopped 26 hours later. I could not believe it took so long. The Insert-select took 4 hours.
I traced the session, but 
"tkprof sys = no explain = user / passw" 
doesn't show the execution plan for ctas. ¿?

I think that the execution plan must be a full scan.

The lob field stores geographic data. I need to create a new table based on the first one, simplifying the polygonal data into points to draw graphics more efficiently (when I see the data from far enough, the polygons appear to be points, so to avoid querying a huge amount of data, I want to convert the polygons to points). This is what ST_Centroid does. To achieve this, it calls to an external library.
ST_Area returns the area of the polygon. I also need to filter some empty polygons (very few).
The shape (polygon data) is stored in the field as follows:
 SQL> desc SDE.ST_GEOMETRY
 SDE.ST_GEOMETRY is NOT FINAL
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ENTITY                                             NUMBER(38)
 NUMPTS                                             NUMBER(38)
 MINX                                               FLOAT(64)
 MINY                                               FLOAT(64)
 MAXX                                               FLOAT(64)
 MAXY                                               FLOAT(64)
 MINZ                                               FLOAT(64)
 MAXZ                                               FLOAT(64)
 MINM                                               FLOAT(64)
 MAXM                                               FLOAT(64)
 AREA                                               FLOAT(64) <- this is returned by st_area
 LEN                                                FLOAT(64)
 SRID                                               NUMBER(38)
 POINTS                                             BLOB

METHOD
------
 FINAL CONSTRUCTOR FUNCTION ST_GEOMETRY RETURNS SELF AS RESULT
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 GEOM_STR                       CLOB                    IN
 SRID                           NUMBER                  IN

METHOD
------
 MEMBER FUNCTION ST_AREA RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_LEN RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_LENGTH RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_ENTITY RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_NUMPTS RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_MINX RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_MAXX RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_MINY RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_MAXY RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_MINM RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_MAXM RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_MINZ RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_MAXZ RETURNS NUMBER

METHOD
------
 MEMBER FUNCTION ST_SRID RETURNS NUMBER

METHOD
------
 STATIC FUNCTION GET_RELEASE RETURNS NUMBER

Tom Kyte
January 17, 2013 - 8:36 am UTC

explain plan for create .....;
select * from table(dbms_xplan.display)

will show you the plan for the CTAS.


In 11g, it would just be in the trace file, you wouldn't have to do anything special to see it.

ctas or insert-select

Daniel, January 17, 2013 - 9:42 am UTC

I think that the execution plan is correct. Full scan and filter by st_area:


------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |

------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | 666K| 268M| 2206 (13)| 00:00:11 | | | | | |

| 1 | LOAD AS SELECT | TMP | | | | | | | | | |

| 2 | PX COORDINATOR FORCED SERIAL| | | | | | | | | | |

| 3 | PX SEND QC (RANDOM) | :TQ10000 | 666K| 268M| 2090 (14)| 00:00:11 | | | Q1,00 | P->S | QC (RAND) |

| 4 | PX BLOCK ITERATOR | | 666K| 268M| 2090 (14)| 00:00:11 | 1 | 9 | Q1,00 | PCWC | |

|* 5 | TABLE ACCESS FULL | RECFE12_120131 | 666K| 268M| 2090 (14)| 00:00:11 | 1 | 9 | Q1,00 | PCWP | |



PLAN_TABLE_OUTPUT

------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter("SDE"."ST_AREA"("SHAPE")>0)

The table isn't small:

select sum(bytes)/1024/1024
from user_segments
where segment_name='RECFE12_120131';
SUM(BYTES)/1024/1024
--------------------
4338

But 24 hours seems to be too much...
And the insert select only takes 4 hours. I can’t understand it.