A reader, April 28, 2002 - 7:59 pm UTC
Append hint
A reader, April 29, 2002 - 12:14 am UTC
Tom,
Regarding 2..
a) When Insert is parallelised using parallel hint, Isn't append mode used by default?
b) For using hints, and especially for comparing the performance between various options and that two for big tables, the hint syntax is so important (missing + in this case), as oracle just assumes hint as a comment, and one could end up in wrong conclusion.
c) What about using parallel DML?
CTAS vs query
A reader, November 17, 2004 - 10:24 am UTC
CBO, 9.2.0.4
On many occasions, I see that a query runs in less than a minute, but if I do a CTAS with that query it takes forever (more than 10 times longer).
Why is this? The default optimization mode for CBO is "all rows", so the query is already optimized to return all rows, so why does the CTAS take so long? The only difference in the plan I see is that it has a extra 'LOAD AS SELECT' at the top?
Thanks
November 17, 2004 - 2:02 pm UTC
how many rows -- remember a select just fetches and dumps, a CTAS writes
A reader, November 17, 2004 - 2:10 pm UTC
Less than 1000 rows, thats why I was surprised?
November 17, 2004 - 2:14 pm UTC
lets see the tkprof.
CTAS vs. Direct Path Insert Oracle 9.2.
Jens, February 25, 2005 - 9:54 am UTC
Hi Tom,
I have a CTAS and a Direct-Path-Insert with the same underlying query (it's a little bit complicated query with some subqueries). I get totally different execution plans. I would have thought, that the optimizer would give me the same plan. What could be the reason? I generated the plan in the same session.
Bye,
Jens
February 25, 2005 - 5:42 pm UTC
CTAS can do things insert /*+ append */ cannot (especially in PQ operations)
how big are the plans? can you share them (if they are relatively small and digestable)
Execution Plan
Jenes, February 28, 2005 - 9:29 am UTC
Ok, here is the select:
CREATE TABLE t_test
AS
SELECT k.kost_buch_wert_id,
k.kart_id,
k.kart_kto_id,
k.verm_vtr_id,
k.verm_vtr_dhln_bio_neu AS VERM_VTR_DHLN_BIO,
k.adnr,
k.kst_id,
k.waehr_knz_id,
k.skr_abr_intv_von,
k.skr_abr_intv_bis,
k.skr_abr_monat,
k.skr_abr_jahr,
k.buch_wert,
k.buch_text,
k.buch_dat,
k.oe_id_neu AS OE_ID,
oe.dhln AS OE_DHLN,
oe.uego_oe_id AS UEGO_OE_ID,
k.ur_dv_system_id,
k.dv_system_id,
k.exp_dat,
k.imp_dat,
k.liefer_id,
k.liefer_id_dgeu,
k.liefer_id_dkeu
FROM (
SELECT k.kost_buch_wert_id,
k.kart_id,
k.kart_kto_id,
k.verm_vtr_id,
v.dhln AS VERM_VTR_DHLN_BIO_NEU,
k.adnr,
k.kst_id,
k.waehr_knz_id,
k.skr_abr_intv_von,
k.skr_abr_intv_bis,
k.skr_abr_monat,
k.skr_abr_jahr,
k.buch_wert,
k.buch_text,
k.buch_dat,
CASE WHEN v.verm_vtr_id IS NOT NULL THEN
v.oe_id
ELSE
NVL(( SELECT DISTINCT /*wg.Fehler in Daten*/
oe_id
FROM bio.s_kst_bio
WHERE kst_id = k.kst_id
AND dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND dgedat > TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND dgbdat <= TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
),
( SELECT oe_id
FROM bio.s_kst_bio kst
WHERE kst.kst_id = k.kst_id
AND kst.dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND kst.dhln = ( SELECT MAX( dhln )
FROM bio.s_kst_bio kst2
WHERE kst2.kst_id = kst.kst_id
AND kst2.dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
)
)
)
END AS OE_ID_NEU,
k.ur_dv_system_id,
k.dv_system_id,
k.exp_dat,
k.imp_dat,
k.liefer_id,
k.liefer_id_dgeu,
k.liefer_id_dkeu
FROM bio.t_kost_buch_wert_alt_bio k,
bio.t_verm_vtr_bio v
WHERE v.verm_vtr_id(+) = k.verm_vtr_id
AND v.dgbdat(+) <= TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND v.dgedat(+) > TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND v.dkedat(+) = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND k.liefer_id BETWEEN 2402 AND 2403
) k,
bio.s_oe_bio oe
WHERE oe.oe_id = k.oe_id_neu
AND oe.dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND (
( oe.dgbdat <= TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND oe.dgedat > TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
)
OR
( oe.dhln = ( SELECT MAX( dhln )
FROM bio.s_oe_bio
WHERE oe_id = oe.oe_id
AND dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
)
AND NOT EXISTS( SELECT 1
FROM bio.s_oe_bio
WHERE oe_id = oe.oe_id
AND dkedat = TO_DATE( '04.04.4444', 'DD.MM.YYYY' )
AND dgbdat <= TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
AND dgedat > TO_DATE( '26.01.2005', 'DD.MM.YYYY' )
)
)
)
------------------------------------------------------------
This is the execution-plan with the CTAS:
Statement Id=4203132 Type=
Cost=2,64023809101798E-308 TimeStamp=28-02-05::12::37:51
(1) CREATE TABLE STATEMENT CHOOSE
Est. Rows: 371.528 Cost: 7.923
LOAD AS SELECT
(13) FILTER
(7) HASH JOIN
Est. Rows: 371.528 Cost: 7.923
(2) TABLE ACCESS FULL BIO.S_OE_BIO [Analyzed]
(2) Blocks: 20 Est. Rows: 367 of 1.127 Cost: 5
Tablespace: BIO
(6) VIEW (Embedded SQL)
Est. Rows: 365.871 Cost: 7.914
(5) HASH JOIN OUTER
Est. Rows: 365.871 Cost: 7.914
(3) TABLE ACCESS FULL BIO.T_KOST_BUCH_WERT_ALT_BIO [Analyzed]
(3) Blocks: 9.064 Est. Rows: 286.165 of 1.729.955 Cost: 1.377
Tablespace: BIO
(4) TABLE ACCESS FULL BIO.T_VERM_VTR_BIO [Analyzed]
(4) Blocks: 39.955 Est. Rows: 173.277 of 1.850.710 Cost: 6.066
Tablespace: BIO
(10) SORT AGGREGATE
Est. Rows: 1
(9) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(9) Blocks: 20 Est. Rows: 3 of 1.127 Cost: 4
Tablespace: BIO
(8) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 2
(12) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(12) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(11) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 2
------------------------------------------------------------
This is the execution-plan with the INSERT:
Statement Id=4203132 Type=
Cost=2,64023809101798E-308 TimeStamp=28-02-05::15::25:35
(1) INSERT STATEMENT CHOOSE
Est. Rows: 66.486 Cost: 203.153
LOAD AS SELECT
(23) FILTER
(17) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(17) Blocks: 20 Est. Rows: 367 of 1.127 Cost: 3
Tablespace: BIO
(16) NESTED LOOPS
Est. Rows: 66.486 Cost: 203.153
(5) HASH JOIN OUTER
Est. Rows: 65.474 Cost: 6.731
(3) TABLE ACCESS BY INDEX ROWID BIO.T_KOST_BUCH_WERT_ALT_BIO [Analyzed]
(3) Blocks: 15.602 Est. Rows: 65.474 of 2.146.363 Cost: 578
Tablespace: BIO
(2) NON-UNIQUE INDEX RANGE SCAN BIO.X_T_KOST_BUCH_WERT_ALT_2 [Analyzed]
Est. Rows: 65.474 Cost: 101
(4) TABLE ACCESS FULL BIO.T_VERM_VTR_BIO [Analyzed]
(4) Blocks: 40.178 Est. Rows: 177 of 1.858.758 Cost: 6.100
Tablespace: BIO
(15) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 1
(8) SORT UNIQUE
Est. Rows: 1 Cost: 49
(7) TABLE ACCESS BY INDEX ROWID BIO.S_KST_BIO [Analyzed]
(7) Blocks: 187 Est. Rows: 1 of 11.294 Cost: 4
Tablespace: BIO
(6) UNIQUE INDEX RANGE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 2 Cost: 2
(14) NESTED LOOPS
Est. Rows: 1 Cost: 3
(11) VIEW SYS.VW_SQ_1
Est. Rows: 1 Cost: 2
(10) SORT GROUP BY
Est. Rows: 1 Cost: 2
(9) UNIQUE INDEX RANGE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 2 Cost: 2
(13) TABLE ACCESS BY INDEX ROWID BIO.S_KST_BIO [Analyzed]
(13) Blocks: 187 Est. Rows: 1 of 11.294 Cost: 1
Tablespace: BIO
(12) UNIQUE INDEX UNIQUE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 5.647
(20) SORT AGGREGATE
Est. Rows: 1
(19) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(19) Blocks: 20 Est. Rows: 3 of 1.127 Cost: 4
Tablespace: BIO
(18) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 2
(22) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(22) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(21) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 3 Cost: 2
February 28, 2005 - 9:38 am UTC
interesting -- I would not have expected such wide variations in the cardinalities.
can you use explain plan instead? I'm not familar entirely with the above output (are the '.'s for decimals or thousands separators -- looks like thousands right?)
I'm assuming there is a large performance difference between the two?
diff between CATS and inset parallel
Steve, February 28, 2005 - 10:11 am UTC
Tom,
Can you give a simple example for
"
CTAS can do things insert /*+ append */ cannot (especially in PQ operations)"
Thanks!
Steve
February 28, 2005 - 10:46 am UTC
has to do with the number of PQ processes that need be used, the CTAS can be "parent combined with child" instead of pairs of PQ processes. It amounts to extra work with the insert append that the CTAS can skip
ops$tkyte@ORA9IR2> create table t1 as select * from all_objects where 1=0;
Table created.
ops$tkyte@ORA9IR2> delete from plan_table;
3 rows deleted.
ops$tkyte@ORA9IR2> explain plan for &1;
old 1: explain plan for &1
new 1: explain plan for create table t2 parallel( degree 4) as select /*+ parallel( t1 4) */* from t1
Explained.
ops$tkyte@ORA9IR2> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------
--------------------------------------------------------------------------
| Id | Operation |Name|Rows|Bytes|Cost| TQ |INOUT|PQ Distrib|
--------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT| | 82|10496| 1 | | | |
| 1 | LOAD AS SELECT | | | | |16,00|P->S |QC (RAND) |
| 2 | TABLE ACCESS FULL |T1 | 82|10496| 1 |16,00|PCWP | |
--------------------------------------------------------------------------
Note: cpu costing is off
10 rows selected.
ops$tkyte@ORA9IR2> create table t2 as select * from all_objects where 1=0;
Table created.
ops$tkyte@ORA9IR2> delete from plan_table;
3 rows deleted.
ops$tkyte@ORA9IR2> explain plan for &1;
old 1: explain plan for &1
new 1: explain plan for insert /*+ append parallel( t2 4) */ into t2 select /*+ parallel( t1 4) */* from t1
Explained.
ops$tkyte@ORA9IR2> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------
-------------------------------------------------------------------------
| Id | Operation |Name|Rows|Bytes|Cost| TQ |INOUT|PQ Distrib|
-------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 82|10496| 1| | | |
| 1 | LOAD AS SELECT | | | | | | | |
| 2 | TABLE ACCESS FULL |T1 | 82|10496| 1|17,00|P->S |QC (RAND) |
-------------------------------------------------------------------------
Note: cpu costing is off
10 rows selected.
New Analyze, new Plans
Jens, February 28, 2005 - 11:21 am UTC
Sorry Tom, I couldn't use any other Explain-Plan. It's SQL Navigator (yes .=, )
I asked the DBA to do a new analyze for the underlying tables, now we have these different plans:
CTAS (runs in 17 Sec., produces 280.000 rows):
----------------------------------------------
Statement Id=4203132 Type=
Cost=2,64023809101798E-308 TimeStamp=28-02-05::16::49:51
(1) CREATE TABLE STATEMENT CHOOSE
Est. Rows: 1.447 Cost: 6.737
LOAD AS SELECT
(14) FILTER
(8) HASH JOIN
Est. Rows: 1.447 Cost: 6.737
(2) TABLE ACCESS FULL BIO.S_OE_BIO [Analyzed]
(2) Blocks: 20 Est. Rows: 11 of 1.127 Cost: 5
Tablespace: BIO
(7) VIEW (Embedded SQL)
Est. Rows: 65.474 Cost: 6.731
(6) HASH JOIN OUTER
Est. Rows: 65.474 Cost: 6.731
(4) TABLE ACCESS BY INDEX ROWID BIO.T_KOST_BUCH_WERT_ALT_BIO [Analyzed]
(4) Blocks: 15.602 Est. Rows: 65.474 of 2.146.363 Cost: 578
Tablespace: BIO
(3) NON-UNIQUE INDEX RANGE SCAN BIO.X_T_KOST_BUCH_WERT_ALT_2 [Analyzed]
Est. Rows: 65.474 Cost: 101
(5) TABLE ACCESS FULL BIO.T_VERM_VTR_BIO [Analyzed]
(5) Blocks: 40.178 Est. Rows: 177 of 1.858.758 Cost: 6.100
Tablespace: BIO
(11) SORT AGGREGATE
Est. Rows: 1
(10) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(10) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(9) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 2 Cost: 2
(13) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(13) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(12) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 2 Cost: 2
DIREC-PATH-INSERT (after more than 20 min. breaks with "cannot extend temp segment in temp tablespace)" Temp-Tablespace is more than 10 Gig!:
----------------------------------------------------------
Statement Id=4203132 Type=
Cost=2,64023809101798E-308 TimeStamp=28-02-05::17::02:33
(1) INSERT STATEMENT CHOOSE
Est. Rows: 733.618 Cost: 35.226.859
LOAD AS SELECT
(23) FILTER
(8) HASH JOIN OUTER
Est. Rows: 733.618 Cost: 13.195
(6) MERGE JOIN CARTESIAN
Est. Rows: 733.618 Cost: 6.352
(2) TABLE ACCESS FULL BIO.S_OE_BIO [Analyzed]
(2) Blocks: 20 Est. Rows: 11 of 1.127 Cost: 5
Tablespace: BIO
(5) BUFFER SORT
Est. Rows: 65.474 Cost: 6.347
(4) TABLE ACCESS BY INDEX ROWID BIO.T_KOST_BUCH_WERT_ALT_BIO [Analyzed]
(4) Blocks: 15.602 Est. Rows: 65.474 of 2.146.363 Cost: 577
Tablespace: BIO
(3) NON-UNIQUE INDEX RANGE SCAN BIO.X_T_KOST_BUCH_WERT_ALT_2 [Analyzed]
Est. Rows: 65.474 Cost: 100
(7) TABLE ACCESS FULL BIO.T_VERM_VTR_BIO [Analyzed]
(7) Blocks: 40.178 Est. Rows: 177 of 1.858.758 Cost: 6.100
Tablespace: BIO
(11) SORT UNIQUE
Est. Rows: 1 Cost: 48
(10) TABLE ACCESS BY INDEX ROWID BIO.S_KST_BIO [Analyzed]
(10) Blocks: 202 Est. Rows: 1 of 12.123 Cost: 3
Tablespace: BIO
(9) UNIQUE INDEX RANGE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 1 Cost: 2
(17) NESTED LOOPS
Est. Rows: 1 Cost: 3
(14) VIEW SYS.VW_SQ_1
Est. Rows: 1 Cost: 2
(13) SORT GROUP BY
Est. Rows: 1 Cost: 2
(12) UNIQUE INDEX RANGE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 1 Cost: 2
(16) TABLE ACCESS BY INDEX ROWID BIO.S_KST_BIO [Analyzed]
(16) Blocks: 202 Est. Rows: 1 of 12.123 Cost: 1
Tablespace: BIO
(15) UNIQUE INDEX UNIQUE SCAN BIO.CPK_S_KST [Analyzed]
Est. Rows: 12.123
(20) SORT AGGREGATE
Est. Rows: 1
(19) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(19) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(18) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 2 Cost: 2
(22) TABLE ACCESS BY INDEX ROWID BIO.S_OE_BIO [Analyzed]
(22) Blocks: 20 Est. Rows: 1 of 1.127 Cost: 4
Tablespace: BIO
(21) UNIQUE INDEX RANGE SCAN BIO.CPK_S_OE [Analyzed]
Est. Rows: 2 Cost: 2
I think there is something wrong with the Optimizer. Unfortunality, I can't invest more time in this, cause I'm not responsible for that query (and my access is very restricted). The DBA "helped" the responsible developer by doing the select with a RULE-Hint...
February 28, 2005 - 12:01 pm UTC
if you do get time -- let me know, I'd like to reproduce because it does look fishy.
I would need
a) export of tables with rows=n
b) export of stats using dbms_stats
if you are willing -- I'd be willing to import that and reproduce and see if the root cause is "obvious" or buried deep.
CTAS to create a partitioned table
A reader, August 21, 2006 - 12:42 pm UTC
Tom,
I wonder if there is any way to create copy of a partitioned table such that the copy also has the identical partitions/sub-partitions?
Thanks!
August 27, 2006 - 9:56 am UTC
ops$tkyte%ORA10GR2> CREATE TABLE t
2 (
3 dt date,
4 x int,
5 y varchar2(30)
6 )
7 PARTITION BY RANGE (dt)
8 (
9 PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
10 PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
11 PARTITION junk VALUES LESS THAN (MAXVALUE)
12 )
13 /
Table created.
ops$tkyte%ORA10GR2> insert into t select to_date( '12-mar-2003','dd-mon-yyyy')+mod(rownum,3), user_id, username from all_users;
29 rows created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> CREATE TABLE new_t
2 (
3 dt ,
4 x ,
5 y
6 )
7 PARTITION BY RANGE (dt)
8 (
9 PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
10 PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
11 PARTITION junk VALUES LESS THAN (MAXVALUE)
12 )
13 as
14 select * from t;
Table created.
You can use dbms_metadata.get_ddl to get the DDL and just remove the datatypes...
CTAS to create a partitioned table
A reader, August 29, 2006 - 4:09 pm UTC
Thanks, Tom.
I even removed the entire column list and the data types (easier to remove lines instead of words) and I need an exact copy anyway:
CREATE TABLE new_t
PARTITION BY RANGE (dt)
(
PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
PARTITION junk VALUES LESS THAN (MAXVALUE)
)
as
select * from t;
Another Question
Ashish, April 27, 2007 - 8:57 am UTC
Hi Tom,
We are trying to aggregate data from one table and dump its result into another. Something like:
create table2 as select ... sum(....) from table1 group by ...
After this when we check the results we find that the sum is less than expected(...all values in table1 are positive). We are suspecting that Oracle is somehow not able to pick up some rows to include them in the aggregation.
The size of the table1 is 200 mill rows.
One the other hand if we run the same query but without the create i.e.:
select * from (<i>select ... sum(....) from table1 group by ...</i>)
where <some condition to pin point required dataset>
... we get correct results.
Is there a chance that the database/db file is corrupted?
Can you please suggest how should we proceed solving this problem?
Regards,
Ashish
April 27, 2007 - 11:15 am UTC
please utilize support.
CTAS/Direct Path Load with DBLink
A Mitra, June 30, 2010 - 4:07 pm UTC
Hi Tom,
In connection to the first question in the thread, could you tell me if CTAS or parallel direct path load will be equally effective over dblink?
In my scenario, I have a big table with multiple partitions (each partition having approx 150 million records) in one database (SRCDB) from where I want to pull 1 partition data to target database - TGTDB. SRCDB and TGTDB are residing on different physical servers connected over gigabit ethernet network. Source table does not have any index and I do not have any option to create any index, or any database object for that matter, to source side. My target table will also have partition (which is different from the source side partition scheme). I only have a DBLink which I need to use to fetch the data. My options are,
1. COPY
2. CTAS (also need to create the partition in target)
3. INSERT /*+ APPEND */ (After enabling parallel ddl, dml)
Which should be the best option for me and why?
July 06, 2010 - 2:12 pm UTC
the dblink will become a single thread in any of the parallel cases - you can run parallel on the target and the source, but there will be a single pipe from source to target at some point.
create table as select is generally preferred if the table is to be fully populated one time. It can skip a step that insert append would have to take.
The copy command should probably be avoided for ongoing use as it is deprecated and might not work in the future (something to avoid for ongoing use, if you have a one time need - it might be the right trick to use).
If this is ongoing, probably insert append - it'll direct path load the table, it'll manage the index maintenance as efficiently as possible - without having to full scan the target table once per index (that is, load with indexes enabled to avoid having to scan the table N times after loading)
You don't mention a version, but data pump might also be an option for you to consider, it can do parallel things, direct path things and so on.
CTAS or insert-select
Daniel, January 16, 2013 - 2:18 am UTC
Hi Tom.
I have a large table partitioned with 13 million records.
I have to create another, applying a function to lob field.
If I use CTA, it takes much longer to Insert: select proces.
Can you explain why?
It is a parallel query issue?
Thanks
This is a test case whit 10000 rows:
(Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 )
SQL> create table dd as select C_REFREC, st_centroid(shape) shape from T1 where 1=2;
Table created.
Elapsed: 00:00:00.13
SQL> sta tr
SQL> insert into dd select C_REFREC, st_centroid(shape) shape from T1 where st_area(shape) >0 and rownum <10000;
9999 rows created.
Elapsed: 00:00:10.69
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 9999 | 4130K| 2090 (14)| 00:00:11 | | | | | |
|* 1 | COUNT STOPKEY | | | | | | | | | | |
| 2 | PX COORDINATOR FORCED SERIAL| | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 666K| 268M| 2090 (14)| 00:00:11 | | | Q1,00 | P->S | QC (RAND) |
|* 4 | COUNT STOPKEY | | | | | | | | Q1,00 | PCWC | |
| 5 | PX BLOCK ITERATOR | | 666K| 268M| 2090 (14)| 00:00:11 | 1 | 9 | Q1,00 | PCWC | |
|* 6 | TABLE ACCESS FULL | RECFE12_120131 | 666K| 268M| 2090 (14)| 00:00:11 | 1 | 9 | Q1,00 | PCWP | |
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10000)
4 - filter(ROWNUM<10000)
6 - filter("SDE"."ST_AREA"("SHAPE")>0)
Statistics
----------------------------------------------------------
41271 recursive calls
272821 db block gets
31218 consistent gets
347 physical reads
4505756 redo size
671 bytes sent via SQL*Net to client
661 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
9999 rows processed
SQL> create table dd as select C_REFREC, st_centroid(shape) shape T1 where st_area(shape) >0 and rownum <10000;
Table created.
Elapsed: 00:02:17.28
January 16, 2013 - 11:57 am UTC
trace both, use tkprof
I can say you are comparing apples to flying toaster ovens. Your insert is a conventional path load - using the buffer cache, your CTAS is a direct path load.
I would suspect however different plans.
why test on 10,000 rows when you really want to do millions? You cannot really compare these two at 10,000 rows and expect that to scale up.
It isn't going to be fast in any case - invoking a user defined function against a lob 13 million times. are you sure you need to do that? what does st_area do exactly?
ctas or insert-select
Daniel, January 17, 2013 - 1:52 am UTC
I tested the case with the actual table. First, I launched the CTAS and it stopped 26 hours later. I could not believe it took so long. The Insert-select took 4 hours.
I traced the session, but
"tkprof sys = no explain = user / passw"
doesn't show the execution plan for ctas. ¿?
I think that the execution plan must be a full scan.
The lob field stores geographic data. I need to create a new table based on the first one, simplifying the polygonal data into points to draw graphics more efficiently (when I see the data from far enough, the polygons appear to be points, so to avoid querying a huge amount of data, I want to convert the polygons to points). This is what ST_Centroid does. To achieve this, it calls to an external library.
ST_Area returns the area of the polygon. I also need to filter some empty polygons (very few).
The shape (polygon data) is stored in the field as follows:
SQL> desc SDE.ST_GEOMETRY
SDE.ST_GEOMETRY is NOT FINAL
Name Null? Type
----------------------------------------- -------- ----------------------------
ENTITY NUMBER(38)
NUMPTS NUMBER(38)
MINX FLOAT(64)
MINY FLOAT(64)
MAXX FLOAT(64)
MAXY FLOAT(64)
MINZ FLOAT(64)
MAXZ FLOAT(64)
MINM FLOAT(64)
MAXM FLOAT(64)
AREA FLOAT(64) <- this is returned by st_area
LEN FLOAT(64)
SRID NUMBER(38)
POINTS BLOB
METHOD
------
FINAL CONSTRUCTOR FUNCTION ST_GEOMETRY RETURNS SELF AS RESULT
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
GEOM_STR CLOB IN
SRID NUMBER IN
METHOD
------
MEMBER FUNCTION ST_AREA RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_LEN RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_LENGTH RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_ENTITY RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_NUMPTS RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_MINX RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_MAXX RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_MINY RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_MAXY RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_MINM RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_MAXM RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_MINZ RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_MAXZ RETURNS NUMBER
METHOD
------
MEMBER FUNCTION ST_SRID RETURNS NUMBER
METHOD
------
STATIC FUNCTION GET_RELEASE RETURNS NUMBER
January 17, 2013 - 8:36 am UTC
explain plan for create .....;
select * from table(dbms_xplan.display)
will show you the plan for the CTAS.
In 11g, it would just be in the trace file, you wouldn't have to do anything special to see it.
ctas or insert-select
Daniel, January 17, 2013 - 9:42 am UTC
I think that the execution plan is correct. Full scan and filter by st_area:
------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | 666K| 268M| 2206 (13)| 00:00:11 | | | | | |
| 1 | LOAD AS SELECT | TMP | | | | | | | | | |
| 2 | PX COORDINATOR FORCED SERIAL| | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 666K| 268M| 2090 (14)| 00:00:11 | | | Q1,00 | P->S | QC (RAND) |
| 4 | PX BLOCK ITERATOR | | 666K| 268M| 2090 (14)| 00:00:11 | 1 | 9 | Q1,00 | PCWC | |
|* 5 | TABLE ACCESS FULL | RECFE12_120131 | 666K| 268M| 2090 (14)| 00:00:11 | 1 | 9 | Q1,00 | PCWP | |
PLAN_TABLE_OUTPUT
------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter("SDE"."ST_AREA"("SHAPE")>0)
The table isn't small:
select sum(bytes)/1024/1024
from user_segments
where segment_name='RECFE12_120131';
SUM(BYTES)/1024/1024
--------------------
4338
But 24 hours seems to be too much...
And the insert select only takes 4 hours. I can’t understand it.