Skip to Main Content
  • Questions
  • Underlying mechanism of Exchanging the partitions

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Ravi.

Asked: January 01, 2001 - 11:22 pm UTC

Last updated: July 02, 2013 - 5:06 pm UTC

Version: 815

Viewed 10K+ times! This question is

You Asked

Dear Tom,

I have a partitioned table with 9 partitions. Each partition is about 10Gig in size. Now, I also have set of 9 conversion tables(non partitioned tables ) for each partitions in the partitioned tables. The non-partitioned tables and the partitoned table are indentical except that the non-partitioned(conversion) tables have extra columns.

Each table also has 4 indexes defined on them and the indexes in the partitioned table are local and donot have any non-partitioned/global indexes defined. Now, I am tasked with exchanging the partitions with the non-partitioned tables along with the indexes.

I have tested the same with small sizes. But I have not worked with such huge sizes. However, I would like to understand the underlying mechanism of exchanging the partitions.

When an object is renamed in oracle, it would not take time. Similarly, when I exchange partitions with non-partitioned table, will it result fast as is in renaming of objects. Or oracle has to rebuild all the segments and will take time.

If it is going to work just like renaming, then what is the difference between

if the partition and the corrsponding non-partitioned table are in different tablespaces or

if the partition and the corrsponding non-partitioned table are in same tablespace.

Could you please give an idea of how much time, it might take for one partition.

Each partition is 10G in size and has 4 indexes of 1G each in size.

Could you please suggest me any other method. Your suggestions are always highly valuable.

Thanks
Ravi





and Tom said...

If you are planning on exchanging the table with the extra columns with a partition, be prepared for:

*
ERROR at line 3:
ORA-14096: tables in ALTER TABLE EXCHANGE PARTITION must have the same number of columns

You cannot do that. You'll have to add the columns to the partitioned table first.

A partition exchange is simply an exchange of segments -- no movement of data takes place. It is a data dictionary update. For example:

tkyte@TKYTE816> CREATE TABLE new_table2000
2 ( x int,
3 y int,
4 z DATE
5 )
6 PARTITION BY RANGE (z)
7 (
8 PARTITION newtab_1999_h1 VALUES LESS
9 THAN(to_date('30-jun-1999','dd-mon-yyyy')),
10 PARTITION newtab_1999_h2 VALUES LESS
11 THAN(to_date('31-dec-1999','dd-mon-yyyy'))
12 )
13 /

Table created.

tkyte@TKYTE816>
tkyte@TKYTE816> insert into NEW_table2000 values ( -1, -1, '15-jun-1999' );

1 row created.

tkyte@TKYTE816> insert into NEW_table2000 values ( -2, -2, '15-dec-1999' );

1 row created.

tkyte@TKYTE816>
tkyte@TKYTE816>
tkyte@TKYTE816> create table temp ( x int, y int, z date ) tablespace tools;

Table created.

tkyte@TKYTE816>
tkyte@TKYTE816> select segment_name, partition_name,
2 tablespace_name, extent_id
3 from user_extents
4 where segment_name in ( 'NEW_TABLE2000', 'TEMP' );

SEGMENT_NAME PARTITION_NAME TABLE EXTENT_ID
--------------- --------------- ----- ----------
TEMP TOOLS 0
NEW_TABLE2000 NEWTAB_1999_H2 USERS 0
NEW_TABLE2000 NEWTAB_1999_H1 USERS 0

tkyte@TKYTE816>
tkyte@TKYTE816> alter table new_table2000
2 exchange partition newtab_1999_h1
3 with table temp;

Table altered.

tkyte@TKYTE816>
tkyte@TKYTE816> select segment_name, partition_name,
2 tablespace_name, extent_id
3 from user_extents
4 where segment_name in ( 'NEW_TABLE2000', 'TEMP' );

SEGMENT_NAME PARTITION_NAME TABLE EXTENT_ID
--------------- --------------- ----- ----------
TEMP USERS 0
NEW_TABLE2000 NEWTAB_1999_H2 USERS 0
NEW_TABLE2000 NEWTAB_1999_H1 TOOLS 0


shows that the extents from TEMP and NEW_TABLE2000 (in different tablespaces) were simply "swapped".

The time to do 1 partition will be a function of the number of extents... You will need to do it when no one is performing DML or queries on the table (cursors must be invalidated and such)

Rating

  (46 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Transportable Tablespaces, Partition Exchange, and Data Warehouse

Lisa, May 04, 2004 - 11:55 am UTC

Tom,

I am working on a strategy to accomplish incremental additions to a data warehouse. I have the following scenario:

Every 2 weeks data feeds will be loaded into a staging (STG) database where the data is transformed/scrubbed to be inserted into the data warehouse (DW). Each load gets a load key, which ultimately goes into the DW. The DW target tables will be partitioned by this load key.

Therefore, my thought process is as follows:

1) Have the ETL process on the STG database load into a set of skeleton DW tables having the same structure as the DW target tables (but no partitioning). Assume tablespace STG_TAB

2) Build similar index structures on the skeleton STG tables as the DW tables. Assume tablespace STG_IND

3) Export transport_tablespace=y the STG_TAB and STG_IND tablespaces and Import these tablespaces into the DW database.

4) Assume that the DW tables are partitioned by the load key. Add a new partition for the new load key data to be added. Tablespace DW_TAB

5) DW indexes are LOCAL, tablespace DW_IND

6) Simply add the new load key data by doing a partition switch, swapping the tables/indexes in the transported tablespaces with the new partitions. This automatically converts the STG skeleton table and index into the new partitions of the DW table and index.

My problem is that when I run my test case, the new partitions remain in the transported STG_TAB and STG_IND tablespaces, instead of being moved to DW_TAB and DW_IND. The Oracle manuals states "All of the segment attributes of the two objects (including tablespace and logging) are also exchanged." This does not appear to be happening. You state in this thread that nothing is actually moved, just metadata pointers are switched, which is consistent with what I see.

Am I required to then move the new exchanged partitions into the DW_TAB and DW_IND tablespaces manually after performing the partition switch?

Also, do you see a benefit to having each partition in the DW tables stored in a separate tablespace anyway?

I am new to data warehousing, please advise!

Tom Kyte
May 04, 2004 - 1:58 pm UTC

all segment attributes are exchanged

if the partition was in tablespace "A" with logging mode "X"
and the table was in tablespace "B" with logging mode "Y"

when you are done, the "reverse" will be true -- the attributes will have "switched"

the only way to get the table (now a partition) into the "other tablespace" would be to physically MOVE it there.

why do the partitions have to be in that other tablespace? would it not be more efficient to just leave them where they are?

Tablespaces should be used for purely administrative purposes. Here, they seem to fit the bill nicely.


Complicated by different block sizes

Lisa, May 05, 2004 - 8:43 am UTC

Tom, thanks for your response.

Thinking further, my example gets more complicated due to differences between the STG and DW databases.

The STG database will experience heavy insert activity as a result of the ETL loads; therefore I am thinking that a small "OLTP-like" block size (2k-4K) is appropriate.

However, the data warehouse will have a 32k block size to accomodate heavy scans/read activity.

Therefore, it seems that I will have to "move" the table to a new tablespace anyway once the tablespace is imported to the DW before doing the partition switch.

Unless you know of a more elegant way to change the block size of a table?

Even more, I am concerned about the partitioning strategy. Partitioning by load key allows us to add new data to the warehouse by utilizing the TTS and partition switch. This partition strategy may also be useful for backup and recovery. I am initially thinking about keeping the latest load key partitions in a separate tablespace from the past load partitions. Previous load key partitions can be backed up as a group in a single tablespace (or separately in separate tablespaces, but what a pain to recover), but the last load partition will not be backed up since the TTS process could be repeated.

My concern is that the concept of a load key is purely internal; therefore I never expect queries to be run against the DW based upon a single load key. Therefore, partition elimination is unlikely to occur. Could sub-partitions within the load_key partitions potentially help, and how hard would it be to change this later if not implemented initially?

Also, since the size of the loads will vary based upon the dynamic nature of the ETL feeds, the load key partitions will not be uniformly balanced in size. Is this a problem?

Regards,

Lisa


Tom Kyte
May 05, 2004 - 9:15 am UTC

the only way to change the block size will be a rebuild of the object -- no magic we can do there.

the queries you run against this data -- would they *ever* benefit from partition elimination? I mean do they always/frequently use something in their predicates that you could partition by.

You'll partition to achieve:

o ease of admin
o higher availability
o performance

you need to prioritize your goal -- which of the above is *the most important*, and then partition to suite that (with as little negative impact on the others as possible). It is rare that you would achieve all three simultaneously.

non-equisized partitions are OK.

Histograms lost?

A reader, November 10, 2004 - 2:58 pm UTC

Oracle 9.2.0.4

Histograms dont seem to be carried over to the partitioned table.

create table t(i int,j int,k int) partition by range(i) (partition p1 values less than (10),partition p2 values less than (20));

create table t2(i int,j int,k int);

insert into t2 select rownum i,rownum j,rownum k from all_objects where rownum<10;

exec dbms_stats.gather_table_stats(user,'t2',method_opt=>'for all columns');

SQL> select count(*) from user_tab_histograms where table_name='T2';

  COUNT(*)
----------
        27

SQL> alter table t exchange partition p1 with table t2;

Table altered.

SQL> select count(*) from t;

  COUNT(*)
----------
         9

SQL> select count(*) from t2;

  COUNT(*)
----------
         0

SQL> select count(*) from user_tab_histograms where table_name='T2';

  COUNT(*)
----------
         0

SQL> select count(*) from user_tab_histograms where table_name='T'; 

  COUNT(*)
----------
         0

SQL> select last_analyzed from user_tables where last_analyzed is not null;

no rows selected

1. The histograms are not carried over 
2. Why doesnt last_analyzed show that the table was analyzed?

Is this a bug or am I missing something? Thanks 

Tom Kyte
November 10, 2004 - 8:17 pm UTC

no bug, you haven't any GLOBAL stats for the partitioned table, you only have LOCAL stats at the partition level.


ops$tkyte@ORA9IR2> l
  1  select partition_name, column_name, count(*) from user_PART_histograms where table_name = 'T'
  2* group by partition_name, column_name
ops$tkyte@ORA9IR2> /
 
PARTITION_NAME                 COLUMN_NAM   COUNT(*)
------------------------------ ---------- ----------
P1                             I                   9
P1                             J                   9
P1                             K                   9
 

A reader, November 11, 2004 - 9:33 am UTC

OK then how are "global queries" of the form select count(*) from table or select ... from table where predicate not involving partitioned columns evaluated? Would they do a FTS i.e. full partition scan of all partitions?

Specifically, if I do

select count(*) from table where part_key='A'; It would do a index FFS of the local index on A

But if I do

select count(*) from table; would it do a index FFS on the N index partitions or would it do a full table scan?

In general, if I always intend to access only one partition at a time, I dont need global indexes, right?

Thanks

Tom Kyte
November 11, 2004 - 10:32 am UTC

not sure where you are going with the first paragraph.  it sounds like you are trying to pick up the conversation -- but not sure where it left off really?!? 

the answer to 

"Would they do a FTS i.e. full partition scan of all partitions?"

is -- well, it totally depends.  select count(*) would be looking for an index on a non-nullable column to fast full scan.  the select from predicate would be looking at everything available and would use whatever plan made sense (insufficient data to comment on what type of plan it might use, could be an index range scan of any column for all I know)



even the specific question doesn't provide sufficient data -- it could look for a non-partitioned index, it could use global indexes, it could use a local index.  but it'll be able to do any of them if it likes

ops$tkyte@ORA9IR2> CREATE TABLE t
  2  (
  3    temp_date  date,
  4    x          int,
  5    y int,
  6    constraint t_pk primary key(temp_date) using index LOCAL
  7  )
  8  PARTITION BY RANGE (temp_date)
  9  (
 10    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) tablespace users,
 11    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) tablespace tools,
 12    PARTITION junk VALUES LESS THAN (MAXVALUE)
 13  )
 14  /
 
Table created.
 
ops$tkyte@ORA9IR2> exec dbms_stats.set_table_stats( user, 'T', numrows=>1000000, numblks=>10000 );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> select count(*) from t;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1)
   1    0   SORT (AGGREGATE)
   2    1     PARTITION RANGE (ALL)
   3    2       INDEX (FAST FULL SCAN) OF 'T_PK' (UNIQUE) (Cost=4 Card=1000000)
 
 
 
ops$tkyte@ORA9IR2> set autotrace off


If you always access a SINGLE partition and that is clear in the predicate (eg: that constraint is ALWAYS in the predicate), then you may well not need global indexes.
 

Time to exchange a partition

A reader, November 11, 2004 - 3:42 pm UTC

As my partitioned table gets more and more partitions, I find that it takes longer and longer to do the 'exchange partition'

This is inspite of doing 'without validation'.

How long should the 'exchange partition' statement take? Would the size of the tables involved have anything to do with this? What about the number of partitions in my partitioned table?

Thanks

Tom Kyte
November 11, 2004 - 4:25 pm UTC

trace it, see what it is doing.

might be a constraint/unique/something getting checked - show us an example (pare it down to as small as possible and trace it)

Constraints

A reader, November 12, 2004 - 8:00 am UTC

Yes, the partitioned table does have unique constraints (the PK is local though).

But still, the time to exchange one partition shouldnt increase as I add more partitions? All these are partition-local operations, I dont have any global indexes on the table?

Coming to think of it, this is for DW-type archival, the source system already has PK/FK/UK constraints, so I dont really need them here.

What if I keep all the indexes but drop the constraints, would that change the query plans?

Thanks

Tom Kyte
November 12, 2004 - 8:08 am UTC

one word for you:

trace

*find out what it is doing*, until then -- we are guessing. once we KNOW we can suggest. until we know, we can do nothing.

Suggestion

Jonathan Lewis, November 12, 2004 - 5:28 pm UTC

This is an article of mine that may
answer the original question.

</code> http://www.dbazine.com/jlewis17.shtml <code>

To JPL

A reader, November 15, 2004 - 9:13 am UTC

Thanks, Jonathan. I didnt mention it, but I had already read your article before posting my question. Specifically, I read your "stop press" section which said that the unique constraint check bug had been fixed in 9.2.0.4 (the release I am on)?

Anyway, as Tom suggested, I will set up a test case with tracing and see what is going on and post my results.

Thanks

Partition Exchange While Querying Data

Jasbir Kular, November 15, 2004 - 6:28 pm UTC

Hi Tom,

You had said...
<quote>
The time to do 1 partition will be a function of the number of extents... You
will need to do it when no one is performing DML or queries on the table
(cursors must be invalidated and such)
</quote>

I tested partition exchange while the source table and target table were being queried and it worked fine, no errors. I am using version 9.2.0.5. Did your statement on no queries only apply to version 8.1.5?

Thanks.


Tom Kyte
November 15, 2004 - 9:23 pm UTC

queried will "generally" work -- but the queries will continue against the "old partition that is now a table" -- that is, the queries won't flip over to the new partition, they will complete against the old partition (which is now a table, and could be dropped and overwritten resulting in invalid rowid....)

How to handle this scenario?

A reader, December 14, 2004 - 2:19 am UTC

Hi Tom,
Okay, exchange partition is dictionary update. But I have scenariao where we create staging tables in different tbs than the target tables (partitioned, and each partition is in different tbs, for ease of miantenace - like transportable tbs, historical partitions in read only etc). Each month, we load data in staging area, just want to exchange staging area partition with target current partition in target tbs. Do you think that once the echange partition is succussful, we than need to move the partition segment in original tbs, or there is better way?
PS: DB version 9R2.
Thanks

Tom Kyte
December 14, 2004 - 8:58 am UTC

why not make the target partition the staging area? since it is new data, you are just loading it.

So, you don't have a single "stage", but rather you create stage in the tablespace you want the data to ultimately be in, load it up, do whatever to do and then exchange.

Else, you would have to "move" all of the data after you process it.

TBS movement

A reader, December 14, 2004 - 2:54 pm UTC

Hi Tom,
The staging area is to clean the data with RI. Once the data is clean we just want to load data in the target table(s), and we thought exchange partition (without validation) will be the best option. Since each month load for target is in different partition so we may have to now find some way to build staging in the same partition tbs, which we are expecting in taget table partitioned tbs.
Thanks

Tom Kyte
December 15, 2004 - 1:00 pm UTC

you are missing my point


create "staging area" in the target tablespace (where you ultimately want the data to reside)

scrub it.

alter exchange -- now that stage table becomes the newly added partition and is in the "right place"




Clarification on cursor invalidation and partition exchange

Jeff White, October 13, 2005 - 5:16 pm UTC

In your reply to the initial question you stated...

<quote>
The time to do 1 partition will be a function of the number of extents... You
will need to do it when no one is performing DML or queries on the table
(cursors must be invalidated and such)
</quote>

A little later you stated in another reply...
<quote>
queried will "generally" work -- but the queries will continue against the "old
partition that is now a table" -- that is, the queries won't flip over to the
new partition, they will complete against the old partition (which is now a
table, and could be dropped and overwritten resulting in invalid rowid....)
</quote>

You put generally in quotes which is leading me to believe there are situations where queries will fail. I am looking into doing a partition exchange against a table that is actively being queried but do not want to create problems for the users. Are there cases, outside of the old partition being dropped, where queries would fail? I am not worried about causing hard parses ocurring after the exchange as this is a OLAP environment.

Thanks.

Tom Kyte
October 13, 2005 - 9:10 pm UTC

....and could be dropped and overwritten resulting in invalid rowid....



I believe that is it, if the old data sticks around, the query should be able to run to completion (assuming no 1555's of course..)

Concurrent partition exchange

Rob, December 19, 2005 - 3:52 pm UTC

Tom:

I have a partitioned table. I have two seperate threads that will load two different scratch tables. After loading it's scratch table each thread will attempt to exchange it's scratch table with a partition (different partition) in the partitioned table. Can two partition exchanges occur concurrently on the same table or will there be an ora-54 error.

Rob

Tom Kyte
December 19, 2005 - 4:20 pm UTC

they will serialize, might not get a ora-54, but could "time out" ultimately.

Murali, March 31, 2006 - 5:15 am UTC

Tom:

Doubts in Exchanging a Hash-Partitioned Table with a Range-Hash Partition.

I was working with example...

CREATE TABLE t1 (i NUMBER, j NUMBER)
PARTITION BY HASH(i)
(PARTITION p1, PARTITION p2);

CREATE TABLE t2 (i NUMBER, j NUMBER)
PARTITION BY RANGE(j)
SUBPARTITION BY HASH(i)
(
PARTITION p1 VALUES LESS THAN (10)
(SUBPARTITION t2_pls1,
SUBPARTITION t2_pls2)
,
PARTITION p2 VALUES LESS THAN (20)
(SUBPARTITION t2_p2s1,
SUBPARTITION t2_p2s2)
);

When I run this COMMAND --
ALTER TABLE t1 EXCHANGE PARTITION p1 WITH TABLE t2 WITH VALIDATION;

I am getting this ERROR MESSAGE...

ERROR at line 1:
ORA-14095: ALTER TABLE EXCHANGE requires a non-partitioned, non-clustered table

Can you geme the cause for this problem.

This is an example given in "Oracle9i Database AdministratorÂ’s Guide" page-567

Tom Kyte
March 31, 2006 - 12:17 pm UTC

you have to turn the individual partitions of T2 into tables (exchange them with a table) and then use the table itself - you cannot do this "en-mass"


in fact, you cannot turn the partition p1 into a partition with subpartitions, it doesn't work that way.

Murali, March 31, 2006 - 1:36 pm UTC

Tom:
Thanks for your quick response.

So, This is oralce drawback.. i.e can't do exchange of partitions between partitioned tables.

Can you give one good example How can we do data exchange of partitions between partitioned tables.

of i use insert into /*+ Append */ Tabl select * from Tab2 to move 40 millions data seems taking long time


Tom Kyte
March 31, 2006 - 1:42 pm UTC

you are changing the physical structure here - this goes beyond a partition to partition exchange.

But yet, you would have to make a partition a table, a table the partition (which makes the other partition a table) and then that table a partition again.

table1.partition1 exchanged with TEMP_TABLE (table1 now has empty partition, temp_table is full of the stuff that was in table1)

TEMP_TABLE exchanged with table2.partition2 (temp_table now has data from table2.partition2. table2.partition2 now has data from table1.partition1)

TEMP_TABLE exchanged with table1.partition1 (to go full circle)

Murali, April 02, 2006 - 8:36 am UTC

Tom:

Thanks for your sugession.

Can you help to give one example to "exchange partition" between "partition tables"






Tom Kyte
April 02, 2006 - 11:28 am UTC

ops$tkyte@ORA10GR2> CREATE TABLE t1
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /

Table created.

ops$tkyte@ORA10GR2> CREATE TABLE t2
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /

Table created.

ops$tkyte@ORA10GR2> CREATE TABLE t3
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  /

Table created.

ops$tkyte@ORA10GR2> insert into t1 select to_date('12-mar-2003','dd-mon-yyyy')+mod(rownum,4), rownum, 'T1' from all_users;

34 rows created.

ops$tkyte@ORA10GR2> insert into t2 select to_date('12-mar-2003','dd-mon-yyyy')+mod(rownum,4), rownum, 'T2' from all_users;

34 rows created.

ops$tkyte@ORA10GR2> select y, count(*) from t1 group by y;

Y                           COUNT(*)
------------------------- ----------
T1                                34

ops$tkyte@ORA10GR2> select y, count(*) from t2 group by y;

Y                           COUNT(*)
------------------------- ----------
T2                                34

ops$tkyte@ORA10GR2> alter table t1 exchange partition part1 with table t3;

Table altered.

ops$tkyte@ORA10GR2> alter table t2 exchange partition part1 with table t3;

Table altered.

ops$tkyte@ORA10GR2> alter table t1 exchange partition part1 with table t3;

Table altered.

ops$tkyte@ORA10GR2> select y, count(*) from t1 group by y;

Y                           COUNT(*)
------------------------- ----------
T2                                 8
T1                                26

ops$tkyte@ORA10GR2> select y, count(*) from t2 group by y;

Y                           COUNT(*)
------------------------- ----------
T2                                26
T1                                 8

 

Murali, April 02, 2006 - 11:11 pm UTC

Tom:

You really great.

Easy and Fantastic example to understand.

Thanks

Murali, April 05, 2006 - 2:08 am UTC

Tom:

When i exchange partition between partiton table to non partition table, whey all my partiton table index's is become unusable state?

is it mandatory that all my index need to rebuild if we exchange partition?



Tom Kyte
April 05, 2006 - 5:43 pm UTC

they do not all become unusable.

a global index might - is that what you mean?

Murali, April 05, 2006 - 2:09 am UTC

Tom:

When i exchange partition between partiton table to non partition table, why all my partiton table index's is become unusable state?

is it mandatory that all my index need to rebuild if we exchange partition?



Murali, April 06, 2006 - 9:57 am UTC

<tom say>
Followup:
they do not all become unusable.

a global index might - is that what you mean?
</tom say>

Yes my index on table is GLOBAL INDEX .

my table have 20 milliions records and have 8 partition
and all index which i have is global index i.e index is not done partiton.

how to avoid Index unsable state if my index is global index, because to rebuild index taking almost 24hours.

if i do index also partition same as table, can i avoid index unusable state? which one you recommend





Tom Kyte
April 07, 2006 - 3:45 pm UTC

You can have global indexes maintained during the exchange.

Yes - the exchange will obviously take longer as we are maintaining the index.

But - yes, there is no downtime as far as the end users querying the data are concerned since the index doesn't go invalid (so who cares if it takes longer... it is not unavailable).

for example:

ops$tkyte@ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /
 
Table created.
 
ops$tkyte@ORA10GR2> create index t_idx on t(x) global;
 
Index created.
 
ops$tkyte@ORA10GR2> insert into t select to_date( '12-mar-2003' )+mod(rownum,3), rownum, rownum from all_users;
 
42 rows created.
 
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> CREATE TABLE t2
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  /
 
Table created.
 
ops$tkyte@ORA10GR2> insert into t2 select to_date( '12-mar-2003' ), rownum, rownum from all_users;
 
42 rows created.
 
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select status from user_indexes where index_name = 'T_IDX';
 
STATUS
--------
VALID
 
ops$tkyte@ORA10GR2> alter table t exchange partition part1 with table t2;
 
Table altered.
 
ops$tkyte@ORA10GR2> select status from user_indexes where index_name = 'T_IDX';
 
STATUS
--------
UNUSABLE
 
<b>that is what you have.... But</b>

ops$tkyte@ORA10GR2> alter index t_idx rebuild;
 
Index altered.
 
ops$tkyte@ORA10GR2> select status from user_indexes where index_name = 'T_IDX';
 
STATUS
--------
VALID
 
ops$tkyte@ORA10GR2> alter table t exchange partition part1 with table t2 UPDATE GLOBAL INDEXES; 
Table altered.
 
ops$tkyte@ORA10GR2> select status from user_indexes where index_name = 'T_IDX';
 
STATUS
--------
VALID

<b>that is what you can do...</b>


Now, you would not locally partition an index necessarily WITHOUT LOOKING AT YOUR DESIGN.  there are huge differences between local and global and you chose one over the other at design time to make data retrieval more efficient.  Going local might remove the unusable aspect but kill runtime performance.

If you have access to my latest book (Expert Oracle Database Architecture), I cover these topics in more detail. 

undo for truncate / exchange partition

John, November 21, 2006 - 4:10 pm UTC

Hi Tom:

bug a quick question here, will statement "alter table ...truncate partition" generate undo? i have a partition with over 100M record, it tooks long time (did not complete the statement, as afraid it will drag down the performance of the production server for too long, based on session_longops, it will take more than 24 hrs) to truncate partition. i know truncate a non-partitioned table would not generate any undo, wonder doesnt truncate a partition should be the same?

rssize under v$undostat for the transaction kept on increasing, so i am confused. thanks for help


Tom Kyte
November 22, 2006 - 4:05 pm UTC

all DDL generates undo, undo for the DDL statement (as it updates the dictionary)

here, it will generate undo for the updates to the dictionary, but it will not generate undo for the data that is actually truncated.


the truncate does have to checkpoint the blocks that are dirty in the buffer cache before it does it's job, was this a very actively modified partition?

rssize??

truncate table

John, November 22, 2006 - 5:36 pm UTC

Thanks Tom for the quick response, really appreciate ur kindness

what you told me is similar as what i got from metalink, undo is only for the change of data dictionary.

the partition have around 45,000 blocks (4M/block), about 100M record and is NOT actively updated (partition is a range partition based on month, what i tend to truncate is not the active month).

i just puzzled by the speed of truncation (estimated to be around 36 hours to finish) and amount of undo (it used up the initial 500M undo space after running for 2 hours, rssize is the field under v$rollstat), isnt undo for data dictionary should be relatively small? thanks for help in any form given.


Tom Kyte
November 24, 2006 - 6:03 pm UTC

anything like an "update indexes" in that truncate command. what does the command look like?

truncate table

John, November 25, 2006 - 10:16 am UTC

Hi Tom, i did not ask for update of global index, the command is just "alter table .. truncate partion ..;'

when i check session_longops, total work is 3,288,456 blocks, and estimate 216,902 seconds to complete.




truncate table

John, November 25, 2006 - 11:00 am UTC

Hi Tom: i found the root cause, it is the materialized view log (based on Rowid) that causing the trouble. Seems, when i need to drop / exchange partition, Oracle will do a full table scan to update the materialized view log.

Realized this after see oracle actually spent all the time on the m$log table, instead of the table i asked for. dropped materialzed view log and exchange partition finished within 10 seconds,

seems a little big odd, as i can drop and rebuild the materialized view log pretty quick, unsure why dont Oracle backend process adopt the same approach, :),

thanks for all the help given to me,


heap table to partitioned table

A reader, March 08, 2007 - 9:28 am UTC

Hi

I have read in many places that to convert a heap table into a partitioned table exchange is a good way.

I have now experienced that it not so good idea.

Since I am exchanging partition I need to then split them to get more patitions. Splitting is really a pain, it generates so much redo.

I suggest anyone who wants to partition tables uses insert append select
Tom Kyte
March 08, 2007 - 11:09 am UTC

split need not generate redo.

and I have never heard that "to convert a heap into a partitioned table - exchange is good" myself.

If the heap was to be placed into a SINGLE partition and new partitions added over time, then it would be good.

If the heap was to be spread out over many partitions - the exchange would simply not make sense - a create table as select would be the correct approach.


split does generate redo

A reader, March 08, 2007 - 5:08 pm UTC

As far as I know splitting partitons does generate redo!
Tom Kyte
March 08, 2007 - 8:41 pm UTC

you can do these things nologging.

creating an index - normally generates redo in archivelog mode. unless you do it nologging.

create a table as select - normally generates redo ........

and so on. Lots of things normally generate redo

Exchanging partitions between different tablespaces

natasha, April 04, 2007 - 6:29 am UTC

Hi Tom.

I have big (about 100Gb) table with two partitions stored in tablespace1, tablespace2. And I want to use alter table¿ exchange partitions to delete old data from my table. I create new nonpartitioned table with the same columns, but stored in different tablespace (tablespace3). The data file of tablespace3 may be even on different physical carrier.
Would exchanging partitions work fast in that case? I cant understand how changing data dictionary may move data from one tablespace to another.

Thanks for your time
Tom Kyte
April 04, 2007 - 10:26 am UTC

exchange partitions goal is to NOT move a single byte of data.

if you did an exchange of a partition in tablespace2 with a table in tablespace3, when you were done, you would have a TABLE in tablespace2 (not a partition anymore) and a partition in tablespace3 (not a table anymore).

We just point to the new segments, we don't move the data.

Karteek, May 03, 2007 - 5:55 am UTC

I think it is not not pissible to exhange two partitions of two different tables...

In our case we not looking at avaiblabilty. All that we wanted to do is...

we have 2 tables - staging and Live - both are hash partitioned

in order to do incremental data load, instead of directly doing update/delete/insert operations on Live table, we wanted to make all necessary changes to Staging table and do the partitions exchange b/w the 2 tables

Staging table:
A1
A2
A3
A4

Live table:
B1
B2
B3
B4

build all local indexes on Staging table and exchange partitions b/w A1<->B1; A2<->B2... (if direct alter table exchange parti... doesn't work, I want to take a non-partitioned conversion table and proceed with swap, like P1<->N, P1<->P2, N<->P2-------P is table1's partition, N - non-partitioned table)

for me, all this is looking funny...why exchanging ALL the partitions? cann't I simply interchange the names of my Staging and Live table, bcoz I don't need a single parition or a sub set of partitions - I need all partitions of Staging table - I meand entire table.

Please give your feedback Tom...

- Karteek
Tom Kyte
May 03, 2007 - 5:47 pm UTC

I'm not following the problem here.

you exchange a single table partition
with a single table

period. that is it, if you wanted to 'swap' the above two tables you would:

drop live;
rename stage to live;


Karteek, May 03, 2007 - 10:38 pm UTC

ok...before I drop Live and rename Stage to Live, If I build all my indexes on Stage table and also analyze the schema, I think statistics and index status of Stage table will still be valid even after I rename it. am I correct Tom?.

If I am right, we can control the tables as we know their names clearly. But, how to I maintain other objects of that table, say INDEXES.

tablename/indexname:

I am taking one more table for Backup, instead of simply dropping Live table
                  Staging        Live        Backup
before rename      A/ia          B/ib         C/ic
after rename     dropped         B/ia         C/ib


Here I know which table is of what type(stagin/live/backup). But, I do I know abt indexes. Next time (may be after a week) when i create Staging table (and the same continues for several weeks)...which index names should I use for Staging table?. do you suggest any best practice or any logic?

- Karteek
Tom Kyte
May 04, 2007 - 12:52 pm UTC

stats and indexes are all valid after a rename, yes.


you can rename indexes via alter index as well.

Karteek, May 05, 2007 - 1:53 pm UTC

Thank you Tom!

Converting a non partitioned table to a partitioned table

Shahram Samardar, May 10, 2007 - 7:43 pm UTC

Hi

Is there any way to convert a non partitioned table to a partitioned table by alter table commands?

What is the recommendation to do this with fewer steps?

Thanks
Tom Kyte
May 11, 2007 - 11:31 am UTC

well, I would presume that you want the data to be in more than one partition - which means row 1 goes into partition A, row 2 goes into partition B and so on

which means - recreate.

you can use create table as select to take an existing table and create a partitioned table

you can use dbms_redefinition to do this online


We can even do this:

ops$tkyte%ORA10GR2> create table t1 ( x int, y int );

Table created.

ops$tkyte%ORA10GR2> CREATE TABLE t2 ( x int, y int )
  2  partition by range ( x )
  3  ( partition p values less than (maxvalue) )
  4  /

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t1 select rownum, rownum from all_users;

39 rows created.

ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select 't1' nm, count(*) from t1 union all select 't2', count(*) from t2;

NM   COUNT(*)
-- ----------
t1         39
t2          0

ops$tkyte%ORA10GR2> alter table t2
  2  exchange partition p with table t1
  3  -- other options here for indexes, validation and the like
  4  /

Table altered.

ops$tkyte%ORA10GR2> alter table t1 rename to t1_old;

Table altered.

ops$tkyte%ORA10GR2> alter table t2 rename to t1;

Table altered.

ops$tkyte%ORA10GR2> alter table t1_old rename to t2;

Table altered.

ops$tkyte%ORA10GR2> select 't1' nm, count(*) from t1 union all select 't2', count(*) from t2;

NM   COUNT(*)
-- ----------
t1         39
t2          0

ops$tkyte%ORA10GR2> select dbms_metadata.get_ddl('TABLE', 'T1' ) from dual;

DBMS_METADATA.GET_DDL('TABLE','T1')
-------------------------------------------------------------------------------

  CREATE TABLE "OPS$TKYTE"."T1"
   (    "X" NUMBER(*,0),
        "Y" NUMBER(*,0)
   ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  STORAGE(
  BUFFER_POOL DEFAULT)
  TABLESPACE "USERS"
  PARTITION BY RANGE ("X")
 (PARTITION "P"  VALUES LESS THAN (MAXVALUE)
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "USERS" NOCOMPRESS )


exchange partition

Vikas Atrey, July 16, 2007 - 4:48 am UTC

Exchange partition works only on one partition in one go.

So if we have to convert a huge non partitioned table into partitioned table with 8 partitions quickly then only way to go is :

1. Use "Exchange partition" as suggested.
2. Then split the partition till we get required number of partitions.

if there is any other way also

Tom Kyte
July 17, 2007 - 11:24 am UTC

this would not apply???? you have to reorganize all of the data - split it out 8 ways. you do not have 8 tables to exchange with.


dbms_redefinition - an online redefine.

this would make it appear instantaneous to the end users (no downtime)

but it might take a while to do - but you do not care, it is available.



splitting is not going to be an ONLINE operation - it is DDL..... it would perhaps be the slowest way I could think to do it.

Why is this failing?

Chinni, May 26, 2009 - 10:21 am UTC

Hi Tom,
I encountered the following error when I tried exchange partition.

ORA-14130: UNIQUE CONSTRAINTS mismatch IN ALTER TABLE EXCHANGE PARTITION

while executing exchange partition command

ALTER TABLE RCM EXCHANGE PARTITION PART_20052009 WITH TABLE RCM_TEMP EXCLUDING INDEXES WITHOUT VALIDATION

Main table has : two non unique local indexes, and one unique(pk) index
temporary table : has No indexes.

This exchange is working fine in two databases(one 10.2.0.4 and another 9.2.0.8) but not working in another database (9.2.0.8). I have found this in my UAT environment. There are no object level changes done to these tables(except that I have rebuilt the PK index and gathered stats on RCM). I am afraid that it might come in production also. But could you please help me to identify the problem??? Why is it working in one db and not in other?

could you also let me know what is meant by "EXCLUDING INDEXES". Do I really need to add the indexes on temp table also??

thanks for your help.
Tom Kyte
May 26, 2009 - 10:26 am UTC

you would need to provide us with an example.

and in the construction of the example, you will almost certainly figure out what you did wrong - what the difference IS in UAT (because there will be a difference)


You are correct

Chinni, May 27, 2009 - 1:56 am UTC

Hi Tom,
yes, you are correct . There was actually a difference between the two environments. The difference is "there was some data in RCM_TEMP table in UAT whereas this table is empty in other environments". I cleared that data in UAT and it worked fine.

Thank you very much for your time. But a small question, when I specify "excluding indexes" can i ignore about differences in indexes between the two tables, I meant when i download the data from main table to temp table only??


Tom Kyte
May 27, 2009 - 8:12 am UTC

give me a "for example" to work with.

I don't know what you mean in your last paragraph.

Exchange partition

A reader, March 19, 2010 - 6:45 pm UTC

Hi Tom,
Is there any way with exchange partition to exchange only nominated column(s) data? For example, if I have table with a key, and two amount columns. With end of day run we want to exchange all columns (no issue) but with intra day run we want to refresh only one amount column with key, the other amount column value remains same as of end of day.

Thanks
Tom Kyte
March 20, 2010 - 9:22 am UTC

think about it.

close your eyes and envision what happens during a partition exchange.

does any data actually move? (no, it does not)

we just rename segments - the partition becomes a table, the table becomes the partition - no data is moved.

In order to do what you ask, you have to MOVE data, hence it will not be a partition exchange.

ORA-14292: Partitioning type of table must match subpartitioning type of composite partition

Balaji, April 07, 2010 - 5:06 am UTC

Hi Tom,
What will be the problem here,

create table TEMP_partition
(
empno number(10),
name varchar2(20),
join_date date
)
partition by range(join_date)
subpartition by hash(empno) subpartitions 1
(
partition part_1 values less than (to_date('01-jan-2009','dd-mon-yyyy')) (subpartition part1_sub1)
,partition part_2 values less than (to_date('01-may-2010','dd-mon-yyyy'))(subpartition part2_sub1)
)
/

create table composite_partition_1
(
empno number(10),
name varchar2(20),
join_date date
)
partition by range(join_date)
subpartition by hash(empno) subpartitions 1
(
partition part_1 values less than (to_date('01-jan-2009','dd-mon-yyyy'))(subpartition part1_sub1)
,partition part_2 values less than (to_date('01-may-2010','dd-mon-yyyy'))(subpartition part2_sub2)
)
/

create table composite_partition_2
(
empno number(10),
name varchar2(20),
join_date date
)
partition by range(join_date)
subpartition by hash(empno) subpartitions 1
(
partition part_1 values less than (to_date('01-jan-2009','dd-mon-yyyy'))(subpartition part1_sub1)
,partition part_2 values less than (to_date('01-may-2010','dd-mon-yyyy'))(subpartition part2_sub2)
)
/
insert into composite_partition_1
values(2,'composite_partiti_1',sysdate)
/

insert into composite_partition_2
values(1,'composite_partiti_2',sysdate)
/


alter table composite_partition_1 exchange partition PART_2 with table TEMP_partition
excluding indexes without validation

ORA-14292: Partitioning type of table must match subpartitioning type of composite partition
Tom Kyte
April 12, 2010 - 8:33 pm UTC

you have a range/hash partitioned table that you are trying to exchange with a hash partitioned partition - you cannot exchange

range/hash
with
hash

ops$tkyte%ORA11GR2> create table TEMP_partition
  2  (
  3  empno number(10),
  4  name varchar2(20),
  5  join_date date
  6  )
  7  partition by hash(empno) partitions 1
  8  /

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> drop table composite_partition_1;

Table dropped.

ops$tkyte%ORA11GR2> create table composite_partition_1
  2  (
  3  empno number(10),
  4  name varchar2(20),
  5  join_date date
  6  )
  7  partition by range(join_date)
  8  subpartition by hash(empno) subpartitions 1
  9  (
 10  partition part_1 values less than
 11  (to_date('01-jan-2009','dd-mon-yyyy'))(subpartition part1_sub1)
 12  ,partition part_2 values less than
 13  (to_date('01-may-2010','dd-mon-yyyy'))(subpartition part2_sub2)
 14  )
 15  /

Table created.

ops$tkyte%ORA11GR2> alter table composite_partition_1 exchange partition PART_2 with table
  2  TEMP_partition
  3  excluding indexes without validation
  4  /

Table altered.




you can exchange a hash partitioned table with a hash partitioned partition.

What about OLTP ?

Frank, March 22, 2011 - 6:26 pm UTC

Hello Tom,

a whole lot of details are covered here about "partition exchange", but I'm still missing something:

is "partition exchange" recommended for tables in an OLTP environment as well, where in 95% of the cases only read-only access is granted to the applications ?

A bit of details: huge amount of tables receive data from an external application, the number of records are between 0 and 50.000.000 and these must be loaded into Oracle (10gR2) while applications read these tables. (I need to "replace" the tables completely + online, it makes no sense to merge the data.)

I'm thinking about creating partitioned tables for each with only one partition - values less than (maxvalue) - to be able to do the exchange with "standalone" tables. However I have some doubts, since I know that on a busy day thousands or even tens of thousands statements (select) run simultaneously agains these tables...
It would be great if you could answer the following questions:

- Is partition exchange officially able to handle such volume of simultaneous statements / transactions when it's executed without giving:
- wrong results/datasets,
- an error message?
- Is there any delay in the partition exchange in case of such transaction volumes ? (Should not take minutes...)
- Do you have a trick for exchanging multiple tables once ? (Views with couple of tables...)
- Is there anything else that should be in focus here ?
- Any other thoughts or suggestions ?

Thanks.
Tom Kyte
March 23, 2011 - 8:29 am UTC

If I were doing a complete refresh of the data, I might use synonyms instead.

Have two tables, T1 and T2. Right now synonym S points to T1. Queries use S.

I get new data, load into T2.

When all done, indexed, etc - I create or replace synonym S for T2; New queries start using T2, already executing queries continue to use T1.

When I'm sure all T1 queries are done - I can truncat T1.

Tomorrow, we load T1 and swap again.

...
- Do you have a trick for exchanging multiple tables once ? (Views with couple of tables...)
.....

that will be problematic, it can be done but will be expensive. You would use DELETE on the tables, then use a conventional path load into these tables - upon commit, new queries would see new data, already running queries would see old data.

50,000,000 records is pretty small, on today's machines - that delete+insert might take some amount of resources, but it wouldn't be anywhere near the realm of 'out of the question'



Another option would be to use partitioning and views or fine grained access control to access the tables.

You would have a table T with partitions P1 and P2, it would be a list partition. Table T would have a column "which_one" that is NOT NULL and has a check constraint for being in ('P1','P2').

The view would look like:

select * from T where which_one = sys_context( 'my_global_application_context', 'which_one' );

where my_global_application_context would be just that - a global application context (you can search for that in the documentation on otn.oracle.com -> documentation). You would store in a table the value to be used for which_one and an after startup on database trigger would read that table and populate this context.

So, everyone would be reading from partition p1 right now (since the value is assumed to be 'P1')

You would load P2 - for as many table T's as you need, you can direct load them, you can load them and commit each one - whatever, no one will see partition P2 right now.

When they are all loaded, you can then update the table and run the procedure that populates the context. Immediately, new queries would see all of the P2's and start using them. Existing already running queries would still use P1 until they are done.

You could use fine grained access control instead of a view to accomplish this same thing.


OLTP reloaded

Frank, March 24, 2011 - 3:59 pm UTC

Hello Tom,

recreation of synonyms is available now at us, this is something we want to eleminate. (There have been issues in our environments. Alert logs show no errors, but businness errors indicate that something goes wrong when switching synonymys, especially when synonyms in the local database reference synonyms on the remote database that are switched against different tables). Have you ever experienced such an issue?

The other 2 ideas are ok, the suggestion to have 2 partitions would imply the extension of the primary key and the unique indexes. According to the documentation "online queries" are possible during the direct path insert, but I cannot see detailed information about the indexes, only one sentence suggest that "index maintenance is done at the and of the direct-path insert".

Can you tell me what is going on at the technical level? (I want to understand what performance or other impacts this implies and whether this has any negative effect on the real-time queries.)

Thanks.

Tom Kyte
March 25, 2011 - 10:45 am UTC

... but businness errors indicate that something goes wrong when switching synonymys, ...

that is not very descriptive. "something goes wrong"


The other 2 ideas are ok, the suggestion to have 2 partitions would imply the extension of the primary key and the unique indexes. According to the documentation "online queries" are possible during the direct path insert, but I cannot see detailed information about the indexes, only one sentence suggest that "index maintenance is done at the and of the direct-path insert"

indexes are entirely maintained. Beware that a unique index could be left in a direct load (invalid) state if you data has duplicates.

In a direct path load we:

o write all of the new data above the high water mark of the existing segment. Your existing segment will be empty so this isn't really relevant. We'll have a lock on the segment you are direct pathing into - so if you insert /*+ append */ into t partition(p1); - we'll only lock partition p1, p2 could actually be modified if that is a need while p1 is being loaded. At no time would queries be blocked

o while we write the data above the high water mark, we build a 'mini index' of just the data loaded. If the segment is no empty - at the end of the load, right before the insert finishes - we would merge this index with the other existing index. That won't be your case since you'll be using local indexes only and there will be no existing index.

Your real time queries will feel no pain as they won't even be touching the segments being loaded into.

OLTP forever :-)

Frank, March 28, 2011 - 3:39 pm UTC

Hello Tom,

... Followup March 23, 2011 - 8am Central time zone:
... Existing already running queries would still use P1 until they are done.

Ok, but for how long shall I keep the "old" segments? Can I issue an "ALTER TABLE XXXXXX DROP PARTITION XXXXXX_1 UPDATE INDEXES' right after "switching views" or shall I wait and monitor the usage of "old" segments?


... Followup March 25, 2011 - 10am Central time zone:
... That won't be your case since you'll be using local indexes only and there will be no existing index.

I intend to have primary keys and unique indexes, so there will be existing indexes. Or was this a suggestion? If I do not "bother" with "global indexes/constraints" then a huge amount of validation must be done before loading things into the database.)

Would you please share your thoughts?

Thanks.

Tom Kyte
March 29, 2011 - 3:49 am UTC

if you drop the partitions - their space becomes available for reuse and if their space is reused, the already running queries that were reading it would get the dreaded ora-8031 object no longer exists. It would be similar to getting an ora-1555 snapshot too old. If you want them to finish, you would wait for them to complete before making that space "reuseable"

there would be NO global indexes here - remember, you are swapping the data out - there is no need for uniqueness ACROSS these two partitions, you'll definitely have "dups" across the partitions.

You would have local indexes ONLY. Your primary key would have the column we are adding to the table added to it - we talked about that...

Internals of Exchange

A reader, January 24, 2012 - 2:01 am UTC

Hi Tom,

In your Ist comment you stated that time for EXCHANGE is a function of how many extents the staging table has to swap.

However I did some tests which kind of confirmed that there is no noticeable difference if we have 15K extents Vs. 15 extents of the same object. can you please show us how the time can significantly rise?

Second question, while doing the EXCHANGE I noticed that the TABLE with Partitions gets a SHARE ROW EXCLUSIVE mode(Lmode=3), with partition and the staging table both locked in EXCLUSIVE MODE (Lmode=6.

Also the internal dictionary objects HIST_HEAD$,
HISTGRM$,
MLOG$,
SUPEROBJ$,
TAB_STATS$,
TAB$,
TABPART$ are locked in Share EXCLUSIVE MODE.

How come EXCHANGE works fine with two different partition tables getting exchanged with different staging tables concurrently where as the EXCHANGE on the same partition table concurrently (multiple partitions exchanges) face issues with "Library cache Lock".

I see only one of them proceeds with wait Event "db file sequential read" while the other is on wait "Library cache Lock" and this situation reverse itself with sid's in question.

Can you please explain us in details the working of EXCHANGE? My idea was if the recursive calls need update to the dictionary segments then the ALTER TABLE EXCHANGE should work in a serialized fashion across database, and not at the table level.

Your thoughts please?

thanks



Tom Kyte
January 24, 2012 - 1:32 pm UTC

I believe I mis-spoke on that one. It just updates the segment level information, it doesn't matter how many extents there are. The exchange will just basically swap object names. The extent information isn't touched there - sorry about that.


How come EXCHANGE works fine with two different partition tables getting
exchanged with different staging tables concurrently where as the EXCHANGE on
the same partition table concurrently (multiple partitions exchanges) face
issues with "Library cache Lock".


because we only allow one DDL statement on an object at a time.


Can you please explain us in details the working of EXCHANGE? My idea was if
the recursive calls need update to the dictionary segments then the ALTER TABLE
EXCHANGE should work in a serialized fashion across database, and not at the
table level.


why - we only need to serialize the access to the tables being swapped.

Concurrent Exchange

A reader, January 25, 2012 - 9:05 am UTC

Hi tom,

Will the update to dictionary objects eg. tab$, seg$ and their dependent objects will NOT make the exchange work in a serialized manner.

If one of the sessions is holder of the latch for obj$, how the other session can acquire the latch on the library cache for the same object.

That is not understood by me, thats the reason I thought it will be serialized across database but thats not the case.

Can you please share your thoughts how oracle internally does solve it.

thanks
Tom Kyte
January 26, 2012 - 10:09 am UTC

obj$ doesn't use "latches", latches are used for very short, very fast operations against memory structures.

we serialize at the object level using locks and latches when appropriate, just like you would in your applications for various "things".

Not sure what else to say here - we just use locks and latches (and mutexes) to serialize when we have to.

Concurrent Exchange

a Reader, January 28, 2012 - 8:54 pm UTC

Hi Tom,

When Oracle needs to update a particular block of a table it needs to hold the latch to update the contents of the block to make it current.

Is this statement not true?

If no latches are required to update a particular table blocks data then how come we are seeing "library cache lock" when we are running multiple sessions to do concurrent exchange on two partitions of the same table.

Can you please explain in details the library cache and row cache internals when multiple recursive calls are passed for the ALTER table exchange statement involving multiple dictionary objects.

Thanks
Tom Kyte
January 31, 2012 - 5:40 pm UTC

when oracle needs to update the bits and bytes on a block - we get the block in current mode - only one transaction can have a block in current mode at any given time.


Forget latches, forget the mechanism for a minute, the rule is "one transaction has a block in current mode at a time - there is serialization happening there"

whether it be a latch, a lock, a mutex....


"library cache lock" when we are running multiple sessions
to do concurrent exchange on two partitions of the same table.


that is an entirely different situation! You are trying to do DDL on the table, you are not modifying any blocks. We only permit, in general, one DDL at a time - sometimes at the system level (only one create tablespace can happen at a time), sometimes at the segment level (only one partition exchange can happen to a table at a time).


Efficient partition exchange

Josh A, March 02, 2012 - 1:23 pm UTC

Hi Tom,
This thread has been very helpful in helping me understand why I'm seeing so many library cache lock waits. Now I'm wondering if it is worth trying to minimize them.

I have a process that will spawn a number of jobs, each does an insert into the new table to be exchanged and then a partition exchange on the original partitioned table. Is it worthwhile to try to rewrite this process to serialize the exchange partition step, or will it still take a similar amount of time and not have the library cache lock wait event?


Thanks
Tom Kyte
March 02, 2012 - 1:49 pm UTC

if you serialize or we serialize there will still be serialization. I would not foresee any performance benefit in you doing the serialization versus just letting the database do it in this case.

alter table exchange partition - reorganize tablespace

diao, August 13, 2012 - 2:51 pm UTC

Hi Tom!

Exchange partition table is good method to reorganize default tablespace for user in oracle 92 when after that i must drop old default tablespace for user? is better then dbms_redefinition or alter table move ?
Tom Kyte
August 17, 2012 - 1:55 pm UTC

exchange partition cannot be used to change a tablespace - you still need to read and write all of the data.

alter table move partition

or

dbms_redefinition


alter table if you can take an outage

dbms_redefinition if you cannot

Concurrent Exchange Partition

Hemant K Chitale, July 02, 2013 - 8:52 am UTC

You indicate that two sessions attempting to EXCHANGE PARTITION concurrently will serialise.

Thus, these two statements from two different sessions :
ALTER TABLE t1 EXCHANGE PARTITION p100 WITH TABLE p100_target_tbl;
ALTER TABLE t1 EXCHANGE PARTITION p150 WITH TABLE p150_target_tbl;

would serialise.

Would this scale to 'n' (say 8 or 10) sessions attempting that many different partitions concurrently ?
Tom Kyte
July 02, 2013 - 5:06 pm UTC

define what you mean by "scale" in this case.

you already know that will serialize - if 8 or 10 sessions do it, they will all serialize. that "scales" - it is a perfect slope (everyone's response time will be N*tim where N= their place in line and tim = time to exchange a single partition). Perfect linear scaling :) but I'm not sure that you mean by scale in this context.

Exchange slow

A reader, February 05, 2015 - 8:52 pm UTC

Hi Tom,

We have a purging process which does :

1) create temp table as main table
2) create indexes on temp table matching local indexes on main table
3) exchange partition to be dropped
4) drop partition
5) drop temp table

Monitoring process from gv$session_longops I see 3 "stages" ( OPNAME ) at least :

Table scan
Sort/Merge
Sort output

Since it takes too long, specially for the last "stage" even hours for just one partition of 100m rows, I did some research and added :

a) disable PK keep index

between 2 and 3 above, according to doc :

http://docs.oracle.com/cd/B19306_01/server.102/b14231/partiti.htm

but that didn't work, am I missing anything ?

The simplest table has no "incoming" FKs and just a global PK ( not local index ) which is NOVALIDATE and DISABLED

exchange command is :

alter table ANALYZER_ATTRIBUTE_INSTANCE exchange partition SYS_P2339 with table ZZPARTDROP including indexes without validation update indexes;

db version is 11.2.0.3.0, 6 nodes rac.

Please advise on what can be done to improve the process ( changing PK to LOCAL is not viable and migrating to 12c to do the delete on the PK in background is not planned yet).

Thanks,
Bertran Saragusti.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.