Skip to Main Content
  • Questions
  • Merging multiple tables with LOBs into one table

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, David.

Asked: May 07, 2012 - 8:45 am UTC

Last updated: May 07, 2012 - 9:18 am UTC

Version: 11.2.0

Viewed 1000+ times

You Asked

Hi Tom,

We have an application that uses two tables in an 11.2 database to store document information. The CURRENT_INFO table shows the current version of a document; the ARCHIVE_INFO shows the history information. The two tables have identical definitions (and, though not shown, identical index definitions):

create table CURRENT_INFO (
    sys_id varchar(32) not null,
    details clob,
    summary clob,
    updated_by varchar2(40) DEFAULT NULL,
    updated_on timestamp DEFAULT NULL,
    created_by varchar2(40) DEFAULT NULL,
    created_on timestamp DEFAULT NULL,
    mod_count number(11) DEFAULT NULL
);

create table ARCHIVE_INFO (
    sys_id varchar(32) not null,
    details clob,
    summary clob,
    updated_by varchar2(40) DEFAULT NULL,
    updated_on timestamp DEFAULT NULL,
    created_by varchar2(40) DEFAULT NULL,
    created_on timestamp DEFAULT NULL,
    mod_count number(11) DEFAULT NULL
);


(Note: I've given the tables and columns simple names for the purposes of this question, but the idea is the same).

A vendor has a goal of migrating everything to a single table called ALL_INFO, and I agree with them -- the current layout is inefficient and redundant. Every time a new record is inserted into the CURRENT_INFO table, the application copies the old version of the record to the ARCHIVE_INFO table. That means a lot of LOB movement, lots of redo, basically a lot of work when all we want is to insert a record.

To combine the two tables into a single entity, we could use CTAS. However, both the CURRENT_INFO and ARCHIVE_INFO tables have over 80GB of LOB data associated with them, spread over four columns. (Note the tables themselves, not including the LOB data, combine to take up only 1GB of space). Since an outage window is required and we have limitations on the length of the window, I'm trying to come up with faster alternatives.

It seems to me the ideal would be to copy as little data as possible. One early thought was to rename the larger of the two tables to ALL_INFO, and then do an insert ... select from the other table. That would cut our DML roughly in half, but it still requires we allocate space to hold a copy of the second table.

But now I'm wondering: is there a way we can use (exchange?) partitions to combine the two tables into a single table, and then somehow "remove" the partition key, so we don't have to copy any LOB values?

For example, in pseudocode:
create table ALL_INFO ( sys_id varchar(32) not null, ..., ) PARTITION BY LIST ("RECORD_TYPE") ;
alter table CURRENT_INFO add (RECORD_TYPE) and set value = 'C' /* C = Current */
alter table ARCHIVE_INFO add (RECORD_TYPE) and set value = 'A' /* A = Archive */
alter table ALL_INFO exchange partition (add the CURRENT_INFO table)
alter table ALL_INFO exchange partition (add the ARCHIVE_INFO table)
rebuild the indexes so they're global
remove the partition key
and now we have one table with all the data.

Updating every row in the tables to set RECORD_TYPE shouldn't be terrible, relatively, since the LOBs are by far larger than the rest of the table and are not stored inline.
But I'm not sure that we can take two segments and combine them into one. I also haven't found a way to drop a key column from a partitioned table.

Any thoughts on a way to do this so as to minimize both downtime and space requirements? What if we had more than two tables with identical definitions?

Thanks.

and Tom said...

You cannot really remove the partition key. You'd have to merge the two partitions into one. Anything that would take a partitioned table to non-partitioned would involve rewriting the table.

That said, if you wanted - you could add a column to each table - your A and C values - and then create a partitioned table - and just simply KEEP the partitioned table.

You'd put non-partitioned indexes on them - so index access to this table would be more or less equivalent to index access of a non-partitioned table.

And a full scan would be no different.


You could call this new partitioned table by some name - and create a view of it that simply does not select out the virtual column with the name of the table of you want for the view.


If you had N tables with the same definition to merge - you could take the same approach, just add a column to each and create a partitioned table that has the right structure and alter exchange in each of the tables.


ops$tkyte%ORA11GR2> create table current_emp
  2  (
  3   EMPNO        NUMBER(4) NOT NULL,
  4   ENAME        clob,
  5   JOB          VARCHAR2(9),
  6   MGR          NUMBER(4),
  7   HIREDATE     DATE,
  8   SAL          NUMBER(7,2),
  9   COMM         NUMBER(7,2),
 10   DEPTNO       NUMBER(2)
 11  )
 12  /

Table created.

ops$tkyte%ORA11GR2> insert into current_emp select * from scott.emp;

14 rows created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table archive_emp as select * from current_emp;

Table created.

ops$tkyte%ORA11GR2> update archive_emp set empno = mod(empno,1000);

14 rows updated.


<b>so there are our current and archive tables...</b>


ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> alter table current_emp add status varchar2(1);

Table altered.

ops$tkyte%ORA11GR2> update current_emp set status = 'C';

14 rows updated.

ops$tkyte%ORA11GR2> alter table current_emp modify status NOT NULL;

Table altered.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> alter table archive_emp add status varchar2(1);

Table altered.

ops$tkyte%ORA11GR2> update archive_emp set status = 'A';

14 rows updated.

ops$tkyte%ORA11GR2> alter table archive_emp modify status NOT NULL;

Table altered.

<b>we add the status flags...</b>


ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table emp_t
  2  (
  3   EMPNO        NUMBER(4) NOT NULL,
  4   ENAME        clob,
  5   JOB          VARCHAR2(9),
  6   MGR          NUMBER(4),
  7   HIREDATE     DATE,
  8   SAL          NUMBER(7,2),
  9   COMM         NUMBER(7,2),
 10   DEPTNO       NUMBER(2),
 11   status       varchar2(1) default 'C' not null
 12  )
 13  partition by list(status)
 14  ( partition part_c values ( 'C' ),
 15    partition part_a values ( 'A' )
 16  )
 17  /

Table created.

<b>create the partitioned table..</b>

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> alter table emp_t exchange partition part_c
  2  with table current_emp without validation;

Table altered.

ops$tkyte%ORA11GR2> alter table emp_t exchange partition part_a
  2  with table archive_emp without validation;

Table altered.

<b>move the data in</b>

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> alter table emp_t add constraint emp_pk primary key(empno);

Table altered.

ops$tkyte%ORA11GR2> alter table emp_t add constraint emp_fk_emp foreign key(mgr) references emp_t(empno);

Table altered.

<b>do whatever we need to the table...</b>

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create view emp
  2  as
  3  select empno, ename, job, mgr, hiredate, sal, comm, deptno
  4    from emp_t
  5  /

View created.

<b>and use a view in the application</b>

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select *
  2    from emp;

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7369 SMITH      CLERK           7902 17-DEC-80        800                    20
      7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
      7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
      7566 JONES      MANAGER         7839 02-APR-81       2975                    20
      7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
      7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
      7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
      7788 SCOTT      ANALYST         7566 09-DEC-82       3000                    20
      7839 KING       PRESIDENT            17-NOV-81       5000                    10
      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
      7876 ADAMS      CLERK           7788 12-JAN-83       1100                    20
      7900 JAMES      CLERK           7698 03-DEC-81        950                    30
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7934 MILLER     CLERK           7782 23-JAN-82       1300                    10
       369 SMITH      CLERK           7902 17-DEC-80        800                    20
       499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
       521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
       566 JONES      MANAGER         7839 02-APR-81       2975                    20
       654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
       698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
       782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
       788 SCOTT      ANALYST         7566 09-DEC-82       3000                    20
       839 KING       PRESIDENT            17-NOV-81       5000                    10
       844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
       876 ADAMS      CLERK           7788 12-JAN-83       1100                    20
       900 JAMES      CLERK           7698 03-DEC-81        950                    30
       902 FORD       ANALYST         7566 03-DEC-81       3000                    20
       934 MILLER     CLERK           7782 23-JAN-82       1300                    10

28 rows selected.


Is this answer out of date? If it is, please let us know via a Comment

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.