Skip to Main Content
  • Questions
  • Archival of static partitions of a large table

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Stuart.

Asked: July 29, 2021 - 3:06 am UTC

Last updated: August 26, 2021 - 4:06 am UTC

Version: 18c

Viewed 1000+ times

You Asked

Hi,

We have a large table in our data warehouse, (2.2TB, over 5 billion records), range-partitioned by month. The table includes a CLOB column. Each month as the partition becomes static it is moved to a separate tablespace and compressed. That tablespace can then be excluded from overnight backups for performance reasons.

We are coming close to running out of storage capacity and would like to archive the older partitions to AWS S3 storage and drop the partitions from the database until more on-prem storage becomes available at which point we want to bring the older data back into the database.

I had thought that having the older partitions in a separate tablespace, including data segments and corresponding LOB segments would mean that the data would be self-contained and the datafiles could simply be archived and retrieved at a later point in time.

However, I have found there is a global LOB index which spans partitions in both tablespaces.

I'm considering moving the older partitions into a separate table, but would welcome any suggestions for a better approach.

Regards,
Stuart.

Hi, This table is range partitioned, not list, here's the DDL, I've omitted most of the columns and partitions for brevity:

CREATE TABLE DWSTGLM.FCT_LM_REQUESTS (
 FCT_LM_REQUESTS_KEY  NUMBER,
 ID    VARCHAR2(50 CHAR),
 TIMESTAMP   TIMESTAMP,
 TIMESTAMP_YEAR   NUMBER,
 TIMESTAMP_MONTH   VARCHAR2(4000 CHAR),
 TIMESTAMP_DAY   DATE,
 USER_ID    VARCHAR2(4000 CHAR),
~
 ASSIGNMENT_ID   VARCHAR2(4000 CHAR),
 URL    CLOB,
 USER_AGENT   VARCHAR2(4000 CHAR),
~
 HTTP_VERSION   VARCHAR2(4000 CHAR),
 EXTRACT_DATE   DATE
)
PARTITION BY RANGE(TIMESTAMP_MONTH)
(PARTITION REQUESTS_MIN
VALUES LESS THAN ('2016-06') TABLESPACE DWSTGLMDATA NOLOGGING,
PARTITION REQUESTS_201606
VALUES LESS THAN ('2016-07') TABLESPACE DWSTGLMDATA NOLOGGING,
PARTITION REQUESTS_201607
VALUES LESS THAN ('2016-08') TABLESPACE DWSTGLMDATA NOLOGGING,
~
VALUES LESS THAN ('2017-06') TABLESPACE DWSTGLMDATA NOLOGGING,
PARTITION REQUESTS_201706
VALUES LESS THAN ('2017-07') TABLESPACE DWSTGLMDATA NOLOGGING,
PARTITION REQUESTS_MAX
VALUES LESS THAN (MAXVALUE) TABLESPACE DWSTGLMDATA NOLOGGING);

select index_name from user_lobs where table_name = 'FCT_LM_REQUESTS';
SYS_IL0007478673C00015$$

select index_name, partitioning_type, locality from user_part_indexes
where index_name = 'SYS_IL0007478673C00015$$';
SYS_IL0007478673C00015$$ RANGE LOCAL

select index_name, partition_name, high_value, tablespace_name from user_ind_partitions where index_name = 'SYS_IL0007478673C00015$$';

SYS_IL0007478673C00015$$ SYS_IL_P52953 '2016-06' DWSTGLMDATA
SYS_IL0007478673C00015$$ SYS_IL_P27543 '2016-07' DWSTGLMDATARO
SYS_IL0007478673C00015$$ SYS_IL_P27544 '2016-08' DWSTGLMDATARO
~
SYS_IL0007478673C00015$$ SYS_IL_P52952 '2021-06' DWSTGLMDATARO
SYS_IL0007478673C00015$$ SYS_IL_P47213 '2021-07' DWSTGLMDATA
~
SYS_IL0007478673C00015$$ SYS_IL_P47219 '2022-01' DWSTGLMDATA
SYS_IL0007478673C00015$$ SYS_IL_P47220 MAXVALUE DWSTGLMDATA


I see that this is a partitioned global index. The partitions from 2016 to May 2021 have been moved to the DWSTGLMDATARO tablespace, along with their corresponding LOB segments, with the min, max and remaining 2021 partitions in DWSTGLMDATA.

The intention was to take the DWSTGLMDATARO tablespace offline and copy the datafiles to S3, then drop the partitions it contains and the tablespace to recover the space.

The reason I'm concerned is that our DBA was looking at whether converting to a transportable tablespace would be a good option, so he ran the check utility which returned:

ORA-39910: Partitioned Global index DWSTGLM.SYS_IL0007478673C00015$$ in tablespace DWSTGLMDATARO points to partition REQUESTS_MIN of table DWSTGLM.FCT_LM_REQUESTS in tablespace DWSTGLMDATA outside of transportable set.
ORA-39921: Default Partition (Table) Tablespace DWSTGLMDATA for FCT_LM_REQUESTS not contained in transportable set.

Regards,
Stuart.

and Connor said...

OK - take a look at this link

https://asktom.oracle.com/pls/apex/asktom.search?tag=use-transportable-tablespace-to-archive-old-data

which I think matches your scenario.

Ping us via comment if you need more details.

Rating

  (6 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Stuart Smith, August 03, 2021 - 4:45 am UTC

Thanks. I've set up a test scenario with a small number of partitions, a subset of columns and a small volume of data per partition. I had considered moving the partitions to be archived to separate tables, so good to have that confirmed. I need to ensure that the exchange takes across the LOB segment and index associated with the data segment.

Since the datafiles are on an NFS share rather than ASM, can we simply copy the datafiles rather than using a TTS?

After the data has been archived, do we drop the tables containing the archived data and the tablespace containing it? We will then need to test that we can bring the archive back into the database.

Stuart Smith, August 05, 2021 - 5:06 am UTC

Hi Connor,

Following your advice regarding the need to exchange table partitions with non-partitioned tables before archiving, we have adopted a different approach.

We are now intending to create a set of archive tables, then expdp each one separately. The dumps can then be compressed before archiving, so a lot smaller.

To bring the data back, we would import the archive table, then insert into the main table then compress the partition segment. This has been tested and works fine, the LOB segment and LOB index are created as expected.

Among many other benefits, this approach means that the archived partitions remain defined in the data dictionary but use no space, so no DDL required for the main table. The partitions just sit there empty until we want to bring back archived data.

Thanks again for you advice.
Connor McDonald
August 13, 2021 - 7:01 am UTC

Check out hybrid partitioned tables as well.

You might be able to read your datapump exports directly

Stuart Smith, August 13, 2021 - 9:04 am UTC

Hi Connor, thanks will check it out. In terms of accessing archived data directly I think the latency between S3 storage and the on-prem database would preclude that option. Personally I would fork-lift our data warehouse into the Oracle cloud, but that is an ongoing conversation.
Cheers, Stuart.
Connor McDonald
August 16, 2021 - 4:47 am UTC

ADW is pretty sweet for warehouse requirements. Lots of goodies out of the box

Stuart Smith, August 23, 2021 - 8:31 am UTC

Is there any way to use the partition exchange functionality in reverse, i.e exchange a table with a table partition? Exchanging partitions to be archived into tables, exporting them and gzipping the dump files is working well, the result is about a third the size of the table partitions even though the table partitions are compressed. I'm just wondering about the best way to get the archived data back into the partitioned table. I've tested importing the archive table then repopulating the partition using insert as select, just interested to know if there's a better way.
Connor McDonald
August 26, 2021 - 1:55 am UTC

exchange table is bi-directional, ie, you can create an empty partition and then exchange a populated table with it.

But of course, this still means that table needs to be *loaded* which means data pump or similar.

Stuart Smith, August 26, 2021 - 2:35 am UTC

Thanks Connor, the initial exchange into the archive table leaves the partitions empty. I just haven't seen the reverse syntax anywhere.
Going out is:
alter table dwstglm.fct_lm_requests exchange partition requests_201606 with table dwstglm.archive_requests_201606;

Would I be correct in thinking this is to put it back in?
alter table dwstglm.archive_requests_201606 exchange with table dwstglm.fct_lm_requests exchange partition requests_201606;
Connor McDonald
August 26, 2021 - 4:06 am UTC

The syntax is the same in both cases because it is a swap.

We don't leave the partition "empty" we do a swap, eg

case 1)

before:
table empty
partition 200 rows

after:
table 200 rows
partition empty

case 2)

before:
table 100 rows
partition 200 rows

after:
table 200 rows
partition 100 rows

Hope that makes sense.

Stuart Smith, August 26, 2021 - 4:44 am UTC

Thanks again Connor, makes perfect sense and works well. Much better than insert select from the table. I know that to bring the data back we would need to import the dump, but the benefit in space saving is substantial. The gzipped dump file is about 33% the size of the compressed database table. The import even brings the archive table back compressed, so gunzip, import, exchange and done.

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database