Skip to Main Content
  • Questions
  • Compressing new partitions of existing very large tables

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, Berton.

Asked: August 10, 2020 - 2:01 pm UTC

Last updated: January 28, 2021 - 9:47 am UTC

Version: 12.1

Viewed 1000+ times

You Asked

We have quite a few very large tables that are interval partitioned ( 1 partition per day ). At any given point we store 120 days worth of data, i.e. we do daily partition maintenance and move the > 120 days data over to either Archival systems or purge them.

My question is, is it possible to compress new partitions as they are added to the source table or do we need to create a new table and move data ?

Here is an example of what the source table looks like

create table my_very_large_table
(
  id                        number(20) not null,
  processedxml              clob,
  updatedon                 date default trunc( current_date )
)
lob ( processedxml ) store as securefile 
( 
   tablespace my_tablespace
   enable storage in row chunk 8192
)
partition by range( updatedon )
interval( numtodsinterval( 1, 'day' ))
(
   partition p_default values less than ( to_date( '01/01/2020', 'mm/dd/yyyy' ))
)
tablespace my_tablespace;


We have 120 days worth of data in this table and several hundred thousand rows are added everyday. Can I alter the partition attributes so that new partitions are created as compressed ?


Thanks

BC, MT MI

and Connor said...

You can alter the *default* settings for a table (for new partitions), eg

alter table my_partitioned_table modify default attributes compress for all operations 



Rating

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

compress on lobs.

Rajeshwaran, Jeyabal, August 13, 2020 - 5:21 am UTC

But given this very skinny table of three columns, we suspect the bunch of data will be in lob segments.

so it is worth to benchmark by compressing the lob segments and measure the benefits.

since you are using secure files, de-duplication is also possible. please do check that too.

sort the data and load to get better compression

Paul, November 03, 2020 - 6:04 am UTC

This is on Oracle 12.2 on EXADATA.

We have a huge partition table in which one of the column "JSON" is defined as Extended VARCHAR2(32767). We want to compress older partitions to reduce the space usage to accomodate space for future growth.

Questions :-

1) What type of compression is suitable for "JSON" data stored in the extended VARCHAR2(32K) table .?

2) To get best compression ratio, How to identify the columns with few distinct values and sort them and load the data .?

One other suggetion is, create the exchange_table with appropriate compression from <partition> order by <column with few distinct values)..


Thanks!
Connor McDonald
November 04, 2020 - 3:26 am UTC

You are going to be a little stuck here because varchar2(32k) sits "between" the typical compression mechanisms.

- For 'normal' columns, we look for repeated values to compress
- For 'clobs' we recommend securefile compression

You can see this with a simple demo


SQL>
SQL> create table t ( x varchar2(32767));

Table created.

SQL>
SQL> variable x varchar2(32000)
SQL> exec :x := '{"json":"data"}';

PL/SQL procedure successfully completed.

SQL> exec begin for i in 1 .. 6 loop :x := :x || :x || :x; end loop; end;

PL/SQL procedure successfully completed.

SQL> exec :x := :x || :x;

PL/SQL procedure successfully completed.

SQL> select length(:x) from dual;

LENGTH(:X)
----------
     21870

SQL> insert into t select :x
  2  from dual
  3  connect by level <= 10000;

10000 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> exec dbms_stats.gather_table_stats('','T');

PL/SQL procedure successfully completed.

SQL>
SQL> select blocks from user_tables
  2  where table_name = 'T';

    BLOCKS
----------
        88

SQL>
SQL> select bytes from user_segments
  2  where segment_name = (
  3    select segment_name from user_lobs
  4    where  table_name = 'T' );

     BYTES
----------
 285409280

SQL>
SQL> alter table t move row store compress advanced;

Table altered.

SQL>
SQL> exec dbms_stats.gather_table_stats('','T');

PL/SQL procedure successfully completed.

SQL>
SQL> select blocks from user_tables
  2  where table_name = 'T';

    BLOCKS
----------
        68

SQL>
SQL> select bytes from user_segments
  2  where segment_name = (
  3    select segment_name from user_lobs
  4    where  table_name = 'T' );

     BYTES
----------
 251789312


So we don't get a lot of benefit from compression. Internally the varchar2(32767) is likely stored as a securefile, but because its not a "genuine" one, you don't get access to clob compression

SQL> alter table t move lob ( x ) store as securefile ( compress high );
alter table t move lob ( x ) store as securefile ( compress high )
                         *
ERROR at line 1:
ORA-14692: STORE AS clause is not allowed for extended character type column


SQL>


My recommendation would be to look at using a clob/blob and taking advantage of securefile compression.

To see columns with low distinct values, you can see this from USER_TAB_COLUMNS, and also there is the compression advisor to gauge the benefits you might get.

https://asktom.oracle.com/pls/apex/asktom.search?tag=oracle-advanced-compression

compression

Paul, November 05, 2020 - 7:47 am UTC

Hi Connor,

Thanks.

It looks like size of JSON data stored in the table appears to be much less than 4K. Hence the entire JSON data stored INROW with the table segment itself. This is partition table.

1* select /*+ PARALLEL(a,8) */ min(length(json)), max(length(json)) from MKT.V_QUOTE_JSON partition(V_QUOTE_JSON_200501) a
2 /


MIN(LENGTH(JSON)) MAX(LENGTH(JSON))
-------------------- --------------------
232 1102

SQL> select table_owner, table_name, COLUMN_NAME, PARTITION_NAME, LOB_PARTITION_NAME, IN_ROW, SECUREFILE , SEGMENT_CREATED, TABLESPACE_NAME from dba_lob_partitions where TABLE_OWNER='MKT' and table_name='V_QUOTE_JSON' and COLUMN_NAME='JSON' and PARTITION_NAME='V_QUOTE_JSON_200501'
2 /

TABLE_OWNE TABLE_NAME COLUMN_NAM PARTITION_NAME LOB_PARTITION_NAME IN_ROW SEC SEG TABLESPACE_NAME
---------- ------------------------- ---------- ------------------------------ ------------------------------ ---------- --- --- ------------------------------
MKT V_QUOTE_JSON JSON V_QUOTE_JSON_200501 SYS_LOB_P55584 YES YES YES MKT_DATA

SQL> select owner, segment_name, partition_name, segment_type, bytes/1024/1024 size_MB from dba_segments where owner='FMC' and partition_name='SYS_LOB_P55584' ;

OWNER SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE SIZE_MB
---------- ------------------------------ ------------------------------ ------------------ --------------------
MKT SYS_LOB0000487570C00008$$ SYS_LOB_P55584 LOB PARTITION .125


Original uncompressed partition segment size: 71.19 GB

With Advanced compression - it compressed to 55.82GB

With Query High compression - it came down to 7.12 GB


SQL> alter table MKT.V_QUOTE_JSON move partition V_QUOTE_JSON_200501 compress for QUERY HIGH parallel 8 update indexes
2 /

Table altered.



In this case, I can compress it with Query high compression or even for historical partition, I can even go with much higher compression level to reduce the space usage.

Ofcourse, i have to find out from the developer why they are using Extended VARCHAR2(32767) when the data can fit in VARCHAR2(4000)..

Thanks.

Connor McDonald
November 05, 2020 - 7:48 am UTC

Thanks for getting back to us.

Convert JSON column from VARCHAR2(32767) to VARCHAR2(4000)

Paul, January 26, 2021 - 8:27 am UTC

I have JSON table which has JSON column defined with Extended VARCHAR2(32767) . it created a LOG segment. We are reducing the the column size to VARCHAR2(4000) as data will not grow beyond 4K. Table is truncated,

After reducing the size to VARCHAR2(4000), the lob segment is still not going away.

How do i get rid of LOB segment.

Thanks
'

Connor McDonald
January 28, 2021 - 9:19 am UTC

If the table if empty, then you can drop the column and re-add it


SQL>
SQL> create table t ( x int, y varchar2(32767));

Table created.

SQL>
SQL> insert into t values (1,rpad('x',32000,'x'));

1 row created.

SQL> select * from user_lobs
  2  where table_name = 'T'
  3  @pr
==============================
TABLE_NAME                    : T
COLUMN_NAME                   : Y
SEGMENT_NAME                  : SYS_LOB0000127817C00002$$
TABLESPACE_NAME               : USERS
INDEX_NAME                    : SYS_IL0000127817C00002$$
CHUNK                         : 8192
PCTVERSION                    :
RETENTION                     :
FREEPOOLS                     :
CACHE                         : YES
LOGGING                       : YES
ENCRYPT                       : NO
COMPRESSION                   : NO
DEDUPLICATION                 : NO
IN_ROW                        : YES
FORMAT                        : ENDIAN NEUTRAL
PARTITIONED                   : NO
SECUREFILE                    : YES
SEGMENT_CREATED               : YES
RETENTION_TYPE                : DEFAULT
RETENTION_VALUE               :

PL/SQL procedure successfully completed.

SQL>
SQL> truncate table t;

Table truncated.

SQL> select * from user_lobs
  2  where table_name = 'T'
  3  @pr
==============================
TABLE_NAME                    : T
COLUMN_NAME                   : Y
SEGMENT_NAME                  : SYS_LOB0000127817C00002$$
TABLESPACE_NAME               : USERS
INDEX_NAME                    : SYS_IL0000127817C00002$$
CHUNK                         : 8192
PCTVERSION                    :
RETENTION                     :
FREEPOOLS                     :
CACHE                         : YES
LOGGING                       : YES
ENCRYPT                       : NO
COMPRESSION                   : NO
DEDUPLICATION                 : NO
IN_ROW                        : YES
FORMAT                        : ENDIAN NEUTRAL
PARTITIONED                   : NO
SECUREFILE                    : YES
SEGMENT_CREATED               : YES
RETENTION_TYPE                : DEFAULT
RETENTION_VALUE               :

PL/SQL procedure successfully completed.

SQL>
SQL> alter table t modify y varchar2(1000);

Table altered.

SQL>
SQL> select * from user_lobs
  2  where table_name = 'T'
  3  @pr
==============================
TABLE_NAME                    : T
COLUMN_NAME                   : Y
SEGMENT_NAME                  : SYS_LOB0000127817C00002$$
TABLESPACE_NAME               : USERS
INDEX_NAME                    : SYS_IL0000127817C00002$$
CHUNK                         : 8192
PCTVERSION                    :
RETENTION                     :
FREEPOOLS                     :
CACHE                         : YES
LOGGING                       : YES
ENCRYPT                       : NO
COMPRESSION                   : NO
DEDUPLICATION                 : NO
IN_ROW                        : YES
FORMAT                        : ENDIAN NEUTRAL
PARTITIONED                   : NO
SECUREFILE                    : YES
SEGMENT_CREATED               : YES
RETENTION_TYPE                : DEFAULT
RETENTION_VALUE               :

PL/SQL procedure successfully completed.

SQL>
SQL> alter table t drop column y;

Table altered.

SQL> alter table t add y varchar2(1000);

Table altered.

SQL>
SQL> select * from user_lobs
  2  where table_name = 'T';

no rows selected

SQL>
SQL>
SQL>


Paul, January 26, 2021 - 9:04 am UTC


Also, if i try to increase the column size from VARCHAR2(4000) to VARCHAR2(5000), it does not allow me due to virtual column present. See below.

How do i workaround ?. I am unable to drop the extended stats on the virtual column as it says not droppable.



SQL> alter table ESPT.DLY_PRC_JSON_TEST modify JSON VARCHAR2(5000) ;
alter table ESPT.DLY_PRC_JSON_TEST modify JSON VARCHAR2(5000)
*
ERROR at line 1:
ORA-54031: column to be dropped or modified is used in a virtual column
expression


SQL> select owner, table_name, column_name, data_default, hidden_column, virtual_column from dba_tab_cols where table_name='DLY_PRC_JSON_TEST' and hidden_column='YES';

OWNER TABLE_NAME COLUMN_NAME DATA_DEFAULT HID VIR
-------------------- ------------------------------ ------------------------------ --------------------------------------------------------------------------- --- ---
ESPT DLY_PRC_JSON_TEST SYS_IME_OSON_000100213A52BE67 OSON("JSON" FORMAT JSON , 'ime' RETURNING RAW(32767) NULL ON ERROR) YES YES


SQL> select OWNER, TABLE_NAME, EXTENSION_NAME, EXTENSION, droppable from dba_stat_extensions where table_name='DLY_PRC_JSON_TEST' and extension_name='SYS_IME_OSON_000100213A52BE67'
2 /

OWNER TABLE_NAME EXTENSION_NAME EXTENSION DRO
-------------------- ------------------------------ ----------------------------------- --------------------------------------------------------------------------- ---
ESPT DLY_PRC_JSON_TEST SYS_IME_OSON_000100213A52BE67 (OSON("JSON" FORMAT JSON , 'ime' RETURNING RAW(32767) NULL ON ERROR)) NO

Elapsed: 00:00:01.83
SQL> select owner, table_name, column_name, data_default, hidden_column, virtual_column from dba_tab_cols where table_name='DLY_PRC_JSON_TEST' and hidden_column='YES';

OWNER TABLE_NAME COLUMN_NAME DATA_DEFAULT HID VIR
-------------------- ------------------------------ ------------------------------ --------------------------------------------------------------------------- --- ---
ESPT DLY_PRC_JSON_TEST SYS_IME_OSON_000100213A52BE67 OSON("JSON" FORMAT JSON , 'ime' RETURNING RAW(32767) NULL ON ERROR) YES YES

Elapsed: 00:00:00.67
SQL>
SQL> alter table ESPT.DLY_PRC_JSON_TEST modify JSON VARCHAR2(5000) ;
alter table ESPT.DLY_PRC_JSON_TEST modify JSON VARCHAR2(5000)
*
ERROR at line 1:
ORA-54031: column to be dropped or modified is used in a virtual column expression


Connor McDonald
January 28, 2021 - 9:47 am UTC

Can we see the full DDL for the table

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database