Skip to Main Content
  • Questions
  • Algorithm for compressed table or index

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, A Reader.

Asked: June 12, 2005 - 2:29 pm UTC

Last updated: November 14, 2005 - 1:26 pm UTC

Version: 10.1.0.2

Viewed 1000+ times

You Asked

I've run into someone who says he knows exactly how the compress table algorithm works. He says it will only compress on a per block level any column that has a duplicate column with any other column. For example,
if there was a value for NAME that was "Pete".. and 3 rows later there was a NAME that said "Pete Sampras".. it would not be compressed, but if there was a LASTNAME "Pete".. it would be - or a NAME for that matter - and this is only done on a per block level because the goal here is speed - to pass a rather simple comporession algorithm to the CPU and speed things up, not to zip up the whole table. Is he 100% correct? In the ballpark? He also says that indexes can use the row as a whole to compress, or combination of more than one column to compress, while tables cannot. This he wasn't sure about. Can you clarify?

and Tom said...

he was right up to the very end (if you got that last part correct).

segment compression (compressed tables) works at the block level and only works for bulk operations like alter table move, create table as select, insert /*+ append */, sqlldr direct=true...

It works by streaming the data onto a block and while doing that -- looking for repetitive data. It factors out the repeating data it finds and stores it in a symbol table on that block. It replaces the string with an index into this symbol table in the data on the block itself. That way, each block is a self contained "thing", you only need that block to decompress it.

Now for the last comment -

..
He also says that indexes
can use the row as a whole to compress, or combination of more than one column
to compress, while tables cannot.
.....

that is sort of backwards - the TABLE can use the entire row and all of the rows on the block to compress.

The INDEX compression works by removing redundant leading key entries. If you have a copy of ALL_OBJECTS, a compressed index on:

allobjects(owner,object_type,object_name)

would be compressed much better than an index on

allobjects(object_name,object_type,owner)


because owner/object_type repeat a lot (scott owns lots of tables, scott/table repeats over and over) whereas object_name is almost unique, no repitition. Index key compression works by factoring out repeats on the leading edge of just the indexed columns.

Rating

  (16 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Is array insert a bulk operation?

Lars Stampe Villadsen, June 12, 2005 - 3:45 pm UTC

You state that compression only works for bulk operations. We are using array inserts from OCI in our application. Does these fall into the bulk group?

Thanks,

Tom Kyte
June 12, 2005 - 10:59 pm UTC

I'll clarify

direct pathed operations, that do not use the buffer cache.


alter table t move
insert /*+ APPEND */
create table as select
sqlldr direct

but not a conventional path insert (eg: imp uses array binds, but imp cannot import "compressed" data)

So back to the table

A Reader, June 12, 2005 - 3:58 pm UTC

So I have a 3-part followup - part of which may be a RTFM question which I do realize. 

So if I have a table that is 
create table testcomp (a number, firstname varchar2(20),
lastname varchar2(20),city varchar2(20),state varchar2(2),anothercolumn varchar2(20))

 insert into testcomp values (1,'Peter','Smith','A City','CT','Peters');


 insert into testcomp values (2,'Todd','Smith','A City','CT','Peters');

Can I assume in the compressed version that the symbol table will contain ONE entry for "Smith%A City%CT%Peters"
or will it contain 4 entries, 
1) "Smith"
2) "A City"
3) "CT"
4) "Peters"
or will it contain 4 entries like
1) "Peter"
2) "Smith"
3) "A City"
4) "CT"
and the row containing Peters will be a table lookup concatenated with "s". 

I assume version 1. Is this true?

Now if I do the following

analyze table testcomp compute statistics;

select avg_row_len,num_rows from user_tables where table_name = 'TESTCOMP';

AVG_ROW_LEN   NUM_ROWS
----------- ----------
         34          2

SQL> alter table testcomp compress;

Table altered.

SQL>  select avg_row_len,num_rows from user_tables where table_name = 'TESTCOMP';

AVG_ROW_LEN   NUM_ROWS
----------- ----------
         34          2

Or this 

create table testcomp (a number, firstname varchar2(20),
lastname varchar2(20),city varchar2(20),state varchar2(2),anothercolumn varchar2(20)
compress
SQL> /

Table created.

insert into testcomp values (1,'Peter','Smith','A City','CT','Peters');

1 row created.

insert into testcomp values (2,'Todd','Smith','A City','CT','Peters');

1 row created.

analyze table testcomp compute statistics;

Table analyzed.

SQL> select avg_row_len,num_rows from user_tables where table_name = 'TESTCOMP';

AVG_ROW_LEN   NUM_ROWS
----------- ----------
         34          2

Why is it the same?  (That's the RTFM question probably).

Third part, if one of the columns is either a primary or foreign key, does that affect the compression algorithm? It seems it would add to the work to have to decompress for integrity - or maybe not. 

Thanks,
D.  

Tom Kyte
June 12, 2005 - 11:08 pm UTC

no substrs, just entire fields
    1) "Smith"
    2) "A City"
    3) "CT"
    4) "Peters"



SQL> alter table testcomp compress;
Table altered.

that just set an attribute for the next direct path operation.  it doesn't touch the existing data AT ALL.



In order to have the data compressed, you would have to "move" it.  A conventional path insert will NOT be compressed.


ops$tkyte@ORA9IR2> create table testcomp (a number, firstname varchar2(20),
  2  lastname varchar2(20),city varchar2(20),state varchar2(2),anothercolumn
  3  varchar2(20) )
  4  compress
  5  /
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into testcomp
  2  select rownum, 'nm'||rownum, 'Smith', 'A City', 'CT', 'Peters'
  3    from all_objects;
 
30686 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec show_space( 'TESTCOMP' )
Unformatted Blocks .....................               0
FS1 Blocks (0-25)  .....................               3
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               0
FS4 Blocks (75-100).....................              26
Full Blocks        .....................             151
Total Blocks............................             256
Total Bytes.............................       2,097,152
Total MBytes............................               2
Unused Blocks...........................              64
Unused Bytes............................         524,288
Last Used Ext FileId....................               9
Last Used Ext BlockId...................           5,512
Last Used Block.........................              64
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> alter table testcomp move;
 
Table altered.
 
ops$tkyte@ORA9IR2> exec show_space( 'TESTCOMP' )
Unformatted Blocks .....................               0
FS1 Blocks (0-25)  .....................               0
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               0
FS4 Blocks (75-100).....................               0
Full Blocks        .....................              75
Total Blocks............................              88
Total Bytes.............................         720,896
Total MBytes............................               0
Unused Blocks...........................               5
Unused Bytes............................          40,960
Last Used Ext FileId....................               9
Last Used Ext BlockId...................           5,704
Last Used Block.........................               3
 
PL/SQL procedure successfully completed.


the 'insert' just inserts, the alter move compressed, an insert /*+ append */ would have compressed.

Also, don't expect the row length to change, it'll be what it is.


Data on the block is cached "COMPRESSED", resolving things via a symbol table "isn't hard" 

Back to the table Part II

A Reader, June 12, 2005 - 4:00 pm UTC

Actually, It would be the 4 entry version, example 2, not 1, correct? He did say it was a on a column by column basis, not a combination of columns but then you said he had reversed the index and table concepts.

Tom Kyte
June 12, 2005 - 11:09 pm UTC

you just lost me there. not sure what you mean.

Skip the scrambled reply

A reader, June 13, 2005 - 1:13 am UTC

No.. I understand what you mean by one entry for a column value. Ignore the quick and confused followup. What I was saying was that there wouldn't be just one entry for several concatenated columns, there would be an entry for each column value. The index can compress the leading part if it is repetitive. If let's say the table had a repetitive leading part, it would have one entry per column of repeated data, not one for the whole repeated entry, right? Hope that makes sense

Tom Kyte
June 13, 2005 - 10:45 am UTC

indexes have "leading edges"

tables just have "blocks"


when compressing an index, it looks for repeating columns on the leading edge. That index on allobjects(owner,object_type,object_name) would store on the leaf blocks something like this:


+----------------------------------------------+
| owner=SCOTT, object_type=TABLE, EMP, DEPT, |
| BONUS, T1, T2, .... TN |
+----------------------------------------------+

instead of:

+----------------------------------------------+
| SCOTT/TABLE/EMP SCOTT/TABLE/DEPT |
| SCOTT/TABLE/BONUS ..... |
+----------------------------------------------+


with a table, the "order" would not matter -- it considers all columns in the table (normal columns, not "large objects") and factors out repeating data IN the row and ACROSS rows. All of the data going on the block is considered.


A reader, June 13, 2005 - 8:43 pm UTC

Tom,

According to one of our DB2(group) folks data is actually compressed(like gzip) if compress option is used for table.

Not sure how far that is true?

Thanks.

Tom Kyte
June 13, 2005 - 9:08 pm UTC

I'm doubtful unless the table is read only. But they should be able to point you to documentation don't you think?

(and importantly -- why does it matter, if we can compress 2:1 or 3:1 do you care really if we used huffman bit compression or a RLE scheme or whatever)

RE : Table Compression

A reader, July 12, 2005 - 9:41 pm UTC

Hi Tom,

I attended the NYOUG meeting on July 12 at the Oracle office where you mentioned about table compression. I searched thru Expert-one-on-One oracle but could not find any reference to it in that book.
a. From which version of Oracle is table and index compression available?
b. What are the activities that do not affect table and index compression?
c. will normal DML such as insert/update/delete have performance impact on table and index compression
d. What are the scenarios under which this is efficient?

Your feedback/example in this regard will be highly helpful

Tom Kyte
July 13, 2005 - 10:59 am UTC

Expert One on One Oracle is 8i release 3 and before -- segment space compress is a 9iR2 and above feature.

index compression -- 8i (and in expert one on one)
segment space compress 9ir2 (and in effective oracle by design)

here is a short excerpt from effective Oracle by design

<quote>
(2)Use Table Compression for Read-Only/Read-Mostly Tables
The ability to compress a table started in Oracle8i with IOTs. Since an IOT is really a special kind of index—one that supports index key compression—we could compress it using that technique.
True table-level compression for more general table types is new with Oracle9i Release 2. It is similar to index key compression in that it factors out repetitive information, but it differs in many ways. Table compression works on the aggregate-block level, not row by row as index key compression does; that is, when Oracle is building a database block, it is looking for repeating values across all columns and rows on that block. The values need not be on the leading edge of the row; they can be anywhere in the row.
The major difference between table and index compression is when each may be used. Index key compression works equally well on systems where the table is being modified frequently and on read-only or read-mostly systems. The index is maintained in its compressed state in both environments. However, table compression works well only in a read-only or read-mostly environment. It is not a feature you will use on your active transaction tables (although you can use it in your transactional systems, as you’ll see). Table compression works only for bulk operations as well. A normal INSERT or UPDATE statement will not compress data. Instead, you need to use one of the following:
· CREATE TABLE AS SELECT
· INSERT /*+ APPEND */ (direct-path insert)
· SQLLDR direct=y (direct-path load)
· ALTER TABLE MOVE
Compressing a table does not prohibit you from using normal DML against it. It is just that the newly added or modified information will not be stored compressed. It would take a rebuild of that segment in order to compress it. So, table compression is most likely to be valuable in these situations:
· Large amounts of static reference information that is read-only or read-mostly
· Data warehousing environments where bulk operations are common
· Audit trail information stored in partitioned tables, where you can compress last month’s auditing information at the beginning of a new month
Table compression should not be considered for most transactional tables where the data is heavily updated. There table compression would be defeated, as each update would tend to “decompress” the row(s).
Table compression in Oracle is achieved by factoring out repeating data found on a database block and creating a symbol table
</quote>

Pls. clarify

A reader, July 13, 2005 - 4:55 pm UTC

Tom,

This is regarding the example you provided for accessing documents for a given user where u described about the physics of the data layour.
1. You mention that in the case of IOT the data will be sorted and data for same key value should be close to each other thus reducing the I/O. Since IOT is based on PK, how can IOT allow for duplicates. ie. how can multiple documents exist for a given user_id in that example. Please clarify my understanding.
2. When u mention that for IOT the data will be sorted and that the blocks will be in close proximity, whereas for Cluster the data need not be sorted but the rows with the same keys will be in the same block. By this you refer to the data blocks as layed out on the datafile in the disk. ie. in the case of IOT, the data blocks will be next to each other in the database file on disk whereas for cluster, the data blocks may not be next to each other in the database file on disk, but records with same keys will be put in the same blocks. Please confirm my understanding.


Tom Kyte
July 13, 2005 - 5:04 pm UTC

"u"?

1) document names must be unique within a username in my example, the primary key was

(username, document_name)

if that was not the primary key, then there must have been a third attribute that would added to make it unique (perhaps a folder name).


2) I was saying the "rows" will be in close proximity. The blocks on disk might be miles apart, that is not relevant. what is relevant is that my 100 documents are stored near eachother on 3 or 4 blocks -- NOT on 100 separate blocks.

How is Hash Cluster different

A reader, July 13, 2005 - 5:30 pm UTC

Tom,

In the above discussion, you mentioned that the blocks in the file may be "miles apart". So my questions are : How "IOT" differs from "table Clusters" and what do u mean by saying that in the case of "Clusters", the data will not be sorted but will be stored in same database blocks, but in case of IOT the data will be sorted and stored in close proximity. I guess I am not understanding the term "rows are close to each other" vs "the actual blocks in the file".

An example will be very helpful

Tom Kyte
July 14, 2005 - 9:59 am UTC

"u"?


An IOT is an index.

A hash cluster is a hash table.

An IOT will store things "sorted"
A hash cluster is great to access something by key.

An IOT can range scan. (select * from iot where pk > :x)
A hash cluster cannot using the hash key.

Say you have a zip code to state table. Finite size, you know about how big it is. You can size it. You FREQUENTLY query "select * from zip2state where zip = :x". That is a hash cluster great example. 1 IO to turn that zip into it's related information.


Now, say you have stock quotes coming in. You have to keep an infinite amount of them (you do not know how big this'll get, they might say 3 years from now THAT IS ENOUGH start aging out some or not). You cannot size this nicely as a hash cluster and besides -- you frequently query:

select ... from stocks where symbol = :x and dt between :y and :z;

well the key is (symbol,dt), the IOT can range scan to find the records of interest (you have thousands of records per symbol, but you analyze the last 2 or 3 weeks alots, the last two or three weeks are together on the same blocks, very efficient). A hash cluster would have been built on SYMBOL only so that query would have to look at all stock records to find the 2/3 weeks of interesting data (or use a conventional index to access it)


Do you have expert one on one Oracle or Effective Oracle by Design? I give a couple of conceptual examples more in there.

I find it easiest to list out "what do I do to this data", "now, given that a HASH table works like this and an IOT like that (meaning you have to have a good foundation in THOSE facts), which would do best".



do you know the size of your table (approximately at least) and is is a lookup (equality, not range scan). Sounds like a hash cluster.

Do you not know the size? or do you need it sorted? do you need range scans? sounds more like an IOT.

When can we get Expert 10G

Broke Andy, July 14, 2005 - 6:01 am UTC

Tom,
I am maybe one of two people in the Oracle world who hasn't bought your book yet. But after reading the posts on this site I have to get it.
When do you expect Expert 10G to be released? Just want to know if I should wait for that one or just run out and grab the first edition (which i believe covers upto 8.1.7). Are the differences just based on the new features or have you included new ideas and techniques as well ?

living on pocket change right now, so can't buy both ;)

Thanks

Tom Kyte
July 14, 2005 - 10:42 am UTC

Right now, July 2005, I would say "wait, it'll be printed (the first volume) for sep 2005 and it'll include a CD of the first edition in it's entirety"

The plan going forward for expert one on one is that it is two volumns, first volume includes CD, each will list at about $45 USD (should be in the mid 20's on bookpool and related sites therefore).

First volume is 15 chapters long and is loosely the first nine chapters of the original.



Thank you Tom.

Andy, July 14, 2005 - 9:16 pm UTC


Pls help in clarifying this

A reader, July 27, 2005 - 8:12 pm UTC

Tom,

On your new book 'Effective Oracle By Design' you discussed about 9iR2 segment compression.
1. My understanding on segment compression is that it will be useful in cases where the data itself repeats (ie. lesser number of distinct values). Is that understanding correct?
2. In that discussion on pg. 436, the compress ratio was more when data sorted by TIMESTAMP column (since it had few distinct values and avg_col_len = 19). But my question is that it would provide more compression ratio if it had been sorted on OWNER and TIMESTAMP. Please clarify
3. Also, on page 438, you specify maximum compression would be acheived by sorting on OBJECT_NAME. Will it not be acheived by sorting on OWNER (28 distinct values) and then on OBJECT_NAME (17130 distinct values). Also in the following paragraph, you mention that 'average row length' is very wide. Should it not be 'average col length' or am I missing something. Also, Please help me understanding on the columns to choose for sorting.

Tom Kyte
July 27, 2005 - 8:54 pm UTC

1) correct
2) test it out - it very well MAY HAVE if timestamp repeated in an owner alot (which it probably does for SYS). Think about the data.

3) but owner was only 6 bytes, objects name was *big* -- 24 bytes. and we had almost 2 million rows with 17,000 distinct object_names -- for that data, object_name was "really good to compress" because it was low distinct cardinality and really wide.

yes, it should have been average column width

DDL and compressed tables

Ajeet, August 01, 2005 - 10:00 am UTC

Tom,
someone told me that in oracle 9i there is a patch avaliable which will allow alter table add column on a compressed table without un-compressing it.Is is true or can you point out to a document which gives detail of this.

Thanks
Ajeet

Tom Kyte
August 01, 2005 - 10:40 am UTC

latest 9i patch set:

C:\Documents and Settings\tkyte>sqlplus /

SQL*Plus: Release 9.2.0.7.0 - Production on Mon Aug 1 10:30:21 2005

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.7.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.7.0 - Production

ops$xp9ir2\tkyte@ORA9IR2W> drop user "OPS$XP9IR2/TKYTE";

User dropped.

ops$xp9ir2\tkyte@ORA9IR2W> create table t compress as select * from all_objects;

Table created.

ops$xp9ir2\tkyte@ORA9IR2W> alter table t add x number;
alter table t add x number
                  *
ERROR at line 1:
ORA-22856: cannot add columns to object tables
 

How to find out the space gain by compressing a table?

Pradeep G, September 07, 2005 - 10:40 am UTC

Hi Tom,

Thank you very much for your contributions to DBA community.

I have one question regarding the compressed tables.

I have few compressed tables in the database. These tables was compressed an year ago.
After that some data change (insert ..delete etc) has been performed on this table.

Now I want to know which all tables are good candiates for compressing again?

Basically I want to know what is the space gain I can have on each table by compressing them. Is there a way to find out this?


Thanks a lot in advance,
Pradeep
Kerala
India.

Tom Kyte
September 07, 2005 - 1:43 pm UTC

there is no way to know before you actually do it. sorry.

Thanks

Pradeep G, September 08, 2005 - 2:51 am UTC

Thanks Tom for the response.

difference in space used

Ajeet, November 09, 2005 - 5:39 am UTC

Please allow me to ask this question here -this is not 100% related to thread but it is related to this thread somewhat.

I will describe the problem -start with table creation and data load scripts-
I have created 4 tables with exactly same structure except that -

Table T_CP is composite partitioned
Table T_P is composite partitioned
table t_p1 is compostie partitioned
table T_NP is not partitioned at all.

below is the table creation scripts and the script fot the tablespace I used - for the above tables.

create tablespace tts_ex4
datafile 'D:\ttsex4_01.dbf' size 3 m autoextend on next 2 m maxsize UNLIMITED
extent management local autoallocate
segment space management auto ;
create table t_cp(invoice_num number(9) NOT NULL,
invoice_qty number(9) not null,
invoice_amt number(14,2) not null,
fiscal_week number(2) not null,
fiscal_year number(4) not null
)
pctfree 10 pctused 80 INITRANS 1 MAXTRANS 255 TABLESPACE tts_ex4
partition by range (fiscal_year)
subpartition by list (fiscal_week)
SUBPARTITION TEMPLATE
(SUBPARTITION CWINVSCH_SUB01 VALUES (1),
SUBPARTITION CWINVSCH_SUB02 VALUES (2),
SUBPARTITION CWINVSCH_SUB03 VALUES (3),
SUBPARTITION CWINVSCH_SUB04 VALUES (4),
SUBPARTITION CWINVSCH_SUB05 VALUES (5),
SUBPARTITION CWINVSCH_SUB06 VALUES (6),
SUBPARTITION CWINVSCH_SUB07 VALUES (7),
SUBPARTITION CWINVSCH_SUB08 VALUES (8),
SUBPARTITION CWINVSCH_SUB09 VALUES (9),
SUBPARTITION CWINVSCH_SUB10 VALUES (10),
SUBPARTITION CWINVSCH_SUB11 VALUES (11),
SUBPARTITION CWINVSCH_SUB12 VALUES (12),
SUBPARTITION CWINVSCH_SUB13 VALUES (13),
SUBPARTITION CWINVSCH_SUB14 VALUES (14),
SUBPARTITION CWINVSCH_SUB15 VALUES (15),
SUBPARTITION CWINVSCH_SUB16 VALUES (16),
SUBPARTITION CWINVSCH_SUB17 VALUES (17),
SUBPARTITION CWINVSCH_SUB18 VALUES (18),
SUBPARTITION CWINVSCH_SUB19 VALUES (19),
SUBPARTITION CWINVSCH_SUB20 VALUES (20),
SUBPARTITION CWINVSCH_SUB21 VALUES (21),
SUBPARTITION CWINVSCH_SUB22 VALUES (22),
SUBPARTITION CWINVSCH_SUB23 VALUES (23),
SUBPARTITION CWINVSCH_SUB24 VALUES (24),
SUBPARTITION CWINVSCH_SUB25 VALUES (25),
SUBPARTITION CWINVSCH_SUB26 VALUES (26),
SUBPARTITION CWINVSCH_SUB27 VALUES (27),
SUBPARTITION CWINVSCH_SUB28 VALUES (28),
SUBPARTITION CWINVSCH_SUB29 VALUES (29),
SUBPARTITION CWINVSCH_SUB30 VALUES (30),
SUBPARTITION CWINVSCH_SUB31 VALUES (31),
SUBPARTITION CWINVSCH_SUB32 VALUES (32),
SUBPARTITION CWINVSCH_SUB33 VALUES (33),
SUBPARTITION CWINVSCH_SUB34 VALUES (34),
SUBPARTITION CWINVSCH_SUB35 VALUES (35),
SUBPARTITION CWINVSCH_SUB36 VALUES (36),
SUBPARTITION CWINVSCH_SUB37 VALUES (37),
SUBPARTITION CWINVSCH_SUB38 VALUES (38),
SUBPARTITION CWINVSCH_SUB39 VALUES (39),
SUBPARTITION CWINVSCH_SUB40 VALUES (40),
SUBPARTITION CWINVSCH_SUB41 VALUES (41),
SUBPARTITION CWINVSCH_SUB42 VALUES (42),
SUBPARTITION CWINVSCH_SUB43 VALUES (43),
SUBPARTITION CWINVSCH_SUB44 VALUES (44),
SUBPARTITION CWINVSCH_SUB45 VALUES (45),
SUBPARTITION CWINVSCH_SUB46 VALUES (46),
SUBPARTITION CWINVSCH_SUB47 VALUES (47),
SUBPARTITION CWINVSCH_SUB48 VALUES (48),
SUBPARTITION CWINVSCH_SUB49 VALUES (49),
SUBPARTITION CWINVSCH_SUB50 VALUES (50),
SUBPARTITION CWINVSCH_SUB51 VALUES (51),
SUBPARTITION CWINVSCH_SUB52 VALUES (52),
SUBPARTITION CWINVSCH_SUB53 VALUES (53),
SUBPARTITION CWINVSCH_SUBmax VALUES (default)
)
(PARTITION CWINVSCH_1999 VALUES LESS THAN ( 2000),
PARTITION CWINVSCH_2000 VALUES LESS THAN ( 2001),
PARTITION CWINVSCH_2001 VALUES LESS THAN ( 2002),
PARTITION CWINVSCH_2002 VALUES LESS THAN ( 2003),
PARTITION CWINVSCH_2003 VALUES LESS THAN ( 2004),
PARTITION CWINVSCH_2004 VALUES LESS THAN ( 2005),
PARTITION CWINVSCH_2005 VALUES LESS THAN (2006),
PARTITION CWINVSCH_2006 VALUES LESS THAN ( 2007),
PARTITION CWINVSCH_max VALUES LESS THAN ( maxvalue)
);

then

create table t_p(invoice_num number(9) NOT NULL,
invoice_qty number(9) not null,
invoice_amt number(14,2) not null,
fiscal_week number(2) not null,
fiscal_year number(4) not null
)
pctfree 10 pctused 80 INITRANS 1 MAXTRANS 255 TABLESPACE tts_ex4
partition by range (fiscal_year)
subpartition by list (fiscal_week)
SUBPARTITION TEMPLATE
(SUBPARTITION CWINVSCH_SUB01 VALUES (1,2,3,4,5,6,7,8,9,10,11,12,13),
SUBPARTITION CWINVSCH_SUB02 VALUES (14,15,16,17,18,19,20,21,22,23,24,25,26),
SUBPARTITION CWINVSCH_SUB03 VALUES (27,28,29,30,31,32,33,34,35,36,37,38,39),
SUBPARTITION CWINVSCH_SUB04 VALUES (40,41,42,43,44,45,46,47,48,49,50,51,52,53),
SUBPARTITION CWINVSCH_SUBmax VALUES (default)
)
(PARTITION CWINVSCH_1999 VALUES LESS THAN ( 2000),
PARTITION CWINVSCH_2000 VALUES LESS THAN ( 2001),
PARTITION CWINVSCH_2001 VALUES LESS THAN ( 2002),
PARTITION CWINVSCH_2002 VALUES LESS THAN ( 2003),
PARTITION CWINVSCH_2003 VALUES LESS THAN ( 2004),
PARTITION CWINVSCH_2004 VALUES LESS THAN ( 2005),
PARTITION CWINVSCH_2005 VALUES LESS THAN (2006),
PARTITION CWINVSCH_2006 VALUES LESS THAN ( 2007),
PARTITION CWINVSCH_max VALUES LESS THAN ( maxvalue)
);
--
create table t_p1(invoice_num number(9) NOT NULL,
invoice_qty number(9) not null,
invoice_amt number(14,2) not null,
fiscal_week number(2) not null,
fiscal_year number(4) not null
)
pctfree 10 pctused 80 INITRANS 1 MAXTRANS 255 TABLESPACE tts_ex4
partition by range (fiscal_year,fiscal_week)
(PARTITION CWINVSCH_1999 VALUES LESS THAN ( 2000,53),
PARTITION CWINVSCH_2000 VALUES LESS THAN ( 2001,53),
PARTITION CWINVSCH_2001 VALUES LESS THAN ( 2002,53),
PARTITION CWINVSCH_2002 VALUES LESS THAN ( 2003,53),
PARTITION CWINVSCH_2003 VALUES LESS THAN ( 2004,53),
PARTITION CWINVSCH_2004 VALUES LESS THAN ( 2005,53),
PARTITION CWINVSCH_2005 VALUES LESS THAN (2006,53),
PARTITION CWINVSCH_2006 VALUES LESS THAN ( 2007,53),
PARTITION CWINVSCH_max VALUES LESS THAN ( maxvalue,maxvalue)
);

create table t_np(invoice_num number(9) NOT NULL,
invoice_qty number(9) not null,
invoice_amt number(14,2) not null,
fiscal_week number(2) not null,
fiscal_year number(4) not null
)
pctfree 10 pctused 80 INITRANS 1 MAXTRANS 255 TABLESPACE tts_ex4
;

then I load data into the above tables using the following scripts -

(changed the table name for each table)

begin
for x in (select rownum r from all_objects where rownum <= 53) loop
insert /*+ append */ into t_cp
select rownum ,rownum*x.r ,rownum*x.r*10,x.r,'2002'
from all_objects where rownum <=101;
commit ;
end loop ;
for x in (select rownum r from all_objects where rownum <= 53) loop
insert /*+ append */ into t_cp
select rownum ,rownum*x.r ,rownum*x.r*10,x.r,'2003'
from all_objects where rownum <=101;
commit ;
end loop ;
for x in (select rownum r from all_objects where rownum <= 53) loop
insert /*+ append */ into t_cp
select rownum ,rownum*x.r ,rownum*x.r*10,x.r,'2004'
from all_objects where rownum <= 101 ;
commit ;
end loop ;
for x in (select rownum r from all_objects where rownum <= 53) loop
insert /*+ append */ into t_cp
select rownum ,rownum*x.r ,rownum*x.r*10,x.r,'2005'
from all_objects where rownum <= 101 ;
commit ;
end loop ;
end ;
/

then i run the dbms_stats on each of the 4 tables after the load of data...
using ..

begin
dbms_stats.gather_table_stats
( ownname => user,estimate_percent => 5,
tabname => 'T_CP',
method_opt => 'FOR ALL INDEXED COLUMNS size 1',
degree=>4,granularity=>'ALL',cascade=>TRUE );
end;
/

Now when I run this query -i got the results below :

ngta3zt@EVNODSD2> select table_name,blocks from user_tables where table_name in
('T_CP','T_P','T_P1', 'T_NP')
2 /

TABLE_NAME BLOCKS
------------------------------ ----------
T_CP 848
T_NP 224
T_P 264
T_P1 239


My first question is why the space used by these tables are so different -specially for the table T_CP - blocks are very high with respect to other tables in question ?
is it due to number of segments used by this table ?
my database is oracle 9i and block size is 8K.

then when I compress the above tables and run the dbms_stats again - i got the following results

ngta3zt@EVNODSD2> select table_name,blocks from user_tables where table_name in
2 ('T_CP','T_P','T_P1', 'T_NP');

TABLE_NAME BLOCKS
------------------------------ ----------
T_CP 848
T_NP 54
T_P 100
T_P1 64


To compress the subpartitioned table i use :

alter table t_cp compress ;
begin
for x in ( select * from user_tab_subpartitions where table_name = 'T_CP' )
loop
execute immediate 'alter table T_CP move subpartition ' || x.subpartition_name ;
end loop;
end;
/
to compress the partition table -i do the same thing except that i do a move partition...in the above script.

my questiones are again -

1. why there are so much difference in the space used by these tables - even they have same amount of data .

2. will the savings of space in a subpartitioned table will be less than the partitioned table -- assume that i can have the same table either partitioned or subpartitioned.

thanks again
Ajeet

Tom Kyte
November 10, 2005 - 5:16 pm UTC

each partition will have at least one extent - your tables have many different number of partitions (segments). Especially the first.

This accounts for the differences in sizes - a partition is a segment, a segment consumes some amount of space, the more partitions, the more 'wastage' you potentially observe.

How to add a column to a compressed-partitioned table

Ebrahim, November 14, 2005 - 10:10 am UTC

Tom,

We have a table in production (9.2.0.4 on Solaris, datawarehouse environment) that is partitioned by year and has a mix of compressed and uncompressed partitions. It has approximately 130 million rows.
Now there is a need to add a column to the table. There is no way we could upgrade to 10g at this point. Would you suggest any clever methods to do this?

We won't be able to uncompress all partitions at once since we do not have that much of space, but we could uncompress a few partitions at a time if needed.

Thanks for your help.

Ebrahim.

Tom Kyte
November 14, 2005 - 1:26 pm UTC

how about just compressing everything you have? create a new table as select, drop old table, rename new table?




More to Explore

Hints

Need more information on hints? Check out the complete hint descriptions