Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Riaz.

Asked: March 25, 2003 - 7:29 am UTC

Answered by: Tom Kyte - Last updated: March 09, 2020 - 10:55 am UTC

Category: Database - Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

Hello Tom !

what are the pros and corns of compressed indexes/tables ? Is there any performance overhead in this regard ? I heard that compressed indexes save space but cause increased CPU time...is that true ?

Regards



Riaz Shahid

and we said...

and reduce CPU time cause they take less space!

If you have to scan 100 blocks -- do 100 latches, 100 consistent gets that takes a certain amount of CPU.

Now, if it is compressed -- you have to do many 60 blocks, 60% of the CPU as before. Sure, this is offset to some degree by the fact it takes longer to process the block by most of the time, it is a total offset.

But - if you can only cache 70 blocks, one of them will do NO physical IO maybe and the other will -- all of a sudden, you are doing better already. It is like getting more RAM for free as well.. Your buffer cache can hold X% more data -- without changing size.

There are pros and cons. A con -- you store more index entries per leaf block, increasing contention on heavily modified tables perhaps. A pro -- you store more index entries per leaf block, dramatically reducing consistent gets (Logical IO), latching, and increasing buffer cache efficiency.


Benchmark, Benchmark, Benchmark. Try it out -- see what it does for you.

If you have my book Expert one on one Oracle -- see the chapter on indexes, I cover them there.

search for

compress

on this site and read the first couple of hits -- other articles about compression are there.


and you rated our response

  (54 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Some Confusion

March 25, 2003 - 7:09 pm UTC

Reviewer: Riaz Shahid from PRAL, Islamabad, Pakistan

Hello Tom !

Thanks for your propmt reply. But i was confused with "and reduce CPU time cause they take less space!"

Check this out and advise whats correct.

www.aaoug.org/pres/Index%20Compression.ppt

Regards

Riaz


Tom Kyte

Followup  

March 25, 2003 - 9:09 pm UTC

the reduced cpu time comes from 

if you have 100 uncompressed blocks to process, that'll be 100 logical IO.  Each logical IO consumes CPU time.

Now, you compress, you have 60 blocks.  you have 40% less IO consumed by LIO (and that is *SIGNIFICANT*). Less cache buffer chains latches, less contention, less work.  so, you have to spend a tad bit more time processing the block -- that is typically offset by the rest.


The stuff in that presentation looks good -- however, it leaves out some things.  Most importantly -- what happens as you SCALE UP.

So, you get to the end and see it takes a little more CPU.  thats bad -- but wait, is it?  They reduced consistent gets (LIO's) by more then 50% in many cases.  In a real system, with lots of people accessing the data -- this is hugely significant.  What do we do for an LIO?  We latch -- lots.  Latches = locks = serialization device = scalability inhibitor.


Oh, and the PIO's went way down as well.  supposing I was IO bound.  


Compression -- it (like everything else) is a tool.  Used properly -- awesome.  Used on a unique single column index -- stinky (as they showed).  


Overall -- I liked the presentation, well thought out (cept for the multi-user case, they didn't play that up enough).  One big thing missing -- scripts to reproduce the results with...


SLIDE 55 -- love it, everyone should do that.  Perfect.  



Also -- if you read my book "Expert one on one Oracle", the chapter on indexes -- I do put this to test myself in "Myth: It is Best to put the most discriminating elements first?"  You will see you can also find 0% change in CPU.  Here is a small test:

ops$tkyte@ORA920> create table t1
  2  as
  3  select * from all_objects ;

Table created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> create index t1_idx on
  2  t1(object_type,owner,object_name)
  3  compress 2;

Index created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> create table t2
  2  as
  3  select * from all_objects ;

Table created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> create index t2_idx on
  2  t2(object_type,owner,object_name);

Index created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> @trace
ops$tkyte@ORA920> alter session set events '10046 trace name context forever, level 12';

Session altered.

ps$tkyte@ORA920> declare
  2      l_rec t1%rowtype;
  3  begin
  4      for x in ( select * from t1 )
  5      loop
  6          select * into l_rec from t1
  7          where object_type = x.object_type
  8            and owner = x.owner
  9            and object_name = x.object_name
 10                    and rownum = 1;
 11          select * into l_rec from t2
 12          where object_type = x.object_type
 13            and owner = x.owner
 14            and object_name = x.object_name
 15                    and rownum = 1;
 16      end loop;
 17  end;
 18  /

PL/SQL procedure successfully completed.


SELECT * from t1 where object_type = :b3 and owner = :b2 and object_name = :b1
          and rownum = 1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  30616     11.37      14.50          0          0          0           0
Fetch    30616      3.63       6.04         91      91848          0       30616
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    61233     15.00      20.55         91      91848          0       30616

Rows     Row Source Operation
-------  ---------------------------------------------------
  30616  COUNT STOPKEY
  30616   TABLE ACCESS BY INDEX ROWID T1
  30616    INDEX RANGE SCAN T1_IDX (object id 45535)
********************************************************************************
SELECT * from t2 where object_type = :b3 and owner = :b2 and object_name = :b1
          and rownum = 1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  30616     11.32      12.66          0          0          0           0
Fetch    30616      3.42      11.63        433      91848          0       30616
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    61233     14.74      24.30        433      91848          0       30616

Rows     Row Source Operation
-------  ---------------------------------------------------
  30616  COUNT STOPKEY
  30616   TABLE ACCESS BY INDEX ROWID T2
  30616    INDEX RANGE SCAN T2_IDX (object id 45537)


No significant difference in response times CPU wise -- but less PIO did give me better elapsed times.  However,

ops$tkyte@ORA920> exec show_space( 'T1_IDX', user, 'INDEX' );
Free Blocks.............................
Total Blocks............................256
Total Bytes.............................2097152
Total MBytes............................2
Unused Blocks...........................95
Unused Bytes............................778240
Last Used Ext FileId....................9
Last Used Ext BlockId...................54920
Last Used Block.........................33

PL/SQL procedure successfully completed.

ops$tkyte@ORA920> exec show_space( 'T2_IDX', user, 'INDEX' );
Free Blocks.............................
Total Blocks............................256
Total Bytes.............................2097152
Total MBytes............................2
Unused Blocks...........................34
Unused Bytes............................278528
Last Used Ext FileId....................9
Last Used Ext BlockId...................55816
Last Used Block.........................94

PL/SQL procedure successfully completed.

it is smaller.... 


(personally, i've yet to see the wide CPU disparities reported -- but, I can believe it.  I would just need the "test case", like I provide)

 

Still One Question

March 28, 2003 - 2:01 am UTC

Reviewer: Riaz Shahid from PRAL, Islamabad, Pakistan

Hello Tom !

Thanks for your valuable advise. I read somewhere that only thise tables should be compressed which are updated infrequently. Is that true ? If yes then whats is the true defintion of "ifnrequently updated tables".

Thankx again

Tom Kyte

Followup  

March 28, 2003 - 7:14 am UTC

well, as with all ROT, you have to look at the intent.

I like to say -- the compressed data should be "read mostly" or "read only".

Now, suppose I have an audit trail. Audit trails are write intensive. Obviously they are not a candidate for compression. But wait -- I think they are perfect. Here is how:

create audit trail partitioned by month.
current month is write intensive -- old months are read only.

At the end of the month you will:

add new partition for next month.

create table last_month_compressed as select * from audit partition(last_month) order by <least selective columns to achieve maximum compression>;

alter table audit exchange last-months-partition with last_month_compressed;

drop oldest month



There -- it is perfect for a sliding window of data. The old months are packed really tight. The current month is uncompressed. Next month that uncompressed data will be compressed.

So you see -- it is really at the segment level, not the "table" level and it is a tool that you can apply in different circumstances, under different conditions.

Great

March 31, 2003 - 10:27 am UTC

Reviewer: Riaz Shahid from PRAL, Islamabad, Pakistan

Thats a fine example. I think the same holds for indexes (i.e; indexes that are frequently updated shouldn't be compressed). Am i right ???

Riaz

Tom Kyte

Followup  

March 31, 2003 - 11:00 am UTC

wrong - totally different concept here -- indexes and tables.  Similar (both compress, both compress via redundant data factoring) but different.

ops$tkyte@ORA920> declare
  2      l_rec t1%rowtype;
  3  begin
  4      for x in ( select * from t1 )
  5      loop
  6                  update t1
  7             set object_type = lower(object_type)
  8           where object_type = x.object_type
  9             and owner = x.owner
 10             and object_name = x.object_name;
 11                  update t2
 12             set object_type = lower(object_type)
 13           where object_type = x.object_type
 14             and owner = x.owner
 15             and object_name = x.object_name;
 16      end loop;
 17  end;
 18  /

PL/SQL procedure successfully completed.


and using same tables from before -- we see:

UPDATE t1
           set object_type = lower(object_type)
         where object_type = :b3
           and owner = :b2
           and object_name = :b1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute  29967     26.50      29.79        307      81104     193278       29967
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    29968     26.51      29.79        307      81105     193278       29967

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  UPDATE  (cr=81158 r=307 w=0 time=15582063 us)
  29967   INDEX RANGE SCAN T1_IDX (cr=75002 r=167 w=0 time=3245296 us)(object id 32507)
--------------------------------------------------------------------------------
UPDATE t2
           set object_type = lower(object_type)
         where object_type = :b3
           and owner = :b2
           and object_name = :b1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute  29967     26.08      29.07        772      92672     217154       29967
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    29968     26.08      29.07        772      92673     217154       29967

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  UPDATE  (cr=92747 r=772 w=0 time=16015230 us)
  29967   INDEX RANGE SCAN T2_IDX (cr=85919 r=231 w=0 time=4135654 us)(object id 32509) 

Really Great

April 01, 2003 - 3:04 am UTC

Reviewer: Riaz Shahid from PRAL, Islamabad, Pakistan


Thats really great. Great help from your side.

Thanks

Great

April 28, 2003 - 4:52 am UTC

Reviewer: Riaz Shahid from PRAL, Islamabad, Pakistan

Hello Tom !

thats great. But i've a little confusion. Consider:

-- Script: uncompressed_indexes.sql
-- Purpose: to identify indexes that would benefit from compression
-- For: 8.1 and above

column index_name format a54

select
u.name ||'.'|| o.name index_name,
decode(
sign(s.full_save - s.one_save),
-1, 1,
decode(s.cols, 1, 1, 2)
) min_compress,
decode(
sign(s.full_save - s.one_save),
-1, greatest(1, least(s.max_ok, s.cols - 1)),
s.cols
) max_compress
from
(
select
x.obj#,
x.cols,
x.leaf_bytes,
x.full_save,
x.max_ok,
h1.avgcln * (x.rowcnt - h1.null_cnt - h1.distcnt) - 4 * h1.distcnt
one_save
from
( select
i.obj#,
i.cols,
i.rowcnt,
(sum(h.avgcln) + 10) * i.rowcnt leaf_bytes,
sum(h.avgcln) * (i.rowcnt - i.distkey) - 4 * i.distkey full_save,
max(decode(sign(i.rowcnt - 2 * h.distcnt), -1, 0, ic.pos#)) max_ok
from
sys.ind$ i,
sys.icol$ ic,
sys.hist_head$ h
where
i.leafcnt > 1 and
i.type# in (1,4,6) and -- exclude special types
bitand(i.property, 8) = 0 and -- exclude compressed
ic.obj# = i.obj# and
h.obj# = i.bo# and
h.intcol# = ic.intcol#
group by
i.obj#,
i.cols,
i.rowcnt,
i.distkey
) x,
sys.icol$ c1,
sys.hist_head$ h1
where
c1.obj# = x.obj# and
c1.pos# = 1 and
h1.obj# = c1.bo# and
h1.intcol# = c1.intcol#
) s,
sys.obj$ o,
sys.user$ u
where
greatest(s.full_save, s.one_save) > 0.05 * s.leaf_bytes and
o.obj# = s.obj# and
o.owner# != 0 and
u.user# = o.owner#
order by
greatest(s.full_save, s.one_save) desc
/

I ran it on my DB and found a few index names which may benefit from compression. Is that right method to do this ? Please Advise

Regards

Riaz


Tom Kyte

Followup  

April 28, 2003 - 7:57 am UTC

it would be really cool just to say "yes" or "no", but -- i won't do that.


Here is what you need to do....


in your TEST environment,

o measure your current usage of space by these candidates
o recreate your indexes using index key compression.
o remeasure the space (but don't get excited, just measure and remember)
o run your test system to scale for a while. find and fix any issues that might
arise. You might find you need more initrans (more rows/block), you might
find you have a hot spot you didn't before. You might even find a "product
issue" or two that need patching.
o remeasure your index space again


Then

o make sure the savings is worth it to you (first measure vs LAST measure, after the indexes had time to "grow" due to use)

o make sure everything works

o make sure performance is at least as good, if not better then before.



Very helpful

May 19, 2003 - 4:05 pm UTC

Reviewer: Haiqing from NY USA

Hello Tom,
Thanks a lot for your detailed demonstration. They are very helpful.
It would be most appreciated if you could show us your pl/sql "show_space".
Thanks again Tom!
Best regards.

Haiqing


June 17, 2003 - 10:54 am UTC

Reviewer: A reader

Hi Tom,
Can we use compress table for CLOB and XMLTYPE?

Thanks

Tom Kyte

Followup  

June 17, 2003 - 12:47 pm UTC

LOBS are not compressable.

ALTER TABLE MOVE COMPRESS

July 01, 2004 - 8:26 pm UTC

Reviewer: A reader

In 9.2, what does ALTER TABLE t MOVE COMPRESS do? It doesnt seem to reduce the space usage by a whole lot.

Does it really repack/compress existing data?

What would it do to new incoming data?

Thanks

Tom Kyte

Followup  

July 01, 2004 - 10:01 pm UTC

please see above.

if the table did not have lots of repeating data in the blocks, it won't compress "well"

compress typically works best on data sorted by the most repetitive, large columns.

but it does not need that (I alter my log partitions, move them each month with compress -- get 1/2 to 1/3 the storage afterwards)


new incoming data would be stored "not compressed"

documents

July 13, 2004 - 12:53 am UTC

Reviewer: abc

HI Tom

I have 2-3 documents on dbms_stat and other concepts and interseted to share with all so please suggest

Tom Kyte

Followup  

July 13, 2004 - 11:24 am UTC

find a website to post them on? there are lots of places out there. I won't host them as I would have to personally "ok" them and there are only so many hours in the day.

Why is index bigger when compresed?

May 17, 2005 - 5:28 pm UTC

Reviewer: A reader

Hi Tom, could you please explainme this
I had a index compressed 7, and I want to see how much I was saving recreating uncompressed, and surprise it was smaller when I uncompress it.
I recreated compressed and again it was bigger.
Why this happens?


Using compress is
SELECT SEGMENT_NAME, SUM(BYTES)/1024/1024 FROM DBA_SEGMENTS
WHERE SEGMENT_NAME = 'CST_HCF_FECHA_STATUS_RO2_A'
GROUP BY SEGMENT_NAME --243.4375
ORDER BY 2 DESC
271.953125

Not using compress is
SELECT SEGMENT_NAME, SUM(BYTES)/1024/1024 FROM DBA_SEGMENTS
WHERE SEGMENT_NAME = 'CST_HCF_FECHA_STATUS_RO2_A'
GROUP BY SEGMENT_NAME --243.4375
ORDER BY 2 DESC
243.4375 MB

I think because there are several columns, it should be smaller

CREATE UNIQUE INDEX fon.CST_HCF_FECHA_STATUS_RO2_A ON fon.hicuentasf_ro2
(
hcf_fecha ASC,
hcf_codcli ASC,
hcf_ctacorr ASC,
hcf_status ASC,
hcf_dbcuo ASC,
hcf_crcuo ASC,
hcf_iva ASC,
hcf_monto_iva ASC
)
PCTFREE 3
INITRANS 2
MAXTRANS 255
TABLESPACE tbl_readonly_idx
STORAGE (
INITIAL 65536
NEXT 81920
PCTINCREASE 0
MINEXTENTS 1
MAXEXTENTS 2147483645
)
NOLOGGING
COMPRESS 7
/



CREATE UNIQUE INDEX fon.CST_HCF_FECHA_STATUS_RO2_A ON fon.hicuentasf_ro2
(
hcf_fecha ASC,
hcf_codcli ASC,
hcf_ctacorr ASC,
hcf_status ASC,
hcf_dbcuo ASC,
hcf_crcuo ASC,
hcf_iva ASC,
hcf_monto_iva ASC
)
PCTFREE 3
INITRANS 2
MAXTRANS 255
TABLESPACE tbl_readonly_idx
STORAGE (
INITIAL 65536
NEXT 81920
PCTINCREASE 0
MINEXTENTS 1
MAXEXTENTS 2147483645
)
NOLOGGING
/

Describing FON.HICUENTASF_RO....
NAME Null? Type
------------------------------- --------- -----
HCF_CODCLI NOT NULL VARCHAR2(13)
HCF_CTACORR NOT NULL NUMBER(2,0)
HCF_FECHA NOT NULL DATE
HCF_DBCUO NOT NULL NUMBER(17,8)
HCF_CRCUO NOT NULL NUMBER(17,8)
HCF_IVA NOT NULL NUMBER(10,5)
HCF_MONTO_IVA NOT NULL NUMBER(10,2)
HCF_STATUS NOT NULL VARCHAR2(3)
HCF_CIUDAD NOT NULL VARCHAR2(3)
Thank you

Tom Kyte

Followup  

May 17, 2005 - 6:34 pm UTC

analyze index CST_HCF_FECHA_STATUS_RO2_A validate structure;

select * from index_stats;

look at the opt_cmpr_count and opt_cmpr_pctsave, it'll tell you the best compression count and how much you can expect to save (in 9i and above, in 8i, you cannot do this trick)

If the index is not having a lot of leading edge repetition, it is very easy for the index to get "bigger".

May 17, 2005 - 6:37 pm UTC

Reviewer: A reader

Thanks Tom, this means
"If when compressed is bigger, leave uncompressed"
Or even if is bigger, there could be a benefit leaving compressed.



Tom Kyte

Followup  

May 18, 2005 - 8:27 am UTC

it means use analyze index (warning, 'offline' operation, not to be done during times of modifications) to compute the optimum compression count and how much you would save with it given your data.  Then consider compressing the index based on the pct saved (is it worth it, do you get something from it) using the compression count provided.  Consider:


ops$tkyte@ORA9IR2> create table t
  2  as
  3  select * from all_objects;
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx1 on t(object_id,object_name,owner,object_type);
 
Index created.
 
ops$tkyte@ORA9IR2> create index t_idx2 on t(object_type,owner,object_name,object_id);
 
Index created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> analyze index t_idx1 validate structure ;
 
Index analyzed.
 
ops$tkyte@ORA9IR2> select opt_cmpr_count, opt_cmpr_pctsave from index_stats;
 
OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
-------------- ----------------
             0                0
 
<b>that index is NOT compressable, compressing it would make it larger, why?  the leading edge is unique, there are no opportunities for removing repeating leading edge values</b>

ops$tkyte@ORA9IR2> analyze index t_idx2 validate structure ;
 
Index analyzed.
 
ops$tkyte@ORA9IR2> select opt_cmpr_count, opt_cmpr_pctsave from index_stats;
 
OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
-------------- ----------------
             2               26
 
<b>that index is compressable but it only makes sense to use compress 2 for object_name is almost unique and the ability to remove repeating values effectively stops there</b>

 

May 18, 2005 - 10:20 am UTC

Reviewer: A reader

This is really a nice trick tom, thank you.

Hi Tom following from your book the fact
the order of columns in the index don't matter I want
to see if the order of columns don't matter on indexes I
analyzed and used the compression suggested by Oracle.

The conclusion is, distinct order distinct compression.
And to get the best compression put from the less selective columns
to the more selective columns. Then is where I got the best compression.
178.59375

The question is, Which column in which statistics view,
givesme the level of selectivity of each columns?

Thank you


hcf_fecha ASC,
hcf_codcli ASC,
hcf_ctacorr ASC,
hcf_status ASC,
hcf_dbcuo ASC,
hcf_crcuo ASC,
hcf_iva ASC,
hcf_monto_iva ASC

No Compressed 243.4375
Optimum Compression 1
Compressed 205.78125

hcf_monto_iva ASC,
hcf_fecha ASC,
hcf_codcli ASC,
hcf_ctacorr ASC,
hcf_status ASC,
hcf_dbcuo ASC,
hcf_crcuo ASC,
hcf_iva ASC


No Compressed 243.359375
Optimum Compression 1
Compressed 226.71875

hcf_iva ASC,
hcf_monto_iva ASC,
hcf_fecha ASC,
hcf_codcli ASC,
hcf_ctacorr ASC,
hcf_status ASC,
hcf_dbcuo ASC,
hcf_crcuo ASC

No Compressed 243.4375
Optimum Compression 1
Compressed 226.25


hcf_crcuo ASC,
hcf_iva ASC,
hcf_monto_iva ASC,
hcf_fecha ASC,
hcf_codcli ASC,
hcf_ctacorr ASC,
hcf_status ASC,
hcf_dbcuo ASC


No Compressed 243.4375
Optimum Compression 3
Compressed 214.453125

hcf_dbcuo ASC,
hcf_crcuo ASC,
hcf_iva ASC,
hcf_monto_iva ASC,
hcf_fecha ASC,
hcf_codcli ASC,
hcf_ctacorr ASC,
hcf_status ASC

No Compressed 243.515625
Optimum Compression 4
Compressed 200.9375

hcf_dbcuo ASC,
hcf_crcuo ASC,
hcf_monto_iva ASC,
hcf_iva ASC,
hcf_codcli ASC,
hcf_fecha ASC,
hcf_status ASC,
hcf_ctacorr ASC

No Compressed 243.59375
Optimum Compression 5
Compressed 178.59375



Tom Kyte

Followup  

May 18, 2005 - 11:23 am UTC

<quote>order of columns in the index don't matter</quote>

WHERE DID YOU SEE THAT!

I never said that. I said that the myths of "put most selective first" and so on are MYTHS.

I showed that if you retrieve by "where x = :x and y = :y" it won't matter if y or x is first.

But I very much said "the QUESTIONS you ask of the data mandate the order of the columns"



Do you use this index for only "where c1 = ? and c2 = ? and c3 = ? ...."

You need, must, have to look AT THE QUESTIONS, that will tell you the order they columns should appear.

May 18, 2005 - 11:35 am UTC

Reviewer: A reader

Your right tom,sorry

Tom, even when this could seem a silly question, could you please suggestme the best and fastest way to test the performance in a table with compressed and without compressed index. without running the application.

Please, thank you.

Tom Kyte

Followup  

May 18, 2005 - 12:46 pm UTC

well, in the book I did that -- tested the what happens (how much cpu/io) with each, you can use the same techniques -- but without having the application going, it'll just be measurments of things.



May 18, 2005 - 1:24 pm UTC

Reviewer: A reader

Which book, which chapter?

Tom Kyte

Followup  

May 18, 2005 - 1:29 pm UTC

the book you were reading above

"...following from your book the fact..."

expert one on one, tons of examples on many things -- chapter on indexes.

May 18, 2005 - 2:11 pm UTC

Reviewer: A reader

Thanks

for table?

May 18, 2005 - 4:33 pm UTC

Reviewer: Sam

Your suggestion to analyze index and then query index_stats to determine whether to compress the index or not is very useful. Thanks.
Is there a way to see this for a table to determine whether to compress it or not?

Tom Kyte

Followup  

May 18, 2005 - 4:56 pm UTC

no, you have to sort of "do it" to see it for the segment compression

Overhead when adding columns on a compressed table

May 19, 2005 - 5:09 am UTC

Reviewer: N.Venkatagiri from India

I would like to share one of our experience about the compressed table. We were in the need of adding a column to the compressed table.
We cannot do that directly.
Doing so will cause

SQL>  ALTER TABLE TEST1 ADD(NO_TEST2 NUMBER(10));
 ALTER TABLE TEST1 ADD(NO_TEST2 NUMBER(10))
                       *
ERROR at line 1:
ORA-22856: cannot add columns to object tables

So we have to uncompress the table.

SQL>   ALTER TABLE TEST1 NOCOMPRESS;

Table altered.

We have to move the table.
SQL>   ALTER TABLE TEST1 MOVE;

Table altered.

Add the column. 

SQL> ALTER TABLE TEST1 ADD(NO_TEST1 NUMBER(10));

Table altered.

Compress the table 

SQL> ALTER TABLE TEST1 COMPRESS;

Table altered.

Move the table.

SQL>   ALTER TABLE TEST1 MOVE;

Table altered.

 

Index compressed and Memory

May 24, 2005 - 9:28 am UTC

Reviewer: A reader

Hi Tom, please, a compressed index, once is loaded in memory, is in memory compressed or decompressed and only is compressed in disk.
could you show please with a example for 9i if possible.
Thank you in advance.

Tom Kyte

Followup  

May 24, 2005 - 10:39 am UTC

the block is what is cached, and the block is cached in its 'compressed' format.

All information needed is self contained on that single block and just 'accessed'.

It would be impossible for us to explode the single index block into "n" blocks in the buffer cache, we'd never be able to find them again.

Thanks

May 24, 2005 - 11:28 am UTC

Reviewer: A reader


Compression plus Clustering

July 11, 2005 - 4:36 pm UTC

Reviewer: David from VA, USA

Tom,
I have learnt a great deal about compressed tables from this site. I have also read about table clustering which is kind of similar to compression in the sense that similar data in certain columns is physically stored together thereby reducing I/O.

My question is , have you used or know any drawbacks/cons of using both compression and clustering. I mean, two tables which are frequently accessed together with join keys are both compressed and clustered. Compression is done on a non-unique column with few distinct values and clustering is done on the primary key/foreign key column.

Will one offset the benefits of the other ??

Tom Kyte

Followup  

July 11, 2005 - 5:54 pm UTC

clustering prevents direct path operations.
compression mandates direct path operations.

you won't have a clustered compressed table.

Moving a compressed table

September 06, 2005 - 9:38 am UTC

Reviewer: VA from New Jersey, USA

I have a compressed table in Database A (a test database). The data has been verified by QA.

What is the best way to "move" this compressed table to the Production database and preserve the "compressed-ness"?

Thanks

Tom Kyte

Followup  

September 06, 2005 - 9:43 am UTC

o transport it.

o dump it, and reload it usnig external table and Create table as select.

o exp/imp it and then ALTER TABLE T MOVE to compress it.

o create table as select over a dblink or direct path insert

are some of the ways - you have to do a BULK operation

Moving a compressed table

September 06, 2005 - 9:56 am UTC

Reviewer: VA from New Jersey, USA

o transport it

The tablespace contains tons of other stuff which I dont want to touch so this option is out.

o dump it, and reload it usnig external table and Create table as select.

This is viable, thanks.

o exp/imp it and then ALTER TABLE T MOVE to compress it.

Not sure I understand this. "imp" uses regular SQL inserts, not direct path, so it wouldnt compress the data going in, right? After imp is done, if I just ALTER TABLE T MOVE it, I cant specify my magical ORDER BY clause (that results in the huge compression saving for my data). So this option is also out?

o create table as select over a dblink or direct path insert

This is also viable, thanks.

Tom Kyte

Followup  

September 06, 2005 - 9:59 am UTC

... Not sure I understand this. ....

right, that is why after the IMP, you would have to alter table T move to rebuild


If the sort is really important, external table or dblink - using SQL to sort it - would be recommended.

October 01, 2005 - 9:13 pm UTC

Reviewer: Bryan

Tom
We have a paritioned compressed table with Indexes..We want to add some columns to this table. Since the table is huge
uncompressing it and adding columns is difficult? Any suggestions
thanks


Tom Kyte

Followup  

October 02, 2005 - 10:32 am UTC

well, in order to add these columns - with some data (assuming you want data in them, else you would not be adding them), it will require a rebuild anyway since the update to put data in there will necessarily uncompress each row and migrate it!

My suggestion would be to use a CREATE TABLE AS SELECT to create a new partitioned, compressed table that includes all of these new columns, index it, grant it, constrain it (could be done in a enable rely novalidate mode - just put the constraints on, don't need to check them/validate them as the data is known to be clean in the source table) and then drop the old and rename the new.

the very act of adding columns with data (the only useful thing I can think) would decompress and migrat each row - hence a rebuild is going to be unavoidable. compressed tables use pctfree = 0 (no row for new data) and ANY update necessarily uncompresses the row and migrates it.

One other question about compressed tables/indexes

March 18, 2006 - 12:19 pm UTC

Reviewer: Chris Taylor from Wilkesboro, NC

Now I know tables and indexes are much different in their storage requirements/storage methodologies. The question I have is:

If new table data (inserts/updates) get stored uncompressed when added to a table, does this hold true for indexes? I've have 3 of your books but couldn't find it explicitly mentioned for indexes so I'm assuming the answer is no due to:

1.) Indexes are maintained differently
and
2.) You didn't explicity say new data would be stored uncompressed in indexes :) (Of course, I could have missed that part somewhere)

Thanks,
Chris Taylor

Tom Kyte

Followup  

March 18, 2006 - 4:43 pm UTC

index key compression works in READ/WRITE environments - the index key compression is ALWAYS happening, constantly.

segment compression works only for bulk direct path operations such as insert /*+ append */, alter table move, create table as select, sqlldr direct=y. It does not compress data that is added/modified in the table using conventional path operations.



One doubt

August 22, 2006 - 6:50 am UTC

Reviewer: Ik from BG

Hi Tom,

From the oracle documentation link (given below)

</code> http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_1008.htm#BABHFFHI <code>

It is mentioned that "Oracle Database compresses only nonpartitioned indexes that are nonunique or unique indexes of at least two columns." This seemed confusing to me. What about partitioned indexes?

For existing partitioned indexes (LOCAL), do we need to drop and recreate the indexes inorder to convert them to compress mode?

Please let me know.

Thanks,


Tom Kyte

Followup  

August 27, 2006 - 2:13 pm UTC

#b
this was in regards to an index rebuild (you need to add # and B in the url to click on it, I use # B for bold...).

You would have to recreate the partitioned index with compress - yes. You cannot have some partitions compressed and others not (with index key compression).




CPU Usage

September 04, 2006 - 9:40 am UTC

Reviewer: Ik from BG

Tom,

I have a few questions regarding compression

(a) This is the flow of a data block as i understand it

Data files >>------>> Buffer Cache>>----->>User

A block read from a data file gets read into the buffer cache and then decompressed on the fly while being passed to the user.

What happens if there is a sort involved. Sort in memory or sort to disk? Oracle has to read information in the block, and for that it has to decompress, right? So, how would the flow be? Would decompression happen before sorting? Could you please clarify.

(c) I came across a certain odd case where the query (against a compressed table) has reduced LIO and PIO by half but the elapsed time (and CPU time --- from v$sql) seems to be much higher. Iam comparing a compressed table against non-compressed table. having exactly same data same indexes and queries have similar execution plan. The DBA's have retorted saying that oracle compression slows down queries and iam not able to explain why the queries underperform with compression. What should i look for here? Waits are very very minimal (negligible)

Thanks a ton





Tom Kyte

Followup  

September 04, 2006 - 10:26 am UTC

the data is stored on the block on disk and in the cache "compressed"

when a user requested information from the block, the user process reads the block and dererences all of the pointed to information - the act of just accessing the row on a block will "uncompress it", the user process gets back THE ROW with all of the data in place (for you need the block to get the values, so getting a compressed row back would be useless)

so, it would "decompress" before the sort because we sort rows, not blocks.

I would expect the LIO and maybe even the PIO (but definitely the LIO) to be reduced by the compression factor if you
a) full scan the table
b) sort it
c) then return it

(eg: did not use an index to read it sorted) - that is because IO from TEMP (temp memory OR temp disk) is not LIO, once you copied it out into temp to sort it, the LIO's are over for that block.


show me a tkprof please, I've not noticed a difference as you state.

Thanks

September 06, 2006 - 6:33 am UTC

Reviewer: Ik from BG

Tom,

Pardon me if this is naive

You said "when a user requested information from the block, the user process reads the
block and dererences all of the pointed to information "

Does this mean that everytime a row from a block is accessed it gets decompressed as a whole? Does the entire block get decompressed or just the row alone?

If user has requested for a FTS, every row in the table has to be read. How many times does decompression occur here? 1 decompression per block? or 1 decompression per row?

How is it different for an index access?

I wanted to know how many times oracle would attempt to decompress a certain block?

I will post the trace files as soon as i have them again.

Thanks,



Tom Kyte

Followup  

September 06, 2006 - 7:59 am UTC

just the data required by the client process.

the data, as it is retrieved from the block, is "decompressed" - but decompressed here just means "the pointer was dereferenced".

It is NOT at all like the decompression you are used to (zip for example), it is a very simple "dereferencing of a pointer"



Would it vary based on number of columns?

September 06, 2006 - 12:03 pm UTC

Reviewer: Ik from BG

Tom,

So, would selecting 4 columns from a compressed table be different from selecting 10 columns? More overhead involved for 10 columns?

Thanks,



Tom Kyte

Followup  

September 06, 2006 - 3:58 pm UTC

selecting just the columns you need will always consume the least resources (we stop looking at the other columns). compressed or not.

June 14, 2007 - 7:36 pm UTC

Reviewer: A reader

In the following example, I measured the effect table compression has on the query performance. After the table compression, the number of data blocks used is reduced by quite a bit (2757 vs 832). However, the logical I/Os and the elapsed time for doing a full table scan did not get reduced by much. Can you explain why that's the case? Thanks.

SELECT * FROM v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - Prod
PL/SQL Release 10.2.0.2.0 - Production
CORE    10.2.0.2.0      Production
TNS for 32-bit Windows: Version 10.2.0.2.0 - Production
NLSRTL Version 10.2.0.2.0 - Production


CREATE TABLE x AS
SELECT * FROM all_objects;

INSERT INTO x SELECT * FROM x;
INSERT INTO x SELECT * FROM x;

commit;

exec show_space('X');

Free Blocks.............................
Total Blocks............................2816
Total Bytes.............................23068672
Total MBytes............................22
Unused Blocks...........................64
Unused Bytes............................524288
Last Used Ext FileId....................12
Last Used Ext BlockId...................343185
Last Used Block.........................64


set timing on
set autotrace traceonly

SELECT * FROM x;

192664 rows selected.

Elapsed: 00:00:10.51

Execution Plan
----------------------------------------------------------
Plan hash value: 2941724873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   198K|    38M|   604   (2)| 00:00:08 |
|   1 |  TABLE ACCESS FULL| X    |   198K|    38M|   604   (2)| 00:00:08 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          6  recursive calls
          1  db block gets
      15462  consistent gets
          0  physical reads
        176  redo size
    9395431  bytes sent via SQL*Net to client
     141669  bytes received via SQL*Net from client
      12846  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     192664  rows processed


ALTER TABLE x MOVE COMPRESS;

exec show_space('X');

Free Blocks.............................
Total Blocks............................896
Total Bytes.............................7340032
Total MBytes............................7
Unused Blocks...........................64
Unused Bytes............................524288
Last Used Ext FileId....................7
Last Used Ext BlockId...................456457
Last Used Block.........................64


SELECT * FROM x;

192664 rows selected.

Elapsed: 00:00:10.46

Execution Plan
----------------------------------------------------------
Plan hash value: 2941724873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   221K|    42M|   190   (4)| 00:00:03 |
|   1 |  TABLE ACCESS FULL| X    |   221K|    42M|   190   (4)| 00:00:03 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
      13692  consistent gets
        810  physical reads
        116  redo size
    9395431  bytes sent via SQL*Net to client
     141669  bytes received via SQL*Net from client
      12846  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     192664  rows processed


Tom Kyte

Followup  

June 14, 2007 - 8:32 pm UTC

the first query did no physical IO at all (everything was found in the cache)

The arraysize is what will control the LIO's for a full scan like that.


set arraysize 500
run test again.
set arraysize 2
run test again.



June 20, 2007 - 7:53 pm UTC

Reviewer: A reader

I'm not quite sure if I follow that. I think I understand that if I increase the size of the arraysize, the logical IOs will be reduced. However, when the arraysize is the same, why would the elapsed time be almost the same when the table sizes are so different?

Also just curious, does arraysize have a counterpart in JDBC?
Tom Kyte

Followup  

June 21, 2007 - 10:20 am UTC

you timed elapsed time from the point of view of the sqlplus client.

that includes all network time and everything.

2000 logical IO's - pretty fast in single user mode (no contention on cache buffers chains, pretty fast)

network activity - that likely is the gating factor here - so, I would not expect any different.


prefetch is a "method" (also known as a subroutine in old speak) associated with connections and statements in jdbc that allow you to control the default array size - it is "10" by default with oracle jdbc drivers.

Posting My New Test Result

June 20, 2007 - 8:11 pm UTC

Reviewer: A reader

SQL> 
SQL> set arraysize 2
SQL> 
SQL> DROP TABLE x;

Table dropped.

Elapsed: 00:00:00.14
SQL> 
SQL> CREATE TABLE x AS
  2  SELECT * FROM all_objects;

Table created.

Elapsed: 00:00:02.81
SQL> 
SQL> INSERT INTO x SELECT * FROM x;

48148 rows created.

Elapsed: 00:00:00.56

SQL> INSERT INTO x SELECT * FROM x;

96296 rows created.

Elapsed: 00:00:00.28

SQL> 
SQL> commit;

Commit complete.

Elapsed: 00:00:00.01
SQL> 
SQL> exec show_space('X');
Free Blocks.............................
Total Blocks............................2816
Total Bytes.............................23068672
Total MBytes............................22
Unused Blocks...........................64
Unused Bytes............................524288
Last Used Ext FileId....................10
Last Used Ext BlockId...................377097
Last Used Block.........................64

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL> 
SQL> set timing on
SQL> set autotrace traceonly
SQL> 
SQL> SELECT * FROM x;

192592 rows selected.

Elapsed: 00:00:47.64

Execution Plan
----------------------------------------------------------
Plan hash value: 2941724873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   230K|    44M|   604   (2)| 00:00:08 |
|   1 |  TABLE ACCESS FULL| X    |   230K|    44M|   604   (2)| 00:00:08 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
      97765  consistent gets
          0  physical reads
          0  redo size
   19987963  bytes sent via SQL*Net to client
    1059630  bytes received via SQL*Net from client
      96297  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     192592  rows processed

SQL> 
SQL> set arraysize 15
SQL> 
SQL> /

192592 rows selected.

Elapsed: 00:00:09.29

Execution Plan
----------------------------------------------------------
Plan hash value: 2941724873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   230K|    44M|   604   (2)| 00:00:08 |
|   1 |  TABLE ACCESS FULL| X    |   230K|    44M|   604   (2)| 00:00:08 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      15372  consistent gets
          0  physical reads
          0  redo size
    9389051  bytes sent via SQL*Net to client
     141614  bytes received via SQL*Net from client
      12841  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     192592  rows processed

SQL> 
SQL> 
SQL> set arraysize 500
SQL> 
SQL> /

192592 rows selected.

Elapsed: 00:00:01.87

Execution Plan
----------------------------------------------------------
Plan hash value: 2941724873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   230K|    44M|   604   (2)| 00:00:08 |
|   1 |  TABLE ACCESS FULL| X    |   230K|    44M|   604   (2)| 00:00:08 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3072  consistent gets
          0  physical reads
          0  redo size
    7807393  bytes sent via SQL*Net to client
       4620  bytes received via SQL*Net from client
        387  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     192592  rows processed

SQL> 
SQL> 
SQL> ALTER TABLE x MOVE COMPRESS;

Table altered.

Elapsed: 00:00:05.71
SQL> 
SQL> exec show_space('X');
Free Blocks.............................
Total Blocks............................896
Total Bytes.............................7340032
Total MBytes............................7
Unused Blocks...........................65
Unused Bytes............................532480
Last Used Ext FileId....................8
Last Used Ext BlockId...................322825
Last Used Block.........................63

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL> 
SQL> set arraysize 2
SQL> 
SQL> SELECT * FROM x;

192592 rows selected.

Elapsed: 00:00:47.84

Execution Plan
----------------------------------------------------------
Plan hash value: 2941724873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   225K|    43M|   190   (4)| 00:00:03 |
|   1 |  TABLE ACCESS FULL| X    |   225K|    43M|   190   (4)| 00:00:03 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
      96789  consistent gets
        809  physical reads
         72  redo size
   19987963  bytes sent via SQL*Net to client
    1059630  bytes received via SQL*Net from client
      96297  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     192592  rows processed

SQL> 
SQL> set arraysize 15
SQL> 
SQL> /

192592 rows selected.

Elapsed: 00:00:09.40

Execution Plan
----------------------------------------------------------
Plan hash value: 2941724873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   225K|    43M|   190   (4)| 00:00:03 |
|   1 |  TABLE ACCESS FULL| X    |   225K|    43M|   190   (4)| 00:00:03 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      13595  consistent gets
          0  physical reads
          0  redo size
    9389051  bytes sent via SQL*Net to client
     141614  bytes received via SQL*Net from client
      12841  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     192592  rows processed

SQL> 
SQL> 
SQL> set arraysize 500
SQL> 
SQL> /

192592 rows selected.

Elapsed: 00:00:01.92

Execution Plan
----------------------------------------------------------
Plan hash value: 2941724873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   225K|    43M|   190   (4)| 00:00:03 |
|   1 |  TABLE ACCESS FULL| X    |   225K|    43M|   190   (4)| 00:00:03 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1199  consistent gets
          0  physical reads
          0  redo size
    7807393  bytes sent via SQL*Net to client
       4620  bytes received via SQL*Net from client
        387  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     192592  rows processed

Tom Kyte

Followup  

June 21, 2007 - 10:21 am UTC

so, do you have a question about the results?

as stated - increase array size and for this specific query (a full table scan), the logical IO's will go down. That is what you observed.

JDBC Prefetch Method

June 21, 2007 - 12:46 pm UTC

Reviewer: A reader

"prefetch is a "method" (also known as a subroutine in old speak) associated with connections and statements in jdbc that allow you to control the default array size - it is "10" by default with oracle jdbc drivers."

So I can simulate the effect that prefetch has on a SQL by running the same SQL in SQL*Plus using different arraysize?
Tom Kyte

Followup  

June 22, 2007 - 10:04 am UTC

umm, yes.

Number of Data Blocks vs Logical I/O

June 21, 2007 - 1:14 pm UTC

Reviewer: A reader

Oh ok. I think I understand it a little bit more now. To get a more accurate test of measuring the effect a compressed table has on reducing the logical I/Os, I'll need to take the arraysize out of the equation. Instead of of runnign "SELECT * FROM x", I'll run "SELECT COUNT(*) FROM x" since it will always return only 1 record.

Is this the correct understanding?

The test below shows that the ratio of data blocks and logical I/O's between the uncompressed and compressed tables are very similar.


SQL> 
SQL> DROP TABLE x;

Table dropped.

Elapsed: 00:00:00.15
SQL> 
SQL> CREATE TABLE x AS
  2  SELECT * FROM all_objects;

Table created.

Elapsed: 00:00:02.75
SQL> 
SQL> INSERT INTO x SELECT * FROM x;

48150 rows created.

Elapsed: 00:00:00.54
SQL> INSERT INTO x SELECT * FROM x;

96300 rows created.

Elapsed: 00:00:00.28
SQL> 
SQL> commit;

Commit complete.

Elapsed: 00:00:00.03
SQL> 
SQL> exec show_space('X');
Free Blocks.............................
Total Blocks............................2816
Total Bytes.............................23068672
Total MBytes............................22
Unused Blocks...........................64
Unused Bytes............................524288
Last Used Ext FileId....................10
Last Used Ext BlockId...................377097
Last Used Block.........................64

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL> 
SQL> set timing on
SQL> set autotrace traceonly
SQL> 
SQL> SELECT COUNT(*) FROM x;

Elapsed: 00:00:00.04

Execution Plan
----------------------------------------------------------
Plan hash value: 989401810

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |   600   (1)| 00:00:08 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| X    |   230K|   600   (1)| 00:00:08 |
-------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          6  recursive calls
          1  db block gets
       2799  consistent gets
          0  physical reads
        176  redo size
        412  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> 
SQL> 
SQL> ALTER TABLE x MOVE COMPRESS;

Table altered.

Elapsed: 00:00:05.65
SQL> 
SQL> exec show_space('X');
Free Blocks.............................
Total Blocks............................896
Total Bytes.............................7340032
Total MBytes............................7
Unused Blocks...........................63
Unused Bytes............................516096
Last Used Ext FileId....................8
Last Used Ext BlockId...................323721
Last Used Block.........................65

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL> 
SQL> SELECT COUNT(*) FROm x;

Elapsed: 00:00:00.28

Execution Plan
----------------------------------------------------------
Plan hash value: 989401810

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |   187   (2)| 00:00:03 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| X    |   190K|   187   (2)| 00:00:03 |
-------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
        885  consistent gets
        811  physical reads
         72  redo size
        412  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> spool off

Tom Kyte

Followup  

June 22, 2007 - 10:06 am UTC

you have recursive sql in there mussing stuff up (that LIO is counted in your LIO with autotrace)

but - for a count(*) against a single table full scan - you would expect the LIO count to be equal to the allocated and used blocks of the table - so yes, you would expect a 100% relationship between the tablesize and the LIO's performed.

index takes more space than table

January 04, 2008 - 12:56 pm UTC

Reviewer: jasdeep from las vegas, usa

I have a question/observation that i need to sort out.
I have seen that indexes on a big table takes more space than table sitself.
eg: we have table( 9 columns) og 3gb in size and total of 5 index on different columns and seperate dedicated tablespace for this table and for its indexes.
The tablespace for table has taken 3 gb space and all indexes in total has taken 11 gb of space.

Tom Kyte

Followup  

January 04, 2008 - 1:15 pm UTC

you have 5 indexes - any single index is not larger than the table, but in aggregate - sure, why wouldn't they be? why couldn't they be????

Why does this seem strange to you?


Remember, the index has the data being indexed (the key) PLUS the rowid in it. The table - just has the data.

This does not seem "strange" to me, you have a table with LOTS of indexes on it - each of these indexes could have multiple columns, plus the rowid, plus the fact that an index is a complex data structure.

Compression+partition+M-Views

April 09, 2008 - 9:37 pm UTC

Reviewer: MN from USA

Hi Tom,

I have a table which is partioned by range on transaction date, all old partitioned (previous quarter) get compressed on timely basis. If there any perfomnace issues if I create MViews on this table, tables behaves on Heavy Insert-No Update mode.

Regards,

Tom Kyte

Followup  

April 10, 2008 - 10:38 am UTC

... If there any
perfomnace issues if I create MViews on this table, tables behaves on Heavy
Insert-No Update mode.
...

how in the world could anyone give a meaningful answer to that.


Strange value in column OPT_CMPR_PCTSAVE

January 13, 2009 - 10:28 am UTC

Reviewer: Loïc from Paris

Hi Tom,
I ran the analyze index XXX validate structure and I get strange values in the column INDEX_STATS.OPT_CMPR_PCTSAVE over 100 (255, 246...).

As I understand the content of this column, I can get the estimated size (of the index once compressed on the first INDEX_STATS.OPT_CMPR_COUNT columns) like that:

select BTREE_SPACE * (1-(OPT_CMPR_PCTSAVE/100)) from INDEX_STATS;

But following is what I get:

SQL> analyze index INN_CB_MSG_5 validate structure;
 
Index analyzed
 
SQL> select name, btree_space, opt_cmpr_count, opt_cmpr_pctsave, btree_space*(1-(opt_cmpr_pctsave/100)) as estimated_gain from index_stats;
 
NAME                           BTREE_SPACE OPT_CMPR_COUNT OPT_CMPR_PCTSAVE ESTIMATED_GAIN
------------------------------ ----------- -------------- ---------------- --------------
INN_CB_MSG_5                     134354968              4              252  -204219551.36
 
Index definition: 
create index INN_CB_MSG_5 on CB_MSG (CREATION_DATE, STATUS_TYPE, DIRECTION, TECHNICAL_TYPE);

Is it a problem of numeric overflow, modulo 100 or a bug? (I'm using version 9.2.0.8 on HP-UX).

Thanks for your lights!
Loïc

Tom Kyte

Followup  

January 16, 2009 - 3:50 pm UTC

looks like it is saying "it'll get larger", not smaller.

can you show us how to create the table (need datatypes) and describe to us the contents of the table now - eg: can I recreate this?

More information

January 19, 2009 - 11:55 am UTC

Reviewer: Loïc from Paris

Strangely,
Sadly, I can't reproduce for this given index.

I'm basing my tests upon a production database copy and with the
new *physical copy* it doesn't happen anymore on this index but
I've got this behavior for other indexes.

> looks like it is saying "it'll get larger", not smaller.

Okay, I'll test it (on an other index):

The following table is used to check for duplicate messages (rows) loaded via files (external tables).

create table EF_UNIQUE_MESSAGES
(
  BRANCH_DB_ID    NUMBER(15) not null,          -- location (Paris, London...)
  FILE_TYPE_DB_ID NUMBER(15) not null,          -- file type ID
  FUNCTIONAL_KEY  VARCHAR2(1000) not null,      -- functional key of the message (concatenated business fields)
  MSG_ID          NUMBER(15) not null,          -- message ID inside one loaded file (generally rownum)
  LINE_START      NUMBER(15) not null,          -- for multi-lines message, start line in the original file
  LINE_END        NUMBER(15) not null,          -- for multi-lines message, end line in the original file
  CREATION_DATE   DATE default sysdate not null -- field for purge of this table
)
tablespace DATA
  pctfree 10
  pctused 40
  initrans 1
  maxtrans 255
  storage
  (
    initial 64K
    minextents 1
    maxextents unlimited
  );

-- Unique key, we handle duplicates via ORA-00001 (PL/SQL bulk collect insert)
alter table EF_UNIQUE_MESSAGES
  add constraint PK_EF_UNIQUE_MESSAGES primary key (BRANCH_DB_ID, FILE_TYPE_DB_ID, FUNCTIONAL_KEY)
  using index 
  tablespace INDX
  pctfree 10
  initrans 2
  maxtrans 255
  storage
  (
    initial 64K
    minextents 1
    maxextents unlimited
  );


This table contains: 3,367,660 rows.

Repartition is as following:

SQL> SELECT BRANCH_DB_ID, FILE_TYPE_DB_ID, COUNT(*)
  2    FROM EF_UNIQUE_MESSAGES
  3   GROUP BY BRANCH_DB_ID, FILE_TYPE_DB_ID
  4  ORDER BY 3 DESC;
 
    BRANCH_DB_ID  FILE_TYPE_DB_ID   COUNT(*)
---------------- ---------------- ----------
               1              142    1182803
               1              147    1038128
               1              141     596084
               1              143     211418
               1              144     170839
               1              145     168388


Index statistics:

SQL> analyze index PK_EF_UNIQUE_MESSAGES validate structure;
 
Index analyzed

SQL> select name, btree_space, opt_cmpr_count, opt_cmpr_pctsave,
  2  btree_space*(1-(opt_cmpr_pctsave/100)) as estimated_gain from index_stats;
 
NAME                           BTREE_SPACE OPT_CMPR_COUNT OPT_CMPR_PCTSAVE ESTIMATED_GAIN
------------------------------ ----------- -------------- ---------------- --------------
PK_EF_UNIQUE_MESSAGES            321900268              2              248  -476412396.64


I'll now apply the recommendation of compressing only the two first columns (which seams good
according to the OPT_CMPR_COUNT column):

SQL> alter table EF_UNIQUE_MESSAGES drop constraint PK_EF_UNIQUE_MESSAGES cascade;
 
Table altered

SQL> create unique index PK_EF_UNIQUE_MESSAGES on EF_UNIQUE_MESSAGES (BRANCH_DB_ID, FILE_TYPE_DB_ID, FUNCTIONAL_KEY) compress 2
  2    tablespace INDX
  3    pctfree 10
  4    initrans 2
  5    maxtrans 255
  6    storage
  7    (
  8      initial 64K
  9      minextents 1
 10      maxextents unlimited
 11    );
 
Index created
 
SQL> alter table EF_UNIQUE_MESSAGES
  2    add constraint PK_EF_UNIQUE_MESSAGES primary key (BRANCH_DB_ID, FILE_TYPE_DB_ID, FUNCTIONAL_KEY)
  3    using index
  4    tablespace INDX
  5    pctfree 10
  6    initrans 2
  7    maxtrans 255
  8    storage
  9    (
 10      initial 64K
 11      minextents 1
 12      maxextents unlimited
 13    );
 
Table altered

SQL> analyze index PK_EF_UNIQUE_MESSAGES validate structure;
 
Index analyzed

SQL> select name, btree_space, opt_cmpr_count, opt_cmpr_pctsave,
  2  btree_space*(1-(opt_cmpr_pctsave/100)) as estimated_gain from index_stats;
 
NAME                           BTREE_SPACE OPT_CMPR_COUNT OPT_CMPR_PCTSAVE ESTIMATED_GAIN
------------------------------ ----------- -------------- ---------------- --------------
PK_EF_UNIQUE_MESSAGES            224419180              2                0      224419180


According to this test, the resulting index uses 69.7% of the original index space.

Could the strange values:

OPT_CMPR_PCTSAVE ESTIMATED_GAIN
---------------- --------------
             248  -476412396.64


come from the fact that the table had many rows deleted thus the index is not 100% fresh and
the index validation process can't give accurate results?

Thanks for your time,
Loïc

Tom Kyte

Followup  

January 19, 2009 - 9:39 pm UTC

well, the compression here probably does nothing.

the first two columns - very small. the third - up to 1000 bytes - very larger.

the 70% "used space" is entirely due to the rebuild (10% pctfree reintroduced into each block and each block is TEMPORARILY 90% full - very very temporary I'm sure)

You'll delete and add more stuff over time and it'll go right back to about 50-70% full - most indexes are on randomly arriving data (yours is I suspect based on the columns in it - meaning you put stuff on the right - on the left - on the middle - on the right and so on all of the time, it is not just increasing on the right like a sequence would or a date would)

This index, just looking at it, would not really benefit from compression. Most of the data is in the 3rd almost unique field.

March 23, 2009 - 11:17 am UTC

Reviewer: Venkat from MO, USA

I have a partitioned IOT table that has about 850 million rows. The IOT table is uncompressed as of now.  I want to compress the oldest partitions to save space. Currently the IOT is uncompressed. The old data is hardly used and no inserts happen against them.

I saw a mention above in your followup about exchanging the oldest partition with compressed data created as a table.

I am trying a test to create a small table that is IOT and trying to get the oldest partition to be compressed.

Definition of simple test table (partitioned, IOT)
==================================================
create table  test1
(
 col1 number not null,
 col2 date not null,
 col3 varchar2(30),
 constraint tstpk
 primary key (col1,col2)
)
ORGANIZATION INDEX NOCOMPRESS
tablespace TBS_200406 overflow tablespace TBS_200406
  PARTITION BY RANGE (COL1)
(
  PARTITION P_10 VALUES LESS THAN (10)
    LOGGING
    NOCOMPRESS
    TABLESPACE TBS_200406
    PCTFREE    10
    INITRANS   2
    MAXTRANS   255
    STORAGE    (
                INITIAL          64K
                MINEXTENTS       1
                MAXEXTENTS       2147483645
                BUFFER_POOL      DEFAULT
               )
  ,
and so on..... total of 10 partitions.
);
I inserted some data about 100 rows per partition.

Simple move to compress does not work.
======================================
SQL> alter table test1 move compress ;
alter table test1 move compress
            *
ERROR at line 1:
ORA-28660: Partitioned Index-Organized table may not be MOVEd as a whole

Move partition level to compress - error
========================================
SQL> alter table test1 move partition p_10 tablespace tbs_200406 compress nologging parallel ;
alter table test1 move partition p_10 tablespace tbs_200406 compress nologging parallel
            *
ERROR at line 1:
ORA-28659: COMPRESS must be specified at object level first

creating a compressed temporary table and exchange partition with the temporary table (oldest partition)
==================================================
drop table vm.oldp1_comp
Table dropped.

create table
oldp1_comp(col1,  col2 , col3 , primary key(col1,col2))
organization index
parallel  (degree 2)
compress
overflow tablespace tbs_200406
as select * from vm.test1 partition (p_10)
   ;
Table created.

alter table oldp1_comp move compress
   ;
Table altered.

alter table test1 exchange partition p_10 with table oldp1_comp
                                                     *
ERROR at line 1:
ORA-28665: table and partition must have same compression attribute


Any ideas how I can go about this or is this perhaps not possible.
Regards,
Venkat

Tom Kyte

Followup  

March 26, 2009 - 1:12 pm UTC

IOTs can only do index key compression - tables have "compression at the block leve", indexes compress by removing redundant leading edge columns from the index key.

Your index key is (col1,col2) - col1 is a number (small)

Unless col1 repeats A LOT, I mean a huge "a lot", there will be no compression here.



ops$tkyte%ORA10GR2> create table t1
  2  ( col1, col2, data, constraint NOCOMPRESS_PK primary key(col1,col2))
  3  organization index
  4  as
  5  select 1, sysdate+rownum, rpad('*',20,'*')
  6    from all_objects
  7  /

Table created.

ops$tkyte%ORA10GR2> analyze index NOCOMPRESS_PK validate structure;

Index analyzed.

ops$tkyte%ORA10GR2> create table t2 as select * from index_stats;

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table t1 purge;

Table dropped.

ops$tkyte%ORA10GR2> create table t1
  2  ( col1, col2, data, constraint COMPRESS_1_PK primary key(col1,col2))
  3  organization index
  4  compress 1
  5  as
  6  select 1, sysdate+rownum, rpad('*',20,'*')
  7    from all_objects
  8  /

Table created.

ops$tkyte%ORA10GR2> analyze index compress_1_pk validate structure;

Index analyzed.

ops$tkyte%ORA10GR2> insert into t2 select * from index_stats;

1 row created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table t1 purge;

Table dropped.

ops$tkyte%ORA10GR2> create table t1
  2  ( col1, col2, data, constraint compress_pk primary key(col1,col2))
  3  organization index
  4  compress
  5  as
  6  select 1, sysdate+rownum, rpad('*',20,'*')
  7    from all_objects
  8  /

Table created.

ops$tkyte%ORA10GR2> analyze index compress_pk validate structure;

Index analyzed.

ops$tkyte%ORA10GR2> insert into t2 select * from index_stats;

1 row created.


ops$tkyte%ORA10GR2> select name, lf_rows, lf_blks, lf_rows_len, lf_blk_len from t2;

NAME                              LF_ROWS    LF_BLKS LF_ROWS_LEN LF_BLK_LEN
------------------------------ ---------- ---------- ----------- ----------
NOCOMPRESS_PK                       50021        272     1950819       7996
COMPRESS_1_PK                       50022        252     1800792       7992
COMPRESS_PK                         50022        252     1800792       7992




no real difference.... Now, if they were tables, it would be a different story:

ops$tkyte%ORA10GR2> drop table t1 purge;

Table dropped.

ops$tkyte%ORA10GR2> drop table t2 purge;

Table dropped.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t1
  2  ( col1, col2, data, constraint t1_pk primary key(col1,col2))
  3  as
  4  select 1, sysdate+rownum, rpad('*',20,'*')
  5    from all_objects
  6  /

Table created.

ops$tkyte%ORA10GR2> create table t2
  2  ( col1, col2, data, constraint t2_pk primary key(col1,col2))
  3  compress
  4  as
  5  select 1, sysdate+rownum, rpad('*',20,'*')
  6    from all_objects
  7  /

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec show_space( 'T1' );
...
Full Blocks        .....................             257
Total Blocks............................             384
Total Bytes.............................       3,145,728

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> exec show_space( 'T2' );
...
Full Blocks        .....................              95
Total Blocks............................             104
Total Bytes.............................         851,968

PL/SQL procedure successfully completed.


Thanks a lot and Regards

March 26, 2009 - 3:59 pm UTC

Reviewer: Venkat from MO, USA

Appreciate the response. Thanks a lot for the enlightening response and the example of index and comparing with table.

Regards
Venkat

compress

April 20, 2009 - 3:32 am UTC

Reviewer: A reader

greetings thomas,

and thanks like always.

1) if i compress a table or an index and become bigger, what is the side effects except becomming bigger.
Example: does that effects future dml statements, .....etc.

2) if the table or the index become bigger, how i make it as it was.
a) i have tried to uncompress.
b) i have tried to uncompress and then compress it.
but a and b have been faild.

3) i have a partition table that is frequently inserted to, and then compressed. is there any way to find which partition need compression or not.

please advice.
Tom Kyte

Followup  

April 21, 2009 - 12:58 pm UTC

1) it is just bigger. that is all. Not sure what "does that effect future dml statements" could be looking for? They (the segments) are just bigger. The net effect of compressing is - the object is stored in the format that compress segments use - if that is smaller - so be it, if it is larger - so be it. There are not any more or less 'things' that happen because it is bigger (or smaller)

2) my car won't start, I tried, but it failed.

When you answer my question, I'll be glad to answer yours. We have exactly the same level of detail available to us.


You tell us *NOTHING* about what you did or did not do. You describe NOTHING about the process you went through. You do not tell us what 'FAIL' means to you. No one, repeat no one could answer you.


here is my example showing table/index before compression, after compression and after rebuilding non-compressed (these are the steps that show you how to 'uncompress' something

ops$tkyte%ORA10GR2> create table t as select * from all_objects;

Table created.

ops$tkyte%ORA10GR2> create index t_idx on t(timestamp);

Index created.

ops$tkyte%ORA10GR2> select segment_name,blocks from dba_segments where owner = user and segment_name in ('T','T_IDX');

SEGMENT_NAME                       BLOCKS
------------------------------ ----------
T                                     768
T_IDX                                 256

ops$tkyte%ORA10GR2> alter table t compress;

Table altered.

ops$tkyte%ORA10GR2> alter table t move;

Table altered.

ops$tkyte%ORA10GR2> alter index t_idx rebuild compress 1;

Index altered.

ops$tkyte%ORA10GR2> select segment_name,blocks from dba_segments where owner = user and segment_name in ('T','T_IDX');

SEGMENT_NAME                       BLOCKS
------------------------------ ----------
T                                     256
T_IDX                                  96

ops$tkyte%ORA10GR2> alter table t nocompress pctfree 10;

Table altered.

ops$tkyte%ORA10GR2> alter table t move;

Table altered.

ops$tkyte%ORA10GR2> alter index t_idx rebuild nocompress;

Index altered.

ops$tkyte%ORA10GR2> select segment_name,blocks from dba_segments where owner = user and segment_name in ('T','T_IDX');

SEGMENT_NAME                       BLOCKS
------------------------------ ----------
T                                     768
T_IDX                                 256




3) none of them "need" compression - it is a choice. You would know if a partition would be a candidate for compression by understanding the application and how it uses the partitions. If you have partitions that are range partitioned by date and after a while you don't insert into a partition (like an audit trail for example), then compressing the older read only/read mostly partitions would make sense and you would KNOW based on the application itself.

Until 11g when you have OLTP compression (the inserts for regular insert values can be compressed, not just insert /*+ APPEND */), compressing a partition that is constantly inserted into probably doesn't make sense as it would be an offline operation (takes a MOVE and then rebuild all indexes).

compression in 11g

March 14, 2010 - 4:45 am UTC

Reviewer: Nicksa from sa

Hi Tom,
I have tested the 11g oltp compression and not really happy with the results as claimed in oracle pdf document where it talks about 2, 3 times rate.
My results
1.I did not get more than 30% on avg compression. (The same tables yields 50% compression on avg from basic compression alter table move compress...)
2. queries performing full scan also takes a bit longer on oltp compressed tables.
3. Likewise cpu also increased a bit.
I WONDER who would use oltp compression at the a) additional cost of its licensing $10k per cpu, b) additional cost of machine cpu (which increases oracle licensing) c) full scans taking more time.
I mean this is a disaster feature unless oracle proves it works for most cases.
Anyway my queston after reading the whole thread is:
I have a warehouse type database which i would like to comopress using old conv compression method. my db is 10g.

1. Can i run one command to put the whole schema in compress mode (or relevant tablespaces) rather than go table by table?

2. If above is possible, will dml transactions to compress data works fine (i know with slow and extra cpu)?

3. wil the new data be automatically compressed or i have to change the source code of insert and update?

4. If above is not true, that means I wil pick up the large tables from the database and compress them manualy every month end (the prev month paritions). So if i do that, will my indexes (global) gets invalid and i have to rebuild them?

any other issues? basically i have a warehouse type db of 900 gb and i wanted to save on space.
cheers

Tom Kyte

Followup  

March 15, 2010 - 11:01 am UTC

1) it is a function of the data and how it arrives. There is a trade off between getting maximum compression (direct path) and getting very good compression without impacting performance in a huge way.

Here is a quick example:

ops$tkyte%ORA11GR2> create table all_objects_unload
  2  organization external
  3  ( type oracle_datapump
  4    default directory TMP
  5    location( 'allobjects.dat' )
  6  )
  7  as
  8  select a.*, dbms_random.random rnd from dba_objects a, (select 1 from dual connect by level <= 5);

Table created.



some data, not huge, but some data.

ops$tkyte%ORA11GR2> create table uncompressed
  2  pctfree 0
  3  as
  4  select *
  5    from all_objects_unload
  6   order by owner, object_type, object_name;

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'UNCOMPRESSED' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> clear screen
ops$tkyte%ORA11GR2> create table compressed_direct
  2  COMPRESS
  3  as
  4  select *
  5    from all_objects_unload
  6   order by owner, object_type, object_name;

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_DIRECT' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> clear screen
ops$tkyte%ORA11GR2> create table compressed_conventional
  2  COMPRESS FOR ALL OPERATIONS
  3  pctfree 0
  4  as
  5  select *
  6    from all_objects_unload
  7   where 1=0;

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> begin
  2      for x in (select * from all_objects_unload
  3                 order by owner, object_type, object_name )
  4      loop
  5          insert into compressed_conventional values X;
  6      end loop;
  7      commit;
  8  end;
  9  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_CONVENTIONAL' );




load it all and measure:

ops$tkyte%ORA11GR2> select uncompressed,
  2         compressed_direct,
  3         to_char( round(compressed_direct/uncompressed*100,2), 'fm999.00')||'%' cd_pct,
  4         compressed_conventional,
  5         to_char( round(compressed_conventional/uncompressed*100,2), 'fm999.00')||'%' cc_pct
  6    from (
  7  select max(decode(table_name,'COMPRESSED_DIRECT',blocks,null)) compressed_direct,
  8         max(decode(table_name,'UNCOMPRESSED',blocks,null)) uncompressed,
  9         max(decode(table_name,'COMPRESSED_CONVENTIONAL',blocks,null)) compressed_conventional
 10    from user_tables
 11   where table_name in ( 'COMPRESSED_DIRECT', 'COMPRESSED_CONVENTIONAL', 'UNCOMPRESSED' )
 12         )
 13  /

UNCOMPRESSED COMPRESSED_DIRECT CD_PCT   COMPRESSED_CONVENTIONAL CC_PCT
------------ ----------------- -------- ----------------------- --------
        5165              1087 21.05%                      1252 24.24%


direct path - almost (but not quite) 5:1, conventional path - more than 4:1. We sorted by owner, object_type, object_name on purpose - to achieve pretty good compression AND to reduce resources needed to retrieve data by owner, (owner,object_type) - if that was what we needed.


do it again, loading the data randomly sorted - that is, not organized by anything - just randomly arriving data:

ops$tkyte%ORA11GR2> create table compressed_direct
  2  COMPRESS
  3  as
  4  select *
  5    from all_objects_unload
  6   order by rnd;
Table created.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_DIRECT' );
PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> create table compressed_conventional
  2  COMPRESS FOR ALL OPERATIONS
  3  pctfree 0
  4  as
  5  select *
  6    from all_objects_unload
  7   where 1=0;
Table created.

ops$tkyte%ORA11GR2> begin
  2      for x in (select * from all_objects_unload order by rnd )
  3      loop
  4          insert into compressed_conventional values X;
  5      end loop;
  6      commit;
  7  end;
  8  /
PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_CONVENTIONAL' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> clear screen
ops$tkyte%ORA11GR2> select uncompressed,
  2         compressed_direct,
  3         to_char( round(compressed_direct/uncompressed*100,2), 'fm999.00')||'%' cd_pct,
  4         compressed_conventional,
  5         to_char( round(compressed_conventional/uncompressed*100,2), 'fm999.00')||'%' cc_pct
  6    from (
  7  select max(decode(table_name,'COMPRESSED_DIRECT',blocks,null)) compressed_direct,
  8         max(decode(table_name,'UNCOMPRESSED',blocks,null)) uncompressed,
  9         max(decode(table_name,'COMPRESSED_CONVENTIONAL',blocks,null)) compressed_conventional
 10    from user_tables
 11   where table_name in ( 'COMPRESSED_DIRECT', 'COMPRESSED_CONVENTIONAL', 'UNCOMPRESSED' )
 12         )
 13  /

UNCOMPRESSED COMPRESSED_DIRECT CD_PCT   COMPRESSED_CONVENTIONAL CC_PCT
------------ ----------------- -------- ----------------------- --------
        5165              3003 58.14%                      3142 60.83%


About 1.7:1 compression - in both cases - not bad. Given the nature of the data.



2) A bit longer in total? Give us a for example (numbers, metrics, tkprofs...)

3) And the time for physical IO's did what? What was the net/net - better or worse? The goal isn't "reduce all resources", the goal is "reduce the amount of storage we take". I've seen a small cpu hit for direct path compression and a small cpu hit for conventional path compression - almost always overshadowed by a decrease in runtime overall due to decreased physical IO.





part 2, assuming by "old conv compression" you meant "old direct path compression" since convention path compression is "new"

1) no, it is segment by segment
2) yes, you can do dml to compressed data, always have been able to.
3) only direct path operations - direct path load, alter table t, create table as select, insert /*+ append */ as select - would load compressed data. dml compression is new.


and at the end, you said why yourself why people would use an option to the database - in order to "save on space" and the cost associated with that. In your case, you have a pretty small database to start with - might not be much in the way of savings on such a small sized system.


compressed table partition query performance

January 05, 2011 - 4:43 pm UTC

Reviewer: Dan from NJ

Hi Tom:
I am seeing performance difference between the same table in two different schema.

Schema1 has a partitioned table in which all table partition/index partitions are compressed. Query performance looks good in schema1.

I took data pump export of this same table from schema1 and imported into schema2. After import, this table is identical in schema2 in terms of number of rows/segment size/compression/stats etc.

But when i run this query, it takes longer in SCHEMA2.
Query against SCHEMA1 runs fast.

I see "high consistent gets" in the case of SCHEMA2 than SCHEMA1. I already checked the partitioned are compressed as part of data pump import.

I am not able to figure what else is causing this slowness.
Thanks for your inputs.

Please find below the TKPROF report on both cases:

===================================
Query against SCHEMA1 (running ok):
===================================

SELECT /*+ ROWID(TRDPS_CED_FAC_PRCS) */ *
FROM RIMS.TRDPS_CED_FAC_PRCS
WHERE RIMS.TRDPS_CED_FAC_PRCS.rowid >= 'AAASpgAAGAACGnFAAA' and RIMS.TRDPS_CED_FAC_PRCS.rowid < 'AAASpgAAGAACGoFAAA'
AND (RIMS_DT<=to_date('20101221', 'YYYYMMDD')
AND ((TRANSACTION_TYP_CD IN (18, 22, 29, 30, 36 )
AND GL_EFFECTIVE_DT=to_date('20101201', 'YYYYMMDD'))
OR (TRANSACTION_TYP_CD IN (15, 21, 23, 31, 32, 33, 44, 34 )
AND GL_EFFECTIVE_DT>=to_date('20070701', 'YYYYMMDD')
AND GL_EFFECTIVE_DT<=to_date('20101201', 'YYYYMMDD')))
AND REIN_DIRECTION_CD='CED'
AND MIS780_ASCO_CODE='11'
AND GL_EFFECTIVE_DT='01-MAY-2008'
AND DOCUMENT_TYPE_CD='MCH'
AND POLICY_NO='0003474937'
AND transaction_typ_cd IN (31, 32, 33 ))

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.07 0.08 61 535 0 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.08 0.10 61 535 0 2

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 92

Rows Row Source Operation
------- ---------------------------------------------------
2 PARTITION RANGE ALL PARTITION: 1 13 (cr=535 pr=61 pw=0 time=54712 us)
2 TABLE ACCESS BY ROWID RANGE TRDPS_CED_FAC_PRCS PARTITION: 1 13 (cr=535 pr=61 pw=0 time=54733 us)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
SQL*Net more data to client 4 0.00 0.00
db file scattered read 5 0.00 0.01
SQL*Net message from client 2 10.02 10.03
********************************************************************************

====================================
Query against SCHEMA2 (takes longer):
=====================================

SELECT /*+ ROWID(TRDPS_CED_FAC_PRCS) */ *
FROM RIMS.TRDPS_CED_FAC_PRCS
WHERE RIMS.TRDPS_CED_FAC_PRCS.rowid >= 'AAASpgAAGAACGnFAAA' and RIMS.TRDPS_CED_FAC_PRCS.rowid < 'AAASpgAAGAACGoFAAA'
AND (RIMS_DT<=to_date('20101221', 'YYYYMMDD')
AND ((TRANSACTION_TYP_CD IN (18, 22, 29, 30, 36 )
AND GL_EFFECTIVE_DT=to_date('20101201', 'YYYYMMDD'))
OR (TRANSACTION_TYP_CD IN (15, 21, 23, 31, 32, 33, 44, 34 )
AND GL_EFFECTIVE_DT>=to_date('20070701', 'YYYYMMDD')
AND GL_EFFECTIVE_DT<=to_date('20101201', 'YYYYMMDD')))
AND REIN_DIRECTION_CD='CED'
AND MIS780_ASCO_CODE='11'
AND GL_EFFECTIVE_DT='01-MAY-2008'
AND DOCUMENT_TYPE_CD='MCH'
AND POLICY_NO='0003474937'
AND transaction_typ_cd IN (31, 32, 33 ))

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.07 0.08 61 535 0 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.08 0.10 61 535 0 2

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 92

Rows Row Source Operation
------- ---------------------------------------------------
2 PARTITION RANGE ALL PARTITION: 1 13 (cr=535 pr=61 pw=0 time=54712 us)
2 TABLE ACCESS BY ROWID RANGE TRDPS_CED_FAC_PRCS PARTITION: 1 13 (cr=535 pr=61 pw=0 time=54733 us)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
SQL*Net more data to client 4 0.00 0.00
db file scattered read 5 0.00 0.01
SQL*Net message from client 2 10.02 10.03
********************************************************************************

Tom Kyte

Followup  

January 06, 2011 - 7:06 am UTC

according to what you just posted, they are identical.

compressed table and ROWID Hint

January 05, 2011 - 10:52 pm UTC

Reviewer: Dan from NJ

Hi Tom:

I figured out the problem. Query against SCHEMA1 was using small ROWID range whereas query against SCHEMA2 was using large rowid range which is why this query was consuming more logical io and more time.

I have question about the query using ROWID hint.

a) Is ROWID Hint still supported in 10gR2 and above ?

Below note says: ROWID is deprecated since 10gRelease 1.
Refer the following note: QREF: SQL Statement HINTS [ID 29236.1]

b) The query in the above post with /*+ ROWID */ hint returns wrong data (that is, -337.2) when the data actually is null. If we remove the /*+ ROWID */ hint and run the query, we get correct output (that is, NULL) for the column. Actual column datatype is NUMBER(18,4).

Do you know of any issue with using ROWID hint ?

I actually dumped the column data using DUMP(column) it shows NULL.

Thanks for your inputs.

Tom Kyte

Followup  

January 06, 2011 - 7:48 am UTC

a) I fail to see any difference between the two above.

rowid hint probably still works, but would not be supported as per the note


b) need a test case to reproduce with.

Table and key compression

September 02, 2011 - 8:54 am UTC

Reviewer: A reader

We have lot of focus on being able to save space in our different dbs.
I would like to suggest to make use of table and index compression.
My current understanding is key compression is all good while the table
compression might cause slow performance.
We do a lot of table reorg and index rebuild just to save space .Know
your general opinion is against them.But do you think in these
circumstances it is good for key compression and table compression.
Key compression is a 'rare' always good, right ?(never say always).
Tom Kyte

Followup  

September 02, 2011 - 9:54 am UTC

Never say never
never say always
I always say...


Index key compression is in general a positive thing (analyze index validate structure and review OPT_CMPR_COUNT and OPT_CMPR_PCTSAVE in index_stats to see if it makes sense - beware, that validate structure will LOCK the entire index!).

It does make the index structure more complex so it could possibly make retrieve take a bit more cpu - however, since the index is smaller - you have less blocks to process in general so this is usually a fair and even tradeoff.

It does put more leaf rows per block - you might find yourself hitting ITL waits (v$segment_statistics will tell you) then you did before (rebuild with a higher initrans - which of course will consume more space). You might find yourself with higher buffer busy waits (contention).


Table compression comes in two flavors - in 9ir2 EE and above you have 'basic' compression. This only works with direct path operations (alter table t move, create table as select, insert /*+ append */, sqlldr direct=y). Any updates or inserts will result in non-compressed data.

It will make creation of the data take longer - since will uses CPU to do the compression - but at runtime - the performance of the compressed table will be at least as good and usually *better* than the uncompressed version.


In 11gr1 and above you have a database option - the advanced compression option. It works with normal inserts/updates as well as direct path operations and is implemented in a manner that does not impact the raw insert/update performance noticeably (maybe not at all). Again - retrieval will be as good if not better after compressing.


Compression would definitely be something to consider, benchmark and evaluate if you are heavily space conscious.

compression ratio achieved.

November 09, 2011 - 11:43 pm UTC

Reviewer: Amir Riaz

Hi Tom,

I have a few table with 250 million rows, the average batch size to be loaded is 25Gb/day. I have to keep 3 years historical data. so

per month load is: 25x30=750GB
Per year load is: 8TB
With 3 years of data: 25TB.

We have resolved our performance problems, so no problem in performance. but now we are out of storage and I want to compress data.

Oracle 11g r2 have OLTP compression and data warehouse compression(basic compression), I think I have to go with basic compression. But what compression ratios I can achieve with basic compression and OLTP comression and how can I increase the compression ratio.
Tom Kyte

Followup  

November 10, 2011 - 1:17 pm UTC

You can use the compression advisor in DBMS_COMPRESSION to see how compressable you data might be.


To increase the compression ratio - you can presort your information. For example (all_objects_unload in this example is an external table that is just a copy of all_objects for demonstration purposes)


ops$tkyte%ORA11GR2> create table all_objects_unload
  2  organization external
  3  ( type oracle_datapump
  4    default directory TMP
  5    location( 'allobjects.dat' )
  6  )
  7  as
  8  select a.*, dbms_random.random rnd from stage a;

Table created.

ops$tkyte%ORA11GR2> !ls -l /tmp/allobjects.dat
-rw-rw---- 1 ora11gr2 ora11gr2 7938048 Nov 10 14:11 /tmp/allobjects.dat


ops$tkyte%ORA11GR2> create table uncompressed
  2  pctfree 0
  3  as
  4  select *
  5    from all_objects_unload
  6   order by owner, object_type, object_name;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'UNCOMPRESSED' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> create table compressed_direct
  2  COMPRESS
  3  as
  4  select *
  5    from all_objects_unload
  6   order by owner, object_type, object_name;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_DIRECT' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> create table compressed_conventional
  2  COMPRESS FOR ALL OPERATIONS
  3  pctfree 0
  4  as
  5  select *
  6    from all_objects_unload
  7   where 1=0;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> begin
  2      for x in (select * from all_objects_unload
  3                 order by owner, object_type, object_name )
  4      loop
  5          insert into compressed_conventional values X;
  6      end loop;
  7      commit;
  8  end;
  9  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_CONVENTIONAL' );

PL/SQL procedure successfully completed.


ops$tkyte%ORA11GR2> select uncompressed,
  2         compressed_direct,
  3         to_char( round(compressed_direct/uncompressed*100,2), 'fm999.00')||'%' cd_pct,
  4         compressed_conventional,
  5         to_char( round(compressed_conventional/uncompressed*100,2), 'fm999.00')||'%' cc_pct
  6    from (
  7  select max(decode(table_name,'COMPRESSED_DIRECT',blocks,null)) compressed_direct,
  8         max(decode(table_name,'UNCOMPRESSED',blocks,null)) uncompressed,
  9         max(decode(table_name,'COMPRESSED_CONVENTIONAL',blocks,null)) compressed_conventional
 10    from user_tables
 11   where table_name in ( 'COMPRESSED_DIRECT', 'COMPRESSED_CONVENTIONAL', 'UNCOMPRESSED' )
 12         )
 13  /

UNCOMPRESSED COMPRESSED_DIRECT CD_PCT   COMPRESSED_CONVENTIONAL CC_PCT
------------ ----------------- -------- ----------------------- --------
        1028               459 44.65%                       496 48.25%





so, out of the box - without thinking about the data - just loading it "as is" in this case - we see about a 2:1 (a little more) compression ratio on the data...


ops$tkyte%ORA11GR2> drop table compressed_direct;

Table dropped.

ops$tkyte%ORA11GR2> create table compressed_direct
  2  COMPRESS
  3  as
  4  select *
  5    from all_objects_unload
  6   order by rnd;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_DIRECT' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> drop table compressed_conventional;

Table dropped.

ops$tkyte%ORA11GR2> create table compressed_conventional
  2  COMPRESS FOR ALL OPERATIONS
  3  pctfree 0
  4  as
  5  select *
  6    from all_objects_unload
  7   where 1=0;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> begin
  2      for x in (select * from all_objects_unload order by rnd )
  3      loop
  4          insert into compressed_conventional values X;
  5      end loop;
  6      commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_CONVENTIONAL' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> select uncompressed,
  2         compressed_direct,
  3         to_char( round(compressed_direct/uncompressed*100,2), 'fm999.00')||'%' cd_pct,
  4         compressed_conventional,
  5         to_char( round(compressed_conventional/uncompressed*100,2), 'fm999.00')||'%' cc_pct
  6    from (
  7  select max(decode(table_name,'COMPRESSED_DIRECT',blocks,null)) compressed_direct,
  8         max(decode(table_name,'UNCOMPRESSED',blocks,null)) uncompressed,
  9         max(decode(table_name,'COMPRESSED_CONVENTIONAL',blocks,null)) compressed_conventional
 10    from user_tables
 11   where table_name in ( 'COMPRESSED_DIRECT', 'COMPRESSED_CONVENTIONAL', 'UNCOMPRESSED' )
 12         )
 13  /

UNCOMPRESSED COMPRESSED_DIRECT CD_PCT   COMPRESSED_CONVENTIONAL CC_PCT
------------ ----------------- -------- ----------------------- --------
        1028               593 57.68%                       622 60.51%





if the data arrives all scrambled up - the compression drops off a bit - because we compress by removing repeating data on a block - by scrambling it up in this case - we've decreased the amount of repeating data that arrives together.

However, it we take a second to understand our data - we might be able to help it compress more. I know there is a varchar2(19) column in this table that is NOT NULL and has relatively low distinct cardinality - if I were to sort by that as I loaded - it would repeat a lot. It is wide and highly repetitive - best for compressing:


ops$tkyte%ORA11GR2> drop table compressed_direct;

Table dropped.

ops$tkyte%ORA11GR2> create table compressed_direct
  2  COMPRESS
  3  as
  4  select *
  5    from all_objects_unload
  6   order by timestamp, owner, object_type, status;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_DIRECT' );

PL/SQL procedure successfully completed.


ops$tkyte%ORA11GR2> drop table compressed_conventional;

Table dropped.

ops$tkyte%ORA11GR2> create table compressed_conventional
  2  COMPRESS FOR ALL OPERATIONS
  3  pctfree 0
  4  as
  5  select *
  6    from all_objects_unload
  7   where 1=0;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> begin
  2      for x in (select * from all_objects_unload
  3                     order by timestamp, owner, object_type, status )
  4      loop
  5          insert into compressed_conventional values X;
  6      end loop;
  7      commit;
  8  end;
  9  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'COMPRESSED_CONVENTIONAL' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> select uncompressed,
  2         compressed_direct,
  3         to_char( round(compressed_direct/uncompressed*100,2), 'fm999.00')||'%' cd_pct,
  4         compressed_conventional,
  5         to_char( round(compressed_conventional/uncompressed*100,2), 'fm999.00')||'%' cc_pct
  6    from (
  7  select max(decode(table_name,'COMPRESSED_DIRECT',blocks,null)) compressed_direct,
  8         max(decode(table_name,'UNCOMPRESSED',blocks,null)) uncompressed,
  9         max(decode(table_name,'COMPRESSED_CONVENTIONAL',blocks,null)) compressed_conventional
 10    from user_tables
 11   where table_name in ( 'COMPRESSED_DIRECT', 'COMPRESSED_CONVENTIONAL', 'UNCOMPRESSED' )
 12         )
 13  /

UNCOMPRESSED COMPRESSED_DIRECT CD_PCT   COMPRESSED_CONVENTIONAL CC_PCT
------------ ----------------- -------- ----------------------- --------
        1028               428 41.63%                       496 48.25%






So, if your main goal is "compress it as much as possible", consider sorting it. I say consider because you'll now have "declustered" some of the data if it was naturally clustered before - you'll have rearranged the rows in order on disk - and if you are using indexes to retrieve the data - you might be increasing the time it takes to retrieve that data.

row length

December 20, 2011 - 8:31 am UTC

Reviewer: lh from finland

Hi

I looked at the parititioned table where most of the partitions were compressed by direct path compression.

The rowlength in statistics looked like rowlength of uncompressed row.

Is rowlength used by optimizer when making decisisions of the access path ? How exactly does compression affect optimizer ? Only by changing number of blocks and index's clusterin factor ?


Tom Kyte

Followup  

December 20, 2011 - 9:07 am UTC

rowlength affects the amount of data flowing from one step in a plan to the next - which in turn will affect the potential amount of memory/temp space a query might need - which can affect the plan.


Compression would affect the optimizer by reducing the number of blocks in the table - the smaller the table, the cheaper a full scan will be, for example:

ops$tkyte%ORA11GR2> create table t as select * from all_objects;

Table created.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from t;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 72219 |  6911K|   289   (1)| 00:00:04 |
|   1 |  TABLE ACCESS FULL| T    | 72219 |  6911K|   289   (1)| 00:00:04 |
--------------------------------------------------------------------------

ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2> alter table t compress;

Table altered.

ops$tkyte%ORA11GR2> alter table t move;

Table altered.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from t;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 72219 |  6911K|    86   (2)| 00:00:02 |
|   1 |  TABLE ACCESS FULL| T    | 72219 |  6911K|    86   (2)| 00:00:02 |
--------------------------------------------------------------------------

ops$tkyte%ORA11GR2> set autotrace off



that is the main one - the number of blocks, but yes, it would also tend to lower the clustering factor of indexes as well in some cases - the lower the clustering factor was initially - likely the more it would be lowered. An index with a high clustering factor won't drop as much as one with a lower clustering factor will.

Rowlength in compressed tables

December 21, 2011 - 1:15 am UTC

Reviewer: lh from finland

Thanks for Your answer. Still wondering the value of rowlength in compressed tables.


create table lhdemo(f1 number,
f2 varchar2(200 char),
f3 char(1000 char));

declare
begin
for i in 1..100000 loop
insert into LHDEMO(f1,f2,f3) values (i,'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx','dummy');
end loop;
commit;
dbms_stats.gatheR_table_stats(user,'LHDEMO');
end;
/
select compression, avg_row_len, NUM_rows, blocks from user_tables where table_name='LHDEMO';
DISABLED 2107 100000 4053

alter table lhdemo move tablespace users compress;
execute dbms_stats.gatheR_table_stats(user,'LHDEMO');
select compression, avg_row_len, NUM_rows, blocks from user_tables where table_name='LHDEMO';
ENABLED 2107 100000 42



Tom Kyte

Followup  

December 21, 2011 - 8:28 am UTC

select *
from compressed_table
order by unindexed_column;


please come up with an estimate for the temp space/sort workarea necessary for that query. I'll take rowlength - after decompressing since temp is not compressed - and use that.

Like I said - it is used to estimate the size of the data flowing from one step in a plan to the next.

rowlength affects the amount of data flowing from one step in a plan to the next - which in turn will affect the potential amount of memory/temp space a query might need - which can affect the plan.

followup

December 21, 2011 - 1:28 am UTC

Reviewer: lh from finland

Hi,

Read Your answer again. Do You mean that rowlength is used for internal processing giving information about memory usage in handling rows and it does not affect the actual data access how blocks are fetched ? So rowlength does not (and it should not) represent the actual size of row in datablock.

But if query is not fetching all the columns of table what is used then when calculating memory usage ?


Tom Kyte

Followup  

December 21, 2011 - 8:33 am UTC

So rowlength does
not (and it should not) represent the actual size of row in datablock.


not in a compressed table, correct.



But if query is not fetching all the columns of table what is used then when
calculating memory usage


it can use avg_col_len, avg_col_len works in the same fashion. In this example, you KNOW the avg_col_len on disk isn't 41 after compressing - the X's all disappear. but avg_col_len represents the length in memory after compressing...


ops$tkyte%ORA11GR2> create table t
  2  as
  3  select rpad( 'x', 40, 'x' ) x, a.*
  4    from all_objects a
  5  /

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> select avg_col_len from user_tab_columns where table_name = 'T' and column_name = 'X';

AVG_COL_LEN
-----------
         41

ops$tkyte%ORA11GR2> select blocks from user_tables where table_name = 'T';

    BLOCKS
----------
      1477

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> alter table t compress;

Table altered.

ops$tkyte%ORA11GR2> alter table t move;

Table altered.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> select avg_col_len from user_tab_columns where table_name = 'T' and column_name = 'X';

AVG_COL_LEN
-----------
         41

ops$tkyte%ORA11GR2> select blocks from user_tables where table_name = 'T';

    BLOCKS
----------
       308

ops$tkyte%ORA11GR2> 

followup

December 23, 2011 - 5:36 am UTC

Reviewer: lh from finland

Thanks for Your answer. This made it clear to me.


Just for the record, documention says

AVG_ROW_LEN*
Average length of a row in the table (in bytes)

in ALL_TABLES section in
Oracle® Database Reference
11g Release 2 (11.2)
Part Number E25513-01

http://docs.oracle.com/cd/E11882_01/server.112/e25513/statviews_2117.htm#i1592091


Tom Kyte

Followup  

December 23, 2011 - 8:49 am UTC

and it is true. It is the average length of the row in that table. It does not say "this is the average number of bytes on disk in a block".

If you were to get a row from the table, on average that rows width would be... the avg_row_len.

Dropping colums from a Compressed table

December 09, 2012 - 2:14 am UTC

Reviewer: Kishore from INDIA

Hi Tom,

Greetings!

In one of our databases, for all the tables, COMPRESSION is ENABLED for OLTP. As per few requirements, one of those tables needs to drop few coulmns and add few columns very frequently.

As the compression is ENABLED for this particular table, when ever we issue the statement to DROP a column, its being SET as UNUSED column and so soon the count in dba_unsed_col_tabs will become till 1000 and as it is the maximum count for number of columns allowed for a table, further operation will be affected due to this.

Can you please let us know it there is way to drop the columns completely without being set to UNUSED while we issue ALTER TABLE <tab_name> DROP COLUMN <col_name> ?

Please let us know.
Tom Kyte

Followup  

December 14, 2012 - 2:21 pm UTC

can you describe the situation in which you feel you need to add/drop columns at such a rate that you'll hit this practical limit?


(read about dbms_redefinition for online rebuilds of segments - or you can use alter table move to perform an offline operation - which is what you would need to do).

but really - tell us why you feel you need to do this in the first place, I'm very curious (as I've never heard of anyone doing this..... Maybe there is a better approach to take)

Dropping colums from a Compressed table

December 18, 2012 - 7:20 pm UTC

Reviewer: Kishore from INDIA

Tom,

Thanks for your reply.

Even by using, dbms_redefinition or alter table move operation, the dropped columns are not being dropped permanently. We can still see it in dba_unused_col_tabs as the respective table is a Compresssed one with OLTP.

The approach followed by this process is, to drop and add few columns in the table fvery requently. This is because, they use it for reporting purpose and everytime they run the report, some columns will be dropped and some will be added. Count is not always the same. Its based on how the customer wants to view the report.
eg:Sometimes they drop 10 columbs and add only 5 new ones.

So, in this case, as they frequently use this report, 1000 columns mark is reaching in a faster manner. Also, Virtual columns feature will also be not applicable here.

So, we are looking to find a way where these coulmns being set to UNUSED instead of being dropped permanently.

The proposed approach by us is to do the following steps:

1) Make the respective table as an UNCOMPRESSED one
2) Drop the respective columns from the table.
3) Make the table back to Compressed mode with OLTP beore adding the new set of columns.

At this point of time, we are looking for a way where we can by pass this process of setting the columns as UNUSED when dropped from these compressed tables (OLTP compression).

Thank you so much for your prompt response once again.

Regards,
Kishore.
Tom Kyte

Followup  

December 19, 2012 - 8:42 am UTC

This is because, they use it for reporting purpose and
everytime they run the report, some columns will be dropped and some will be
added. Count is not always the same. Its based on how the customer wants to
view the report.


why wouldn't you use a table with some varchar2's, some dates, some numbers and in your program just remember what columns are what.

I wouldn't ever consider adding/dropping as you are for a reporting table.

If you wanted to do DDL, the most I would do is create a new view


create view report
as
select c1 as whatever, c2 as somethingelse, c99 as mynumber, c120 as mydate ...
from t;


if you needed a different set of column types, build a new view - or just have your SQL generate against table T using the column types you need and avoid the DDL altogether.


Some things never change

December 19, 2012 - 8:51 am UTC

Reviewer: A reader

"why wouldn't you use a table with some varchar2's, some dates, some numbers and in your program just remember what columns are what."

That is the approach I used in Oracle 4 in 1986 or so.

December 31, 2012 - 2:25 am UTC

Reviewer: Girish Kale

Hello Tom,

I've a related question to index compression but in the context of heavy deletes.

I analyzed all indexes to figure out that some of the indexes are good candidates of compression where OPT_CMPR_COUNT is 1 through 4 for some giving likely saving in storage (OPT_COMR_PCTSAVE) between 59 and 48 %.

I was wondering to what extent my deletes will get boost if I compress these indexes.

Thanks in advance for you valuable advise.

Best,
Girish

Tom Kyte

Followup  

January 04, 2013 - 11:29 am UTC

pretty much nothing helps deletes - they are the slowest of all DML operations.

we'll still have to maintain the index and maintaining the index might be a little harder because the leaf blocks are compressed. So that could add CPU time to the delete.

On the other hand, we might have to do a few fewer logical and physical IO's so that might help the delete

On the other hand, there would be more leaf rows/block so if the deletes are done by many sessions - you might increase concurrency on a block and hit buffer busy waits.


It would probably in most cases do nothing for your deletes - neither good nor bad.

Uncompress tables and indexes.

March 06, 2020 - 4:06 pm UTC

Reviewer: Sushil Tripathee from TX, USA

The information provided is very helpful. I have lots of tables and indexes associated with tables are compressed. I am looking to uncompress tables and indexes. I am planning to uncompress indexes first since I can uncompress indexes online then uncompress tables. Do I need to re-built indexes after uncompressing tables?
Chris Saxon

Followup  

March 09, 2020 - 10:55 am UTC

You need to move the table to change the compression of existing data.

Provided you're on a version that supports online table move:

alter table ... move online;


The indexes will remain valid. Otherwise they'll be UNUSABLE and you'll need to rebuild them.

Check the status with:

select index_name, status 
from   user_indexes
where  table_name = '...';