Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question.

Asked: December 20, 2001 - 9:40 pm UTC

Last updated: October 07, 2009 - 8:31 am UTC

Version: 8i

Viewed 10K+ times! This question is

You Asked

Hi,

How to determine if an index needs to be rebuilt if the optimizer mode is RULE?

Thanx.

and Tom said...

I'm a big believer that indexes do not need to be rebuilt except in very perculiar cases. 99% of index rebuilds are a waste of time.

Some people look at the number of deleted entries in the index and rebuild based on that (analyze index index_name validate structure and then query index_stats to see statistics on the index such as deleted entries and such)...

I say that might be misleading. If 50% of the entries are deleted (leaving "holes" in the index) -- should we rebuild? Maybe -- maybe not. depends on how the index is used and what data it indexes. Consider this:

create table emp ( empno int, ename varchar2(25) );

create index emp_idx1 on emp(empno);
create index emp_idx2 on emp(ename)


Now, suppose empno is filled in via a sequence. Its always increasing and we never reuse a value. In our system, we actually delete rows when we fire someone (a real HR system would never do that). So, emp_idx1 develops "holes" over time that will never be reused. It we find it to be X% "empty" over time-- it might be a candidate for a rebuild. We'll never reuse that deleted space (unless we end up deleting all index entries on a block).

But, the index on ename -- that's a little more chaotic. Say we have employees:

Bill
Bob
Mary

Say we fire Bob, now we have:

Bill
X-Bob (deleted entry)
Mary

Should we rebuild an index like that? Probably NOT, the reason -- we are going to hire Hillary:

Bill
Hillary
Mary

Hillary will reuse Bobs slot. There is no reason to get rid of that slot -- only to have to rebuild it again later. If the data you are indexing has a good probability of reusing a slot like that -- rebuilding can actually slow you down over time (it takes time to split a block -- with empty entries -- the chances we need to split are reduced. If you rebuild -- they'll go up -- you unsplit everything, got rid of the space. Now we have to re-split again -- every time you rebuild). I find most indexes hit a steady state -- if you rebuild them, they'll go back into that state over time. So rebuilding is somewhat self defeating in that case.

Make sure to BENCHMARK yourself here. Before rebuilding -- capture statistics, performance metrics, IO's, etc -- against that index. After rebuilding -- compare the results. If you got nothing back, no increased performance, decreased IO, etc -- don't rebuild it again later -- it doesn't buy you anything.

Rating

  (38 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Index rebuild

Rob Katofiasc, October 04, 2002 - 12:28 pm UTC

What metrics should I be looking at to determine if the index rebuild helped or not? Can you give a brief example of what data you would capture before and after the index rebuild.

Tom Kyte
October 04, 2002 - 6:46 pm UTC

Keep stats pack reports. Look at your top sql queries for example and see if the IO's go down, transaction rates go up, IO's decrease, performance improves.


If you felt the index needed rebuilding -- you must have had some material numbers that made you do it -- see if those numbers improve

Should we re-create index before Analyzing tables/schema

Syed, October 07, 2002 - 7:22 am UTC

We are using Cost based optimizer and some time we do large
data insert. At that time we need to analyze tables/schema to keep the statistics current. But before Analyzing table/schema is it necessary to re-create index ?

Please clarify.


Tom Kyte
October 07, 2002 - 7:39 am UTC

NO

Sagi, October 07, 2002 - 8:03 am UTC

Hi Tom !

Can you give an example please.

Thanx.

Regards,
Sagi

Tom Kyte
October 07, 2002 - 8:14 am UTC

example of what exactly?

Index Rebuild

Randy Bollock, October 07, 2002 - 12:52 pm UTC

Rebuilding of index helped me in couple of occasions. Let me share my experience.

1. One fine morning users started to complain they cannt access a part of the system. I looked into the query and submitted the same query to find the error. It was clearly pointing to index corruption, ORA - 1502. I have no clue how on earth that could happen. There was apperently no h/w error reported. The database was running on HP AutoRaid(RAID 0, 1). Rebuilt the index and everything came back well.

2. Migrated 7.3.4 database to 8.1.7. Wanted to use the LMT. But found no better way to use LMT rather than rebuilding the indexes into LMT.

Thanks,

Randy

Would coalesce do the same thing?

Peter Tran, August 05, 2003 - 2:57 pm UTC

Tom,

In your example above, you wrote:

[quote]
Now, suppose empno is filled in via a sequence. Its always increasing and we never reuse a value. In our system, we actually delete rows when we fire someone (a real HR system would never do that). So, emp_idx1 develops "holes"
over time that will never be reused. It we find it to be X% "empty" over time-- it might be a candidate for a rebuild. We'll never reuse that deleted space (unless we end up deleting all index entries on a block).
[/quote]

Wouldn't "alter index <index_name> coalesce;" do the same thing?

Thanks,
-Peter

Tom Kyte
August 05, 2003 - 2:59 pm UTC

yes (not the same thing but the end result is what we are after -- reclaim space)

Index rebuild releases space???

Ashwani, August 06, 2003 - 8:38 am UTC

Dear tom,
One of the dba from our site told us that the rebuild of index will release space in tablespace in which index is present...

Is that a fact??

we ran the following query after a rebuild and acutally found that free space had increased

select a.TABLESPACE_NAME,
a.BYTES "Size",
(a.BYTES-b.BYTES) "Used",
round(((a.BYTES-b.BYTES)/a.BYTES)*100,2) "PUsed",
b.BYTES "Free",
round((1-((a.BYTES-b.BYTES)/a.BYTES))*100,2) "PFree"
from sys.orasnap_ddf a, sys.orasnap_dfs b
where a.TABLESPACE_NAME = b.TABLESPACE_NAME
order by a.tablespace_name


Regards

Tom Kyte
August 06, 2003 - 9:02 am UTC

ask this dba "well, for how long? I mean, the index got fat and used that space, isn't it just going to grow all over again? haven't you just reclaimed some space that it will just have to allocate all over again?"


The question is "for how long", did you actually sell some of your disks on EBay since they are all empty? Or, do you still have them cause the indexes naturally grow back to their equilibrium size?

read this:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6601312252730 <code>

those guys "reclaimed space" and then spent the next week or so painstakingly "reallocating it" only to "reclaim it" again. vicious loop they got into there.

is there a need for rebuild

Ashwani, August 08, 2003 - 1:20 am UTC

hi Tom,
follwing is the health report of our database for a day

We r using oracle APPS 11.5.5 with DB 8.1.7

No. of Sessions
---------------
122


Buffer Hit Ratio
----------------
50


Logical reads Physical Reads Physical Writes
------------- -------------- ---------------
1.8447E+19 1.8447E+19 1.8447E+19


Data Dictionary Hit Ratio


Data Dict. Gets Data Dict. Cache Misses DATA DICT CACHE HIT RATIO
--------------- ----------------------- -------------------------
7870953 340130 95.6786681


Library Cache Ratio


executions Cache Misses (1-(SUM(RELOADS)/SUM(PINS)))*100 LIBRARY CACHE PIN HIT RATIO LIBRARY CACHE GET HIT RATIO
---------- ------------ -------------------------------- --------------------------- ---------------------------
118433124 31851 99.9731063 99.645 98.6516038


NAMESPACE Get Hit Ratio Pin Hit Ratio Reloads
--------------- ------------- ------------- ----------
SQL AREA 92 99 8143
TABLE/PROCEDURE 99 99 23702
BODY 99 99 0
TRIGGER 99 99 6
INDEX 91 83 0
CLUSTER 98 98 0
OBJECT 100 100 0
PIPE 99 99 0

8 rows selected.


Memory


NAME VALUE
-------------------- ----------
Fixed Size 73620
Variable Size 559747072
Database Buffers 1864499200
Redo Buffers 2105344
----------
sum 2426425236


NAME BYTES
-------------------------- ----------
free memory 32849892
free memory 14442496


Tot SQL since startup SQL executing now
--------------------- -----------------
37699679 24


If these are > 1% of Total Number of Requests for Data
then extra rollback segments are needed.

CLASS COUNT
------------------ ----------
free list 1
system undo block 0
system undo header 0
undo block 428
undo header 34

Total Number of Requests for Data

SUM(VALUE)
----------
3.6893E+19

The star DBA to increase performance rebuild the indexes and is Also asking to increase the size of SGA..
it is already very huge in Size..
Pls guide in the right direction..

Ashwani

Tom Kyte
August 10, 2003 - 11:28 am UTC

wow, thats a lot of numbers.

too bad no one could determine the health of anything based on them.

Your DBA actually sees:

Logical reads Physical Reads Physical Writes
------------- -------------- ---------------
1.8447E+19 1.8447E+19 1.8447E+19

and doesn't think "wow, something is wrong there, better look into that". tell that dba to read:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6047280202769 <code>

they have absolutely NO CLUE WHATSOEVER as to the true hit ratio. NOT that the hit ratio is very meaningful (tell them after reading that link to goto hotsos.com and read the "are you still using cache hit ratios".


Ask the star DBA for the scientific reasoning behind rebuilding indexes and SGA increases. They must have some solid technical reasoning beyond "well, everyone knows a bigger sga is better and indexes need to be rebuilt"

If they don't, well......


Is there a need for rebuild corrected??

Ashwani, August 11, 2003 - 6:21 am UTC

Dear tom,
i do not think the bug u have listed in the link is applicable to us .. Sorry i potrayed a wrong picture.. The correct pic is below


No. of Sessions
---------------
63


Buffer Hit Ratio
----------------
99.33


Logical reads Physical Reads Physical Writes
------------- -------------- ---------------
1749924 8833837 71181


Data Dictionary Hit Ratio


Data Dict. Gets Data Dict. Cache Misses DATA DICT CACHE HIT RATIO
--------------- ----------------------- -------------------------
91987 6948 92.4467588


Library Cache Ratio


executions Cache Misses (1-(SUM(RELOADS)/SUM(PINS)))*100 LIBRARY CACHE PIN HIT RATIO LIBRARY CACHE GET HIT RATIO
---------- ------------ -------------------------------- --------------------------- ---------------------------
7229659 11 99.9998478 99.932 93.5214788


NAMESPACE Get Hit Ratio Pin Hit Ratio Reloads
--------------- ------------- ------------- ----------
SQL AREA 84 99 11
TABLE/PROCEDURE 95 99 0
BODY 94 93 0
TRIGGER 88 88 0
INDEX 97 97 0
CLUSTER 96 96 0
OBJECT 100 100 0
PIPE 99 99 0

8 rows selected.


Memory


NAME VALUE
-------------------- ----------
Fixed Size 73620
Variable Size 559747072
Database Buffers 1864499200
Redo Buffers 2105344
----------
sum 2426425236


NAME BYTES
-------------------------- ----------
free memory 448321852
free memory 15347712


Tot SQL since startup SQL executing now
--------------------- -----------------
2423600 8


INSTANCE_NUMBER INSTANCE_NAME HOST_NAME VERSION STARTUP_T STATUS PAR THREAD# ARCHIVE LOG_SWITCH_ LOGINS SHU DATABASE_STATUS INSTANCE_ROLE
--------------- ---------------- ---------------------------------------------------------------- ----------------- --------- ------- --- ---------- ------- ----------- ---------- --- ----------------- ------------------
1 PROD chiia4ood01 8.1.7.1.0 10-AUG-03 OPEN NO 1 STARTED ALLOWED NO ACTIVE PRIMARY_INSTANCE


If these are > 1% of Total Number of Requests for Data
then extra rollback segments are needed.

CLASS COUNT
------------------ ----------
free list 0
system undo block 0
system undo header 0
undo block 1
undo header 2

Total Number of Requests for Data

SUM(VALUE)
----------
10584138


Pls gude me now on the health report of the db and is there a need for increase of SGA

Tom Kyte
August 11, 2003 - 7:45 am UTC

gee, I wonder where the other made up data came from huh?


but now I wonder how this all adds up:

Buffer Hit Ratio
----------------
99.33


Logical reads Physical Reads Physical Writes
------------- -------------- ---------------
1749924 8833837 71181


tell me, how you have a 99% plus hit ratio when you appear to do 6 PIO's for every 1 LIO.



these are meaningless numbers -- you can make ZERO decisions based on this data.

use statspack



Tuning Pack - tablespace map

Jennifer Chen, March 01, 2004 - 4:28 pm UTC

Hi Tom,

We are using Oracle Tuning Pack -> Tablespace map -> Tablespace Analysis to determine whether or not an index needs to be rebuilt.

The tablespace analysis returns and following alerts and raises red flags on indexes that are fragmented:

Space Alerts and Warnings The following space management problems where discovered in tablespace ALIAS_LI.TypeSegment NameAlert StatusIndexALIAS.IX_INCIDENT_CTNALERTALERT: Index suffering from fragmentation. IndexALIAS.IX_INCIDENT_OCAALERTALERT: Index suffering from fragmentation. IndexALIAS.IX_INCIDENT_TCNALERTALERT: Index suffering from fragmentation.

At this point, we would need to rebuild these indexes. My management can't stand these red flags on the database :).

10g AWR

Marvun, May 04, 2004 - 11:40 am UTC

Hi

I just read that 10 does Automatic Maintenance Tasks based on AWR informations to carry out tasks such as as optimizer statistics refresh and rebuilding indexes.

I am trying to find in the docs what´s the rule 10g uses to determine when the indexes has to be rebuilt but so far no good. Cant find anything. Do you know what rule is 10g using to determine index rebuild? Should we use that rule for 8i and 9i?

Tom Kyte
May 04, 2004 - 1:52 pm UTC

I don't know what rule they use either (takes the auto out of auto that way). none of the auto stuff is really documented -- it changes frequently (with virtually every release)

I'm not a huge fan of automated rebuilds of indexes personally.

Indexes should be rebuild

A reader, May 05, 2004 - 3:32 pm UTC

Hi Tom, I got the last time I ask you a question about indexes, indexes don't need to be rebuild, now I found an article that says

"When should you rebuild an index"
</code> http://searchoracle.techtarget.com/tip/1,289483,sid41_gci910442,00.html <code>
for example says
" I like to use the rule of thumb that if the blevel column in DBA_INDEXES is greater 4, it's time to rebuild the index."

Is this a rule to say "should be rebuild"

Thanks Tom

Tom Kyte
May 05, 2004 - 3:44 pm UTC

if the blevel is 5
and you rebuild
and the blevel stays 5
hmmm, what then?

or it goes to 4 but a week later it is 5 (and it worked really hard to get there again).

did the have perhaps an incomplete article? are they missing the part where you check to see if you did more good than harm to your system? that what you did had any sort of positive effect anywhere?



(funny, the followup by sanjay is a cut and paste from something I wrote, small world)

A reader, May 05, 2004 - 4:13 pm UTC

Thanks Tom

Rebuild Index?

Danny Chen, July 22, 2004 - 2:26 pm UTC

What about if the a big table got a lot of insert/delete , and over a period of time, will it affect the query performance that scan the index (like Range Scan)?

Tom Kyte
July 23, 2004 - 8:15 am UTC

really depends -- what if the index is on "name" and you insert/update/delete lots of names

will that index have lots of white space? well, no - not really. you delete "george" but insert "greg" (greg reuses georges space). you update mary to beth, delete becky, and insert merideth. no "white space".

so -- most indexes, no -- not a problem.

index on a sequence populated column X (increasing value only).

now you delete 99% of the "X" values over time (from the low value on up) -- so for example you at one time had the values 1..1000 in the table, after a "purge" you have 1, 101, 201, 301, ... 901 left.

you might have lots of white space left over -- meaning a "select * from t where x < 1000" might read 9 index blocks (assuming 100 rows / leaf block to make the math easy). If you coalesced the index (not rebuilt here, coalesce) -- you would put all 9 rows remaining under 1,000 on a single block.

So - it really depends on

a) how the data is "added/modified"
b) how big your range scans are or even ranges you scan
c) the very way in which you use the data


In most cases -- no, in certain rare cases -- a coalese does just fine, a rebuild can be used as well but is a little more drastic (requiring about 2x the space to be performed)

Rebuild Index?

Danny Chen, July 28, 2004 - 1:46 pm UTC

In your followup:
Quote
will that index have lots of white space? well, no - not really. you delete
"george" but insert "greg" (greg reuses georges space). you update mary to
beth, delete becky, and insert merideth. no "white space".
End Quote

But according to the Oralce University, Database Administration clase note:
"Deleting a row results only in a logical deletion of the index entry. The space used by the deleted row is not available for new entries unitl all the entries in the block are deleted."

So in some degree, there will be a lot of white space lost due to large random rows deletion. Base on this, I think the index rebuild is necessary.

Tom Kyte
July 28, 2004 - 7:37 pm UTC

that is false - that statement is patently FALSE and provably so.

just create a table t ( x int )

create an index t_idx on t(x);

insert into t values ( 1 );
insert into t values ( 2 );
insert into t values ( 9999999999999999999999999);

now, delete from t where x=2, put in 3, then delete 3, put in 4 and so on.

if what they said was true - by the time you got to 9999999999999999999999999, you'd expect the index to be pretty large no? well, it won't be, in fact, it'll have 3 entries.

so, in some degree, back to square one.

Index Rebuild ?

Richard Foote, July 28, 2004 - 11:06 pm UTC

To Danny and others,

For more info on index rebuilding "myths", have at look at a presentation of mine at www.actoug.org.au/Downloads/oracle_index_internals.pdf

Indexes rarely require a rebuild and the presentation hopefully goes some way to explaining why.

Cheers

Richard

Steve, November 03, 2004 - 11:24 am UTC

We need to keep rebuilding an index against one of our tables. What happens is that the index is rebuilt, queries run faster and we can see that the CBO is using the index. After a couple of days, users complain that their queries are running slow again and an autotrace shows full table scans. We rebuild the index and everything speeds up again and autotrace shows the index being used.

Clearly rebuilding the index every couple of days is unsatisfactory and I think it must be hiding a more fundmental issue. What should we be looking for/at?

Tom Kyte
November 04, 2004 - 1:25 am UTC

you should be describing what you do to this table, what happens to this table.

otherwise -- no one could answer anything.

so, whats up, what exactly do you do to this table. describe how it is used, describe how/when statistics are gathered. throw us a "bone"

Thanks to Richard Foote! And to unbreakable Tom ,of course! ;-)

Franco, November 04, 2004 - 4:55 am UTC

I found Richard's document extremely interesting, clear picture of Tom's reasons against blind rebuild of indexes all over!
Thanks again to Tom and to all the clever contributors of this invaluable website!!

Maintain index structure during table move

Bob B, November 09, 2004 - 2:40 pm UTC

We are moving our tablespace from DMT to LMT. Is there a way to do this move and rebuild the indexes without changing the structure of the index? Essentially, it would be a tablespace move that updates the index each time a row is moved.

Most of the indexes seem to be in a stable state, so it doesn't seem necessary to rebuild each one (performance hit #1) and then wait for them to restabilize (performance hit #2). Performance hit #3 (updating index during move) would seem more simple and natural to measure.

Tom Kyte
November 09, 2004 - 3:30 pm UTC

well, maintaining them would be performance hit #1,001 plus some. every single rowid is changed -- radically, the entire index would be basically "destroyed".


it is necessary to rebuild them. The rowid is an intristic part of the index -- to the point where it is actually part of the KEY (for non-unique indexes)! and the data is sorted by rowid (and they rowids are all a-changing)




Figured that

Bob B, November 09, 2004 - 3:53 pm UTC

That was my hunch, just wanted to check on it. Based on what you say, seems like if there were a method, it would have to change the structure of non-unique indexes either way.

Tom Kyte
November 09, 2004 - 4:06 pm UTC

and unique ones, the rowids are stored in there as well.

your indexes would basically explode to 2x their size as we would have to effectively delete each entry and insert them again in the same transaction.

Simulations

Vikas Khanna, November 18, 2004 - 9:02 am UTC

Hi Tom,

I did tried to simulate that the Index Rebuilds are a total waste of time for indexes which are on numbers inserted in a table via a sequence.

SQL>create table employee ( empno int, ename varchar2(25)); 
SQL> create index emp_empno on employee(empno) tablespace indx;

Now, suppose as employees join an organization empno is filled in via a sequence from the application.
Its always increasing.

SQL> CREATE SEQUENCE "SCOTT"."EMP_SEQ" INCREMENT BY 1 START WITH 1 MAXVALUE 1.0E28 MINVALUE 1 NOCYCLE CACHE 20 NOORDER

When an employee leaves us, suppose the application deletes his entries from the system.The deleted entries will never be reused again.

begin
for x in 1..1000000 loop
Insert into employee values (emp_seq.nextval,dbms_random.random);
end loop;
commit;
end;

Used dbms_random.random function to insert employee names as they can be very random as per their nature. At some moment Allen is joining us, on others King, Ford is joining us. 

sql> Analyze index emp_empno validate structure;
Height    Blocks    Leaf Blocks    Lf_Rows        Del_lf_rows
3    2304    2226        10000000    0

Now lets delete every 10th employee from the organisation

SQL>Delete from employees where mod(empno,10) = 0
LetÂ’s see the effect after having 100000 holes which are never to be used in case of empno.
Query execution as per Indexes can take 4 different access paths
Index Fast Full Scan    
Index Full Scan 
Index Range Scan 
If not rebuild then the operations listed above will get affected.

Let us illustrate an example of Index Range Scan 
SQL> Select empno, ename from employee where empno between 1 and 100000; -- This will impact the performance as the number of leaf  blocks scanned will be more as compared to when the index has been rebuild.

Before Deletion:

Elapsed: 00:02:15.02
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
   2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)

Statistics
----------------------------------------------------------
            279    recursive calls
              0    db block gets
          13855    consistent gets
            513    physical reads
              0    redo size
        2514457    bytes sent via SQL*Net to client
          73825    bytes received via SQL*Net from client
           6668    SQL*Net roundtrips to/from client
              4    sorts (memory)
              0    sorts (disk)
     100000  rows processed

After Deletion:

SQL>Analyze index emp_empno validate structure;
SQL> Select * from index_stats;

Height    Blocks    Leaf Blocks    Lf_Rows        Del_lf_rows
3    2304    2226        900000        0    

SQL>Select * from employee where empno between 1 and 100000;
90000 rows selected.

Elapsed: 00:01:51.04

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
   2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)

Statistics
----------------------------------------------------------
        279  recursive calls
          0  db block gets
      12513  consistent gets
        514  physical reads
          0  redo size
    2263924  bytes sent via SQL*Net to client
      66488  bytes received via SQL*Net from client
       6001  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
      90000  rows processed

After Coalescing:

SQL>Analyze index emp_empno validate structure;
SQL> Select * from index_stats;
Height    Blocks    Leaf Blocks    Lf_Rows        Del_lf_rows
3    2304    2226        900000        0    

SQL> Select * from employee where empno between 1 and 100000

90000 rows selected.
Elapsed: 00:03:42.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
   2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)

Statistics
----------------------------------------------------------
         77  recursive calls
          0  db block gets
      12483  consistent gets
        288  physical reads
          0  redo size
    2263924  bytes sent via SQL*Net to client
      66488  bytes received via SQL*Net from client
       6001  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      90000  rows processed

After Rebuild:
SQL>Analyze index emp_empno validate structure;
SQL> Select * from index_stats;
Height    Blocks    Leaf Blocks    Lf_Rows        Del_lf_rows
3    2048    2004        900000        0    

90000 rows selected.
Elapsed: 00:04:51.04
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
   2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)

Statistics
----------------------------------------------------------
        279  recursive calls
          0  db block gets
      12492  consistent gets
        491  physical reads
          0  redo size
    2263924  bytes sent via SQL*Net to client
      66488  bytes received via SQL*Net from client
       6001  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
      90000  rows processed

Questions : 
1. Why del_lf_rows is still showing a value of 0 after performing the delete operation on the table?
2. How can execution time get reduced after the deletion has been performed,since the number of index blocks are same which has to be read again and again to retrieve the data values lying in the table? Those White spaces still exists with in the leaf blocks.
3.After coalescing the index no. of index blocks remain same however the elapsed time has increased? This is even strange as the non white entries might have coalesced into various leaf blocks leaving behind the leaf blocks with all white spaces?
4.After the index has been rebuild the no of leaf blocks to scan have reduced, hence lesser number of index blocks to read via range scan. Why the elapsed time is still shooting up?

Please note, for all the cases the tablespace having data & index have been offlined and made online.
Also the shared pool was flushed.

Would appreciate a reply on these questions.

Regards,


     

Tom Kyte
November 18, 2004 - 10:51 am UTC

did you read the original answer?

<quote>
I did tried to simulate that the Index Rebuilds are a total waste of time for
indexes which are on numbers inserted in a table via a sequence.
</quote>

why -- that is not an accurate statement, in fact, in the original answer above I said:

<quote>
Now, suppose empno is filled in via a sequence. Its always increasing and we
never reuse a value. In our system, we actually delete rows when we fire
someone (a real HR system would never do that). So, emp_idx1 develops "holes"
over time that will never be reused. It we find it to be X% "empty" over time--
it might be a candidate for a rebuild. We'll never reuse that deleted space
(unless we end up deleting all index entries on a block).
</quote>

You are trying to prove the opposite of what i stated...

Scheduled Coalesce

A reader, November 18, 2004 - 12:57 pm UTC

What are your thoughts on a scheduled coalesce of indices that are in the swiss cheese category?

i.e. monotonic surrogate primary key that gets deleted from the middle quite often.

Tom Kyte
November 18, 2004 - 2:18 pm UTC

give it a go but measure the before and after..... (eg: see what it does, see if it was worth it...)

Simulations

Vikas Khanna, November 19, 2004 - 6:08 am UTC

Hi Tom,

I was not trying to prove you wrong.

Rather for a index which has been built on a column with values inserted via a sequence, it should be beneficial for that index to be rebuild as we will never reuse those deleted entries.

It will grow and grow with many white spaces within the many leaf blocks which understandbly is a total waste as per space usage and for performance reasons when we do :

Fast Full Scan
Index Full Scan
Index Range Scan

However,as per simulation exercise the timings taken to range scan after the index rebuild is MORE than what it used to be before rebuilding.

Also please answer the questions already addressed.

Thanks



Tom Kyte
November 19, 2004 - 11:16 am UTC

<quote>
Rather for a index which has been built on a column with values inserted via a 
sequence, it should be beneficial for that index to be rebuild as we will never 
reuse those deleted entries.
</quote>


which is what I said above...



don't use elapsed time in sqlplus to time things.  Your network, your load on the system (server), billions of factors contribute to that.  rerun the test 5 times and you'll get 5 conflicting answers.

USE TKPROF and timed_statistics and sql_trace.  there you can see the real work performed.  but lets look at your situation in a perfectly controlled environment (just me, myself and I)


ops$tkyte@ORA9IR2> create table employee ( empno int, ename varchar2(35));
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert /*+ APPEND */ into employee select rownum, object_name from big_table.big_table;
 
1000000 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index emp_empno on employee(empno);
 
Index created.
 
ops$tkyte@ORA9IR2> delete from employee where mod(empno,10) = 0;
 
100000 rows deleted.
 
ops$tkyte@ORA9IR2> commit;
 
Commit complete.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set timing on
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select * from employee e1 where empno between 1 and 100000;
 
90000 rows selected.
 
Elapsed: 00:00:00.92
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
   2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      19248  consistent gets
        624  physical reads
      37380  redo size
    3451637  bytes sent via SQL*Net to client
      66488  bytes received via SQL*Net from client
       6001  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      90000  rows processed
 
ops$tkyte@ORA9IR2> @trace
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context forever, level 12';
 
Session altered.
 
Elapsed: 00:00:00.02
ops$tkyte@ORA9IR2> select * from employee e1 where empno between 1 and 100000;
 
90000 rows selected.
 
Elapsed: 00:00:01.84
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
   2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      18652  consistent gets
          0  physical reads
          0  redo size
    3451637  bytes sent via SQL*Net to client
      66488  bytes received via SQL*Net from client
       6001  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      90000  rows processed
 
ops$tkyte@ORA9IR2> @traceoff
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context off';
 
Session altered.
 
Elapsed: 00:00:00.04
ops$tkyte@ORA9IR2> alter index emp_empno coalesce;
 
Index altered.
 
Elapsed: 00:00:00.08
ops$tkyte@ORA9IR2> select * from employee e2 where empno between 1 and 100000;
 
90000 rows selected.
 
Elapsed: 00:00:00.73
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
   2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      18652  consistent gets
          0  physical reads
          0  redo size
    3451637  bytes sent via SQL*Net to client
      66488  bytes received via SQL*Net from client
       6001  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      90000  rows processed
 
ops$tkyte@ORA9IR2> @trace
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context forever, level 12';
 
Session altered.
 
Elapsed: 00:00:00.03
ops$tkyte@ORA9IR2> select * from employee e2 where empno between 1 and 100000;
 
90000 rows selected.
 
Elapsed: 00:00:01.72
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
   2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      18652  consistent gets
          0  physical reads
          0  redo size
    3451637  bytes sent via SQL*Net to client
      66488  bytes received via SQL*Net from client
       6001  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      90000  rows processed
 
ops$tkyte@ORA9IR2> @traceoff
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context off';
 
Session altered.
 
Elapsed: 00:00:00.03
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>


<b>tkprof says....</b>

select * from employee e1 where empno between 1 and 100000
                                                                                                       
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     6001      0.71       0.68          0      18652          0       90000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     6003      0.71       0.68          0      18652          0       90000
                                                                                                       
Rows     Row Source Operation
-------  ---------------------------------------------------
  90000  TABLE ACCESS BY INDEX ROWID EMPLOYEE (cr=18652 r=0 w=0 time=384701 us)
  90000   INDEX RANGE SCAN EMP_EMPNO (cr=6210 r=0 w=0 time=127938 us)(object id 34023)
********************************************************************************
select * from employee e2 where empno between 1 and 100000
                                                                                                       
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     6001      0.64       0.66          0      18652          0       90000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     6003      0.64       0.66          0      18652          0       90000
                                                                                                       
                                                                                                       
Rows     Row Source Operation
-------  ---------------------------------------------------
  90000  TABLE ACCESS BY INDEX ROWID EMPLOYEE (cr=18652 r=0 w=0 time=368914 us)
  90000   INDEX RANGE SCAN EMP_EMPNO (cr=6210 r=0 w=0 time=120904 us)(object id 34023)


<b>case 2 did less work, took less time on the server</b>
 

Some more clarifications required!!

Vikas Khanna, November 23, 2004 - 4:52 am UTC

Hi Tom,

Please find my findings on the above simulation.

Before Deletion:

Height    Blocks    Leaf Blocks    Lf_Rows        Del_lf_rows
3    2176    1999        10000000    0    

SQL> Select empno, ename from employee where empno between 1 and 100000; 
SQL TRACE
Call count cpu  elapsed  disk  query    current   rows
---- ----- ---  -------  ----  -----    -------   ----    
Parse    1 0.00    0.00     0      0          0      0
Execute  1 0.00    0.01     0      0          0      0
Fetch 6668 1.45    8.02   488  20698          0 100000
---- ----- ---  -------  ----  -----    -------   ----    
total 6670 1.45    8.03   488  20698          0 100000

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)

Elapsed times include waiting on following events:
Event waited on             Times   Max. Wait  Total Waited
---------------             Waited  ---------  -------------
SQL*Net message to client   6668    0.00       0.02
db file sequential read     488     0.09       5.85
SQL*Net message from client 6668    0.51     199.51

After Deletion:

Height    Blocks    Leaf Blocks    Lf_Rows        Del_lf_rows
3    2176    1999        9000000            0    

SQL TRACE
Call count cpu  elapsed  disk  query    current   rows
---- ----- ---  -------  ----  -----    -------   ----    
Parse    1 0.00    0.00     0      0          0      0
Execute  1 0.00    0.03     0      0          0      0
Fetch 6001 0.88    1.11     0  12470          0  90000
---- ----- ---  -------  ----  -----    -------   ----    
total 6003 0.88    1.15     0  12470          0  90000

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)

Elapsed times include waiting on following events:
Event waited on             Times   Max. Wait  Total Waited
---------------             Waited  ---------  -------------
SQL*Net message to client   6001    0.00       0.02
SQL*Net message from client 6001    0.54     108.25

After Deletion with colaesce:

Height    Blocks    Leaf Blocks    Lf_Rows        Del_lf_rows
3    2176    1999        9000000        0    

SQL TRACE
Call count cpu  elapsed  disk  query    current   rows
---- ----- ---  -------  ----  -----    -------   ----    
Parse    1 0.00    0.00     0      0          0      0
Execute  1 0.00    0.03     0      0          0      0
Fetch 6001 0.95    1.01     0  12470          0  90000
---- ----- ---  -------  ----  -----    -------   ----    
total 6003 0.95    1.02     0  12470          0  90000

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)

Elapsed times include waiting on following events:
Event waited on             Times   Max. Wait  Total Waited
---------------             Waited  ---------  -------------
SQL*Net message to client   6001    0.00       0.02
SQL*Net message from client 6001    0.58     339.98

After Deletion with rebuild:

Height    Blocks    Leaf Blocks    Lf_Rows        Del_lf_rows
3    2048    2004        9000000        0    

SQL TRACE
Call count cpu  elapsed  disk  query    current   rows
---- ----- ---  -------  ----  -----    -------   ----    
Parse    1 0.00    0.02     0      0          0      0
Execute  1 0.00    0.03     0      0          0      0
Fetch 6001 1.20    2.95   201  18460          0  90000
---- ----- ---  -------  ----  -----    -------   ----    
total 6003 0.95    1.02     0  12470          0  90000

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEE'
2    1     INDEX (RANGE SCAN) OF 'EMP_EMPNO' (NON-UNIQUE)

Elapsed times include waiting on following events:
Event waited on             Times   Max. Wait  Total Waited
---------------             Waited  ---------  -------------
SQL*Net message to client   6001    0.00       0.02
db file sequential read         201    0.06       1.32   
SQL*Net message from client 6001    0.58     193.24

Environment:
1.Single user working on the database.
2.Working on Oracle9i Enterprise Edition Release 9.2.0.1.0

Questions:
1.Why 'Select * from index_stats' is showing del_lf_rows as 0 when the deletion to the employee table has already taken place?
2.Why the wait event 'db file sequential read' is missing from the tkprof output in the case - II & Case III scenario?
3.Why elapsed time has increased in the case of Case IV scenario when the index has been rebuilt. The CPU elapsed should have gone down when an index on a column is monotonically getting populated with a sequence number.
The index range scan should have scanned lesser no. of blocks as the index has been compacted.
4. How come the leaf blocks have increased in Case IV after the rebuild option, this could be a determent factor in answering Question 3 as the index blocks are scanned via a link list starting from one leaf block and ending with some other leaf block where the leaf entry ends.

Regards, 

 

Tom Kyte
November 23, 2004 - 7:19 am UTC

1) because (contrary to the myths out there) index space is reused, indexes are pretty efficient beasts. there was no white space is all that is saying.

2) because you didn't do any physical IO. that is clearly shown. You got everything from the buffer cache.

3) umm, you had waits on physical IO that time. It is called the "luck of the draw". had you found it all in the buffer cache, it would have been different. you didn't (that is the real world, sometimes you will -- sometimes not)

4) rebuild does more work than coalesce. coalesce takes adjacent blocks and says "can we merge them", if so -- it does, else it moves on. rebuild does just that -- rebuilds. you are comparing apples to flying toaster ovens.

Needs some more understanding

Vikas Khanna, November 24, 2004 - 5:36 am UTC

Hi Tom,

1.Why there will be no white spaces when a deletion has taken place. I have not reinserted any data into the table for the leaf entries to be reused.

The index_stats should show Lf_rows as 100000 and del_lf_rows as 10000.

As an example,I again performed on Dept table and the index_stats shows the correct results.

SQL> Select * from dept;
    DEPTNO DNAME          LOC
---------- -------------- -------------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON

SQL> Insert into dept values (50,'QUARK','MOHALI');
1 row created.
SQL> COMMIT;
Commit complete.
SQL> ANALYZE INDEX PK_DEPT VALIDATE STRUCTURE;
Index analyzed.

SQL> Select Height,Blocks,Name,lf_rows,del_lf_rows,del_lf_rows_len from index_stats;

HEIGHT BLOCKS NAME     LF_ROWS DEL_LF_ROWS DEL_LF_ROWS_LEN
------ ------ -----    ------- ----------  -----------     1          8  PK_DEPT  5       0           0   

SQL> DELETE FROM DEPT WHERE DEPTNO = 50;
1 row deleted.
SQL> COMMIT;
Commit complete.
SQL> ANALYZE INDEX PK_DEPT VALIDATE STRUCTURE;
Index analyzed.
SQL> Select Height,Blocks,Name,lf_rows,del_lf_rows,del_lf_rows_len from index_stats;
HEIGHT BLOCKS NAME     LF_ROWS DEL_LF_ROWS DEL_LF_ROWS_LEN
------ ------ -----    ------- ----------  -----------     1          8  PK_DEPT  5       1           13

I again performed another simulation with all_objects table.

SQL> create table allobjects as select * from all_objects;
Table created.

SQL> create index allobjects_id on allobjects(object_id);
Index created.

SQL> select count(*) from allobjects;
  COUNT(*)
----------
     29104
SQL> Select Height,Blocks,Name,lf_rows,del_lf_rows,del_lf_rows_len from index_stats;

HEIGHT BLOCKS NAME       LF_ROWS DEL_LF_ROWS DEL_LF_ROWS_LEN
------ ------ ----       ------- ----------- ---------------  2    72     ALLOBJECTS_ID 29104           0              0
SQL> analyze index allobjects_id validate structure;
Index analyzed.

SQL> delete from allobjects where mod(object_id,2) = 0 ;
14551 rows deleted.
SQL> commit;
Commit complete.
SQL> analyze index allobjects_id validate structure;
Index analyzed.
SQL> Select Height,Blocks,Name,lf_rows,del_lf_rows,del_lf_rows_len from index_stats;

HEIGHT BLOCKS NAME       LF_ROWS DEL_LF_ROWS DEL_LF_ROWS_LEN
------ ------ ----       ------- ----------- ---------------  2    72     ALLOBJECTS_ID 26160      11607    181044      

In this case, the index after validating is showing lf_rows as 26160 & del_lf_rows as 11607 which is quite surprising?
Please let us know what's the alogrithm according to which the index behaves. Does deletion of data in the index, itself readjust the leaf index entries.?

Suppose we have an index block having these leaf enries,
1
2
3 <-- If this gets deleted, will 4 jump up automatically to reclaim the space released by 3.My understanding was the white space created during deletion will be reused by any insertion for data between 3 and 4 in near future.Is that understanding wrong??
4
5

2) Thanks.Got your point.

3)<<umm, you had waits on physical IO that time.>> Is the reason behind so many waits occuring this time is that we had rebuilded the index, against the table & this new copy of index is still residing in datafiles as no one has cached them up in the buffer cache. Please correct me if I am wrong.

4)Please elaborate it with an example which will let me know the exact difference between coalesce and rebuild. To me it means 

if we have an index for 3 leaf blocks such as :

LF-Blk1   LF-Blk2  LF-Blk3
1         6        11
2         7        12
3         8        13
4         9        14
5        10        15

and we delete 3,7,9,11,13,15 entries from the table then we have index which has white space entries 

LF-Blk1   LF-Blk2  LF-Blk3
1         6        11
2                  12
          8         
4                  14
5        10        

Coalesce will look for adjacent blocks to see if the merger can take place so index leaf blocks will take up the shape of 

LF-Blk1   LF-Blk2  LF-Blk3
1         6        14
2         8        
         10         
4        11        
5        12        

The entries for leaf block 2 can not be merged with leaf block 1 as two deleted entries need to fit in a index block with only one deleted entry. However the leaf block 3 has fitted in the leaf block 2.

With a rebuild option, the index leaf blocks should look like this:

LF-Blk1   LF-Blk2  
1         8        
2         10        
4         11         
5         12        
6         14       
 
thus reducing the no. of leaf blocks.

Is my understanding correct? Please explain?

Thanks

 

Tom Kyte
November 24, 2004 - 7:52 am UTC

1) because indexes are complex, robust structures that most of the time take care of themselves. you will NOT find a 1:1 correlation between delete leaf rows and the number of rows you delete.

there is nothing saying otherwise, so your basic assumption is flawed (clearly, you proved that!)


Indexes are horribly complex beasts and the way they behave changes subtly from release to release. how they behave given single row deletes by a single transaction vs bulk deletes in a single session vs single row deletes by many concurrent users vs bulk deletes by many concurrent users vs mixture of single/multi-row deletes by single/concurrent users -- all different.

splitting, read consistency needs, many other factors come into play.


the reason you had IO waits on physical IO was because the data you needed was no longer in the buffer cache. Why wasn't it there? again, millions of reasons -- perhaps it was simply because another user needed space in the buffer cache.



I don't know how to say it any simplier than:

a coalesce will take two adjacent blocks (in 10g, algorithm is slightly modified but lets deal with 9i and before for simplicity sake here) and ask itself "hey, if I was to merge these two leaf blocks into one block -- would it all fit???"

o yes: do it and leave one fullish leaf block in the index proper and put the other now empty block on the freelist

o no: skip the leftmost block, make the right hand block we were just looking at the "left" block and get the next rightmost block and do it over. (eg: if you have leaf blocks 1, 2, 3, 4, 5 -- try to merge 1,2 and then 2,3 and then 3,4 and so on


a rebuild says "forget about the existing index, lets just rebuild the entire structure from the ROOT BLOCK on down"

Some more data to understand point 3)

Vikas Khanna, November 24, 2004 - 6:48 am UTC

Hi Tom,

I thought the same way,so I cached the index blocks and again ran the query IInd time with the hope that all the blocks are cached and there will be no waits for 'db file sequential read'.

Select *
from
employee with_rebuild_cached_blocks where emp_no between 1 and 100000


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.01 0 0 0 0
Fetch 6001 1.08 1.73 0 12459 0 90000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 6003 1.08 1.75 0 12459 0 90000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 59

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 6001 0.00 0.02
SQL*Net message from client 6000 0.56 245.80

Again the time taken is MORE that what it took when the index was just coalesced. Appreciate if you clarify all doubts.

Time taken to range scan an index after REBUILD should be LESS than what it should be before rebuild on a column where the deleted leaf entries are never going to be reused.

Thanks


Tom Kyte
November 24, 2004 - 7:56 am UTC

run it 5 times, you'll get 5 different runtimes. the difference between 1.08 and 1.00 is "nothing" -- meaningless.

If you are interested in "why"

Effective Oracle by Design (superficial coverage of why the timings have errors in them)

Optimizing Oracle Performance (Milsap) non-superficial coverage of why the timings have errors in them, in fact -- part of the crux of the entire book. An excellent book.

www.actoug.org.au/Downloads/oracle_index_internals.pdf

Richard Foote, November 24, 2004 - 7:05 am UTC

Hi Vikas,

Take a look at the presentation of mine that's mentioned earlier in this thread (www.actoug.org.au/Downloads/oracle_index_internals.pdf) as it will answer many of your questions.

The point that you're missing in relation to deleted rows is that Oracle "cleans" them out in many cases during delayed block cleanout. All trace of the deleted index entries are removed from the block. Therefore stats such as del_lf_rows only document those deleted rows that have *not* been cleaned out (because Oracle doesn't bother to clean out the deleted entries on the portion of blocks processed in memory during commit operations). Note though that these deleted entries do get cleaned out if (say) a subsequent insert is performed in the leaf block. Note also that leaf blocks that have been emptied of all their leaf entries are placed on the freelist and recycled. So this "white" space has a good chance of being reused in many scenarios.

Those poor souls who use some ratio of the deleted row statistics in index_stats as justification for a rebuild are not even considering the whole picture. Let's just keep it our little secret as it might shock them ;)

Again, look at the presentation for more details.

Cheers

Richard Foote



Tom Kyte
November 24, 2004 - 7:57 am UTC

Thanks Richard -- I reference that (one of my favorite presentations) elsewhere.

On one of the pages -- I showed where the del_lf_rows "ratio" was awesome -- but the index really was in dire need of a cleanup, and later the del_lf_rows "ratio" was rather not as good -- but that index was perfect already.

A reader, March 03, 2005 - 11:20 am UTC

the document www.actoug.org.au/Downloads/oracle_index_internals.pdf
is junk

Tom Kyte
March 03, 2005 - 11:31 am UTC

absolutely brillant followup. (and I thought you were giving up on this site -- your comment in january said so.... but hey, welcome back)

In fact, I am so convinced by your clear arguments and dissertation that I will straight away set up a job to rebuild all of my indexes every 5 minutes. I mean if a rebuild is good -- doing it really frequently should be the best.



Geez. I guess if you had something intelligent to say about it you would have. In light of that -- all you'll do is have more people actually click thru to

</code> http://www.actoug.org.au/Downloads/oracle_index_internals.pdf <code>

and read a rather well put together technical explaination of the facts as they are. (i'll make it easier, it is now a link)


You know, it is funny -- when something doesn't agree with something you've been led to believe, and probably have staked part of your job on -- and that something is so clearly laid out, people attack with such compelling arguments like "it stinks", or "what junk". No mention of any sort of factually wrong information, no mention of caveats not considered -- no nothing actionable, just "it doesn't agree with what I 'know' to be true".

Well, it ain't what we don't know that gets us into trouble, it's the things we know that just ain't so that get us in trouble (loosely quoted from Artemus Ward, a humorist in the past...)

clicking the link

Frank, March 03, 2005 - 12:14 pm UTC

Very funny, I read the negative follow-up above and decided to take a look at the presentation, only to read your comment on returning
<quote>all you'll do is have more people actually click thru to ...</quote>
LOL

one doubt

Jagjeet Singh, March 17, 2005 - 5:59 am UTC


case 1 :

"Now, suppose empno is filled in via a sequence. Its always increasing and we
never reuse a value. In our system, we actually delete rows when we fire
someone (a real HR system would never do that). So, emp_idx1 develops "holes"
over time that will never be reused. It we find it to be X% "empty" over time--
it might be a candidate for a rebuild. We'll never reuse that deleted space
(unless we end up deleting all index entries on a block)"

Case 2:

" But, the index on ename -- that's a little more chaotic. Say we have employees:

Bill
Bob
Mary

Say we fire Bob, now we have:

Bill
X-Bob (deleted entry)
Mary

Should we rebuild an index like that? Probably NOT, the reason -- we are going
to hire Hillary:

Bill
Hillary
Mary

Hillary will reuse Bobs slot"

Sir,
what is the diff. in two test cases. If oracle will not use old slot in first case if we don't use lower value.

Then in second case Bob start with "B" and Hilary start with "H". how can index reuse the slot of B.

Thanks,
Js

Tom Kyte
March 17, 2005 - 9:12 am UTC

The slot was not for "B", the slot was for any value that falls between "Bob" and "Mary" -- any value, not the exact value that was there.


That is the difference, if you have an increasing sequence -- you'll never have a value that falls between two old values.

please ignore my previous question

Jianhui, June 23, 2005 - 5:13 pm UTC

Sorry about this, I have found following place is more relavant to this question, if you could, please remove my post from this thread.

</code> http://asktom.oracle.com/pls/ask/f?p=4950:8:6382127215492936598::::: <code>


Delete rows by Date, Insert rows by PK Sequence_Num

Charlie Zhu, January 11, 2006 - 4:58 pm UTC

Hi Tom,

Say there is an PK index, the value generated by Sequence Number.
I delete the data row by row by a date column,
and continue inserting new rows.

Most of our tables get this data changing model.

Should I rebuild the PK Index?

Thanks a lot.

Tom Kyte
January 12, 2006 - 10:49 am UTC

depends, do you run queries of the form:


select * from t where pk = :x;

they won't really benefit from a rebuild.


if you query:

select min(pk) from t;
select /*+ first_rows */ * from t order by pk;

they *might* - since you have a "sweeper" index and may well have some near empty leaf blocks on the left hand side of the index.

PK (Sequence Num) index, delete by an Update Date column

Charlie Zhu, January 12, 2006 - 1:45 pm UTC

It's 90 Millions rows table,
it gets 1M rows INSERT and DELETE each daily.
but DELETE by a row_update_date date flag column(randomly).

I wonder will the ReBuild PK index compact the PK index space used?
or I just leave the space maintenance job to Oracle db?

Yes, we run below so frequently, it's our product(dimension) table.
select * from t where pk = :x;

We seldon run
select min(pk) from t;
select /*+ first_rows */ * from t order by pk;


Tom Kyte
January 13, 2006 - 10:17 am UTC

if you seldom run the last ones, I would likely not bother - you might "reclaim some space" right after the rebuild, but the index will almost certainly grow to reclaim it back again anyway (eg: fighting a losing battle)

Index rebuild is only justified when you gain on blevel.

Sai, March 11, 2006 - 8:42 pm UTC

Do index rebuilds only if you think new blevel would be atleast one down from the original blevel, and that new index wouldn't get back to the original blevel pretty soon.

It all depends on the way you access and modify key values of that index.

Thanks,
Sai
</code> http://sai-oracle.blogspot.com/ <code>



Tom Kyte
March 12, 2006 - 1:43 pm UTC

what if the blevel doesn't change - but - the consistent gets done by your most popular query (or at least a very popular query) go way down for a long time?


Your initial sentence in that entry on
http://sai-oracle.blogspot.com/2006/03/how-to-predict-index-blevel-and-when.html

<quote>
Depth of an index would determine how many consistent gets are required for Index lookups in any joins and in index range scans.
</quote>

is not correct.  The blevel plus one predicts the minimum number of consistent gets required for the first key value.  It is also (plus one) the least number of consistent gets that could be necessary to retrieve all of the key values.

It is by no means a predictor of the maximum.


Not even close.


ops$tkyte@ORA10GR2> create table t ( x char(40) constraint t_pk primary key, y int );

Table created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> insert into t
  2  select to_char( rownum, 'fm' || rpad('0',40,'0') ), rownum
  3    from all_objects;

49819 rows created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10GR2> select blevel from user_indexes where index_name = 'T_PK';

    BLEVEL
----------
         2

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> set autotrace traceonly
ops$tkyte@ORA10GR2> select min(x) from t;


Execution Plan
----------------------------------------------------------
Plan hash value: 2094033419

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |    41 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |      |     1 |    41 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T_PK | 49819 |  1994K|     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        447  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte@ORA10GR2> select min(x) from t;


Execution Plan
----------------------------------------------------------
Plan hash value: 2094033419

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |    41 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |      |     1 |    41 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T_PK | 49819 |  1994K|     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        447  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte@ORA10GR2> set autotrace off
ops$tkyte@ORA10GR2> delete from t where  rownum <= 10000;

10000 rows deleted.

ops$tkyte@ORA10GR2> commit;

Commit complete.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10GR2> select blevel from user_indexes where index_name = 'T_PK';

    BLEVEL
----------
         2

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> set autotrace traceonly
ops$tkyte@ORA10GR2> select min(x) from t;


Execution Plan
----------------------------------------------------------
Plan hash value: 2094033419

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |    41 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |      |     1 |    41 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T_PK | 39812 |  1594K|     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         63  consistent gets
          0  physical reads
          0  redo size
        447  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte@ORA10GR2> select min(x) from t;


Execution Plan
----------------------------------------------------------
Plan hash value: 2094033419

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |    41 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |      |     1 |    41 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T_PK | 39812 |  1594K|     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         63  consistent gets
          0  physical reads
          0  redo size
        447  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte@ORA10GR2> set autotrace off


<b>I do however agree very much with:

It all depends on the way you access and modify key values of that index.
</b> 

Re:Index rebuild is only justified when you gain on blevel.

Sai, March 12, 2006 - 2:34 pm UTC

Thanks Tom, I've corrected my statement on that link.

Tom Kyte
March 13, 2006 - 12:00 am UTC

well, I disagree that blevel would be one of the prime indicators.

And it is very hard to determine when a new level will be spawned. Given a specific test case - sure, but "in general", not really.

b-levels and the cost of b-tree index access

Steven Hankin, March 13, 2006 - 10:44 am UTC

Supposedly the cost of a b-tree index access (Lewis, 2006) can be derived as follows;

c = blvl + (lblk * insel) + (cf * tbsel)

where;

blvl = B Level
lblk = # leaf blocks
insel = index selectivity
cf = cluster factor
tbsel = table selectivity

So having the indexes leading columns in the 'where' clause (insel) and ensuring the index is selective (insel) is a good start to reducing cost.

Also, having a low cluster factor is helpful (maybe reordering the data in the table, at the risk of adversely affecting other indexes) and ensuring the table selectivity is high (usual b-tree index requirement) are all good.

But the B Level by it's very nature (representing a logarithmic growth compared to table size) and the fact that it is not multiplied with any other property means it is always going to be relatively small and inconsequential.

And this is a good thing and a reason why we like using b-tree indexes.

references:
Lewis, J. (2006) Cost-Based Oracle Fundamentals, pp 69. Apress


Bitmap Index Rebuild

Jatin, July 30, 2009 - 3:46 am UTC

Hi Tom

Is there a difference in opinion regarding rebuilding bitmap indexes. Maybe because these go out of shape even on small DML operation (which ideally should not be done on columns with bitmap indexes)?
Tom Kyte
August 03, 2009 - 5:14 pm UTC

Bitmaps should

a) never be used in any system where concurrent modification is necessary against the indexed information

b) with single row DML statements, they respond well to big bulk statements (modify hundreds or hundreds of thousands of rows in a single statement)


If you do single row operations on bitmap indexed data - that falls into the "don't do that category". You will need to perform maintenance on such indexes...

Daniel Stolf, August 03, 2009 - 5:58 pm UTC

If i might add, Jonathan Lewis has a wonderfull post about how you can lose space to ITL if you have a burst of concurrent DML at an index.

http://jonathanlewis.wordpress.com/2009/07/28/index-explosion/

rebuild indexes

A reader, September 30, 2009 - 7:22 am UTC

Dear Tom,

We have 2 data base db1 and db2. They have the same tables (same name, same structure, same indexes, etc..)

The first data base db1 is an OLTP date base in which we are doing mainly insert and updates. Every day also via overnight jobs we are deleting data from db1 that are greater than 7 days.
We keep 7 days online in db1.

Every day we are archiving data coming from db1 into db2 via overnight batch jobs (before the delete in db1). Tables in db2 are mainly inserted and updated sometimes. There is no delete in db2

What could you please suggest about the indexes in both db1 and db2?

The same indexes exist in both db1 and db2. In db2 they are never deleted while in db1 they are deleted every 7 days

Thanks a lot
Tom Kyte
October 07, 2009 - 8:31 am UTC

... We have 2 data base db1 and db2. They have the same tables (same name, same
structure, same indexes, etc..)
...

unless you never insert/update/delete/merge into these tables
and unless you created db2 from a backup of db1

db1 and db2 are nothing alike, period.



I don't know what it means for an index to be "deleted", this sentence doesn't make sense to me:

... The same indexes exist in both db1 and db2. In db2 they are never deleted while
in db1 they are deleted every 7 days
...


I don't know what you are looking for me to suggest? Index space is efficiently reused in indexes, it is highly likely that you need to rebuild anything. You don't have what I call a sweeper index (read about that by SEARCHING for that word sweeper on this site). You delete ALL of the old data, you do not purge "most but not all" of the old data - so your indexes on monotomically increasing data are OK. As are your indexes on randomly arriving data like "last_name" and so on are OK. We'll reuse the space you freed up by deleting the next time you insert.


More to Explore

Performance

Get all the information about database performance in the Database Performance guide.