Skip to Main Content
  • Questions
  • how to separate index table from data tablespace

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Lilian .

Asked: July 03, 2001 - 10:49 am UTC

Last updated: August 25, 2010 - 1:11 pm UTC

Version: 8.1.6

Viewed 10K+ times! This question is

You Asked

Hi Tom:

in our production there is a tablespace called USER_DATA which included both all data tables and index tables, i am new in oracle dba career, i would like to know if i am going to move these index tables from USER_DATA tablespace and create a new index tablespace to hold these index tables:

1. how should i do it? would you show me some examples?
2. how do i decide a size of index tablespace?
3. if i do rebuild index monthly, would it clean fragmentation of the tablesapce?

thanks in advance.

an


and Tom said...

1) create the new tablespace and then:

alter index INDEX_NAME rebuild tablespace OTHER_TABLESPACE;

this can be on "online" as well.

2) you need to do that based on the size of the indices themselves. See how big they are now (select sum(bytes) from dba_segments where tablespace_name = 'USER_DATA' and segment_type = 'INDEX' );


3) don't schedule rebuilds of indexes on a periodic basis. An index rebuild is a rare thing to have to do. Do it only to indexes that need it and only when they need it.

Rating

  (26 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

oracle dba

L An, July 03, 2001 - 1:29 pm UTC

thank you very much for your quick response, and it's veru helpful.

Separating Data and Indexes improves performance

A reader, April 23, 2004 - 3:26 pm UTC

Hi Tom, please

Why "Separating Data and Indexes improves performance" is a myth?


I don't understand exactly what is this myth about?

</code> http://www.niall.litchfield.dial.pipex.com/OracleMyths.zip <code>

Thanks

Tom Kyte
April 23, 2004 - 3:35 pm UTC

tell you what, why don't you try to tell us why it would make a difference -- we can star there.

I mean, if you have been doing something as a "rule", you must have good reasons for doing so right?


so -- what would be the technical reasons, in 2004 (not 1987) for doing this.

Even in 1987 the technical reasons were weak (having small and lots of extents would have been a better way to accomplish what this was attempting to do but that would have broken another "rule", that of "few extents")....



A reader, April 23, 2004 - 6:17 pm UTC

The reason is not fragmentation,

else the possibility to move tables and indexes in two separated tablespaces.
Then set every one in a distinct physical disk, to allow "simultaneous" access, I read that in somewhere in the documentation.
In a small database that doesn't have real big contention, this will allow, in theory, speed up access to data, becase then you'll can to read "at the same time" the data and the indexes.

Reading
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:2913600659112 <code>

Because my database is small (about 4G) , and never an oltp, I think this is a good idea, originally we did it, even when our customer use logical disks instead of physical disks, because there are not any serious performance problem, but maybe is better another solution?

:) Thanks

Tom Kyte
April 23, 2004 - 7:55 pm UTC

we do not access them "simultaneously" though.

You do not ever read the index and table data "simultaneously"

we need to read index root block IN ORDER to read
the index branch block IN ORDER to read
the index leaf block IN ORDER to get the rowid to read
the TABLE data

it is all very serial and all 100% random IO (eg: don't even start mentioning "head position", to go from root to branch to leaf to table is 4 totally random IO's regardless of WHAT disks organization you have in place).


the goal is to achieve even io. if you use small extents, and create a single tablespace with files on all physical devices -- we'll allocate extents in a round robin fashion on the disks -- achieving "poor mans raid" -- this is what the "split indexes from tables" was -- back in the day when a big schema had like 10 tables and 50meg of data total.


Index seperate tablespaces how index works

nn, April 26, 2004 - 4:07 am UTC

Tom

You are really genius , i must say GURU
Tom Dont you provide any training or hire asst. DBA's

A reader, April 26, 2004 - 9:03 am UTC

Thanks Tom, sorry to ask again,
Only to be sure I understood.

The point is, as I understand, (in the case you have two physical disk for the database) :
1) There is not any important advantage to separate in distinct tablespaces and in distinct physical disks to indexes and data .
2) Is better to separate data and indexes, in distinct tablespaces, in distinct physical disks, based in other criteria like concurrent access, for example: customer data in one disk and financial data in other disk, because they are accessed at the same time.

Thanks


Tom Kyte
April 26, 2004 - 9:38 am UTC

1) correct
2) incorrect (especially today)

what you want to do is have even IO over BOTH disks regardless of system. what if customers use the system from 9-5 and finance from 5-midnight. be a bummer to have a single disk per type of data.


Even IO
over ALL devices.

A reader, April 26, 2004 - 1:17 pm UTC

Thanks Tom

A reader, April 27, 2004 - 3:32 pm UTC

Hi Tom,

Searching a reason to keep indexes separated from data.
I got from Tim G., an interesting reason to separate index from tables, because indexes can be recreated from tables you can move all indexes to another tablespace.
This will reduce the possibility to have a damage in critical data.

For example our customers don't have dbas and keeping indexes separated from data, in other disk can help to protect more the data tablespace and if the damage come to the index tablespace, you know you can rebuild any index and at last you can rebuild every one (for a small database is about 1-2 hour, so this is not an impossible task), meanwhile they can survive using fullscans.

So This is a interesting idea, if one day one datafile have to "die", because the destiny had choosen to damage several disk blocks , the possibility to be the index datafile instead of the data datafile is rasonable, and if the damage hit the index, the problem is not as serious.



Tom Kyte
April 28, 2004 - 2:02 pm UTC

That -- that is something I've said as well :)

The reasons for separating index from data (X from Y in general) are not related to performance at all.

They are related to ADMINISTRATIVE or PERSONAL (because you like to) reasons only.



A reader, April 28, 2004 - 5:57 pm UTC

thanks

seperate index and data tablespaces

A reader, June 12, 2004 - 5:39 am UTC

Hi

I agree that separating data tablespace and index tablespace would not improve read performance. How about insert/loading data? When you insert does Oracle updates at the same time data and index or it update first data then index (serial access)?

Also I think separating index and data tablespaces can sometimes make administrative tasks easier. I mean even Oracle own databases separate index from data....

Tom Kyte
June 12, 2004 - 9:46 am UTC

the updates are done to the buffer cache and written out in the background.


That last paragraph nails it -- that is the only reason to even consider separating. because you "want to".

Me, I prefer for many applications to have a single tablespace -- easier to transport, backup, PITR, manage.

how to separate index table from data tablespace

Doug, July 19, 2004 - 3:18 pm UTC

I believe some of the confusion comes from Oracle's own training material.

If you look at the current books for 9i on page 3-4 of the Performance & Tuning Training Manual, it states the following:

Tables and indexes should be split into separate tablespaces, because indexes and tables are often inserted into and read from simultaneously.

I would definitely think this to mean, put data and indexes in separate tablespaces.

Regards,
Doug

Tom Kyte
July 19, 2004 - 4:43 pm UTC

yes, that finally got corrected in the current docs.

Using Multipe SANS devices

Steve Gillis, November 24, 2004 - 11:03 am UTC

I am currently working on an application that has about 20 tablespaces. In current standards I think it would qualify as a small application, say 100 tables with no more the 5 million rows in the big tables (only 5 - 10 tables are **big** tables, and by today's standards those aren't even big). I am trying to convince them to move to a single tablespace for all objects, both indexes and tables and I am getting the following roadblocks:

1. "If we use multiple SANS devices it would be faster to have the indexes on one sans and tables on another."

Based on the fact that index -> table access is serial I would assume that having tablespaces spread across sans might actually slow things down. Any truth to this?

2. Having a single tablespace makes it more likely that if the tablespace gets corrupted all data will be lost. With multiple the chance of loosing everything is lessened.

If a tablespace gets corrupted (I have never seen one) will all the data in the data files be lost? Or is there any truth to this argument at all?

Tom Kyte
November 24, 2004 - 12:19 pm UTC

1) on some disk arrays there is truth to this:

place objects that do single block IO on this device
place objects that do multi-block IO on this device

The reason -- read aheads that kick in on some storage arrays, if they detect that you frequently do large multi-block IO's, they start anticipating that and read ahead. If not, they just do single block IO. Now, if you have a mixture of the two you either

a) get pinged because your single block IO's are really doing multi-block IO's or
b) your multi-block IO's don't go as fast as they could because the read ahead does not kick in.

So.... if you never anticipate lots of full scans (and remember, we full scan tables, we fast full scan indexes -- just because something is an index does not imply single block IO, just because something is a table does not imply multi-block IO) -- a single device is all you want.

If you have both kinds of IO going on and you can point at an object and say "thou art of the multiblock IO class" and vice versa -- split them out that way.

(so sometimes table/index split works by accident -- but not because indexes and tables are like matter and anti-matter, but because of this read-ahead that can happen if your storage array's do that).


You use tablespaces for ADMINISTRATIVE purposes. Some people separate index from data because they don't backup index, just tables (they recreate the indexes in the even of a failure).

Tablespaces do not get "corrupt"

Blocks do (and rman can fix a block)
Files do (and rman can fix a file)

and if corruption of all files means "you lost data", it really means "you need new DBA's cause the ones you got cannot perform the ONLY THING they have to be able to perform -- a recovery.....






Thanks

Steve Gillis, November 24, 2004 - 1:10 pm UTC

Thanks a lot for the info above, it is very useful.

I do have one more question regarding the serial nature of index -> table access. Is index -> table access always serial, or are different types of index scans handled differently, such as index-range scan and fast full index scans?

Again thanks for all the help you provide on this site.

Tom Kyte
November 24, 2004 - 1:49 pm UTC

we read an index to read a table.

we would need the quantum version of Oracle (coming in version 20 i believe) in order to be able to read in a single thread the index and the TABLE BLOCK corresponding to that index simultaneously.

we use an index to go to a table

single block io -- read root block
single block io -- read branch block
single block io -- read leaf block which has row id
single block io -- read table block


1, 2, 3, 4.... in order.


index range scan is 1,2,3,4,3,4,3,4,3,4,3,4..... (in general)

some index range scans are

1,2,3,3,3,3 --> sort rowids collected --> single block ios by those rowids.


a fast full scan is when we pretend the index IS the table (and we don't need the table, but even if we did it would again be "index" gets us to "table")

Thanks again

Steve Gillis, November 24, 2004 - 2:10 pm UTC

Once again i really appreaciate all your help here. Your responces have been helping me invalidate all the myths that Oracle is difficult to manage. I admint I have been only working with Oracle for about 2 years but it seems like it is very much overmanaged. With proper configuration, and knowledge of all the features( and there are a lot) it doesn't seem like there is that much to manage at all.

1,2,3,4,3,4...

Jan, November 25, 2004 - 3:35 am UTC

this just imply me that if I have a control over physical disks
then I should separate a table and indexes on a different disks (=> put it on diff. tablespaces due to performance reasons => follow a myth). Yes, the system is multi-users and datablocks may not be spread contiguously but there is still better chances for better I/O time then having it on one disk (e.g. users may not everytime compete, they works could be serial by coincidence).

1,2,3,4,3,4 - why not 1,2,3,3,3,3,4,4,4,4,3,3,3,3,4,4,4,4.. ?



Tom Kyte
November 25, 2004 - 9:46 am UTC

how about this concept:

1 tablespace
N files in that tablespace.

or

1 tablespace
1 big old striped volume with tons of physical devices


Now that 1,2,3,4 sequence could be hitting disk 1, disk 2, disk 3, disk 4


how does putting an "index" over there and a table "over here" achieve even IO?

===>>> striping <<<==== does that. So, give me one logical volume nicely striped and put it there. Nirvana.


In a normal index range scan, it is:

root -> branch(s) -> leaf, get a row, leaf, get a row, leaf, get a row

why -- because you are fetching a row at a time or you are nested loops joining or whatever.


or it can be

root -> branch(s) -> leaf(s) (gather all rowids, sort them by file/block and then access all of the table rows)

with array fetching it can simulate

1,2,3,4,4,4,4,4
3,4,4,4,4,4
3,4,4,4,4,4



so, data and index on same filesystems ..

A reader, August 25, 2006 - 3:15 pm UTC

Tom,
I totally agree and now understand your reason for NOT separating data and index tablespaces. So, to drive the nail in coffin, please confim it this configuration is acceptable?

Lets assume, there are 4 filesystems(ora01-ora04) and 2 tablespaces(DATA1 and INDEX1)

DATA1
=====
/ora01/data1_1.dbf
/ora02/data1_2.dbf
/ora03/data1_3.dbf
/ora04/data1_4.dbf

INDEX1
======
/ora01/index1_1.dbf
/ora02/index1_2.dbf
/ora03/index1_3.dbf
/ora04/index1_4.dbf


Tom Kyte
August 27, 2006 - 8:47 pm UTC

let's assume that.

now what?

now ....

A reader, August 28, 2006 - 4:29 pm UTC

Thanks.

All I wanted to be clear is that by assuming the filesystems are all in seperate LUNS(striped), then the above configuration is acceptable, right ?

Just need a clarification on that point.

Thanks.

Tom Kyte
August 28, 2006 - 5:30 pm UTC

One LUNS - striped - would be fine?

Have a doubt..

A reader, August 28, 2006 - 8:43 pm UTC

Great material to read. I have learn a lot from this. I just have one doubt.

Tom, how does oracle then achieve parallel processing when reading (select) and doing dml. You said all is a serial process, then how does parallel dml/select works ?

Thanks!

Tom Kyte
August 29, 2006 - 6:52 am UTC

we can do a full scan in parallel for example because we break the table up into non-overlapping bits and each bit is processed by a separate process/thread.


however, you cannot IN PARALLEL (at the same time) do an index range scan AND the table access by index rowid. You need to have completed the range scan PRIOR to performing the table access by index rowid.


So, a query like:

select * from t where x between 100 and 200;

Oracle can break that up into virtual queries that look like (conceptually)

select * from t where x >= 100 and x < 150;
select * from t where x >= 150 and x <= 200;


to do a "parallel two" - each of the "parallel processes" would execute their bit of the query. But even in this case, each would be doing a serial process (read index on X, find rowid, table access by index rowid)

one LUN ?

A reader, August 29, 2006 - 2:39 pm UTC

Thanks Tom.
So, are you saying that we should create only 1 LUN on the whole system and put all the datafiles in that same mount point ?

So, is the below configuration/location of the files OK?

DATA1
=====
/ora01/data1_1.dbf
/ora01/data1_2.dbf
/ora01/data1_3.dbf
/ora01/data1_4.dbf

INDEX1
======
/ora01/index1_1.dbf
/ora01/index1_2.dbf
/ora01/index1_3.dbf
/ora01/index1_4.dbf


Tom Kyte
August 29, 2006 - 4:44 pm UTC

I'm saying:

please achieve even IO distribution over all of your devices.


that is all, you can do it with 4 luns, you can do it (probably easier) with one big one there (and just go with two files).

OK, Thanks much Tom.

A reader, August 30, 2006 - 12:09 pm UTC


How Can I check from remote place that DBfiles and Idxfiles are seperate or not?

Maulesh Jani, November 17, 2006 - 9:24 am UTC

Hi TOm,
Thanks for your valuable support . As I understand to seperate the Index and Data files on Seperate Tablespace will not give more benifit untill we put it on different disks .
Now I am having one problem that from Remote-Place how can I check that this are on different disks. I knows the whole path for all such tablespace but not been able to find out where they are mounted ! Can you please hepl me on this .
O.S. is HP-Unix and Oracle version is 9.2.0.5.0 .

Regards
MJani


Tom Kyte
November 19, 2006 - 3:52 pm UTC

why do you understand that separately indexes and data PERIOD in any case WOULD be good from a performance perspective.

explain the science please.

I believe you have been mislead, but - hey - try this: convince me.

PITR - many apps in one tbs?

Tom Fox, November 19, 2006 - 10:42 pm UTC

Tom,

Reading through this thread, and saw this response:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:901906930328#19624973885275 <code>

Me, I prefer for many applications to have a single tablespace -- easier to transport, backup, PITR, manage.

If you have many apps in one tablespace, and one of those apps require a restore (say an app admin deleted data and committed and did the big whoops!), then you would have to restore each application's data to a prior point in time (aka, pre-whoops). Wouldn't that be a good reason to split your apps into multiple tablespaces?

Tom Kyte
November 20, 2006 - 2:54 am UTC

opps, the word many should have been "MY"


yes, it was a tablespace PER application

separate Index and Data Tablespace

Anil, November 20, 2006 - 1:11 pm UTC

Tom,

When data load is happening thru loader using Direct path. I Mean in that case does it make sense to have separate Index and Data tablespace

regards,
Anil


Tom Kyte
November 20, 2006 - 1:52 pm UTC

create tablespace i datafile '/d01/i.dbf';
create tablespace d datafile '/d01/d.dbf';

do you think that by having two tablespaces - anything would be gained performance wise given the above? I'll put indexes in I, data in D.

but, they are on the same volume - so who cares.


and if /d01 is massively striped, really who cares.


and if



create tablespace i datafile '/d01/i.dbf';
create tablespace d datafile '/d02/d.dbf';

but /d01 and /d02 share the same underlying storage - again, WHO CARES.


You want IO distribution, think "io", not tablespaces (collections of segments), think about the underlying IO infrastructure

OK, agreed, but how about backup?

J. Sieben, October 19, 2008 - 7:44 am UTC

Tom,
I learn a lot from what you're saying in this post. The only concern that comes up for me is that one reason for sharing tables (and indexes) among several tablespaces is the ability to do a fine grained backup strategy.
Let's assume that we have busy tables and others which are seldomly used. Wouldn't it be a valid approach to separate these tables into two distinct tablespaces to allow for, let's say, incremental backup of the busy tables instead of a cumulative backup strategy for the others?

Best regards,

J├╝rgen
Tom Kyte
October 21, 2008 - 11:23 am UTC

... The only concern that comes
up for me is that one reason for sharing tables (and indexes) among several
tablespaces is the ability to do a fine grained backup strategy.
...

fine, that is great, that is exactly what I've been saying!

From 2004 for example:

That -- that is something I've said as well :)

The reasons for separating index from data (X from Y in general) are not related to performance at
all.

They are related to ADMINISTRATIVE or PERSONAL (because you like to) reasons only.


you use tablespaces to make your life better, easier, happier. Tablespaces are an administrative things, a book-keeping thing, an organization tool - use them to make your job easier.

Using automatic storage management

A reader, February 05, 2009 - 3:13 am UTC

You said
"You want IO distribution, think "io", not tablespaces (collections of segments), think about the
underlying IO infrastructure"

In case of using Automatic Storage management(ASM), even thinking about IO also does not really matter, isn't it ?

-Vinu
Tom Kyte
February 05, 2009 - 10:37 am UTC

... In case of using Automatic Storage management(ASM), even thinking about IO also does not really matter, isn't it ? ....


of course it does - ASM is a tool to build and manage a striped and maybe redundant file system for a database.

Do you want just one big volume for everything? (probably not)

It makes it easier for the DBA to reallocate storage on the fly without incurring database downtime.

It does not remove the need to think, that is coming in version 23, the sentient version of Oracle.

Vinu, February 09, 2009 - 7:23 am UTC

Thanks for the response. I liked the last line while understanding the need for thinking. :)

"It does not remove the need to think, that is coming in version 23, the sentient version of Oracle. "

-Vinu

i am curious

tony, August 22, 2010 - 7:07 am UTC

let us say

index is written after the data

D1,D2 and D3 are data to write
I1,I2 and I3 are indexes to write

D1 is written at 1s
I1 and D2 are written at the same timeline 2s
I2 and D3 are written at the same timeline 3s
I3 is written at 4s

all processes takes 1s to complete(let this go)

time in seconds
1s 2s 3s 4s
user1 D1 I1
user2 D2 I2
user3 D3 I3

can we not say what happens in 2s or 3s are concurrent

if they are on seperate disk, i guess that IO is distributed when (I1 is to D2) and (I2 is to D3)
but not (D1 is to I1), (D2 is to I2) and (D3 is to I3)

what do you think?
Tom Kyte
August 25, 2010 - 1:11 pm UTC

what do I think about what??

I do not see any 'question' here.

If this is about recovery - don't worry - we got it covered. Redo is written to disk upon commit - any committed change can be replayed if the write to the datafile gets "lost" for any reason.