November 24, 2009 - 2:30 pm UTC
... I was wondering creating an index on a 16K tablespace to reduce this high
umm, that would probably INCREASE it. close your eyes and envision what must happen.
You have a really large index - say it is on last name (not always increasing, sometimes you insert names with A, sometimes with Z sometimes with K - all over the place)
Therefore, any of the leaf blocks could be touched (not like a sequence populated field where the inserts would just go on the right hand side).
And in actuality - ALL of the index blocks are being touched - since you have 20 sessions inserting randomly scattered data.
And they do not all fit in the cache with everything else going on.
If you make them larger - they still all have to fit, but now each is taking twice as much space (true, there are half as many, but that doesn't solve anything).
Now, since the blocks are larger - you likely increase the probability that when you go to look for block 'X' and do not find it - that the cache will be full - and you'll have to flush out some index blocks - which you need to re-read again soon and so on. But now you are paging with 16k blocks, not 8k blocks. So the units of IO are larger. You might wait for fewer numbers of db file sequential reads - but you'll wait LONGER for each one because they are twice as big.
So, what to do?
do not rebuild the index, just set up a keep pool for it and let it have its own memory region. If you make this larger than large enough - you could well reduce the number of total re-reads on the index.
partition the table by some increasing value, locally partition the index that is giving you a problem. For example, suppose the records are all inserted with a default of SYSDATE in some column. Partition the table by that column into weekly partitions (for example, maybe daily, maybe monthly, maybe hourly - I don't know, you figure out what keeps them small enough). You will now have an index partition by week. You only touch the current week in your inserts (they all hit the newest partition). That index segment probably fits into the cache. This will help the inserts - you need to make sure it doesn't hurt the queries!!! If your queries do not utilize this date column in their predicate, we'll have to index range scan ALL of the index partitions.
instead of 20 processes doing insert.... have a single
process do a 20 way parallel direct path insert. We maintain the indexes very differently with a parallel direct path insert - each of the insert threads will create a mini index on just their data and then merge that in bulk with the existing index. This will minimize re-reads of blocks in a huge way - because we'll process ALL of the A's for a thread at once, then the B's then the C's and so on
those are three ideas for you to think about..