It wasn't replication - we wouldn't talk about replication (being somewhat opposed to the concept - especially multi-master - ugh)
It was about index contention on an index on a sequence number. You have a table T:
create table t ( x int primary key, .... );
and you insert s.nextval into X. All inserts will go into the same right hand side block. Now, if you have several sessions doing this at the same time - they will all naturally try to hit the same right hand side block - but they cannot really do that *at the same exact instant*. They will be serializing their bits and bytes modifications on that block. They will appear to be at the same time, but you know they cannot be. You'll have buffer busy waits.
So, you would like to fix this - to reduce or remove the buffer busy waits.
Three solutions were presented
a) reverse key index. This would solve the buffer busy waits by spreading the inserts into the index all across the entire breadth of the index. All of the numbers that end with 1 would now start with 1, end with 2 - start with 2 (after reversing the bytes). So, the index would be hit on the left hand side, the middle, the right hand side - all over the place. A sequence like 5431531 would not be anywhere near 5431532 would not be anywhere near 5431533 and so on. Contention = gone.
But.... Now the entire index had better be in the buffer cache whereas before - only the hot right hand side needed to be in the cache for efficient inserts. If the index cannot fit into the cache we might well have turned a buffer busy wait into a physical IO wait which could be even worse (the remedy is worse than the symptoms)
b) globally hash partition the index. This would again solve the buffer busy waits by spreading the inserts into N insertion points where N is the number of partitions you used. So, if you used 64 partitions - we'll have 64 small indexes instead of one large one. Each of the 64 hash partitions will have a "slightly warmer than cold" right hand side we'll be inserting into. Each right hand side will see 1/64th of the number of inserts per second - a dramatic reduction in contention.
And you don't need the entire index in cache, just the 64 small right hand sides...
But..... You are now using partitioning for 'tuning', instead of for ease of administration or query performance. We'd rather use it there - we don't want to partition for something like this necessarily (it works, but removes your ability to partition for something else). also, it requires the partitioning option. And lastly - in a RAC environment - this wouldn't work very well because you'd be constantly pinging those slightly warmer than cold right hand sides all over the place.
c) use a new type of key, a scalable key of your own design. Using either three columns (good idea) or a single column (works but less flexible, might impose a size limit on your key and consumes maximum space) - make a key composed of:
1) instance_id - the number 1, 2, 3, 4, .....
2) mod(session/process_id,some_small_number) so you get things like 23, 56, 89 and so on - just 1 or 2 or three digits
3) a sequence number
So, if your key starts with instance id, each node in a rac cluster would have its own subtree of the index to insert into - so you don't have much internode collisions on the inserts.
and if your key continues with something about the session/process - we'll spread the inserts out across a few insertion points within each instance (so no/reduced intra-instance/node collisions)
and the sequence will make it all unique.
The "best" way would probably be (c) in most cases - however, it requires you to have thought of these things at design time. (how often does that happen ;) )
After that - the hash partitioned index is nice, but you need an environment where it is possible.
after that - the reverse key index is looking good - if you have the extra cache, but if you don't - you'll probably be better off *leaving the problem alone* or *reducing the amount of concurrency* (remember this:
http://www.youtube.com/watch?v=xNDnVOCdvQ0 )
does that clear it up?