Thanks for the question, Keith.

Asked: April 08, 2001 - 10:52 pm UTC

Last updated: June 13, 2017 - 9:53 am UTC

Version: Oracle 8i 8.1.7

Viewed 10K+ times! This question is

You Asked

Tom,

I have many primary keys that are sequences. Thinking of using reverse keys. How do I change the index created by the primary key? Rebuild as a reverse key?

Thanks.

Keith Cutler

More info:

Reading through the books I received with my classroom training for Enterprise DBA Part 2: Performance and Tuning, reverse keys are suggested for use with columns using sequences. I have very many sequences used to create primary key values. That is why I posed this question. Why would you suggest not using a reverse key? Thanks as always.

and Tom said...

why not:

reverse key indexes void the ability of an index range scan.
They only work with exact equality.

They create the worst case index structure (if we always insert on one side, that actually generates a very good index structure. If you are interested in this, see "Practical Oracle8i" by Jonathan Lewis -- an EXCELLENT book IMO. He goes over this in some detail)

why you would:

you are using OPS and need to remove a hot spot from an index on a table every node inserts into. Period. thats the only reason.

Rating

(44 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

But how are the keys reversed

Praveen, May 05, 2003 - 1:43 am UTC

Hi Tom,

Learned a lot from your site. I have doubt here. I was going through some question while preparing for my OCP fundamentals I paper when I came across this question. It says:
The value stored in an index for a column is '596849'. The DBA then issues the alter index reverse statement. What does the data in the index now look like?
A. '948695'

B. '849596'

C. '596849'

D. '695948'

And the answer which is said correct is '695948' but I thought the answer to be '849596'. My doubt is how the keys are reversed and stored in the reverse key index?

May 05, 2003 - 7:37 am UTC

well, not knowing if that is a varchar for sure or not, I tested with a number and a varchar:

create table t ( x number, y varchar2(10) );
insert into t values ( 596849, '596849' );

create index t1_idx on t(x);
create index t2_idx on t(y);

i dumped the blocks, the t1_idx had these bytes:
col 0; len 4; (4):  c3 3c 45 32       

and t2_idx had:
col 0; len 6; (6):  35 39 36 38 34 39

<b>and if you 

  1  select dump(596849,16),
  2         chr(to_number('35','xx'))||
  3         chr(to_number('39','xx'))||
  4         chr(to_number('36','xx'))||
  5         chr(to_number('38','xx'))||
  6         chr(to_number('34','xx'))||
  7         chr(to_number('39','xx'))
  8*   from dual
ops$tkyte@ORA920> /

DUMP(596849,16)          CHR(TO
------------------------ ------
Typ=2 Len=4: c3,3c,45,32 596849

You can see how that data matches up

Then I:</b>

alter index t1_idx rebuild reverse;
alter index t2_idx rebuild reverse;

<b>and dumped again...</b>

col 0; len 4; (4):  32 45 3c c3
col 0; len 6; (6):  39 34 38 36 39 35

<b>which is:

 1  select chr(to_number('39','xx'))||
  2         chr(to_number('34','xx'))||
  3         chr(to_number('38','xx'))||
  4         chr(to_number('36','xx'))||
  5         chr(to_number('39','xx'))||
  6         chr(to_number('35','xx'))
  7*   from dual
ops$tkyte@ORA920> /

CHR(TO
------
948695

it reverses the entire string -- I fail to see how (d) could be the correct answer.  The only answer that could be correct would be (a)</b>

reverse key index

A reader, September 11, 2003 - 6:20 pm UTC

Sir,
In your book expert one-on-one you say that reverse key indexes are for use mainly in OPS environment. In these forums you do recommend using sequences, if possible. In
situations, where sequences are used in *single* instance oracle in indexkey generation, would you say that reverse key indexes are the way to go in a concurrent environment assuming the loss of range scan is not a issue? Basically I dont get the caveat 'Only in OPS' since single instance situations could also face the same problem if multiple processes are hitting the same index block.

Rgds..

September 11, 2003 - 8:19 pm UTC

we are pretty good at concurrency controls and row level locking.

what sort of rates are you talking about here?

Reverse Key Index

A reader, September 11, 2003 - 10:55 pm UTC

Sorry, I do not have the numbers since it was a theoritical thought that crossed my mind. Put it other way, I was trying to get your esteemed opinion/thought to this question:
From a reverese-key-index block point of view what difference *it* sees being accessed thru a multi-instance server process Vs a single instance server process to perform less for the single-instance server process. I am sure i am missing something.

Thanks as always...

September 12, 2003 - 9:49 am UTC

for 99.999999% of implementations it is far far beyond where you have to go.

most people never incurr the sort of contention in a single instance you might need in order to really think about this feature. initrans and multiple freelists are what most people need to look at if anything.

More clarification on Reverse Index

Anil Pant, September 12, 2003 - 5:45 am UTC

Hello,
When I had read abt the Reverse key Index I was understood that it stores in reverse order. Whether its varchar or number. And ur statement 'it reverses the entire string -- I fail to see how (d) could be the correct answer. The only answer that could be correct would be (a)' also say the same thing.
What will happen if index on date in reverse order ? Also I could under stand how u arrived that answer is (a). I mean this query

select chr(to_number('39','xx'))||
chr(to_number('34','xx'))||
chr(to_number('38','xx'))||
chr(to_number('36','xx'))||
chr(to_number('39','xx'))||
chr(to_number('35','xx'))
from dual

CHR(TO
------
948695

September 12, 2003 - 10:21 am UTC

dates are fixed width 7 byte values.

they will just be reversed.

that query just takes the ascii codes (in hex) and returns the ascii character assoicated with them.

Range scan

Reader, September 12, 2003 - 8:09 am UTC

"reverse key indexes void the ability of an index range scan.
They only work with exact equality.'

Does it mean if my query needs to do range scan on cols with reverse key index, does oracle ignore using the reverse key indexes on those columns and go for full table scan instead? Thanks for clarification.

I have bought your new book, Effective Oracle by Design .. This weekend is going to be a reading weekend....

September 12, 2003 - 10:25 am UTC

if you create a reverse key index on a number column "N" then

where n between 1 and 50

*cannot* range scan that index. the data is no longer stored sorted. we could fast full scan that index, we could full scan that index - but we could not really range scan it.

HOWEVER, if you create a reverse key index on a pair of columns (N,M) then:

where n = 5 and m between 1 and 50

can range scan -- we'll range scan to find N=5 and look at each of the N=5 values in the index for M between 1 and 50...

but in general, NO range scanning with reverse keys.

Reverse Key Index and RBO

Vivek Sharma, March 27, 2004 - 12:51 pm UTC

Dear Tom,

Does RBO make use of Reverse Key Index ? I have a table in which the primary key is populated by sequence and I see lots of Buffer Busy waits on the primary key Index and hence I suggested to drop and recreate it as a revarse key index. Also before suggesting, I had made sure that the queries on this table and on this column are used with equality and not range scan. But immediately after recreating this index as a Reverse Key, the queries started doing Full Table Scan. The optimizer_mode is choose but the schemas are not analyzed hence the optimizer is RBO.

When the Index was dropped and recreated as NORMAL, the Index was used.

Hence my question are

1. Does RBO does not recognize Reverse Key Index ?
and 2.Mine is Oracle 8174 database with Single Instance. So this type of Index is only useful in OPS ?

Regards
Vivek

March 27, 2004 - 4:44 pm UTC

RBO cannot see a reverse key index, no.

This index can be useful in other cases (to spread the insert activity out, to prevent "sweeping indexes" where you delete much of but not all of the left hand side (so you do not need to rebuild)...

but you have to use the CBO

Reverse Key

Pravin Ningoo, June 23, 2004 - 6:27 am UTC

all the info was very helpful...

i have 2 tables that are partitioned on a column generated by a sequence number, the transactions is more than 2 million records a day. The environment is a RAC with SAN Storage.

i join the 2 partitioned tables for AN ID (no range scans).

would a reverse key index improve performance in JOINs?

cheers
Pravin

June 23, 2004 - 9:21 am UTC

doubtful. if anything, it would slow it down, you have to perform the extra step of reversing the bytes.

Reverse Index

Desmond Campbell, June 25, 2004 - 11:06 am UTC

I wanted Tom to expand on his comment below about reverse index as I though that inserting on one side only would
create an unbalance index tree.

"They create the worst case index structure (if we always insert on one side, that actually generates a very good index structure."

June 25, 2004 - 4:14 pm UTC

you have an index built on an increasing value.

When Oracle goes to split an index block, there is an optimization (known as the 90/10 split).  If we notice that you are always inserting increase values (as you would with a primary key on a sequence), we do a 90/10 split -- 90% of the data goes "left", 10% of the data goes right.

What you end up with is a perfectly balanced index (not possible for it to be anything but, no matter what) with all of the blocks on the "left" 90% full and the one block on the right somewhere between 90 and 10% full.  That is, nicely compact.  Index utilization is running high -- very little white space.

Now, you reverse key it.  entries are no longer increasing, they go all over the place.  Oracle is performing more of a 50/50 split now.  On average -- about half of your index structure might be white space!.

ops$tkyte@ORA9IR2> create table t ( x int, y int );
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx_1 on t(x);
 
Index created.
 
ops$tkyte@ORA9IR2> create index t_idx_2 on t(y) reverse;
 
Index created.
 
ops$tkyte@ORA9IR2> create sequence s start with 999999999000000000;
 
Sequence created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t select s.nextval, s.nextval from big_table.big_table where rownum <= 100000;
 
100000 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> analyze index t_idx_1 validate structure;
 
Index analyzed.
 
ops$tkyte@ORA9IR2> create table t2 as select * from index_stats;
 
Table created.
 
ops$tkyte@ORA9IR2> analyze index t_idx_2 validate structure;
 
Index analyzed.
 
ops$tkyte@ORA9IR2> insert into t2 select * from index_stats;
 
1 row created.
 
ops$tkyte@ORA9IR2> set echo off
HEIGHT                        : "2"
BLOCKS                        : "384"
NAME                          : "T_IDX_1"
PARTITION_NAME                : ""
LF_ROWS                       : "100000"
LF_BLKS                       : "276"
LF_ROWS_LEN                   : "2198988"
LF_BLK_LEN                    : "7996"
BR_ROWS                       : "275"
BR_BLKS                       : "1"
BR_ROWS_LEN                   : "4948"
BR_BLK_LEN                    : "8028"
DEL_LF_ROWS                   : "0"
DEL_LF_ROWS_LEN               : "0"
DISTINCT_KEYS                 : "100000"
MOST_REPEATED_KEY             : "1"
BTREE_SPACE                   : "2214924"
USED_SPACE                    : "2203936"
PCT_USED                      : "100"
ROWS_PER_KEY                  : "1"
BLKS_GETS_PER_ACCESS          : "3"
PRE_ROWS                      : "0"
PRE_ROWS_LEN                  : "0"
OPT_CMPR_COUNT                : "0"
OPT_CMPR_PCTSAVE              : "0"
-----------------
HEIGHT                        : "2"
BLOCKS                        : "512"
NAME                          : "T_IDX_2"
PARTITION_NAME                : ""
LF_ROWS                       : "100000"
LF_BLKS                       : "446"
LF_ROWS_LEN                   : "2198988"
LF_BLK_LEN                    : "7996"
BR_ROWS                       : "445"
BR_BLKS                       : "1"
BR_ROWS_LEN                   : "4430"
BR_BLK_LEN                    : "8028"
DEL_LF_ROWS                   : "0"
DEL_LF_ROWS_LEN               : "0"
DISTINCT_KEYS                 : "100000"
MOST_REPEATED_KEY             : "1"
BTREE_SPACE                   : "3574244"
USED_SPACE                    : "2203418"
PCT_USED                      : "62"
ROWS_PER_KEY                  : "1"
BLKS_GETS_PER_ACCESS          : "3"
PRE_ROWS                      : "0"
PRE_ROWS_LEN                  : "0"
OPT_CMPR_COUNT                : "0"
OPT_CMPR_PCTSAVE              : "0"
-----------------
 
PL/SQL procedure successfully completed.


see how t_idx_2 has lots more lf_blks, btree_space is higher, but used_space is the same...

Reverse index on foreign keys

Purushoth, January 27, 2006 - 6:10 pm UTC

90% of time when I search for something about Oracle I found it here. Thanks really for all your information and insight of "how things work inside oracle". I have a question on using reverse key in RAC 9i. We have lot of sequence generated primary key and indexed foreign key relations. However we have made the foreign key indexes as reverse indexes in addition to reverse primary key indexes. Reading this topic I have been thinking isn't good to have normal indexes on the foreign key?. Since foreign key indexes are not going to be skewed and may not result in index block contention is it safe to make them normal indexes?. Are there any issues we should be concerned on having normal indexes on foreign key in RAC apart from block contention?

Thanks.

January 28, 2006 - 12:55 pm UTC

it "depends".

do you tend to create child records at the same time as the parent - then you have the same issue on the child table as you do the parent.

Not often

Purushoth, January 28, 2006 - 7:23 pm UTC

Tom, We don't create child records while creating the parent record for 80% of the tables. Even when we create child records it's not all records. These child records are inserted over a period of time. Is there are any other issues apart block contention/skewedness (I read another topic here you explained how sequence generated primary key is good for space consumption) in RAC should we consider before making the foreign key as normal indexes?

Thanks!

January 29, 2006 - 8:23 am UTC

if you are not having a right hand side index (skewed would not really be a proper term here), whereby all of the inserts are always pretty much hitting just the right hand side - but rather that the insertion points are all over the place (as they would be if you create child records over time), then the benefit of the reverse key index as a way to reduce contention doesn't come into the picture and it would be unlikely that you are receiving benefit from it.

Clustering Factor on Reverse Index

Arindam Mukherjee, January 04, 2007 - 12:20 am UTC

Respected Mr. Tom,

I have an index on "Date" and "Number" column where "Date" is nonunique and "Number" is quite unique generated from Sequence. Let's assume to look the entries in Index like below.

17-feb-06, 1, rowid,
17-feb-06, 2, rowid,
19-feb-06, 3, rowid,
19-feb-06, 4, rowid,
20-feb-06, 5, rowid,
21-feb-06, 6, rowid,

Now if I rebuild it with "Reverse" cluse i.e. "Reverse Index", my understanding, the leading column "Date" in the index should close together even after reversing its byte order as it is NONUNIQUE. The UNIQUE column "Number" is associated with the leading column "Date" in the index. Index should be sorted FIRST on the leading column "Date" and then "Number".

But how does the clustering factor differ after having reverse index on this combination on Oracle 9i / 10g? Hope I am missing some logic or I don't know.

Please help me get it.

January 05, 2007 - 8:59 am UTC

if you use a reverse key index, you will not be range scanning it really - so the clustering factor is not going to be relevant.

Some clarification needed on reverse key index

Asim, June 21, 2007 - 11:33 am UTC

Hi Tom,
I was trying to execute the same sort of example in this thread to check the index value in reverse order after rebuild reverse. But don't know what I am doing wrong. May be I am confused.

See the output below and the last two dump queries still give me blocks in the same order even after doing rebuild reverse -

SQL> create table asim_temp ( x number, y varchar2(10));

Table created.

SQL>
SQL> insert into asim_temp values ( 453224, '453224' );

1 row created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> select * from asim_temp;

X Y
---------- ----------
453224 453224

SQL>
SQL> create index asim_temp_idx on asim_temp(x);

Index created.

SQL>
SQL> create index asim_temp_idy on asim_temp(y);

Index created.

SQL>
SQL>
SQL> select dump(453224,16) from dual;

DUMP(453224,16)
------------------------
Typ=2 Len=4: c3,2e,21,19

SQL>
SQL> select dump('453224',16) from dual;

DUMP('453224',16)
-------------------------------
Typ=96 Len=6: 34,35,33,32,32,34

SQL>
SQL>
SQL>
SQL>
SQL> select dump('453224',16),
2 chr(to_number('34','xx'))||
3 chr(to_number('35','xx'))||
4 chr(to_number('33','xx'))||
5 chr(to_number('32','xx'))||
6 chr(to_number('32','xx'))||
7 chr(to_number('34','xx'))
8 from dual;

DUMP('453224',16) CHR(TO
------------------------------- ------
Typ=96 Len=6: 34,35,33,32,32,34 453224

SQL>
SQL>
SQL> alter index asim_temp_idx rebuild reverse;

Index altered.

SQL> alter index asim_temp_idy rebuild reverse;

Index altered.

SQL>
SQL> select dump(453224,16) from dual;

DUMP(453224,16)
------------------------
Typ=2 Len=4: c3,2e,21,19

SQL>
SQL> select dump('453224',16) from dual;

DUMP('453224',16)
-------------------------------
Typ=96 Len=6: 34,35,33,32,32,34
(WHY IS IT IN THE SAME ORDER - "34,35,33,32,32,34" INSTEAD OF "34,32,32,33,35,34")????????
SQL>
========================================================

Also I have another question -

If I issue a statement like
create index asim_1 on asim_temp(x asc) REVERSE;

Does oracle sorts the values first in ascending order and then build the reverse index or vice versa?

Thanks,
Asim

June 21, 2007 - 11:42 am UTC

the reversal is 100% transparent, not sure what you are looking for via the reverse function???

with a reverse key index - you'll never see the data "reversed" - only we do.

with a reverse key index, oracle will

a) reverse the bytes
b) insert the value sorted by the reverse bytes into the index

Reverse Key Index

Asim, June 21, 2007 - 12:07 pm UTC

Hi Tom,
Thank you very much for being so prompt.

This is what I was thinking I should get out of the last query in the above thread -
SQL> select dump('453224',16) from dual;

DUMP('453224',16)
-------------------------------
Typ=96 Len=6: 34,32,32,33,35,34

Thanks for confirming the order of the reverse key index with asc as sorting order.

Thanks,
Asim

June 22, 2007 - 10:02 am UTC

why do you think that.

you are comparing strings to numbers

you won't see reversed bytes UNLESS you reverse them yourself (it is *transparent*)

Reverse key Index

Asim, June 22, 2007 - 11:23 am UTC

Hi Tom,
Thanks for the clarification.
I got it.

Asim

Not using RAC

Mette Stephansen, September 24, 2007 - 8:56 am UTC

Hi Tom,

We have a large table (almost half a TB) consisting of audit data. We have a Java application with up to 25 threads inserting data into this table. The table has a sequence as primary key. It is partitioned on date and random number of thread 1-25 (no of concurren threads).

We do not have RAC :-)

Could you explain if a reverse key would help performance of this application (it is kind of a batch application, runnig async from the online app that generated the audit data. The app. reads from jms queues and writes into Oracle). We add approx 10G a day, and deletes approx 10GB a day (using partition drops).

Maybe remove the partitioning on random(1..25), change to reverse key index and keep the partitioning on date?

Any suggestions?

Best regards from
Mette

September 26, 2007 - 1:40 pm UTC

you do not say if you have identified this table as being a performance issue

or, if you have, why it is - what sort of issue you are hitting

therefore, short of using a crystal ball, no one could comment on what will make you faster

we could list hypothetical after hypothetical, but that just wastes time :)

Upps sorry

Mette Stephansen, September 27, 2007 - 4:40 am UTC

Hi Tom,

The performance of this java application IS indeed a performance issue. It takes off recored of a JMS datastore slower than new records coming in. So on a bad day messages pile up in the JMS stores, slowing down performace of the BEA Server, making other apps slowing down etc etc etc

So the app owner need to speed up the performance of the aplication (and they dont want to make it a true batch application, with ie. 20000 inserts before the commit - and just one thread, sadly).

In the beginning (with just a few 100 thousand messages a day) we did not have problems, but now we have issues - and day by day we have more and more messages coming in (and out)

In the row they insert there is a CLOB (containg an XML (not always valid)). Mostly this is smaller than the 4K boundry for "inline storage of lobs".

At the moment we are just looking into ideas for makeing performance better. It takes quite a bit of space and time (not to mention planning) to make a "real test" with loads of rows into this app - that is why we try to make a catalog of possible changes instead of just a "plain old test".

best regards
Mette

September 27, 2007 - 7:17 am UTC

whose JMS are you using?

eg: if you use Oracles, we use a database feature called "AQ" - advanced queues - which have attributes that make enqueue/dequeue very concurrent.

did the programmers build their own queue (in which case, tuning is going to be very difficult, if even possible) or are they using builtin capabilities?

Upps sorry

Mette Stephansen, September 27, 2007 - 4:40 am UTC

JMS - Reverse key question response

Mette Stephansen, September 27, 2007 - 7:57 am UTC

Hi Tom

I know for a fact that they don't use Oracles, I think they use Bea's (it resides on a BEA server).

My impression from their app is that the JMS datastore resides in "flatfiles" in the OS, but internally administrated by BEA, using 2-phase commits against JMS and Oracle.

We were just looking for possible quick wins, not having to tell them to recode/redesign (cause they wont). And then that "reverse key" thing popped up, since the 25 threads must use the same index blocks all the time (inserting by a sequence).

Mette

September 28, 2007 - 3:11 pm UTC

well, one cannot just say "use a reverse key index", for the application might range scan on that attribute - and then you'd be deader in the water than you are now.

You would be best served by finding out what the java application is WAITING ON (eg: tracing it, see the waits)

We can then see what we see there. trace a typical dequeue session that is experiencing poor performance, see what it DOES, see what is WAITS for and do not under any circumstances fix things that are not identified as being broken.

Stefan Pastrilov, April 22, 2008 - 2:46 pm UTC

How can I dump index block as the example of your answer about reverse-key index ?

April 23, 2008 - 5:59 pm UTC

http://www.google.com/search?q=dump+index+block+oracle

Reverse Key Index on Varchar2 column

Maverick, August 13, 2008 - 3:38 pm UTC

We are doing some searches from our App and one search is on Last_name. we currently use
last_name_upper like '%JAMES%' and since it doesn't use index, we are planning to build a reverse key index on this column last_name_upper. Please let me know if our theory works!!

Index idx_1 on last_name_upper
Index idx_2 on last_name_upper reverse [not sure about syntax]
so, if I am searching on %JAMES%, it does JAMES% and SEMAJ% which are going to use index.

now, can we do like this?

[pseudo code]

reverse_p_value Varchar2(50):=UPPER(REVERSE(p_value)); -- This will reverse the string

last_name_upper like p_value% or
last_name_upper like reverse_p_value%..

Something like this..

Will this work? or My understnding of Reverse Key index is not correct?

August 18, 2008 - 9:39 am UTC

absolutely will NOT work

and if you do %XXXX% (leading and trailing, it would not do *a thing* for you. You just moved the percent.

see:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:37336026927381#37360199308431

Got more questions

Maverick, August 18, 2008 - 11:09 am UTC

Tom, I did not entirely understand your reply. I came to know I will not be able to do that after posting the questiosn here as I cannot have two different indexes on same column. But your response was 'ABSOLUTELY WILL NOT WORK'. Can you elaborate why it will not work?
If I need '%string%', logically, isn't it same as
string% and gnirts% together?
But I am sure I cannot do that as I cannot create two indexes on same but there will be a workaround for it like reverse string on another column and doing an OR on both columns will use index.

Also, I followed the link and tried the CONTEXT Search like it was mentioned. It's working fine but got problems with most common strings. eg: in a last_name I might try
'%string%' and this might match atleast 7000 rows in my table . I keep getting this error:

ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-51030: wildcard query expansion resulted in too many terms

Now, I have to restrict users to limit their searches. Any workaround for this one?

Thanks,

August 20, 2008 - 9:53 am UTC

you cannot use a btree index for the leading % search, reverse or not - it will NOT work.

I would look at the requirement here - if a leading percentage search returns with that error - then, well - maybe you just return that to the end user and say "please be a little more exact - this will return so much data your head cannot use it, it will be a waste of all of our time"

The question is - for last name, do you really need a leading %???

Client requested Leading %

Maverick, August 20, 2008 - 10:22 am UTC

Tom, thanks for your response, but our clients requested to do a wild card search on last_name as it is very important search element and users might not always remember exact last_name or exact first_name. So, is there any way to avoid that error message I mentioned above with Text Search other than putting more filters?

Thanks again for your help.

August 20, 2008 - 10:39 am UTC

more filters - think about it, the expansion list is huge, perhaps your code can:

a) try the text query

b) when that fails with "too long", just full scan the table and grab the first N records you find - you are hitting TONS OF DATA, the full scan is preferred.

How about Index Fast Full Scan?

kbalfe, August 20, 2008 - 10:28 am UTC

I just ran the following query on our Oracle 9iR2 database.

Select count(*)
From Persons
Where lastname like '%SMITH%'

It used an Index Fast Full Scan per the following:

2 SORT AGGREGATE Bytes: 8 Cardinality: 1
1 INDEX FAST FULL SCAN NON-UNIQUE MERID.PERSON_FNAME_LNAME Cost: 867 Bytes: 391,808 Cardinality: 48,976

(If I use the Upper function on lastname, however, it becomes a Full Table Scan -- about an order of magnitude slower elapsed.)

2 SORT AGGREGATE Bytes: 8 Cardinality: 1
1 TABLE ACCESS FULL MERID.PTPERSON Cost: 3,814 Bytes: 391,808 Cardinality: 48,976

So Oracle, in this case, did use an Index FFS for a leading '%' query, and it did speed up the query. Am I missing something?

August 20, 2008 - 10:45 am UTC

yes, it FULL SCANNED.

it fast full scanned the index, the entire index from start to finish. It used the index as a skinny version of the table.

Now, select * from persons where lastname like '%SMITH%', did it use the index?

probably not.

Index FFS vs FBI

Jmv, August 20, 2008 - 1:04 pm UTC

Tom,

I think kbalfe, above was asking why an "INDEX FAST FULL SCAN" was used for this statement:
Select count(*)
From Persons
Where lastname like '%SMITH%'

Versus, why a "TABLE ACCESS FULL" was used for an Upper on lastname (query not included with post above) but interpreted to be as follows:
Select count(*)
From Persons
Where upper(lastname) like '%SMITH%'

And in answer to kbalfe, the first query utilizes a scan of the index MERID.PERSON_FNAME_LNAME to search for '%SMITH%'. Such that 'John SMITH' would match and 'Jane Smith' would not.

Which is a different from the second query, wherein you are asking that all lastnames be converted to upper case and then compared with '%SMITH%'. Since there is no Function Based Index on Persons( UPPER(lastname)), the Optimizer has to do a full table scan for all lastnames, convert to upper case, then examine for a match to the result set. Such that 'John SMITH' and 'Jane Smith' would match.

So to make the second query more responsive, you may want to investigate the function based indexes as a way to improve performance. "Standard disclaimers apply...your mileage may vary..." ie an index may or may not help, it depends. Or cleanse the data to be of a standard formating eg all uppercase.

Hope that helps clarify.

August 20, 2008 - 1:14 pm UTC

I ignored that altogether since it *just doesn't matter*, they don't want to count rows, these indexes are not useful.

the question wasn't going to get them any closer to the answer....

Agreed, I need to put more filters..

Maverick, August 20, 2008 - 2:15 pm UTC

But,
I need to send a count to application and it is displayed for client. So, I cannot simple select some N Rows. I think only option I see is to put more filters.

Also,can you help me understand this?

CREATE TABLE TEST_INDEX
(
ID INTEGER,
NAME VARCHAR2(20 BYTE)
);

--inserted some 20000 rows in which string NAME1 matches 6111 records;

CREATE INDEX NAME_IDX ON TEST_INDEX
(NAME)
INDEXTYPE IS CTXSYS.CONTEXT;

set autotrace on;
select count(*) from test_index where upper(name) like '%NAME1%';

output
-------

COUNT(*)
----------
6111

1 row selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=20 Card=1 Bytes=10)
1 0 SORT AGGREGATE (Card=1 Bytes=10)
2 1 TABLE ACCESS FULL MYSCHEMA.TEST_INDEX (Cost=20 Card=1 K Bytes=12 K)

Statistics
----------------------------------------------------------
0 recursive calls
0 physical write total multi block requests
0 gcs messages sent
0 db block gets from cache
0 redo entries
0 java session heap collected bytes
0 cursor authentications
0 queries parallelized
0 Parallel operations not downgraded
0 Parallel operations downgraded to serial
1 rows processed

set autotrace on;
select count(*) from test_index where contains(name,'%NAME1%')>0;

output
-------

COUNT(*)
----------
6111

1 row selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=4 Card=1 Bytes=14)
1 0 SORT AGGREGATE (Card=1 Bytes=14)
2 1 DOMAIN INDEX MYSCHEMA.NAME_IDX (Cost=4 Card=12 Bytes=168)

Statistics
----------------------------------------------------------
12252 recursive calls
0 physical write total multi block requests
0 gcs messages sent
0 db block gets from cache
0 redo entries
0 java session heap collected bytes
1 cursor authentications
0 queries parallelized
0 Parallel operations not downgraded
0 Parallel operations downgraded to serial
1 rows processed

With LIKE and UPPER Function on it, COST is high but there are 0 RECUSIVE CALLS,
whereas using CONTAINS, COST is very less but RECURSIVE CALLS are high..

Which is a better plan?
what do we look at-> Cost or Recursive calls?
Why CONTAINS is doing many recursive calls

Time wise they is not much difference. I know data is too small to make any conclusion, but as far as cost/recursive calls,
why CONTAINS is not on par with LIKE?

August 21, 2008 - 8:26 am UTC

Oracle text has a function to "guess" the number of hits and frankly, if it exceeds what Text was willing to do - you should just say "you are seeing 1-10 of LOTS AND LOTS OF JUNK THAT YOU WILL NEVER LOOK AT"

counting rows - ugh, I cannot tell you how much I hate that. You realize in order to count the rows, you have to VISIT THEM ALL BEFORE SHOWING THE FIRST ONE

My proposed solution stands:

a) run the text query
b) if it fails, just run where last_name like :bind and full scan.

Index Fast Full Scan

kbalfe, August 20, 2008 - 4:10 pm UTC

Perhaps there is a misunderstanding re: the phrase "use the index". I interpreted an Index Fast Full Scan as "using the index".

Indeed, "Select * from persons where lastname like '%SMITH%" did a Full Table Scan. But "Select Lastname, Firstname persons where lastname like '%SMITH%" used the index because both columns are in the index.

I don't know Maverick's precise requirements -- they were not entirely clear to me. But isn't the "skinny table" approach to querying (using an Index Fast Full Scan) a legit way to improve performance under some circumstances? Of course, all of the queried columns must be in the index, but sometimes this is possble.

August 21, 2008 - 8:30 am UTC

a fast full scan uses the index as a skinny table - it reads the index unordered, it is a full scan.

is it legit in some cases? sure, but I don't see it applying here.

your answer kind of confused me :-)

Maverick, August 21, 2008 - 8:49 am UTC

"Oracle text has a function to "guess" the number of hits and frankly, if it exceeds what Text was willing to do - you should just say "you are seeing 1-10 of LOTS AND LOTS OF JUNK THAT YOU WILL NEVER LOOK AT"

what does this mean? I was wondering why LIKE is much better in performance than CONTAINS and I couldn't get exactly what you are trying to say in regards to that
I tried with much less rows [say 77, 95 etc] and still LIKE was performaning better than TEXT Search. If it failed on small rows and cannot retrive lot of rows, than why it is used at all? Sorry! But could not understand it properly..

"counting rows - ugh, I cannot tell you how much I hate that. You realize in order to count the rows, you have to VISIT THEM ALL BEFORE SHOWING THE FIRST ONE "

Well, if you have a reason to count total number of rows, then how else would you do that instead of counting? You mean count(*) is bad and I need to use count(column_name)? Can you be more specific?

August 21, 2008 - 9:22 pm UTC

I'm saying

never ever count the rows - never tell an end user "here are rows 1 thru 10 of 122,232,213" - tell them "here are 1-10 of more than 10 rows, hit next to see more"

Be like google.

Google doesn't have a page 100, they just say "stop it already, get serious"
Google doesn't tell you the real count, they say "there are lots, maybe millions"

Google gets it right, because they want to get you the results fast and in a scalable fashion, they know that by counting to the last hit - they would never work - never. So they never do that.

Neither should you.

Run the text query - if the user input was so vague as to have text say "sorry, too many", DO THE FULL SCAN QUERY without an index (if you dare). Else use the output from the text queries.

as for the last paragraph, I'm trying to tell you DO NOT COUNT. I do not know how to say it more clear. In order to find out how many rows would be returned by a query, you have to get to the last one. Now, think about this - text is saying "whoa there fellow, too many terms match this, this is nuts" - and you want to now count them all - you'll hit most of the table, you'll count a huge number and the end user will NEVER EVER look at that data. What a complete waste of cpu/io resources.

Index Fast Full Scan not apply?

kbalfe, August 21, 2008 - 11:28 am UTC

Yes, it's pretty clear that an Index Fast Full Scan is a full scan, but of an index.

I guess I see the Index Fast Full Scan as possibly applying here. I don't think Maverick's specification of his situation is clear or complete enough to really know.

I agree that counting records before doing something else is almost always a bad idea. It's not clear why he needs to do that either.

Counting rows

Maverick, August 22, 2008 - 9:11 am UTC

Well, Counting rows and displaying to user is the idea of our .NET Designers/developers who want to show how many total reacords are there in the system for that search and so are our USERS. You can explain this to .NET Developers to not show entire count but you can never win. You can educate your client not to go for full count, but not all the users [clients] are in terms with that.
My scenario, we have current application which is a client server based app and it shows total number of records in that. We are moving to WEB App. and customers want the same to be displayed there, to please their end users. Now you cannot really argue here and tell them they are being stupid and wasting CPU resources, now can we? So, alternative is to provide them what they want [that's why we are getting paid for :-)]

forget about Google. Google is worried about user base in entire world and we are worried about our small user base.

Let's for our discussion assume, not showing total counts, is not possible..so what's the best possible solution for this?

requirement again:
I have to show total number of records existing for a last_name search which is a wild card search.

Also, Tom, I still did not know which is a better PLAN: less COST or less number of RECURSIVE CALLS. Any Ideas?

Kbalfe, now I guess, you have an idea why I need full count :-)

Thanks for your help.

August 22, 2008 - 2:51 pm UTC

If you want to waste your computer resources and have really slow responding applications - do whatever you like.

If you want to hear my opinion on how to do it, read above.

Do not forget about google, what they do in this respect makes sense for a single user system as well as a multi-user system. It *makes sense*

I answered this already and will just reiterate it:

a) run the text query
b) if that fails, go ahead if you dare and run the full scan with %||:x||% as your input.

You want the query that returns the information with the least work. That - that will be a)

If a) fails, then b) is your option.

Counting rows

Maverick, September 16, 2008 - 1:48 pm UTC

Tom, just to continue our discussion on previous topic, I did a search on Google for "Oracle" and I got results with 1 of about 130,000,000. Doesn't that mean Google is doing a count as well? They cannot blindly say some random number.

so, don't you think they are also doing the same thing I was planning to do?

September 16, 2008 - 11:45 pm UTC

oh, so, hmmm

you think there are exacty 130,000,000 - all zeros - or do you think

THEY ARE SO GUESSING

tell you what, please go to page 100 and report back what you see. When you get to page 100 on google and report back, we'll continue discussing. You may use any approach you want to get to page 100 - but you have to let us in on how you did it - OK??

and you have never searched for something to have them say "you are looking at 1-10 of ABOUT (about - they say ABOUT for crying out loud) 42" only to hit page forward and have them say "you are looking at 11-14 of 14" (eg: they sort of change their MIND when they hit the end of the internet)

they totally

GUESS

(like I do, on my search)

eg:

 Results 1 - 10 of about 498 for asktom +obfuscation +rules. (0.37 seconds)

so I click on page 10

 Results 81 - 81 of 81 for asktom +obfuscation +rules. (0.10 seconds)

whoops....

and even when I do the "we omitted some, click here to see them all" it goes from

 Results 1 - 10 of about 572 for asktom +obfuscation +rules. (0.46 seconds)

 Results 221 - 230 of 230 for asktom +obfuscation +rules. (0.60 seconds)

whoops again..

So, no, I don't think they are completely and utterly wasting resources for no reason - they are doing what I said they do.

THEY GUESS and they don't really care if they guess wrong.

Doesn't really make sense in an application world

Maverick, September 17, 2008 - 10:05 am UTC

If I just guess 200 for a user search and display only one value, it looks very wrong

You are saying do not really count, just give a blind guess [????]
If a user searches on last name, I would say 1 of 10 of about 200.
and if my users click next page, I would say sorry there is only one record. So what happened to my remaining search records? They would come back and put a help desk ticket ..which you really can't do with Google search. They don't really care for how many they are bringing as they don't take your feedback and your grievances. But our users are going to pound us with help desk tickets.

What should I really say at that time?

September 17, 2008 - 11:48 am UTC

so what if it looks wrong.

Look - all the user needs to know is:

a) you are looking at N through M
b) there are some before N (if they are not on page 1)
c) there are some after M (if there is)

If you use the word 'about' or 'approximately' - I fail to see your point.

If say "1-10 of ABOUT 200" and they hit next page and you say

"11-14 of 14, not 200, sorry about that, we guessed"

then all is good.

I really think you should say:

"you are looking at 1 through 10 of MORE THAN 10, hit page forward to see more (and if you want the last page, please just click on the column heading and sort the data backwards so the last page becomes the first page)"

All anyone needs is:

a) to know they are looking at N-M
b) there is more
c) how to get to the last page if that is relevant

I am deadly serious here, the other approach of count the rows will force you to

COUNT THE ROWS, ALL OF THEM, TO THE END, THAT IS RUN THE ENTIRE QUERY, TO BUILD A COUNT OF ROWS, A COUNT OF A LOT OF ROWS THEY WILL NEVER EVER LOOK AT - EVER

that is, you will have wasted more resources than you actually expend to do useful work (90% of your effort will be counting rows the user will NEVER LOOK AT, 10% of your effort might actually be used)

So, if you want to expend an order of magnitude more cpu, io, etc resources - to give them a NUMBER THAT ISN'T ACCURATE EITHER - go right ahead. Don't ask how to make it go faster though, it won't go faster

(and the number will be wrong because you are probably querying data that changes, so by the time you count the answer might well have changed....)

How to guess the number of rows to be returned with text search

Vipen Koul, December 04, 2009 - 9:48 am UTC

Hi Tom,

How does one get the "guessed" number of rows to be returned with CONTAINS clause ?

Thanks

December 04, 2009 - 4:40 pm UTC

http://docs.oracle.com/docs/cd/B19306_01/text.102/b14218/cqrypkg.htm#CCREF0904

Primary Keys with sequences

A reader, June 06, 2011 - 1:30 am UTC

Hi Tom
The below text is from Performance Tuning Guide 11g Release 1 (11.1) B28274-02:
"Serializing within Indexes - Use of sequences, or timestamps, to generate key values that are indexed themselves can lead to database hotspot problems, which affect response time and throughput.This is usually the result of a monotonically growing key that results in a right-growing index. To avoid this problem, try to generate keys that insert over the full range of the index. This results in a well-balanced index that is more scalable and space efficient. You can achieve this by using a reverse key index or using a cycling sequence to prefix and sequence values."

Do we experience the same hotspot problem when we have primary keys on columns filled with values from growing sequence. If yes , Please suggest what should be the correct way to handle such situations.

- Thanks

June 06, 2011 - 9:50 am UTC

Do we experience the same hotspot problem when we have primary keys on columns
filled with values from growing sequence.

It depends. You need to have a high volume insert scenario to have this be a big issue. If you have LOTS of concurrent inserts into a table with an indexed sequence - you'll have everyone trying to simultaneously modify the same right hand index block. You'll see this as a buffer busy wait (or gc waits in RAC) event.

You can "solve" this by many methods...

a) reverse key index. if you reverse the bytes and then put the value in the index - then the number one will not be next to the number two and so on - you'll have spread the inserts out all over the breadth of the index structure, not just the right hand side.

b) global hash partitioning the index. If you take the index and hash partition it into say 128 partitions - you'll have decreased the buffer busy wait contention by a factor of about 128

c) using sys_guid() instead of a sequence. The values will be distributed all over the place.

d) using a key generated 'intelligently', for example:

to_char(instance_id,'00') || to_char(process_id,'00000') || to_char(sequence.nextval,'999999999999')

In that fashion - each RAC instance would work on its own portion of the index (decreasing gc waits) because the instance id would have each instance working on its own section of the index. And within an instance - each session would be working on its own bit of the index (cutting down on buffer busy waits).

Each of these solutions more or less rules out operations on the surrogate key that involve <, >, between - because the key values are NOT sorted anymore - but that should be OK since you don't tend to use those operations on surrogate keys - you use equals pretty much exclusively.

It depends...

Dana, June 06, 2011 - 11:00 am UTC

I had a scenario very much like this in a RAC cluster. The indexes were "hot" and generated high gc buffer busy waits.

As a test case, I generated simple tables using a regular b*tree, reverse key, and hash clustered indexes.

Then for each index type, I created a simple procedure with a for loop that executed a million inserts, and executed the procedure in multiple sessions on each node of the cluster.

Interestingly, the regular b*tree remained the best performer, followed by hash clustered, then reverse key.

My test was flawed, as I did not replicate the client coding. The client code held transactions open longer. I ran directly on the server with local connections.

In this case, what the client is doing and how seems to have a greater impact on contention than the physical structure of indexes.

It depends...

June 06, 2011 - 11:56 am UTC

it always depends. That is why I wrote:

It depends. You need to have a high volume insert scenario to have this be a big issue. If you have LOTS of concurrent inserts into a table with an indexed sequence - you'll have everyone trying to simultaneously modify the same right hand index block. You'll see this as a buffer busy wait (or gc waits in RAC) event.

If you don't see those events - then they are not the cause of any of your waits - so the techniques to reduce those waits won't apply...

it always depends :)

Primary Key Sequence Number Contention

Rob, February 08, 2012 - 11:29 am UTC

Hi Tom,

Following on from previous posts, I'd like to hear your views on this situation and whether a reverse key index would be suitable and whether or not performance will improve, degrade or stay the same.

- We have a sequence used for the primary key of a table
- The table has a very high number of inserts ~ 300 per second
- For every insert there are 3 updates to the same row
- The DB runs fine but then we hit a spike in contention and see wait events "enq: TX - index contention" and "buffer busy waits"
- This can be at peak load or at low load, it doesn't seem to have a distinctive pattern
- Running Oracle 10.2.0.4 on a Windows platform
- Single instance, not RAC
- Primary key is in a ASSM tablespace, with 128M extent sizes, using only 1 extent per day

Would a reverse key index help to alleviate the contention? Any ideas why this shows in non peak times? I thought it would make more sense for it to me in peak times if it's due to a contention issue.

Thanks in advance,
Rob

February 08, 2012 - 12:26 pm UTC

it can happen at any time, you have the perfect storm for contention. It would only take a few people doing something at about the same time to get people into a queue - maybe even have to 'wait' (go to sleep) and experience a longer wait.

A reverse key index might work IF

a) you only use equals when querying this field (no range scans on this single column reverse key index...)

b) you have the memory to keep the index entirely in memory. right now only the hot right hand side has to be there. With a reverse key index - every leaf block is subject to insertion. You could end up turning a buffer busy wait (short) into a physical IO wait (longer)

A global hash partitioned (into say 64 partitions) index might work IF

a) you only use equals....

b) you have partitioning :)

c) you have the memory to hold the 64 hot right hand sides

A new method for surrogate keys is used. You could:

ops$tkyte%ORA11GR2> select to_char(instance_number,'fm09' ) || to_char( mod(sid,100), 'fm000') || to_char(s.nextval,'fm000000000000' )
  2    from v$instance,(select sid from v$mystat where rownum=1)
  3  /

TO_CHAR(INSTANCE_NUM
--------------------
01092000001234567

1 row selected.

ops$tkyte%ORA11GR2> /

TO_CHAR(INSTANCE_NUM
--------------------
01092000001234568

1 row selected.

ops$tkyte%ORA11GR2> /

TO_CHAR(INSTANCE_NUM
--------------------
01092000001234569

1 row selected.

that way, if you ever do go RAC, you'll have each instance inserting into its own subtree in the index (01, 02, 03, .... for each instance). Contention free across instances.

And within each instance - you'll have the individual sessions inserting into their own subtree. Contention free within an instance....

Primary Key Sequence Number Contention Continued

Rob, February 09, 2012 - 5:34 am UTC

Thanks for your response, Tom. We are going to test the reverse key index solution, as we don't do any range scan lookups on the PK index and it doesn't involve a code change in the Application. We can also pin the index into the keep pool - it's only.

I wanted to double check that I'm right in assuming that the fact that the index is in a tablespace which has 128M extents and is only using 1 extent doesn't matter because it's at the block level that it will be reading into the buffer cache so there will be no wasted space? The only advantage I can see of putting it into a smaller extent size tablespace is to save space on disk?

Thanks,
Rob

February 09, 2012 - 5:53 am UTC

do not pin the index.

first - you cannot 'pin' it, it'll be managed like everything else.

second, setting up a keep pool for this would prevent you from using the memory for the greater good of the database.

Just let it go into the default pool with everything else and just make sure your sga settings are appropriate for the amount of data you want to cache.

also, you should reconsider your tablespace setup, use locally managed with SYSTEM ALLOCATED extents.

We'll only buffer blocks that actually have data. We won't cache blocks from the initial extent that don't have anything on them - correct.

Rob, February 09, 2012 - 6:33 am UTC

Thanks for your feedback.

I'm not quite sure I understand why not to assign it to the KEEP pool. My thinking is that we want all the blocks in cache so that they are always in the buffer cache after their first use. We already have a KEEP pool and I'm sure that we can accommodate this index in there. If we leave it in the DEFAULT pool, it runs the risk of ageing out, or are you saying that because the blocks will be so hot they won't get aged out and putting it to the KEEP pool will, potentially, waste memory?

Re. the second point: Our SGA is around 120GB, but we'd like to have it larger to cache all the data in the entire DB but we can't do that. So there is always going to be data which doesn't sit in the cache.

Thanks, I'll have a look at the System Allocated extents and we already have locally managed tablespaces.

Cheers,
Rob

February 09, 2012 - 9:42 am UTC

if you use it, they will be in the cache. if you do not use it, they will not be in the cache. The default cache works just fine 99.999999999% of the time.

It runs the same risk of aging out of the KEEP pool. The keep pool (and recycle pool for that fact) age out and flush blocks in a manner identical to the default pool.

reverse index on Partitioned table

Umair, February 22, 2012 - 2:18 am UTC

Hi Tom,
We are using RAC solution on 10.2.0.4, we are facing performance issue because of one sql query which is doing heavy insert and updates, with ADDM report we have observed ROW LEVEL CACHE issue… currently we are using standard indexing method on this particular partitioned table, when we have raised issue with oracle they have suggested for reverse indexes. This table have total 11 indexes out of which 10 are standard indexes and 1 partition indexed, can you please assist how we can implement the reverse index on Partitioned indexed. We also need to know does this reverse index will impact our performance issue…

Thanks & Regards,
Umair

February 23, 2012 - 12:44 am UTC

how many cpus do you have and how many concurrent sessions do you have?

share the details( the numbers ) you have with me here as well.

umair, February 27, 2012 - 2:52 am UTC

Extremely Sorry for replying late,

The no. of cpu's are 4 on both RAC node and concurrent session value is set as 0,

SQL> show parameter LICENSE_MAX_SESSIONS

NAME                                 TYPE        VALUE
------------------------------------ ----------- 
license_max_sessions                 integer     0
license_sessions_warning             integer     0
license_max_users                    integer     0

February 28, 2012 - 6:28 am UTC

I don't want a parameter setting, I want to know how many active sessions you have during your peak processing time.

v$session where status = 'ACTIVE'

or select * from v$license.

reverse key index

Tony, December 12, 2012 - 1:57 pm UTC

Great discussion Tom, we have RAC with 2 nodes and we see contention during heavy inserts on a table and we were thinking about reverse key index (for primary key generated by sequence). Your comments and sql below looks great but I can't understand how it will help in removing index contention

ops$tkyte%ORA11GR2> select to_char(instance_number,'fm09' ) || to_char( mod(sid,100), 'fm000') || 
to_char(s.nextval,'fm000000000000' )
  2    from v$instance,(select sid from v$mystat where rownum=1)
  3  /

December 17, 2012 - 3:31 pm UTC

close your eyes and envision the index....

the index is a big tree with lots of leaf nodes across the bottom.

If you have 2 nodes - then the beginning of all of your index keys would start with 01 or 02, followed by a 3 digit number followed by a 12 digit number.

So, all of your keys would look either like:

01xxxssssssssssss
02xxxssssssssssss

so, on one node - you would only generate keys starting with 01, they would go into the 01 section of the index tree - never the 02 section - and vice versa. So the leaf blocks for 01 entries would STAY in node one, all of the writes to 01 leaf nodes would always take place in node 1 (and vice versa)

so no more gc contention on leaf nodes - since node 2 will never write a 01 key and node 1 will never write a 02 key. Each will have their own "half" of the index.

Purging and sequential indexes

Dan Jerghiuta, December 20, 2013 - 8:31 pm UTC

Tom,

Your explanation of 90/10 filling is very useful. You also made a statement that unless you purge only part of the left side you still have a perfect structure.
Can you explain how the tree remains balanced when old data (smaller sequence numbers) is purged regularly?

Jess, June 12, 2017 - 1:49 pm UTC

Tom said, "you are using OPS and need to remove a hot spot [...] thats the only reason".

We have a 8 digit column, col1, (numeric, integer) on a table of few millions records. The business most commonly search the records using the last 3 digits of col1. Each search normally brings back 1 result. Now and again, however, the business search for the whole string (it's not a sequenced number, so the leading edge isn't always the same).

I was thinking this would be a great candidate for a reverse index: we would reverse the string, so the search for '%123' would become '321%'.

Having read this thread several times though, I still have a few gaps and uncertainties...

1. Is it worthwhile to get into reverse indexes here (given that there might be a need to index the whole column, for example if the GUI could trigger either an exact search on the full string or a reverse substring search) or to just have a regular b-tree index on the column and a function-based index, albeit still using 'reverse' as the function being applied, whether on the whole string or on the substring?

2. Tom's response to one of the posts says "reverse key index might work if you only use equals when querying this field"... But if we index (whether genuine 'reverse' index or function-based substring reverse) only the last 3 digits, that leaves a bit of technical debt, as someone would need to re-index when the numbers go up. I don't like the idea of forcing the business to only ever put in 3 digits in the GUI and doing exact match on reverse. Do both reverse index and function-based substring reverse handle the "like 123%' queries poorly?

3. With either (1) or (2), how does the database know to use this index (other than us hinting it in the query)?

4. The table is not partitioned at the moment, but we're considering range-partitioning it by created_date as part of this release (with a global index on col1 we search by). How does this help/hamper the reverse or function-based indexing if at all?

Many thanks... This reverse index thing is far from streamlined under the covers...

June 12, 2017 - 3:00 pm UTC

If you use a reverse key index you still write your queries in the same way. It only affects how Oracle Database stores values in the index. If you flip %123 to 321% you'll get different results!

create table t (
  x int, 
  y varchar2(10)
);

insert into t values (1, '123456');
insert into t values (2, '654321');

select * from t
where  y like '%456';

X  Y       
1  123456  

select * from t
where  y like '654%';

X  Y       
2  654321

So your queries stay the same. Now, you could use the reverse function and place that in an index. Which would help your queries searching the last values e.g.:

create index i on t (reverse(y));

select * from t
where  reverse(y) like '654%';

X    Y       
1    123456

But this is a completely separate concept to reverse key indexes.

A reverse key index is only useful if you compare the column using "=". If you use like, it'll do a full index scan.

Partitioning and it's impact on indexing is a big topic. You can read up on these at:

https://docs.oracle.com/database/122/CNCPT/partitions-views-and-other-schema-objects.htm#CNCPT112
https://richardfoote.wordpress.com/category/partitioning/

Or look at Connor's partitioning presentation from (look for ukoug_partitioning.pdf on the resources tab).

A reader, June 12, 2017 - 3:14 pm UTC

Tom,
Welcome back

June 12, 2017 - 4:04 pm UTC

It's Chris here, but thanks anyway ;)

Jess, June 12, 2017 - 7:13 pm UTC

Thanks Chris, I think I see where I got myself confused with the reverse index...

Sounds like the original suspicion that a function-based index might do the trick is, in fact, the way to go, provided the API reverses the input.

This being the case, would there be a material difference between allowing users to essentially search for "reverse(col1) like '123%'" and "reverse(col1) = 123"? Doesn't seem to be the case...

(Also, strictly speaking, do we need to worry about implicit conversions here?)

As far as partitioning, sounds like a global function-based index on reverse(col1) should do just fine, unless I've missed something obvious...

June 13, 2017 - 9:53 am UTC

This being the case, would there be a material difference between allowing users to essentially search for "reverse(col1) like '123%'" and "reverse(col1) = 123"?

Other than they do completely different things?

"reverse(col1) = 123" will only match the value 321.

"reverse(col1) like '123%'" will match any value that ends 321.

And the optimizer is likely to estimate a different number of rows for each, which could result in different plans.

(Also, strictly speaking, do we need to worry about implicit conversions here?)

Yes. You have to to_char your number if you want to pass it to reverse:

create table t (
  x int 
);

create index i on t ( reverse (x) );  

ORA-00932: inconsistent datatypes: expected CHAR got NUMBER

create index i on t ( reverse (to_char(x)) );

Index I created.

And in the general case you might have padding issues. "1" and "0000001" both represent the number 1. But the latter will turn up in a search for like '100%' whereas the former won't.

Make sure you test the global index to ensure it does what you expect. I don't know enough about your data to comment much further...

Live SQL

Share and learn SQL and PL/SQL; free access to the latest version of Oracle Database!

Dev Gym

Classes, workouts and quizzes on Oracle Database technologies. Expertise through exercise!

Reverse Key

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

But how are the keys reversed

reverse key index

Reverse Key Index

More clarification on Reverse Index

Range scan

Reverse Key Index and RBO

Reverse Key

Reverse Index

Reverse index on foreign keys

Not often

Clustering Factor on Reverse Index

Some clarification needed on reverse key index

Reverse Key Index

Reverse key Index

Not using RAC

Upps sorry

Upps sorry

JMS - Reverse key question response

Reverse Key Index on Varchar2 column

Got more questions

Client requested Leading %

How about Index Fast Full Scan?

Index FFS vs FBI

Agreed, I need to put more filters..

Index Fast Full Scan

your answer kind of confused me :-)

Index Fast Full Scan not apply?

Counting rows

Counting rows

Doesn't really make sense in an application world

How to guess the number of rows to be returned with text search

Primary Keys with sequences

It depends...

Primary Key Sequence Number Contention

Primary Key Sequence Number Contention Continued

reverse index on Partitioned table

reverse key index

Purging and sequential indexes

More to Explore

Live SQL

Dev Gym