what is the advantage of using functional index here?
A reader, November 24, 2003 - 7:51 pm UTC
Why normal index wouldn't do
To reader: this index is a lot smaller
msc, November 25, 2003 - 2:53 am UTC
This (function based) index does not contain all 'N' values - it contains only single 'Y' record. So index will be quite small and fast for 'Y' values (of course you would use this only if you have few 'Y'-s and a lot of 'N'-s).
You could also use NULL to mean 'N' and skip functional index. I personally would rather use NULL approach.
smaller?
mikito harakiri, November 25, 2003 - 2:39 pm UTC
Yes, but who cares about memory these days? You'll spend much more onto developers time who would puzzle what fancy predicate in
select * from t1 where (case when c2 = 'Y' then c2 else null end) = 'Y'
is all about. I thought that Tom's book is against writing unreadable, unportable (cause you won't find the index NULL "feature" in any competitor's RDBMS), unreliable (cause oracle might decide to remove this bug in the future version) code.
November 25, 2003 - 3:31 pm UTC
it is not memory -- it is maintanence and the usability of the index itself (whether the optimizer will choose to use it).
maintaining the index with millions of "N's" and few Y's is bad (impacts DML)
using the index becomes iffy.
If your developers
a) do not comment
b) do not document
c) do not read manuals
d) eg: don't do there job
a simple "select * from dual" becomes pretty obscure too. I would never in a billion years not do something smart because someone might be dumb and not understand it.
My book was about EXPLOITING THE PRODUCT YOU BOUGHT to its 100% fullest.
It is decidely against doing things in a database independent fashion, decidedly against that (i went as far as to say "DON'T do that -- if you bought sqlserver, exploit the heck out of it. if you bought Oracle, use it to it's extreme).
Entirely Null entries are not indexed
It is documented to that effect
It is not a bug
Oracle DBA
David, November 25, 2003 - 4:37 pm UTC
Hi Tom,
I am facing a performance issue regarding to creating a function index. If the function "key value" is more then boolean, still a workaround for using the function index.
for example:
select C1, C2 from T
where MOD(C1, :V01)='XYZ'
Value ":V01" could be a number from 0 to 10.
Thanks in advance,
David
November 25, 2003 - 6:37 pm UTC
not going to work with binds. function based indexes must be deterministic.
you would need 10 indexes here.
Oracle DBA
David, November 25, 2003 - 10:06 pm UTC
Now, I got. Thanks for quick response.
--David
Function -based Index on Y/N Completed Flag
Peter, November 26, 2003 - 6:52 am UTC
Tom,
Excellent advice as ever, thanks very much.
Just to clarify for the other posters, this was a very simplified case to try and figure out why my FBI was not working.
In the real world, the flag C2 has several values ( Y, N, null, A, B, C, etc ).
For the most part, our system queries all the M's, C's, S's etc, so a Full Table Scan is used.
A B-Tree index exists on C2, and is used when our query just wants the Y's, which is great, but the index is huge.
I was looking for a way to cut down on the size of the index by using a FBI, and now I have it.
FBI Question ..
A reader, December 03, 2003 - 3:28 pm UTC
Hi Tom,
Does fbi work with rbo ?
December 03, 2003 - 6:45 pm UTC
nope, not even a little
Impact on DML
A reader, August 05, 2004 - 7:04 pm UTC
In general, what is the impact of FBI on DML against that table? The function is called/evalated for every row that affects that table, right?
Sort of like a row-level trigger? IMO a row-level trigger kills performance!
Is this something to consider before adding FBIs even though they benefit queries? i.e. consider the cost (DML) vs the benefit (query) before deciding?
Would native PLSQL compilation help here?
Your thoughts appreciated.
Thanks
August 05, 2004 - 9:31 pm UTC
remember - how many times is a row inserted?
vs
queried?
everything -- EVERYTHING is a trade off. common sense (which isn't) must be EMPLOYED here.
I wrote about this:
</code>
http://asktom.oracle.com/~tkyte/article1/index.html
I too can find evil lurking everywhere.
I too can find goodness around every corner.
if you modify the column 1% of the times you query the funtion the column - what do you think might be "the right approach"?
http://asktom.oracle.com/~tkyte/article1/index.html <code>
and remember please "O's" don't count.
for every example of "performance getting killed" by a trigger, I'll provide five examples of "triggers saving the day"
not that triggers are all about goodness,
or evil
or anything in between.
they are a *tool* to be wielded appropriately.
Impact on DML
va, August 06, 2004 - 9:27 am UTC
Great article on FBI. Very comprehensive and educational, as usual.
Couple of questions:
1. You said that I need to substr the FBI expression when the function returns varchar2. I found this to be not (always) true.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
SQL> create table t (i int);
Table created.
SQL> create or replace function f(p in int) return varchar2 deterministic
2 as
3 begin
4 return 'blah';
5 end;
6 /
Function created.
SQL> create index fbi on t(f(i));
Index created.
SQL> show parameter db_block_size
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_block_size integer 8192
How is this explained? 4000 is more than 1/3 of my block size, so how is Oracle able to create the index?
2. In my case, given my data distribution and the logic in my function, it returns only 2-3 distinct values against my 500K+ table. i.e. it has very low selectivity. So, even after creating the FBI, the CBO doesnt really use it. But I see that it *is* able to get a perfect estimate of the cardinality of any predicate involving this indexed column and that really helps the overall query (especially when this table is joined to other tables, views, etc)
In other words, the default selectivity that the CBO estimates when using a indexed column is 1%, as you demonstrated in
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1547006324238#20331466182478
In this case, my selectivity is closer to 50%.
Is this a appropriate use of FBI i.e. use it just to help cardinality estimates but not really use the index itself?
3. Even if I dont use FBI, but instead do
associate statistics with functions myfn default selectivity 50;
I get the same result i.e. the CBO assigns the proper cardinality and the query performs great.
Which is better, the associate statistics or FBI?
4. Regarding disabling FBIs during direct path sqlldr, I see the 'skip_index_maintenance' option to sqlldr. Is this what you are referring to? Disable all index maintenance and rebuild all indexes later? Or is there a way to disable just the FBI and rebuild that later?
Thanks a lot
Thanks
August 06, 2004 - 10:17 am UTC
1) i said you ALWAYS SHOULD, not that you ALWAYS NEED TO.
2) ok
3) seems #3 here would be more efficient for you - no index to maintain.
4) depends on the release of oracle as well (insert /*+ APPEND */ into t select * from external table is a direct path load in 9i that can maintain the index)
skipping index maintainence is one way.
setting the fbi unusable and skipping unusable indexes is another.
Size of FBI
va, August 06, 2004 - 10:02 am UTC
I cant understand the following
select INDEX_NAME, INDEX_TYPE, UNIQUENESS, COMPRESSION, LEAF_BLOCKS, DISTINCT_KEYS, AVG_LEAF_BLOCKS_PER_KEY, AVG_DATA_BLOCKS_PER_KEY, CLUSTERING_FACTOR, NUM_ROWS
from user_indexes
where index_name in ('I1','ACCNT#ACCNT_ID_PK')
I1 is a FBI and other one is a PK. Look at the results of this
INDEX_NAME INDEX_TYPE UNIQUENESS COMPRESSION LEAF_BLOCKS DISTINCT_KEYS AVG_LEAF_BLOCKS_PER_KEY AVG_DATA_BLOCKS_PER_KEY CLUSTERING_FACTOR NUM_ROWS
ACCNT#ACCNT_ID_PK NORMAL UNIQUE DISABLED 64 30306 1 1 726 30306
I1 FUNCTION-BASED NORMAL NONUNIQUE DISABLED 54 3 18 404 1213 29675
The FBI has just 3 distinct values but the PK index has 30306! So why is the storage (leaf blocks) needed by them almost the same? Even when I look at user_segments, they occupy the same size.
Why is the FBI so large?
Yes, in this case, I did do the substr() in the index expression
select * from user_ind_expressions where index_name='I1'
INDEX_NAME TABLE_NAME COLUMN_EXPRESSION COLUMN_POSITION
I1 ACCNT SUBSTR("UTIL"."MY_FN"("MY_COL"),1,1) 1
Thanks
August 06, 2004 - 10:23 am UTC
ummm, it is not the number of distinct values here -- it would be the number of rows that is relevant.
you can save *some* space with index key compression -- but you still have some 30,000 rowids to stuff in there.
tell me -- if you put into a table -- one billion number 99999999's, would you expect it to take more or less space than a one billion row table with unique numbers in it? (and why?)
A reader, August 06, 2004 - 11:01 am UTC
"tell me -- if you put into a table -- one billion number 99999999's, would you expect it to take more or less space than a one billion row table with unique numbers in it? (and why?)"
create table t1 (i int);
create table t2 (i int);
Stuff t1 with billion 9999999s and t2 with 1..9999999?
You are asking if the size would be different?
All INTs takes the same storage, so the size of the 2 tables should be identical, right?
August 06, 2004 - 11:49 am UTC
in oracle, ints are numbers. numbers are a varying length string - 0 to 22 bytes in length.
a billion 99999999s would probably take more room ;)
but the point was, it is the number of rows.... not the number of distinct values.
FBI
A reader, November 26, 2004 - 3:17 pm UTC
I have a table with millions of rows, a kind of workflow table where process periodically wakes up and needs the "pending" rows from the table.
create index i on t(case when status='P' then status end);
select * from t where case when status='P' then status end = 'P'
Works great. The index size is tiny, just the pending rows are in the index, the query is super fast, so far so good.
Now, what if some txn activity suddenly dumps 1000s of P rows in the table.
How would the CBO know that now a FTS would be more appropriate given that the data distribution has changed?
Most likely it wouldnt...so what is the workaround? How can I get the best of both worlds?
Thanks
November 26, 2004 - 3:23 pm UTC
<quote>
How would the CBO know that now a FTS would be more appropriate given that the
data distribution has changed?
</quote>
the same way the CBO gets to know *anything* -- statistics. Stats up to date -> optimizer knows.
A reader, November 26, 2004 - 3:32 pm UTC
Right, I do use the MONITORING option, gather stale, etc, but only once a day.
If some unexpected txn activity, dumps lots of rows, there is not going to be a opportunity to gather stats.
Would optimizer dynamic sampling help here? Quickly check if the stats are ok or drastically different from reality?
Thanks
November 26, 2004 - 4:01 pm UTC
the dynamic sampling is taken during the hard parse -- dumping lots of data in there won't cause cursors to reparse.
If you have an "unexpected" thing dump lots of stuff in there unexpectedly - that is something I would look into. why is this unexpected thing doing that and why cannot it either set the stats (let us know what it did) or gather them.
FBI and PK index
A reader, November 26, 2004 - 4:24 pm UTC
On that table with the FBI for the pending column, I need to process rows in the order they were created it. The PK of the table is a sequence-generated key.
So, I added a 'order by pk' to the query.
Now suddenly, it doesnt use the FBI anymore, it does a FTS on the table! Even when the rows returned are 10 out of a million. Why is this?
Thanks
November 26, 2004 - 4:37 pm UTC
*example* please.
ops$tkyte@ORA9IR2> create table t
2 as
3 select case when mod(rownum,500) = 0 then 'P' else 'X' end my_status,
4 all_objects.*
5 from all_objects
6 /
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx on t( decode(my_status,'P',my_status), object_id );
Index created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', cascade=>TRUE );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select *
2 from t
3 where decode(my_status,'P',my_status) = 'P'
4 order by object_id;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=59 Card=57 Bytes=5472)
1 0 SORT (ORDER BY) (Cost=59 Card=57 Bytes=5472)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=52 Card=57 Bytes=5472)
3 2 INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=2 Card=57)
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
Any new updated features or restrictions added in oracle 10
David, April 14, 2005 - 12:23 pm UTC
Tom,
All this about FBI is very helpful. I just have a very general question: Is there any new updated features or restrictions on functional indexes added in oracle 10?
Thanks a lot.
April 14, 2005 - 1:21 pm UTC
they are pretty much the same in 10 as 9ir2
FBI on grouped columns
Kumar, July 12, 2006 - 1:19 pm UTC
I have a table with data :
c1 c2
a 10
a 10
b 20
b 80
...
can column c2 be functioned indexed using sum() grouped on c1.
-- sum(c2) group by c1
July 12, 2006 - 3:56 pm UTC
eh? not following you on this one.
is what you want a precomputed aggregate of C2 by C1? If so, that might be a materialized view, but an index - no.
wrong cardinality
Katrin Toedt, February 08, 2007 - 8:26 am UTC
Hello Tom!
I'm struggling for a few days with a wrong cardinality calculation:
Verbunden mit:
Oracle9i Release 9.2.0.8.0 - Production
JServer Release 9.2.0.8.0 - Production
SQL> set timing on
SQL> CREATE TABLE TB_TEST
2 (
3 ZS_ID,
4 ZS_STATUS
5 )
6 AS SELECT ROWNUM,
7 'NICHT_AKTIV'
8 FROM ALL_OBJECTS,
9 ALL_OBJECTS
10 WHERE ROWNUM <= 1200000;
Tabelle wurde angelegt.
Abgelaufen: 00:02:24.07
SQL> UPDATE TB_TEST SET zs_status = 'Klärfall'
2 WHERE zs_id BETWEEN 500 AND 2000;
1501 Zeilen wurden aktualisiert.
Abgelaufen: 00:00:01.09
SQL> commit;
Transaktion mit COMMIT abgeschlossen.
Abgelaufen: 00:00:00.00
SQL> CREATE INDEX fidx_neu ON TB_TEST
2 (
3 CASE WHEN zs_status = 'Klärfall' THEN zs_status ELSE NULL END
4 );
Index wurde angelegt.
Abgelaufen: 00:00:05.04
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS( USER, 'TB_TEST', CASCADE=>TRUE );
PL/SQL-Prozedur wurde erfolgreich abgeschlossen.
Abgelaufen: 00:00:10.03
SQL> set autotrace on
SQL> SELECT COUNT(*)
2 FROM TB_TEST zs
3 WHERE (CASE WHEN zs.zs_status = 'Klärfall' THEN zs.zs_status ELSE NULL END) = 'Klärfall';
COUNT(*)
----------
1501
Abgelaufen: 00:00:00.00
Ausführungsplan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=12)
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'FIDX_NEU' (NON-UNIQUE) (Cost=1 Card=1200000 Bytes=14400000)
Statistiken
----------------------------------------------------------
7 recursive calls
0 db block gets
8 consistent gets
0 physical reads
0 redo size
414 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> set autotrace off
SQL> set linesize 200
SQL>
SQL> SELECT index_name, table_name, sample_size, num_rows, clustering_factor,last_analyzed
2 FROM ALL_INDEXES
3 WHERE INDEX_NAME = 'FIDX_NEU';
SAMPLE_SIZE NUM_ROWS CLUSTERING_FACTOR LAST_ANA
----------------- ----------------------------
1501 1501 5 08.02.07
Why does the optimizer calculate Card=1200000???
I have to join this outcomme with other tables and due to this wrong cards I got terrible results...
Thanks a lot!
Katrin
February 08, 2007 - 10:53 am UTC
Might be a time for dynamic sampling
ops$tkyte%ORA9IR2> SELECT COUNT(*)
2 FROM TB_TEST zs
3 WHERE (CASE WHEN zs.zs_status = 'Klrfall' THEN zs.zs_status ELSE NULL END) = 'Klrfall';
COUNT(*)
----------
1501
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=12)
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'FIDX_NEU' (NON-UNIQUE) (Cost=1 Card=1200000 Bytes=14400000)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
407 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte%ORA9IR2> SELECT /*+ dynamic_sampling(zs 3) */ COUNT(*)
2 FROM TB_TEST zs
3 WHERE (CASE WHEN zs.zs_status = 'Klrfall' THEN zs.zs_status ELSE NULL END) = 'Klrfall';
COUNT(*)
----------
1501
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=12)
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'FIDX_NEU' (NON-UNIQUE) (Cost=1 Card=6000 Bytes=72000)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
407 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte%ORA9IR2> SELECT /*+ dynamic_sampling(zs 10) */ COUNT(*)
2 FROM TB_TEST zs
3 WHERE (CASE WHEN zs.zs_status = 'Klrfall' THEN zs.zs_status ELSE NULL END) = 'Klrfall';
COUNT(*)
----------
1501
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=12)
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'FIDX_NEU' (NON-UNIQUE) (Cost=1 Card=1501 Bytes=18012)
Statistics
----------------------------------------------------------
17 recursive calls
0 db block gets
3694 consistent gets
2538 physical reads
0 redo size
407 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte%ORA9IR2> set autotrace off
cardinality
DarrenL, February 08, 2007 - 1:01 pm UTC
it interesting if you repeat the test case using different length constants in the FBI
CREATE INDEX fidx_neu ON TB_TEST
(
CASE WHEN zs_status = 'Klärfall' THEN 'xxxxxxx' ELSE NULL END
);
then use the matching where clause..
if the length of xxxxxxx is < 7 then the cardinality is ok..anything over 6 and the card = number of rows in my tests.
advantage is gone...at least as soon as you have to join...
Katrin, February 08, 2007 - 2:23 pm UTC
Tom, thanks for the fast reponse, but...
the advantage is away:
6 consistent gets
to
3694 consistent gets
2538 physical reads
I made the mistake to test the index with
create index [...] compute statistics
then the card is perfect...
So does anyone know why the card calculation goes so wrong?
Katrin
February 08, 2007 - 4:21 pm UTC
give a for example please.
Why not use a Bitmap Index
pShah, July 23, 2007 - 5:48 pm UTC
If the values are going to be a limited set, wouldn't a Bitmap Index be bettter in this case?
July 24, 2007 - 9:31 am UTC
bitmap indexes are appropriate in read only, read mostly (eg; bulk load once a day/week whatever) environment.
A Y/N completed flag - single row inserts, single row updates - this system would lock up tighter than a drum if you used a bitmap index on it.
the simple update of "N" to "Y" would have the effect of locking hundreds/thousands of indexed entries - massive serialization, deadlocks all over the place, and the bitmap index would grow to an outrageous size quickly (yes, even in 10g)
they are not appropriate in a transactional system.