Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Rakesh.

Asked: July 24, 2006 - 8:30 pm UTC

Last updated: July 31, 2013 - 4:13 pm UTC

Version: 9.2.0.7

Viewed 10K+ times! This question is

You Asked

Hi Tom,

I have a question regarding bitmap index and appreciate if can get some guidance as why it is needed?

I understand that Bitmap Index are needed for columns that have very low cardinality, for example gender column in employee table might be a good candidate for BITMAP index as it can have only 2 possible values "Male or Female".

The question/doubt I have is if there are say 30 millions rows in Employee table then why I would like to have an index (bitmap idex) on employee.gender column since there would be say 15 millions male and 15 millions female employee?

What value or usage it can have searching through such an index?

Appreciate if could help me understand the benefits and usage of Bitmap indexes.

Thanks a lot for your time and help.

Regards,
Rakesh

and Tom said...

bitmap indexes are *never needed*.

bitmap indexes - like any index - are optional.

Just because a column is "low cardinality" does NOT mean it is a candidate for a bitmap index. It might be, it might not be.

here is a snippet on this from my latest book

<quote src=expert oracle database architecture>

When Should You Use a Bitmap Index?
....

Bitmap indexes are extremely useful in environments where you have lots of ad hoc queries, especially queries that reference many columns in an ad hoc fashion or produce aggregations such as COUNT. For example, suppose you have a large table with three columns: GENDER, LOCATION, and AGE_GROUP. In this table, GENDER has a value of M or F, LOCATION can take on the values 1 through 50, and AGE_GROUP is a code representing 18 and under, 19-25, 26-30, 31-40, and 41 and over. You have to support a large number of ad hoc queries that take the following form:

Select count(*)
from T
where gender = 'M'
and location in ( 1, 10, 30 )
and age_group = '41 and over';

select *
from t
where ( ( gender = 'M' and location = 20 )
or ( gender = 'F' and location = 22 ))
and age_group = '18 and under';

select count(*) from t where location in (11,20,30);

select count(*) from t where age_group = '41 and over' and gender = 'F';

You would find that a conventional B*Tree indexing scheme would fail you. If you wanted to use an index to get the answer, you would need at least three and up to six combinations of possible B*Tree indexes to access the data via the index. Since any of the three columns or any subset of the three columns may appear, you would need large concatenated B*Tree indexes on

* GENDER, LOCATION, AGE_GROUP: For queries that used all three, or GENDER with LOCATION, or GENDER alone
* LOCATION, AGE_GROUP: For queries that used LOCATION and AGE_GROUP or LOCATION alone
* AGE_GROUP, GENDER: For queries that used AGE_GROUP with GENDER or AGE_GROUP alone

To reduce the amount of data being searched, other permutations might be reasonable as well, to decrease the size of the index structure being scanned. This is ignoring the fact that a B*Tree index on such low cardinality data is not a good idea.

Here the bitmap index comes into play. With three small bitmap indexes, one on each of the individual columns, you will be able to satisfy all of the previous predicates efficiently. Oracle will simply use the functions AND, OR, and NOT, with the bitmaps of the three indexes together, to find the solution set for any predicate that references any set of these three columns. It will take the resulting merged bitmap, convert the 1s into rowids if necessary, and access the data (if you are just counting rows that match the criteria, Oracle will just count the 1 bits). Let’s take a look at an example. First, we’ll generate test data that matches our specified distinct cardinalities—index it and gather statistics. We’ll make use of the DBMS_RANDOM package to generate random data fitting our distribution:

ops$tkyte@ORA10G> create table t
2 ( gender not null,
3 location not null,
4 age_group not null,
5 data
6 )
7 as
8 select decode( ceil(dbms_random.value(1,2)),
9 1, 'M',
10 2, 'F' ) gender,
11 ceil(dbms_random.value(1,50)) location,
12 decode( ceil(dbms_random.value(1,5)),
13 1,'18 and under',
14 2,'19-25',
15 3,'26-30',
16 4,'31-40',
17 5,'41 and over'),
18 rpad( '*', 20, '*')
19 from big_table.big_table
20 where rownum <= 100000;
Table created.

ops$tkyte@ORA10G> create bitmap index gender_idx on t(gender);
Index created.

ops$tkyte@ORA10G> create bitmap index location_idx on t(location);
Index created.

ops$tkyte@ORA10G> create bitmap index age_group_idx on t(age_group);
Index created.

ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
PL/SQL procedure successfully completed.
Now we’ll take a look at the plans for our various ad hoc queries from earlier:
ops$tkyte@ORA10G> Select count(*)
2 from T
3 where gender = 'M'
4 and location in ( 1, 10, 30 )
5 and age_group = '41 and over';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=1 Bytes=13)
1 0 SORT (AGGREGATE)
2 1 BITMAP CONVERSION (COUNT) (Cost=5 Card=1 Bytes=13)
3 2 BITMAP AND
4 3 BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX' (INDEX (BITMAP))
5 3 BITMAP OR
6 5 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
7 5 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
8 5 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
9 3 BITMAP INDEX (SINGLE VALUE) OF 'AGE_GROUP_IDX' (INDEX (BITMAP))

This example shows the power of the bitmap indexes. Oracle is able to see the location in (1,10,30) and knows to read the index on location for these three values and logically OR together the “bits” in the bitmap. It then takes that resulting bitmap and logically ANDs that with the bitmaps for AGE_GROUP='41 AND OVER' and GENDER='M'. Then a simple count of 1s and the answer is ready.

ops$tkyte@ORA10G> select *
2 from t
3 where ( ( gender = 'M' and location = 20 )
4 or ( gender = 'F' and location = 22 ))
5 and age_group = '18 and under';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=77 Card=507 Bytes=16731)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=77 Card=507 …
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP AND
4 3 BITMAP INDEX (SINGLE VALUE) OF 'AGE_GROUP_IDX' (INDEX (BITMAP))
5 3 BITMAP OR
6 5 BITMAP AND
7 6 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
8 6 BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX' (INDEX (BITMAP))
9 5 BITMAP AND
10 9 BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX' (INDEX (BITMAP))
11 9 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))

This shows similar logic: the plan shows the OR’d conditions are each evaluated by AND-ing together the appropriate bitmaps and then OR-ing together those results. Throw in another AND to satisfy the AGE_GROUP='18 AND UNDER' and we have it all. Since we asked for the actual rows this time, Oracle will convert each bitmap 1 and 0 into rowids to retrieve the source data.
In a data warehouse or a large reporting system supporting many ad hoc SQL queries, this ability to use as many indexes as make sense simultaneously comes in very handy indeed. Using conventional B*Tree indexes here would not be nearly as usual or usable, and as the number of columns that are to be searched by the ad hoc queries increases, the number of combinations of B*Tree indexes you would need increases as well.

However, there are times when bitmaps are not appropriate. They work well in a read-intensive environment, but they are extremely ill suited for a write-intensive environment. The reason is that a single bitmap index key entry points to many rows. If a session modifies the indexed data, then all of the rows that index entry points to are effectively locked in most cases. Oracle cannot lock an individual bit in a bitmap index entry; it locks the entire bitmap index entry. Any other modifications that need to update that same bitmap index entry will be locked out. This will seriously inhibit concurrency, as each update will appear to lock potentially hundreds of rows preventing their bitmap columns from being concurrently updated. It will not lock every row as you might think—just many of them. Bitmaps are stored in chunks, so using the earlier EMP example we might find that the index key ANALYST appears in the index many times, each time pointing to hundreds of rows. An update to a row that modifies the JOB column will need to get exclusive access to two of these index key entries: the index key entry for the old value and the index key entry for the new value. The hundreds of rows these two entries point to will be unavailable for modification by other sessions until that UPDATE commits.
</quote>

Rating

  (20 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

why rebuild bitmap indexes,

A reader, September 29, 2006 - 3:09 pm UTC

I have heard that after updating the columns that has bitmap index, it will not appear in the SELECT statement until you rebuild the bitmap index.

Is it a Oracle bug, or a feature of bitmap index?

thanks,


Tom Kyte
September 30, 2006 - 7:58 am UTC

it is a lie.

so, none of the above.

Bitmap index candidate

Yoav, October 02, 2006 - 1:55 pm UTC

Hi Tom,

I gust start working as a dba in a data warehouse project.
I found that there isnt any single bitmap or bitmap join index .

I thought about checking each index distinct keys and if it
has less then 5% from the number of rows in the table,
then to check it as a candidate to bitmap index.

Is this approach seem reasonable ?
Can You suggest an alternative way ?
Thanks.


Tom Kyte
October 02, 2006 - 4:41 pm UTC

nope, it does not.

Understanding the design
What the end users ask
How and if a bitmap could be useful

those would be all good things first.

Bitmap indexes in a DWH

Philippe, October 03, 2006 - 2:42 am UTC

In a data warehouse, I use bitmap indexes :

- on (some) foreign key columns in the FACT tables : data warehouse queries often put restrictions on the DIMENSION tables joined to the FACT table. So, by placing restrictions on dimensions you get the keys needed to access the fatc table... and use the bitmap indexes. In general, the cardinality of the dimensions is lower than the cardinality of the fact.

- on a few columns of some dimensions

Keep in mind that the update (insert, update or delete on he underlying table) of a bitmap index is EXPENSIVE. So, before updating the tables, set the indexes to UNUSABLE and REBUILD them afterwards.


Tom Kyte
October 03, 2006 - 6:28 am UTC

sure, you can USE them, however:

data warehouse DOES NOT IMPLY bitmap indexes

you can have a warehouse with or without them. And using some "formula" to decide where and if you use them is just so wrong in so many ways.

A reader, October 03, 2006 - 4:56 am UTC


Bitmap Index & Bitmap Entry Locking

jagjal, October 04, 2006 - 9:23 am UTC

Hi tom,

"Oracle
cannot lock an individual bit in a bitmap index entry; it locks the entire
bitmap index entry"

the above extract from your quote seems to fly over my head .. kindly explain in depth

Tom Kyte
October 04, 2006 - 5:09 pm UTC

A key in a bitmap index points to many rows (hence the name - bitmap, a bunch of bits of 0's and 1's that point to rows).

So, if you create bitmap index bm on emp(deptno)

the key entry for deptno=10 will point to MANY rows. If you update one bit in that key (transfer one emp from deptno=10 to deptno=20) the entire BITMAP pointed to by the key entry deptno=10 will be locked. In effect you cannot update the other "deptno=10" values, since they would need to lock the same bitmap.

Bitmap not being used

A reader, October 09, 2006 - 3:56 pm UTC

Tom, are there any parameters that affect the selection of a bitmap index ?? Because there's a difference in your example above if I run it on a 9iR2 db, however on a 10gR2 I can replicate your example just ok. Please, see below..

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.7.0 - Production
PL/SQL Release 9.2.0.7.0 - Production
CORE    9.2.0.7.0       Production
TNS for Compaq Tru64 UNIX: Version 9.2.0.7.0 - Production
NLSRTL Version 9.2.0.7.0 - Production

  1  create table t
  2  ( gender not null,
  3    location not null,
  4    age_group not null,
  5    data
  6  )
  7  as
  8  select decode( ceil(dbms_random.value(1,2)),
  9                 1, 'M',
 10                 2, 'F' ) gender,
 11         ceil(dbms_random.value(1,50)) location,
 12         decode( ceil(dbms_random.value(1,5)),
 13                 1,'18 and under',
 14                 2,'19-25',
 15                 3,'26-30',
 16                 4,'31-40',
 17                 5,'41 and over'),
 18         rpad( '*', 20, '*')
 19    from dual
 20* connect by level <= 100000
SQL> /

Table created.

SQL> create bitmap index gender_idx on t(gender);

Index created.

SQL> create bitmap index location_idx on t(location);

Index created.

SQL> create bitmap index age_group_idx on t(age_group);

Index created.

SQL> exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly explain
SQL>
SQL> r
  1  select count(*)
  2      from T
  3     where gender = 'M'
  4       and location in ( 1, 10, 30 )
  5*      and age_group = '41 and over'

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=13)
   1    0   SORT (AGGREGATE)
   2    1     BITMAP CONVERSION (COUNT)
   3    2       BITMAP AND
   4    3         BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX'
   5    3         BITMAP OR
   6    5           BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX'
   7    5           BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX'
   8    5           BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX'
   9    3         BITMAP INDEX (SINGLE VALUE) OF 'AGE_GROUP_IDX'


SQL> r
  1  select *
  2    from t
  3   where (   ( gender = 'M' and location = 20 )
  4          or ( gender = 'F' and location = 22 ))
  5*    and age_group = '18 and under'

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=55 Card=510 Bytes=16830)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=55 Card=510 Bytes=16830)



 

Tom Kyte
October 09, 2006 - 9:12 pm UTC

the clustering factor (estimated for bitmaps) coupled with the estimated cardinality made this one say "let us just full scan, it'll be better than hitting index, hitting table, hitting index, hitting table"

If you made the table "wider" (more real world perhaps), it would eventually change it's mind and say "full scanning is more expensive"

Bitmap speed

Billy, October 10, 2006 - 4:04 am UTC

A simple real-world like example of what a bitmap can do ito speed.

This query is from a data warehouse fact table (name obfuscated below to protect the innocent ;-) ). 

A simple select count of the entire table. Yes, it is not a common query that would be run - but that is not the point I'm trying to make. Simply showing that just how effective a bitmap index can be in reducing I/O for certain types of queries.

And of course, it also serves well when dealing with  consultants and "experts" of other database vendors on-site. Kind of a like a quick kick to the nuts. I just love to ask an innocent-like question, baiting the trap, and then go "oh, so can then achieve this type of performance from your database?".  :-)

SQL> select count(*) from the_big_fact_table;

COUNT(*) 
---------- 
965934425

Elapsed: 00:00:20.77

This SELECT counted 46,506,231 rows per second. But looking under the hood (execution plan, amount of I/O done, using  PQ), it seems to be.. well, normal.

That's one thing I like about Oracle. You can prove it. Put data and code together for a show and tell. Unlike 'them others database folk' that simply make claims without providing any real backup for their statements. 

Hmm.. but then we also have a few of these folk too that makes claims regarding Oracle performance and tuning and practises and never back up their statements with evidence. A pity.
 

Bitmap - When to avoid

Piet de Visser, October 10, 2006 - 4:44 am UTC

Tom, Readers,

That was good info. Thanks.

As an addition, in this article:
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6042872196732 <code>
you explain when Bitmaps are Not Appropriate.

Notably, bitmaps are to be avoided in single-row-DML situations.

I've just demonstrated to our dev/dba team how the well-intended bitmaps on status fields will cause ORA-00060 (Deadlock) errors when concurrent usage (OLTP style) will be released in full on the database.
A conventional (btree) index will do the job without causing ORA-00060.

Regards,
PdV




Bitmap join is not being used

Serge Shmygelskyy, June 23, 2010 - 5:14 pm UTC

Dear Tom,

I have very strange behavior of the bitmap join index. There are 2 tables:

Master table
22:26:41 SHMYG@bscs9mig> select count(*) from contract_all;

  COUNT(*)
----------
  28144031

Number of columns in the table:
23:01:50 SHMYG@bscs9mig> select count(*) from all_tab_columns where table_name = 'CONTRACT_ALL' and owner = 'SYSADM';

  COUNT(*)
----------
        83

'Dimension' table
22:26:55 SHMYG@bscs9mig> select count(*) from rateplan;

  COUNT(*)
----------
        38

23:03:23 SHMYG@bscs9mig> select count(*) from all_tab_columns where table_name = 'RATEPLAN' and owner = 'SYSADM';

  COUNT(*)
----------
        13

The table CONTRACT_ALL references RATEPLAN by the column TMCODE (foreign key exists)

I have the following query:

22:27:27 SHMYG@bscs9mig> select ra.service_mix_type, count(ca.*), 100*ratio_to_report(count(*)) over()
from contract_all ca, rateplan ra
where ra.tmcode = ca.tmcode
group by ra.service_mix_type;

SERVICE_MIX_TYPE COUNT(CA.*) 100*RATIO_TO_REPORT(COUNT(*))OVER()
---------------- --------------- -----------------------------------
               1         1415896                           5,0308927
               2        26728135                          94,9691073

As you can see, the data in the column is skewed.

I am creating the bitmap join index:
22:27:51 SHMYG@bscs9mig> create bitmap index shmyg_idx on contract_all(rateplan.service_mix_type) from contract_all, rateplan where rateplan.tmcode = contract_all.tmcode compute statistics;

But whatever I do (calculate statistics for all columns, hidden columns, skewed columns, delete statistics - whatever), the index never got used:

22:30:18 SHMYG@bscs9mig> select contract_all.co_id from contract_all, rateplan where rateplan.tmcode = contract_all.tmcode and rateplan.service_mix_type=2;

Execution Plan
----------------------------------------------------------
Plan hash value: 1218489720

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |  3078K|    44M|   105K (33)| 00:02:36 |
|*  1 |  HASH JOIN         |              |  3078K|    44M|   105K (33)| 00:02:36 |
|*  2 |   TABLE ACCESS FULL| RATEPLAN     |     4 |    24 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| CONTRACT_ALL |    28M|   241M|   103K (32)| 00:02:33 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("RATEPLAN"."TMCODE"="CONTRACT_ALL"."TMCODE")
   2 - filter("RATEPLAN"."SERVICE_MIX_TYPE"=2)

22:30:24 SHMYG@bscs9mig> select contract_all.co_id from contract_all, rateplan where rateplan.tmcode = contract_all.tmcode and rateplan.service_mix_type=1;

Execution Plan
----------------------------------------------------------
Plan hash value: 1218489720

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |    28M|   402M|   105K (33)| 00:02:36 |
|*  1 |  HASH JOIN         |              |    28M|   402M|   105K (33)| 00:02:36 |
|*  2 |   TABLE ACCESS FULL| RATEPLAN     |    34 |   204 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| CONTRACT_ALL |    28M|   241M|   103K (32)| 00:02:33 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("RATEPLAN"."TMCODE"="CONTRACT_ALL"."TMCODE")
   2 - filter("RATEPLAN"."SERVICE_MIX_TYPE"=1)



In the meantime if I create the tables with just 2 columns each from the scratch, the index is used perfectly
21:57:01 SHMYG@bscs9mig> create table contract_all (co_id number, tmcode number);

22:36:02 SHMYG@bscs9mig> create table rateplan (tmcode number, service_mix_type number, constraint pk_rateplan primary key (tmcode));

22:36:36 SHMYG@bscs9mig> alter table contract_all add constraint fk_co_rp foreign key (tmcode) references rateplan;


22:37:25 SHMYG@bscs9mig> create bitmap index shmyg_idx1 on contract_all(rateplan.service_mix_type) from contract_all, rateplan where rateplan.tmcode = contract_all.tmcode compute statistics;

22:38:03 SHMYG@bscs9mig> explain plan for select  co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

22:38:12 SHMYG@bscs9mig> @p

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2151760595

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |     1 |    26 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | CONTRACT_ALL |     1 |    26 |     2   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | SHMYG_IDX1   |       |       |            |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CA"."SYS_NC00003$"=1)

Note
-----
   - dynamic sampling used for this statement
22:38:13 SHMYG@bscs9mig> explain plan for select  co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 2;
22:38:27 SHMYG@bscs9mig> @p

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2151760595

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |     1 |    26 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | CONTRACT_ALL |     1 |    26 |     2   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | SHMYG_IDX1   |       |       |            |          |
---------------------------------------------------------------------------------------------

I have tried creating the tables with the same number of columns but empty:

22:38:29 SHMYG@bscs9mig> create table contract_all as select * from sysadm.contract_all where rownum < 1;

22:58:38 SHMYG@bscs9mig> create table rateplan as select * from sysadm.rateplan where rownum < 1;

22:58:54 SHMYG@bscs9mig> alter table rateplan add constraint pk_rateplan primary key (tmcode);

22:59:21 SHMYG@bscs9mig> alter table contract_all add constraint fk_co_rp foreign key (tmcode) references rateplan;

22:59:30 SHMYG@bscs9mig> create bitmap index shmyg_idx1 on contract_all(rateplan.service_mix_type) from contract_all, rateplan where rateplan.tmcode = contract_all.tmcode compute statistics;

22:59:38 SHMYG@bscs9mig> explain plan for select  co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 2;
22:59:43 SHMYG@bscs9mig> @p

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3167159950

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |     1 |    52 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |              |     1 |    52 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL          | CONTRACT_ALL |     1 |    26 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| RATEPLAN     |     1 |    26 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | PK_RATEPLAN  |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("RT"."SERVICE_MIX_TYPE"=2)
   4 - access("CA"."TMCODE"="RT"."TMCODE")

Note
-----
   - dynamic sampling used for this statement
22:59:44 SHMYG@bscs9mig> explain plan for select  co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;
23:00:42 SHMYG@bscs9mig> @p

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3167159950

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |     1 |    52 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |              |     1 |    52 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL          | CONTRACT_ALL |     1 |    26 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| RATEPLAN     |     1 |    26 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | PK_RATEPLAN  |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("RT"."SERVICE_MIX_TYPE"=1)
   4 - access("CA"."TMCODE"="RT"."TMCODE")

Note
-----
   - dynamic sampling used for this statement

And the index is not used again

So it looks like number of columns (or, rather, row length, I suppose), has some influence on the usage of the index. How can we explain that and how can I have Oracle using index?

Thanks in advance,
Serge

Tom Kyte
June 24, 2010 - 7:40 am UTC

no table creates, no matter how long you made it all - and it is a tad bit too long to digest - no look

simplify the example (make smaller) yet 100% complete. create table statements - use dbms_stats.set_table_stats so we can reproduce and then I'll take a look. This isn't really a "review" or a question on the original question - so please make it easier to glance at and comment on.

Serge Shmygelskyy, June 24, 2010 - 3:07 pm UTC

Dear Tom,

sorry, I thought it wouldn't be good to put 'create table' with more than 80 columns. But here it goes:

CREATE TABLE "RATEPLAN"
( "TMCODE" NUMBER(*,0) NOT NULL ENABLE,
"DES" VARCHAR2(30) NOT NULL ENABLE,
"SHDES" VARCHAR2(5) NOT NULL ENABLE,
"TMGLOBAL" CHAR(1) DEFAULT 'N' NOT NULL ENABLE,
"REC_VERSION" NUMBER(*,0) DEFAULT 0 NOT NULL ENABLE,
"PRERATED_TAP_RP_IND" VARCHAR2(1) NOT NULL ENABLE,
"EXTRNL_RATEPLAN_ID" NUMBER(*,0),
"PDE_RATEPLAN" VARCHAR2(1),
"PROVISION_EXT" CHAR(1),
"IMPORT_EXPORT_FLAG" NUMBER(*,0) DEFAULT 0 NOT NULL ENABLE,
"SPECIAL_USAGE_IND" CHAR(1),
"SERVICE_MIX_TYPE" NUMBER(*,0) NOT NULL ENABLE,
"OFFRE_ENTERPRISE_IND" CHAR(1),
CONSTRAINT "PK_RATEPLAN" PRIMARY KEY ("TMCODE")
)
/


CREATE TABLE "CONTRACT_ALL"
( "CO_ID" NUMBER(*,0) NOT NULL ENABLE,
"CUSTOMER_ID" NUMBER(*,0) NOT NULL ENABLE,
"TYPE" VARCHAR2(1),
"PLCODE" NUMBER(*,0),
"SCCODE" NUMBER(*,0),
"SUBM_ID" NUMBER(*,0),
"CO_SIGNED" DATE,
"CO_EQU_TYPE" VARCHAR2(20),
"CO_REP_BILL" VARCHAR2(40),
"CO_REP" VARCHAR2(40),
"CO_REP_BILL_IDNO" VARCHAR2(30),
"CO_REP_IDNO" VARCHAR2(30),
"CO_INSTALLED" DATE,
"CO_ARCHIVE" VARCHAR2(1),
"CO_DIR_ENTRY" VARCHAR2(1),
"CO_OPERATOR_DIR" VARCHAR2(1),
"CO_PSTN_DIR" VARCHAR2(1),
"CO_ASS_SERV" VARCHAR2(1),
"CO_ASS_EQU" VARCHAR2(1),
"CO_CRD_CHECK" VARCHAR2(1),
"CO_CRD_CHK_END" DATE,
"CO_CRD_CHK_START" DATE,
"CO_CRD_CLICKS" NUMBER(*,0),
"CO_CRD_CLICKS_DAY" NUMBER(*,0),
"CO_CRD_DAYS" NUMBER(*,0),
"CO_COMMENT" VARCHAR2(60),
"CO_RESERVED" DATE,
"CO_EXPIR_DATE" DATE,
"CO_ACTIVATED" DATE,
"CO_ENTDATE" DATE NOT NULL ENABLE,
"CO_MODDATE" DATE,
"CO_USERLASTMOD" VARCHAR2(16),
"TMCODE" NUMBER(*,0),
"TMCODE_DATE" DATE,
"CO_CRD_D_TR1" NUMBER(*,0),
"CO_CRD_D_TR2" NUMBER(*,0),
"CO_CRD_D_TR3" NUMBER(*,0),
"CO_CRD_P_TR1" NUMBER(*,0),
"CO_CRD_P_TR2" NUMBER(*,0),
"CO_CRD_P_TR3" NUMBER(*,0),
"ECCODE_LDC" NUMBER(*,0),
"PENDING_ECCODE_LDC" NUMBER(*,0),
"ECCODE_LEC" NUMBER(*,0),
"PENDING_ECCODE_LEC" NUMBER(*,0),
"CO_REQUEST" NUMBER(*,0),
"DEALER_ID" NUMBER(*,0),
"NOT_VALID" VARCHAR2(1),
"ARPCODE" NUMBER(*,0),
"CO_ADDR_ON_IBILL" VARCHAR2(1),
"CO_CRD_AMOUNT" FLOAT(126),
"CO_CRD_AMOUNT_DAY" FLOAT(126),
"PRODUCT_HISTORY_DATE" DATE,
"CO_CONFIRM" VARCHAR2(1),
"CO_EXT_CSUIN" VARCHAR2(50),
"TRIAL_END_DATE" DATE,
"CO_IB_CDR_FLAG" VARCHAR2(1),
"CURRENCY" NUMBER(*,0),
"REC_VERSION" NUMBER(*,0) DEFAULT 0 NOT NULL ENABLE,
"AN_PACKAGE_ID" NUMBER(*,0),
"CO_REL_TYPE" VARCHAR2(1),
"PENDING_TMCODE" NUMBER(*,0),
"CONTRACT_TEMPLATE" VARCHAR2(1),
"CO_CODE" VARCHAR2(30) NOT NULL ENABLE,
"BUSINESS_UNIT_ID" NUMBER(*,0) NOT NULL ENABLE,
"DESCRIPTION" VARCHAR2(100),
"AGREEMENT_TYPE" VARCHAR2(2),
"EXT_PRODUCT_ID" NUMBER(*,0),
"REMOTE_PLCODE" NUMBER(*,0),
"ECCODE" NUMBER(*,0),
"HPLMN_BID_GROUP_ID" NUMBER(*,0),
"VPLMN_BID_GROUP_ID" NUMBER(*,0),
"HOME_PARTY_ID" NUMBER(*,0) NOT NULL ENABLE,
"CONTRACT_TYPE_ID" NUMBER(*,0) NOT NULL ENABLE,
"TRIAL_START_DATE" DATE,
"FOH_AGGREGATION_FLAG" CHAR(1),
"GL_ACC_PACK_ID" NUMBER(*,0),
"CDS_BILL_MEDIUM" NUMBER(*,0) DEFAULT 1 NOT NULL ENABLE,
"CDS_GENERATION" CHAR(1),
"CHARGE_PART_NUM" NUMBER(*,0),
"CH_STATUS" CHAR(1) DEFAULT 'o',
"SVP_FILE" VARCHAR2(5),
"CH_STATUS_VALIDFROM" DATE,
"OFFRE_ENTERPRISE_IND" CHAR(1),
CONSTRAINT "PKCONTRACT_ALL" PRIMARY KEY ("CO_ID"),
CONSTRAINT "FK_CONTRALL_RATEPL" FOREIGN KEY ("TMCODE")
REFERENCES "RATEPLAN" ("TMCODE")
)
/

sql> exec dbms_stats.set_table_stats(user, 'CONTRACT_ALL', numrows=>28144031, numblks=>760749, AVGRLEN=>165);

sql> exec dbms_stats.set_table_stats(user, 'RATEPLAN', numrows=>38, numblks=>4, AVGRLEN=>41);

sql> create bitmap index shmyg_idx on contract_all(rateplan.service_mix_type) from contract_all, rateplan where rateplan.tmcode = contract_all.tmcode;

sql> explain plan for select co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 2;

sql> @p

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1218489720

-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 281K| 13M| 92325 (24)| 00:02:17 |
|* 1 | HASH JOIN | | 281K| 13M| 92325 (24)| 00:02:17 |
|* 2 | TABLE ACCESS FULL| RATEPLAN | 1 | 26 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| CONTRACT_ALL | 28M| 697M| 90345 (22)| 00:02:14 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("RT"."TMCODE"="CA"."TMCODE")
2 - filter("RT"."SERVICE_MIX_TYPE"=2)

sql> explain plan for select co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

sql> @p

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1218489720

-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 281K| 13M| 92325 (24)| 00:02:17 |
|* 1 | HASH JOIN | | 281K| 13M| 92325 (24)| 00:02:17 |
|* 2 | TABLE ACCESS FULL| RATEPLAN | 1 | 26 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| CONTRACT_ALL | 28M| 697M| 90345 (22)| 00:02:14 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("RT"."TMCODE"="CA"."TMCODE")
2 - filter("RT"."SERVICE_MIX_TYPE"=1)

sql> explain plan for select count(co_id) from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

sql> @p

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1855272813

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | BITMAP CONVERSION COUNT | | 281K| 3572K| 1 (0)| 00:00:01 |
|* 3 | BITMAP INDEX SINGLE VALUE| SHMYG_IDX | | | | |
-----------------------------------------------------------------------------------------

sql> explain plan for select ca.* from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

sql> @p

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 444702056

---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 281K| 44M| 106K (1)| 00:02:37 |
| 1 | TABLE ACCESS BY INDEX ROWID | CONTRACT_ALL | 281K| 44M| 106K (1)| 00:02:37 |
| 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
|* 3 | BITMAP INDEX SINGLE VALUE | SHMYG_IDX | | | | |
---------------------------------------------------------------------------------------------

Oracle version is 10.2.0.4
Tom Kyte
July 06, 2010 - 9:55 am UTC

I thought it wouldn't be good to put 'create table' with more than 80
columns.


you were correct - it wouldn't be. Especially since you need maybe 5 columns in each table to demonstrate with?????????


do you see the estimated cardinalities there?


I'll still need a bit of help, on 10.2.0.4 with your test case so far, I observe:

ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats(user, 'CONTRACT_ALL', numrows=>28144031, numblks=>760749, AVGRLEN=>165);

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats(user, 'RATEPLAN', numrows=>38, numblks=>4, AVGRLEN=>41);

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create bitmap index shmyg_idx on contract_all(rateplan.service_mix_type) from contract_all, rateplan where rateplan.tmcode = contract_all.tmcode;

Index created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select  co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3893157349

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |   281K|  7145K| 49973   (1)| 00:06:08 |
|   1 |  TABLE ACCESS BY INDEX ROWID | CONTRACT_ALL |   281K|  7145K| 49973   (1)| 00:06:08 |
|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | SHMYG_IDX    |       |       |            |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CA"."SYS_NC00084$"=2)

ops$tkyte%ORA10GR2> select  co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3893157349

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |   281K|  7145K| 49973   (1)| 00:06:08 |
|   1 |  TABLE ACCESS BY INDEX ROWID | CONTRACT_ALL |   281K|  7145K| 49973   (1)| 00:06:08 |
|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | SHMYG_IDX    |       |       |            |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CA"."SYS_NC00084$"=1)

ops$tkyte%ORA10GR2> select  count(co_id) from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3625186868

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |     1 |    13 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |           |     1 |    13 |            |          |
|   2 |   BITMAP CONVERSION COUNT   |           |   281K|  3572K|     1   (0)| 00:00:01 |
|*  3 |    BITMAP INDEX SINGLE VALUE| SHMYG_IDX |       |       |            |          |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CA"."SYS_NC00084$"=1)

ops$tkyte%ORA10GR2> select ca.* from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3893157349

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |   281K|    44M| 49998   (1)| 00:06:09 |
|   1 |  TABLE ACCESS BY INDEX ROWID | CONTRACT_ALL |   281K|    44M| 49998   (1)| 00:06:09 |
|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | SHMYG_IDX    |       |       |            |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CA"."SYS_NC00084$"=1)

Cardinalities

Serge Shmygelskyy, July 12, 2010 - 4:43 pm UTC

Hi Tom,

thanks for the answer.

Yes, I see the cardinalities. And yes, it seems they're not really correct. But how can I fix the situation? I tried collecting simple statistics, histogram statistics for skewed columns, for all columns, collecting statistics for index, whatever. It doesn't help. This is a real DB with real - not artificial - data. And my results are not the same as yours
Tom Kyte
July 19, 2010 - 10:57 am UTC

are your actual values flowing through each step of the plan the same as the estimated.

do you have any init.ora parameter values set.


if you run YOUR test case, and you do not see what I see - why is that - what have you set on your system - I set nothing non-standard.

Serge Shmygelskyy, July 20, 2010 - 6:59 am UTC

Hi Tom,

What parameters should I consider? The DB was also created with the standard set. I can put it here but not sure it makes any sense

I have tried dropping the index and recreating it and now the explain plan shows me that if I use the index, it'll select exactly half of the data which could have been correct as RATEPLAN.SERVICE_MIX_TYPE contains only 2 values but it is skewed

12:52:15 > select  /*+ INDEX(ca shmyg_idx) */ co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 444702056

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |    14M|   120M|  1005K  (1)| 03:21:11 |
|   1 |  TABLE ACCESS BY INDEX ROWID | CONTRACT_ALL |    14M|   120M|  1005K  (1)| 03:21:11 |
|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | SHMYG_IDX    |       |       |            |          |
---------------------------------------------------------------------------------------------

Meanwhile without the hint execution plan is the same as before
12:44:26 SHMYG> select  co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 1218489720

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |    28M|   402M|   169K  (2)| 00:33:49 |
|*  1 |  HASH JOIN         |              |    28M|   402M|   169K  (2)| 00:33:49 |
|*  2 |   TABLE ACCESS FULL| RATEPLAN     |    34 |   204 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| CONTRACT_ALL |    28M|   241M|   168K  (2)| 00:33:47 |
-----------------------------------------------------------------------------------


Thanks in advace

Tom Kyte
July 23, 2010 - 7:07 am UTC

if you run your example, what do you see - do you see what I see or not. can we stick with something I can reproduce myself.

run your example - just like I did - and compare (you don't have to post the entire result, just point out differences)

I asked about parameters to make sure you haven't set any yourself - what non-default parameters do you have set, if any?

Serge Shmygelskyy, July 26, 2010 - 5:30 am UTC

Hi Tom,

for me HASH JOIN is always used. No matter what I do. Here is the init.ora the instance was created with:

*.compatible='10.2.0.4.0'
*.db_block_size=8192
*.db_file_multiblock_read_count=128
*.job_queue_processes=50
*.open_cursors=520
*.pga_aggregate_target=4500M
*.processes=500
*.sessions=500
*.sga_max_size=8500M
*.sga_target=8500M
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.undo_retention=180

The only parameter which was set to non-default - db_file_multiblock_read_count. But even if I change it to the default value, execution plan is the same
Tom Kyte
July 26, 2010 - 8:27 am UTC

where is your cut and paste like mine??



*.db_file_multiblock_read_count=128

you should not be setting that, it should be left UNSET in 10gr2 and above.



Serge Shmygelskyy, July 27, 2010 - 7:18 am UTC

Hello Tom,

thanks for your time. Here is the copy-paste after I have removed the db_file_multiblock_read_count parameter.

12:51:16 SHMYG@bscs9r1> select num_rows, avg_row_len from dba_tables where table_NAME = 'CONTRACT_ALL';

  NUM_ROWS AVG_ROW_LEN
---------- -----------
  28144031         165

12:51:19 SHMYG@bscs9r1> select num_rows, avg_row_len from dba_tables where table_NAME ='RATEPLAN';

  NUM_ROWS AVG_ROW_LEN
---------- -----------
        38          41
12:52:27 SHMYG@bscs9r1>create bitmap index shmyg_idx on contract_all(rateplan.service_mix_type) from contract_all, rateplan where rateplan.tmcode = contract_all.tmcode;

12:53:14 SHMYG@bscs9r1> set autotrace traceony explain
select  co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and  rt.service_mix_type = 2;

Execution Plan
----------------------------------------------------------
Plan hash value: 1218489720

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |  3078K|    44M| 23399  (19)| 00:02:56 |
|* 1 | HASH JOIN | | 3078K| 44M| 23399 (19)| 00:02:56 |
|*  2 |   TABLE ACCESS FULL| RATEPLAN     |     4 |    24 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| CONTRACT_ALL |    28M|   241M| 23149  (18)| 00:02:54 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("RT"."TMCODE"="CA"."TMCODE")
   2 - filter("RT"."SERVICE_MIX_TYPE"=2)

select  co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and 
rt.service_mix_type = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 1218489720

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |    28M|   402M| 23399  (19)| 00:02:56 |
|* 1 | HASH JOIN | | 28M| 402M| 23399 (19)| 00:02:56 |
|*  2 |   TABLE ACCESS FULL| RATEPLAN     |    34 |   204 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| CONTRACT_ALL |    28M|   241M| 23149  (18)| 00:02:54 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("RT"."TMCODE"="CA"."TMCODE")
   2 - filter("RT"."SERVICE_MIX_TYPE"=1)

12:54:21 SHMYG@bscs9r1> 
select  count(co_id) from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 1855272813

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |     1 |     3 |   738   (0)| 00:00:06 |
|   1 |  SORT AGGREGATE             |           |     1 |     3 |            |          |
|   2 |   BITMAP CONVERSION COUNT   |           |    14M|    40M|   738   (0)| 00:00:06 |
|*  3 |    BITMAP INDEX SINGLE VALUE| SHMYG_IDX |       |       |            |          |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CA"."SYS_NC00085$"=1)


12:56:24 SHMYG@bscs9r1> select /*+ INDEX (ca shmyg_idx) */ co_id from contract_all ca, rateplan rt where rt.tmcode = ca.tmcode and rt.service_mix_type = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 444702056

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14M| 120M| 936K (1)| 01:57:13 |
| 1 | TABLE ACCESS BY INDEX ROWID | CONTRACT_ALL | 14M| 120M| 936K (1)| 01:57:13 |
|   2 |   BITMAP CONVERSION TO ROWIDS|              |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | SHMYG_IDX    |       |       |            |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CA"."SYS_NC00085$"=1)


Cardinality looks OK now, but the main point - look at the time which is needed for bitmap index access. Can it be related to the system statistics?
Tom Kyte
July 27, 2010 - 12:38 pm UTC

was this done with the example you provided me?

I don't see the table creates and dbms_stats.set calls?

I want to know what I'm comparing to here. It cannot be to what I ran since our card=values are so so so VERY VERY VERY far off from each other.

Serge Shmygelskyy, July 28, 2010 - 5:32 am UTC

Hi Tom,

sorry for taking your time and thanks for it. I think we can close the question as I found the root of the issue.

As I said - these tables I've mentioned, are the real ones with the real data. I cannot do anything to them. So I have decided to run the tests in parallel - on the real tables and the 'fake' ones created by the same DDL.

Finally I managed to get the same result as you on the 'fake' tables but not on the real ones. As I said in the previous post, the estimated time for accessing the real bitmap index was huge. And the only way to achieve the same plans for real and 'fake' tables is to delete bitmap index statistics for real ones. After that the plans are identical

I'm not sure this is a really good approach but nevertheless it leads to the desired result. Anyway I'm going to check what is possible to do with this index stats

Thanks again for your time

Bitmap join index - 10GR2 - Expert Oracle database Architecture

Rajeshwaran Jeyabal, August 25, 2010 - 8:06 am UTC

rajesh@10GR2> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

Elapsed: 00:00:00.01
rajesh@10GR2> create table sales
  2  as
  3  select object_id as cust_id,
  4        dbms_random.random as sales_id,
  5        abs(dbms_random.random)*100 as sales_amt,
  6        created        as txn_date
  7  from dba_objects
  8  where object_id is not null;

Table created.

Elapsed: 00:00:01.56
rajesh@10GR2>
rajesh@10GR2> insert /*+ append */ into sales select cust_id,dbms_random.random ,abs(dbms_random.random)*100,txn_date from sales;

56871 rows created.

Elapsed: 00:00:01.37
rajesh@10GR2> commit;

Commit complete.

Elapsed: 00:00:00.03
rajesh@10GR2> insert /*+ append */ into sales select cust_id,dbms_random.random,abs(dbms_random.random)*100,txn_date from sales;

113742 rows created.

Elapsed: 00:00:02.67
rajesh@10GR2> commit;

Commit complete.

Elapsed: 00:00:00.01
rajesh@10GR2> insert /*+ append */ into sales select cust_id,dbms_random.random,abs(dbms_random.random)*100,txn_date from sales;

227484 rows created.

Elapsed: 00:00:05.57
rajesh@10GR2> commit;

Commit complete.

Elapsed: 00:00:00.03
rajesh@10GR2> insert /*+ append */ into sales select cust_id,dbms_random.random ,abs(dbms_random.random)*100,txn_date from sales;

454968 rows created.

Elapsed: 00:00:10.50
rajesh@10GR2> commit;

Commit complete.

Elapsed: 00:00:00.04
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> create table customers
  2  as
  3  select object_id as cust_id,
  4        object_name as cust_name,
  5        created as dob,
  6        abs(dbms_random.random)*100 as salary
  7  from dba_objects
  8  where object_id is not null;

Table created.

Elapsed: 00:00:01.10
rajesh@10GR2>
rajesh@10GR2> alter table customers add constraint cust_pk primary key(cust_id);

Table altered.

Elapsed: 00:00:00.29
rajesh@10GR2> alter table sales add constraint sales_pk foreign key(cust_id) references customers(cust_id);

Table altered.

Elapsed: 00:00:02.67
rajesh@10GR2> alter table customers modify cust_name not null;

Table altered.

Elapsed: 00:00:00.04
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> create bitmap index sales_bm_idx on sales(c.cust_id,c.cust_name)
  2  from sales s,customers c
  3  where s.cust_id = c.cust_id;

Index created.

Elapsed: 00:00:08.31
rajesh@10GR2> begin
  2     dbms_stats.gather_table_stats(ownname=>user,tabname=>'CUSTOMERS',cascade=>true,method_opt=>'for all indexed columns');
  3     dbms_stats.gather_table_stats(ownname=>user,tabname=>'SALES',cascade=>true,method_opt=>'for all indexed columns');
  4  end;
  5  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.09

rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> select count(c.cust_name)
  2  from sales s,customers c
  3  where s.cust_id = c.cust_id
  4  and   c.cust_name = 'I_PROCEDURE1';

COUNT(C.CUST_NAME)
------------------
                16

Elapsed: 00:00:00.15
rajesh@10GR2>
rajesh@10GR2> desc sales;
 Name                                                                    Null?    Type
 ----------------------------------------------------------------------- -------- -------------------------------------------------
 CUST_ID                                                                          NUMBER
 SALES_ID                                                                         NUMBER
 SALES_AMT                                                                        NUMBER
 TXN_DATE                                                                         DATE

rajesh@10GR2> desc customers;
 Name                                                                    Null?    Type
 ----------------------------------------------------------------------- -------- -------------------------------------------------
 CUST_ID                                                                 NOT NULL NUMBER
 CUST_NAME                                                               NOT NULL VARCHAR2(128)
 DOB                                                                              DATE
 SALARY                                                                           NUMBER

rajesh@10GR2> alter table sales modify cust_id not null;

Table altered.

Elapsed: 00:00:00.09
rajesh@10GR2>
rajesh@10GR2> set autotrace traceonly explain;
rajesh@10GR2>
rajesh@10GR2> select count(c.cust_name)
  2  from sales s,customers c
  3  where s.cust_id = c.cust_id
  4  and   c.cust_name = 'I_PROCEDURE1';
Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3348672949

---------------------------------------------------------------------------------
| Id  | Operation           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |     1 |    56 |  1015   (3)| 00:00:13 |
|   1 |  SORT AGGREGATE     |           |     1 |    56 |            |          |
|*  2 |   HASH JOIN         |           |  9022 |   493K|  1015   (3)| 00:00:13 |
|*  3 |    TABLE ACCESS FULL| CUSTOMERS |   569 | 24467 |    89   (2)| 00:00:02 |
|   4 |    TABLE ACCESS FULL| SALES     |   902K|    11M|   918   (2)| 00:00:12 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("S"."CUST_ID"="C"."CUST_ID")
   3 - filter("C"."CUST_NAME"='I_PROCEDURE1')

rajesh@10GR2> set autotrace off;
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> drop index sales_bm_idx;

Index dropped.

Elapsed: 00:00:00.87
rajesh@10GR2> create bitmap index sales_bm_idx on sales(c.cust_name)
  2  from sales s,customers c
  3  where s.cust_id = c.cust_id;

Index created.

Elapsed: 00:00:06.89
rajesh@10GR2> set autotrace traceonly explain;
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> select count(c.cust_name)
  2  from sales s,customers c
  3  where s.cust_id = c.cust_id
  4  and   c.cust_name = 'I_PROCEDURE1';
Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 2763409380

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |    13 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |              |     1 |    13 |            |          |
|   2 |   BITMAP CONVERSION COUNT   |              |    28 |   364 |     3   (0)| 00:00:01 |
|*  3 |    BITMAP INDEX SINGLE VALUE| SALES_BM_IDX |       |       |            |          |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("S"."SYS_NC00005$"='I_PROCEDURE1')

rajesh@10GR2> set autotrace off;
rajesh@10GR2>


Questions
Tom, why the above query is not able to use the bitmap join index (BJ composite index) to answer the query?
However if i drop that index and create Index on single column, the optimizer is able to use that.
Tom Kyte
August 26, 2010 - 12:47 pm UTC

using a 10053 trace, it (the optimizer) does not currently believe the concatenated index would be a "index only" access - so, it looks like a current implementation restriction.

Bitmap index to B*Tree index

Dhruva, July 03, 2013 - 10:40 am UTC

Hi Tom, a practical business example: when an insurance shop is set up, only an handful of policies (as dimensions) constitute almost all the policy transaction fact rows, which is where a bitmap index will be appropriate.

As the business progresses, more and more policies come on board and get written to the transactions fact table and gradually the distinct cardinality becomes higher (e.g. 250,000 distinct policies for 25,000,000 transactions) and starts to exceed 1%.

It is sensible to drop the bitmap index and create a B*Tree index instead? Is this normal practice from a DWH maintenance point of view?

Thanks
Tom Kyte
July 16, 2013 - 12:27 pm UTC

... constitute almost all the policy transaction fact rows, which is where a bitmap index will be appropriate. ...


why? why would a bitmap be appropriate?


the number of distinct values does not tell you whether a bitmap or a b*tree would be appropriate.


you use a bitmap index to

o identify a small number of rows out of a large number of rows
o so you can merge many of them together to build a new bitmap index on the fly


If you just "where" on "policy_id = ?" - then a bitmap or b*tree could be considered (you are just retrieving rows). If you "where policy_id = :x and this = :y and that = :z and theother = :a ...." and you have bitmaps on this that and theother thing - that would be nice since we could merge them all together.


Probably I did not clarify

Dhruva, July 18, 2013 - 1:48 pm UTC

The index I am talking about is on the foreign key surrogate column on the fact table that references the policy dimension table.

Q1) Do bitmap indexes on foreign key columns, that only have predicates in join conditions, follow the same consideration as any other columns - "Bitmap indexes are most appropriate on low distinct cardinality data (i.e., data with relatively few discrete values when compared to the cardinality of the entire set)"?

The queries are likely to be along the lines of:
SELECT d.policy_no, SUM(trans_amount)
FROM   d_policy d, f_policy_trans f
WHERE  d.sid_policy = f.sid_policy
AND    f.trans_date BETWEEN ADD_MONTHS(SYSDATE, -6) AND SYSDATE
GROUP BY d.policy_no;
So in the first year there are only 1,000 policies but 500,000 transactions, so I get a value of 0.002 (0.2%) if I run something like
SELECT COUNT(DISTINCT sid_policy)/COUNT(*)
FROM f_policy_trans;
As the business progresses, the ratio starts to creep up as I take on board more policies, but not necessarily more transactions, making a B-Tree index look more likely. It is not fiction, it has happened at my organisation which I have joined recently and we are restructuring the physical layout of the data warehouse.

Q2) Should we have such a monitoring in place? Is it worth it?

Thanks as always
Tom Kyte
July 18, 2013 - 5:37 pm UTC

bitmap indexes do not have to be on low distinct cardinality columns, no. See https://communities.bmc.com/docs/DOC-10061 the section on "Low Cardinality Columns"

see also http://richardfoote.wordpress.com/2010/02/18/myth-bitmap-indexes-with-high-distinct-columns-blow-out/


what is the technical reason you are restructuring the physical layout?


About foreign keys

Dhruva, July 22, 2013 - 11:00 am UTC

Thanks Tom,

I am still struggling to find out if creating bitmap indexes on foreign key columns (regardless of cardinality) on a fact table is a good idea.

Most of the illustrations talk about the benefits when we filter on bitmap indexed columns by values. In case of foreign key columns we will not be directly filtering on the columns but will only use them for joining with the dimension table.

There will be filters on the columns of the dimension table but we'll be addressing those with bitmap join indexes.

We are restructuring the physical design since this was designed several years ago by OLTP cowboy developers who did not bother to read up about best practices for a data warehouse. We are including other factors such as use of NOLOGGING complementing the backup/recovery strategy, use of RELY constraints for query rewrite, CDC, etc. It's a monumental exercise for a warehouse that has 50+ source applications.

Please do comment on the "creation of bitmap index on foreign key column" question.
Tom Kyte
July 31, 2013 - 4:13 pm UTC

http://docs.oracle.com/cd/B19306_01/server.102/b14223/schemas.htm#CIHGHEFB

if you want to do star transformations, you will have bitmap indexes on your foreign keys.

Please ignore

Dhruva, July 22, 2013 - 3:36 pm UTC

Perhaps you should ignore this, I just remembered that bitmap indexes on FK columns are a pre-requisite for star transformations...

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library