Home>Question Details



Hariharan -- Thanks for the question regarding "Bitmap Indexes", version 10.02

Submitted on 20-Oct-2009 8:00 Central time zone
Last updated 23-Nov-2009 10:01

You Asked

Hi Tom,

Good Day

I was reading a book written by a popular author (I would prefer not to mention the name of the book and its author). He says:

In discussion on bitmap indexes, the following was written:

"Concatenated B*tree indexes were being used and the distinct values for each of these columns numbered less than 200. By replacing the b*tree indexes with bitmap indexes, the overall performance improvement of the system was incredible"...

Further the author also says...

Bitmap indexes do not update quickly, so the best approach is to only use them in database that update all rows in a nightly batch. In this fashion, the bitmap is dropped, changes applied to the table and the bitmap rebuilt.

In view of the above, can you please clarify the following points for us?

a) Do we create bitmap indexes when the distinct numbers are 200? Is it not high?
b) Why Bitmap indexes do not update quickly? Is it only for update or any DML activities?
c) I also heard many people telling me that number of bitmap indexes to be created on a table should be limited. Is this true?
d) Any new books on Oracle Database by you in near future?

Thanks for helping us

Hari

and we said...

a) 200 distinct values means the key would in general (on average) return 0.5% of the table.

A b*tree index would be great for that - a bitmap index would be great for that. Whether it was a 100,000 row table or a 100,000,000 row table.


However, I think there is missing information. I put the quote into google and it is clear where it comes from:

http://www.remote-dba.net/t_bitmap_index_tuning.htm


The entire quote was:

An ad-hoc query system was experiencing slow query performance. The system was read-only 
except for a 30-minute window at night for data loading. Inspection of the SQL showed 
complex combinational WHERE clauses like:

WHERE color=’BLU’ and make=’CHEVY’ and year=1997 and doors=2;

Concatenated B-tree indexes were being used and the distinct values for each of these 
columns numbered less than 200. By replacing the b-tree indexes with bitmap indexes, the 
overall performance improvement of the system was incredible.

 Queries that had taken as much as three seconds to run were reduced to runtimes of under 
one-tenth of a second. 


Now, if I had a concatenated index on (color,make,year,doors) - I don't really care if it was a b*tree or bitmap index - it would perform approximately the same.

If I had to just "count" things - both would be able to count via the index (without touching the table). The bitmap index might be smaller than the b*tree index - but not to such a degree that it would turn a 3 second query into a 1/10th of a second query.

If I had to access the table - both would perform *about the same*, the bitmap might even go a tad bit slower CPU wise since it has to convert bitmaps into rowids whereas the b*tree index would have the rowids already.

Wouldn't it be nice if the book had an example, one that showed the difference before and after? And then an explanation as to WHY it was so massively improved.



Here is how a bitmap index could be very useful. Think of an ad-hoc situation. You have a table with 20 attributes. All of them are searched on - but you never know which ones will be used. Maybe I like to query on C1, C5, C20 and you like to use C1..C5, C17. Maybe someone else likes to use another set. Maybe I use equals on C1, you use > on C1 (I would like C1 to be near the front of a b*tree index - you would like it to be near the end in the list of columns).

Now, you have these 20 attributes and the end users pick what they want to query against at runtime and they get to pick HOW they want to query against them (=, <, >, etc). I defy you to come up with a set of concatenated b*tree indexes that would be useful for each and every query that would benefit from an index.

Now, if I used single column bitmap indexes - having say 20 indexes, one each on each of the columns - we can put them together and create new indexes that never existed by ANDing and ORing them together.

Here is my write up on this


http://asktom.oracle.com/pls/asktom/z?p_url=ASKTOM%2Edownload_file%3Fp_file%3D695006012050
5500493&p_cat=bitmap.pdf&p_company=822925097021874




b) See, the fact that you ask this means "things that talk about Oracle should explain why they say the things they say - if someone writes 'bitmap indexes do not update quickly' - they should qualify that.

Bitmaps in fact do update rather quickly. However, they also tend to SERIALIZE and do not respond well to slow by slow (row by row) updates. They are suitable for read only/read mostly data that is modified in BULK (not row by row)


They tend to serialize because a bitmap index by definition is an index such that a single key value in the index points to 10's or 100's or 1000's of rows in a table. If you had a bitmap index on the JOB column in the EMP table, and you issued

update emp set job = 'CLERK' where ename = 'SCOTT' -- scott is an analyst


you would lock an index key of ANALYST (to change a 1 to a 0, scott isn't an analyst anymore) and an index key of CLERK (to change a 0 to a 1...). However, that ANALYST key points to hundreds of other EMP records - they would be effectively locked, you could not (probably) insert another analyst right now as that key is locked. Same for the clerk key - you could not delete someone that was a clerk and you'd have a hard time updating the JOB column for clerks and analysts, and creating a new record with those values would block as well.



c) ask them why, ask them for the science. The right number of indexes on a table TOTALLY DEPENDS ON THE USE CASES.


d) nothing concrete right now, but lots of answers to your bitmap questions - so you can UNDERSTAND them, to see how they work, what they do and don't do (so you can decide for yourself WHEN and WHERE to use them - rather then "it was incredible once - maybe it'll be incredible again some day") is in Expert Oracle Database Architecture right now.

Reviews    
5 stars Bitmap Indexes   October 20, 2009 - 8am Central time zone
Reviewer: Hariharan from Chennai, India
Brilliant Tom. 

I actually do not want to specify the author name as I remember seeing some heated arguments 
exchanged on the "Precepts". 

But you found out.

Good Explanation. 

Thanks Again


4 stars Hmmmmm   October 20, 2009 - 9am Central time zone
Reviewer: djb from UK
'I actually do not want to specify the author name'

Have to say, it was pretty obvious !


5 stars Don't worry, Hari   October 20, 2009 - 9am Central time zone
Reviewer: Martijn Hoekstra 
The author is currently far too busy over here: 
http://forums.oracle.com/forums/message.jspa?messageID=3838839#3838839 and here: 
http://forums.oracle.com/forums/message.jspa?messageID=3836101#3836101

;)

(very interesting and informing reads by the way...)


5 stars   October 20, 2009 - 12pm Central time zone
Reviewer: A reader 
Hello Tom,

<qoute>

the bitmap might even go a tad bit slower CPU wise since it has to convert bitmaps into rowids 
whereas the b*tree index would have the rowids already

<qoute>


Is it not that bitmap index in leaf block have start row-id and end row-id ? Am I correct ?

could please clarify your above comment 


Thanks 






Followup   October 22, 2009 - 4pm Central time zone:

the bitmap index has a rowid range (start/stop) and a bitmap

in order to get the rowid for the i'th row in the bitmap, you must do some "math" on the start/stop rowids - perform some function f(start,stop,bitmap,i)

the b*tree just has the rowid there.
5 stars Attachment link   October 21, 2009 - 12am Central time zone
Reviewer: Raajesh from India
Hi Tom,

I checked the PDF link that you have provided and got a doubt in that.

"If a session modifies the indexed data, then all of the rows that index entry points to are 
effectively locked in most cases. Oracle cannot lock an individual bit in a bitmap index entry; it 
locks the entire bitmap index entry"

Is this an implementation limitation or is there a better way to address this?

Can you explain the overhead of this individual bit locking if it is possible?

Thanks,
Raajesh


Followup   October 22, 2009 - 4pm Central time zone:

It is pretty much in the definition of a bitmap index - in these structures, a single key points to hundreds or thousands of rows.

That is exactly what sets them apart from a b*tree or inverted list, or r*tree or any of the other indexing schemes.

They are all tools
They are all appropriate some time
They are all inappropriate other times
None are best
None are worst
None is better than the other

they are just *tools*

Understand how they work and you can use them safely.

Just like you would not use that table top saw without reading the directions (well, hope you wouldn't)
4 stars Excellent Explanation   October 21, 2009 - 1am Central time zone
Reviewer: A reader 
Excellent Explanation about bitmap indexes. 

In an OLTP multi-user system with update, delete and insert you have better not implement Bitmap 
indexes at all in order to avoid deadlocks.

Houri Mohamed


5 stars Bitmap Indexes   November 16, 2009 - 7am Central time zone
Reviewer: Hari from Chennai, India
Hi Tom,

Good Day.

We need some clarifications on the following scenario from you. 

Database: Oracle 10gR2
OS: Solaris 10

There is a huge table called TRADES in one of the products in our company, which is partitioned on 
the basis of transaction_date. Almost around 300K records gets loaded into this table daily and 
around 80-90 million records are there as of now. 

There are four boolean columns like ISCANCEL, ISACTIVE, ISTRADER and ISLOSS, which are often used 
in most of the queries.

My DBA and system architect recommending to create bitmap index on these four columns to improve 
performance. 

Approximately 300 partitions are currently available in this table. Is it okay to create bitmap 
indexes on these columns?

If you need any other information, please let me know

Thanks

Hari



Followup   November 23, 2009 - 10am Central time zone:

..
My DBA and system architect recommending to create bitmap index on these four
columns to improve performance.
.....

How do they anticipate it improving performance? Give me a for example "for example, we have this query..."

For you see, if there are only two values - it is likely that the bitmap index would retrieve MILLIONS of records from the table. You would need to use all four columns together - frequently - and have us bitmap OR and AND them together to find a small set of rows. I'm skeptical - so give me a for example...

When you say 300k records get loaded - is that a bulk load of 300k records or is this a transactional table.


5 stars Bitmap Indexes   November 19, 2009 - 9am Central time zone
Reviewer: Hari from Chennai, India
Hi Tom,

Good Day.

I was reading a chapter written by Jonathan Lewis and found the following statement:

Updates to bitmapped columns, and general insertion/deletion of data can degrade the quality of the 
indexes quite dramatically.

In order to check this fact, I did the following:

With Bitmap Index

create table bitmapcheck as select rownum id, mod(rownum,10) bt_col, mod(rownum,10) bit_col, 
rpad('x',200) pad from all_objects;
Table created.
create index btindex on bitmapcheck(bt_col);
Index created.
create bitmap index bitindex on bitmapcheck(bit_col);
Index created.
intxqa > insert into bitmapcheck select * from bitmapcheck;
41291 rows created.
Execution Plan
----------------------------------------------------------
Plan hash value: 52221000
---------------------------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time|
---------------------------------------------------------------------------------
|   0 | INSERT STATEMENT  |             | 82490 |    11M|   559   (1)| 00:00:07|
|   1 |  TABLE ACCESS FULL| BITMAPCHECK | 82490 |    11M|   559   (1)| 00:00:07|
---------------------------------------------------------------------------------
Note
-----
   - dynamic sampling used for this statement
Statistics
----------------------------------------------------------
        716  recursive calls
      98204  db block gets
       4665  consistent gets
         95  physical reads
   22488228  redo size
        555  bytes sent via SQL*Net to client
        577  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
      41291  rows processed

Without Bitmap Index

create table bitmapcheck as select rownum id, mod(rownum,10) bt_col, mod(rownum,10) bit_col, 
rpad('x',200) pad from all_objects;

Table created.

create index btindex on bitmapcheck(bt_col);
Index created.

intxqa > insert into bitmapcheck select * from bitmapcheck;
41293 rows created.
Execution Plan
----------------------------------------------------------
Plan hash value: 52221000
---------------------------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time|
---------------------------------------------------------------------------------
|   0 | INSERT STATEMENT  |             | 82490 |    11M|   559   (1)| 00:00:07|
|   1 |  TABLE ACCESS FULL| BITMAPCHECK | 82490 |    11M|   559   (1)| 00:00:07|
---------------------------------------------------------------------------------
Note
-----
   - dynamic sampling used for this statement
Statistics
----------------------------------------------------------
        670  recursive calls
      97416  db block gets
       4448  consistent gets
         81  physical reads
   21830272  redo size
        562  bytes sent via SQL*Net to client
        577  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      41293  rows processed

Pardon me for the improper formatting. 

Now, when comparing the results except for a very few differences in Consistent Gets and Physical 
Reads, I do not see any major difference. 

In what situation bitmap indexes affect the DML operations adversely?

Thanks

Hari



Write a Review
 


All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.

About Oracle | Legal Notices and Terms of Use | Privacy Statement