Skip to Main Content
  • Questions
  • Using bitmap indexes in OLTP database with mostly inserts

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, Mats.

Asked: February 20, 2017 - 1:19 pm UTC

Last updated: February 20, 2017 - 4:18 pm UTC

Version: 12.1.0.2.0

Viewed 1000+ times

You Asked

Hi,

We have a table for logging metadata about processed messages in a production system.
The table have approx 32M rows today and 25 columns. The total number of rows is expected to be around 100M in the future.
When the processing of a message starts, one row is inserted to the table.
When the processing finishes, the same row is updated to fill in some more information such as processingstatus, processingtime etc.
The update occur within 20-1000ms after the insert. After the initial insert/update the row is never again updated/deleted.
It is estimated that the number of inserted rows per day will be around 1M.
The table is partitioned in order to allow for dropping data older than X months.
The referenced LiveSQL represents a simplified version of the logging table with a bitmap index defined.

For analysis we want to be able to filter the table for multiple combinations of columns, some of them with just a small number of possible values (such as processingstatus) and some with more or less unique values.
The filtering usually also include a time range (using created column).

It seems hard to define BTree indexes to allow for effective filtering on arbitrary columns but adding bitmap indexes on the filterable columns seems to work pretty well for this.

My questions are:
- Is it a good or bad idea to use bitmap indexes in the scenario above?
- I've read about risks with deadlock's or slowness due to locking issues when using bitmap indexes in OLTP with concurrent inserts/updates/deletes and I can understand that happening when data is modified more arbitrarily but is it a problem in this case when each row is inserted once, updated once (shortly after and by the same thread) and then never modified again?
- If there really are performance/deadlock issues using bitmap indexes in our scenario, are there any other better alternatives other than using a separate offline database for the filtering/analysis?

Best regards,
Mats

with LiveSQL Test Case:

and Chris said...

Bitmap indexes are one of the few cases in Oracle Database where an insert in one session can block an insert in another. For example, if in session 1 you run:

create table my_log ( 
    Id INTEGER GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, 
    A VARCHAR(65), -- A number of filterable columns with relatively few distinct values 
    B VARCHAR(65), -- 
    ProcessingStatus VARCHAR(65), 
    ProcessingTime INTEGER, 
    CREATED TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP NOT NULL 
) partition by range (created) interval (interval '1' day) (
  partition p0 values less than (date'2017-01-01')
);

CREATE BITMAP INDEX MY_LOG_BIDX1 ON My_Log (PROCESSINGSTATUS) LOCAL;

select sys_context('USERENV', 'SID') from dual;

SYS_CONTEXT('USERENV','SID')  
54   

INSERT INTO my_log (A, created) VALUES ('XYZ', date'2016-12-31');


And session 2 you run:

select sys_context('USERENV', 'SID') from dual;

SYS_CONTEXT('USERENV','SID')
51

INSERT INTO my_log (A, created) VALUES ('XYZ', date'2016-12-31');


You'll see that session 2 is blocked:

select 'SID=' || s1.sid || ' is blocking SID=' || s2.sid blocking_status
from v$lock l1, v$session s1, v$lock l2, v$session s2
where s1.sid=l1.sid and s2.sid=l2.sid
and l1.BLOCK=1 and l2.request > 0
and l1.id1 = l2.id1
and l2.id2 = l2.id2 ;

BLOCKING_STATUS            
SID=54 is blocking SID=51  


So bitmap indexes are a big threat to scalablitly. Even if you're only inserting the row and doing one update. If you're going to have multiple sessions inserting rows, they're best avoided.

So what could you do instead? Here's a few options:

- Analyze your queries to create BTree indexes as appropriate
- Split current and old rows into separate tables. Then have a job to move data from the current to the history/archive table. Because you can have one process to do this, concurrency isn't an issue. Thus the history/archive table can have bitmap indexes with minimal scalability risks
- Test In-Memory to see if it gives the performance you need without indexes

Fortunately you're on 12c, so there is another option open to you:

Partial Indexes

This enables you to specify whether or not Oracle Database indexes values in a particular partition. So you can index the old partitions but not the current one:

drop index my_log_bidx1;
alter table my_log modify default attributes indexing off;
create bitmap index my_log_bidx1 on my_log (processingstatus) local indexing partial;

insert into my_log (created, PROCESSINGSTATUS) 
  select trunc(sysdate, 'y')+rownum, 'X' from dual connect by level <= 50;
 
select count(*) from user_tab_partitions
where  table_name = 'MY_LOG';

COUNT(*)  
51   

select status, count(*), min(partition_position), max(partition_position)
from   user_ind_partitions
where  index_name = 'MY_LOG_BIDX1'
group  by status;

STATUS    COUNT(*)  MIN(PARTITION_POSITION)  MAX(PARTITION_POSITION)  
USABLE    1         1                        1                        
UNUSABLE  50        2                        51   


Notice that most of the index partitions are "UNUSABLE". So they're not indexed. You'll need to have a process which enables indexing when a current partition becomes old. You do this by altering the partition to set "indexing on". For example:

begin
  for ps in (
    select * from user_tab_partitions
    where  table_name = 'MY_LOG'
    and    partition_position < 51
  ) loop
    execute immediate 'alter table my_log modify partition ' || ps.partition_name || ' indexing on';
  end loop;
end;
/

select status, count(*), min(partition_position), max(partition_position)
from   user_ind_partitions
where  index_name = 'MY_LOG_BIDX1'
group  by status;

STATUS    COUNT(*)  MIN(PARTITION_POSITION)  MAX(PARTITION_POSITION)  
USABLE    50        1                        50                       
UNUSABLE  1         51                       51


Now only the last (current) partition is unindexed. So you can happily insert rows in both sessions! You'll still need to figure out how to manage queries against the current partition. But this should be a more manageable problem.

Note you can mimic this behaviour in 11g for local (not global) indexes:

https://danischnider.wordpress.com/2016/06/28/partial-indexes-part-1-local-partial-indexes/

For further reading on partial indexes, see:

https://docs.oracle.com/database/121/VLDBG/GUID-256BA7EE-BF49-42DE-9B38-CD2480A73129.htm#VLDBG14102
https://richardfoote.wordpress.com/2013/07/08/12c-partial-indexes-for-partitioned-tables-part-i-ignoreland/

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Mats, February 23, 2017 - 1:37 pm UTC


More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.