Skip to Main Content
  • Questions
  • Multiple concatenated column indexes or single column indexes?

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Steve.

Asked: December 19, 2007 - 4:42 pm UTC

Last updated: October 08, 2008 - 9:14 pm UTC

Version: 10.1.0.5

Viewed 10K+ times! This question is

You Asked

When I design tables, I've always added indexes to the columns that were used in WHERE clauses in queries. If the columns were used in multiple WHERE-clause combinations, I index each one of the columns that are being used. I found an application which has a table that has 34 indexes on it, plus a primary key, most all of them are made up of multiple columns in which most of the columns are the same. Isn't it better to just create individual indexes on these often-used columns instead of creating a concatenated index on each possible set of WHERE-clauses?

I've included the table and index DDL as an example:

CREATE TABLE TESTTAB
(
  PGM_ID                NUMBER(6)               NOT NULL,
  UW_CD                 VARCHAR2(4 BYTE)        NOT NULL,
  PGM_NAME              VARCHAR2(100 BYTE),
  PGM_TYPE              VARCHAR2(50 BYTE),
  TEACHER_CERT_FL       CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  TEACHER_CERT_ONLY_FL  CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  BASE_ACAD_LEVEL       CHAR(1 BYTE)            NOT NULL,
  ACAD_LEVEL            CHAR(1 BYTE)            NOT NULL,
  SUBMAJOR_REQUIRED_FL  CHAR(1 BYTE),
  EFFECTIVE_TERM        VARCHAR2(7 BYTE),
  PGM_END_TERM          VARCHAR2(7 BYTE),
  PIPELINE_BEGIN_TERM   VARCHAR2(7 BYTE),
  PIPELINE_END_TERM     VARCHAR2(7 BYTE),
  PUBLISH_BEGIN_DT      DATE,
  PUBLISH_END_DT        DATE,
  INTERDIS_PGM_FL       CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  JOINT_PGM_FL          CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  COLLAB_PGM_FL         CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  COOP_PGM_FL           CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  OFF_CAMPUS_PGM_FL     CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  DUAL_DEGREE_FL        CHAR(1 BYTE),
  PGM_STATUS            CHAR(1 BYTE)            NOT NULL,
  PGM_STATUS_DT         DATE                    NOT NULL,
  DATA_STATUS           CHAR(1 BYTE)            NOT NULL,
  DATA_STATUS_DT        DATE                    NOT NULL,
  UPDT_BY               VARCHAR2(30 BYTE)       DEFAULT USER                  NOT NULL,
  UPDT_DT               DATE                    DEFAULT SYSDATE               NOT NULL,
  OK_TO_REVIEW_FL       CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  OK_TO_REVIEW_DT       DATE,
  CAMPUS_REVIEW_BY      VARCHAR2(30 BYTE),
  CAMPUS_REVIEW_DT      DATE,
  CAMPUS_APPRV_FL       CHAR(1 BYTE),
  UWSA_NOTIFIED_FL      CHAR(1 BYTE),
  UWSA_NOTIFIED_DT      DATE,
  HELP_REVIEW_BY        VARCHAR2(30 BYTE),
  HELP_REVIEW_DT        DATE,
  HELP_APPRV_FL         VARCHAR2(20 BYTE),
  VERSION_NO            NUMBER(3)               NOT NULL,
  IS_SUBPGM             CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  HAS_SUBPGM            CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  IS_SPLIT_MAJOR        CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  UWSA_MAJOR_CODE       NUMBER(5),
  IS_LATEST_VERSION     CHAR(1 BYTE)            DEFAULT 'Y'                   NOT NULL,
  GENERIC_NAME          VARCHAR2(100 BYTE),
  PREPROFESSIONAL_FL    CHAR(1 BYTE)            DEFAULT 'N'                   NOT NULL,
  PERM_PGM_ID           NUMBER(6),
  PREV_PGM_ID           NUMBER(6)               NOT NULL,
  EFFECTIVE_TERM_NUM    NUMBER(7),
  PGM_END_TERM_NUM      NUMBER(7),
  ALT_DELIV_FL          VARCHAR2(1 BYTE)        DEFAULT 'N'                   NOT NULL,
  ALT_DELIV_ONLY_FL     VARCHAR2(1 BYTE)        DEFAULT 'N'                   NOT NULL
);

CREATE INDEX TESTTAB_OLDPGMVERSION_NDX ON TESTTAB
(PREV_PGM_ID, VERSION_NO, PGM_ID);

CREATE INDEX TESTTAB_LEVELSTATUSEND_NDX ON TESTTAB
(BASE_ACAD_LEVEL, PGM_STATUS, PGM_END_TERM, PUBLISH_END_DT, PUBLISH_BEGIN_DT, 
PGM_ID);

CREATE INDEX TESTTAB_STATUSTYPE_NDX ON TESTTAB
(PGM_STATUS, PGM_TYPE, PGM_ID);

CREATE INDEX TESTTAB_STATUSUWTYPE_NDX ON TESTTAB
(PGM_STATUS, UW_CD, PGM_TYPE, PGM_ID);

CREATE INDEX TESTTAB_STATUSENDUWPID_NDX ON TESTTAB
(PGM_STATUS, PGM_END_TERM, UW_CD, PGM_ID);

CREATE INDEX TESTTAB_LEVELLVERUWPID_NDX ON TESTTAB
(BASE_ACAD_LEVEL, UW_CD, IS_LATEST_VERSION, PGM_ID);

CREATE INDEX TESTTAB_LVLSTATUSENDUWTYPE_NDX ON TESTTAB
(BASE_ACAD_LEVEL, PGM_STATUS, PGM_END_TERM, PUBLISH_END_DT, PUBLISH_BEGIN_DT, 
UW_CD, PGM_ID);

CREATE INDEX TESTTAB_LVERPID_NDX ON TESTTAB
(IS_LATEST_VERSION, PGM_ID);

CREATE INDEX TESTTAB_LVERTYPE_NDX ON TESTTAB
(IS_LATEST_VERSION, PGM_TYPE);

CREATE INDEX TESTTAB_STATUS_UW_PGMID_NDX ON TESTTAB
(BASE_ACAD_LEVEL, PGM_STATUS, UW_CD, PGM_ID);

CREATE INDEX TESTTAB_LVLLVERSTATUSTYPE_NDX ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, PGM_STATUS, PGM_ID);

CREATE INDEX TESTTAB_STATUSEND_NDX ON TESTTAB
(PGM_STATUS, PGM_END_TERM, PUBLISH_END_DT, PUBLISH_BEGIN_DT, PGM_ID);

CREATE INDEX TESTTAB_STATUSENDUWTYPE_NDX ON TESTTAB
(PGM_STATUS, PGM_END_TERM, PUBLISH_END_DT, PUBLISH_BEGIN_DT, UW_CD, 
PGM_ID);

CREATE INDEX TESTTAB_LVERUWPID_NDX ON TESTTAB
(UW_CD, IS_LATEST_VERSION, PGM_ID);

CREATE INDEX TESTTAB_LASTVERSTAT_UW_PID_NDX ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, PGM_STATUS, UW_CD, PGM_ID);

CREATE INDEX TESTTAB_LASTVERSTAT_PTYPE_NDX ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, PGM_STATUS, PGM_TYPE);

CREATE INDEX TESTTAB_STATUS_PGMID_NDX ON TESTTAB
(BASE_ACAD_LEVEL, PGM_STATUS, PGM_ID);

CREATE INDEX TESTTAB_STATUS_PGMTYPE_NDX ON TESTTAB
(BASE_ACAD_LEVEL, PGM_STATUS, PGM_TYPE, PGM_ID);

CREATE INDEX TESTTAB_LASTVER_PGMID_NDX ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, PGM_ID);

CREATE INDEX TESTTAB_LASTVER_UW_PGMID_NDX ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, UW_CD, PGM_ID);

CREATE INDEX TESTTAB_LASTVER_PGMTYPE_NDX ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, PGM_TYPE);

CREATE INDEX TESTTAB_STATUS_UW_PGMTYPE_NDX ON TESTTAB
(BASE_ACAD_LEVEL, PGM_STATUS, UW_CD, PGM_TYPE, PGM_ID);

CREATE INDEX TESTTAB_LASTVER_UW_PGMTYPE_NDX ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, UW_CD, PGM_TYPE);

CREATE INDEX TESTTAB_STATUSUWPID_NDX ON TESTTAB
(PGM_STATUS, UW_CD, PGM_ID);

CREATE UNIQUE INDEX TESTTAB_PK ON TESTTAB
(PGM_ID);

CREATE INDEX TESTTAB_LVEREND_NDX ON TESTTAB
(IS_LATEST_VERSION, PGM_END_TERM, PGM_ID);

CREATE INDEX TESTTAB_LVERENDUW_NDX ON TESTTAB
(IS_LATEST_VERSION, PGM_END_TERM, UW_CD, PGM_ID);

CREATE INDEX TESTTAB_STATUSENDPID_NDX ON TESTTAB
(PGM_STATUS, PGM_END_TERM, PGM_ID);

CREATE INDEX TESTTAB_STATUSENDUWTYPEPID_NDX ON TESTTAB
(PGM_STATUS, PGM_END_TERM, UW_CD, PGM_TYPE, PGM_ID);

CREATE INDEX TESTTAB_LVERUWTYPE_NDX ON TESTTAB
(IS_LATEST_VERSION, PGM_END_TERM, UW_CD, PGM_TYPE);

CREATE INDEX TESTTAB_LVERSTATUSUWPID_NDX ON TESTTAB
(IS_LATEST_VERSION, PGM_STATUS, UW_CD, PGM_ID);

CREATE INDEX TESTTAB_LVERENDTYPE_NDX ON TESTTAB
(IS_LATEST_VERSION, PGM_END_TERM, PGM_TYPE);

CREATE INDEX TESTTAB_LVERSTATUSTYPEPID_NDX ON TESTTAB
(IS_LATEST_VERSION, PGM_STATUS, PGM_TYPE, PGM_ID);

CREATE INDEX TESTTAB_STATUSPID_NDX ON TESTTAB
(PGM_STATUS, PGM_ID);

ALTER TABLE TESTTAB ADD (
 PRIMARY KEY
 (PGM_ID));


Thanks, and great site!

and Tom said...

.... Isn't it better to just create individual indexes on these often-used columns instead of creating a concatenated index on each possible set of WHERE-clauses? ...


In general:

NO

Specifically - if this were a data warehouse, and the data in question is read only/read mostly, and the indexes in question were BITMAP indexes - then maybe

YES


so, it depends. Let's say you were a transactional system. And you had a table that resembled all_objects. Your main queries are:

a) show me all of user X's stuff (where owner = ?)
b) show me all of user X's things of this type (where owner = ? and object_type = ?)
c) show me the details on user X's thing of this type and having this name (where owner = ? and object_type = ? and object_name = ?)


The only sensible index would be on

all_objects(owner,object_type,object_name)


it would be useful for ALL THREE queries. Any other ordering of columns, any other set of indexes (plural) would not work as nicely as that one index.


That index flies in the face of conventional wisdom - it has the most selective item (object_name) dead last and the least selective first. That is OK because conventional wisdom is dead wrong (we never order columns in an index based on selectivity of the columns - we only order based on HOW THE INDEX IS TO BE USED)

So, in this case, a single concatenated index would work well for all three. It would do the minimal work.

If you created three indexes on

i1(owner)
i2(object_type)
i3(object_name)

then the first query would be OK - it would use i1.
the third query would LIKELY be ok (presuming that hundreds of people didn't own an EMP table - we'd have to scan through hundreds of emp tables!)

The second query however would be "hosed"

If the query used I1 - it would have to scan over all of user X's objects
If the query used I2 - it would have to scan over all TABLES (for example) in the database to find user X's tables.


Only an index on (owner,object_type) would be right for it.

But the right indexing scheme for this set of queries - the single concatenated index on (owner,object_type,object_name).



Now, let us play devils advocate - say the data was in a data warehouse, and you had NO IDEA what the ad hoc users would do. Now it is very hard to come up with an indexing scheme. Therein lie the power of the bitmap index. This is a short excerpt from Expert Oracle Database Architecture:


<quote>
When Should You Use a Bitmap Index?

Bitmap indexes are most appropriate on low distinct cardinality data (i.e., data with relatively few discrete values when compared to the cardinality of the entire set). It is not really possible to put a value on this-in other words, it is difficult to define what low distinct cardinality is truly. In a set of a couple thousand records, 2 would be low distinct cardinality, but 2 would not be low distinct cardinality in a two-row table. In a table of tens or hundreds of millions records, 100,000 could be low distinct cardinality. So, low distinct cardinality is relative to the size of the resultset. This is data where the number of distinct items in the set of rows divided by the number of rows is a small number (near zero). For example, a GENDER column might take on the values M, F, and NULL. If you have a table with 20,000 employee records in it, then you would find that 3/20000 = 0.00015. Likewise, 100,000 unique values out of 10,000,000 results in a ratio of 0.01-again, very small. These columns would be candidates for bitmap indexes. They probably would not be candidates for a having B*Tree indexes, as each of the values would tend to retrieve an extremely large percentage of the table. B*Tree indexes should be selective in general, as outlined earlier. Bitmap indexes should not be selective-on the contrary, they should be very 'unselective' in general.

Bitmap indexes are extremely useful in environments where you have lots of ad hoc queries, especially queries that reference many columns in an ad hoc fashion or produce aggregations such as COUNT. For example, suppose you have a large table with three columns: GENDER, LOCATION, and AGE_GROUP. In this table, GENDER has a value of M or F, LOCATION can take on the values 1 through 50, and AGE_GROUP is a code representing 18 and under, 19-25, 26-30, 31-40, and 41 and over. You have to support a large number of ad hoc queries that take the following form:

Select count(*)
  from T
 where gender = 'M'
   and location in ( 1, 10, 30 )
   and age_group = '41 and over';

select *
  from t
 where (   ( gender = 'M' and location = 20 )
        or ( gender = 'F' and location = 22 ))
   and age_group = '18 and under';

select count(*) from t where location in (11,20,30);

select count(*) from t where age_group = '41 and over' and gender = 'F';


You would find that a conventional B*Tree indexing scheme would fail you. If you wanted to use an index to get the answer, you would need at least three and up to six combinations of possible B*Tree indexes to access the data via the index. Since any of the three columns or any subset of the three columns may appear, you would need large concatenated B*Tree indexes on

* GENDER, LOCATION, AGE_GROUP: For queries that used all three, or GENDER with LOCATION, or GENDER alone
* LOCATION, AGE_GROUP: For queries that used LOCATION and AGE_GROUP or LOCATION alone
* AGE_GROUP, GENDER: For queries that used AGE_GROUP with GENDER or AGE_GROUP alone

To reduce the amount of data being searched, other permutations might be reasonable as well, to decrease the size of the index structure being scanned. This is ignoring the fact that a B*Tree index on such low cardinality data is not a good idea.

Here the bitmap index comes into play. With three small bitmap indexes, one on each of the individual columns, you will be able to satisfy all of the previous predicates efficiently. Oracle will simply use the functions AND, OR, and NOT, with the bitmaps of the three indexes together, to find the solution set for any predicate that references any set of these three columns. It will take the resulting merged bitmap, convert the 1s into rowids if necessary, and access the data (if you are just counting rows that match the criteria, Oracle will just count the 1 bits). Let¿s take a look at an example. First, we¿ll generate test data that matches our specified distinct cardinalities-index it and gather statistics. We¿ll make use of the DBMS_RANDOM package to generate random data fitting our distribution:

ops$tkyte@ORA10G> create table t
  2  ( gender not null,
  3    location not null,
  4    age_group not null,
  5    data
  6  )
  7  as
  8  select decode( ceil(dbms_random.value(1,2)),
  9                 1, 'M',
 10                 2, 'F' ) gender,
 11         ceil(dbms_random.value(1,50)) location,
 12         decode( ceil(dbms_random.value(1,5)),
 13                 1,'18 and under',
 14                 2,'19-25',
 15                 3,'26-30',
 16                 4,'31-40',
 17                 5,'41 and over'),
 18         rpad( '*', 20, '*')
 19    from big_table.big_table
 20   where rownum <= 100000;
Table created.
 
ops$tkyte@ORA10G> create bitmap index gender_idx on t(gender);
Index created.
 
ops$tkyte@ORA10G> create bitmap index location_idx on t(location);
Index created.
 
ops$tkyte@ORA10G> create bitmap index age_group_idx on t(age_group);
Index created.
 
ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
PL/SQL procedure successfully completed.


Now we¿ll take a look at the plans for our various ad hoc queries from earlier:
ops$tkyte@ORA10G> Select count(*)
  2    from T
  3   where gender = 'M'
  4     and location in ( 1, 10, 30 )
  5     and age_group = '41 and over';
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=1 Bytes=13)
   1    0   SORT (AGGREGATE)
   2    1     BITMAP CONVERSION (COUNT) (Cost=5 Card=1 Bytes=13)
   3    2       BITMAP AND
   4    3         BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX' (INDEX (BITMAP))
   5    3         BITMAP OR
   6    5           BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
   7    5           BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
   8    5           BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
   9    3         BITMAP INDEX (SINGLE VALUE) OF 'AGE_GROUP_IDX' (INDEX (BITMAP))


This example shows the power of the bitmap indexes. Oracle is able to see the location in (1,10,30) and knows to read the index on location for these three values and logically OR together the 'bits' in the bitmap. It then takes that resulting bitmap and logically ANDs that with the bitmaps for AGE_GROUP='41 AND OVER' and GENDER='M'. Then a simple count of 1s and the answer is ready.

ops$tkyte@ORA10G> select *
  2    from t
  3   where (   ( gender = 'M' and location = 20 )
  4          or ( gender = 'F' and location = 22 ))
  5     and age_group = '18 and under';
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=77 Card=507 Bytes=16731)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=77 Card=507 ¿
   2    1     BITMAP CONVERSION (TO ROWIDS)
   3    2       BITMAP AND
   4    3         BITMAP INDEX (SINGLE VALUE) OF 'AGE_GROUP_IDX' (INDEX (BITMAP))
   5    3         BITMAP OR
   6    5           BITMAP AND
   7    6             BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
   8    6             BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX' (INDEX (BITMAP))
   9    5           BITMAP AND
  10    9             BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX' (INDEX (BITMAP))
  11    9             BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))


This shows similar logic: the plan shows the OR¿d conditions are each evaluated by AND-ing together the appropriate bitmaps and then OR-ing together those results. Throw in another AND to satisfy the AGE_GROUP='18 AND UNDER' and we have it all. Since we asked for the actual rows this time, Oracle will convert each bitmap 1 and 0 into rowids to retrieve the source data.

In a data warehouse or a large reporting system supporting many ad hoc SQL queries, this ability to use as many indexes as make sense simultaneously comes in very handy indeed. Using conventional B*Tree indexes here would not be nearly as usual or usable, and as the number of columns that are to be searched by the ad hoc queries increases, the number of combinations of B*Tree indexes you would need increases as well.

However, there are times when bitmaps are not appropriate. They work well in a read-intensive environment, but they are extremely ill suited for a write-intensive environment. The reason is that a single bitmap index key entry points to many rows. If a session modifies the indexed data, then all of the rows that index entry points to are effectively locked in most cases. Oracle cannot lock an individual bit in a bitmap index entry; it locks the entire bitmap index entry. Any other modifications that need to update that same bitmap index entry will be locked out. This will seriously inhibit concurrency, as each update will appear to lock potentially hundreds of rows preventing their bitmap columns from being concurrently updated. It will not lock every row as you might think-just many of them. Bitmaps are stored in chunks, so using the earlier EMP example we might find that the index key ANALYST appears in the index many times, each time pointing to hundreds of rows. An update to a row that modifies the JOB column will need to get exclusive access to two of these index key entries: the index key entry for the old value and the index key entry for the new value. The hundreds of rows these two entries point to will be unavailable for modification by other sessions until that UPDATE commits.
</quote>




Now, back to looking at some of your indexes - what you need to do is look for indexes that appear to overlap, for example:

...
CREATE INDEX TESTTAB_LASTVER_PGMID_NDX ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, PGM_ID);

CREATE INDEX TESTTAB_LASTVER_UW_PGMID_NDX ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, UW_CD, PGM_ID);
....


It is possible, maybe even probable, that these two indexes could be replaced by:
CREATE INDEX I ON TESTTAB
(BASE_ACAD_LEVEL, IS_LATEST_VERSION, PGM_ID, UW_CD);


If the predicates used are all equality then, yes - they could.

Only if the predicates are of the form:

where base_acad_level = ? and is_lastest_version = ? and uw_cd = ? and pgm_id > ?

then it could be bad to replace the two with the one. (we would have to range scan possibly many pgm_id's - we'd like that LAST in the index)


To see what I mean, say you have:



ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2> drop table t;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2> select object_id id1, rownum + 1000 id2, all_objects.*
ops$tkyte%ORA10GR2>   from all_objects;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index t_idx_id2_id1 on t(id2,id1);
ops$tkyte%ORA10GR2> create index t_idx_id1_id2 on t(id1,id2);
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2> @at
ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> set autotrace traceonly
ops$tkyte%ORA10GR2> select /*+ index( t t_idx_id2_id1 ) */ * from t where id1 = 42 and id2 > 1000;


Execution Plan
----------------------------------------------------------
Plan hash value: 2604971482

------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Co
------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     1 |   103 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T             |     1 |   103 |
|*  2 |   INDEX RANGE SCAN          | T_IDX_ID2_ID1 |     1 |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID2">1000 AND "ID1"=42 AND "ID2" IS NOT NULL)
       filter("ID1"=42)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        148  consistent gets
          0  physical reads
          0  redo size
       1312  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte%ORA10GR2> select /*+ index( t t_idx_id1_id2 ) */ * from t where id1 = 42 and id2 > 1000;


Execution Plan
----------------------------------------------------------
Plan hash value: 1605821398

------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Co
------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     1 |   103 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T             |     1 |   103 |
|*  2 |   INDEX RANGE SCAN          | T_IDX_ID1_ID2 |     1 |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID1"=42 AND "ID2">1000 AND "ID2" IS NOT NULL)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
       1312  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte%ORA10GR2> set autotrace off


In general, you want things you "equal on" first, range over "last" - to avoid having to inspect lots and lots of values....



So, the short answer:

a) concatenated indexes make more sense in general than lots and lots of single column indexes

b) a) is false in a warehouse with lots of ad-hoc queries

c) you need to inspect your set of existing indexes and see which could be possibly merged - eg, if you take my first example and saw that you had these indexes:

i1( owner )
i2( object_type, owner )
i3( object_name, object_type, owner )

it would not be immediately obvious - but after some study, you would be able to recognize that a single index:

i( owner, object_type, object_name )

would suffice to efficiently answer all three questions.


Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

pattern ... too wide a brush

nameless, December 20, 2007 - 1:13 pm UTC

It does seem they have over-indexed here.

PGM_ID is the PK, seems surrogate (so that PGM_ID >0 shouldn't really exist in a query); table can accommodate maximum 2 million rows; if the semantic is educational program id then the volume here is likely substantially smaller than that.

Index on (PGM_STATUS, PGM_ID): they added the PK to make the index appealing for a query like:

select <some> from t where pgm_id = :pk and pgm_status=:s

Well, the PK index is there and hence access is already fast ... so this index only makes sense if the select list contained only pgm_id and pgm_status and one would want to save the trip to the table segment ... that is, the index segment is enough to answer the query. On balance, it seems a waste of space for a questionable absolute squeeze on select performance (if any).

All those indexes where they added pgm_id at the end (to make them appealing) are questionable.

Use multiple Indexes

Albert Nelson A., December 21, 2007 - 6:30 am UTC

Hi Tom,

Very good explanation.

Why is not possible to use multiple b* tree indexes similar to bitmap indexes?

One of the access method index_join infact hashes several indexes and returns the result. But it seems it is applicable only when all the table columns referenced in the query exist in those indexes.

Is it not possible to merge rowids resulting from several b* tree index range scans and then do a table look up?

This is more so, say when the columns are foreign keys. For example if there are three columns a, b & c and each is a foreign key. And also we query using all three columns. Accordingly an index on a+b+c is appropriate. But to index the foreign key, we need to create two separate indexes one for b and another for c. Is there any suggestion?


Regards,

Albert Nelson A.
Tom Kyte
December 21, 2007 - 3:05 pm UTC

... Why is not possible to use multiple b* tree indexes similar to bitmap indexes? ...

we can do an index join of two indexes - if all of the columns needed by the query are in the indexes themselves (we can join the two indexes by ROWID)

However, it is the fact that bitmaps have, well, BITMAPS that makes them "mergable" like that. You have a key, and a big range of zero's and one's - we can OR, AND whatever.

if you have a b*tree - what can you do with that? You'd have to join all of the indexes together by rowid - and then hit the table. We can do that with two indexes, but in general - one is the name of the game (it takes a lot of work to range scan and join - might as well just take the rowid we have and instead of joining that to another index - just get the row in the table and be done with it)


bitmap conversion + index join

A reader, October 03, 2008 - 12:11 pm UTC

Hi Tom,

I got an application having only single column indexes.

In the statement below all referenced columns have an single column index on it. You see what the database - oracle ee 9.2.0.6 - does with it.

It would be better, if it would use almpostion as driving table and start with an index range scan (while it would be probably best to use an composite index on both predicates of almposition).

My question is if there is any possibility to tell the optimizer to use the other table as driving table WITHOUT changing the statement OR re-creating indexes as composite indexes?

select SECTYPE_ID, SCENARIO_ID, RISKFACTOR_ID, VAL, TARGET
from
ALMPOSITION, RISKEXPOSURE where EXTRACT_ID=5313933 and
ALMPOSITION.ALMPOSITION_ID=RISKEXPOSURE.ALMPOSITION_ID and SCENARIO_ID=0


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 2 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 30.77 35.24 339 391430 0 191
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 30.77 35.25 339 391432 0 191

Misses in library cache during parse: 1
Optimizer goal: FIRST_ROWS
Parsing user id: 87

Rows Row Source Operation
------- ---------------------------------------------------
191 TABLE ACCESS BY INDEX ROWID ALMPOSITION
72052 NESTED LOOPS
71860 TABLE ACCESS FULL RISKEXPOSURE
191 BITMAP CONVERSION TO ROWIDS
191 BITMAP AND
71860 BITMAP CONVERSION FROM ROWIDS
71860 INDEX RANGE SCAN ALMPOSITION_PK (object id 69814)
14777 BITMAP CONVERSION FROM ROWIDS
38804400 INDEX RANGE SCAN ALMEXTRACT_IDX (object id 69815)
14777 BITMAP CONVERSION FROM ROWIDS
33691560 INDEX RANGE SCAN ALMSCENARIO_IDX (object id 69816)




Tom Kyte
October 06, 2008 - 2:31 pm UTC

why would it be better - have you shown this to be absolutely true by testing with hints?

are the estimated row counts in an explain plan correct? or near correct? or way off?

bitmap conversion + index join

A reader, October 07, 2008 - 6:18 am UTC

hi tom,

thanks for your time.

<quote>
are the estimated row counts in an explain plan correct? or near correct? or way off? 
</quote>
yes. they are nearly correct as you see below. we did a compute statistics + cascade on the tables.

SQL> select table_name, num_rows, sample_size from dba_tables where owner = 'CALM_A' and table_name in ('ALMPOSITION', 'RISKEXPOSURE');

TABLE_NAME                       NUM_ROWS SAMPLE_SIZE
------------------------------ ---------- -----------
ALMPOSITION                         75020       75020
RISKEXPOSURE                        71860       71860

SQL> select count(*) from calm_a.almposition;

  COUNT(*)
----------
     80452

Elapsed: 00:00:00.00
SQL> select count(*) from calm_a.riskexposure;

  COUNT(*)
----------
     74476

Elapsed: 00:00:00.02


<quote>
why would it be better - have you shown this to be absolutely true by testing with hints?
</quote>
yes. i tried this with an ordered hint. the plan changed to a NL join using almposition as driving table as expected.

select /*+ ORDERED */ SECTYPE_ID, SCENARIO_ID, RISKFACTOR_ID, VAL, TARGET
from
 ALMPOSITION, RISKEXPOSURE where EXTRACT_ID=5313933 and
  ALMPOSITION.ALMPOSITION_ID=RISKEXPOSURE.ALMPOSITION_ID and SCENARIO_ID=0

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       14      0.01       0.00          1         90          0         191
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       16      0.01       0.01          1         92          0         191

Misses in library cache during parse: 1
Optimizer goal: FIRST_ROWS
Parsing user id: 224

Rows     Row Source Operation
-------  ---------------------------------------------------
    191  TABLE ACCESS BY INDEX ROWID OBJ#(69965)
    212   NESTED LOOPS
     20    TABLE ACCESS BY INDEX ROWID OBJ#(69813)
    540     INDEX RANGE SCAN OBJ#(69815) (object id 69815)
    191    INDEX RANGE SCAN OBJ#(80875) (object id 80875)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                      14        0.00          0.00
  SQL*Net message from client                    14        2.70          3.38
  db file sequential read                         1        0.00          0.00


SQL> /

 OBJECT_ID OBJECT_NAME                    OBJECT_TYPE
---------- ------------------------------ ------------------------------
     69813 ALMPOSITION                    TABLE
     69815 ALMEXTRACT_IDX                 INDEX
     69965 RISKEXPOSURE                   TABLE
     80875 SL_ALMPOSITION_ID_IDX22        INDEX

Elapsed: 00:00:00.00



i added a composite index on all columns affected. the index has NOT been used at all while the plan changed to a HASH join?

11:48:55 SQL> /

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
SL_ALMPOSID_EXTID_SCENID_IDX   ALMPOSITION_ID                               1
SL_ALMPOSID_EXTID_SCENID_IDX   EXTRACT_ID                                   2
SL_ALMPOSID_EXTID_SCENID_IDX   SCENARIO_ID                                  3

Elapsed: 00:00:00.00


select SECTYPE_ID, SCENARIO_ID, RISKFACTOR_ID, VAL, TARGET
from
 ALMPOSITION, RISKEXPOSURE where EXTRACT_ID=5313933 and
  ALMPOSITION.ALMPOSITION_ID=RISKEXPOSURE.ALMPOSITION_ID and SCENARIO_ID=0

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       14      0.05       0.06        353        381          0         191
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       16      0.05       0.07        353        383          0         191

Misses in library cache during parse: 1
Optimizer goal: FIRST_ROWS
Parsing user id: 224

Rows     Row Source Operation
-------  ---------------------------------------------------
    191  HASH JOIN
     20   TABLE ACCESS BY INDEX ROWID OBJ#(69813)
    540    INDEX RANGE SCAN OBJ#(69815) (object id 69815)
  74476   TABLE ACCESS FULL OBJ#(69965)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                      14        0.00          0.00
  db file scattered read                         22        0.00          0.00
  SQL*Net message from client                    14        1.68          2.39

Q1: any idea on this (hash join not using my composite index)?

Q2: how to eleminate this bitmap conversion + index join without changing the code or indexes? any change on this?


Tom Kyte
October 08, 2008 - 9:14 pm UTC

... they are nearly correct as you see below. ....

how can we see that? You don't have a single explain plan with the estimates - you only have row source operations which show what actually happened.

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library