This thread brought memories of a discussion we had regarding the implementation of a DAY dimension
</code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:4632159445946#21994567492814 <code>
where I said:
<quote>My worry (justified or not
maybe you can comment on it) is
that, by choosing to use the FK DATE from the fact table, the action of the end-users may preclude the usage of some optimization techniques (the usage of the index, the applicability of the star transformation with bitmap indexes, etc.).</quote>
Essentially, I was (still am) apprehensive of DATE manipulations in the fact table especially when it comes to end-users or BI tools.
I left that discussion with
<quote>Ill try to see if I can materialize some technical (contra)arguments for <Tom>I don't see how querying directly against the attributes of the fact table would preclude anything</Tom></quote>
Having read Jonathans answer to this thread
<quote>Your iteration through dates can't work because Oracle cannot detect that your data is 'date-only'
</quote> (that is, DAY-only) I had my own moment of hand-smacking-forehead.
If this BIG_TABLE were a fact table with C3 the migrated FK from some DAY dimension table then the optimizer would have better insight into that C3-related predicate
hence it should/may use the nice BITMAP AND on all 3 bitmap indexes (without having to be nudged along with the hint and the [expert] query re-write).
Just expanding on the table/data presented here
flip@FLOP> create table day_dim
2 ( id number(9) not null
3 ,dy date not null
4 ,constraint daypk primary key (id)
5 ) organization index
6 ;
Table created.
Elapsed: 00:00:00.00
flip@FLOP> insert into day_dim
2 select rownum as id, c3 as dy
3 from (select distinct c3 from big_table)
4 ;
272 rows created.
Elapsed: 00:00:00.00
flip@FLOP> exec dbms_stats.gather_table_stats(ownname=> USER,tabname=>'DAY_DIM',method_opt=>'FOR ALL INDEXES FOR ALL COLUMNS SIZE 225' ,degree=>1,cascade=>FALSE);
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.02
flip@FLOP> create table big_fact
2 (
3 c1 number not null,
4 c2 number not null,
5 c3 number not null references day_dim(id),
6 c4 number
7 );
Table created.
Elapsed: 00:00:00.00
flip@FLOP>
flip@FLOP> insert into big_fact
2 select t.c1, t.c2, d.id, t.c4
3 from big_table t, day_dim d
4 where t.c3 = d.dy
5 ;
135745 rows created.
Elapsed: 00:00:01.04
flip@FLOP> create bitmap index c1_bfidx on big_fact (c1) compute statistics;
Index created.
Elapsed: 00:00:00.01
flip@FLOP> create bitmap index c2_bfidx on big_fact (c2) compute statistics;
Index created.
Elapsed: 00:00:00.01
flip@FLOP> create bitmap index c3_bfidx on big_fact (c3) compute statistics;
Index created.
Elapsed: 00:00:00.01
flip@FLOP>
flip@FLOP> exec dbms_stats.gather_table_stats(ownname=> USER,tabname=>'BIG_FACT',method_opt=>'FOR ALL INDEXES FOR ALL COLUMNS SIZE 225' ,degree=>1,cascade=>FALSE);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.04
flip@FLOP> explain plan set statement_id = 'p3' for
2 select d.dy, count(*)
3 from big_fact b, day_dim d
4 where 1=1
5 and b.c3 = d.id
6 and b.c1 = 1
7 and b.c2 = 10
8 and d.dy > to_date('04-MAY-2005','DD-MON-YYYY')
9 and d.dy <= to_date('07-MAY-2005','DD-MON-YYYY')
10 group by d.dy;
Explained.
Elapsed: 00:00:00.00
flip@FLOP> select * from table(dbms_xplan.display('PLAN_TABLE','p3'))
2 ;
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 42 | 15 (20)|
| 1 | SORT GROUP BY | | 2 | 42 | 15 (20)|
| 2 | NESTED LOOPS | | 36 | 756 | 13 (24)|
|* 3 | INDEX FULL SCAN | DAYPK | 2 | 22 | 1 (0)|
| 4 | BITMAP CONVERSION TO ROWIDS| | | | |
| 5 | BITMAP AND | | | | |
|* 6 | BITMAP INDEX SINGLE VALUE| C3_BFIDX | | | |
|* 7 | BITMAP INDEX SINGLE VALUE| C1_BFIDX | | | |
|* 8 | BITMAP INDEX SINGLE VALUE| C2_BFIDX | | | |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("D"."DY">TO_DATE('2005-05-04 00:00:00', 'yyyy-mm-dd hh24:mi:ss')
AND "D"."DY"<=TO_DATE('2005-05-07 00:00:00', 'yyyy-mm-dd hh24:mi:
ss'))
6 - access("B"."C3"="D"."ID")
7 - access("B"."C1"=1)
8 - access("B"."C2"=10)
24 rows selected.
Elapsed: 00:00:00.00
And just to contrast, here is the plan I get for the original query against BIG_TABLE:
flip@FLOP> explain plan set statement_id = 'p1' for
2 select c3, count(*)
3 from big_table b
4 where 1=1
5 and c1 = 1
6 and c2 = 10
7 and c3 > to_date('04-MAY-2005','DD-MON-YYYY')
8 and c3 <= to_date('07-MAY-2005','DD-MON-YYYY')
9 GROUP BY c3;
Explained.
Elapsed: 00:00:00.00
flip@FLOP> select * from table(dbms_xplan.display('PLAN_TABLE','p1'))
2 ;
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 14 | 10 (30)|
| 1 | SORT GROUP BY | | 1 | 14 | 10 (30)|
| 2 | TABLE ACCESS BY INDEX ROWID | BIG_TABLE | 24 | 336 | 8 (38)|
| 3 | BITMAP CONVERSION TO ROWIDS| | | | |
| 4 | BITMAP AND | | | | |
|* 5 | BITMAP INDEX SINGLE VALUE| C1_BMIDX | | | |
|* 6 | BITMAP INDEX SINGLE VALUE| C2_BMIDX | | | |
| 7 | BITMAP MERGE | | | | |
|* 8 | BITMAP INDEX RANGE SCAN | C3_BMIDX | | | |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("B"."C1"=1)
6 - access("B"."C2"=10)
8 - access("B"."C3">TO_DATE('2005-05-04 00:00:00', 'yyyy-mm-dd hh24:mi:ss')
AND "B"."C3"<=TO_DATE('2005-05-07 00:00:00', 'yyyy-mm-dd hh24:mi:
ss'))
23 rows selected.
Elapsed: 00:00:00.00
Now, Im not suggesting one plan is better than the other
only that by always having a TIME (DAY in this case) dimension table seems to give the CBO better insight. Of course, the PK on DAY_DIM couldve been the DATE column
probably the same effect
the notable exception would be that the DATE PK would get migrated into the fact and one would again be tempted to use the DATE from there (what would be the point of joining to the time dimension?
would likely be the explanation).
Im curious if my hand-smacking-forehead has only produced a
sore forehead?
Cheers.