Skip to Main Content
  • Questions
  • explicit Foreign Key constraints in DW

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Stephen.

Asked: February 12, 2013 - 11:39 pm UTC

Last updated: August 27, 2016 - 2:13 am UTC

Version: 11.2

Viewed 10K+ times! This question is

You Asked

Hi Tom,
In all the 3 Data Warehouse projects that I had worked on we never created any Foreign key constraints in the fact tables explicitly. When I asked for the reason the standard answer that I get is "It will impact the performance and No one uses FK constraints in Data warehouse projects"

But I had read in one of your articles that the more information we provide to the CBO the better the performance would be.

So I would like to know what would be the strategy you would use were you to design a Data warehouse. Would you explicitly create FK constraints or not? Also it would be great if you can elaborate the reasons in detail.


Regards,
Stephen.

and Tom said...

constraints - all of them - are *vital* to GOOD database performance!

without primary key, foreign key, not null, and check constraints the optimizer is missing vital information, information it needs to be able to optimize your queries.

read:
http://www.oracle.com/technetwork/issue-archive/2009/09-mar/o29asktom-098473.html

"Wide Load Storage"

and observe the foreign keys in there (rely disable novalidate).

See also:

http://asktom.oracle.com/Misc/constraints-have-bad-name.html
http://asktom.oracle.com/Misc/stuck-in-rut.html


and get those bits of metadata in the database!

Rating

  (9 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Thanks....

Stephen, February 13, 2013 - 9:28 am UTC

Thanks a lot Tom... Now I can always point to this post and now everyone will agree :)

Regards,
Stephen
Tom Kyte
February 13, 2013 - 9:29 am UTC

and now everyone will
agree :)


I wish :) On what planet will that happen??

need clarity

Mani, February 13, 2013 - 9:41 am UTC

While i completely agree with your comment about having constraints right in the database, but large selling ERP like Oracle's own EBusiness suite dont have implemented these constraints in their design despite their architecuture diagrams available shows there is a primary and foreign key relationship do exists.

Any idea why this is so?
Tom Kyte
February 13, 2013 - 11:17 am UTC

unfortunately, when you get employed at Oracle - there is no magic cap they can put on your head to instantly download how to be a database engineer. they are programmers and developers just like you and I. And many of these apps were initially developed many years ago. Changing them in a big way is like turning an aircraft carrier around.

I can say that they have constraints defined as metadata in current releases of many of the applications these days.


Chuck Jolley, February 13, 2013 - 10:06 am UTC

Maybe you are looking in the wrong place.

reports@test>select count(*)
2 from all_constraints
3 where constraint_type in ('R', 'P');

COUNT(*)
----------
5311

Index on the primary key constraint column in a warehouse

Pedro, August 16, 2016 - 8:24 pm UTC

Hello,

In the article:
http://www.oracle.com/technetwork/issue-archive/2009/09-mar/o29asktom-098473.html
the primary key on T2 is set up with:
alter table t2 add constraint t2_pk primary key(object_id)
rely;
I am operating in a warehouse environment.
I need to let the optimizer know that object_id is unique and optimize the performance of queries and data load. To me, I need to create an index on object_id, which won't be dropped and recreated but instead made unusable and rebuild during the load.
What would you say about this:
CREATE INDEX T2_pk ON T2 (object_id);
alter table t2 add constraint t2_pk primary key(object_id)
rely;
ALTER SESSION SET skip_unusable_indexes = TRUE;
INSERT INTO T2 VALUES (-1); /*this is the load*/
ALTER INDEX T2_pk REBUILD;
ALTER SESSION SET skip_unusable_indexes = FALSE;
Thank you in advance,

Connor McDonald
August 17, 2016 - 1:33 am UTC

I think you might struggle here.

If you *dont* set the index to unusable, then the constraint is still enforced

SQL> create table t  ( x int, y int );

Table created.

SQL>
SQL> CREATE INDEX T_pk ON T (x);

Index created.

SQL> alter table t add constraint t_pk primary key(x) rely;

Table altered.

SQL>
SQL> ALTER SESSION SET skip_unusable_indexes = TRUE;

Session altered.

SQL> INSERT INTO T VALUES (-1,-1);

1 row created.

SQL> INSERT INTO T VALUES (-1,-1);
INSERT INTO T VALUES (-1,-1)
*
ERROR at line 1:
ORA-00001: unique constraint (MCDONAC.T_PK) violated



If I *do* make the index unusable, then I'll get issues on insert

SQL> drop table t purge;

Table dropped.

SQL>
SQL> create table t  ( x int, y int );

Table created.

SQL>
SQL> CREATE INDEX T_pk ON T (x);

Index created.

SQL> alter table t add constraint t_pk primary key(x) rely;

Table altered.

SQL>
SQL> ALTER SESSION SET skip_unusable_indexes = TRUE;

Session altered.

SQL> alter index t_pk unusable;

Index altered.

SQL>
SQL> INSERT INTO T VALUES (-1,-1);
INSERT INTO T VALUES (-1,-1)
*
ERROR at line 1:
ORA-01502: index 'MCDONAC.T_PK' or partition of such index is in unusable state


SQL> INSERT INTO T VALUES (-1,-1);
INSERT INTO T VALUES (-1,-1)
*
ERROR at line 1:
ORA-01502: index 'MCDONAC.T_PK' or partition of such index is in unusable state


SQL>


You would need to disable the constraint

Index on the primary key constraint column in a warehouse

Pedro, August 17, 2016 - 3:05 pm UTC

Appreciate the faster than light response.
You are indeed correct about disabling the primary key, since this:
DROP TABLE t PURGE;
CREATE TABLE t ( x int, y int );
CREATE INDEX t_pk ON t (x);
ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY(x) RELY DISABLE NOVALIDATE;
ALTER SESSION SET skip_unusable_indexes = TRUE;
ALTER INDEX t_pk UNUSABLE;
INSERT INTO T VALUES (-1,-1);
ALTER INDEX t_pk REBUILD;
works just fine.
My questions are:
1. Is the UNUSABLE/REBUILD approach in the warehouse environment the most efficient way of index maintenance, both bitmap and b-tree (as an opposite to DROP/CREATE)?
2. Is the optimizer treating the primary key constraint of RELY DISABLE NOVALIDATE, exactly the same as NORELY ENABLE VALIDATE?
3. If I understood the https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:7641143144618 correctly, there is no difference for the optimizer if the primary key is supported by a unique or a non-unique index?
Thank you very much again.
Chris Saxon
August 19, 2016 - 2:59 am UTC

1) They are pretty much the same. I prefer "UNUSABLE" because it keeps the dictionary definition. For example, if your index was (say) pctfree 2, compress 1, then you might forget to redo that when you drop/recreate.

2) That is my understanding. But obviously, if you tell the database to rely on something...it has better be correct. A nice blog post here on that

http://asktom.oracle.com/Misc/stuck-in-rut.html

3) There are some small differences, that might come into play in high frequency transaction environments, and the non-unique index will be a tiny bit bigger. If you are planning on marking it unusable, and disabling the constraint then you may as well have a unique one.

Having said all of that, ask your self if your approach is practical in the long run. As your data gets larger and larger, that rebuild time will just keep growing. It might be worth looking at alternative options (partitioning etc)


Index on the primary key constraint column in a warehouse

Pedro, August 19, 2016 - 8:22 pm UTC

Thank you, for the second time. Is there anything I can do to repay you?
I guess there is no running away from telling the whole story.
I have a star schema (fact/dimension) type warehouse on RAC, the load happens once a day and as of right now all the indexes and constraints (foreign, primary and unique) are dropped and recreated (DROP/CREATE approach).
The issue that started this whole fiasco is as follows; in session 1 runs a SELECT statement, in session 2 foreign keys are created on a fact table used by session 1, session 2 fails with "ORA-04020: deadlock detected while trying to lock object" on a dimension table not on the fact table to which the DDL is applied.
So, a nifty solution to the ORA-04020 would be to replace the current DROP/CREATE approach with UNUSABLE/REBUILD.
Back to my questions:
1. I thought that the REBUILD will use the existing UNUSABLE index and by doing so be more efficient than straight CREATE INDEX?
2. I will have to develop the code to check the data so the constraints can rely on the data. No problem.
3. Believe me, I tried to make a unique index unusable, for reasons that I don't quite understand there is no way to do so. Thus, I am compelled to use a non-unique index with the primary/unique constraints. Thus, I have to bother you with this question.
4. "As your data gets larger and larger, that rebuild time will just keep growing. It might be worth looking at alternative options (partitioning etc)." Are you saying to get rid of the indexes and use table partitioning? The fact tables are already partitioned and the indexes are mandatory for the star schema optimizer plan.

Connor McDonald
August 20, 2016 - 3:23 am UTC

"Is there anything I can do to repay you?"

Beer works well :-)

Re 4: Yes, rebuilding dimensions in their entirity is not uncommon (or a merge). I was worried you were rebuilding your *fact* table indexes from scratch

Index on the primary key constraint column in a warehouse

Pedro, August 20, 2016 - 2:58 am UTC

I apologize for a mistake I made above in #3, it should read:
3. Believe me, I tried to make a unique index unusable and INSERT a record (as demonstrated in https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:8357645069092 ), for reasons that I don't quite understand there is no way to do so. Thus, I am compelled to use a non-unique index (which after making it unusable will allow an INSERT to follow) with the primary/unique constraints. Thus, I have to bother you with this non-unique index question.

Connor McDonald
August 20, 2016 - 3:23 am UTC

Thats right ... an unusable unique index effectively renders the table unusable :-)

Index on the primary key constraint column in a warehouse

Pedro, August 24, 2016 - 2:48 pm UTC

Thank you for spending so much time on this.
How would you like the beer to be delivered?
I think I am getting there, just I am a bit confused.
3. Assuming the surrogate key on the dimension table is used by the foreign key from the fact table. Which solution is the better?
3.1. The primary key constraint on the dimension table does not use any index on the surrogate key.
There exists a separate unique index on the surrogate key on the dimension table.
Since the index is unique only DROP INDEX/CREATE INDEX approach can be used.
3.2. The primary key constraint on the dimension table uses a non-unique index on the surrogate key.
Since the table of interest is a dimension table there is little difference between DROP INDEX/CREATE INDEX and UNUSABLE/REBUILD approach, either can be used.
4. You were rightly worried, I am dropping and creating all indexes on our partitioned fact tables (local, bitmap and b-tree, used by constraints and not used by constraints).
Since fact tables are large should I be using the UNUSABLE/REBUILD approach for the fact table indexes?
Any other thoughts?

Chris Saxon
August 25, 2016 - 3:06 am UTC

The following is a common strategy I've seen:

dimensions:
- primary keys as per normal (enabled, validated)
- other unique keys as per normal (enabled, validated)

fact table
- foreign keys back to dimension (primary keys) defined as "rely" (ie, you are guaranteeing that the relationship is valid, even though the database itself has not done the work to validate the data)

data population (assuming full recreate of dimensions)

a- disable fact foreign keys
b- disable constraints, set indexes unusable (if needed for load performance) on dimension tables
c- load dimension data
d- rebuild indexes, re-enable constraints (ie validate) on dimension tables

e- load new fact data into staging tables
f- use staging tables to validate fk data against dimension constraints (ie running plain SQL)
g- load fact data (eg by partition exchange or other means)
h- re-enable fact foreign keys (as rely, novalidate, because from 'f' you have confidence the data is valid)

Hope this helps

Index on the primary key constraint column in a warehouse

Pedro, August 26, 2016 - 8:46 pm UTC

You hope this helps? Your last update, is what I am looking for, for the last couple weeks.
But devil sits in the details, I used your last update, http://www.oracle.com/technetwork/issue-archive/2009/09-mar/o29asktom-098473.html and https://community.oracle.com/thread/3552130 to devise the underneath sample code to cover all the aspects of my DW operations. What do you say?

--BEG setup--------------------------------------------------------------------------------------------
DROP TABLE zzz_fact;
DROP TABLE zzz_dimension;

CREATE TABLE zzz_fact (fact_id INTEGER NOT NULL, dimension_key INTEGER);

CREATE TABLE zzz_dimension (dimension_key INTEGER NOT NULL);

CREATE INDEX zzz_fact_i01 ON zzz_fact (dimension_key);

--based on https://community.oracle.com/thread/3552130 => Using non-unique indexes to support unique constraints changes the cost of queries and the volume of redo generated
CREATE UNIQUE INDEX zzz_dimension_pk ON zzz_dimension (dimension_key);
--based on http://www.oracle.com/technetwork/issue-archive/2009/09-mar/o29asktom-098473.html => this allows to skip a) and f)
ALTER TABLE zzz_dimension ADD CONSTRAINT zzz_dimension_pk PRIMARY KEY (dimension_key) RELY ENABLE VALIDATE;
ALTER TABLE zzz_fact ADD CONSTRAINT zzz_fact_f01 FOREIGN KEY(dimension_key) REFERENCES zzz_dimension (dimension_key) RELY DISABLE NOVALIDATE;

--based on https://community.oracle.com/thread/3552130 => Using non-unique indexes to support unique constraints changes the cost of queries and the volume of redo generated
CREATE UNIQUE INDEX zzz_fact_pk ON zzz_fact (fact_id);
ALTER TABLE zzz_fact ADD CONSTRAINT zzz_fact_pk PRIMARY KEY (fact_id);
--END setup--------------------------------------------------------------------------------------------

--here the users are querying and everybody is happy including the optimizer----------------

--BEG data population----------------------------------------------------------------------------------
--based on https://community.oracle.com/thread/3552130 => Using non-unique indexes to support unique constraints changes the cost of queries and the volume of redo generated
--in conjuction with the fact that it is not possible to make UNIQUE index UNUSABLE
ALTER TABLE zzz_dimension MODIFY CONSTRAINT zzz_dimension_pk DISABLE;
DROP INDEX zzz_dimension_pk;
ALTER TABLE zzz_fact      DROP PRIMARY KEY DROP INDEX;

ALTER INDEX zzz_fact_i01 UNUSABLE;
ALTER SESSION SET skip_unusable_indexes = TRUE;
INSERT INTO zzz_dimension VALUES (1);
INSERT INTO zzz_fact VALUES (1, 1);
ALTER SESSION SET skip_unusable_indexes = FALSE;
ALTER INDEX zzz_fact_i01 REBUILD;

CREATE UNIQUE INDEX zzz_fact_pk ON zzz_fact (fact_id);
ALTER TABLE zzz_fact ADD CONSTRAINT zzz_fact_pk PRIMARY KEY (fact_id);

CREATE UNIQUE INDEX zzz_dimension_pk ON zzz_dimension (dimension_key);
ALTER TABLE zzz_dimension MODIFY CONSTRAINT zzz_dimension_pk USING INDEX zzz_dimension_pk ENABLE VALIDATE;

--validate the foreign key zzz_fact_f01-----------------
SELECT COUNT(*)
FROM
   zzz_fact      F,
   zzz_dimension D
WHERE
       F.dimension_key = D.dimension_key(+)
   AND D.dimension_key IS NULL
;
--END data population----------------------------------------------------------------------------------

--here the users are querying and everybody is happy including the optimizer----------------


Thank you very much for your time, as I already mentioned without your help I would have been lost like a little child.

Connor McDonald
August 27, 2016 - 2:13 am UTC

Glad we could help :-)

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.