Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Jan.

Asked: January 10, 2005 - 9:02 pm UTC

Last updated: January 11, 2005 - 1:48 pm UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

I have a table with 4 columns which would uniquely identify a row. For each row, two of these columns will always be populated. The other two columns will have one populated and the other will be null. Which of those two populated columns will vary for each row.

For instance:

I have a table:
create table t (a number NOT NULL, b varchar2(10) NOT NULL, c number, d number, e varchar2(100));

with this data in it:
insert into t values (1,'a',1,null,'hello');
insert into t values (2,'a',null,2,'world');
insert into t values (2,'a',2,null,'hi there');

Columns a and b are always populated. Columns c and d will have one null and one populated. Column e may or may not be populated.

I would like to have a primary key on my table. The logical columns to include would be columns a,b,c and d. However, one of those will always be null. What is a good approach to this dilema? Would the only solution be to have a sequence generated value included in the primary key and then build a unique index on the combination a,b,c,d?

Jan

and Tom said...

there really is a "good approach", since a primary key implies "not null" attributes.

It would seem the best you can achieve here is a UNIQUE constraint on a,b,c,d. UNIQUE constraints, very similar in many respects to a primary key, permit nullable attributes. They can be pointed to by foreign keys as well.


The only other alternatives I see would be

a) to choose a single constant value that means "nothing here" for C and D. eg, -1 for C, '*' for D. That way they are not null. Not a fan of that one.

b) to add a column "ID", id is the primary key, id is populated via a sequence and a,b,c,d have a UNIQUE constraint (not index, a unique constraint) on them. This is what I would do IF i needed a "true" primary key for some reason.

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Another Suggestion

Al Davis, January 11, 2005 - 7:47 am UTC


Since c and d are both number data type and one is null when the other is not, why not combine them and call the column c_d and add an indicator column called 'c_or_d' to the index and to the application SQL.

Tom Kyte
January 11, 2005 - 10:43 am UTC

I am personally not a fan of using a column for two things -- just my 2 cents. It'll cause massive confusion down stream.

Design flaw?

Kevin, January 11, 2005 - 12:14 pm UTC

This looks like a case where two different relationships are being crammed into a single table. R1 is a relation (a,b,c,e) and R2 is a relation (a,b,d,e). Consider using two tables instead (short hand):

CREATE TABLE T1 (a NOT NULL, b NOT NULL, c NOT NULL, e, PK ON (a,b,c));
CREATE TABLE T2 (a NOT NULL, b NOT NULL, d NOT NULL, e, PK ON (a,b,d));

in many cases, you may find this approach (using table structures which more closely match your data) actually simplifies your application design. You don't have to worry about the convoluted "primary keys with nulls" workarounds - you've got pure primary keys right there. The constraint "only one of c and d is populated" is implemented by definition.

And if you're worried about queries out there which need to be backwards compatible, just add a view:
DROP TABLE OLD_TABLE_NAME;
CREATE VIEW OLD_TABLE_NAME AS
SELECT A, B, C, NULL D, E FROM T1 UNION ALL
SELECT A, B, NULL C, D, E FROM T2;

("INSTEAD OF INSERT/UPDATE/DELETE" Triggers can be used for backwards compatibility, if OLD_TABLE_NAME was being modified by other processes).


The only major problem is if a foreign key needed to reference all four columns (a, b, c, d)... honestly, I doubt any such dependency SHOULD exist - much more likely that any such dependency should be on (a,b,c) OR on (a,b,d).

Tom Kyte
January 11, 2005 - 1:48 pm UTC

you are probably right -- and there is probably even a missing table :)

t (a,b,e,..... )

t1( a,b,c, m,n, foreign key(a,b) points to t)
t2( a,b,d, x,y, foreign key(a,b) points to t)