Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Pramod.

Asked: January 25, 2001 - 11:04 am UTC

Last updated: April 15, 2010 - 2:12 pm UTC

Version: 8.1.5

Viewed 10K+ times! This question is

You Asked

Sometimes a column with "date" Datatype needs to be part of Primary Key or Unique Key. While retrieveing result sets in a SQL Query, we have to use TRUNC(date column) for date comparision or ( Selecting rows greater or less than of database field with TRUNC(date column)). I assume if I use TRUNC function, optimizer may not be able to use Indexes for retrieving result sets.

How do we handle this type of situation ? (Creating Function Based Indexes ??? or Set NLS_DATE_FORMAT ???)


and Tom said...



where date_field between to_date( '01-jan-2000' ) and
to_date( '01-jan-2000' ) + 1-1/24/60/60

is the same as:

where trunc(date_field) = to_date( '01-jan-2000' )

That would be my preferred way to get all records for a given DAY.

Rating

  (15 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Useful Info....More on this

Kishan, January 27, 2002 - 8:15 pm UTC

Hi Tom:

Is there any disadvantages using DATE datatype as part of Primary Key. I got this information from one of my colleagues, taken from the net, source of which I don't know:

<quote>
Are there any problems if I select a date column as a primary key for an Oracle 8i database?

There are many problems when having a date field as a primary key:
a) Exact fetch is practically impossible to do.
b) Referential integrity is very, very hard to accomplish.
c) In most cases you will have to do a convert function on the column and that will eliminate the use of an index.

What I suggest is to:

First, if you can, do not choose a date; choose number(8) or number(12) and when you update the field,
use to_number(sysdate,'yyyymmdd')

Second, to enforce a foreign key, when you update the father in a transaction, save the date/time entered in a temporary variable, and when updating the child, use the same variable.

If you must carry with you the info about date and time, use another column number(4) with the hh and mm value.

Hope that's a help.
</quote>

How far is this reasoning valid? Or is this one of the Great Myths of Database Design? If indeed its so, any suggestions as to how we can best design the database?

Thanks for your time.

Take care,
Kishan.

Tom Kyte
January 28, 2002 - 7:58 am UTC

Well, my followup to their comments are:

a) exact fetch is practically impossible to do.... How so, how is the number:

20020128075256

any harder to use then a number generated from a sequence? I mean, a date is just a value, a number is just a value, a value is a value -- it is not ANY harder using a date then a number


b) referential integrity is practically impossible to do... How so, how is it harder then using a sequence? Its just a VALUE -- its not magic, it doesn't morph into some other value, its not liquid -- it is a stable, constant, never changing value -- just like "sequence.nextval"


c) In most cases you will have to do a convert function ... How so? Only if you write really really bad code. I cannot imagine this being an issue at all



If the date is to be part of the primary key -- so be it, it is not any different from any other datatype. It is not hard to use, it is just like using a number. I view it as NO different whatsoever.

Now -- back to the original question "if i select a date column as a primary key".... I do have issues with a DATE being the primary key. Dates only go down to the second in Oracle8i and before. What happens when two people insert at about the same time? You'll get the same date. It is my opinion that a primary key consisting of a single DATE column is an incorrect design. You should use a sequence instead.



Thank You Very much....

Kishan, January 29, 2002 - 7:20 pm UTC

Thank you very much. I reckoned that if a date column is not supposed to be used as PK, then Oracle would certainly not allow it. But a word from an authority, it surely helps.. :-) Thanks..

why between ?

A reader, September 30, 2002 - 4:23 pm UTC

Hi tom,

why would you use between operator over
date_field >= to_date( '01-jan-2000' )
and date_field <= to_date( '01-jan-2000' ) + 1-1/24/60/60


Thanks,

Tom Kyte
October 01, 2002 - 9:36 am UTC

less keystrokes?

reads better?

means more semantically (it is one standalone predicate compared to your two disjoint ones)

will the performance be same ?

A reader, October 01, 2002 - 4:19 pm UTC

Hi tom,

thanks, for answring the question.
will the performance be same using between and
two disjoint stmts ?
also that is primary key attributes
and about 1 mil records ( will it use index ?)


Thanks,

Tom Kyte
October 02, 2002 - 9:56 am UTC

they will be about the same. they do the same thing.

A reader, March 29, 2004 - 9:18 am UTC

Hi Tom,

I think the point being made by Kishan's comment above is that if you needed to use just the date (and NOT the time component of the date) as part of a primary key, then you're hosed since every insert with a 1-second interval will make it into the table. To demonstrate what I mean:

create table t2 (x varchar2 (1), y int, z date, constraint
pk_t2 primary key (y, z) )
/

insert into t2 values ('a', 111, sysdate )
/
insert into t2 values ('a', 111, sysdate )
/
insert into t2 values ('a', 111, sysdate )
/
insert into t2 values ('a', 111, sysdate )
/

All of the above inserts, assuming a one-second gap, will make it into the table, whereas the requirement might have been to fail all but the first insert in my example above. I've run into this situation a few times, and some database designers and modelers have suggested excluding dates from the primary key for such a requirement. And then the only avenue is to use a formatted string that includes just the date component, such as "to_char (sysdate, 'MMDDYYYY')", as part of the primary key, instead of the date. I hope I've been able to explain myself... I think this is what was being attempted to be explained in the comments above. Any thoughts? Thanks.

Tom Kyte
March 29, 2004 - 10:40 am UTC

but that has nothing to do with the expressed concerns above.

IF you are using SYSDATE as a primary key value
AND you are inserting more than one record per second
BY DEFINITION you cannot use SYSDATE as the primary key value


Is there anything wrong with using a column of type DATE as a primary key? NO.

Can there be a logical problem with it in your application? YES. but the same is true of numbers and strings. not really datatype based -- but rather data VALUE based.

Potential Logical Issue with Date in PK...

Jason Cloninger, August 11, 2004 - 2:23 pm UTC

Firstly, thank you for your knowledge and your perspective on our issues. (patience as well...)

I am presently in debate with a colleague and want your opinion. Presently, we have a table:

LOCATION_ID VARCHAR2(10)
LOCATION_NAME VARCHAR2(55)
FISCAL_YEAR_ID VARCHAR2(4)
REPORT_ID VARCHAR2(5)

The customer wants to alter the FISCAL_YEAR_ID,(example '2004'), to MANIFEST_CREATE_DATE DATE. My colleague is supporting this change, but I disagree. Before, the report identifiers were character representations of a sequence, that would reset at the end of the fiscal year. The customer has not expressed an interest in deviating from this method of fiscal identifier generation. With altering the identifier to a date, you now have three logical breaks instead of one. (ie. DAY, MONTH, and YEAR) The date field is a value that has three parts (six if you use time information). This allows the customer to accidentally create duplicate report identifiers in the same fiscal year because it was entered on the following day. There is no way to guarantee the consistency of and quality of your data in the table any longer. Now, the report identifier can accidentally be re-generated daily and not violate the PK constraint. The argument of the customer was that it would give them more flexibility with their data down the line. Why can't this be accomplished with a creation date field below the line with an index built upon it? So, with this being said, isn't a date (DD-MON-YYYY) actually 3 values that are analyzed separately and then placed as one value in the field? Shouldn't it be said that a date datatype cannot be used as a single actual value in a table, but is actually a concatenation of different measurements that is only displayed as one value? Also, please explain the difference between having a four digit character representation of a fiscal year and having a date in this primary key. (Thank you in advance)

Tom Kyte
August 12, 2004 - 7:36 am UTC

sorry -- not following the issue here. i don't even see a primary key

a date field has but one part -- it is just 7 bytes. You can attribute "meaning" to different bytes (one means year, one hour and so on) but at the end of the day -- a date is a date, just like a number is a number.

But I really don't know how this "field is used".

Possible Logical Issue with Date in Primary Key...

Jason Cloninger, August 12, 2004 - 10:26 am UTC

I apologize Tom, I only listed the Primary Key attributes for my example to avoid confusion... The four attributes above make up the primary key for the table. Location Identifier is an abstract identifier given to a specific geophysical place, the Location Name is necessary because a specific geophysical location may be identified by different names (i.e. saint louis regional airport has a military and a civilian portion that are both uniquely identified by the FAA, even though it is the same geophysical location). Fiscal Year Identifier is self explanatory as a four digit representation of a fiscal year (i.e. '2004'). Report Identifier is a sequence that is generated for each new report generated in a fiscal period, and is set to zero each year on October 1st.
The seven byte makeup of the data contained in a date column is understood. As I understand, it is binary string whose value identifies a specific day, month, and year. What I am trying to resolve is this: 1) Oracle sees this binary value and interprets its uniqueness of these three measures: year, month, and day. If these three interpreted values are conceptually analyzed differently to give uniqueness to each record that contains the seven byte, binary string, date datatype, then isn't it conceptually three different values? Although is is stored as one string, doesn't Oracle interpret particular parts of this string to identify each of the three measures, and then analyzes and builds the index accordingly? 2) It would allow for a particular Location/Location Name to generate a Report Identifier that is the same value (i.e. '00001') as one generated the day before. The smallest aggregate of the date field is day, and the business measure is fiscal which is defined as annual. In this case, isn't the date field too aggregate for this Primary Key to contain? If you already have uniqueness in your primary key, couldn't a date field give too much aggregation? I perfectly understand implementing a date below the line from a pro-active standpoint because you project to reach that level of aggregation eventually, but if there is no present requirement why do it? I'm worried about possible compromising of uniqueness by giving a level of aggregation that is not presently a business requirement.

Tom Kyte
August 12, 2004 - 10:42 am UTC

1) Oracle sees this binary
value and interprets its uniqueness of these three measures: year, month, and
day.

that is wrong. Oracle sees 7 bytes and interprets its unqiueness based on the contents of 7 bytes

perhaps we are parsing semantics here -- but it is just 7 bytes to Oracle.



I would say -- that if the primary key is to be based on "year", what you have is correct. basing on "date" (unless you trunc the date to the year) would not be correct.

Possible Logical Issue with date in Primary Key...

Jason Cloninger, August 12, 2004 - 3:10 pm UTC

Thank you very much Tom. I understand you are an extremely busy man and appreciate you taking the time to answer my questions.

date storage

A reader, October 01, 2008 - 11:42 am UTC

In terms of bytes taken by a date date type, In all the documentation it states taht it takes 7 bytes. But, a light confusion when I compare the size of date between vsize and dump values, one shows 7 and the other 8. Am I naively comparing apples and toaster ovens?.

1 select sysdate as the_date,
2 vsize(sysdate) as vsize_storage,
3* dump(sysdate) as dump_value from dual
saqib@ORSBDS1R> /

THE_DATE VSIZE_STORAGE DUMP_VALUE
-------------------- ------------- ----------------------------------------
01-OCT-08 7 Typ=13 Len=8: 7,216,10,1,11,10,59,0

vsize_storage=7
dump_value (Len) = 8


Tom Kyte
October 01, 2008 - 1:13 pm UTC

ops$tkyte%ORA11GR1> create table t ( x date );

Table created.

ops$tkyte%ORA11GR1> insert into t values ( sysdate );

1 row created.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> select dump(sysdate), dump(x) from t;

DUMP(SYSDATE)
-------------------------------------------------------------------------------
DUMP(X)
-------------------------------------------------------------------------------
Typ=13 Len=8: 216,7,10,1,12,44,42,0
Typ=12 Len=7: 120,108,10,1,13,45,43


see the typ= SYSDATE is 'different' from a date - in representation and type.

sysdate different from date data type

A reader, October 01, 2008 - 12:22 pm UTC

Got my answer here. Should've searched the site more. :) http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8769090385739

Thanks

Primary key large number of characters in it

Vinu, May 18, 2009 - 7:58 am UTC

Is there any issue having a string with large number of characters as primary key in the table. The referred table has a primary key is a combination of a date column (yyyymmddhh24miss) and other 2 columns (6 char long and 8 char long) which are already stored in the table.
Tom Kyte
May 23, 2009 - 11:09 am UTC

I can tell by your description of the primary key that it is wrong already.

Any primary key that has a date in it is almost certainly wrong, a bad idea, data modeling done wrong.


Having a three part composite key of 7+6+8 bytes is not in itself a bad idea, so the length - not a problem.

Having a three part composite key where the first part is a date (or the 2nd/3rd) is a red flag.

Why any primary key that has a date in it is almost certainly wrong?

Charlie MuJiang 木匠, April 08, 2010 - 1:28 pm UTC

Hi Tom,

Can you give a us a practical example to prove why it's bad?

E.g.
bad index selectivity,
more query logical reads,
more CPU used,
more latches,
hard to reference in child tables,
hard to maintain,
...

Question Authority. ^_^

Thanks,
Charlie
Tom Kyte
April 13, 2010 - 8:13 am UTC

because of the high probability of duplicates (second of granularity - not fine grained enough)

and because it is obviously a surrogate key - not a natural key. So, use a true surrogate.

A date column in primary key - DW table

Charlie MuJiang 木匠, April 08, 2010 - 4:18 pm UTC

Note: it is an Data Warehouse application table.

Such as a sales fact table:

Primary Key: (order_date_year_month, location_id, product_id)
Tom Kyte
April 13, 2010 - 8:30 am UTC

remember, I carefully wrote:

Any primary key that has a date in it is almost certainly wrong, a bad idea, data modeling done wrong.


In this case, you are not really using a date truly, you have a year and month (not a date with year, month, day, hour, minute, second). You have a single observation per month per location per product. That is a valid use.


If you look at the original poster to whom I responded - they were using yyyymmddhh24miss

A date column in primary key - OLTP table

Charlie 木匠, April 14, 2010 - 12:27 pm UTC

Thanks for the clarification.

I thought about the point of natural key on OLTP table, forget to jot it down.

A reader, April 15, 2010 - 2:11 pm UTC

Hello sir,

Could you please explain what is the risk of using date as primary key?


Thanks
Tom Kyte
April 15, 2010 - 2:12 pm UTC

hit page up, we just talked about this 3 or 4 reviews up

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.