Skip to Main Content
  • Questions
  • DATE column along with Duration as part of Primary key.

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Viswa.

Asked: December 07, 2015 - 4:50 pm UTC

Last updated: December 10, 2015 - 2:12 pm UTC

Version: 11.2.0.3.0

Viewed 1000+ times

You Asked

Hi Tom - First of all, I would like to Thank you for all the time to invest to answer our questions. It is always insightful to read (and understand) your answers :)

Though I have asked this question in other forums, I have not satisfied with the responses, so I am asking you.
---------------

I have seen and used many times, "Date" column (some times, just the date, and some times, Date along with time component) as part of the primary key.

I want to maintain a primary key, that contains a set of columns along with a date range (duration) of time between two date columns, so there will be no duplicate records for the ENTIRE RANGE of Start and end dates.

Here is the process I followed, however I know there can be better ways to do this... Please share it if you can..
My question is - is there any easier/better/organized way to implement this??

Here is my way
---------------
Lets say I want to create a table that maintains the "name changes for a given person over a period of time".
To achieve that I have two tables listed below.

1. This maintain list of ALL person_IDs.
CREATE TABLE person_ids (person_id NUMBER CONSTRAINT person_id_pk PRIMARY KEY);


2. For each person listed in the above table, we want to maintain legal_name over different periods of time.
In this below table
CREATE TABLE person_m
(
 rec_id                        NUMBER CONSTRAINT person_m_pk PRIMARY KEY
,person_id                     NUMBER CONSTRAINT person_m_fk1 REFERENCES person_ids(person_id)
,legal_name                    VARCHAR2(32)
,eff_from_ge_dtm               DATE
,eff_thru_lt_dtm               DATE
,CONSTRAINT person_m_fk2 CHECK(eff_from_ge_dtm <= eff_thru_lt_dtm)
);


Just to make sure, I relay the meaning of the column names.
rec_id is a "surrogate key", no importance for it.
eff_from_ge_dtm - Effective from (greater than or equal to) this specific date&time.
eff_thru_lt_dtm - The legal name is effective until (but less than) this specific date&time.

I want to make sure the uniqueness of a person_id and duration between eff_from_ge_dtm, and eff_thru_lt_dtm.

INSERT INTO person_ids VALUES(1);

INSERT INTO person_m VALUES (1001, 1, 'LN1', TO_DATE('2001/01/01','yyyy/mm/dd'), TO_DATE('2010/01/01','yyyy/mm/dd'));
INSERT INTO person_m VALUES (1002, 1, 'LN2', TO_DATE('2010/01/01','yyyy/mm/dd'), TO_DATE('2012/01/01','yyyy/mm/dd'));

For the same person identified by "1", we have two names for the given two different date&time ranges.
We can not have overlapped date range records for a given person.

For that I have created a function that checks if the incoming record values cause any overlaps and returns a Yes OR No.
I call this in a trigger (or in a another procedure) where the actual insert is going to happen.


CREATE OR REPLACE FUNCTION does_cause_overlap_dt_range_yn(
                                          i_person_id                 IN NUMBER
                                         ,i_lower_date_val            IN DATE
                                         ,i_upper_date_val            IN DATE)
    RETURN VARCHAR2 IS
    l_yes_no                      VARCHAR2(1);
    l_cnt                         NUMBER;
  BEGIN
    BEGIN
      SELECT 1
      INTO  l_cnt
      FROM  person_m e1
      WHERE e1.person_id = i_person_id
        AND   NOT (   e1.eff_from_ge_dtm >= ALL (i_lower_date_val, i_upper_date_val)
                   OR e1.eff_thru_lt_dtm < ALL (i_lower_date_val, i_upper_date_val)
                  );
      -- if there is one record, that means it is going to cause overlap           
      l_yes_no                   := 'Y';
    EXCEPTION
      WHEN NO_DATA_FOUND THEN
      -- no overlaps
        l_yes_no                   := 'N';
      WHEN TOO_MANY_ROWS THEN
      -- overlaps
        l_yes_no                   := 'Y';
    END;
    RETURN l_yes_no;
  END;

and Connor said...

Your logic is sound, but you need to cater for a multiple user environment.

That is, if TWO people insert PERSON_M rows for the same person, then each will *not* see the others changes until they commit ... and ker splat, you have an overlap.

See this asktom question

https://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:474221407101

for a solution to this using dbms_lock.

Hope this helps.

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Thank you.

Viswa Peesapati, December 08, 2015 - 4:48 pm UTC

Thank you for reviewing the question, and proposed solution.

1. You are right, actually I have "locked_yn" column on the Person_IDs table (I did not show it thinking to simplify the problem). So, any process (from multi-user environment), first tries to update "locked_yn" to "Y" for that specific person's ID record in Person_IDs table. If there is an exception, it will throw an error message and aborts the further processing. Once, the update is completed, it will release the lock by "committing" the changes.

2. I was under the impression if I try to update with a "wait 1" clause that is enough, however as you mentioned I can use dbms_lock to get an exclusive lock on that specific record. Can you please suggest me which one is the better one.

3. I am thinking to keep this call in the trigger that way, it is as close to the table as possible. However for no specific reason, I do not like the triggers very much. I can keep the same validations in a procedure and call the procedure. The disadvantage of that is - we need to make sure all the clients, use the procedure and avoid direct inserts, which is beyond my reach right now.

4. Do you have any suggestion whether we should go with a trigger implementation or a procedure/function (inside a package) implementation and reason for that.

5. Finally, I expect any RDBMS (especially Oracle) to support a primary key with date ranges as a built in (like a Primary key/check constraint), as I feel that is somewhat basic need for any requirement that needs history preservation. Why is that not considered?

Thank you for your insight and help.
Viswa
Chris Saxon
December 09, 2015 - 8:59 am UTC

You will potentially need a combination of both.

If you are adding person_m rows for an existing person, then it would sufficient to lock the person row.

But if you adding a person row, then person_m rows for that new person, then there is no pre-existing person to lock (hence the need for dbms_lock).

Using PLSQL as a wrapper is a sound choice - its easy to stop people accessing the underlying table directly - simply do not grant them acces. Just give execute on the procedure and NOT to the tables.

You brought up a good point.

Viswa, December 10, 2015 - 12:55 am UTC

Connor - Thank you very much for your response, which gave me few other things to consider -
First when you say, if two persons insert two NEW records for the same individual, we need to ensure locking, I assume you thought person_id as the "natural key". In my opinion so far that is a unique number generated from a sequence. That means, in my opinion those two will be assigned two different numbers (I think that is the default behavior of a sequence), without the need of any lock.

However, as you mentioned if they "refer" to the same person, I assume we do need to make sure we assign the "same person_id", and I am confused if we can ever get that. I appreciate your thoughts/suggestions around that.

One more interesting thing that came to my mind is what is the "natural" key for a "Person". It is not a person name, not necessarily the combination of "last_nm", and "DOB". I am sure I thought about this in the past, however I was not able to come to a certain conclusion. I think at that time, if I remember well, I thought of keeping "last_nm", "first_nm" and "DOB", however in a "health industry" that may not be possible because a new born child will not be given a name until few days after the birth. Also, it is possible for a "twins" who may be born on the same day, with same gender (Lot of times, I observed "Baby girl/boy, with the DOB" on the hand tags (also they will have the same last name). So, what is the best natural key for any given person in order to ensure uniqueness.

I know that I may be diverting the topic, so if you are not interested to prolong the discussion, I can totally understand.

Thank you for your help.


Connor McDonald
December 10, 2015 - 2:12 pm UTC

re: "That means, in my opinion those two will be assigned two different numbers (I think that is the default behavior of a sequence), without the need of any lock."

Yes that is a valid assertion. My error.

Dan, December 11, 2015 - 7:16 pm UTC

If your requirement is more along the lines of what name is in effect at a specific date, you can consider a different approach where you just have one date, as part of a UK, and use LEAD in a view to get the effective date range.

create or replace view person_m_vw as(
select p.rec_id,
   p.person_id,
   nvl(lead(p.eff_dt) 
         over(partition by p.rec_id, p.person_id
           order by p.eff_dt), 
       to_date('31-dec-9999', 'DD-MON-YYYY'))
  -interval '0 00:00:01' day to second next_eff_dt,
  p.legal_name
from person_m p


This prevents some issues (like gaps in entered ranges - was there a few days where the person had no name???), but that may not be correct either (maybe the gap IS legitimate). Also by using a MERGE, you could avoid a PK violation in the multiuser environment (but the last one committed will stick)

As to what identifies a "person".. I usually resort to physical existence... only one "thing" can be in the same place at the same time (when that is violated in the physical world, a lawyer or insurance company usually gets involved)... you need day (and time to account for twins), and where (a given hospital COULD have 2 babies born at the same time in 2 different rooms, so "where" can get tricky). It depends on how much data you have, and what is the probability of duplicates, and who is trying to resolve the duplicates (a live person can ask the customer a question to resolve it; a computer program can't deal with it)



More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library