Skip to Main Content
  • Questions
  • Parent/Child tables, and the use of surrogate keys.

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Paul.

Asked: April 26, 2010 - 3:30 pm UTC

Last updated: April 29, 2010 - 12:57 pm UTC

Version: 10.2.0.4.0

Viewed 1000+ times

You Asked

We have a design requirement that involves a Parent table with a 1:n relationship with a child table. The data in these tables will be transmitted to external apps using the OSB.

We have a 20 character globally unique id, called guid, that is unique to each record on the Parent table.

There is a concern about performance if we used the guid as the PK for the Parent table, and the FK for the Child table. The rest of the team wants to use a sequence-generated number as the PK/FK for the Parent/Child, and just have the guid be a unique index on the Parent.

With that, they also want to have external apps provide the generated id when requesting records from the child table, as opposed to the guid. This means, at insert time, the external apps would have to hold onto the sequence generated id that is returned.

I suggested to join the Parent/Child table on the generated sequence, but use the guid to look up the Parent record, but I guess there is a concern about performance involving that as well.

The team wants to create the public function (psudeo-code)
get_child_records(id IN NUMBER) {
   Select *
     From child_table
    Where child_fk = id
}


Instead of

get_child_records(guid in varchar2) {
   Select *
     From parent_table,
          child_table
    Where parent_pk = child_fk
      and parent_guid = guid
}


I feel that any sequence generated surrogate keys should be kept private and used for backend joins only, and to have to guid used externally (shielding the rest of the world from the generated ids).

What are your thoughts on this?

and Tom said...

ask the "team" for the science.

Not hypothesis (hypothesis can be tested), but the numbers, the facts, the solid non-myth based reasoning.

they would have to measure the time it takes to add this unique sequence (in addition to the unique guid) to the parent, time to maintain the unique index on it.

My hypothesis is "this is a 'not smart' idea - to add yet another silly surrogate when a proper one already exists".


This sounds "silly" to me, I don't get their thought process. If the parent has a key (and it does) that is the foreign key in the child table, done - period, nuff said.


I don't like your teams overly procedural approach either, I would create a VIEW, not a procedure, but whatever...

... I feel that any sequence generated surrogate keys should be kept private and used for backend joins only, and to have to guid used externally (shielding the rest of the world from the generated ids)....


and I would say that the sequence should die here, it isn't useful, it only makes the data larger, harder to manage, increases the number of indexes, provides no compelling "performance" case - nothing.


Here is my quick and dirty "what are you TALKING about" example:

ops$tkyte%ORA11GR2> create table t
  2  ( guid  varchar2(32) unique not null,
  3    seq   number       unique not null,
  4    data  char(80)
  5  )
  6  /

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> insert into t (guid, seq,data)
  2  select rawtohex(sys_guid()), 123456+level, 'x'
  3    from dual
  4  connect by level <= 100000
  5  /

100000 rows created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> declare
  2          type array is table of t.guid%type;
  3          l_guids array;
  4          l_rec1  t%rowtype;
  5          l_rec2  t%rowtype;
  6  begin
  7          select guid bulk collect into l_guids from t;
  8          dbms_monitor.session_trace_enable();
  9          for i in 1 .. l_guids.count
 10          loop
 11                  select * into l_rec1 from t where guid = l_guids(i);
 12                  select * into l_rec2 from t where seq  = l_rec1.seq;
 13          end loop;
 14  end;
 15  /

PL/SQL procedure successfully completed.




I'll let you run that - and you tell us if the guid or the sequence was "amazingly fast compared to the other" (it won't be)


This is premature optimization here - fixing a problem that probably doesn't exist. I hope they spend as much time on other true design issues as they do on this!

Rating

  (6 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, April 27, 2010 - 9:09 am UTC

I guess the main idea of the generated key was that there would be no need to do any joins; you could just look it up directly against the child table. But yes, there are perfectly good, immutable natural keys already available here. The main concern is for performance, and I don't know enough about the inner workings of Oracle to know one way or the other.

As far as procs vs views go, we're kind of stuck maintaining the status quo of the oracle 8 days where VIEW was an ugly 4 letter word. Also, don't views require that the user behaves oneself when selecting from it, and doesn't add any criteria to slow it down (like adding non indexed fields to the where clause when selecting from the view)?

Thanks for the tips. I was gonna share this w/ the team but since you pretty much pooped all over them I guess I'll just keep this to myself ;)
Tom Kyte
April 27, 2010 - 9:23 am UTC

... I guess the main idea of the generated key was that there would be no need to
do any joins; you could just look it up directly against the child table. ...

huh? the child table would have it's own primary key (which should include the parent key by all rights, that'll ensure the common key gets carried all of the way down the hierarchy)

You lost me totally with this thought here.


... and I don't know enough about the inner
workings of Oracle to know one way or the other.
..

It is not magic, it is not secret, it is all visible and testable. And doing premature optimization based on hypothetical hypothesis without knowing how Oracle works- well, that is a recipe for certain disaster - don't you think?


... we're kind of stuck maintaining the status quo of
the oracle 8 days where VIEW was an ugly 4 letter word. ...

myth alert, myth alert - warning to all - myth alert.

View was never an ugly four letter word (for those not familiar with "four letter words" - all of our really good curse words in english are four letters long). Views were always something that could be

a) used correctly
b) used incorrectly

just like *anything else*.



... Also, don't views
require that the user behaves oneself when selecting from it,...

We allow users to drink, to smoke, to do whatever they want when they select from it. They can be screaming out their window at people passing by - they do not need to behave themselves at all. I don't know what that means.


... and doesn't add
any criteria to slow it down (like adding non indexed fields to the where
clause when selecting from the view)?
..

again myth alert, myth alert - warning to all - myth alert.

I don't know what to say, honestly.

I would work on this:

and I don't know enough about the inner
workings of Oracle to know one way or the other.



get rid of the need to say that statement and most of these myths will fall by the wayside.

Thomas, April 28, 2010 - 1:34 am UTC

>.. Also, don't views
>require that the user behaves oneself when selecting from
>it,...

>We allow users to drink, to smoke, to do whatever they
>want when they select from it. They can be screaming out
>their window at people passing by - they do not need to
>behave themselves at all. I don't know what that means.

LOL! You made my day!

Front end apps

A reader, April 28, 2010 - 12:29 pm UTC

Another common reason for using a surrogate sequence-generated key for ALL tables (UK on the child table in this example would be parent.guid+some child column) is that it makes building front-end maintenance screens on the data easier. If I am not mistaken, Oracle's own tool Application Express recommends this.
Tom Kyte
April 28, 2010 - 12:56 pm UTC

I don't care about the tools, if the tools cannot deal with real databases - the tool should either be fixed or discontinued.

Even Hibernate can deal with true keys (and so can apex)

Front end apps

A reader, April 28, 2010 - 8:48 pm UTC

Of course Apex can "deal" with it. But is it easier and more maintenable to pass in via URL and pull up a edit form for a record using a single, numeric value or multiple string, number and date values? That was my point. As stated before, there are cases where it is overkill, for example, for many-to-many intersection tables which have the composite PK as 2 surrogate keys, there is no reason to create a UK on them and maintain another surrogate key.

But a surrogate key does have its place. As you have stated yourself so many times, if one technique fits all situations all the time, nothing else would exist.
Tom Kyte
April 29, 2010 - 7:28 am UTC

It isn't any easier or harder in APEX to build a crud screen for a table with a compound key. Software automates all of it.


I know surrogate keys have their place - when you have no true natural key, a surrogate key is called for.


I have a table called questions_and_answers. It has no natural key, none. So, it uses a surrogate.

I also have other tables that do have natural keys and they use those.

And I'm written using APEX.


No one is saying - not here, not there, not anywhere - do not use surrogates.

What I am saying here, and there and everywhere is - if you already have a key- why would you ADD YET ANOTHER KEY.


Original post:

....
We have a 20 character globally unique id, called guid, that is unique to each record on the Parent table.
...
The rest of the team wants to use a sequence-generated number as the PK/FK for the Parent/Child, and just have the guid be a unique index on the Parent.


Do you see the "not smartness" here - they state clearly:

a) we already have a primary key on the parent table - it is calls GUID

b) we want to ADD ANOTHER primary key to the parent table - call it ID and use a sequence, this is our foreign key we'll use in the child table.



We are not even really talking about surrogates (heck, they start with the premise that the parent table key is a SURROGATE - it is called GUID!) I'm ok with their surrogate

What I am definitely NOT ok with is them adding YET ANOTHER SURROGATE and having TWO primary keys in effect.

That would be *not smart*

A reader, April 29, 2010 - 10:19 am UTC

Hello Sir,

in our DWH we have natural key which we use to produce surrogate key

The process is we substr some portion of natural key depending on src_appl_cd field
and do lookup on keycutting_xref table which have
structure like (natural_key char(45), sk_id int) if we find mate we retreive sk OR
we call stored proc which assign new key and insert record in keycutting_xref for future key cuting process


My question

(1) since we are deriving sk from natural key is not a good idea to define it as
primary key and do not generate the sk as you said if you have natural column which is unique use it as primary key?

(2) Is there is any bebefit of producing such artificial key in DWH?


Many Thanks
Tom Kyte
April 29, 2010 - 10:26 am UTC

simple question for you first: why don't you just use the natural key?

A reader, April 29, 2010 - 12:42 pm UTC

Thanks for the reply

thats not my decision thats how they model the process,
may be thats how its work here since many years i have just join before a year

And if we argue with there data model they don't want to listen


Thanks



Tom Kyte
April 29, 2010 - 12:57 pm UTC

then why did you ask?

it would likely be the right way

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library