Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Geraldo.

Asked: November 07, 2018 - 4:42 pm UTC

Last updated: November 09, 2018 - 3:58 pm UTC

Version: 12.2.0.1

Viewed 1000+ times

You Asked

We're developing a new system in for the company with .NET and Oracle Database 12cR2 using RAC. It basically does the following: receives invoices in XML format, then validates it according to business rules.

A requirement is to give to the client a "tracking number" in the presentation layer so later the client can ask the app for the status of the invoice.

There is a six columns table called "TRACKING". The development team wants to use a GUID (varchar2 32bytes) generated by the code as the tracking number. For every request the clients will get a tracking number. This table is related (FK) with another one called "MESSAGES_TABLE" (has four columns) that stores the messages the client get while the system validates the invoice (could be "acepted", or a massive number of messages depending on the validations). Each tracking number can have one or more rows in the "MESSAGES_TABLE" and there is a query used to view the status of a tracking number associated with its message(s) (a join between both tables by the pk).

Note: the system would get 300MM per year.

The question is: would be better to:

1. use the GUID as pk and tracking number?
2. use an auto-generated sequence as pk and tracking number?
3. use an auto-generated sequence as pk and the GUID as the tracking number?
4. What else?

All These question are related to have the best performance. I know that organize a GUID column with an index is bad for performance and databases handle sequential number better. But I also know that taking a roundtrip to the database to get a sequence could be bad (maybe a high cache value can help).

I need some advice here.

Thanks in adavance.

Regards,


and Chris said...

It's not straightforward that sequences are faster than GUIDs. Tom has written several articles discussing this:

https://asktom.oracle.com/pls/apex/asktom.search?tag=sequences-or-guids
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2570145300346198113
https://asktom.oracle.com/pls/apex/asktom.search?tag=guids-sequential-or-random

Ultimately if you care about this, test and validate on your data to see the impact.

And remember: performance is only one consideration. GUIDs will almost certainly use more storage. But they're also globally unique. Which may be useful if you're replicating data elsewhere.

But I also know that taking a roundtrip to the database to get a sequence could be bad

The roundtrip is unnecessary! With the returning clause of insert, you can get the value you inserted. So you can stick seq.nextval in the values clause and get its value back.

And, as you're on 12c, you can use an identity column too:

create table t (
  c1 int generated as identity
);

var id number;

insert into t values ( default )
returning c1 into :id;

print :id;

        ID
----------
         1

insert into t values ( default )
returning c1 into :id;

print :id;

        ID
----------
         2

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Response

Geraldo Peralta, November 08, 2018 - 1:14 pm UTC

But I also know that taking a roundtrip to the database to get a sequence could be bad.

1. With this I meant that the insert will always ask the db for a available sequence. But you just wrote than anyway is faster than inserting a GUID.

2. The tracking number given to clients cannot be simple like 1 or 2 or 3, because anyone could see another's one info. What approach can we have to solve this?

3. Yes, I will replicate data somewhere else. Is this a problem if I have sequences in my database?

Note: We have lots os sequence in our databases.

Chris Saxon
November 08, 2018 - 3:47 pm UTC

1. "Something" has to generate the sequence or GUID. The difference is you can generate GUIDs in the app without the DB.

You can use returning for DB generated GUIDs just like sequences:

create table t (
  c1 raw(16) default sys_guid()
);

var id varchar2(32);

insert into t values ( default )
returning c1 into :id;

print :id;

ID
--------------------------------------------------------------------------------
7A2A0B9F05A16600E0530100007FAE6E


2. You could pad the sequence # with a random number. This would make the values "non-guessable". Or have a separate column which stores customer references (generated however you like).

3. It's more of an issue if you're consolidating data from different databases. If you do you can clash on the sequence number.

Review

Geraldo Peralta, November 08, 2018 - 6:13 pm UTC

Ok.

1. What about the performance of the index as pk in the sys_guid colum? Because there's a view that will query this table joined with another one. Will this be OK use a sys_guid in the join query?


Connor McDonald
November 09, 2018 - 2:22 am UTC

A primary key is a primary key. Queries for a single primary key will pretty much be the same no matter what the datatype.

Insertion is perhaps a different matter. The monotonic nature of sequences result in more densely packed indexes. That *might* be a good thing (better cache efficiency) or it *might* be a bad thing (contention for common blocks).

The guids being essentially pseudo-random will place entries in random order into the index so typically over time the index will end up being 60-70% packed due to block splitting.

Review

Geraldo, November 09, 2018 - 12:35 pm UTC

Great, Chris!

Thanks for the help!

Please, take a look on this:
One guy told me that I could use a oracle sequence for the parent table as pk, create the GUID column with a unique index, give the traking number to clients and use the the column with the sequence to be joined with the other MESSAGES_TABLE. The query to view each tracking number with its associated message(s) would use the sequence in the join query and a where filtered by the tracking number.

Would this be a good approach?
Chris Saxon
November 09, 2018 - 3:58 pm UTC

There are many who argue you shouldn't expose surrogate keys (whether via sequence or guid). i.e. customers never see the primary key value. This helps protect you if:

- You ever need to change a customer reference
- You need to re-insert a deleted customer reference
- Customers insist on their references following certain rules (e.g. no ambiguous characters - zero vs. capital oh, references must be pronounceable, etc.)

If you're exposed the primary key handling requests like these can be somewhere between hard and effectively impossible.

So yes, having a sequence PK (that stays hidden) and a GUID UK (that customers see) can be a good option.

Review

Geraldo Peralta, November 11, 2018 - 11:26 pm UTC

Thanks for the quick response.


More to Explore

SQL

The Oracle documentation contains a complete SQL reference.