Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Hariharan.

Asked: October 20, 2009 - 6:07 am UTC

Last updated: October 26, 2009 - 12:04 pm UTC

Version: 10.0.2

Viewed 1000+ times

You Asked

Hi Tom,

Good Day.

We have an application built almost 4 years ago by a group of DBAs and Architects. In all the tables, they have a column called ID__ which is of Number data type. For example, please consider the following example:

create table party(ID__, party_id, account_number, primary_email_address, ....)

Here, even though party_id is a unique key, they have introduced this ID__ as primary key, which is populated by a sequence.

Now, the issue is almost 3 or 4 tables, which are depending on this party table, is referencing to this ID__, even though they can use PARTY_ID as foreign key.

When inquired, they said that instead of developing the database design first, they created the application and class design and hence this type of design. Here, the column ID__ is available in all the 712 tables in the application, and they are generated by Hibernate.

My manager have asked me to re-vamp this and make sure that the ID__ is not used and also to create a fully normalized database. My questions are:

a) Is there anything wrong in the design outlined above?
b) I heard DBAs and Architects telling me that Primary Key should always be a number data type. Even I have designed database in my previous company, but never came across this statement. Is this true?
c) Is it feasible to create a composite foreign key? (i.e.) if table A has empname and emploc as primary key, is it possible to create a table B where I can create empname and emploc as foreign key referencing to the columns in table A?
d) Here, the DBA team has the habit of creating a join between two tables as heap tables and store in the database. I would prefer to create a view to represent a map between two tables, as most of the dynamic performance views are nothing but the joins of more than one base table in Oracle. I would like to know if heap table is better or a view which represents a join between two tables is better.

Thanks for helping us

Hariharan

and Tom said...

wait, back up...

you wrote: "We have an application built almost 4 years ago by a group of DBAs and Architects."

and then wrote: " they said that instead of developing the database design first, they created the application and class design and hence this type of design."


that is not possible, no DBA would be involved in such a 'design', they cannot be called DBA's if they did it that way. Unless the A stands for "Abuser" instead of the normal Administrator.

The 'Architects', I totally believe they would do this, but DBA's ? No, it isn't possible.



a) is there anything wrong? There is something seriously wrong when the application tool dictates a data model. Applications come, they go (mostly - they go - how many apps do you use today that you used 15 years ago? Now, how much of the data you use every day was in existence 15 years ago?)

You are paying a performance penalty every time you want the real key but have this silly, redundant surrogate. You need "party_id" for example - and now you have to walk back to the table to get it - instead of just already having it as the natural foreign key pointing to a natural key.

b) no. I will not go further. I will ask them to 'prove' why it is so.

No - primary keys should not be numbers as a rule, never, nope, not at all even remotely true.


c) of course, it is done all of the time.

d) you do not have a DBA team I'm sorry to say. The type of A you have is not 'administrator', it is the other kind of A.

Rating

  (8 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Very rare case

Oleksandr Alesinskyy, October 20, 2009 - 12:32 pm UTC

IT is a very rare case when I have do disagree with your answer almost completely:

a) "is there anything wrong? There is something seriously wrong when the application tool dictates a data model."
It is true that it is wrong when the application tool dictates data model - but it is not a case, Hibernate does not do it, it nicely supports natural keys, including composite ones. Another story is a choice between surrogate and natural keys. It is one of the most discussed area in the database design area (and it was discussed here more then once). It is not so simple decision as it looks at first glance - the main problem with natural keys is that they are mutable (exceptions from this rule exist but they do not constitute majority). Even the keys that looks and are suppose to be absolutely immutable mutate earlier or later (e.g. individual tax numbers recently in Germany). So ask yourself what it would cost you to update the whole database with a new keys (or not the whole DB but sizable chunk of it).So it depends - as usually.

b) Statement is untrue, but I guess it is quoted incompletely - if is formulated as "surrogate key should be numeric" it would have sense.

c) Answer depends on "how composite" this composite key is (how many columns, which types and sizes of those columns).

d) Once more - it depends, if it is OLTP system, that such habit is a pure nonsense, if it is DWH - answer might differ (especially, because it is not clear what "heap table" means in this context, it may as well be materialized view.
Tom Kyte
October 22, 2009 - 5:09 pm UTC

I think we've been down this path haven't we?


a) but the people using hibernate do NOT do that, not in general, not in my experience. They let the tool dictate the model and everything else. They do not understand what a database is, let alone how to program it.

The question in this case sort of backs that up - "When inquired, they said that instead of developing the database design first, they created the application and class design and hence this type of design."

hmmmm - I don't hate hibernate, I don't like users that use hibernate. But that is sort of like the slogan "guns don't kill people, people kill people" used by the anti-gun groups. I cannot fix the people that use hibernate - therefore I do not promote hibernate since - in the wrong hands (and most hands are wrong) it is a painful deadly process.

When you have a natural key, use it (more exist than you think, especially for child tables of a parent with a surrogate - their natural key is many times the parent key plus some bit of (immutable) data.


b) we agreed then

c) the question was "Is it feasible to create a composite foreign key? (i.e.) if table A has empname and emploc as primary key, is it possible to create a table B where I can create empname and emploc as foreign key referencing to the columns in table A? " and the answer is "yes, it is certainly feasible"

I don't see us disagreeing.


A reader, October 21, 2009 - 12:30 am UTC

Following are concerns on Db Model for Web Based OLTP Application
a)There will be aroud 1000 tables some of these are transcational and some are reference data. We need to keep detail logging of it. If a portion of record is changed by one user and other portion by other user then we should have change track by individual user. Please suggest which option we use for logging.
1. Put logging in text file.
2. Put loggin in proper history tables.

have a fear that storing this logging information in tables will reduce performance.

If you suggest loggin in history tables then should these be populated through triggers or stored procedures?


b)Where we store large objects including pictures/audio etc. In databae or file system.

c)We are planning to persist db tables through Hibernate in order to speed up development. What you suggest should we do it or use stored procedures?
Tom Kyte
October 23, 2009 - 11:14 am UTC

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2055088900346391584

why do you ask in multiple places? One is sufficient, two is annoying, three would be worse...

model

A reader, October 22, 2009 - 9:50 pm UTC

Tom:

wow, just curious have you seen a 1000 table schema before?

everyone talks about this java hibernate framework. Does this really do something differnt than something simple like apex or pl/sql procedures.

I think IT managers hear of Struts/Hibernate/JSF and say that is the latest cool thing and that is what we should use. No wonder 50% of IT projects end up with total failure

Java frameworks are just too complex and overkill.

do you agree?


Tom Kyte
October 23, 2009 - 2:05 pm UTC

.. wow, just curious have you seen a 1000 table schema before? ...

sure


sys%ORA10GR2> select count(*) from user_tables;

  COUNT(*)
----------
       712


that is getting pretty close itself :)


... Java frameworks are just too complex and overkill.
do you agree?
....

No, I think there are problems of varying degrees. And I think everything is just a tool. And you need a toolbox of various tools and the knowledge of when and how to use each one in order to call yourself a software engineer, a programmer, a coder, whatever.


Problem is, people learn one tool and then call themselves engineers, programmers, coders, whatever.

And that is the problem really.

Hiberate does not contribute to ease of development

Analyst, October 23, 2009 - 4:39 am UTC

"everyone talks about this java hibernate framework. Does this really do something differnt than
something simple like apex or pl/sql procedures."

Having developed in both imperative and functional languages, I have never used Hibernate, and have never had a reason to believe that I should consider using it. Similarly, I have never used LINQ, and believe it too, like Hibernate, offers no real enhancements. This view is based upon my experience as a database developer - using both primarily disk-based and in-memory databases - within various development teams in the financial services sector. The largest part of my role involves implementing in one of aforementioned types of databases, financial algorithms that draw upon several sets of data, and over which are applied a set of calculations.

Given the experience of my team and I in several distinct programming environments, including the use of SQL and procedural languages and frameworks (Java, .NET, Hibernate), we all strongly agreed at the outset that the development of the majority of the algorithms would be done in SQL. There are, of course, exceptions, and in such cases we do develop in an imperative language, such as Java.

What have we realised to be a key capability of SQL, which does not exist in imperative languages, in relation to our development requirements?

First, SQL offers a very powerful model of abstraction. Having once developed in assembly language, I certainly appreciate the level of abstraction offered by SQL. As an example, the ability to perform complex analytical functions over large volumes of data using only a few lines of SQL code is a fundamental capability and advantage of the SQL language. I would be very interested to see somebody write an equivalent function in an imperative language in the same of less amount of time.

Second, and perhaps most importantly, is the concept of abstraction with respect to system integration. This is itself an advantage that is difficult to match with imperative languages. Suppose we have a calculation procedure that involves, say, seven stored procedures, and which draws upon a set of no fewer than, say, 20 tables. Further, suppose the research team would like to implement a change to one or more of the calculations that are performed by this process. Using the configuration that I have just described, we would amend one or more of the stored procedures to incorporate the requested changes. There is no other aspect of the system that would be subject to a technical amendment. Our set of applications would remain unchanged.

"I think IT managers hear of Struts/Hibernate/JSF and say that is the latest cool thing and that is
what we should use. No wonder 50% of IT projects end up with total failure"

This is a view of which I am unfamiliar. When I first joined my firm, the manager of my team informed me that I would be asked to develop the solution using the approach that I considered most appropriate based on my experience in, and knowledge of database development and data analysis. The manager does not understand the intrinsic detail of the database, nor does he understand the mechanics of the solutions that we have developed in C# - to problems to which SQL would be considered wholly inappropriate. Of course, the manager could understand the internal workings of our system. But he chooses not to. And this is the right thing to do. Rather, he hires people who have proven experience and knowledge in development. It is owing to this approach that we can attribute our success as a team.

"Java frameworks are just too complex and overkill."

They are certainly not complex. I would say that five lines of SQL code can be considered to be more complex than several entire Java frameworks. Why? Because the former will often represent an innovate solution to a problem, and will often utilise the power of abstraction. It is worthwhile to remember that solving a problem using only a few lines of code is significantly more complex than to write a solution of a couple of pages.

db

A reader, October 23, 2009 - 10:16 am UTC

Analyst from Australia

I totally lost you in your last paragraph.

First you say you try to use SQL insted of procedural code like JAVA. Then at the end, you say SQL is too complex and wiriting 2 lines of code does not mean it is easier to read/maintain than writing 2 pages of code.

How could this be true?

To me the less code, the better the program especially if it runs faster and both give the same result.

I also thought that JAVA, .NET, etc all use SQL functions.
It is only after you get the data using SQL what do you use: PL/SQL, C#, java, etc..

It would be really stupid for someone to write a function in JAVA like a SELECT, MAX, MIN, etc in SQL. Why reinvent the wheel and create proprietary software. SQL is a STANDARD.

The main problem with application developers is that they do not know to write SQL very well. You are not going to find that many Tom Kytes out there.

To A Reader

Analyst, October 23, 2009 - 11:25 am UTC

I should apologise for the fact that my last paragraph is not as clear as it should have been.

"First you say you try to use SQL insted of procedural code like JAVA. Then at the end, you say SQL
is too complex and wiriting 2 lines of code does not mean it is easier to read/maintain than
writing 2 pages of code."

What I meant to say was that often to solve a problem with only a few lines of SQL will often be more complex to design and develop conceptually. However, once implemented, it will appear as very simple and easy to understand. That was the key point I was trying to make.

Indeed, it is very easy to write pages of procedural code to solve a complex problem. However, do solve it using only a few lines of SQL code, will be more complex, but will deliver an easier to understand solution, and which will, most likely, perform better than the procedural solution.

And while procedural code is not difficult to understand (remember: it is only loops, at the end of the day), I do find it more time consuming to analyse given the way in which solutions are often developed in procedural languages. On the other hand, solutions developed in SQL tend to reflect very closely the original requirement.

"To me the less code, the better the program especially if it runs faster and both give the same
result."

I completely agree.

"I also thought that JAVA, .NET, etc all use SQL functions.
It is only after you get the data using SQL what do you use: PL/SQL, C#, java, etc.."

There are no SQL functions in procedural languages (JAVA, .NET). Programs written in these languages may invoke SQL functions that exist within a database. For example, JDBC offers this capability in Java.

"It would be really stupid for someone to write a function in JAVA like a SELECT, MAX, MIN, etc in
SQL. Why reinvent the wheel and create proprietary software. SQL is a STANDARD."

I do not understand what you meant to say in this paragraph.

"The main problem with application developers is that they do not know to write SQL very well. You
are not going to find that many Tom Kytes out there."

There is an even bigger problem. That is there are many developers who believe that the data, and data processing, is secondary to the application. This is the single biggest misunderstanding. I am currently developing yet another application (a small web-based application) that will provide access to a small subset of the functions that exist in the database, and which had been developed earlier this year.
Tom Kyte
October 23, 2009 - 2:52 pm UTC

I would like to change this statement from :

The main problem with application developers is that they do not know to write
SQL very well.


to

The main problem with application developers is that they do not know to write
code very well.



I find that to be true quite often. And you know what - if someone it WRITING A DATABASE APPLICATION (and who isn't?) GUESS WHAT THEY SHOULD LEARN.

I'll leave the answer to that to the reader...



ORM Can be used correctly

Steve G, October 23, 2009 - 3:27 pm UTC

I have been involved in several projects where all the data access was done through JPA, specifically we utilized EclipseLink as the JPA Provider implementation. However the key difference is that we created the Database first, we spent quite a bit of time working on the ERD to make sure the Database was clean and as normalized as possible, with all foreign keys. Then we used tools to generate the Java classes based on the database tables. I do a lot in both SQL and Apps (people can do both if you put the effort in to learning the proper techniques in each environment). I have found that for most apps using and ORM's as data access layers against properly designed databases dramatically reduces the amount of code, can perform very very well, and makes maintenance easier. As with almost anything in software proper implementation and use of technology is what makes you successful.
Tom Kyte
October 26, 2009 - 12:04 pm UTC

I can generalize your statement:

ORM Can be used correctly

to

Anything can be used correctly, all things are tools.



However, this topic for me falls into the same area as triggers, autonomous transactions and when others.

They all *can* be used correctly.

They *seldom* are.

Therefore, the world would be a much safer place without them (especially - ESPECIALLY - when others, it is the worst of the worst when it comes to being abused/malused)




Keyword in your post "designed" as in

properly designed database (and application, and architecture, and ...)

database

A reader, October 25, 2009 - 4:23 pm UTC

Tom:

<<<No, I think there are problems of varying degrees. And I think everything is just a tool. And you need a toolbox of various tools and the knowledge of when and how to use each one in order to call yourself a software engineer, a programmer, a coder, whatever. >>>

Let us face it. How many coders or engineers know JAVA, PL/SQL, SQL, Pythoon, Perl, Ruby, .NET, C, C#, C++ etc.
to be able to know/determine what is the best to use.

Most people struggle with one language and do not even know what it offers.

Now, many tools accomplish one purpose. For example if you want to develop web application anything will work: PL/sql, Java, Coldfusion, Perl, ASP, JSP, PSP, PHP, etc....

It just a matter of what the prorammer knows and what tools exist already in place.

This IT industry needs to be regulated and many people be flushed out. Too many incompetent people in it from so called techies to managers. One manager once suggested to a customer to migrate from oracle enterprise to flat EXCEL files (because he can sort data in EXCEL easily). Imagine a 50 giga database with 500 tables and 1000s users running with Excel flat files these days!!