Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Scott.

Asked: April 17, 2007 - 8:33 am UTC

Last updated: June 23, 2008 - 1:07 pm UTC

Version: 10g

Viewed 1000+ times

You Asked

Mr. Kyte, I enjoy reading the data on your web site. I have searched it extensively using the search words "hibernate, toplink, object relational mapping," and so on. I would like to append a question to the article http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:12083187196917, and wonder if doing so is an appropriate use of your web site.

The question I would like to ask is "Does anyone have any success stories of using O/R mapping tools to build high performance, scalable systems? Greater than 5M transactions per day against multiple TB of data?" I am working my way through the design of a data access layer as described by Lisa at the beginning of the referenced article. I noticed that a reader or two tried to pick a fight somewhere is middle of the article, but they did not describe any real successful experiences with their preferred way of building a system. I would like to hear how data-centric folks worked their way through this problem with OO-application-centric folks, and built performing systems. If it is not clear from my words thus far, I want to let Oracle do what it does well (query optimization, concurrency, transaction management) and let the application do what it does well (pure business logic).

Thanks for your time and your educational website.

and we said...

sure, go for it.

In fact, I'll just publish this one and you can track it...

Your stated goal:

I want to let Oracle do what it does well (query optimization, concurrency, transaction management) and let the application do what it does well (pure business logic).

is pretty much in line with mine :)

Rating

  (11 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Clarification of my question

Scott, April 19, 2007 - 9:59 am UTC

In asking the above question, I am not looking for someone to do my design for me. I am well into it.

My experiences with application interfaces to Oracle have mostly been through PL/SQL modules of one sort or another through one web server or another. Also a lot of Pro*C. In my first and only experience w/ the OO/Java folks a few years ago, I lost the battle and got to sit back and watch as the Java guys wrote all the SQL, made a horrible mess of it, and then [I] ducked the crash and burn phase of the project. Not my proudest moment, but the decision was not mine.

I was not ignorant of OO-design and Java at the time, but since then I have studied it harder. Also have talked to a lot of OO folks about what it is they really want when they talk about "database independence". Know thine enemy? I have also been building a presentation explaining the extent to which database independence can be achieved, and what the various costs are if you go too far. (Tom Kyte's books have helped a lot.) I have this silly notion that I can win this battle if I am prepared.

My co-workers experiences w/ O/R mapping tools such as Hibernate are that the tools work okay w/ object-to-single-table mappings, but that object-to-many-table mapping are just concatenations of natural joins built with no insight as to how Oracle will process it. OJB folks have walked me through their code/configuration files, which works well enough on low-volume systems. The biggest Hibernate user I know cannot be pinned down as to how many users or much data he tested with, although the prototype system "performed well".

I would really like to find some objective evidence of high-performance, scalable implementations using these tools. Or find out that there isn't any such evidence.

I am interested in how other data-centric folks approached this problem, the general nature of the solution, and how they got around the human issues involved.

Thanks.
Tom Kyte
April 19, 2007 - 12:21 pm UTC

just updating this to have it pop to the top again...

Giovanni, April 19, 2007 - 3:24 pm UTC

Scott,
if you are trying to develop data-centered applications with Java and an ORM tool try iBatis. It's free and gives you all what you need to exploit the full power of oracle sql database. IBatis, unlike other orm tools, works the other way around, you write the sql (not hql, or some other oo-sql just plain sql) and then map the results into java objects (Hibernate & co start from java and hql to create sql).
It does not have the hibernate full power but it's very easy to learn (1 day for the 70% of the features) and it can be higly customizable. if you use the last jdbc driver you can get a decent blob support for free.
Hope it helps,
Giovanni

A database-centric approach to J2EE application development

Zlatko Sirotic, April 20, 2007 - 5:25 am UTC

> I am interested in how other data-centric folks approached this problem, the general nature of the solution, and how they got around the human issues involved.

Excellent text:

A database-centric approach to J2EE application development (Where should code go: Java or PL/SQL?)
Toon Koppelaars (Centraal Boekhuis) - ODTUG 2004
http://web.inter.nl.net/users/T.Koppelaars/J2EE_DB_CENTRIC.doc


Regards,
Zlatko Sirotic

Hibernate

Mark, April 20, 2007 - 5:53 pm UTC

I have no evidence of a good system using an O/R tool, but I can tell you that our company is moving towards getting rid of Hibernate because it proved to be nothing but a pain. (Tom has listened to my Hibernate complaints before.) I asked my colleague who has worked with Hibernate a lot more than I've had to (lucky for me), and here are the big problems he's found that convinced him to throw it out:

- Set operators like union or intersect, are not supported.
- Hibernate's Criteria API doesn't support multiple self-joins.
- The Hibernate devleopers generally have the attitude of "why would you ever want to use that" in reference to Hibernate's unsupported features. They were even resistant to implementing outer joins for a long time - why would anyone need those - but this problem has eventually been "mostly" fixed.
- You can't specify query hints.
- This might be a "feature" and not a problem, but he found Hibernate would send updates to the database when he didn't want it to. I'm sure they would say he's using it wrong but (shrug) this guy is a very good programmer who I've never seen stumped by any problem for very long.

Not to mention all the other Oracle (or database x) features you can't take advantage of... Basically it can probably perform and scale well as long as you don't need any of those fancy shmancy database features, which are ironically what you usually need to make your database perform and scale well. If all you need is simple CRUD, it might work just dandy. Again I have no evidence of this, though.

Mark

JDBC & OR-Mapping Tools

RAM, October 01, 2007 - 4:19 am UTC

Hi Tom,

Thanks for the good explanations in this thread, was very exciting and more useful.

Recently we had a discussion with our Database Architect regarding the usage of JDBC VS OR-Mapping Tools, the architect being an oracle & database person had the comments as below.

"Personally, my view is that JDBC is essentially flawed. JDBC gives the java developer the ability to make 'direct' calls to the DB and to deal with the data at a low level. Something that most Java developers definitely don't want to do. My view is that we should have some sort of O/R tool that exists in the DB, and operates such that the Java tier can make a call to it as a Java class and return data asif it were in data land, but where the actual mapping of Relational to Object is at the DB level. Thus you could have a common standard interface between the DB and the java tier, but allow each vendor to fine tune the actual O/R mapping based on the strengths of their database product."

I am not able to produce a proven test case or arguments on the above any pointers on how to defend such statements in future.

Thanks
Tom Kyte
October 03, 2007 - 2:41 pm UTC

no, the only problem with jdbc is it permits the application to use insert, update, select and merge when the only thing it should be able to do is CALL() and invoke a stored procedures that a database professional that understands locking, concurrency controls, data integrity, multi-user access, SQL, scalability and performance has written.

I do not think your architect is really a database person - they might now a little sql, but if they are saying that, they really are not a database person.

In a small way, if you use the object relational views and jpublisher - you can get a tiny way towards what they speak of, but....

You know - data processing is data processing. Java is *just a language*.

David Horton, October 03, 2007 - 10:40 pm UTC

Personally, I've not used ORM tools to any great extent but I've worked in organisations alongside projects that do. I would consider using ORM for simple CRUD functionality but probably nothing more. It can help speed up development of basic functionality but introduces many complexities of its own.

Once the data access becomes anything more complex than a single DML statement that can be naturally encapsulated within a single transaction (a la CRUD) I think that stored procs are generally the way to go.

For example, an issue which I had to help resolve recently, is a fairly standard use case in many applications; finding the next piece of work to process, and returning it for processing. The app then updates the status to indicate it is in progress and merrily goes about doing whatever it needs to do with the data. We might use SELECT FOR UPDATE to serialize the access to the row requiring processing so we don't pick up the same row more than once.

Now, in a usual multitiered application, with connection pools and such like, there's no guarantee that the call to perform the update will use the same connection as the initial select. The lock won't magically move across connections, so even though the app developer thinks they're doing the right thing, there may actually be the possibility of two client processes grabbing the same row. I believe some application servers have mechanisms to identify things such as SELECT FOR UPDATE and act appropriately but that's then another piece of added complexity, and something not necessarily provided by all vendors.

So we may have a hard to track down error, that doesn't always occur, and which the developer thinks can't happen because they did the right thing...

In this case, we can fix the issue easily by wrapping the work in a stored proc that does the select for update, updating of status and returning the unit of work in a single logical unit which the DB will guarantee is integral.

This is just one very simple example it just makes sense to do it in the database. It's quick and easy to do and there is no need to worry about any of the what if's.

Incidentally, this also applies to straight JDBC calls, not just ORM. Anything more complex than basic CRUD has to be coded very carefully to ensure integrity. And as Tom often says, applications come and go, the data is what organisations rely on.

I think that middle tier (be it Java, C, php, xyz, abc, 123 etc etc) developers need to have a greater appreciation of database details and can't consider it to be a black box. Equally, database developers need to understand the needs of the app developers to figure out why they are trying to do something to suggest 'better' alternatives. Like in any organisation, a silo effect leads to inefficient communication at best and at worst.....

ORM and data set processing

Loïc, December 20, 2007 - 6:12 pm UTC

I currently work on a really big new project (back office application recoding in J2EE technologies)... big because the development started since 2001 (with up to 100 developers) and production full deployment is only for 2008 (well 5% of transactions managed since 2.5 years by the new project but *95%* still by the old one).

However, one thing I have remarked is that though self-proclaimed-scalable Java processes have been developed using Hibernate, the thing is rewriting them (one after the other) in pure SQL (mono threaded data set processing) *strangely* boosts performances from x7 to x50 (and above).

Although Hibernate was described as THE solution to scale up, it ended in poor performing SQL because of:
- updatint all the columns (but the PK) of one table when really updating only one column (although dirty checking is available),
- not understanding JDBC batch processing and so Hibernate batch processing,
- not configuring row prefetch size properly (in fact having no knowledge about a critical component: the network between Java and the database),
- running tons of selects to map rows to Java objects with respect to object integrity (that is to be select all columns, no light ojbect) with all relation (1..n, n..m) in order to update some fields and repersist them in the database *in parallel*.... instead of set processing (one simple update...),
- mapping hard to maintain relationships between tables to objects thus producing the well known "n+1 selects" or the famous-Hibernate-problem shoud I say (note: with very complex relationships, I sometime see the very same object -- or row -- being fetched up to 6 times while loading a parent of this object),
- thinking that scalability means parallelize at maximum a process in Java without even thinking a second of the impacts in the database... What is the relation between Java scalability and the database??? (ouch)
- putting everything in the object cache (known as second level cache) without thinking that RAM is not infinite (especially for 32bits servers) and thinking that the garbage collector is powerful, not subject to bugs and can't slow down the application during a major gc of a 2Gb heap
...

Clearly SQL is not easy to understand (CBO, statistics, contention, locking, I/O...) but adding a layer above all of that and deciding that it would be easier from now on is certainly a mistake. I would prefer to create a new layer over SQL once I know SQL and if I'm sure it adds something absolutely necessary.

As a final word I'm not saying that ORM is bad, it is a tool and I know no tool being bad. Only the person using a tool can produce disasters.

My two cents
Tom Kyte
December 21, 2007 - 2:38 pm UTC

I was with you until....

Clearly SQL is not easy to understand

:) I would say

Clearly Java is not easy to understand

myself....

Thanks for the Hibernate critique

Stew Ashton, December 22, 2007 - 4:25 am UTC


Tom, perhaps Loic meant that to use SQL and Oracle properly you have to be willing to learn; whereas some think Hibernate = never having to know anything about the database.

Some things are "hard to understand" because there is a lot to learn.

Others are "hard to understand" because they don't make much sense.

Loic's critique makes me think that Hibernate belongs in the second category. I plan to use his remarks at work, so thanks.

Crud

Vikas, June 23, 2008 - 8:37 am UTC

In my schema only database packages can select , create , update or delete the data. Is their any automated way to prepare the CRUD matrix without going in and searching in each of the database package ?
Tom Kyte
June 23, 2008 - 8:42 am UTC

huh? not sure at all what you are asking for there. I know what CRUD is - Create, Read, Update, Delete. But only Neo knows what the CRUD matrix is.


(how much do I hate CRUD packages.... what a waste of keystrokes. A transactional API - very nice. A table API - waste of keystrokes...)

http://asktom.oracle.com/pls/ask/search?p_string=transactional+table+api+xapi+tapi

CRUD

Vikas, June 23, 2008 - 10:52 am UTC

Thanks Tom .
I was just wondering if I can get the list of package/s.

1. That selects some data from the table a.
2. That inserts the data in table a.
3. That deletes the data in table a.
4. That updates the data in table a.

by using some dictionary view. That will save me going into the each package code to find out this information.

Tom Kyte
June 23, 2008 - 1:07 pm UTC

that is not recorded by us anywhere (and if it were, it would be woefully incomplete, we don't know what you insert/update/delete as a matter of fact). that should be in your design documentation, should be a snap to pull that up (he says somewhat knowing that there isn't any documentation, or notes...)


Here's a laugh...

Mark, March 17, 2011 - 3:29 pm UTC

I asked to review some OpenJPA-generated DDL (not for Oracle, but for a database with sequences that work just like Oracle's), as another developer has been using it for his web app...I knew I was going to see some things I didn't like before I even opened the file. I took a deep breath then look what was staring me in the face. It took all my strength to refrain from cursing really loud.

CREATE TABLE openjpa_sequence_table
(
id smallint NOT NULL,
sequence_value bigint,
CONSTRAINT openjpa_sequence_table_pkey PRIMARY KEY (id)
)