Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Ganesh.

Asked: January 05, 2004 - 11:06 am UTC

Last updated: April 17, 2014 - 5:27 pm UTC

Version: 9.2.0.3.0

Viewed 1000+ times

You Asked

Hi Tom

We have lots of Production Databases on numerous different Produciton Servers running great catering various Business needs. Now there is a need to design and build Web Based Application using Java J2EE architecture which needs information from these multiple databases. We are discussing lots of options of achieving the same.

We are divided into two groups of techies -

1. Since it is the Homogenous Data Source - Oracle, one group proposes the usage of Oracle Database Links to get the data from multiple databases, which would give better access control, concurrency and has great transparancy for the application.

2. The other group opposes creating any Database links and likes to have joins and data captured at the java pool connection level and let the application do the work. In this way, the tight integration with the vendor based softwares are completely avoided.

Well, I would like to get your thoughts on this. I like to know what would be best way to Design/Build a web based application which uses multiple data sources (in this case, it is homogenous, Oracle, Datasource).

Your views on this is much appreciated.

thanks
Ganesh

and Tom said...

the people in group 2 want to write lots of code and have job security for the rest of their natural lives. They will never actually finish and requests for new functionality will take 10 times the amount of time it should. The end users will be wondering why this technology thing is so hard. They will make us all look bad.

the people in group 1 want to do it in the most cost effective, fast method. They'll also be able to support "new technology 532143" which will come out next year. Java/J2EE is "new technology 532140" and is just a language.

I quit my last job in 1993 because of people like group #2.

How the heck is pretending you have one Oracle database instead of N going to give you "tight integration with vendor based software" -- whatever that means. (well, I know what it means -- it means you are actually USING what you paid for, imagine that! it means you are not reinventing an optimizer, a join engine, a database. imagine that! it means you can actually use the advanced features of the product you paid lots of money for)

Would these people invent their own steering mechanism simply because they drive more than one car?

I'm sure this will generate a little bit of discussion.

Rating

  (51 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Architecture Design Question

Ketan Bengali, January 06, 2004 - 11:40 am UTC

Case II has lot many problems. If you don't use database link and try to implement everything in J2EE then performance will be nightmare. All the query screens should have a message saying "Please come back next day to see the result...".

Consider a case where Query1 from Database1 returns 300K records and Query2 from Database2 returns 10000 records. The final output you want to return to user is join of both the queries which should return only 1000 records at the end. If you use DBLink then oracle will do all the optimization and return you 1000 records with shortest possible time.

If someone in your group starts writing this code then he will fetch all 300K+ records into server memory and paging will pick in. This will kill entire system performance.All users will suffer. If your application is heavy data centric then implementing at connection level is not the right choice. Implementing at J2EE connection level may make sense when any select returns not more than 5K records.

unfortunately case #2 usually wins

Ryan, January 06, 2004 - 11:55 am UTC

There tend to be alot more software engineers on projects than database people so they tend to win these arguments based on pure numbers. Also, I've noticed that upper level management tends to come out of the software engineering ranks alot more than out of the database ranks... so they are biased.

I've also noticed that alot of software engineers get stuck in 'cliches'. They repeat the same tired cliches, but never test them. They assume that cliches have to start somewhere and they wouldn't be a cliche if they weren't true. So they don't feel the need to test it to see if its accurate.

Anyone else notice this?

Tom Kyte
January 06, 2004 - 1:07 pm UTC

warning -- opinion coming up

I beg to differ.

What you describe is "alot more coders". I hate that term engineer, where is their certificate?

You know, when I drive onto a bridge, I am fairly sure that I will get off the other end. Why? Because a certified, trained engineer designed and built the thing.

It is funny -- i was in a "panel conference" recently -- a "think tank" sort of thing. I proposed that software developers be trained and certified just like a civil engineer. Used the bridge analogy. The "artist" from another company (a pure think tank) took exception to this. "Software developers must be free to create, to be like artists". The punch line -- there was a note taker, using word, on a projector so we could see the notes. Word "gpf'ed" at that point. Touche' as they like to say.

These are not engineers, they are coders. In the past, they were people who took specs from analysts and pumped out code. Somewhere along the line, we lost "analysts" and just have coders.


Now, don't get me wrong -- i started life as a coder, who got specs from an analyst and implemented them. I'm not a DBA really beating up on developers (far from it).

A reader, January 06, 2004 - 12:16 pm UTC

WOW!! Interesting ...

I'll drink to that!

Anonymous, January 06, 2004 - 12:19 pm UTC

I am working with developers who think the database is a data dumping ground. Something the company paid a lot of money for.

What we need is licensing...

Jim Nasby, January 06, 2004 - 12:28 pm UTC

I swear we need to enact a law whereby only a licensed database architect/developer/admin can touch a database in any way, shape, or form. Anyone without said licensing would have to follow a database professional's exact instructions regarding data access and architecture. It would save so many people so much time and money.

(In case you can't tell, I'm being sarcastic. Sort-of.)

Query, Security should be applied at database

Ketan Bengali, January 06, 2004 - 1:32 pm UTC

We were evaluating a J2EE product from a vendor who implemented all security, filter, sort in J2EE architecture. Vendor also cameout with a unique datamodel whereby all objects were stored in just 4 tables as name value pair. No need for DBAs, database developers and claimed everything is transperent to development team.

The system was running great on demo machine where he had hardly 5000 records. Once I asked him to load 400K plus records and set up a security in such a way that each user gets only 100 record out of 400K plus records. This is the time, we could see the performance difference. JDBC connection was fetching all 400K plus records into server memory and then was returning 100 records per user request. The same happened with filter and sort. At the end, vendor realised his mistake and moved all these functionality back to database to improve the performance.

J2EE architecture works very well when working with less number of records and seamless integration with external applications are a must. It fails when unnecessarily, we start putting everything in J2EE.

Forutnately, I was able to convince in-house J2EE experts that our system doesn't need such integration and decided not to go with such products.

Now, we have fine tuned application whereby JDBC only gets what it needs (i.e. no more than 100 records at any request). All the optimization, performance, security, tuning are applied at database level and response time is less than 2 seconds.

Regards,

Ketan.

Truth is in the middle

Alberto Dell'Era, January 06, 2004 - 5:12 pm UTC

Let me share with Ganesh the experience we had with a big project that used your very same environment - J2EE on top of a lot of Oracle databases. I'm sure that everything may get the "Tom's Seal of Approval", since it's IMHO an example of the Best Practices He has always advocated (stress on IMHO).

We put all the sql statements in packages, something like

create package interface_pkg as

procedure get_orders (p_orders out sys_refcursor) is

open p_orders for
select ...
from customers c, orders@db_link o
where c.id = o.cust_id
and c.status ='ACTIVE';

end get_orders;

end interface_pkg;

and we had all J2EE developers calling the procedures to get the ref cursor (ResultSet in J2EE terminology) and so the rows. We did the same for inserts, updates etc.

The beauty of this arrangement is that it's perfectly database-independent even if it uses PL/SQL; say you want to migrate to SqlServer or MySql, all you have to do is modify the packages, which is a simple and mechanical task (I know because we made it, actually migrating from SqlServer to Oracle). You must examine anyway the sql since (as Tom explains beautifully in his books) different databases have different locking mechanisms, so even if the sql is syntactically compatible, it is usually semantically incompatible - so, it's much much better to have all of the sql in the same place (packages) than having it spreaded in different .java files.
This discussion should satisfy the "database independence" requirement.

But now that all the sql is in the database, it's also close to the data, so you can take advantage of Oracle's unmatched performance, using the DATAbase to crunch the data. Try doing the same join above as a nested loop in Java - will you (1) first get the ACTIVE customers and then the orders, or (2) the opposite ? It depends on how many ACTIVE customers and orders you have, of course ... what if today (1) is optimal and tomorrow (2) is optimal, will you change the Java code ? If the join is in sql, Oracle will automatically switch plans for you. I know because it happened to us - and i'm sure that many other times it happened without we knowing.

Or, let's say that you notice that the network is too slow - you create a local copy (snapshot) of orders and simply change 'orders@db_link o' in 'orders_local o', without touching a single line of Java. The same applies if you need to optimize/hint the sql. I know, because it happened to us.

But the most interesting thing that happened, and that was only partially predicted by us, it's that this 3-layered physical arrangement (Java using packages using the tables) fostered a division-of-labour between the Java developers, the database developers (who develop the packages), and the dbas. As an afterthought, I think that that was the most important factor that drove our project to success, since anyone played in his/her preferred technology, so:

a) the J2EE played with beans, XML parsers, Struts, etc, happy to leave the "boring sql" to the database developers;
b) the database developers enjoyed and exploited the Oracle features;
c) the dbas were more than happy to receive high-quality sql code produced by the (b) layer, and sleep at night instead of chasing the root cause of the latest inefficiency/data loss/deadlock.

Please note that i'm presenting the experience of a *real* and *successfull* project, and that is the real value of my contribution - it worked, and worked perfectly. I know the dangers of "religion wars", that can be peacefully avoided by simply looking at facts - in this case, past experience.

I wish you peace ;-)
Alberto

Ganesh Sundaresan, January 07, 2004 - 7:43 am UTC

Folks

Thanks for the practical advise and it is very helpful. This was the kind of feedback I was waiting for.

Ganesh

as is

Alberto Dell'Era, January 24, 2004 - 12:16 pm UTC

> I proposed that software developers be trained and certified
> just like a civil engineer. Used the bridge analogy.

What about declaring void the "AS IS" disclaimer of sw licenses - wouldn't this achieve the same goal ? If the bridge collapses, the manufacturing company pays for the damage, hence their employing only certified civil engineers ... or, expert ones at least!

Database independence is a myth - once again

Sergey, May 06, 2004 - 6:40 am UTC

1. I just read respective section in Tom's 'Effective Oracle..' where he shows with the examples that different DBs have different semantics and therefore abstraction layers like ODBC do not help to become 'database independent'
What I discovered is that these abstraction layers not only hide (read 'disguise') the differences of say SQL Server and Oracle, they most often introduce different behavior against the same very DB.
I work with MS ADO against Oracle DB for a few years and have found numerous examples of how one and the same piece of code behaves very differently with different providers.
The same [very basic] feature can
- work fine with Oracle OLEDB provider,
- return catastrophic failure with ODBC provider
- report 'not supported' with MS provider for Oracle
So instead of DB independence you are stuck with OLE DB provider dependence

2. About certifications
It's a bit of a fun to catch Tom on contradictions.
Here he angrily downgraded soft engineers to 'coders' because they do not have certificates. In another thread he admitted that neither he nor his employees have certifications and he does not care. What is really on your mind? (Well, I guess I know, but it would be interesting to hear some confessions:)

BW

Tom Kyte
May 06, 2004 - 7:55 am UTC

not certificates for "Oracle certified professional"

I'm saying (in the book) if someone want to call themselves a software ENGINEER, a system ARCHITECT, whatever -- should they not have been subjected to the same rigor, the same qualifications, the same base (over the entire industry) set of standards as would a civil ENGINEER (when I get on a bridge, i full expect to get to the other side. When I open my word processor, I'm not so sure -- maybe it'll run until I hit save, maybe not. Believe me, word processors scare me to death, written as a guy who at 2 in the morning lost lots of stuff while writing a certain book). Or an ARCHITECT would (it is rare for a building to just implode all by itself -- it takes catastrophic causes generally. How many systems have you seen just "implode" under normal use?)


MSCE - bah, so what, you know some facts. Same with OCP.

did you learn how to construct a system with those?
did you learn how to architect something?
did they teach you about testing your systems?
how to scale them?
best practices for using them?
how to project manage?
etc.......


Currently -- I could go to college, get a degree in pottery and call myself a software engineer. Is that right?

The sad truth ...

Ditto, May 06, 2004 - 8:31 am UTC

If an architect built buildings like a programmer wrote programs ...


The first woodpecker to come along would destroy civilization as we know it ... :\


The Secret - Problem Solvers :-$

Long Time Reader, May 06, 2004 - 9:44 am UTC

Computer Science Engineers

Option #3...

J, May 06, 2004 - 10:25 am UTC

Tom, what do you think about using a messaging option (like AQ, etc.) with request/reply as a means to link differing databases together? At one client, they abolished dblinks (they had a boatload of instances - custom and COTS) primarily due to link management concerns, and moved to a messaging solution for database interconnectivity (of course the argument can be made that the management was simply transferred to the EAI team =o). Seemed to work pretty well.... Thoughts? Cheers!

Tom Kyte
May 06, 2004 - 1:15 pm UTC

I believe totally in loosely coupled systems -- for transactional processing. store and forward, don't do distributed.

but... in response to the question "which needs information from these multiple databases." -- messaging doesn't work so good unless you want to "replicate" and that's another whole ball of problems

and yes, they just moved the goalposts, they didn't make it any easier. we could centralize db links and manage them from one location too.

How about one database being an Integration hub.

Siva, June 28, 2004 - 3:58 pm UTC

In our organization we have DB-links going to various DBs. One of the solutions we tried is J2EE. We had some success with it.

The other aspect we are looking at is one Database being the pass thru database for all tables need to be viewed by multiple apps.

Example :

We have 10 databases which publish data from 2 tables in each.

There are 20 DBs looking to DB link from One or many of the publishers.

We want to evaluate creation of a IntegrateDB. This will have DB link to the 10 Database. Now we have 20 Views in the Integrate DB.

The subscribers will link to the single IntegrateDB and depeneding on security will get access to any of some of these tables.

What is your opinion of such an architecture and what sizing impact do we need to study for the IntegrationDB.

Thanks
Siva

Tom Kyte
June 28, 2004 - 4:31 pm UTC

makes you wish distributed databases were never invented eh? (this is exactly why we are consolidating our own databases down into as few as humanly possible, we've whacked hundreds of databases internally - we now have more information)


If you are using the CBO (have stats) on all systems...
This might fly...

but it will not ever be called "fast" I would estimate.


Is it possible that a tiny bit of read only replication into this central database would be possible?

software engineers - off topic

bob, June 28, 2004 - 5:47 pm UTC

Tom,

you say software engineers should be as qualified/certified as bridge builders?

The Prof Eng. (PE) designation offered by www.nspe.org you refer to is just another test on some formulas commonly used in engineering. Ok, it's harder than the OCP, but still just facts. If Civ Eng can pass, it can't be all that hard. :)

And suprisingly, no other engineering discipline has the number of PE's that Civ Engs do and other disciplines construct and build things just as in important. It is just another marketing credential that the CE community adopted and requires, much like some employers look for OCP. NSPE has the credential for chem/mech/elec, but they are no where near as widely accepted/required as civ eng.

It is definitely not the proof you can be an architect of a bridge. Civil Eng have been at bridge building much longer than the word processor developers and the CE's have just as many blunders under their belt as the MSFT software engineers.

How many bridges collapsed before they got the concept right?

</code> http://www.civ.utoronto.ca/funstuff/disaster/default.htm <code>

Software has an equivalent professional designation now also. Just not widely accepted and required like the PE for CE projects.

"Professional certification is now offered by the Institute of Electrical and Electronics Engineers (IEEE) Computer Society. To be classified as a Certified Software Development Professional, individuals need a bachelor’s degree and work experience that demonstrates that they have mastered a relevant body of knowledge, and must pass a written exam."

Is that rigorous? who knows?

Tom Kyte
June 28, 2004 - 7:17 pm UTC

right but.....

today, I can get a degree in philosophy and call myself a software architect and no one can argue with that.

try calling yourself a laywer, doctor, CPA, civil engineer without the paper to back it up and see what happens - there is a fairly high bar to be crossed.

we have no such bar. I remember being 25-28 and thinking I pretty much knew what I needed to know... some dozen years later, I'm always finding out how much I still *don't* know. I've become unafraid to ask really stupid questions ;)

consolidate several databases into one

CH, June 28, 2004 - 6:39 pm UTC

Tom,
I'm glad to see there is a data architecture design question. We have a database that came with a vendor application system and several our own databases for applications that the vendor system cannot provide. All these databases are not big and at certain level, they share data. We would like to consolidate these databases into one. Every database objects came from vendor remain under their schema, our own objects will be put under new schemas. Do you think this is a good idea?

Thank you very much.

Tom Kyte
June 28, 2004 - 7:49 pm UTC

with 3rd party software you have to watch out for "version restrictions"

vendor A says "only 8174 and 9204"
vendor B says "only 9205 and 10.1.01"

so, you might not be able to have a single database - but fewer, yes, definitely.

One perspective from an Object-Oriented Sage

Duke Ganote, December 13, 2004 - 12:04 pm UTC

A citation from a well-respected text among object-oriented designers may assist in discussions with software developers.

Quick background: Our director encouraged us (both software developers and data architects) to take IBM's Object Oriented Analysis and Design (OOAD) Exam 486. I'm always game for broadening my perspective, and understanding the developer's viewpoint.

Our test prep included reading Craig Larman's "Applying UML and Patterns" (IBSN 0-13-092569-1). Naturally, I lept to Chapter 34, "Designing a Persistence Framework with Patterns".

There Larman proposes "a single Pure Fabrication class...where all SQL operations... are consolidated." He asserts the following (familiar sounding) benefits:
(1) "Ease of maintenance and performance tuning by an expert. SQL optimization requires an SQL aficiondo, rather than an object programmer. With all the SQL embedded in the one class, it is easy for the SQL expert to find and work on it."
(2) "Encapsulation of the access method and details. For example, hard-coded SQL could be replaced by a call to a stored procedure in the RDB...".

We (the data architects) are leveraging it. We're pushing for a "call to a stored procedure" as the standard for our directorate.

I should note, however, that Larman continues next to suggest an alternative: He proposes using a "more sophisticated metadata-based approach to generating the SQL....in which SQL is dynamically generated from a metadata schema description...". SQL generated from mere table-column-primaryKey metadata (as I interpret his sentence) is generic and untunable.

Tom Kyte
December 13, 2004 - 1:53 pm UTC

I agree with his first premise (segregate the code into specializations, let people do what they are good at) but not the last where you 'meta build' the SQL (on the fly probably). You lose all of the goodness mentioned previously ;)

how to go on?

A reader, December 17, 2004 - 12:21 am UTC

hi tom,

i have been asked by (and want) to support some J2EE-developers by providing the database needed for one of their projects.

my last chance to start development from scratch dates back to 7.3.x a few years ago. there we 've introduced some useful techniques i'd like to recommend again - such as hiding all tables behind views in order to maintain access restrictions (e.g. by including ... where <some_column> = <some_packaged_function_return_value>).

since we've now 9i(R2) i wouldn't use that approach anymore to implement such features. i would go for contexts and FGAC/VPD instead to acchieve the same. based on experiences we had i would establish appropriate PL/SQL-interfaces (non-TAPI-approach) between the application server's routines and the database too.

i think we 've to bridge a gap to be successful (e.g. how to design an efficient J2EE-PL/SQL-interface): the J2EE-developers asked for assistance, but i've no J2EE-knowledge apart from reading some books and trying some samples.

therefore i would appreciate it very much, if you could point out any additional hints or comments, things you would focus on, you would do and - more important - you would never do, bugs, side effects to be aware of, most important sources of information and so on - as some kind of summarized overview or roadmap to follow.

thanks in advance.

Tom Kyte
December 17, 2004 - 7:50 am UTC

I'm not sure what roadblocks you anticipate. I cannot see designing a "non efficient" <any language> to PLSQL api to take database from state A to state B implementation -- as long as the plsql was well written

forget that it is j2ee for a minute, just pretent it was sqlplus. or VC++. or VB. or straight java/jdbc (which many j2ee apps devolve into, using just jsp's or something). or C. or Perl. or <any language>

if you have a plsql api, there you go. transactions are us. efficient, clean, done.

DB Links vs RAC

Bob B, December 17, 2004 - 8:34 am UTC

I think this is along the same vein:

Lets pretend the original poster has 3 applications on 3 databases on 3 machines. Lets also assume all the computers and databases are homogenous. Which do you think would be better, performance-wise (purely database perspective)?

1) DB-links to access the database on the different databases

2) Creating a RAC out of the three machines and running one database instance for each.

Tom Kyte
December 17, 2004 - 2:39 pm UTC



need more info.

do these databases share anything.


if not, then performance wise, you are done. manageability and cost wise -- you have maximized your mgmt cost and your software cost and your hardware cost -- but each application, totally walled off physically from the other will be "best", it owns the machine. so thumbs up for guestimating performance (only one factor to consider -- your application). thumbs down for ease of mgmt, for cost, and for overall availability of a single application (one machine down, one application goes with it)


if they do, then I'm a big fan of a single database. If you want to survive single machine failures -- clustering could make sense. If you want to decrease the cost of your hardware (making hardware technology insertion tons easier -- people will replace 30k of hardware in 3 years, they will not replace 300k of hardware in 3 years) -- clustering could make sense. Single database, single thing to manage to backup to do whatever. You have tools like resource mgr to slice up the resources, you can assign applications their own machine in the cluster for the performance side.


I'm not a fan of dblinks at all in an operational application. for data exchange everynow and again -- perfection. for continous all day use, I wouldn't even consider it. Both machines must be up or neither are, I explode the number of connections to my database (not only my local community but this outside community of dblink users)...



A reader, December 20, 2004 - 1:35 am UTC


Tom Kyte
December 20, 2004 - 8:17 am UTC

perfect, thanks for the intriguing thoughts and counter discussions. I really see your point. Well said

:)

A reader, April 25, 2005 - 9:29 am UTC

Database_2
---------
User_B------------------->User_A

____________________________________________

Database_1
----------
User_C------------------->User_D



Background:

Front end team has connection object pointing to Database_2/User_A.
Very frequently we need to access the tables
in User_D from User_A.

Since they are in different
databases; we used Dblinks. But, since we wanted
some access control on the User_D objects, we came
up with above solution.

1.User_C (schema User_C) has synonyms for all the
tables in User_D.

2.User_B has views/synonyms for all the synonyms
in User_C using a Dblink.

3.User_A then accesses views/synonyms of User_B.



Problem:
------

Sometimes we have to copy data from Database_1 to Database_2.
For this I have created some SQL level objects and used them
in my procedure in User_D.

When I was trying to call this procedure, I was not able to access
those objects.

I had to create local/package level object in User_C and create a package
which copies data from User_D to User_C. And then used the package in User_C
in User_A using DBlink.

To me it looks like additional work. Is there a better approach for this?

Thanks in advance!!!

Tom Kyte
April 25, 2005 - 9:42 am UTC

well, how was the dblink created. Is it a proxy link (has a username/password coded in it) and if so, whose identity. If not, then the access control is done via the currently logged in user.


I'd need to see a concrete example, I'm not following you -- no error messages, I cannot imagine it happening on the "call" unless the procedure used dynamic sql and so on....


A reader, April 25, 2005 - 9:49 am UTC

Yes it is a proxy link.

I wanted to know if there is a better way of doing the same.
In such an environment with 2 databases and frequent need of data in one database to other, what approach do you suggest?



Tom Kyte
April 25, 2005 - 9:57 am UTC

with proxy links, I use the above approach.

A user is created.
They have the dblink (proxy) in their name.
They setup the views of the remote objects.
They grant on these views to the other local users.

You haven't said "why it didn't work" yet though.

A reader, April 25, 2005 - 9:50 am UTC

User_B is accessing objects of User_C using a DBLink.
So identity is User_C.

Tom Kyte
April 25, 2005 - 9:57 am UTC

see above, what was the cause of "it didn't work"

A reader, April 25, 2005 - 10:52 am UTC

Thanks for your response!

I am getting following error:
PLS-00201: identifier 'obj1@LINK1' must be declared.

obj is User_D and I am trying to use it in User_A.

Tom Kyte
April 25, 2005 - 1:26 pm UTC

user A is to use user_b's objects, not D's

DB Links: Performance

Nurlan, May 06, 2005 - 2:28 pm UTC

Hello Tom, thanks for your directions in system architecture. We have some issues with db links and I would like to make sure with you. We recently built data warehouse which is a backend for vendor application, hence we do not have much control over the versions + it has different recovery requirements compared to applications and application database which use data from DW, we split databases into two. Applications need to use data (some not so big <100000 rows, lookup tables) from DW and we created db link to it. So as you see we now have problems with performance. Applications need to have quick response time, but this architecture does not give us needed performance. What would you suggest?
We have considered some options, such as:
1. Replication of data to local db (then we have to store the same data in two places);
2. Re-write sql to place driving_site or other hints;
3. use stored outlines (which is basically the same as above).
What else could we use? Is there any Oracle global parameter which set driving_site like hint on instance level?
Additionally, you have big experience with all kind of systems. What is the valid architecture for systems which are distributed and have some applications which need to have quick response time.

Tom Kyte
May 06, 2005 - 5:09 pm UTC

I use dblinks to copy data

I would not use them query by query -- in real time.

If these are lookups, sounds like a materialized view would be perfect.

Options for transferring data between 2 databases

A reader, May 24, 2005 - 2:01 pm UTC

Tom,
We are currently transferring data between 2 oracle db's through dbLinks and triggers. Triggers present on base tables in database DB1 would perform the similar dml operation of the base tables in DB2. If an insert is made in talble1 in database 1, an after insert trigger on this table would do an insert in table1 in database2 through a dbLink. A similar procedure is being followed for updates and deletes.
There is a plan to re-write this using Java messaging.
Source DB -->Topic Publisher-->JMS BUS (MQ, SEEBeyond..)-->Topic subscriber-->Destination DB
Topic publisher and subscriber are java messaging objects.
The way they plan on doing this to read data from database1 into the java messaging objects. Publish the message to the JMS bus as queues/topic. Topic subscriber would then pick up the message on the JMS BUS and modify the data in the destination DB.

I would like your opinion on the best approach in general for transferring data between 2 databases.
1. When should db links be used?
2. When should a messaging system be used?
3. Is there a better approach than db links and java messaging?
4. Is there anything in oracle which provides functionality similar to JMS?

Would really appreciate your comments on this subject.


Thank you

Tom Kyte
May 24, 2005 - 3:55 pm UTC

I would have written this many lines of code:

<this space intentionally left blank>

why would you not just use replication? the database has both syncronous (what you have today) and async (what you are trying to get to) as a built in feature??!?

(AQ has been in the database lot's longer than JMS has been a specification, our JMS rides on top of AQ)

If you write this in code, you have done it wrong.

Options for transferring data between 2 databases

A reader, May 24, 2005 - 7:46 pm UTC

Tom,

1) If we go with the option of replication, would the hardware (space and CPU) usage be double the current usage?

2) Can you please list or point me to a URL that would give me top level picture of the different methods of data transfer within Oracle - like Streams, Advanced replication and other available methods.


Thank you

Tom Kyte
May 24, 2005 - 7:58 pm UTC

1) but you are talking about "do it yourself replication", that would be expensive.

You have streams (redo log mining) and advanced replication. Streams would be the way forward.


2)

</code> https://docs.oracle.com
or
http://otn.oracle.com/pls/db10g/portal.portal_demo3?selected=3 <code>

streams guide
advanced replication

Architecture Design

A reader, May 31, 2006 - 6:26 pm UTC

Tom,

We want to build a static data centric application. We havge fairly large data. I don't know which type of application architecture we should select. we have about 20 users. It is an important app. but not critical, meaning it will not stop the business we must have the application very fast and can not frustrate users with slowness. all data is in oracle 10 R2 databses. please share your ideas.

1.) Desktop application ?
* (2-tier) client server ? --
clients may be in US,Europe or Asia will it perform ?

* (3-tier desktop) I have not seen many application like this.

2.) web application ?
The only thing is, the application has significant amount of data and I am not too sure that the web-browser (IE as users will use it) will be able to handle it.



Tom Kyte
June 01, 2006 - 9:47 am UTC

sounds perfect for APEX (</code> http://htmldb.oracle.com/
- the stuff I used to build this site)


I would not even give a moments consideration to a client server application.

The volume of data is not relevant.  Amazon has a lot of data, Yahoo has a lot of data, Ebay has a lot of data, Google has a lot of data....

But they are reputed to be fairly successful.  Even using this web thing :)


You are limited mostly by the PEBKAC issue 
http://pebkac.net/ <code>- they (the problem between the keyboard and chair) cannot deal with huge volumes of data, therefore, you will use good interface design techniques to basically make the "big bunch of data" meaningful to the poor human being.

meh "sw engineer"

Robert, June 01, 2006 - 10:31 am UTC

I also HATE the term "Software Engineer"....also I think the term/title "Consultant" has been misused and "abused" for too long (since like Oracle8.0)

Html db

A reader, June 01, 2006 - 2:45 pm UTC

Is HTML db free ? or included with oracle 10g R2 ?

Tom Kyte
June 01, 2006 - 4:04 pm UTC

it is a feature (comes with) of all editions of Oracle

Thanks,

A reader, June 02, 2006 - 2:19 pm UTC

I went to HTML db oracle website, those samples they have, previews are not production Quality, Your site is totally different than those samples/previews there.

So, How much experties do we need to get some real world work done with this product ?

Tom Kyte
June 02, 2006 - 3:08 pm UTC

why are they not "production quality"

Did you know that HTMLDB is written entirely in.... HTMLDB? That is sort of production quality don't you think?

It takes a very short amount of time for an experienced developer to get up to speed (hours/days, not months/years)

btw

A reader, June 02, 2006 - 2:21 pm UTC

By the way on your web-site, on upper-right side what does the Home tab ?/lable do ? it is not clickable !!

Tom Kyte
June 02, 2006 - 3:09 pm UTC

sure it is? I just clicked it (when grey - meaning you are on a different tab set).

You see, I have two tab sets - home and admin :)

You don't get to see admin

Communication between oracle dbs/j2ee applications

A reader, June 28, 2006 - 6:51 pm UTC

Tom,

I don't seem to have a clear picture on the approach to take on the subject topic even after going through all the excellent discussions above.

We have about 5 different Oracle transcational DBs running different J2EE applications.

What is the optimal method of intercommunication between these 5 different databases/applications. Different groups are giving different suggestions. I have put in the options being considered. I would appreciate your detailed comments or rule of the thumb (if any) on the approach to be taken.

1) Create dblinks, synonyms and triggers between the databases.
2) Use SeeBeyond to publish and subscribe messages.
3) Create web services


Thank you


Tom Kyte
June 28, 2006 - 6:56 pm UTC

All 3, depending on the circumstances.

If there were a universal "always the best", the others quite simply would not exist.

So, insufficient data to say.


But - for example - say you needed to have a read only reference copy of a table that is written to by one database in two of the other databases, then likely - dblink + materialized view.


For example, say a business transaction in database 1 needs to initiate a business transaction in database 2 and then database 3. Perhaps "seebeyond" (or just the already purchased AQ and JMS you own in the database) to build a loosely coupled system would be appropriate.


For example, say a web page generated from database 1 needed to branch to a transaction on database 2 or database 3. A web service might be an easy way to accomplish that.

Communication between Oracle DBs/J2EE apps

A reader, June 28, 2006 - 10:19 pm UTC

Tom,

>>purchased AQ and JMS you own in the database

Can you please provide additional details on the above statement. I know that there is AQ in the DB.

How and where is JMS present in the DB?

Thank you

Tom Kyte
June 29, 2006 - 7:19 am UTC

Our JMS implementation uses AQ to persist the messages.

Database Replication

Jose, June 29, 2006 - 8:02 am UTC

Our development team are planning to integrate the data on our management applications database A with another database B for Portal development.
They are planning to create materialized views on database B based on tables on database A. As they are going to implement triggers on their materialized views they need to create also snapshot logs on the tables in database A that they use for the materialized views. They say that this way the triggers will be fired properly when the views will be refreshed.
I personally don't like this very much because the snapshot logs on database A involve a lot of tables and will affect the performance of the write operations. As they are the development team they don't want to wait to do tests in production before apply this changes to see how much will affect.
The data model in database A is designed for a third party and can change.
Is there a better way to do this.
Thanks.



Tom Kyte
June 29, 2006 - 9:25 am UTC

I don't understand why triggers are going to be created on the materialized views?

And the existence or lackthereof of triggers on the materialized views would not affect the need for materialized view logs - the decision to do incremental changes only refreshes OR full refreshes would drive that decision.

And if they are putting triggers on fast (incrementally refreshed) refresh materialized views in hopes of being notified when updates happen and such - they shouldn't. They have no idea, and no control over how we maintain the materialized view - it would be a shaky proposition at best.



J2EE and oracle db access

jdam, April 14, 2007 - 3:29 pm UTC

Tom,

We are buinding a new J2EE applications and I want to know the best way to access the database, it should be direct DML or use the object oriented concept, create a API to hide the access implementation, and which options have better performance.
Tom Kyte
April 16, 2007 - 10:39 am UTC

I like #3 - go with a transactional API.

easiest to implement
easiest to tune
easiest to manage

J2EE and oracle db access

Jdam, April 16, 2007 - 8:52 pm UTC

Tom,

Could you please give me some example code,how you implement the transactional API, for example if the J2EE applications need the result for the following query, SELECT X,Y,Z FROM t1,t2 WHERE t1.a=t2.a .... How will look like the transactional API.

If we use transactional API the PL/SQL engine could impact on the performance of the query
Tom Kyte
April 17, 2007 - 9:51 am UTC

the "plsql engine" has no bearing on the runtime performance of a query - they are entirely separate entities here. PLSQL might open the cursor - but a cursor is a cursor is a cursor.

In my experience, the largest performance inhibitor in this sort of environment has been..... well..... poorly written java code that interacts improperly with the database because the coders writing java know lots about java and just about ZERO about databases....

You would use ref cursors to return a result set to java - you would use named formal parameters for all inputs and outputs - just like any other language.

create or replace package my_pgk
as

   procedure some_api( p_parm1 in <datatype>,
                       p_parm2 in <datatype>,
                       ....
                       p_resultSet in out sys_refcursor );
.....


http://asktom.oracle.com/tkyte/ResultSets/index.html

the eBay architecture

A reader, August 14, 2007 - 10:20 am UTC

Tom,
can you comment on their implementation, thanks:

move cpu-intensive work moved out of the database layer to applications applications layer: referential integrity, joins, sorting done in the application layer

http://www.addsimplicity.com/downloads/eBaySDForum2006-11-29.pdf
Tom Kyte
August 20, 2007 - 11:11 am UTC



I think you already know what I'd say.

first, you cannot do referential integrity or entity integrity without SERIALIZING in the database (eg: use the LOCK TABLE command) in your application (that is *not* very concurrent is it). If you do it in the application, you will - well - either a) do it WRONG and 'sort of fast' or b) do it RIGHT and really slow. The only thing capable of getting it right and really fast - the database.

joins - DATABASES WHERE BORN TO JOIN. You really want to nail the database? make it really slow? go ahead - do a "do it yourself nested loop join". You end up running MORE sql (that increases, not DECREASES load), you end up asking the database to do MORE WORK (not less).

sorting - well, we are pretty good at that. Consider that in a transactional application - most of the time, well, we can actually *RETRIEVE DATA SORTED* via an index.



I think that if you do not know very much about databases, you might believe you are being "nicer" to the database by not asking it to do stuff - but you would be "incorrect" in your belief.

the eBay architecture

A reader, August 15, 2007 - 10:34 am UTC

hi Tom, you may not be interested, but I'm curious how they could achieve this. Are they doing joins in the "nested loop" way?
Tom Kyte
August 20, 2007 - 7:16 pm UTC

probably, a horribly inefficient nested loops way

or

they read it all out, stuff it into arrays and do it without any sort of concurrency or data integrity (stale data)

Tuning SQL running on Remote

A reader, August 20, 2007 - 12:46 pm UTC

Hi Tom

How do you tune an SQL running on a Remote DB (via DB Link) ?

Explain Plan mentions only "Remote Execution", but doesn't say how the data was access on the Remote DB.
Tom Kyte
August 22, 2007 - 10:38 am UTC

give an example and we'll talk through it.

the eBay architecture ... how they got away with it!

Gabe, August 20, 2007 - 2:08 pm UTC

To: A reader

Hard to say if that was an achievement.

Look, they started with a concoction of files and GDBM until they maxed out ... then they migrated "the items" to an Oracle database ... they did not re-develop their platform for a relational database. They never did ... they remained at GDBM functionality against Oracle! That was the opportunity ... it would've taken longer to re-write the code than to just insert "items" into tables and they took the quick route. After that, they got really big and doing it properly could never have been a viable option.

They were always in catch up mode ... business came pouring in (apparently people like buying and selling ____) and the only way they could scale, without re-development, was to scale out: buy more h/w. The only way they could keep adding h/w was by functional decomposition of their application. And so on ... they always had to respond to the scalability issue ... and apparently they were able to respond successfully; meaning they managed not to collapse under sheer wait. That was the achievement.

Them presenting "Architectural Lessons", just because they got away with it, is ...
And how many didn't get away with shabby architectures? Do you think they're presenting at conferences?

"Striking a balance between site stability, feature velocity, performance, and cost."

How do they know what the costs and performance would've been had they re-developed for a RDBMS in 1998?

If someone raises a building and keeps adding props to it as new floors are added on top ... would you call that an architecture? That paper describes how they evolved, the challenges they encountered and a description of their particular system ... but otherwise it is hardly an architectural blueprint of how one should build against a relational database.

Not all real-estate developers, successful at expanding the suburbia, should be in charge of city planning. And as for them deserving a Pritzker prize ... of well.

Sheer what?

Gabe, August 20, 2007 - 3:07 pm UTC

Somewhere in my comment there is a "sheer wait" ... tuning that gives way to "sheer weight".

but some good ideas

A reader, August 25, 2007 - 5:29 pm UTC

At least they said:

"Extensive use of prepared statements and bind variables"

And in conclusion?

Richard Hall, October 01, 2007 - 11:24 am UTC

There has been a long running debate at work about how to 'join databases together'. We have a number of different databases (for historical reasons) each supporting their own application. However, they share certain information e.g. user account data. Currently the databases are 'joined' by the PHP application software. I have long argued for using db links as a way to 'join' the data together, arguing that the CBO will do it in a far more efficient manner than any application ever could. A quick search on the asktom website came up with the answer..i.e this article!

The discussions in this article have taken place over a number of years and I needed a quick summary of the conclusion for my work colleagues.

My summary of it's conclusions goes as follows:-
1) should I use the application to 'join' different database together?
Answer: definitely not , no way, nicht, never.

2) Should I use db links as part of my real time application code?
Answer: not unless you can absolutely avoid it, by all mean use DB links to copy data via materialized views or even in pl/sql that copies data from one db to another but anything that relies on both databases being up at the same time as part of a real time application should be avoided.

3) Should I use one of Oracle replication technologies to copy and sync data between different databases, so that real time applications are depend on one and only one database?
Answer: yes, were possible

4) Should I strive to keep everything on one database and use technologies like RAC to provide both processing power and resilience?
Answer: yes , yes and yes.

Comments ?
Tom Kyte
October 03, 2007 - 3:18 pm UTC

nice summary, nail head has been hit by hammer.

Steve Landy, November 07, 2007 - 12:32 pm UTC

Hi Tom,

In regards to DB Links, we have a policy set in place that discourages the use of them. This was put into effect before my time and I have since been responsible for enforcing this "DB Link" policy against my better judgement.
We have a large number of shared database servers (245 to be exact) with dozens of applications residing on each and we continually get requests to have DB Links created between databases either to read/write data (in realtime) and push/pull data (e.g. batch jobs) from one DB to the other. The DB's are housed locally and in remote locations (over the WAN).

What are some recommended architecture best practices that I should recommend to my developer community for these different senarios? Which technologies would you suggest be used in these circumstances (i.e. seebeyond (now JCAPS), Business Objects Data Integrater (ETL tools), DB Links, AQ, Streams, etc.)?

1) Reading / writing data between databases locally (both realtime & adhoc)
2) Reading / writing data between databases remotely (over the WAN) (both realtime & adhoc)?
3) Doing batch loads from local DBs to remote DBs?
Tom Kyte
November 07, 2007 - 6:14 pm UTC

using them "real time" - bad idea (makes everything less available, slower in real time)

using them in batch - in the background - great idea, under controlled circumstances.

1) I would question why you have two databases if they need to constantly use eachothers data in real time. That is a mistake.

2) materialized views, streams for data sharing, once your PROVED that these should be two different databases.

3) see #2

taylor otis, April 02, 2008 - 8:19 am UTC

do you know how long architectural drafters work on the weeks and the weekends.
Tom Kyte
April 02, 2008 - 8:59 am UTC

not sure that I really care - it has nothing to do with anything here.

ROFL

Michel Cadot, April 02, 2008 - 10:16 am UTC

One pretty good laugh in this rainy day.
Thanks!

Regards
Michel

... further Architectural Design Question

Craig, April 08, 2008 - 4:33 pm UTC

Forgive this long question - I am working on a project to integrate several applications and would be grateful to receive any comments / suggestions on one of our design ideas;

We have a master Personnel/HR app (highly normalised data). We need a lot of data from this HR app integrated in a couple of other apps. This includes the core personnel record as well as master/lookup data. For example the master HR app has;

create table job_title (
 job_id    number
,job_title varchar2(25)
,constraint job_title_pk primary key (emp_id)
)


create table emp (
 emp_id number
,name   varchar2(25)
,job_id number
,...    ...
,constraint emp_pk primary key (emp_id)
,constraint emp_job_id_fk foreign key (job_id) references job_title(job_id)
)


As new job titles are added we wish to integrate the data into the target application. As emps are added, updated integrate across. There are lots of look up tables (job titles, grades, depts, locations, ethnicity descriptions etc.)

Our target app exposes web services: load_job_title(), load_emp(), etc. The target app has a similar data model.


Our idea is to build Mapping tables in an integration server whereby a call to the web service load_job_title() returns the target application's primary key of the new job title.

So we build a mapping table

create table map_job_title (
 source_job_id number
,target_job_id number
,job_title     varchar2(25)
,constraint map_job_title_pk primary key (source_job_id)
)


When a new job title is captured on the source app we;

l_target_pk = call_web_service('Aspiring Software Engineer!');
insert into map_job_title (l_source_job_id, l_target_job_id, 'Aspiring Software Engineer!');


Then when we capture a new EMP record on the source;

{ emp_id:2, name:John, job_id:205, grade_id:450, deptno:10, location_id:101 }

for each target application we can construct a record using the target application's foreign key values;

select target_job_id
  into l_target_job_id
  from map_job_title
 where source_job_id = l_source_job_id;

select target_deptno into ...;

call_load_emp_webservice (emp_id, l_target_job_id, l_target_depno, ..., ...);



BENEFITS
- We dont have to get each third party vendor to build the same sort of logic internally in their API.
- We build the lookup mapping data tables once on creation and then the data is there for re-use;
Alternatively, the target app's API could receive strings ('Developer', 'IT Department', '...') and have to lookup the relevant foreign key value to load into its main emp table.

NEGATIVES
- Must be many?

Thanks for reading, if you got this far. And many thanks for any comment / critique / suggestion.
Craig
Tom Kyte
April 09, 2008 - 2:43 pm UTC

are you proposing to turn just any string submitted into an acceptable, valid value??

Craig, April 09, 2008 - 4:29 pm UTC

The values all come from a source app which is trusted as the 'owner' of the data.

For the master/lookup data we capture newly inserted and updated source data and integrate into the target - in many cases the target app will have a unique constraint on the real data (i.e. in the Job Title table there will be a unique constraint on the job_title attribute).

But many apps are third party. The services will have their internal validation. In other cases maybe there will not be any unique constraints on the target apps so we will need to manage data integrity in the integration server.

Is this what you were getting at in your question?

many thanks
Tom Kyte
April 10, 2008 - 10:16 am UTC

... The values all come from a source app which is trusted as the 'owner' of the
data. ...

you will end up with utter garbage, it'll take about one week or less would be my guess. If the source app is trusted, they would already have the set of valid values and you wouldn't need them to be created "on the fly"

why even bother with a normalized "lookup" then - you have no data integrity here (not when you "make stuff up on the fly"). The lookup would be useful for when a job title was corrected or modified - which isn't going to work here at all.

Don't bother with the lookups in this case, you jsut have strings, strings that mean nothing really.

Oracle Database 11gR2 Architecture diagram

Prasad Mynumpati, April 16, 2014 - 7:01 am UTC

Hai Tom,

Oracle official database 11gR2 architecture diagram is having a pointer running from DBWn to Redo log buffers. As per oracle concepts and architecture manual concerned I have never come across the relationship between DBWn and redo log buffers. What do you say? Is this needs to be address in the diagram as it is Oracle official database architecture diagram. Appreciated your comments.

Diagram Link: http://www.oracle.com/webapps/dialogue/ns/dlgwelcome.jsp?p_ext=Y&p_dlg_id=9575302&src=7027600&Act=54&msgid=3-3138049431

Thanks,
Prasad Mynumpati.
Tom Kyte
April 16, 2014 - 4:31 pm UTC

I do not see a line from dbwn to redo in that download. please clarify.

i only see a line from dbwn to datafiles.

Oracle Database 11gR2 Architecture diagram

Prasad Mynumpati, April 17, 2014 - 6:20 am UTC

Here is the link shows oracle architecture diagram and my question about DBWR having pointer to REDO LOG Buffers.

https://www.youtube.com/watch?v=fmOLYrDkQ_Q
Tom Kyte
April 17, 2014 - 5:27 pm UTC

dbwr reads a ton of stuff in the SGA to figure out the state of database blocks. There are probably tons of MISSING arrows here.


dbwr writes to the redo log buffer to indicate blocks it has written out in current releases.


here is a discussion on this very topic

https://community.oracle.com/thread/862221?tstart=0


More to Explore

Performance

Get all the information about database performance in the Database Performance guide.