de-coupling?
Duke Ganote, March 20, 2007 - 12:13 pm UTC
"99.9% of the time [database links are] a bad idea". I'd be curious what 0.1% of the time he thinks it's a good idea.
But let's separate the implementation issue of database or messaging technology from the concept of tightly-coupled or de-coupled applications. You can have tightly-coupled applications that use database-links; if one database of the several required databases is unavailable the application is in trouble.
Or you can de-couple the application by picking one database, and using Oracle's messaging technology (AQ) or replication to copy the essential data to that database from the others. Then if one of them is unavailable, the application can still function.
repeating this over and over doesn't give it meaning
Mark Brady, March 20, 2007 - 7:24 pm UTC
I continuously hear this from bus/messaging/queueing/SOA developers/salesmen/evangalists:
"You can have tightly-coupled applications that use database-links; if one database of the several required databases is unavailable the application is in trouble."
If one of the databases the data is required to be in isn't available, your application is in trouble regardless of the "coupling".
"Or you can de-couple the application by picking one database, and using Oracle's messaging technology (AQ) or replication to copy the essential data to that database from the others. Then if one of them is unavailable, the application can still function. "
If I'm moving data from Database A to Database B and saying that by having A put the data in system C first and let C put it in B and that this makes my system "decoupled" is silly at best, dangerous at worst.
Not if the one that is unavailable is the one hosting the queue.
I think people see the data's journey as a physical one, as if the data has to walk from database A to database B. If B is unavailable then the walk hasn't started and it still has 100% to go or 0% complete. But if I contort the journey from A to B via C and B is down then at least the data can make it to C and that's 50% complete and only 50% left to trek.
C is just as vulnerable as B is, whatever you can do to C to make it "more available" can be done to B. Whatever C has to do to retry getting the data to B can be done in A.
Honestly, I think queues or message buses are great, in the right circumstance. But those practitioners of these technologies try to force them into inappropriate circumstances with magical terms like Decoupling or Built-in Retry Logic or Certified Messaging or Complex Tranformations.
I've seen a million records moved, in what is essentially a batch, between 2 Oracle databases by having c# code read on record from the first database, create one create an XML file on the filesystem, a Bus system is watching that directory reads it, queues it, inserts it into database the second and commits. All of this because someone was chasing the myth of "decoupling". So when the morning comes and 999,999 records made it and I'm asked what happened to the one that failed, I honestly answer that will take pouring through 5 or 6 possible stops the record made on its way from point A to point B.
March 20, 2007 - 8:05 pm UTC
thank you..........
dblinks...
Russ, March 20, 2007 - 8:51 pm UTC
My office is extremely afraid of dblinks because once, a long time ago, the application froze because one of the databases it was connecting to using a dblink crashed (or simply became unavailable) in the middle of the transaction. Memories are hazy on the exact details but they remember a dblink was involved so it was used as the scapegoat. And, yes, they spent considerable time and effort to write a perl program to do the job of the dblink. :-(
March 20, 2007 - 9:09 pm UTC
you cannot write a perl script to do the job of a dblink, it is not possible.
dblink was for select
Russ, March 21, 2007 - 7:22 am UTC
Well, the dblink was only used in a select statement so the perl copies the data from one database to the other every five minutes. I did not agree that writing the perl code was necessary but you can't win them all...
The first time I've ever...
Duke Ganote, March 21, 2007 - 10:12 am UTC
I think that's the first time I've ever been lumped in with
"bus/messaging/queueing/SOA developers/salesmen/evangalists" and other snake-oil sales reps.
We're probably in violent agreement that
"queues or message buses are great, in the right circumstance". We just need to be clear on those circumstances.
"If one of the databases the data is required to be in isn't available, your application is in trouble regardless
of the 'coupling'."Depends on the data, so perhaps I should have qualified my original statement. Replication is one mechanism to provide de-coupling
http://en.wikipedia.org/wiki/Decoupling#electronics For example, we are migrating from a plethora of legacy reporting databases to one. As a step in the migration, we have conformed dimensions used in all of the databases. These are slowly changing dimensions. Rather than dblink several databases to one with the "master" conformed dimensions, we replicate the conformed dimensions. Thus, if database 'A' with the master copy becomes unavailable, the other databases have their own local copy, and processing continues.
There's more to it...
Michael Friedman, March 25, 2007 - 10:52 am UTC
AQ, for example, is just a messaging platform layered on top of a DBLink.
I'm not sure if your people who don't like DBLinks would consider that good or bad, and I'm not sure if the people saying that DBLinks are the best solution would consider them a subset of DBLinks or not, but I certainly consider AQ to be superior to a DBLink for many applications.
For example, I can send a Production Order from a merchandising team in Bangkok to a factory in Vietnam (real example) as an XML message. I know that I have guaranteed one time delivery, that the appropriate procedural logic will get called on the far end when that XML message arrives, and that as long as both applications stick to the same interface contract - that they follow our agreed on XML schema - it doesn't matter what other changes get made under the covers on either side.
I also have a much higher level administration interface for addressing, managing routing and message handing, and checking if messages have or have not been sent.
Last part is particularly important since we're now having a discussion about giving suppliers a free application running on Oracle XE (perhaps even bundled on a cheap PC) to handle receiving POs and sending back packing lists, ASNs, and invoices.
So how about "DBlinks are fine especially for point to point bulk transfers of table data but if you want to move complex objects between several heterogeneous systems in an environment in which versions (and even applications) will change and may not stay in synch then a messaging layer will probably work out better in the long run."
Oh oh, it's maaagic, you know
Mark Brady, March 26, 2007 - 10:57 am UTC
'I can send a Production Order from a merchandising team to a factory as an XML message. I have guaranteed one time delivery, that the appropriate procedural logic will get called on the far end when that XML message arrives'
Wow, that is rather complex. I guess that's much better to calling a procedure on the Vietnamese database with a set of parameters. I know that procedure calling suffers dramatically from multiple executions and inappropriate logic.
and that as long as both applications stick to the same interface contract - that they follow our agreed on XML schema - it doesn't matter what other changes get made under the covers on either side.
You mean like an API, right? Good thing XML came along cause we'd have never figured that one out.
I have a much higher level administration interface for addressing, managing routing and message handing, and checking if messages have or have not been sent.
Really? You do reporting? That's amazing. In my lowly, non-xml, dblink riddled databases, I never conceived that I could one day, report to my end-users what rows were in what tables. I gotta start working with XML. <swoon> XML, take me away </swoon>
You realize the more qualifiers you add to your 'mission statement' the fewer places it applies? What if I want to move simple object between heterogeneous dbs? How about complex objects between homogenous dbs?
My apologies for the sarcasm but it's there to make a point. There's nothing magic about an API. The contract between applications can be the XML you agree to pass. It can just as easily be the table you insert the same data into or the parameters a procedure accepts. Maybe this appears as magic to young coders who never encountered good app design; i dunno.
There's certainly nothing magic about having the right procedural logic run for a given event. Or controlling the number of times that event fires. Maybe that's mysterious to web developers ('Don't press submit twice') but it's not to me.
Yes, to an earlier post, we do agree that there are times when a messaging system makes the most sense for distributing data. But you can't justify the use with something that's not unique. You can't justify the purchase of a particular car by claiming it has a steering wheel. The messaging systems ADD complexity and that's OK, but there has to be a net benefit somewhere. API's, reporting, single execution, correct logic are not justifications why you should add the complexity, they¿re just examples of good design.
not just magic, but FM...
Duke Ganote, March 28, 2007 - 11:35 am UTC
ROTFLOL!
As for the bottom line:
The messaging systems ADD complexity and that's OK, but there has to be a net benefit somewhere.
The biggest benefit I've seen for "publish and subscribe" messaging is de-coupling your local transaction from immediate ("synchronous") data propogation/replication or remote procedure calls.
Thank you for the response
Alastair Green, April 03, 2007 - 6:45 am UTC
Thanks Tom,
That has answered my question very well. To add some further context in response to comments made by the reviewers:
1)We are an internal support department, none of our applications needs to communicate with remote sites.
2)I think that the .1% of cases where my colleague would accept a dblink is a hypothetical .1%. Even he couldn't quite say never, ever use them.
3)His justification was a little vague but was basically "DBLInks=tight coupling"
April 04, 2007 - 9:42 am UTC
if he doesn't want tight coupling, then dblinks are absolutely the right answer still given #1
you have asynchronous replication (using dblinks)
you have AQ (queues) (using dblinks)
it is still the right answer, in light of #1.
how to dynamically switch accessing a table from one db to another
Giridhar, April 05, 2007 - 2:27 pm UTC
Hi tom,
I raised this query in few forums, but did not get any clear answer. People said no for my approach, but i would like to know reason if this is not correct approach.
----------------------------------------------------
we have two databases in our production system.
these two databases gets loaded from the data from mainframes data everyday. Data file remains same.
Sometimes, data loading fails on one database and we switch traffic to other database even if one table loading fails. This is for DWH team, where we load lot of data.
I would like to know if there is any way to address this:
In DB1 if one tables loading is failed, I want to configure in such a way that any sql statement hitting DB1 should be automatically routed to DB2. Other statements querying other tables should come to DB1 itself.
Is this approach possible:
1) DB1 table loading failed and DB2 table loading is successful
2) All our tables are accessed by public synonyms. Tables are owned by some schemas and people selecting the data use other schemas.
3) We will recreate public synonym to point from DB1 to DB2 using a DB Link.. DB2 will give select privileges to DB1.
Many users may be accessing our systems and they might be selecting large amount of data.
Once the loading problem is fixed, we will recreate public synonym to point to to DB1.
In order to fix the loading issue, we can either use transportable tablespace or use DB Link to copy the data
from DB2 to DB1. Can you suggest which is better approach.
Please advice.
Thanks
Giridhar
April 05, 2007 - 3:33 pm UTC
I would hazard a guess that the other people that said "do not do this" would say that for the same reason I will.
performance = not there
idea = bad
Let us work on fixing the failed load. Why not make it so the load cannot fail? Why would a load fail? What causes do we need to address?
Thanks
Giridhar, April 06, 2007 - 3:02 am UTC
Thanks tom for your reply.
Yes. i agree. we will be working on fixing the problem. But just wanted to know if at all is it possible to address this temporarily before we fix the main issue.
Thanks
Giridhar
The .0001% DB Links are bad
Do, April 11, 2007 - 10:02 pm UTC
Sadly the example I can relate to of the few times dblinks are bad is when they're to a database which you don't administrate but have outsourced out to a complete bunch of incompetent finger-pointing blame-game playing buffoons. When they meddle (without change controls) with the VPN, the user account, the server, the firewall, etc they clearly impact our systems which require the link.
Being fair though this has more to do with inadequate contract management, connectivity over-reliance, lack of exceptions coding and total absence of any configuration management describing our dependencies than it does have to do with DB links. They're only the symptom, not the cause!
@Do -- it wasn't the dblink's fault you have bad contractors
Mark Brady, May 10, 2007 - 4:20 pm UTC
Alexander, October 30, 2008 - 11:21 am UTC
Hi Tom,
I thought for sure I read on this site many times you cannot make dml changes to a remote db over a link and you have stated that would be "very virus like". I had a meeting with some developers and they corrected me and said it was possible. So I tested it out, and it worked.
Am I nuts, has this always been true? Might I be confusing ddl statements?
November 02, 2008 - 3:14 pm UTC
umm.....
No, since version six of Oracle (1980 something) we've done distributed transactions. Since version SIX.
When I say "very virus like" it has ALWAYS been in answer to "utl_file" or "dbms_lob.fileopen" or "external tables" trying to read or write the (arbitrary, non-mounted, non-mapped) CLIENT FILE SYSTEM from the DATABASE SERVER.
Here, you can plug "very virus like" into my search and see for yourself
http://asktom.oracle.com/pls/ask/search?p_string=%22virus+like%22 distributed transactions - takes setup, takes installing purposely software on machine A and machine B - not like a virus.
having the database write to an arbitrary PC in the network - very virus like.
the two concepts are very different.
Dark cloud of dblink
Nelson, December 21, 2009 - 2:51 pm UTC
Hi Tom:
I am a big fan of you.
In our company, there are a lot of discussions about the evil of using Oracle db links. I personally is a big fan of using dblink since it is secure, simple and can have good performance if coded properly. I totally agree with the decoupling arguments. Decoupling is a good concept. However, the concept is being used, in many times, incorrectly.
I understand that when use in-appropriately, dblink can have performance issue but it is all in the hands of the coder. Bad code = bad performance, simple as that.
Recently, I have received a comment from our Architect:
"My primary concern is the use of dblink. My understanding is that we should not be creating any additional dblink access to our system, and in fact, are trying to replace most/all of the current dblink as part of the system upgrade. dbLink has been flagged as a potential performance inhibitor."
I was told by my other Architects that our current evaluation of the system upgrade (by Oracle) has told us that we should avoid using db links.
Our production application DBA also claims that it has security risk because DBA has access to the password of dblink in the system table(sys.link$), which is correct in 8i if you have DBA role.
I do ask for facts from our production DBA in relate to performance, but I am unable to get any answers since I am not involved in the upgrade evaluation process. At this point, it is very difficult to argue since this is the words from our database vendor, Oracle.
On the other hand, I am also very open minded. If there are indication that dblink is faulty or carry high security risk, then we will just have to follow what Oracle has recommended and replace them with other technology.
It is a bit of a pain since I know what they are trying to do is really not in the best interest of the company.
I just want to get your opinion on this.
Thanks
Nelson
December 21, 2009 - 4:15 pm UTC
.. dbLink has been flagged as a potential performance inhibitor."
...
a dblink using transaction is slower than a 100% local transaction. This is true (and should be, well, rather OBVIOUS)
However, a dblink will be faster than code you write yourself to do stuff, many times faster.
... Our production application DBA also claims that it has security risk because
DBA has access to the password of dblink in the system table(sys.link$)...
you mean that same DBA that has access to 100% of all data? That DBA? Interesting.
And if you are running unsupported software (such as 8i) and have utmost concerns about *security*, well, you might think about using software written *this century*.
dblinks
A reader, December 21, 2009 - 7:56 pm UTC
dblink - when not to use?
Nelson, December 22, 2009 - 8:57 am UTC
Hi Tom:
Well said. You always have an interesting way to put your point across.
I use the same argument about DBA access to data. To me, DBA is a trusted role. If we allow a DBA to have privilege to drop a schema, accessing a single id is really the least we have to worry about if we do not trust that DBA.
Our site has many version of Oracle 8i, 9i and 10g for multiple systems. There are always stories behind why older version of things stay old. The upgrade we are talking about is one of many efforts for us to move from 8i to 10g or even verion 11 if possible. That is at work. So, for that app, we are trying to move it into this century.
Back to the topic of dblink.
In our site, we use dblink for database to database for data movement a lot. Architectually, there are a lot of arguments about the use of dblinks. Obviously, decoupling, ETL middle tier etc are part of the discussion points.
Outside of that architectural arguements, can you provide us with some guidances of when we should NOT be using dblink?
Also, one of Informatica technical sales reps claims that their products can load data for data transformation faster than the used of dblink and PL/SQL. It is hard for me to understand how an additional middle tier will be faster than db to db direct communication. Is that even possible? Is he just behaves like a sales reps?
December 31, 2009 - 7:54 am UTC
... Obviously, decoupling, ETL middle tier etc are part of the discussion points. ..
I don't see why any of those would be an argument AGAINST database links. Database links are simply a "transport", an easy way of doing things. Tell me, are your coders capable of doing the whole two phase commit thing in the middle tier when moving data around? And even if they are - do you want to pay them to do something you've already paid for? Over and Over and Over and over again and again and again?
... can you provide us with some
guidances of when we should NOT be using dblink?
...
sure, that is easy:
do not use a database link to link two databases together when the right answer is to consolidate both databases into a single database. That is, use the database link whenever database A has to talk to database B. If you can make A and B a single database - that'll be best - less resources, less license cost, less maintenance, less code development, less of everything.
.... Also, one of Informatica technical sales reps claims that their products can
load data for data transformation faster than the used of dblink and PL/SQL....
Ask them for the definitive numbers published by an outside research firm that shows they are faster? I would? and then I would ask them to prove it for me in my situation. I spent most of my early career in Oracle doing customer benchmarks - it is something they should be willing to do.
Basic Salesmanship, that is, BS
Duke Ganote, December 24, 2009 - 6:01 am UTC
RE: "one of Informatica technical sales reps claims that their products can load data for data transformation faster than the used of dblink and PL/SQL."
Make the rep demonstrate the difference.
A few years back, in a former life, one of the EAI team claimed that Oracle was slowing down his integrations. I'll name names: webMethods was the EAI tool. We were only getting 2.5 records per second using webMethods as the link between 2 databases, the "source" and the "target". Both were on the same LAN in a midwestern site.
So I set up a simple test using a third remote Oracle database as my "middleware". The remote database was on the east coast, accessible through the WAN.
I used SQL*PLUS to do INSERTs into the "source" table. The table had triggers to insert the data into the "middleware" (New Jersey) database. The NJ database had similar triggers to push the database to the midwest "target" database table.
In a nutshell, I used database links from Indiana to NJ and back to Indiana. And it was several times faster than the EAI tool. Same data, same transformations.
Async in Oracle
Tim M, December 28, 2009 - 9:08 am UTC
One of the above points raised for where a messaging system is "required" is when asynchronous operations are wanted. It is possible to use Oracle jobs to make code execute asynchronous. I think this is actually a common use of jobs.
So, not to say that everything should be 100% Oracle but there is a fairly simple way to get asynch functionality which still staying in Oracle. If this results in a robust yet simple system then perhaps it may be the best solution.
January 04, 2010 - 7:31 am UTC
the job queue facility is a messaging system, you queue a message ( a job ) as a producer of messages and a consumer of messages (the job queue coordinator) dequeues the message and processes it.
It is a highly specialized queue as it supports a scheduling system - repetition - something not typically found (nor useful) in message oriented middleware stuff.
Queues are PART of the database as well - AQ (Advanced queues) have been in there since 8.0 (the last nine releases of the database) as well.
Oracle and DB links as a wrapper?
Charlie B., April 13, 2011 - 11:54 am UTC
Hello Tom!
I work in a mixed-RDBMS shop. One of our applications runs primarily on a data warehouse appliance (a "competitor" to Exadata). As would be expected, they have a lot of data; they're currently approaching 2PB.
The DW appliance (let's call it "N") simply receives queries and returns data to the users. Management has recently mandated that we provide dashboard functions to all applications. Among other things, the dashboards would provide aggregate results. The problem our development staff is encountering has to do with aggregation of data.
The dashboard, and the ad-hoc query tool involved, need to receive aggregated results on the data. So a query from the user, of the form
select A1, A2, A3, A4 from A
needs to be changed on-the-fly to
select A1, A2, sum(A3), sum(A4) from A
before it's presented to "N". We have been told that having the users specify the aggregation while assembling the ad hoc query is "too confusing" to them. The "N" appliance has no facility for intercepting and modifying the query prior to execution.
The solution suggested by Development is that we create an Oracle database link to the (heterogeneous) "N" database, create a pipeline function or a refcursor to modify the statement and pull aggregated data across the db link, and then return the data to the user.
On the Oracle side, we anticipate that this would require a separate db configured for access to a heterogeneous db (we want to isolate it from our standard config); a single user schema with nothing but the pipeline or refcursor function; and a db link to the "N" database.
This is the second time in a month when one of our internal applications wanted to pull data from a non-Oracle database, and asked about creating essentially an Oracle wrapper to provide the data. (The first time was more straightforward, and Development's currently addressing this via code changes to support the other RDBMS.)
I'd appreciate hearing your thoughts regarding the use of Oracle as a wrapper for challenges like this.
April 13, 2011 - 1:52 pm UTC
time to get the resume out I think.
how in the world are you supposed to "know" that selecting a1,a2,a3,a4 really means to sum a3 and a4 and group by a1, a2? Are you going to write a SQL parser and have some compiler you write that takes a bunch of rewrite rules to do this? In general - queries are going to be a lot more complex than that.
A 'pipelined' function isn't going to do anything for you. Ask development for the whiteboard talk that would have them sketch out for your how it would work (it will be a short meeting).
I don't know what 'or a refcursor' would do for you either?
There is a tool - The database firewall - that can do query rewrites, but I'm telling you - this is the weirdest sort of "requirement" I've heard of - I cannot imagine this working in real life.
http://www.oracle.com/us/products/database/database-firewall-160528.html It does some non-Oracle databases as well. Oracle Database 8i, 9i, 10 and 11, Microsoft SQL Server 2000, 2005, 2008, Sybase ASE 12.5 and 15, SQLAnywhere 10+, IBM DB2 on LUW v9.x. Probably not your "N" database.
But, I'm not really recommending this, the "holy grail" of doing query rewrites to trick out an application is a 'bad idea' in my opinion.
The only thing I can think of
Galen Boyer, April 13, 2011 - 2:15 pm UTC
The only way I could see this "rewrite" happening is if you have
metadata on the attributes of the tables that tells you whether they
are a "fact" or a "dimension", cause that is what these confused
business users are asking for.
I would just question how come a business user is confused by a simple
select x,y, sum(a) group by x,y. They "think" like this in Excel all
all all the time!
Confirmed our suspicions
Charlie B., April 14, 2011 - 1:37 pm UTC
Tom -
Once again, thanks for taking the time on this one. You da man!
I'm simply going to say that the two of us production DBA's here thought that the solution would be "suboptimal"...in the extreme...from numerous standpoints.
But I didn't want to color your answer (as if I could) by expressing our thoughts in advance.
Public link to Private
A reader, April 19, 2011 - 1:46 am UTC
Hi Tom,
I have a situation where in my project i have around 100 public dblink ,now for security purpose managment wants to comvert all the link and make all these private db link .
in that case ,while converting these link to private what things i need to take care of,so application should behave like as it is .or in other words there should not be impact on application.
This is very urgent requirment.Please help.
April 19, 2011 - 6:37 pm UTC
there will potentionally be a HUGE impact on the application. I cannot address to what degree it will be impacted - but only because I didn't write your application, I don't know how it works, I know nothing about it and how it operates. That puts me at a distinct disadvantage.
You'd need to be a bit more clear in things here, like in "details"
public versus private
Sam, May 25, 2011 - 10:54 am UTC
Tom:
Is not the only diff between a public and prvate link is that a public can be used by all USERS while private can only be used by the CREATOR or user who created the link.
May 25, 2011 - 11:39 am UTC
public is visible to all
private is only visible to the creator
yes.
Backing up Idea not to blindly use DB links
Abhishek, November 16, 2023 - 10:16 pm UTC
I would say , if you are transferring data from one application to another use messages via middleware.
1. Much more cleaner approach.
2. Tracking of replication is possible and easy.
3. Audit is easy.
4. If there are processing or validation exists on the target application you need not to re write them in first application instance. Just make sure the replication message processing in target application goes through the validation logic and processing.
November 17, 2023 - 10:58 am UTC
There are pros and cons to both approaches. I think the key point is not to blindly use either.