Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Albert .

Asked: April 28, 2011 - 11:41 am UTC

Last updated: July 18, 2012 - 3:26 pm UTC

Version: 11.2.0.2

Viewed 10K+ times! This question is

You Asked

Hi Tom,

We are currently running a 10.2.0.3 production database replicating to reports database (10g) using Streams (local capture remote apply configuration).

We are planning to upgrade both the databases to 11g R2. At the same time we are also going to setup DR databases for both production and reports database. We were evaluating various options for this that include DataGuard, Streams and GoldenGate.

As for setting up the DR we are inclined to use DataGuard.

For replication, though we know that Oracle's strategic direction is GoldenGate, we see many improvements have been done in Streams in 11g when compared to 10g. So instead of reconfiguring our entire replication to use GoldenGate, we are inclined to stick on with Streams. We feel that though GoldenGate is an excellent replication tool for replicating in a heterogenous environment, when it comes to Oracle to Oracle, Streams has an edge over GoldenGate at least in 11g. May be in future GoldenGate will catch up with Streams even for Oracle to Oracle.


Can we know your take on this? Do you think we are at risk because we are going to continue with Streams?


Regards,
Albert Nelson A.

and Tom said...

As for setting up the DR we are inclined to use DataGuard.


perfect, because data guard does disaster recovery (DR) out of the box and the other two are technologies that could be used to BUILD a DR.

to me, tt is sort of like deciding between:

a) Oracle Human Resources, the application
b) Oracle 11g R2, the database

(a) is like data guard, it is an DR (HR) application - it does DR (HR).
(b) is like streams/golden gate, you could use them to BUILD DR (HR).




As for your current replication, I personally would not be inclined to change it. It is working, it is supported, it is supportable, it will be maintained (although it may not be enhanced going forward into the future as much)

Rating

  (13 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Alex Adamowicz, April 29, 2011 - 10:11 am UTC

It sounds like Albert is thinking about having 4 databases (production/reporting, DR production/reporting)? When you could just have 2; Production with an active dataguard standby that you can kill two birds with one stone with (DR and reporting).
Tom Kyte
April 29, 2011 - 10:37 am UTC

yes, assuming they are not using any features of streams to perform 'ETL' on the data as it goes into the reporting instance.

Thanks.

A reader, May 01, 2011 - 11:52 am UTC


Thanks

Albert Nelson A, May 01, 2011 - 12:01 pm UTC

Thanks Tom for sharing your thoughts.

Alex Adamowicz,

Thanks for pointing that standby database can be used for reports.

But our reporting database is having different data retention policy than that of production database. In addition, reporting database contains other additional schemas replicated from another database. Hence our production standby datbase cannot be used in place of reports database.

Regards,

Albert Nelson A.




few tables replication across hetrogeneous system

A Reader, May 18, 2011 - 8:52 pm UTC

Tom,

The requirement is replicate few of the tables to non Oracle ( MS SQL) database from Source (Oracle Database )
on a regular interval say T

we have 200 tables t1, t2, .... t200

but would like to replicate the changes from tables t1, t2, t3,t4 only



your thoughts
a) would it be ok to use Golden Gate here?
b) if we use customized solution to achive how efficient would it be?
i.e.
b1) capture the changes using trigger in the source database in a separate stage tables
b2) once copy to target database ( these chnages would be pulled by remote database) is done.. purge those entries in the stage tables.
c) any other method/approach


regards

Tom Kyte
May 19, 2011 - 12:51 am UTC

a) yes

b) i would not recommend that, use CDC (change data capture) or streams if you want to "do it yourself"

c) I would look for an approach that did not involve replication myself :) replication complicates everything

...few tables replication across hetrogeneous system

A Reader, May 19, 2011 - 1:36 am UTC

Many Thanks Tom.

Similar challange having diffrent character set

Siere, October 17, 2011 - 10:00 pm UTC

Hi Tom,

We have similar challenge, our goal is to move almost 10GB DB to different database, this target database is having different character set and different CPU (Source is INTEL and Target is SPARC, both Solaris though).

We have two option; stream and GG; to migrate these sachems
from perspective "How Stuff Work" will GG work better than Stream?

Thank you


Tom Kyte
October 17, 2011 - 10:38 pm UTC

Streams would likely suffice given you just want to do this as a straight conversion - one time only - with no 'transformations'. If you just want a database on the other platform that looks like the original, and you'll cut over to it and then turn off streams, that would be my first approach (you already have everything you need)

if you wanted to do complex transformations or do this replication long term - golden gate would become more likely

Oleksandr Alesinskyy, October 18, 2011 - 9:10 am UTC

If it is one-time move then why not datapump?
Tom Kyte
October 18, 2011 - 4:54 pm UTC

I guess I made the assumption they wanted to do this move with little to no downtime involved. I made that assumption because of the two tools they mentioned - both of which can accomplish that.

If they can have downtime, then yes, datapump would be the likely tool of choice.


since it is only 10gb (teeny tiny - probably less than 5gb of real data, indexes don't have to move, they get recreated), datapump would be able to do it in minutes - so that would be reasonable.

Bi-directional Replication

João Pinela, February 24, 2012 - 11:34 am UTC

Hello Tom,
Hope my followp is a relevant one:

I have a specific question regarding streams vs GG. After my latest investigation on GG (which seemed an enhanced heterogeneous services) I believe things have evolved.
And my goto response on replication (streams) is already heading for the grave (be it now, 1 or 2 years but soon).

So specifically (and I believe different from the above questions) my client needs a DR environment with HA and both active instances, meaning he wants 2 geographically separated database files and instances that perform like one system and both points are available and open for DML (allways or in case that one might fail).

This seemed a straightforward "Oracle streams is the product for you". Reasons:
- DG (at least 10g) does not allow both instances to be open for DML (continuously)
- GG seems to have many more features than this situation requires
- GG seems more indicated for "partial ETL and replication" (like a schema, or a table).

What do you suggest for this purpose?
Thank you.
Best Regards,
Pinela.
Tom Kyte
February 25, 2012 - 5:22 am UTC

and both active instances,

that is only reasonable if you make it so that a given row of data can be updated in exactly ONE location at any point in time.

Meaning - you run application A at site 1 and have it 'dr' to site 2. You run application B at site 2 and have it 'dr' to site 1. A's tables are updated at site 1 ONLY unless it fails over to 2. B's tables are updates at site 2 ONLY unless it fails over to site 1.

OR - half of table T1 is updateable at site 1 ONLY (unless site 1 has failed over) and the OTHER HALF is updateable at site 2 ONLY (unless....)


the reason I say this is because I do not know of any developers (self included) that can build an update anywhere system for any piece of non-trivial software that accounts for all update conflicts and can resolve them. It is just way way too hard, far too expensive, and extremely fragile.


It would be much better to just have data guard in place - you can fail over in *seconds* - literally *seconds*. If you do the "run app a at 1 and b at 2" and have 1 and 2 be failovers for B and A - you have what you want - honest. it will work, it will be easy, it will be safe, it will be fast, it won't require better developers than exist in the world to build software, you can use it with off the shelf software - every thing is good.

If you try to do update anywhere replication - you need developers that don't exist, you cannot use off the shelf software, you will be administering and fixing this system 24 hours a day.

João Pinela, February 27, 2012 - 4:00 am UTC

Tom, Thank you very much for your feedback.

I can see that I may have missed some information.

Be it that Germany is recruiting in Portugal, maybe we should be recruiting in Mars for such developers. :)

- you mention "application A and B". Well, this specific case would be just application X (=application server A or B)which would connect to database node (streams node) 1 or 2.

Since an appserver connects (and runs dml on any given session without disconnecting) on just 1 node, isn't it possible for the application to update what it wants without blocking or damaging other data? Aren't deadlocks (for example) treated equally?

- the problem with DR is that the client now has 2 DB servers with RAC and wants both the safety of 2 remote sites, with the availability of 2 remote sites and the use of all the servers so that there aren't DBservers or APPservers "sleeping" while others handle all the workload.

- So in your opinion, there is no point in using streams or GG (even given my latest info)? Because in that case, what are streams and GG good for, if the applicational layer is so heavily impacted by this configuration, and like you say we can't use of of the box?

With RAC if you initially have 1 standalone node, 1 application, 1 app server, and you then go to RAC, 2 nodes, 1 application, 2 app servers, after configuring load balancing I believe this is simple and straightforward.

Am I misunderstanding?

Thank you for your precious (and limited) time.

BR,
Pinela.


Tom Kyte
February 28, 2012 - 6:38 am UTC

Since an appserver connects (and runs dml on any given session without
disconnecting) on just 1 node, isn't it possible for the application to update
what it wants without blocking or damaging other data? Aren't deadlocks (for
example) treated equally?


I don't know what you were trying to say there - it is not active active over a long distance in your example there.


the problem with DR is that the client now has 2 DB servers with RAC and
wants both the safety of 2 remote sites, with the availability of 2 remote
sites and the use of all the servers so that there aren't DBservers or
APPservers "sleeping" while others handle all the workload.


Ahh, the fallacy of active/active. The thought that it will somehow require less resources, or use resources 'better' in some fashion. when in fact - the opposite is true.


Let us say for a minute that you run on one site and just use simple data guard to have a DR site. The one site needs to be able to

a) runs its DML work load and generate undo
b) ship undo to remote site <<<<==== (a) is 'obvious', this is the extra work

If you go with two sites 'active active', then

a) each site must run its DML work load *and the DML work load of the remote site* - only less efficiently (replication via golden gate/streams/etc turns a single sql statement that updates 5 rows into 5 sql statements to update 1 row apiece).

b) ship their undo to remote site

c) each site must process redo and turn it from its binary form into a logical record to to the update (new extra work)

d) each site must apply this converted information all of the while looking for update conflicts (extra work)

e) and then invoke the update conflict code (which is REALLY hard to code/design/implement)

f) and then there is the extra management work of maintaining this rather complex flow of data


You actually need two larger servers than you would if you just ran "active and DR".


So in your opinion, there is no point in using streams or GG (even given my
latest info)? Because in that case, what are streams and GG good for, if the
applicational layer is so heavily impacted by this configuration, and like you
say we can't use of of the box?


I am saying "do NOT go for update anywhere, partition your data so that it is only updateable on ONE SITE at a time, avoid any chance of update conflicts".

The easiest way to do that is run "module1" of application A on site 1 and "module2" of application A on site 2 (or run A on 1, B on 2)

Or, starting to introduce a lot more complexity, Run deptno=10 on site1 and deptno=20 on site2, and so on - partition the data so that it is updateable on one site only.


I'm not sure what you were trying to say with RAC. With RAC you have at least two nodes from day one - or you don't have RAC, you have a single instance.

And RAC is NOT DR, RAC is high availability in a room. Data Guard is DR.


Alexander, April 30, 2012 - 11:02 am UTC

Hi Tom,

I would like your advice on a situation that comes up more and more these days as our company acquires others.

We have our primary data center in NH. We have a DR site in KC. Because of latency issues, KC has also become active, or a primary site for certain applications that are used by customers in the mid-west/western part of the country.

Because of this, we are sort of being pushed into active/active site designs to please everyone's SLA's for response times. Active/active basically leaves one option, the one that those of us have supported can barely bring ourselves to say.

I feel like something is off here because although I get that information can only travel so fast over a pipe a couple thousand miles, but why is it I can hit sites in Europe with no problem at all? I know zero about networking.
Tom Kyte
May 01, 2012 - 3:22 pm UTC

You shouldn't need to do anymore than one data center with a lights out failover if you wanted - you just need the network infrastructure.

I access the single production data center at Oracle (which has a failover). It is in Texas. I live in Virginia. Many people live - well - not in Texas - in fact probably 99% of the company doesn't live in Texas (some in Australia, some in the UK, Japan, Russia, etc). We all use the same apps running in the same data center running on the same database.

Active/Active - you won't be doing that with any 3rd party applications (show me a major product that is available that does that - out of the box, supported, no customization).

Active/Active - you'll need to hire *way* better programmers. I myself wouldn't want to touch it.

Streams example

João Pinela, May 14, 2012 - 12:10 pm UTC

Hello Again Tom,

Well, after your precious advices, I managed to get a machine to test a 2 node streams environment.

Every committed record (or ddl command) worked great.

The problem came up, once I tried to update on site 1, an uncommited record deleted on site 2.

This obviously crasched the streams replication, and made the data inconsistent across sites.

Has you said, the application needs to be "streams aware".

Thank you for your help.
BR,
João Pinela.

Tom Kyte
May 14, 2012 - 12:51 pm UTC

are you trying to ask something or just make the statement that "an application has to be built to replicate in order to replicate, you cannot just replicate any old application and think it will work"?

Which way to go?

Galen Boyer, July 18, 2012 - 12:58 pm UTC

Hi Tom,

We have Golden Gate replicating a table. Each modified row in that replicated table means
an action needs to happen (ie, a java process needs to be executed). We have eliminated
triggers on the table that would kick off action, for multiple reasons. One being that
triggers can be fired more than once, but also, more stringently, we cannot have the
replication process affected. What we believe is the best solution is to write to a queue
and then let the queue handle the java process being kicked off. Is there a way, to read
the redo logs and write to an Oracle queue? Golden Gate documentation,
http://www.oracle.com/us/products/middleware/data-integration/goldengate11g-ds-168062.pdf
pg 4, section "Delivery" says "To enhance IT flexibility, captured data can also be
delivered to a Java Message Service destination or as a flat file using Oracle GoldenGate
Application Adapters". So, I believe Golden Gate could handle this. Is that true?

We are also trying to understand how we could use streams directly and not have to involve
Golden Gate. It seems that the solution would start with a process performing CDC for
reading but the Capture part of CDC would be a write to a queue, yet I don't believe that
is allowed? It seems the Capture of CDC can only be another table, not a write to a
queue? Am I correct in that?
Tom Kyte
July 18, 2012 - 3:26 pm UTC

to use streams, you would implement a custom apply process, your apply process would be invoked with a logical change record (LCR) and you can do whatever you want with it.

CDC would not be involved at all.




this is the documentation for the GoldenGate -> JMS interface:

http://docs.oracle.com/cd/E18101_01/doc.1111/e17814.pdf

Oh well, right there

Galen Boyer, July 18, 2012 - 2:14 pm UTC

Hi Tom,

I was sure I had read that streams could be used for exactly as I was saying,
but then I started down many other paths and got a little lost. Then, I started
keeping track of where I had gone and vhalla, there it is, right in the main
documentation, it most certainly can do what I'm describing and this is exactly
what I was thinking when I was sure we could do what I was describing.
http://docs.oracle.com/cd/E11882_01/server.112/e17069/strms_over.htm#CHDJDCFC

Sorry for the noise.