Alex Adamowicz, April 29, 2011 - 10:11 am UTC
It sounds like Albert is thinking about having 4 databases (production/reporting, DR production/reporting)? When you could just have 2; Production with an active dataguard standby that you can kill two birds with one stone with (DR and reporting).
April 29, 2011 - 10:37 am UTC
yes, assuming they are not using any features of streams to perform 'ETL' on the data as it goes into the reporting instance.
Thanks.
A reader, May 01, 2011 - 11:52 am UTC
Thanks
Albert Nelson A, May 01, 2011 - 12:01 pm UTC
Thanks Tom for sharing your thoughts.
Alex Adamowicz,
Thanks for pointing that standby database can be used for reports.
But our reporting database is having different data retention policy than that of production database. In addition, reporting database contains other additional schemas replicated from another database. Hence our production standby datbase cannot be used in place of reports database.
Regards,
Albert Nelson A.
few tables replication across hetrogeneous system
A Reader, May 18, 2011 - 8:52 pm UTC
Tom,
The requirement is replicate few of the tables to non Oracle ( MS SQL) database from Source (Oracle Database )
on a regular interval say T
we have 200 tables t1, t2, .... t200
but would like to replicate the changes from tables t1, t2, t3,t4 only
your thoughts
a) would it be ok to use Golden Gate here?
b) if we use customized solution to achive how efficient would it be?
i.e.
b1) capture the changes using trigger in the source database in a separate stage tables
b2) once copy to target database ( these chnages would be pulled by remote database) is done.. purge those entries in the stage tables.
c) any other method/approach
regards
May 19, 2011 - 12:51 am UTC
a) yes
b) i would not recommend that, use CDC (change data capture) or streams if you want to "do it yourself"
c) I would look for an approach that did not involve replication myself :) replication complicates everything
...few tables replication across hetrogeneous system
A Reader, May 19, 2011 - 1:36 am UTC
Many Thanks Tom.
Similar challange having diffrent character set
Siere, October 17, 2011 - 10:00 pm UTC
Hi Tom,
We have similar challenge, our goal is to move almost 10GB DB to different database, this target database is having different character set and different CPU (Source is INTEL and Target is SPARC, both Solaris though).
We have two option; stream and GG; to migrate these sachems
from perspective "How Stuff Work" will GG work better than Stream?
Thank you
October 17, 2011 - 10:38 pm UTC
Streams would likely suffice given you just want to do this as a straight conversion - one time only - with no 'transformations'. If you just want a database on the other platform that looks like the original, and you'll cut over to it and then turn off streams, that would be my first approach (you already have everything you need)
if you wanted to do complex transformations or do this replication long term - golden gate would become more likely
Oleksandr Alesinskyy, October 18, 2011 - 9:10 am UTC
If it is one-time move then why not datapump?
October 18, 2011 - 4:54 pm UTC
I guess I made the assumption they wanted to do this move with little to no downtime involved. I made that assumption because of the two tools they mentioned - both of which can accomplish that.
If they can have downtime, then yes, datapump would be the likely tool of choice.
since it is only 10gb (teeny tiny - probably less than 5gb of real data, indexes don't have to move, they get recreated), datapump would be able to do it in minutes - so that would be reasonable.
Bi-directional Replication
João Pinela, February 24, 2012 - 11:34 am UTC
Hello Tom,
Hope my followp is a relevant one:
I have a specific question regarding streams vs GG. After my latest investigation on GG (which seemed an enhanced heterogeneous services) I believe things have evolved.
And my goto response on replication (streams) is already heading for the grave (be it now, 1 or 2 years but soon).
So specifically (and I believe different from the above questions) my client needs a DR environment with HA and both active instances, meaning he wants 2 geographically separated database files and instances that perform like one system and both points are available and open for DML (allways or in case that one might fail).
This seemed a straightforward "Oracle streams is the product for you". Reasons:
- DG (at least 10g) does not allow both instances to be open for DML (continuously)
- GG seems to have many more features than this situation requires
- GG seems more indicated for "partial ETL and replication" (like a schema, or a table).
What do you suggest for this purpose?
Thank you.
Best Regards,
Pinela.
February 25, 2012 - 5:22 am UTC
and both active instances,
that is only reasonable if you make it so that a given row of data can be updated in exactly ONE location at any point in time.
Meaning - you run application A at site 1 and have it 'dr' to site 2. You run application B at site 2 and have it 'dr' to site 1. A's tables are updated at site 1 ONLY unless it fails over to 2. B's tables are updates at site 2 ONLY unless it fails over to site 1.
OR - half of table T1 is updateable at site 1 ONLY (unless site 1 has failed over) and the OTHER HALF is updateable at site 2 ONLY (unless....)
the reason I say this is because I do not know of any developers (self included) that can build an update anywhere system for any piece of non-trivial software that accounts for all update conflicts and can resolve them. It is just way way too hard, far too expensive, and extremely fragile.
It would be much better to just have data guard in place - you can fail over in *seconds* - literally *seconds*. If you do the "run app a at 1 and b at 2" and have 1 and 2 be failovers for B and A - you have what you want - honest. it will work, it will be easy, it will be safe, it will be fast, it won't require better developers than exist in the world to build software, you can use it with off the shelf software - every thing is good.
If you try to do update anywhere replication - you need developers that don't exist, you cannot use off the shelf software, you will be administering and fixing this system 24 hours a day.
João Pinela, February 27, 2012 - 4:00 am UTC
Tom, Thank you very much for your feedback.
I can see that I may have missed some information.
Be it that Germany is recruiting in Portugal, maybe we should be recruiting in Mars for such developers. :)
- you mention "application A and B". Well, this specific case would be just application X (=application server A or B)which would connect to database node (streams node) 1 or 2.
Since an appserver connects (and runs dml on any given session without disconnecting) on just 1 node, isn't it possible for the application to update what it wants without blocking or damaging other data? Aren't deadlocks (for example) treated equally?
- the problem with DR is that the client now has 2 DB servers with RAC and wants both the safety of 2 remote sites, with the availability of 2 remote sites and the use of all the servers so that there aren't DBservers or APPservers "sleeping" while others handle all the workload.
- So in your opinion, there is no point in using streams or GG (even given my latest info)? Because in that case, what are streams and GG good for, if the applicational layer is so heavily impacted by this configuration, and like you say we can't use of of the box?
With RAC if you initially have 1 standalone node, 1 application, 1 app server, and you then go to RAC, 2 nodes, 1 application, 2 app servers, after configuring load balancing I believe this is simple and straightforward.
Am I misunderstanding?
Thank you for your precious (and limited) time.
BR,
Pinela.
February 28, 2012 - 6:38 am UTC
Since an appserver connects (and runs dml on any given session without
disconnecting) on just 1 node, isn't it possible for the application to update
what it wants without blocking or damaging other data? Aren't deadlocks (for
example) treated equally?
I don't know what you were trying to say there - it is not active active over a long distance in your example there.
the problem with DR is that the client now has 2 DB servers with RAC and
wants both the safety of 2 remote sites, with the availability of 2 remote
sites and the use of all the servers so that there aren't DBservers or
APPservers "sleeping" while others handle all the workload.
Ahh, the fallacy of active/active. The thought that it will somehow require less resources, or use resources 'better' in some fashion. when in fact - the opposite is true.
Let us say for a minute that you run on one site and just use simple data guard to have a DR site. The one site needs to be able to
a) runs its DML work load and generate undo
b) ship undo to remote site <<<<==== (a) is 'obvious', this is the extra work
If you go with two sites 'active active', then
a) each site must run its DML work load *and the DML work load of the remote site* - only less efficiently (replication via golden gate/streams/etc turns a single sql statement that updates 5 rows into 5 sql statements to update 1 row apiece).
b) ship their undo to remote site
c) each site must process redo and turn it from its binary form into a logical record to to the update (new extra work)
d) each site must apply this converted information all of the while looking for update conflicts (extra work)
e) and then invoke the update conflict code (which is REALLY hard to code/design/implement)
f) and then there is the extra management work of maintaining this rather complex flow of data
You actually need two larger servers than you would if you just ran "active and DR".
So in your opinion, there is no point in using streams or GG (even given my
latest info)? Because in that case, what are streams and GG good for, if the
applicational layer is so heavily impacted by this configuration, and like you
say we can't use of of the box?
I am saying "do NOT go for update anywhere, partition your data so that it is only updateable on ONE SITE at a time, avoid any chance of update conflicts".
The easiest way to do that is run "module1" of application A on site 1 and "module2" of application A on site 2 (or run A on 1, B on 2)
Or, starting to introduce a lot more complexity, Run deptno=10 on site1 and deptno=20 on site2, and so on - partition the data so that it is updateable on one site only.
I'm not sure what you were trying to say with RAC. With RAC you have at least two nodes from day one - or you don't have RAC, you have a single instance.
And RAC is NOT DR, RAC is high availability in a room. Data Guard is DR.
Alexander, April 30, 2012 - 11:02 am UTC
Hi Tom,
I would like your advice on a situation that comes up more and more these days as our company acquires others.
We have our primary data center in NH. We have a DR site in KC. Because of latency issues, KC has also become active, or a primary site for certain applications that are used by customers in the mid-west/western part of the country.
Because of this, we are sort of being pushed into active/active site designs to please everyone's SLA's for response times. Active/active basically leaves one option, the one that those of us have supported can barely bring ourselves to say.
I feel like something is off here because although I get that information can only travel so fast over a pipe a couple thousand miles, but why is it I can hit sites in Europe with no problem at all? I know zero about networking.
May 01, 2012 - 3:22 pm UTC
You shouldn't need to do anymore than one data center with a lights out failover if you wanted - you just need the network infrastructure.
I access the single production data center at Oracle (which has a failover). It is in Texas. I live in Virginia. Many people live - well - not in Texas - in fact probably 99% of the company doesn't live in Texas (some in Australia, some in the UK, Japan, Russia, etc). We all use the same apps running in the same data center running on the same database.
Active/Active - you won't be doing that with any 3rd party applications (show me a major product that is available that does that - out of the box, supported, no customization).
Active/Active - you'll need to hire *way* better programmers. I myself wouldn't want to touch it.
Streams example
João Pinela, May 14, 2012 - 12:10 pm UTC
Hello Again Tom,
Well, after your precious advices, I managed to get a machine to test a 2 node streams environment.
Every committed record (or ddl command) worked great.
The problem came up, once I tried to update on site 1, an uncommited record deleted on site 2.
This obviously crasched the streams replication, and made the data inconsistent across sites.
Has you said, the application needs to be "streams aware".
Thank you for your help.
BR,
João Pinela.
May 14, 2012 - 12:51 pm UTC
are you trying to ask something or just make the statement that "an application has to be built to replicate in order to replicate, you cannot just replicate any old application and think it will work"?
Which way to go?
Galen Boyer, July 18, 2012 - 12:58 pm UTC
Hi Tom,
We have Golden Gate replicating a table. Each modified row in that replicated table means
an action needs to happen (ie, a java process needs to be executed). We have eliminated
triggers on the table that would kick off action, for multiple reasons. One being that
triggers can be fired more than once, but also, more stringently, we cannot have the
replication process affected. What we believe is the best solution is to write to a queue
and then let the queue handle the java process being kicked off. Is there a way, to read
the redo logs and write to an Oracle queue? Golden Gate documentation,
http://www.oracle.com/us/products/middleware/data-integration/goldengate11g-ds-168062.pdf pg 4, section "Delivery" says "To enhance IT flexibility, captured data can also be
delivered to a Java Message Service destination or as a flat file using Oracle GoldenGate
Application Adapters". So, I believe Golden Gate could handle this. Is that true?
We are also trying to understand how we could use streams directly and not have to involve
Golden Gate. It seems that the solution would start with a process performing CDC for
reading but the Capture part of CDC would be a write to a queue, yet I don't believe that
is allowed? It seems the Capture of CDC can only be another table, not a write to a
queue? Am I correct in that?
July 18, 2012 - 3:26 pm UTC
to use streams, you would implement a custom apply process, your apply process would be invoked with a logical change record (LCR) and you can do whatever you want with it.
CDC would not be involved at all.
this is the documentation for the GoldenGate -> JMS interface:
http://docs.oracle.com/cd/E18101_01/doc.1111/e17814.pdf
Oh well, right there
Galen Boyer, July 18, 2012 - 2:14 pm UTC
Hi Tom,
I was sure I had read that streams could be used for exactly as I was saying,
but then I started down many other paths and got a little lost. Then, I started
keeping track of where I had gone and vhalla, there it is, right in the main
documentation, it most certainly can do what I'm describing and this is exactly
what I was thinking when I was sure we could do what I was describing.
http://docs.oracle.com/cd/E11882_01/server.112/e17069/strms_over.htm#CHDJDCFC Sorry for the noise.