Thanks for the question.

Asked: March 13, 2013 - 10:40 am UTC

Last updated: March 25, 2013 - 3:20 pm UTC

Version: 11

Viewed 1000+ times

You Asked

1. Based on CDC document what I understood is

a. Asynchronous Hotlog uses Redolog files
a. Asynchronous Autolog uses Archivelog files

Is the above statements correct?

If the Table is set to nologging and Database is Archivelog Mode. (Assuming force logging is not applied). I think Asynchronous Hotlog/Autolog does not work?

2. The delay in the CDC Autolog Mode is proportional to frequencly of log switch.

3. Assume my Datawarehouse Database is in Archivelogmode (with no force logging applied)
a. 1000 Tables
b. 10 different Tablespace
c. Total Size of 2 TB
d. 500 Tables are loaded in nologging mode. (To reducde redo log generation
(These tables can be easily reconstructed in case of loss)
I need to capture change for 10 Tables apply to Datamart Database
a. All these 10 Tables are in 1 Tablespace
b. size of these 10 Tables is around 50GB (1/20 of actual size of Database )
c. These 10 Tables are having loggind mode.

Now my question is
1. Does CDC Transfers every log file?
2. Or can we define to look at only redo log files pertaining to Tablespace TB1. Is it possible?
3. Is CDC good in this case or you suggest to go for Mlog/Mviews to capture change.

Thanks

and Tom said...

If the table is nologging and you perform a non-logged operation, any log based replication (streams, cdc, goldengate, etc) will fail.

You would not be able to use any log based replication on this data set.

we would not recommend CDC in any case, it is deprecated, supported for current uses but not recommended going forward:

http://docs.oracle.com/cd/E18283_01/server.112/e17222/changes.htm#CJAECCIJ

Rating

(6 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, March 13, 2013 - 11:15 am UTC

"Assume Datawarehouse Database is in Archivelogmode
a. 1000 Tables
b. 10 different Tablespace
c. Total Size of 2 TB
d. 500 Tables are loaded in nologging mode. (To reducde redo log generation
(These tables can be easily reconstructed in case of loss)
I need to capture change for 10 Tables apply to Datamart Database
a. All these 10 Tables are in 1 Tablespace
b. size of these 10 Tables is around 50GB (1/20 of actual size of Database )
c. These 10 Tables are having loggind mode. "

Tom

What do you suggest apart from Goldengate. We dont have money to buy Goldengate. :) How we can replicate the data with minimum impact on Source database and minimum delay in Target Database.

Thanks

March 13, 2013 - 12:05 pm UTC

I would not replicate such a database - period. why would you want to do that?

see https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6338100300346766796 for why.

if you do nologging operations - you have very very very very very few choices. (like one, materialized views with mv logs).

I would take the choices down to zero- don't replicate. It isn't necessary.

A reader, March 13, 2013 - 12:30 pm UTC

Tom

1. All the tables that I need to replicate are logged. Source Database is in Archivelog mode. There are other tables which are not of interest to me are not logged. So, I dont care about them.
2. Why I need to replicate? Because users on my source Database is 600 hundered. Expected users on the Target database would be 25,000. I dont want to burdern my Source Database with 25k users. They want this database to be highly available.

Thanks
Koshal

March 13, 2013 - 12:47 pm UTC

get a bigger machine, seriously - I'm not kidding. A single big machine (or cluster of machines) would be infinitely preferable to multi-database and replication.

And it would be much more highly available.

You do understand that replication is a burden on the source machine as well - right...

Anyway, when you phrased the question above,you should have just said:

I have 10 tables in a single tablespace that are definitely logged and want to replicate them.

it would make it a lot easier to understand....

everything else you wrote just confused the issue. 1,000 tables - not relevant. 10 different tablespaces, not necessary to know. 2tb - irrelevant. 500 tables (of which the 10 in question are not part of) are done in no-logging mode - well, that doesn't mean anything to us...

You have 10 tables of 50gb in size that you want to replicate. That is all we know - no transaction volumes/rate of change against them. No information on the amount of redo generated by the database (any log based replication is going to have to look at the entire redo log stream, every bit of it).

The link above points you to a paper that describes our future directions with respect to replication. goldengate is the suggested path, materialized views are there as well. CDC is not a path to take at all. streams could be used, but is on a similar trajectory as CDC in the future.

A reader, March 13, 2013 - 1:33 pm UTC

Thanks a lot Tom for your quick response!

So what I understod from your answer is that even if I want to capture change for only 5% of the Tables that generate redo I need to look at each and every log file. This does not seems to me as optimal way of replicating. So, we dont have any sort of Index over here which says this redo log file "RD1" contains data of interest to Target Database "TD1" and "RD2" contains data of interst to "TD2". Why do I need to read 1000 redo log files when the data that i am interested is available in only 1 file?
I know adding an index is overhead. But, in this case, benefit defenitely outweighs the cost.

Reason why I mentioned about 1000 Tables and 2TB is to give you size of Database. 500 Tables to tell you that the database generates lot of redo and only 50% of tables genrate redo. 10 Tables needs to be replicate to tell you that only 1/20 of the information in redo I am interested. Assuming Oracle is smart enough to know which Redo file to read.:)

Thanks a lot for all your help!

March 13, 2013 - 1:42 pm UTC

... Why do I need to read 1000 redo log files when the data that i am
interested is available in only 1 file? ...

redo is a continuous stream of information - what you need is *not* in a single redo log file or even a single set of redo logs - it is spread all over the place and intermingled with all of the other redo being generated. it is not organized, clustered together, it just lands where it falls.

... But, in this case, benefit defenitely
outweighs the cost....

no, no it wouldn't. it would be a huge cost to something that has to be fast - and of very very very limited use in the real world.

... Reason why I mentioned about 1000 Tables and 2TB is to give you size of
Database. ....

but I didn't need to know that, what we'd need to understand would be volume of transactions against your 10 tables and total redo volume for the database.

... 0 Tables needs to be replicate to tell you
that only 1/20 of the information in redo I am interested. Assuming Oracle is
smart enough to know which Redo file to read.:) ...

it is - it *knows* it has to read *all* of the redo since it could be in there all over the place. redo isn't a neat and tidy bit of information - it is a continuous stream of data.

reading the redo isn't going to be a problem, it isn't going to be your bottleneck or overhead. That part doesn't phase me - if you can write it, we can read and process it. don't worry about that.

what you need to worry about is the pain in the butt that is this thing called replication. It will consume your time like you won't believe.

A reader, March 13, 2013 - 2:20 pm UTC

I agree with you that we dont know which file to read if I am simulatneusly updating tables that needs to be replicated and Tables that does not to be replicated

For Example take this Scenario

1st Business day of month - There is going to be huge volume of data. It updated Table T1... T100 and generated redo files RD1..RD1000 Size of this redo log file is 100TGB
2nd Business day till second last business data It updates Table T101...T200 and gerated redo log files RD1001 TO RD1100 and generates 10TB
On the last business day it updates Table T201 to T300 and generates redo log files RD1101 TO RD1200 WITH 10TB. These are tables that needs to be replicated in Target Database. I feel RD1101 should not in any way be dependent on any of the files from RD1..RD1000. I feel defintely there is going to be some cost in reading that 100TB of redo log files which is not required. Because they are old and no way related. I know what I am doing on 1st buiness day and I don't want to read files RD1..RD1000 either on the First business day or on the last business day.

Thanks a lot again for your time!

March 14, 2013 - 8:24 am UTC

but you see that you are a very very very special case, different from pretty much everyone else and that such a micro optimization - which would penalize everyone else (having to "index" redo logs) - wouldn't make sense in general.

the cost of having to read those files is going to be less than maintaining a database of precisely what is recorded in each redo log file over time.

any and all log based replication mechanisms - from Oracle, from 3rd parties, will have to process the redo stream as a continuous thing - that is the only way to read redo and make sense of it. redo is like a story of the database.

tell you what - why don't you just take the inputs and the process you run on database 1 and run *the same exact thing* on database 2? Why bother with replication at all?

Or, since the data is read only most of the time, do your bulk load, make the tablespace read only and transport it.

eg: don't do replication (again, I come back to this!)

Combination of Rman/Transportable Tablespace

A reader, March 14, 2013 - 3:54 pm UTC

I can't beleive the amount of passion you have for Oracle Database.

1. We can't do Transportable Tablespace as Data needs to be refreshed every 1/2 hour
2. Don't want to have mlogs on Source DB. It is going to burden the Source Database
3. Need to have all the changes that happened in Source DB to Target DB.
4. I am looking for solution which is sleek. Something like this

Initial Setup
Using Transportable Table Tablespace I will have exact copy of Tablespace TS1 on both DB1(Source DB) and DB2 (Target DB).
Scedule RMAN every 1/2 hour whatever block changes that happened on TS1 Datafiles of DB1 are applied on Database DB2.
Then export/import the datadictionary for TS1 from for DB1 to DB2
Is it possible. :)

Advantage in this approach is
1. I am not transferring whole datafile I am tranferring only the blocks that got changed
2. Faster/Quicker
3. No network issues.

What am I missing here? Why this does not work?

Thanks

March 25, 2013 - 9:21 am UTC

I will again state: do not replicate, sounds like a single system to me.

I don't know who you are or what your specific needs are - you don't tell us very much... but...

You can either;

a) use active data guard, that will allow you to open the failover database read only for queries. You can freely query the data for reports. assuming that was your (unstated) goal.

b) use a logical standby, you will be able to query the data copied from production and add auxillarily structures like indexes/materialized views - and create read write tables that do not exist at production

c) use golden gate, that would allow you to do log based replication fully.

If you use rman, you would take an outage every half hour to apply an half hour of block changes to the system and then recover them - which in itself could take the half hour - and then you'd be able to open READ ONLY. It wouldn't be practical and it would be hugely labor intensive.

3rd part replication option

Wagner Ramos, March 19, 2013 - 10:12 am UTC

Why do not use some 3rd part replication software that can be configured to collect data changes only of these 10 tables ?

I believe that you can find 3 or more sw that can collect data changes with triggers and solve your needs without changing log modes, etc.

The impact in source database will be only in insert/update/delete operations for these 10 tables.

If you have some test environment and would like to try I can help, my solution is not so expensive like a GG.

March 25, 2013 - 3:20 pm UTC

you mean like golden gate. If will only collect data changes for the 10 tables *from the redo*.

hah, triggers - you want to see something run *slower than slow*, go for triggers!!! (and by the way, advanced replication which you already own does that but violates the "do not impact my production site please".)

anything trigger based will definitely, massively and measurably and in so many bad ways affect your production site.

Live SQL

Share and learn SQL and PL/SQL; free access to the latest version of Oracle Database!

Dev Gym

Classes, workouts and quizzes on Oracle Database technologies. Expertise through exercise!

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Combination of Rman/Transportable Tablespace

3rd part replication option

More to Explore

Live SQL

Dev Gym

CDC vs Replication using Materialized views

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Combination of Rman/Transportable Tablespace

3rd part replication option

More to Explore

Live SQL

Dev Gym