A similar experience
Stew Ashton, June      19, 2017 - 7:30 am UTC
 
 
I'll jump in here because I've done something somewhat similar.
First, Connor provided a link to another asktom thread where I mentioned a tool I wrote to synchronize tables. I won't repeat the link here, but I suggest you read the thread to the end.
I have one big question: are you going to apply deltas to the "new" tables, and simultaneously modify them in the new system? How can that possibly work? At the very least, each row needs to be "owned" and modified by only one system, either legacy or new.
I would try to separate "legacy" rows into separate tables or partitions, so the delta apply mechanism would not override changes made by the new system.
Second question: as Connor mentioned, do the deltas need to be "SCN consistent"? That would be because of foreign keys, in which case they need to be deferrable.
Concerning your point 1.
- last_update columns are dangerous because the time in the column is the time of the change, not the time of the commit. If you query the changes since noon, any changes made before noon but committed after noon will not show up.
- ORA_ROWSCN tells you the SCN of the latest change to the block, not to the row. You will get false positives but is that a big problem? You will still have to scan the whole table because ORA_ROWSCN cannot be indexed. I have never done "replication" using ORA_ROWSCN so I won't say any more.
If you have to scan both tables to compare, it would be better to have them in the same database! Comparing large tables over a dblink is a huge pain. You could then apply the delta to the local target and to the remote target on the new system.
If you do need SCN consistency, my tool should be compatible with that.
Personally, I can think of a third party tool that might help you considerably, but I'm not sure I should mention it here.
Best regards, Stew Ashton 
June      20, 2017 - 5:32 am UTC 
 
Feel free - we're all friends here :-)  or I'll just delete the review :-) 
 
 
Jess, June      26, 2017 - 1:37 am UTC
 
 
Hi Connor,
Thanks for your reply.
"Data Guard replicating deltas" was meant to indicate that we intend to pick up any changes to the source system by having them replicated to the staging server.  As you say though, it's picking out the exact bits of what happened that's the interesting part.
Thanks for the link the other thread... You're right about high risk stuff, but we are where we are.  At the moment, we're thinking that the way forward is to let Data Guard apply the changes, but also grab archive logs and do something with LogMiner to get the DML out.  That should give us an idea of what's changing.  Though as source/target have TDE enabled and all tablespaces are encrypted, so are the logs, so we're a bit at a loss...
It's the "comparison scripts" notion that makes me uneasy.  To compare, say, table by table, you need a copy of said table before and after.  If all we have is the target to which Data Guard replicates (and, obviously, the new production to which we're migrating the data), then there is no comparison as such--once DG applies the changes, there is no more 'before'.  We can keep some shadow tables (in the new prod, temporarily) with some PKs and dates, but that won't handle updates in any sort of graceful way, which is why we thought of ROWSCN, as false positives shouldn't matter much--it'd be worse if we missed something.
Looked into GoldenGate.  It appears one could get a fixed-term license (either a year or 6 month, the latter costing 70-80% of the annual cost).  Alternatively, there is a subscription model if you're running it through Oracle Cloud, which I'm not sure is possible in our case.
Stew Ashton,
Thanks, I did see your 'COMPARE_SYNC' post off Connor's link. I'll take some time this week to set up a test environment and give it a go.  As I understand though, it would throw another server into the mix (since we can't connect to prod, the staging server would get changes via Data Guard, and we'd need a copy of staging server before the changes to do the comparison).  I'm also curious how it performs on 80-odd million row unpartitioned tables...  We are definitely in for the fun times.
Re your question: "are you going to apply deltas to the "new" tables, and simultaneously modify them in the new system? How can that possibly work? At the very least, each row needs to be "owned" and modified by only one system, either legacy or new."
It is a staged migration.  Say the system lets users create all sorts of support tickets.  The first thing being migrated are tickets of type A.  This means that, starting with a cutover date, no new tickets of type A can be opened in the old system--only the new system will let you create them.  However, type A tickets already open in the old system won't be migrated until they are closed.  Similarly, new users can be added only in the legacy system.  This means they need to be migrated to the new system.  Then, if they want to open type A tickets, they'll do it in the new, else in the old system.  Over time, more ticket types will get migrated to the new system until there is nothing left in the legacy.  So no, "exact" activities are not happening in 2 systems at once, nor is there data flow from the new system to the legacy.
The tables aren't in the same database--they can't be.  The staging server (to which DG is replicating) is a physical standby (well, we'll convert to a snapshot standby).  It can't be logical because of unsupportable datatypes.  We're looking at excluding, but that may not be possible.  For us to keep a copy of the "as is" data to compare with changes after DG logs have been applied, we need another copy of the dataset.  The database is 30-odd TB.  It's not trivial to just get another server for comparison.
Re: 3rd party tool, I have looked at quite a few, and I am not certain I saw anything that could genuinely help in terms of providing replication and spitting out exactly what's been changed...  It's possible I missed something in my research, so any suggestions would be appreciated.
 
June      28, 2017 - 1:25 am UTC 
 
Some more thoughts for you.
1) 
"We do not have control over the source system (although their DBAs are willing to accommodate small requests we might have)"
rings alarm bells for me.  You're undertaking a significant application restructure of a 30TB system, and DBA's are going to be quibbling over what can and cannot be done on the source.  Man, I hate it when DBA's are like that - they are meant to be enablers not obstacles.  I'd be raising that issue first and foremost.  For a migration of this size and complexity, everyone needs to be rowing the boat in the same direction.
2) 
"The first thing being migrated are tickets of type A. This means that, starting with a cutover date, no new tickets of type A can be opened in the old system--only the new system will let you create them. However, type A tickets already open in the old system won't be migrated until they are closed"
seems to be worth a rehink to me.  Perhaps a staged migration where type A tickets move in their entirity to the new system is a less risky option ?  You obviously still have data sync challenges but at least you dont have to face things like:
- reporting on type A must cross both systems
- enquiries on a single type A must have intelligence to know which system to talk to etc.
3) 
(This is related to 1 above).
Since you are going to running both systems during the migration process, it's self-evident (and should be to your DBA's) that the source system is going to need changes (some non-trivial) to accommodate this.  For example, you might consider AQ on several structures (eg user addition like you provided) to capture those events and propagate them to the new system.
IMHO, you're going to need that flexibility on the *source* system if you are doing to get this to work. 
 
 
Jess, June      28, 2017 - 5:39 am UTC
 
 
Thanks Connor!
Re: (3), first and foremost...
Now I am concerned... You are anticipating non-trivial changes to source like AQ, etc. to propagate changes. What are we missing, as we weren't expecting that? The plan was to replicate via Data Guard, which should give us all the changes. We're working on being able to exclude unsupported columns so we can have a logical standby (suspect we'll need supplemental logging turned on on source too). This is why we're trying to derive from the logs what's changed (to migrate, DG will apply as it does). And that's why I am asking for ideas on deriving DML from archive logs off a TDE database...
Setting up AQ/CDC... We haven't even raised that knowing the response. We're looking at 25+ tables just to get "customer"... Plan A is definitely minimal changes to source...
=======
Re: Control over source system, agree with you fully, but when a system has been supported by a vendor for a decade, poorly, and the company is finally bringing it in-house, the vendor going out the door ain't going to go above and beyond. They'll do small bits, and anything more will have cost/time unacceptable to the company. Is what it is. The key here is that source DBAs aren't going to volunteer ideas/help. If we give them a list of specific tasks, they'll coast it. Whatcha gonna do....
Re: Staged migration, we've raised numerous risks (including things you pointed out) to no avail. The business have formally accepted the risks, and so it's up to us to make the technology support their choices. The workflow between 2 systems is radically different, so migrating open tickets would be non-trivial (e.g., current state might not exist), and tickets can be open for months. Moreover, there is talk of private beta, whereby both systems will accept type A tickets... There are, thankfully, no reconciliation requirements or reqs to push data from new system to the old...
Thanks again!
 
June      29, 2017 - 2:22 am UTC 
 
"The plan was to replicate via Data Guard, which should give us all the changes'
Yes, but the issue (if I'm interpreting your descriptions correctly) is not *capturing* all of the changes, is it *identifying* and *applying* those changes to your new system.
In an given (say) hour, 1000 changes will get applied via redo logs to your data guard system.  Out of those 1000 changes (as I understand it) you need to:
- pick out the 647 of those that are relevant to you
- with each one, take whatever operations it was (eg "add a row to table X") that occurred in the source, and potentially apply a transformation to the logic, because in your new system "add a row to table X" is now "update table Y, delete from Z, add a row to table W"
And as you say, you're starting an old vendor system that isnt a pretty sight :-)  A trivial (fictional) example could be user maintenance.
From the application perspective, we are updating a user details.  What if the vendor system implements that with a delete old record, insert new record.
When you're intercepting that in the redo log stream - you dont see "Oh, this is user maintenance", you see "User has been deleted" followed by "User has been added", etc.  Whereas if you have access to make modifications to the source system, you could (perhaps) record that the "following operations are user maintenance" in a table etc.
I'm just letting you know my thoughts here - I'd hate you to get halfway down the track and hit insurmountable problems.  Have a good long think before you get started.