You Asked

Hi Chris, Connor and Maria.

I have a task, that started out to be simple task but has along they way become more complex. At my work, the statistics and analytic team have been working directly with the production data. Until now this hasn't been a big issue, as we didn't have very much load on the database. But recently, their work has started putting more load on the db, and our developers have asked that statistics should get their own DB instance to play with.

This in itself would not be a big issue, but statistics have come up with some specifications that I am not sure how to follow up on. They want the data in raw format, just the way it is in the production database. On top of that they want to have their own schemas to work on the data in, and then they want to be able to pull external data, that not necessarily comes from an Oracle database.

I have seen mentions of solutions, but I am not sure which one, if any, is suitable for my problem. Basic or advanced replication, dataguard (we use it for Active/Passive failover already), golden gate etc. Here I need advice on which one would possibly solve the issue, I can then dig deeper into it.

We are currently on 11.2.0.4 version of the database but we intend to upgrade later this year. At the moment we can only upgrade to 18.7 (due to ODA setup we have). If there are some better ways to do this after upgrade, then that probably will be our choice. Maybe multitenancy will help us, I am not sure.

I hope my question is clear and that you can help me in this situation.

Regards,
Ingimundur K. Gudmundsson

and Connor said...

A lot depends on the business requirements here.

For example:

1) if they want *real time* access to production data, then you're really looking at a goldengate implementation, because nothing else really lets you do major changes (new schemas etc) to a real time copy of production data

2) if they want a "snapshot" of production data, then snapshot standby is an option. A standby becomes read/write copy of production, but at some point you need to rewind/resync with the primary and changes are lost. eg, snapshot opens at 5am, people do what they like to 6pm, then you rewind it back to 5am and roll it forward with production.

So if people wanted persistent schemas of their own, you'd need mechanisms to preserve that across rewinds. (eg datapump export/import)

3) at 18c, you can go multitenant, but thats a non-trivial license fee ... if you can get to 19c, you get 3 pluggables for free. Then you could consider pluggable cloning (potentially same issues as 2 in terms of persistence)

Hope this helps.

Rating

(2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Thanks for the answer - small followup.

Ingimundur K. Gudmundsson, June 15, 2020 - 11:28 am UTC

Thanks alot for the answer Connor.

It seems to me then that the only real option for me is goldengate, which is a licensed option if I am not mistaken.

I don't think statistics would be very happy if their own tables and views would disappear when we get them fresh data from production. Bear in mind that their own objects would exists in schemas that do not exists on the primary and they will never manipulate the data that comes from there (only with own views on own schemas).

When we are up to 19c and being able to work with PDBs, how is the communication between PDBs? Is it the same as DB link or are they "closer" to each other, so that one PDB can reference an object in another PDB and know how to optimize that connection? If that is the case, I could see that as a solution where the prod data would come from clone, while statistics would have their PDB for doing their job. 1 PDB for PROD clone, 1 PDB as working area for statistics and 1 PDB to import data from other sources.

Thanks in advance for you answers.
Regards,
Ingimundur

June 16, 2020 - 3:09 am UTC

With option 2 I mentioned a lot depends on timeliness

I worked with a customer that did exactly what you were after (within the limits outlined below)

5am => convert standby to read/write snapshot
5am => 10pm, business users can do whatever they want but only in separate schemas
10pm => we datapump exported all the users schemas
1030pm => we revert the snapshot to normal standby

Thus at 1030pm, the standby goes inmto normal mode, rolling forward all the production changes of the day and brings itself up to date. Then...

5am => convert standby to read/write snapshot
=> datapump import the business users data
5am => 10pm, business users can do whatever they want but only in separate schemas
10pm => we datapump exported all the users schemas
1030pm => we revert the snapshot to normal standby

which we then repeat every day

There is potential in 19c for snapshot clones etc, but all of these things ultimately introduce the issue of the collision between keeping business user data and refreshing production data.

A reader, June 16, 2020 - 8:49 am UTC

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Thanks for the answer - small followup.

More to Explore

Administration

Automatic replication between 2 databases

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

Thanks for the answer - small followup.

More to Explore

Administration