How about using expdp/impdp?
Daniel Blondowski, August 05, 2010 - 2:22 pm UTC
August 05, 2010 - 3:01 pm UTC
that would be "dump and reload"
I would not want to
read all of the data out
write all of the data back
reindex it all
Just move the changes or just copy the datafiles and apply some redo to them (transport from backup)
Using CDC
Rajeshwaran, Jeyabal, August 05, 2010 - 2:45 pm UTC
Tom:
I thought to use CDC for this data refresh from Transaction database (or schema)to reporting database (or schema).To measure the latency my question remains un-answered.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8784259916366#2588011900346985554
Do you have any answer to this?
August 05, 2010 - 3:02 pm UTC
I don't have anything to add.
Alexander, August 05, 2010 - 3:02 pm UTC
Rajeshwaran,
A silly question but you are aware Oracle has non blocking reads (not like Sybase and SQLServer) so you can actually run reports against the online system unless they are really nasty Hyperion generated or something like that.
I have seen Tom recommend transport tablespaces using RMAN a few times but I have tested them on every platform patch level imaginable and hit bugs every time. So unless I see a cut and paste of it working in 10g, I don't believe it actually works.
How big is this database? I would not even consider using expdb/impdb for anything over 50GB it will take all night.
I like the dataguard option Tom suggested. And if you can get a server in a different data center, you can kill two birds with one stone.
To Alexander
Rajeshwaran, Jeyabal, August 05, 2010 - 3:23 pm UTC
But you look too silly than me. look at my question The Transaction tables are Hash partitioned (8 partitions with 400 million). But i don't want my reporting tables to be Hash partitioned Instead i need them to be Range partitioned by Created_dt and subpartitioned by Hash (based on primary key). As Tom suggest I Benchmarked the queries against Existing Hash partitioned and this new Range/Hash parition, things looks good with this Range/Hash parition. Hence I need to do data refresh from Existing Transaction tables to reporting tables.
August 05, 2010 - 4:42 pm UTC
Rajeshwaren -
"it might be a silly question" is an idiom. Nothing harsh was meant by it - not at all.
Alexander, August 05, 2010 - 3:53 pm UTC
"But you look too silly than me. "
I'm sorry? Do you want to try that again in a language you can speak?
"look at my question..."
Yes I see that. See Tom's response?
"I would not consider your path at all."
Mark, August 05, 2010 - 4:41 pm UTC
I am somewhat familiar with Rajesh's reporting database situation. We have recommended a physical/logical standby to our technical team and have been rejected (DR solution only) which really ties our hands. They are giving us a RAC node solution as a reporting DB solution. This is a multi-Tb 10g DB with some intense IO and batch processing requirements.
I am having a difficult time getting any info on real life case studies using CDC. I also see that it is Tom's last choice. Is CDC viable for a high volume batch processing/IO load?
August 05, 2010 - 4:45 pm UTC
I personally would prefer data guard (easier way to implement the logical equivalent of streams)
and then streams
and probably would not use CDC for something as simple as "create a copy"
your technical team rejected data guard because they think it is just disaster recovery? Not very technical of them is it...
Try GoldenGate
VicC, August 06, 2010 - 2:48 am UTC
Hi there,
Have you thought of trying GoldenGate. I have been testing it & it seems to work well. You can filter data using the equivalent of where clauses, etc. The data is mined from the log files and sent over the network to your target database - which can be SQL Server or Sybase etc.
Have a look on OTN for more info.
There are some good examples of how to set it up, and see it in action here:
http://gavinsoorma.com/oracle-goldengate-veridata-web/ I am open to correction, but believe it's going to take over from Streams as the main Replication tool.
Regards,
VicC
August 06, 2010 - 8:31 am UTC
they have no real sophisticated needs here. They want a copy of an Oracle database to an Oracle database - without anything fancy. They already own more than enough software to accomplish that in a myriad of ways.
Mark, August 08, 2010 - 9:07 pm UTC
This is correct. In this case we are looking strictly for a simple reporting DB. We have suggested GoldenGate as a potential solution for other applications that are requiring more than a simple reporting DB solution. Tx for the suggestion.
Slight twist on original question.
Aaron, September 14, 2010 - 5:25 pm UTC
Tom,
We are considering the very options you have proposed for
this problem with a slight twist. We are indeed looking to
replicate the data for the purposes of reporting but the
reporting database would reside outside of our network in a
partner's data center. I don't believe this necessarily
prohibits any of the suggestions you've made thus far I was
wondering if you believe it warrants a reordering of
preferences from your prior feedback of:
"I personally would prefer data guard (easier way to implement the logical equivalent of streams)
and then streams
and probably would not use CDC for something as simple as "create a copy" ..."
Thanks for the help.
September 15, 2010 - 7:51 am UTC
I don't know what your refresh requirements are and what the network between you and the partner looks like and the volume of data and how much of the source database needs to be available on the remote site and the transaction volumes.... So, I cannot prioritize
the list was not ordered in any sort of priority or preference.
Delivering data for reporting
A reader, June 22, 2011 - 11:06 am UTC
We have a situation that is similar to what's been discussed in this thread with a slight twist. We are a software vendor that provides package software (runs on oracle) to customers in various industries. Every customer has a different reporting need that suits their particular profile and industry. As such, instead of buildling an extensive reporting capability into our main application/database that addresses every customer requirement, we are thinking about providing a set of APIs that allow customers to extact some data out of the main database and import them into a target of their choice, which could be any RDBMS or even spreadsheets or XML if they so desired. The goal is to get them the data and they would do whatever they like with it. Most customer needs can be met with a subset of a few large tables in the database per data extraction. The size of the data to be exported could range anywhere from 100 MB to a few GBs at a time.
What's the best approach to implement this type of utility?
June 22, 2011 - 12:13 pm UTC
Insufficient data to answer.
firstly - why would they need to take data out to report on it? The SQL database is perhaps the most powerful ad-hoc tool I know of.
But assuming the data needs to come out - what are the real requirements? Just dumping an ad-hoc query to disk? Getting all of the changes to data since X? Getting sets of read consistent data from multiple queries to dump to disk?
At the end of the day, you'll be writing custom code to do this - a client application basically - and using various database features to accomplish it (like read only transactions, XML queries, bulk fetching, etc)...
Same instance, Two Schemas
Cyril, April 01, 2015 - 7:45 am UTC
Hi Tom
We are actually discussing about create a second schema for reporting (a star schema) to simplify the access of the data and do some "complex" aggregations.
Nowadays what would you recommend to ETL ("real-time") data from a schema to another schema on the same instance ? (Streams, CDC, GoldenGate other ?)
We do not want use triggers because it might be lower the performance on the OLTP schema.
Thanks in advance !
Cyril