workspace manager
vivek nema, September 18, 2008 - 5:54 pm UTC
Ok tom,
First, Thanks for the feedback
So, you want to say That
=> use "flashback data archive" to fulfil requirement for Legal Compliance
=> use WorkSpace manager BUT test Bulk Load and Monitor performance.
Understood ! thanks for it
Still, whatever solution I choose one problem stay open
"How do I migrate old History data to new approach?"
...If somebody has brilliant idea let me know.
September 18, 2008 - 10:00 pm UTC
... "How do I migrate old History data to new approach?" ...
nothing exists for that in any of the solutions.
Workspace Manager
Kurt Stockinger, October 30, 2008 - 2:05 pm UTC
Hi Tom,
a) Assume that I have a system that takes care of table versioning and I would like to add "valid time support". Is it possible to use valid time support without the overhead for maintaining the additional columns that handle "createtime" and "retiretime"? In other words, is it possible to switch off the operations for maintaining the columns "createtime" and "retiretime".
b) The second question concerns super transactions. Assume a long running staging process fills version-enabled tables with valid time support. Also assume that during the staging run the effected tables are committed several times (say 10 times). However, I only want the changes of the table be visible after the whole staging process has finished, i.e. after the 10th commit. What's the recommend way to achieve this with Workspace Manager?
c) Would you recommend using Workspace Manager with valid time support for tables with millions to billions of rows? To be concrete, could you point me to large-scale experience reports?
Thanks a lot,
Kurt
November 02, 2008 - 3:30 pm UTC
a) you either use workspace manager or not, it adds and maintains those, they are not optional
b) are you doing this in a "version", a "branch"? or in live?
c) give me an example of what you mean - the number of total rows is not scary (in the table itself), the number of valid time rows would be.
Have you given thought to benchmarking this - with data that looks like your data? and transactions that mimic yours?
that is what I would suggest - really...
workspace manager
Kurt, November 04, 2008 - 7:58 am UTC
Hi Tom,
Thanks for your answers. Find below some more details on your comments:
>b) are you doing this in a "version", a "branch"? or in >live?
In our settings it would be in "live" with version-enabled tables and valid time support. We do not have branches of data. It is really the classical bi-temporal approach where current data is loaded into the warehouse on a daily basis.
The main challenge in this setting is to make the changes only visible to the end-user after the whole staging run has finished. In other words, the applications that access the warehouse should have a consistent state before and after the staging processes. If the data changes are visible during the staging process, then the data would be inconsistent.
Any suggestion?
> c) give me an example of what you mean - the number of
> total rows is not scary (in the table itself), the
> number of valid time rows would be.
I just wanted to get a feeling for how well this feature is supported and recommend by Oracle for large-scale. We will certainly need to perform a major benchmark before we can come up with a good answer. However, I was hoping that there are some reference implementations/customers that have used this feature in a large-scale.
Thanks,
Kurt
November 11, 2008 - 12:05 pm UTC
in live, changes will become visible as soon as you commit to everyone else in live, that is the way the database works.
the purpose of a branch and "merge" is to stage changes and then publish them later.
What about query performance of flashback?
Xenofon, May 02, 2010 - 10:20 pm UTC
Hi Tom,
very interesting thread. I understand you have talked about generating history until now. Flashback is the method for creating history of your data, which will cause the least performance overhead.
But what about quering the history, as in Data Warehouse environments is the case. Whould you suggest the same approach in such systems? Or is it better to somehow materialize flashback history for such purposes?
May 06, 2010 - 11:31 am UTC
If I could use flashback query (with the data archive) - that is, if my data warehouse load was done in a way that supported data archive - I would use it.
Most of the queries would be typically against current data. Those queries would be 100% unaffected by the data archive.
Queries that needed to query data as of some point in the past would be trivial to implement (as of in the from clause - how easy is that).
If you do it yourself, you would either use a single table with all of the history or two tables (like we do).
If you use a single table, every query must be an "as of query", every query coded must know how and implement the where clause to pick off the "right data". Every query would be impacted by this history.
if you use two tables - the above is true for history queries (but current data is easily queried) - but the query you need to write is a somewhat complex union all. And your load process has to be taught how to move data from current to history.
For ease of use and implementation, I would give the data archive serious consideration and benchmark it (realistic benchmark - meaning - don't compare "no data archive to data archive", but rather "data archive to do it yourself data archive" and test all relevant components - load, index, retrieval, ease of implementation, etc)
One more question...
Xenofon, May 02, 2010 - 10:23 pm UTC
Sorry, one more question I have on this:
of course flashback is historical data, you cannot change right? If your businesscase requires also the ability to change historical data, then flashback is not the way, right?
May 06, 2010 - 11:33 am UTC
If your business requires a change to history, in the 22nd century, I question your business practices perhaps - in all be excessively, exceedingly rare conditions - not usually seen in business, you cannot change history.
You change history by "updating", the bad data stays (it existed after all, it was seen, it was used, it existed), but the bad data is overwritten by the good. You fix history by "updating"
Licensing?
A reader, August 22, 2012 - 6:18 pm UTC
We were looking at Workspace Manager to do our auditing and my coworker found this post (I was going off of an older post where WM had been recommended over Flashback) and we have gotten the business sold on the idea of using Flashback for our audits. I then did a bit of research on the licensing and found that the following document states that Flashback Table, Database, Transaction, and Transaction Query are all standard, but Flashback Data Archive (Total Recall) requires an additional license (it says it requires Oracle Advanced Compression option, which is later listed as an Extra Cost option).
http://docs.oracle.com/cd/B28359_01/license.111/b28287/editions.htm My question is, when do you cross the line into needing a license to use this as your auditing mechanism? Is it as soon as you extend the retention period?
August 29, 2012 - 11:35 am UTC
the ability to use the command "create flashback..." requires the license.
A couple of more questions....
A reader, August 31, 2012 - 12:04 pm UTC
First, thank you for the response, Tom.
We have two more questions related to using Flashback Query, but I think it would be helpful to explain the use case a bit more.
We have an old system on Informix that we are converting to Oracle. The old system has some custom auditing functionality that allows the users to look at the history of some fields and see who changed what and when. We were looking at doing this same functionality in the Oracle version of this product.
We are building this as a web service with a custom front end (we are treating them as two separate entities so any client can use the service), and our implementation of security is that the service will log in as a single user in the database, regardless of who logs in. The service does the user validation. There are already a lot of services coded to this and there are multiple different databases that these services connect to, including third-party applications, so changing this model would be very difficult, so I doubt that is an option the business would sign off on.
Question 1: How would you recommend trapping the user information? We started down the path of adding a field for User ID in all of the tables we want to track, and the service passed that ID in through the stored procedure. That works great for inserts and updates, but we run into the problem with deletes. We tried doing an update before the delete as a test (obviously we would rather not have to execute two statements, but we were just seeing if it would work), but Flashback only cares about the last committed value. We really don't want to add commits to the code, and want the service to handle the transaction, and the multiple statement thing is clunky at best. Is there a more elegant solution for us to get user information with our model (we also have 2 other fields that we capture including an Application ID and a field that the user passes to us to indicate who authorized the change, so those would have to be captured as well).
Question 2: The next question is on retrieving the data. My first thought on data retrieval is to just create a stored procedure that lets the service pass in either a from date and a to date or an interval, then I will pass back the data for that time frame. However, if the interval goes too far back, Flashback returns the "invalid lower limit" exception. We will be adding functionality for some time as this project is going to be coming out in phases, so we will have some tables with a year of history and other tables with a few weeks or months. We don't want users to have to try and guess how old a table is and keep changing their ranges to find data. Is there a way to tell the query to just use the earliest information if the lower limit timestamp is beyond the range of the flashback? Is there a way to get the earliest valid timestamp for a particular table?
September 10, 2012 - 6:44 pm UTC
question1: flag delete - as in - don't delete. Update (and use partitioning to move the updated data out of the original segment).
question2: you are talking about versions queries. You have two choices - explicit time ranges - in which case we HAVE to tell you "your range cannot be satisified". Or you can use MAXVALUE/MINVALUE and we'll give you what we have.
but if you ask us for everything between now and 10 hours ago - but we can only go back 5 hours - we *have* to tell you. anything else would be a bug...