Good to know.
Shannon Severance, September 15, 2010 - 4:38 pm UTC
select for update
Franck Pachot, September 20, 2010 - 3:48 am UTC
Hi Tom, can we say that optimistic locking with ora_rowscn is safe if we add a
'select for update' just before the update ?
September 20, 2010 - 2:30 pm UTC
for update has the same issue as update does. so, no, it would not work in all cases.
A reader, September 20, 2010 - 3:22 pm UTC
Hello Sir,
I have below question with respect to your above example for lost update to OP
(1) In your above example when we commit in first session did not ora_rowscn change so that the commit
in the second session fail?
(2) I see from example that ora_rowscn does not change and hence there is a lost update so when did ora_row_scn change?
Thanks
September 21, 2010 - 3:28 pm UTC
1) why would a COMMIT fail?
2) the ora_rowscn did change (run example, query results) - HOWEVER, the fact that it changed did not trigger a restart of the update. If "restart of an update" means nothing to you - please read the links first
In those links, you'll learn that there are two phases to the update - the "find me a row" phase - and then the "update this row phase".
the find me this row phase found the rows in both sessions (changes are not visible until you commit, we use consistent reads with the undo information to read around locks).
then the update this row phase went ahead and did the update without restarting - which again, you'll want to read and play around with those two articles to understand why that is relevant.
select for update
Franck Pachot, September 30, 2010 - 2:52 pm UTC
>> for update has the same issue as update does. so, no, it would not work in all cases.
I understand that the 'select for update' has the same issue because it does write consistency in the same way as the update.
But then, the following update should be safe: row has been locked and ora_rowscn is now evaluated.
It is like doing 'manually' an update restart each time, just in case. Or am I missing something ?
September 30, 2010 - 3:09 pm UTC
I don't see what you mean by:
... But then, the following update should be safe: row has been locked and
ora_rowscn is now evaluated. ...
are you saying
a) select for update the row using ora_rowscn (or not)
b) update the row using ora_rowscn
If so, yes that would work - but it would be a performance hit I would not be willing to take. And it also means you deviate from a standard approach - you have a "workaround" and I'd rather not use a workaround when viable alternatives (that do not require a workaround) exist.
Thanks
Franck Pachot, October 03, 2010 - 3:51 am UTC
Yes, this is what I mean. Thanks for your opinion on that.
Regards,
Franck.
Wouter, February 11, 2011 - 4:50 am UTC
Hi Tom,
Our data is not only accessed/modified through the web (e.g. APEX), but also through client-server applications (e.g. Forms or SQL+). All clients use the same transactional API to insert/update/delete data.
I was under the impression that a combination of pessimistic locking (select .. for update) and optimistic locking (e.g. through ora_rowscn) is a good solution for this kind of applications.
Example:
drop table t purge;
create table t rowdependencies as select object_id, object_name from user_objects;
alter table t add constraint t_pk primary key (object_id);
create or replace procedure update_t
(p_object_id number,
p_object_name varchar2,
p_scn number)
is
l_scn number;
begin
select ora_rowscn
into l_scn
from t
where object_id = p_object_id
for update nowait;
if l_scn = p_scn then
update t set object_name = p_object_name where object_id = p_object_id;
else
raise_application_error(-20000,'Record is changed');
end if;
end update_t;
/
Suppose procedure update_t is the transactional API used in web (stateless) + client-server (stateful).
Can you please explain the
"performance hit you are not willing to take" or other issues you think might happen with this approach?
Regards,
Wouter
February 14, 2011 - 7:22 am UTC
I don't see the words "performance hit you" on this page before you posted it, can you be more clear what part of this page you are referring to.
With the NOWAIT - your routine is "safe", without the NOWAIT - if you were to block - it would not be.
Wouter, February 15, 2011 - 2:16 am UTC
Hi Tom,
Thanks for confirming that this approach is "safe".
I read something about the performance hit in:
Followup September 30, 2010 - 3pm Central time zone:
I don't see what you mean by:
... But then, the following update should be safe: row has been locked and
ora_rowscn is now evaluated. ...
are you saying
a) select for update the row using ora_rowscn (or not)
b) update the row using ora_rowscn
If so, yes that would work - but it would be a performance hit I would not be willing to take. And it also means you deviate from a standard approach - you have a "workaround" and I'd rather not use a workaround when viable alternatives (that do not require a workaround) exist.
February 15, 2011 - 8:34 am UTC
I believe I was saying that if I didn't have to use ora_rowscn - I wouldn't - if I had my own "version counter", I could do it in a single update
update t
set ...., version_counter = version_counter+1
where primary_key = p_PRIMARY_KEY
and version_counter = p_VERSION_COUNTER_AS_READ_OUT_INITIALLY
I wouldn't have to worry about the writing consistency issue with the select for update, nor the extra work performed by the select for update that didn't need be performed.
why tigger just fire once? Suppose a restart
jiang huang zheng, February 17, 2011 - 4:20 am UTC
Hi Tom,
I read your link about restart and I did a test:
SQL> create or replace trigger t_row_trigger before update on t_row for each row
2 begin
3 dbms_output.put_line('fired');
4 end;
5 /
SQL> select * from t_row;
X Y
---------- ----------
10 88
2 3
3 4
Then in session 1:
SQL> update t_row set y=89 where x=10;
fired
1 row updated.
In session 2 : SQL> update t_row set x=22 where y=88;
and it is normally blocked by session 1, and I assume that there should be a update restart since y was modified by session 1. however when i commit in session 1, i supposed to see 2 fired in session 2, but only 1 fired:
SQL> update t_row set x=22 where y=88;
fired
0 rows updated.
is it expected?
Thanks,
February 17, 2011 - 8:41 am UTC
because it is a for each row trigger - think about it.
The first time - we "found" a row to update - y was found to still be 89. We started the update, fired your row level trigger - went to get the block in current mode and blocked.
Then you committed in session 1 - that released the lock and we got the block in current mode. When we looked at the block in current mode - we discovered that "hey, y WAS 89, but now y IS 88 - we cannot update this row!" and we rolled back.
When we restarted - we found 0 rows - therefore we do not fire any "for each row" triggers - there are no rows to cause it to be fired.
So the strange thing you should take away from your example is the fact that the trigger fired at all!!!!!
If you want to see your trigger fire twice, make it a BEFORE trigger, not BEFORE FOR EACH ROW.
very good explaination,thanks,
jiang huang zheng, February 18, 2011 - 5:20 am UTC
if not restart of the update.
jiang huang zheng, February 19, 2011 - 9:12 am UTC
Hi Tom,
I was reading your blog:
---
If we just skipped this record at this point and ignored it, then we would have a nondeterministic update. It would be throwing data consistency and integrity out the window. The outcome of the update (how many and which rows were modified) would depend on the order in which rows got hit in the table and what other activity just happened to be going on. You could take the same exact set of rows and in two different databases, each one running the transactions in exactly the same mix, you could observe different results, just because the rows were in different places on the disk.
---
Can you elaborate more on the nondeterministic outcome if oracle does not restart?
Thanks,
February 20, 2011 - 12:29 pm UTC
if we just skipped the record because something else updated it - it would end up with a different result if the ordering was just a little bit different.
You cannot just skip a record because it doesn't meet your criteria anymore - you have to restart and get a CONSISTENT set of records. Otherwise - it isn't "consistent" - it isn't deterministic - if the other query had sorted things differently or ran just a bit faster or slower - the same statements executed in the same fashion would come to a different result.
Little help in deciding
Prashant, November 01, 2012 - 10:37 pm UTC
Hi Tom,
Looking for your kind advise in the following scenario
1. Application A writes (insert / updates) data into table (T1) and maintains an update_date timestamp (SYSDATE). This application is goldengate and there are 5 parallel threads running continuously processing records.
2. Application B (ETL program for real-time active warehouse) reads data from table (T1) for the incremental changes and carries out transformations.
The application is suffering from the standard problem of missing records since they maintained a max update date when they read the records in (2) above and read only after this timestamp. This meant any transactions that started before they started reading but not committed were missed.
They they decided to re-process by adding a condition (- 30 mins) but they landed up in another problem because of the way the application was written. So they had to avoid duplicate processing
They may occasionally get a peak load of 1,000,000 records in a 5 min interval. The ETL program is scheduled every 5 mins.
What is the best option considering its a warehouse because I was a little confused on the various options and which ones fits when?
1. Ruled out v$transaction usage since it can also re-process records , or
2. ora_rowscn, or
3. version_control, or
4. trigger
Thanks
Prashant
November 01, 2012 - 10:55 pm UTC
remove the unnecessary program that reads and ETL's, do that in the tool you are already using (goldengate)
Even if you do not use v$transaction, you can re-process records, that is always a fact of life. ALWAYS.
for example, assume it is 12:00:00.0000 right now. Assume that the last time you refreshed was 11:55:00.
So, you need to pull all changes since 11:55:00 right....
And you need to remember that your refresh started at 12:00:00.0000 (for the next refresh). Assume that it takes 30 seconds to refresh...
Now at 12:00:00.001 - golden gate inserts and commits a new record.
At 12:00:00.002 - you start your data pull. It will see the 12:00:00.001 record.
You finish your refresh.
At 12:00:05.000 you refresh again, you will pull all records since 12:00:00.000 (time of last refresh), you'll see that record again.
You'll have to deal with duplicates anyway in that case.
or switch to a queue (which is what goldengate is in effect working from) and mark records as processed as you process them, or remove them from the source table as you process them and so on. In other words, re-invent what goldengate already does...
Thanks
Prashant, November 01, 2012 - 11:43 pm UTC
Hi Tom
Thanks for your prompt response.
If I understand correctly the best option in the order of performance
1) Let goldengate handle the transformations, remove the ETL
2) Use v$transaction
3) Mark the records or delete the records as its processed (this avoids duplicate processing but can introduce locks if the same record is also updated by goldengate, right?)
Please correct my understanding.
Thanks
Prashant
November 05, 2012 - 8:12 am UTC
I believe you meant "preference" not performance?
If you don't use a queue of some sort (like goldengate, marking records as processed, AQ, whatever you design) - you have to be prepared to process duplicates (and v$transaction would be the timestamp you want to use for refreshing).
it is up to you.