Skip to Main Content
  • Questions
  • SELECT FOR UPDATE - reducing serialisation time

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, David.

Asked: November 20, 2007 - 9:01 pm UTC

Last updated: April 13, 2009 - 4:10 pm UTC

Version: 10.2.0

Viewed 10K+ times! This question is

You Asked

Hi Tom,

Thanks for all the great info! I have a question regarding SELECT FOR UPDATE. This is a slightly contrived example to specifically look at this situation but hopefully it makes sense.

Here's the example set up..

create table t (
  id         number      primary key,
  processed  varchar2(1) not null
);

insert into t values (1, 'N');
insert into t values (2, 'N');
insert into t values (3, 'Y');
insert into t values (4, 'Y');
insert into t values (5, 'N');
commit;

Statuses:
N --> Not Processed
P --> Processing
C --> Complete

Workers need to get the next unprocessed piece of work and should get the 'oldest' (i.e. the one with the lowest ID). There's no strict rules on order of processing just that they should attempt to get the lowest ID first, to avoid the case where picking them randomly happens to leave an unprocessed row for a long period.

So, when accessing the above table, the first client should get the row with ID 1, the second --> 2 and the third --> 5.

When a session gets a row it immediately updates it to indicate that it is processing it and commits the new status before carrying out any 'real' work.

Using this SQL (or even just a straight update to status 'P'), I can ensure that clients get the next row to process correctly (the next unprocessed row by ID).

select id
from t
where id =
(
  select id
  from (
    select id
    from t
    where processed = 'N'
    order by id
  )
  where rownum < 2
)
for update
/

The problem is that other sessions will block until the first session commits the status update. Although this locking time is small it exists and the goal is to reduce this time blocked to be as small as possible.

I've recently come across the undocumented SKIP LOCKED clause so I thought I'd give it a go to see whether it could help out. Trying something like this..

select id
from t
where id =
(
  select id
  from (
    select id
    from t
    where processed = 'N'
    order by id
  )
  where rownum < 2
)
for update skip locked

...the first session receives the row with ID 1 but the second row returns no rows. This makes sense as the inner select with order by and rownum will still be selecting the same row so we find it locked and skip it. So this doesn't achieve the goal.

Changing this to be

select id
from (
  select id
  from t
  where processed = 'N'
  order by id
)
for update skip locked
/

will allow each session to find all rows that are available when it attempts the select, but will lock all of these rows not just the first. The workers need to get one row at a time. Putting the rownum back in leads to the expected ORA-02014.


So back to the drawing board... Another option (I found here http://www.ixora.com.au/q%2Ba/0010/17094208.htm ) uses two loops in PL/SQL with pseudo code like

  cursor1 --> select rowids with status 'N' ordered by ID
  loop
    cursor2 --> select row with rowid
       from cursor1
       where rowid=cursor1.record.rowid
       -- Need to check that row is still unprocessed.
       and status = 'N'
       for update nowait

    If a row is found then we have it and it is now locked,
    return and allow client to process work.
    If no row found then proceed to next row in outer cursor.
  end loop

This seems to do the job in that other sessions will not block but will try all rows in status 'N' in the correct order until they either find one they can lock or until they run out of rows.

This is at the expense of potentially higher complexity, and also potentially more resource usage as rather than blocking, many sessions might churn through the same rows at the head of the list until they find one that is available.


Is this PL/SQL approach a valid approach? Is there a simpler approach? I've looked over the PL/SQL version and couldn't think of any issues but may well be missing something.

Currently I'm thinking that I'd stick to the straight SQL approach and live with the small serialisation rather than the potential increased resource usage and complexity of the second.

Many thanks in advance. Hope this wasn't too wordy - I'm always having issues with the @concise@ key on my keyboard...

Dave

and Tom said...

well, first thing I'd say is....

do not do this yourself - the database has for the last 10 releases included this functionality - it is called Advanced Queues. It solves this "get the next bit of work to do in a highly concurrent fashion"


...
When a session gets a row it immediately updates it to indicate that it is processing it and commits the new status before carrying out any 'real' work.
.....

that is horrible, you'll be spending the rest of your life "resetting locked rows that aren't locked by any active session"


so, rather than comment on the procedural approaches, I'll point you to the solution:

http://docs.oracle.com/cd/B19306_01/server.102/b14257/aq_intro.htm#sthref12


Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Missing the wood for the trees.

David, November 21, 2007 - 4:11 pm UTC

Thanks Tom,

Nothing like missing the obvious. I've come across AQ before but nearly always for queueing within the database (automated background jobs etc). I hadn't thought of applying it to this client situation. It's always good to be reminded of existing solutions.

Agreed on the recovery needed for my original solution, the commit of the status update would be a workaround to reduce the serialisation impact. But would come at a cost of having to have separate recovery processes checking for orphaned rows and dealing with them. As you say, not ideal.

AQ would avoid all that by allowing the dequeue and subsequent processing to all occur in the same tx. Simple, elegant, reliable and most likely much more performant, which was the goal in the first place. :)

Alternative Solution Maybe

Scott, April 09, 2009 - 4:44 pm UTC

How about something like this?

create table queue (job_id number, status number);

create or replace procedure get_next_job(arg_job_id out integer)
is
begin
arg_job_id := -1;

update queue
set status = 1
where job_id =
(select min(job_id)
from queue
where status = 0)
returning job_id into arg_job_id;

commit;
end get_next_job;
.
/

Tom Kyte
April 13, 2009 - 4:05 pm UTC

and when you fail in the processing of this next job - and leave the status at one, what have you got then?????????

besides a mess of course.



also, there are certain "read consistency issues" here.

The select min(job_id) where status = 0, that runs.... It returns the number 100 (for example).

We update that row - but haven't committed yet.

Someone else calls this, their select min(job_id) gets.... you got it, 100. But they block on the update.

Now you do commit and you have 100. They now get unblocked and - well - update the SAME row....


eg: your logic fails to work. You would assign the same record to N processes sometimes.

So, in addition to leaving a mess when you crash (you have to now build a job to watch the things that do jobs and when they fail - discover what jobs they were doing and update the status back)....

You have a logic error in concurrency controls here...



Nevermind

Scott, April 09, 2009 - 5:04 pm UTC

Ok I think I was dumb. If 2 threads execute the procedure at the same time, they will both return the same job_id, correct?
Tom Kyte
April 13, 2009 - 4:06 pm UTC

correct :)

drop table t;

create table t
as
select 1 job_id, 0 status
  from dual
/

update t
      set status = 1
    where job_id =
       (select min(job_id)
          from t
         where status = 0)
/



do that in one session. In another session do this:

ops$tkyte%ORA10GR2> update t
  2        set status = 1
  3      where job_id =
  4         (select min(job_id)
  5            from t
  6           where status = 0)
  7  /


it'll block... now commit the first session and you'll have the second session update the same row...

Another Option, Potentially Less Dumb

Scott, April 09, 2009 - 5:30 pm UTC

Yes now that I actually think about it, my update idea won't work because the second thread will simply enqueue for the same record.

How about this:

select job_id
from adsonarkernel.queue
where job_id = (select min(job_id) from adsonarkernel.queue
where type = 0
and status = 0
and is_deleted = 0)
for update;

And of course job_id is sequence-generated so that the smallest job_id indicates the oldest unprocessed job.
Tom Kyte
April 13, 2009 - 4:10 pm UTC

same problem as before. the subquery is read consistent.


you'd have to COPY the where clause

where type = 0
and status = 0
and is_deleted = 0


to the outer query as well and handle the case where you get zero rows.

usually, if you want to do this "do it yourself queue" (bad bad horrendously bad idea by the way) you would use a loop like this:

for x in (select rowid rid from q where <whatever defines "unprocessed">)
loop
   begin
      select * into l_rec from q where rowid = x.rid FOR UPDATE NOWAIT;
      return l_rec;
   exception
       when-ora-54-resource-busy
       then 
           null;
   end;
end loop;
return null;



that finds the first "unlocked record" in the table and locks it. You return that record to be processed.

Notice the LACK OF A COMMIT. You do NOT WANT A COMMIT. If the invoker of this routine dies - they rollback and unlock their record and someone will pick it up later. It is self cleaning.


And it does not have the concurrency race condition your example did...