Thanks for the question, Charles.

Asked: April 27, 2010 - 10:21 am UTC

Last updated: April 29, 2010 - 2:14 pm UTC

Version: 10.1.0.4

Viewed 1000+ times

You Asked

We've got some code converted from our WANG system (and COBOL) into PL/SQL. Effectively, our business rules are coded into our table triggers. We have a global variable which we use to short-circuit these business rules from executing. We haven't noticed any problems (4+ years) until recently. We were trying to remove 53K records from a table, skipping all business rules from executing, when the DELETE failed because of business rule violations. So, I debugged & debugged until I found that the TRIGGER was firing twice for the same record, where the second trip through the the TRIGGER occurred with that global variable set to NULL, thus NOT bypassing the business rules and triggering the RAISE_APPLICATION_ERROR(). I've recreated the scenario with the code below on 10.1.0.4, and someone else did the same in version 11.2.0.1. Here is the code for maintaining the 'global variable':

create
PACKAGE                      cfr4521.pacepack as
 
operation_name      varchar2(30) := null;
procedure set_operation_name ( oper_name varchar2 );
function get_operation_name return varchar2;
 
end pacepack;
 
create 
PACKAGE BODY        cfr4521.PACEPACK as
 
  FUNCTION get_operation_name return varchar2 is
    begin
      return nvl(pacepack.operation_name,' ');
    end;
 
  PROCEDURE set_operation_name ( oper_name varchar2 ) as
 
  begin
     pacepack.operation_name := oper_name;
  end set_operation_name;
 
end pacepack;

Here are the 53K records:

create table cfr4521.delete_data
as
select owner, object_name
from dba_objects
where rownum < 53001;

Here's the table trigger

create or replace trigger cfr4521.delete_data_bt
before delete on cfr4521.delete_data
for each row
begin
 
 if nvl(cfr4521.pacepack.get_operation_name,'NADA') != 'NO UP' then
   raise_application_error(-20500, 'Not NO UP!');
 end if;
 cfr4521.pacepack.set_operation_name(null);
 
exception when others then
 cfr4521.pacepack.set_operation_name(null);
 raise;
end;

and here's the PL/SQL which DELETEs the records

declare
 
CURSOR L_DELETE_CSR IS
  SELECT * FROM cfr4521.delete_data                                                  
  for update;      
 
BEGIN
 
 for l_delete_row in l_delete_csr
 loop
 
  cfr4521.pacepack.set_operation_name('NO UP');
  DELETE FROM cfr4521.delete_data
      where current of L_delete_CSR;                                     
 
 end loop;
 
END;

Any idea why the custom -20500 error pops up after several thousands of records are deleted? It's very strange to me at this point ... Thank you for your help, ---=Chuck

and Tom said...

... Effectively, our business rules are coded into our table triggers ...

frequent readers of this site can already see where this is going:

https://www.oracle.com/technetwork/issue-archive/2008/08-sep/o58asktom-101055.html

triggers - ugh, you just have to hate them. Especially when someone says "our business rules are coded into them". cringe....

... We have a global variable which we use to short-circuit these business rules from executing. ....

and now I really cringe :( Magic on Magic...

Now, before we discuss this - you must be made aware that not only do we NOT promise to fire the trigger once - we actually promise to fire it more than once in certain cases:

http://asktom.oracle.com/Misc/something-different-part-i-of-iii.html
http://asktom.oracle.com/Misc/part-ii-seeing-restart.html
http://asktom.oracle.com/Misc/part-iii-why-is-restart-important-to.html

this has *always* been true - complex logic in triggers is a scary scary thing (I don't are what database you are using - it is scary).

You are doing something non-transactional in a trigger - we are doing a statement level rollback silently on you (we do those, we have always done those, we will always do those). When we rollback - your code doesn't "rollback".

This is why doing anything non-transactional in a trigger is a really really bad idea - modifying globals in a trigger - you have to be really really really careful there.

Here is a simplified example, you'll see similar results in 9i, 10g, 11g - that is all I tested on:

ops$tkyte%ORA11GR2> drop table delete_data;

Table dropped.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create or replace PACKAGE pacepack
  2  as
  3      type array is table of number index by varchar2(40);
  4      g_data array;
  5      g_cnt  number;
  6  end pacepack;
  7  /

Package created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create table delete_data
  2  as
  3  select owner, object_name
  4  from dba_objects
  5  where rownum < 53001;

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create or replace trigger delete_data_bt
  2  before delete or update on delete_data
  3  for each row
  4  begin
  5      if ( pacepack.g_data.exists(rowidtochar(:old.rowid)) )
  6      then
  7          dbms_output.put_line( 'doing "' || :old.rowid || '" again was called ' || pacepack.g_cnt );
  8      else
  9          pacepack.g_data(rowidtochar(:old.rowid)) := 1;
 10      end if;
 11      pacepack.g_cnt := pacepack.g_cnt + 1;
 12  end;
 13  /

Trigger created.

ops$tkyte%ORA11GR2> declare
  2      CURSOR L_DELETE_CSR IS
  3      SELECT *
  4        FROM delete_data
  5         for update;
  6      l_cnt number := 0;
  7  BEGIN
  8       pacepack.g_data.delete;
  9       pacepack.g_cnt := 0;
 10      for l_delete_row in l_delete_csr
 11      loop
 12            update delete_data
 13             set owner = lower(owner)
 14           where current of L_delete_CSR;
 15          l_cnt := l_cnt + 1;
 16      end loop;
 17      dbms_output.put_line( 'trigger count = ' || pacepack.g_cnt || ' local count = ' || l_cnt );
 18  END;
 19  /
doing "AAAYMZAAEAABFk7AAQ" again was called 24634
doing "AAAYMZAAEAABFm2ABd" again was called 35899
doing "AAAYMZAAEAABFnxABh" again was called 47164
trigger count = 53003 local count = 53000

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> declare
  2      CURSOR L_DELETE_CSR IS
  3      SELECT *
  4        FROM delete_data
  5         for update;
  6      l_cnt number := 0;
  7  BEGIN
  8       pacepack.g_data.delete;
  9       pacepack.g_cnt := 0;
 10      for l_delete_row in l_delete_csr
 11      loop
 12            DELETE FROM delete_data
 13           where current of L_delete_CSR;
 14          l_cnt := l_cnt + 1;
 15      end loop;
 16      dbms_output.put_line( 'trigger count = ' || pacepack.g_cnt || ' local count = ' || l_cnt );
 17  END;
 18  /
doing "AAAYMZAAEAABFfSACs" again was called 3878
doing "AAAYMZAAEAABFf0ACR" again was called 11795
doing "AAAYMZAAEAABFkcABZ" again was called 19235
doing "AAAYMZAAEAABFmHAAE" again was called 26641
doing "AAAYMZAAEAABFmtAAM" again was called 34092
doing "AAAYMZAAEAABFnTAC8" again was called 41464
doing "AAAYMZAAEAABFn6ACQ" again was called 48844
trigger count = 53007 local count = 53000

PL/SQL procedure successfully completed.

You cannot do things in a trigger that cannot be rolled back - unless you are willing to have them NOT BE rolled back.

Rating

(9 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Charles Forbes, April 27, 2010 - 4:09 pm UTC

So, we're not getting out of this situation anytime soon. Again, there's quite a bit of code set up this way, having been done so by a contractor.

That being the case, you mentioned that, "You cannot do things in a trigger that cannot be rolled back." So, if I change the code for PACEPACK.SET_OPERATION_NAME to insert/update to a Global Temporary Table, then that should solve the problem?

April 28, 2010 - 7:42 am UTC

that would be a Rube Goldberg solution to the problem - yes.

It would make it slower than it already is.

Why wouldn't you just TURN IT OFF in the block, and TURN IT ON in the block - why is the trigger doing anything besides validation here???

Why does the trigger set some flag? That is a bug.

Why is the trigger a BEFORE DELETE - if it were AFTER DELETE and you always delete singleton records - it would work "by accident". The restarts happen AFTER the BEFORE FOR EACH ROW trigger and BEFORE the AFTER FOR EACH ROW trigger. The restarts always happen at that point. Therefore, if you have a single row update - and you KNOW it will ALWAYS be a single row update - the AFTER trigger will not fire twice. The before and before each row triggers will (in this case, in general before, before for each row, after for each row triggers can fire twice in a multi-row update).

ops$tkyte%ORA11GR2> create or replace PACKAGE pacepack
  2  as
  3      type array is table of number index by varchar2(40);
  4      g_data  array;
  5      g_data2 array;
  6      g_cnt   number;
  7      g_cnt2  number;
  8  end pacepack;
  9  /

Package created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create table delete_data
  2  as
  3  select owner, object_name
  4  from dba_objects
  5  where rownum < 53001;

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create or replace trigger delete_data_BDFER
  2  before delete on delete_data
  3  for each row
  4  begin
  5      if ( pacepack.g_data.exists(rowidtochar(:old.rowid)) )
  6      then
  7          dbms_output.put_line( 'BEFORE doing "' || :old.rowid || '" again was called ' || pacepack.g_cnt );
  8      else
  9          pacepack.g_data(rowidtochar(:old.rowid)) := 1;
 10      end if;
 11      pacepack.g_cnt := pacepack.g_cnt + 1;
 12  end;
 13  /

Trigger created.

ops$tkyte%ORA11GR2> create or replace trigger delete_data_ADFER
  2  AFTER delete on delete_data
  3  for each row
  4  begin
  5      if ( pacepack.g_data2.exists(rowidtochar(:old.rowid)) )
  6      then
  7          dbms_output.put_line( 'AFTER doing "' || :old.rowid || '" again was called ' || pacepack.g_cnt );
  8      else
  9          pacepack.g_data2(rowidtochar(:old.rowid)) := 1;
 10      end if;
 11      pacepack.g_cnt2 := pacepack.g_cnt2 + 1;
 12  end;
 13  /

Trigger created.

ops$tkyte%ORA11GR2> declare
  2      CURSOR L_DELETE_CSR IS
  3      SELECT *
  4        FROM delete_data
  5         for update ;
  6      l_cnt number := 0;
  7  BEGIN
  8       pacepack.g_data.delete;
  9       pacepack.g_data2.delete;
 10       pacepack.g_cnt := 0;
 11       pacepack.g_cnt2 := 0;
 12      for l_delete_row in l_delete_csr
 13      loop
 14            DELETE FROM delete_data
 15           where current of L_delete_CSR;
 16          l_cnt := l_cnt + 1;
 17      end loop;
 18      dbms_output.put_line( 'before trigger count = ' || pacepack.g_cnt || ' after trigger count = ' || pacepack.g_cnt2 || ' local count = ' || l_cnt );
 19  END;
 20  /
BEFORE doing "AAAYOBAAEAABBRPACz" again was called 3381
BEFORE doing "AAAYOBAAEAABBRyACm" again was called 11293
BEFORE doing "AAAYOBAAEAABDdhAC9" again was called 18767
BEFORE doing "AAAYOBAAEAAAlaEAB9" again was called 26170
BEFORE doing "AAAYOBAAEAAAlaqABl" again was called 33619
BEFORE doing "AAAYOBAAEAAAlbRABb" again was called 40991
BEFORE doing "AAAYOBAAEAAAlb4AAc" again was called 48379
before trigger count = 53007 after trigger count = 53000 local count = 53000

PL/SQL procedure successfully completed.

But it takes an understanding of this restart phenomena - how it happens, when it happens, where it happens to understand that for a single row update - it'll work by accident.

And now you know in no unambiguous terms to never utilize that contractor again - or any other 'trigger happy' contractor...

A reader, April 27, 2010 - 4:13 pm UTC

Dear Sir,

Still don't unerstand why trigger is fire second times??

Thanks

April 28, 2010 - 7:43 am UTC

read the three links, it is a restart - they happen all of the time, and remember that it doesn't really matter WHY it fired, it only matters that it CAN and WILL fire that way.

Charles Forbes, April 27, 2010 - 4:28 pm UTC

As a second to the "I don't understand" post, I understand why the trigger could fire twice in a multi-user environment, but not in a single-user environment. Your 3-part series makes sense, but not in this context.

Not a criticism, as I obviously need to code in a different manner, but my curious side is wondering why you'd see rollbacks for a single record, in a single-user transaction.

April 28, 2010 - 7:52 am UTC

You should just be thanking your lucky stars it did. Imagine how hard this would be to figure out if you had to have X aligned with Z on a tuesday that isn't raining to have your "Not NO UP!" message appear *sometimes*

It is likely due to the large volume of micro transactions taking place - overwriting some undo information that the delete (the individual tiny deletes) felt they needed - sort of like delayed block cleanout, sort of like why you can get a 1555 on a read only tablespace:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:895410916429

it is all about read consistency.

A thought

Galen Boyer, April 27, 2010 - 9:29 pm UTC

I'm with Tom, as I am almost always. But, you do sound stuck with
your codebase and are looking for some kind of minimally intrusive
solution?

You could probably keep the jist of your logic with adding the use of
a sequence. The set operation_name could set a different global
variable, say oper_id, to seq.nextval if the operation_name passed to
it was null or it could set it to -1 (some number anyways). The
trigger could check if the currval = get_oper_id. (You'd have to check
for the error that the currval can cause if nextval hasn't been
called.) If currval = get_oper_id, then bypass, if not, then do the
business logic.

This would get the trigger out of the business of setting a global
variable, which on multiple trigger firings, is the cause of your
issue. The trigger would just check a condition.

You could probably implement that logic in your codebase without
gutting the dependency on the triggers in your codebase. You'd still
have to touch all triggers, but maybe those are generated? And each
package would need to have its set_operation_name modified. But maybe
those all call a common method somewhere?

A reader, April 28, 2010 - 10:39 am UTC

(Bear with me, as some of these changes are easier than others to implement system-wide.)

Moving the setting & unsetting of this PACKAGE SPEC variable to the code block executing the DELETE, INSERT or UPDATE would take a large amount of effort. It's simply in a ton of locations. Redefining PACEPACK to use a GLOBAL TEMPORARY TABLE ... easy. But, redefining these BEFORE ROW triggers as AFTER ROW triggers is something I could see as being somewhere in-between, if it makes better sense for our system as a whole.

Would your same suggestion of moving everything to the AFTER ROW trigger apply if you knew that these aren't always singleton updates? They're only singleton updates if we need to set this OPERATION_NAME before the DML. I've taken for granted the use of the BEFORE ROW trigger usage by this vendor, as we were all new to Oracle as these systems were delivered (remember, this was a WANG to Oracle conversion for us). But maybe the AFTER ROW should have been where these TRIGGERS should've resided all along.

April 28, 2010 - 12:51 pm UTC

... Would your same suggestion of moving everything to the AFTER ROW trigger apply
if you knew that these aren't always singleton updates?...

Not at all - if you update more than one row and we restart, you'll have double firing of your after triggers too.

But wait - if you update more than one row - your trigger logic flips the switch off - only the first randomly processed record has no validation - the rest - all do??? that doesn't make sense (the logic of the trigger doesn't make sense to me frankly, to automatically turn on validation after it doesn't validate one record...)

If the triggers DO NOT modify the :new record - they should be after triggers, the only reason you use before for each row triggers is to MODIFY the row values.

A reader, April 28, 2010 - 1:28 pm UTC

so you mean to say that :OLd.row_id change thats the reason the trigger fire twice?

Thanks

April 28, 2010 - 2:06 pm UTC

no, not at all. I'm not sure what you mean by that or what led you down that thought...

Please expand on that.

A reader, April 28, 2010 - 1:37 pm UTC

The reason we're using the cursor and WHERE CURRENT OF to perform the DELETE in this case, is solely to set the OPERATION_NAME variable prior to each row's deletion. It's done that way, since OPERATION_NAME is set to NULL at the end of the trigger.

We wouldn't perform a SET-level delete, and assign a value to the OPERATION_NAME, for the reason you stated. The OPERATION_NAME would only retain it's value for the first trip through the TRIGGER.

April 28, 2010 - 2:08 pm UTC

I know, you are forced to do something really bad here because of the twisted, poorly formed, bad trigger you have in place. You cannot use the database as a database, you have to pretend it is a WANG system from the 80's or something.

A reader, April 28, 2010 - 4:45 pm UTC

As per your first part regarding write consistency quoted as below

>SELECT FOR UPDATE mode (which has the same read-consistent and read
>current block gets going on as an update does), a row that was Y=5
>when you started the SELECT FOR UPDATE is found to be Y=11
>when you go to get the current version of it? That
>SELECT FOR UDPDATE will restart and the cycle begins again.

So in your first example to OP the restart is because of select for update ?

Thanks

April 28, 2010 - 5:46 pm UTC

it is not relevant why - it is only relevant that

a) it will happen

(very short relevancy list)

the delete where current of is restarting, the select for update has already selected and locked every row it will retrieve before the first delete took place.

A reader, April 29, 2010 - 2:06 pm UTC

in your demo(3 links you provide) you are talking about two concurrent session, where one is blocked by other and upon commit
by one session other session see restart

Here in your example there is only one session...

ops$tkyte%ORA11GR2> declare
2 CURSOR L_DELETE_CSR IS
3 SELECT *
4 FROM delete_data
5 for update;
6 l_cnt number := 0;
7 BEGIN
8 pacepack.g_data.delete;
9 pacepack.g_cnt := 0;
10 for l_delete_row in l_delete_csr
11 loop
12 update delete_data
13 set owner = lower(owner)
14 where current of L_delete_CSR;
15 l_cnt := l_cnt + 1;
16 end loop;
17 dbms_output.put_line( 'trigger count = ' || pacepack.g_cnt || ' local count = '
|| l_cnt );
18 END;
19 /

(1) Is it because select for update lock the row and later in the loop update also need to lock the same row and that is causing the restart?

(2) Also why only 3 rows see update during first run and 7 rows in the next run?

Please explain

millions thanks

April 29, 2010 - 2:14 pm UTC

Those links prove the existence of a restart - that is all that is necessary to complete this page.

I don't really care *WHY* for this one, I don't want to get the conversation distracted right now - I want to stay focused on this one important fact:

they happen, they can happen, they will happen, they do happen - be prepared for it, code for it, avoid triggers that have any side effects - heck, avoid triggers

that is my goal for this post - I'm keeping it on target.

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

A thought

More to Explore

PL/SQL demos

PL/SQL docs

Same row is getting deleted twice

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

A thought

More to Explore

PL/SQL demos

PL/SQL docs