Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Sokrates.

Asked: January 05, 2009 - 4:22 pm UTC

Last updated: October 07, 2009 - 8:36 am UTC

Version: 10.2.0.2

Viewed 1000+ times

You Asked

Hallo Tom,

during christmas break I found time to read
Expert Oracle Database Architecture
in total.
That was great reading, I learnt a lot, thank you !

I have 2 questions on Chapter 9., "Redo and Undo".

The first one is on redo for undo for DML on temporary tables:

a. it is clear that no redo must be generated for that, since temp tables never will be recovered.
b. it is clear that undo must be generated for that, since it might be rollbacked.
c. it is absolutely not clear, why redo for this undo is generated. You just say "redo is written for every undo" hence also for undo for DML on temp tables.

But am I correct, that this redo will never be used, and that it is just a "not yet implemented optimization", that no redo is generated for that at all ?
I mean, probably sometimes in the future there will be implemented a flag for undo "no need to write redo for that", thus allowing not to write redo for undo for DML on temp tables ?


The second one is on undo, on what I call "undo for trivial updates":
I always falsely thought of undo as physical stuff, always had the concept of "before image" in my head. As you point out, the blocks we modify may also be simultaneously modified by other users, so undo cannot be something "physical", instead of, it has to be something "logical":
The logic of a DML must be able to be undone.
Hence, it is clear, that an insert generates a minimum of undo: Oracle just needs to delete the row in case of rollback, very few undo information is needed for that.

Question is: why does a trivial update ( a column is updated with its value) generate so much undo ?

10gR2 > drop table t;

Table dropped.

10gR2 > create table t as
  2  select object_name from all_objects;

Table created.

10gR2 > select used_ublk
  2  from v$transaction
  3  where addr =
  4    (select taddr
  5    from v$session
  6     where sid = (select sid from v$mystat where rownum=1)
  7    );

no rows selected

10gR2 > update t set object_name=lower(object_name);

68791 rows updated.

10gR2 > select used_ublk
  2  from v$transaction
  3  where addr =
  4    (select taddr
  5     from v$session
  6     where sid = (select sid from v$mystat where rownum=1)
  7    );

 USED_UBLK
----------
      1521

10gR2 > rollback;

Rollback complete.

10gR2 > update t set object_name=object_name;

68791 rows updated.

10gR2 > select used_ublk
  2  from v$transaction
  3  where addr =
  4    (select taddr
  5     from v$session
  6     where sid = (select sid from v$mystat where rownum=1)
  7    );

 USED_UBLK
----------
      1521
   



The same amount of undo is generated.
In the first case, Oracle needs to write the before-update-values into the undo.
In the second case, Oracle could choose just to write "no need to undo something" into the undo, instead of that, it seems, it does just the same.

So my suspect is, this also is an optimization "not yet implemented" and probably "might be seen sometimes in the future".
Am I correct ?

Thank you in advance

and Tom said...

... c. it is absolutely not clear, why redo for this undo is generated. You just say "redo is written for every undo" hence also for undo for DML on temp tables. ...

it is because undo is protected by redo, we do not differentiate between undo for temporary stuff and undo for permanent stuff. undo is undo.


... But am I correct, that this redo will never be used, and that it is just a "not yet implemented optimization", that no redo is generated for that at all ? ...

it would be used during instance crash recovery and media failure. We need to put undo back the way it was in order to use it. Imagine a crash and we restart and go through the roll forward (recovery) phase - and all of a sudden there are these big blank holes in the undo segments. undo is a complex structure - and the undo for the temporary tables is intermingled with the undo for permanent tables - it is not 'separate'.




As for the last bit about the "set column=column" - yes, that is an optimization we don't have(however, if there is an index on the column... there is an optimization there :) we don't mess with the index unless the value changes.

Rating

  (7 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

any chance ?

Sokrates, January 08, 2009 - 5:58 am UTC

Thanks.

Do you see any chance to see this optimization Oracle hasn't implemented so far in the - nearer - future ?
Tom Kyte
January 08, 2009 - 11:29 am UTC

it is possible, sure - but it could be tricky.

We need to update the row - to lock it - and doing the extra compare for every updated column could/would be really expensive to bulk updates. The gain of not modifying an index key outweighs the cost of the column compare - but perhaps not so for a simple block update.

maybe ...

Sokrates, January 08, 2009 - 11:48 am UTC

I would find it an absolute great improvement.
I often have to update large amount of data (say: Oracle table) against source data (say: external table)

merge into <table> t
using (
select * from <externaltable>
minus
select * from <table>
) diff
on (t.<pk_columns> = diff.<pk_columns>)
....


is not too bad, but of course, if only 1 of 44 column-values
is different, then "minus" gives a whole row !

the cost of the column compare: maybe this behaviour could be settible via parameter ...

Using MERGE to minimize undo and redo

Stew Ashton, January 10, 2009 - 5:06 am UTC


Like Sokrates, I often use MERGE to "refresh" tables. I think there is a sweet spot for this technique when most of the data in the target table doesn't change and materialized view fast refresh is not an option. It doesn't commit in the middle like TRUNCATE + INSERT and it doesn't require as much index maintenance. I have adapted your "compare tables" technique to "refresh" a table; in 10G, it's one SQL statement. Does this look OK to you?
create table t_target (key1 number primary key, value1 varchar2(256));
insert into t_target values (1, 'DO NOTHING - same in source and target');
create table t_source as select * from t_target;
insert into t_source values (2, 'INSERT - in source, not in target');
insert into t_target values (3, '(update then) DELETE - not in source, in target');
insert into t_target values (4, 'DO NOTHING - update this from source');
insert into t_source values (4, 'UPDATE - put this in target');
select * from t_target order by 1
/
--KEY1                   VALUE1
------------------------ ------------------------------------------------
--1                      DO NOTHING - same in source and target
--3                      (update then) DELETE - not in source, in target
--4                      DO NOTHING - update this from source
--
--3 rows selected

select * from t_source order by 1
/
--KEY1                   VALUE1
------------------------ -------------------------------------
--1                      DO NOTHING - same in source and target
--2                      INSERT - in source, not in target
--4                      UPDATE - put this in target
--
--3 rows selected

merge into T_TARGET o
using (
  select a.*, sum(new_cnt - old_cnt) over(partition by KEY1) mod_type from (
    select KEY1, VALUE1, sum(old_cnt) old_cnt, sum(new_cnt) new_cnt from (
      select o.*, 1 old_cnt, 0 new_cnt from T_TARGET o
      union all
      select n.*, 0 old_cnt, 1 new_cnt from T_SOURCE n
    ) group by KEY1, VALUE1 having sum(old_cnt) <> sum(new_cnt)
  ) a
) n
on (o.KEY1 = n.KEY1)
when matched then update set VALUE1 = n.VALUE1
  where mod_type in (-1, old_cnt)
delete where mod_type = -1
when not matched then insert (KEY1, VALUE1) values (n.KEY1, n.VALUE1)
/
--3 rows merged
select * from t_target order by 1
/
--KEY1                   VALUE1
------------------------ -------------------------------------
--1                      DO NOTHING - same in source and target
--2                      INSERT - in source, not in target
--4                      UPDATE - put this in target
--
--3 rows selected

Tom Kyte
January 12, 2009 - 8:29 pm UTC

that is good as long as the cost of running the query to find "modified" stuff isn't larger than the cost of just updating the rows (eg: MOST/MANY of the rows would be excluded...)

undo/redo generated for index on temp table

adam tarshis, September 28, 2009 - 12:37 pm UTC

Tom,

Thanks for a great explanation about undo/redo issues on global temp tables. I have a followup on the application of the undo with regards to an index on the global temp table in this example.

Assuming no index, the redo generated from the undo for an insert of 100k records into an empty global temp table would be minimal. Rolling back clearly takes us back to an empty table which is easy to log. My follow up is: why does the presence of the index change this for an insert (I understand the issues with regards to undo/redo on updates and deletes)?

With the index we get a ton of new undo/redo for the same insert into an empty global temp table. You make reference to not having "big blank holes" in the undo space - but i'm still not clear why the presence or lack of an index on the temp table would change the presence of holes in the undo data - if we're not logging the data on the insert - why do we have to log the index?

thanks

adam
Tom Kyte
September 30, 2009 - 7:25 am UTC

... My follow up is: why
does the presence of the index change this for an insert ...

because when you insert a row into a table with an index, we have to maintain the index, the index is a complex structure and generates lots of undo to manage itself.



we have to generate undo for the index because we have to support multi-versioning of the data.


say you

a) insert some data into global temporary table GTT. You also issue a savepoint
b) you open a cursor against that - it uses that index.
c) you delete data from GTT, you update data in GTT, you insert data in GTT
d) now you FETCH from cursor you opened in (b) - we need the undo to rollback the changes to undo what (c) did.
e) now you rollback to savepoint - we need to undo (c)

the gtt to you - to your session is just like a 'real table' - no different. We need undo to roll back to - undo is ALWAYS protected by redo - if there were gaps in the undo segment during instance recovery - that would be an intolerable situation for us.



if we're
not logging the data on the insert - why do we have to log the index?


but we ARE logging data on the insert. You know that - you wrote as much:

Assuming no index, the redo generated from the undo for an insert of 100k
records into an empty global temp table would be minimal.


minimal is not "not logging"

A reader, September 30, 2009 - 8:48 am UTC

Respected Sir,

You reply to original poster as below:

<qoute/>
As for the last bit about the "set column=column" - yes, that is an optimization we don't have(however, if there is an index on the column... there is an optimization there :) we don't mess with the index unless the value changes.
<qoute/>

Could you please flash more light on this?


Thanks
Tom Kyte
October 07, 2009 - 8:36 am UTC

what more can I say?

if you update set c=c then

a) yes, we update the row on the block for the table, we have to - in order to lock the row

b) no, we don't update the index entry for that row if c is indexed - we recognize that c did not change and we do not go to the index and touch it in that case.

A reader, October 07, 2009 - 11:27 am UTC

Thanks SIR that clear my understanding


A reader, May 27, 2014 - 8:51 am UTC