Home>Question Details



prakash -- Thanks for the question regarding "NVL in update columns", version 9.1.2

Submitted on 23-Feb-2009 1:46 Central time zone
Last updated 5-Mar-2009 20:20

You Asked

Hi Tom,

Just a thought,

I have two columns in a table(T) say columns A and B.

I want to update a value to any one of these columns which is not null.
definitely one of these columns is NULL and the other is NOT NULL

So I have to write,

UPDATE T
  SET A = 2
WHERE A IS NOT NULL
/
UPDATE T
  SET B = 2
WHERE B IS NOT NULL
/

to write in single UPDATE,

UPDATE T
  SET A = decode(A,NULL,NULL,2),
    B = decode(B,NULL,NULL,2)
/

Is there a way to write in single update with out case and decode ?

Just a thought, Oracle can be designed in such a way to
handle NVL in UPDATE SET Columns like,
UPDATE SET NVL(A,B) = 2 -- Just a thought

regards,
PK


and we said...

you need an LVALUE (lvalue being a computer science sort of term, a value that can be modified). Since SQL has no concept of a pointer - a "function" will never be an lvalue, only an rvalue (something that can be used to assign from, but not to).

Your update would be actually:
update t 
set a = case when a is null then 2 else a end, 
b = case when b is null then 2 else b end
WHERE A is null OR B is null;




Reviews    
5 stars   March 3, 2009 - 4am Central time zone
Reviewer: A reader 
Tom,

I think prakash stated

"Is there a way to write in single update with out case and decode".

Your solution definitely uses 'case'.

If I understand prakash's question correctly he is looking for something like

update t
set    a = nvl(a, 2)
,      b = nvl(b, 2)
where  a is null
or     b is null      


Followup   March 3, 2009 - 4pm Central time zone:

actually, we both got it wrong.


His working query is:

UPDATE T
SET A = decode(A,NULL,NULL,2),
  B = decode(B,NULL,NULL,2)
/


if A is NULL, let it be - else 2. We both did it backwards.

NVL cannot work here.
1 stars   March 4, 2009 - 1am Central time zone
Reviewer: prakash from Lake Charles, LA, USA
Tom,

Thanks for your inputs.

This works for me.

UPDATE T
   SET A = NVL2(A,2,A),
       B = NVL2(B,2,B)
/

regards,
Prakash


Followup   March 4, 2009 - 1pm Central time zone:

you might want to add

where a is not null or b is not null



but one wonders why you cannot use case (ansi) versus NVL2 (not)


it certainly would be a lot more readable.
1 stars   March 4, 2009 - 2pm Central time zone
Reviewer: prakash from lakecharles,la,usa
Tom,
 
 Thanks for your suggestion.

 Even without A is NOT NULL or B is NOT NULL
 this NVL2 will work.

 Ofcourse, we can use CASE instead NVL2.
 Already I used DECODE, it is working fine.

 Since you said NVL can't work for this, out of curious I thought and got this NVL2.

One more thing, CASE have more SYNTAX than DECODE and NVL2, better to use NVL2 ( hope performance 
wise there will not be much difference among CASE / DECODE / NVL2 ).

If CASE is more optimized command than DECODE/NVL2 then I will change to CASE.


Regards,
Prakash


Followup   March 4, 2009 - 2pm Central time zone:

it will work, but it will update (and hence lock) every row in the table - it does not do what your original two updates did.

you should switch to case just because it makes the update infinitely more understandable. Take pity on those that will follow you in the future.
3 stars   March 4, 2009 - 2pm Central time zone
Reviewer: A reader 
Tom,

 Actually in my real application scenario, I have a table with many columns. It has primary key 
column and other two columns( say A, B) come under this NOT NULL problem.

I will put primary key column in WHERE clause. But I don't know A / B is NULL or NOT NULL. but 
definitely one of these columns will be NOT NULL and the other will be NULL.

If NULL, I will leave as it is. If NOT NULL I have to UPDATE that column value with COST.

Any way thanks especially, I have to thank you for I have learnt a lot, still learning, exploring 
Oracle.. from you. ( One on One / ASK TOM.. etc )

Regards,
PRakash


3 stars Skip redundant update   March 4, 2009 - 7pm Central time zone
Reviewer: Charlie MuJiang from Victoria, BC Canada
Oracle 11.1.0.7 on Linux 64bit.

Scenario: our book seller upload their books data into our database,
there are many columns in book table, but book sellers do not tell us which column to update, they 
supply everything.

My question is how to skip the column that is not changed? then we can dramatically reduce the 
undo/redo.
I heard SQL*Server can ignore this kind of update "intelligently". :)


Here is an example test case:

-- drop table book;

create table book
(
book_id number,
author varchar2(100),
title varchar2(100),
price number(10,2),
qty number(5),
constraint book_pk primary key(book_id)
);


Here is heap table to simulate the source external table,

-- drop table book_ext;

create table book_ext
(
book_id number,
author varchar2(100),
title varchar2(100),
price number(10,2),
qty number(5)
);

insert into book(book_id, author, title, price, qty) values( 1, 'Bob',   'tube in winter', 10.52, 
3);
insert into book(book_id, author, title, price, qty) values( 2, 'Joe',   'tube in summer', 10.52, 
4);
insert into book(book_id, author, title, price, qty) values( 3, 'Jerry', 'tube in spring', 10.52, 
5);

insert into book_ext(book_id, author, title, price, qty) values( 1, 'Bob',   'tube in winter', 21, 
3);
insert into book_ext(book_id, author, title, price, qty) values( 2, 'Joe',   'tube in summer', 22, 
4);
insert into book_ext(book_id, author, title, price, qty) values( 3, 'Jerry', 'tube in spring', 23, 
5);

commit;


Here is merge SQL

MERGE INTO book d
using (select book_id, author, title, price, qty from book_ext ) s
on (d.book_id = s.book_id)
WHEN MATCHED THEN 
update set
 d.author = s.author,
 d.title = s.title,
 d.price = s.price,
 d.qty = s.qty
;

You can see only price changed, and the SQL I really want to run is:

MERGE INTO book d
using (select book_id, author, title, price, qty from book_ext ) s
on (d.book_id = s.book_id)
WHEN MATCHED THEN 
update set
 d.price = s.price
;

Any suggestion will be appreciated. hope I did not post in the wrong place and get deleted.  ^_^

Thanks,
Charlie


Followup   March 4, 2009 - 9pm Central time zone:

... My question is how to skip the column that is not changed? then we can
dramatically reduce the undo/redo.
I heard SQL*Server can ignore this kind of update "intelligently". :)
....

a) if the column is indexed, we intelligently do not touch the index if the column was unchanged

b) no, you cannot skip a column row by row, you either 1) update it, or 2) do not update it - for every row in the update

c) define the sqlserver feature, no baiting, technically and accurately describe what it does please.
ops$tkyte%ORA10GR2> create table t
  2  as
  3  select rpad('*',40,'*') a, rpad('*',40,'*') b , rpad('*',40,'*') c
  4    from all_objects
  5  /

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @mystat "redo size"
ops$tkyte%ORA10GR2> set echo off

NAME                                               VALUE
--------------------------------------------- ----------
redo size                                       50913500

ops$tkyte%ORA10GR2> update t set a = a, b = b, c = c;

49999 rows updated.

ops$tkyte%ORA10GR2> @mystat2
ops$tkyte%ORA10GR2> set echo off

NAME                                               VALUE DIFF
--------------------------------------------- ---------- ------------------
redo size                                       79983900       29,070,400

ops$tkyte%ORA10GR2> select used_ublk from v$transaction;

 USED_UBLK
----------
      1771

ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @mystat "redo size"
ops$tkyte%ORA10GR2> set echo off

NAME                                               VALUE
--------------------------------------------- ----------
redo size                                       79984024

ops$tkyte%ORA10GR2> update t set a = a, b = b;

49999 rows updated.

ops$tkyte%ORA10GR2> @mystat2
ops$tkyte%ORA10GR2> set echo off

NAME                                               VALUE DIFF
--------------------------------------------- ---------- ------------------
redo size                                      104823452       24,839,428

ops$tkyte%ORA10GR2> select used_ublk from v$transaction;

 USED_UBLK
----------
      1325

ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @mystat "redo size"
ops$tkyte%ORA10GR2> set echo off

NAME                                               VALUE
--------------------------------------------- ----------
redo size                                      104823620

ops$tkyte%ORA10GR2> update t set a = a;

49999 rows updated.

ops$tkyte%ORA10GR2> @mystat2
ops$tkyte%ORA10GR2> set echo off

NAME                                               VALUE DIFF
--------------------------------------------- ---------- ------------------
redo size                                      123809580       18,985,960

ops$tkyte%ORA10GR2> select used_ublk from v$transaction;

 USED_UBLK
----------
       969

ops$tkyte%ORA10GR2> commit;

Commit complete.





results: undo is more, but not 2x, 3x more.
results: redo is more, but not 2x, 3x more.

there is some 'intelligence' there.


further:

ops$tkyte%ORA10GR2> create index t_idx on t(a,b,c);

Index created.

ops$tkyte%ORA10GR2> @mystat "redo size"
ops$tkyte%ORA10GR2> set echo off

NAME                                               VALUE
--------------------------------------------- ----------
redo size                                      202805140

ops$tkyte%ORA10GR2> update t set a = a, b = b, c = c;

49999 rows updated.

ops$tkyte%ORA10GR2> @mystat2
ops$tkyte%ORA10GR2> set echo off

NAME                                               VALUE DIFF
--------------------------------------------- ---------- ------------------
redo size                                      227344072       24,538,932

ops$tkyte%ORA10GR2> select used_ublk from v$transaction;

 USED_UBLK
----------
      1317

ops$tkyte%ORA10GR2> commit;

Commit complete.


4 stars Charlie, please at least skip unchanged rows !   March 5, 2009 - 4am Central time zone
Reviewer: Sokrates 
Charlie, please don't
MERGE INTO book d
using (select book_id, author, title, price, qty from book_ext ) s
....


but

MERGE INTO book d
using (select book_id, author, title, price, qty from book_ext minus select book_id, author, title, 
price, qty from book) s
...

which skips all unchanged rows !


3 stars Skip redundant update   March 5, 2009 - 1pm Central time zone
Reviewer: Charlie MuJiang from Victoria, BC Canada
Hi Tom,

1. Please remove the sentence about SQL Server. I know this site is serious and profession. I'll 
ask my SQL Server DBA friend to confirm later.
2. I feel lucky the previous comments did not get deleted. :), so it's related.

OK, let me give you a big picture.

*) Here is our real data distribution and statistics,
Average row length is about 1000 bytes, author/title/description/subject/key_words these synopsis 
columns used 90%+ spaces, and less updated; price/qty these columns used less than 10% space, 
freuently updated.

I use RunStats did same undo/redo benchmark before, in our case, redundant updates exhaust too much 
resources. 

*) The update columns combination patterns arrive randomly,
e.g.
 author + price, 20 rows
 price  + qty,   900 rows
 title  + ISBN,  50 rows

For example, 2 rows update price, 2 rows update author and qty:
insert into book(book_id, author, title, price, qty) values( 1, 'Bob',   'tube in winter', 10.52, 
4);
insert into book(book_id, author, title, price, qty) values( 2, 'Joe',   'tube in summer', 10.52, 
5);
insert into book(book_id, author, title, price, qty) values( 3, 'Jerry', 'tube in spring', 10.52, 
6);
insert into book(book_id, author, title, price, qty) values( 4, 'David', 'tube in fall',   10.52, 
7);
commit;

insert into book_ext(book_id, author, title, price, qty) values( 1, 'Bob',   'tube in winter', 
55.11, 4);
insert into book_ext(book_id, author, title, price, qty) values( 2, 'Joe',   'tube in summer', 
72.11, 5);

insert into book_ext(book_id, author, title, price, qty) values( 3, 'BBB',   'tube in spring', 
10.52, 9);
insert into book_ext(book_id, author, title, price, qty) values( 4, 'DDD',   'tube in fall',   
10.52, 9);
commit;

*) I want to group same update pattern rows together to do bulk update to reduce the PL/SQL -> SQL 
engine context switch.

*) To make it more complex, 80% book sellers do not supply book_id, we have to match existing rows 
to know if it is a new book or an existing book. we call product catalog rollup, Amazon called item 
aggregation.
Our current solution is to generate hash_code based on these synopsis columns, do a 2 level lazy 
match, 1st level on hash_code, 2nd level on synopsis columns.

It is kind of a common real business scenario, happened in many B2C/B2B eCommerce applications, 
include Amazon, they have same issue.

I just want to try many approaches, to make a better efficient design. (acturally it's a rewrite, 
or refactoring).

Thanks,
Charlie


Followup   March 5, 2009 - 2pm Central time zone:

1) no, it is ok, it is just that we'd need information about "what that means" in order to address it.


... redundant
updates exhaust too much resources.
....

define that for us, what are you seeing - give numbers.
3 stars Skip redundant update   March 5, 2009 - 3pm Central time zone
Reviewer: Charlie MuJiang from Victoria, BC Canada
We do not index these synopsis columns. Synopsis data will be sent to our book Search Engine, to 
support front end UI keyword search. There is only one primary key index on book table.
(some B2C sites just do not update database, and send the synopsis data to Search Engine directly 
if the effected data in a row is less than 80%.)

STAT...redo size: 
2,589,196  (update price and qty 2 columns)
23,906,048 (update all columns)

Here is the benchmark:

-- drop table book;

create table book
(
book_id number,
vendor_id number,
price number(10,2),
qty number(5),
hash_code varchar2(32 byte) not null,
author varchar2(750),
title varchar2(750),
vendor_short_isbn number(13,0),
book_condition varchar2(30 byte),
book_editor varchar2(255 byte),
subject varchar2(2000 byte),
publisher_name varchar2(750 byte),
description varchar2(4000 byte),
constraint book_pk primary key(book_id)
) nologging;

insert /*+ append */ into book 
(book_id, vendor_id, price, qty, hash_code, author, title, vendor_short_isbn, book_condition, 
book_editor, subject, publisher_name, description)
select
 rownum book_id, 
 mod(rownum,50) vendor_id,
 10.55 price,
 30 qty,
 rpad('x',31,'x') hash_code,
 rpad('x',100,'x') author,
 rpad('x',100,'x') title,
 999999999012 vendor_short_isbn,
 rpad('x',20,'x') book_condition,
 rpad('x',30,'x') book_editor,
 rpad('x',30,'x') subject,
 rpad('x',30,'x') publisher_name,
 rpad('x',700,'x') description
from dual
connect by level <= 10000;

commit;

-- drop table book2;
create table book2 nologging
as
select * from book;

-- drop table book_ext;
create table book_ext nologging
as
select * from book;

set serveroutput on

begin
runStats_pkg.rs_start;

MERGE INTO book d
using (select book_id, price, qty from book_ext ) s
on (d.book_id = s.book_id)
WHEN MATCHED THEN 
update set
 d.price = s.price,
 d.qty = s.qty
;
commit;

runStats_pkg.rs_middle;
 
MERGE INTO book2 d
using (select * from book_ext ) s
on (d.book_id = s.book_id)
WHEN MATCHED THEN 
update set
 d.vendor_id         = s.vendor_id,
 d.price         = s.price,
 d.qty             = s.qty,
 d.hash_code         = s.hash_code,
 d.author         = s.author,
 d.title         = s.title,
 d.vendor_short_isbn = s.vendor_short_isbn,
 d.book_condition    = s.book_condition   ,
 d.book_editor         = s.book_editor      ,
 d.subject         = s.subject,
 d.publisher_name    = s.publisher_name,
 d.description         = s.description
;
commit;

runStats_pkg.rs_stop(0);

end;
/

Run1 ran in 15 hsecs
Run2 ran in 82 hsecs
run 1 ran in 18.29% of the time

Name                                            Run1        Run2        Diff
STAT...HSC Heap Segment Block Changes         10,005      10,006           1
STAT...consistent changes                         15          19           4
STAT...CPU used by this session                   15          23           8
STAT...redo buffer allocation retries              0          10          10
LATCH.Consistent RBA                               4          17          13
LATCH.lgwr LWN SCN                                 4          17          13
LATCH.mostly latch-free SCN                        4          17          13
LATCH.In memory undo latch                         5          27          22
LATCH.redo writing                                12          51          39
STAT...change write time                           2          42          40
LATCH.enqueues                                     4          48          44
STAT...redo entries                           10,006      10,050          44
LATCH.redo allocation                             15          64          49
STAT...Elapsed Time                               15          82          67
STAT...dirty buffers inspected                     0         103         103
LATCH.undo global data                             8         163         155
STAT...redo subscn max counts                     40         357         317
STAT...redo ordering marks                        39         357         318
STAT...db block changes                       20,061      20,428         367
STAT...free buffer inspected                     111         479         368
STAT...hot buffers moved to head of LRU           39         542         503
STAT...db block gets                          10,097      10,866         769
STAT...session logical reads                  10,851      11,668         817
STAT...IMU undo allocation size                   72       1,080       1,008
STAT...IMU Redo allocation size                    0      15,308      15,308
STAT...physical read bytes                    32,768           0     -32,768
STAT...undo change vector size               928,332  11,571,404  10,643,072
STAT...redo size                           2,589,196  23,906,048  21,316,852

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
55,492      59,657       4,165     93.02%

PL/SQL procedure successfully completed.


Followup   March 5, 2009 - 4pm Central time zone:

23mb is tiny. Why are you worried about that???? It is nothing.
3 stars Skip redundant update   March 5, 2009 - 5pm Central time zone
Reviewer: Charlie MuJiang from Victoria, BC Canada
Do you mean 23MB redo per 10,000 rows update is tiny?

We update 30~50 millions rows a day, about 70~120GB redo, not much. our Linux PC server is not 
powerful, HP DL380, 2CPU, 8GB memory, we will upgrade it to 32GB soon.

Back to your track,

1. it is slow(Elapsed Time), we need to reduce the latency and improve the throughput.
2. it takes 700% more memory to do the in memory hash join.
 (For big book seller, from time to time we need to join source data more than 1 million rows, 12M 
* 100 = 1.2G PGA, not big)
3. Latches is OK, about same, scalability should be OK.

FYI, here is the memory used during execution.

-- Oracle 11.1.0.7
set serveroutput off
set linesize 120
set pagesize 0

MERGE /*+ gather_plan_statistics */ INTO book d
using (select book_id, price, qty from book_ext ) s
on (d.book_id = s.book_id)
WHEN MATCHED THEN 
update set
 d.price = s.price,
 d.qty = s.qty
;

SELECT * FROM table(dbms_xplan.display_cursor(NULL,NULL, 'iostats memstats last'));

commit;

----------------------------------------------------------------------------------------------------
------------------
| Id  | Operation            | Name     | Starts | E-Rows | A-Rows |   A-Time  | Buffers |  OMem |  
1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------
------------------
|   0 | MERGE STATEMENT      |          |      1 |        |      1 |00:00:00.15 |   10832 |       | 
      |          |
|   1 |  MERGE               | BOOK     |      1 |        |      1 |00:00:00.15 |   10832 |       | 
      |          |
|   2 |   VIEW               |          |      1 |        |  10000 |00:00:00.01 |     752 |       | 
      |          |
|*  3 |    HASH JOIN         |          |      1 |  10000 |  10000 |00:00:00.01 |     752 |   865K| 
  865K| 1409K (0)|
|   4 |     TABLE ACCESS FULL| BOOK_EXT |      1 |  10000 |  10000 |00:00:00.01 |     376 |       | 
      |          |
|   5 |     TABLE ACCESS FULL| BOOK     |      1 |  10000 |  10000 |00:00:00.01 |     376 |       | 
      |          |
----------------------------------------------------------------------------------------------------
------------------

MERGE /*+ gather_plan_statistics */ INTO book2 d
using (select * from book_ext ) s
on (d.book_id = s.book_id)
WHEN MATCHED THEN 
update set
 d.vendor_id         = s.vendor_id,
 d.price         = s.price,
 d.qty             = s.qty,
 d.hash_code         = s.hash_code,
 d.author         = s.author,
 d.title         = s.title,
 d.vendor_short_isbn = s.vendor_short_isbn,
 d.book_condition    = s.book_condition   ,
 d.book_editor         = s.book_editor      ,
 d.subject         = s.subject,
 d.publisher_name    = s.publisher_name,
 d.description         = s.description
;

SELECT * FROM table(dbms_xplan.display_cursor(NULL,NULL, 'iostats memstats last'));

commit;

----------------------------------------------------------------------------------------------------
------------------
| Id  | Operation            | Name     | Starts | E-Rows | A-Rows |   A-Time  | Buffers |  OMem |  
1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------
------------------
|   0 | MERGE STATEMENT      |          |      1 |        |      1 |00:00:00.75 |   11580 |       | 
      |          |
|   1 |  MERGE               | BOOK2    |      1 |        |      1 |00:00:00.75 |   11580 |       | 
      |          |
|   2 |   VIEW               |          |      1 |        |  10000 |00:00:00.02 |     752 |       | 
      |          |
|*  3 |    HASH JOIN         |          |      1 |  10000 |  10000 |00:00:00.02 |     752 |    11M| 
 2085K|   12M (0)|
|   4 |     TABLE ACCESS FULL| BOOK_EXT |      1 |  10000 |  10000 |00:00:00.01 |     376 |       | 
      |          |
|   5 |     TABLE ACCESS FULL| BOOK2    |      1 |  10000 |  10000 |00:00:00.01 |     376 |       | 
      |          |
----------------------------------------------------------------------------------------------------
------------------


Do you think try to skip the redundant update will not help much compare to the code complexity to 
handle it?


Followup   March 5, 2009 - 6pm Central time zone:

...
We update 30~50 millions rows a day, about 70~120GB redo, not much. our Linux
PC server is not powerful, HP DL380, 2CPU, 8GB memory, we will upgrade it to
32GB soon.
.....


A person can only comment on information provided. I asked "what are your numbers".

You gave me a for example with 23mb.

From that, you will get a comment "23mb, small"

You did not tell me what you were really experiencing.



... 2. it takes 700% more memory to do the in memory hash join. ...

well, sure it does - but I don't see how you can avoid that - even if you could selectively "not update column X on a row by row basis" - you would *STILL* need column X.

and if it puts a memory strain on you, we can page to disk.





Your choices are:

a) make multiple passes - update the columns most expected to change, and then update the other columns that are not expected to change.

ops$tkyte%ORA11GR1> create table t1 ( x int, freq_change int, infreq_change int );

Table created.

ops$tkyte%ORA11GR1> create table t2 ( x int, freq_change int, infreq_change int );

Table created.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> insert into t1 select rownum, user_id, 42 from all_users;

49 rows created.

ops$tkyte%ORA11GR1> insert into t2 select rownum, case when mod(rownum,2) = 0 then user_id/2 else 
user_id end, case when mod(rownum,10) = 0 then 55 else 42 end from all_users;

49 rows created.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> merge into t1
  2  using (select x, freq_change from t2 minus select x, freq_change from t1 ) t2
  3  on (t1.x = t2.x)
  4  when matched then update set t1.freq_change = t2.freq_change
  5  /

24 rows merged.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> merge into t1
  2  using (select x, infreq_change from t2 minus select x, infreq_change from t1 ) t2
  3  on (t1.x = t2.x)
  4  when matched then update set t1.infreq_change = t2.infreq_change
  5  /

4 rows merged.



b) do it "slow by slow", procedurally, using an update that is "perfect" for each update (hard on the shared pool, but in reality you'll probably only generate 10-20 unique update statements over time)

c) do it in bulk as you are




Question for you: is every row in the target table represented in this batch file? I mean, could we just replace it?

4 stars Skip redundant update   March 5, 2009 - 6pm Central time zone
Reviewer: Charlie MuJiang from Victoria, BC Canada
Thanks Tom for keep looking into this question.

You are reading my mind and our new system design.

Here is my new design,

Our book table have 50 columns, the table is partitioned by Vendor_ID (Book seller ID),
Beside index on primary key, index the hash_code column to handle the small chunk of source data. 
Both are local partition index.
hash_code has something like the bloom filter, False negatives are not possible, False positives 
are possible.

We have 5 types of flat file to be uploaded into inventory database, each vendor upload different 
file:

INSERT.
DELETE.
UPDATE (few, from UI, through async messaging system, dump to interface flat file, the read as 
external table).
MERGE/Unknown(insert or update).
PURGE = the Replacement, replace whole inventory for the seller, it will take advantage of exchange 
partition here.

As for UPDATE and skip unchanged columns, in fact, it is the 2nd step of MERGE(not SQL MERGE).

1) Before generate staging data for update, we need to match source rows and target rows to find 
out which column(s) need to be updated.
E.g. 10000 rows in a file, 8000 rows update column qty, 2000 rows update column author and ISBN, 
all the rest columns data are not changed.

2) And then group same update pattern rows together to do bulk update to reduce the PL/SQL -> SQL 
engine context switch.

It is like your suggestion a) plus b), but not slow by slow. (now all our IT and PM department knew 
"slow by slow". :D )
Yes, the popular update columns combination patterns should be less than 20, and to reduce the 
overhead on Library Cache.
For less popular update columns combination patterns, we simply update all columns.

We will have a counting log table to help to decide which pattern is popular. at same time, store 
the UPDATE/MERGE sql text in the table, do not need to generate the dynamic SQL every time.

Anyway, this makes the code complex, I wonder if it is worth doing...


Followup   March 5, 2009 - 8pm Central time zone:

"I wonder if it is worth doing..."

if your number 1 goal is "reduce to a bare minimum the redo generated", then the answer is probably "yes"
5 stars Suggested approach based on Tom's table compare routine   March 6, 2009 - 10am Central time zone
Reviewer: Stew Ashton from Paris, France
create table t_source(key1 constraint t_source_pk primary key, val1, val2, val3)
as select object_id, object_name, owner, status from all_objects where rownum <= 100
/
-- create table succeeded.

create table t_target as select * from t_source
/
-- create table succeeded.

delete from t_source where key1 = (select min(key1) from t_source)
/
-- 1 rows deleted

update t_source
set val1 = case when mod(rownum,2) = 1 then '1' else val1 end
,val2 = case when mod(floor((rownum)/2),2) = 1 then '2' else val2 end
,val3 = case when mod(floor((rownum)/4),2) = 1 then '3' else val3 end
where rownum <= 7
/
-- 7 rows updated

insert into t_source values((select max(key1)+1 from t_source), '11', '22', '33')
/
-- 1 rows inserted

create global temporary table t_temp as
select t_target.*, 0 old_cnt, 0 new_cnt, 0 mod_type, 0 column_update_flag from T_TARGET where 0=1
/
-- create global succeeded.

insert into t_temp 
select a.*
, sum(new_cnt - old_cnt) over(partition by KEY1) mod_type
, case when new_cnt <> 1 or count(*) over(partition by KEY1) = 1 then 0 else (
  case when lag(val1) over(partition by KEY1 order by new_cnt) = val1 then 0 else 1 end
  + case when lag(val2) over(partition by KEY1 order by new_cnt) = val2 then 0 else 2 end
  + case when lag(val3) over(partition by KEY1 order by new_cnt) = val3 then 0 else 4 end
  ) end column_update_flag
from (
  select KEY1, VAL1, VAL2, VAL3, sum(old_cnt) old_cnt, sum(new_cnt) new_cnt from (
    select o.*, 1 old_cnt, 0 new_cnt from T_TARGET o
    union all
    select n.*, 0 old_cnt, 1 new_cnt from T_SOURCE n
  ) group by KEY1, VAL1, VAL2, VAL3 having sum(old_cnt) <> sum(new_cnt)
) a
/
-- 16 rows inserted

-- >> for mod-type, -1 means delete, 1 means insert, 0 means update
-- >> column_update_flag indicates the appropriate update statement to use depending on column mix
-- >> unchanged records have been eliminated

select * from t_temp
/
-- KEY1  VAL1     VAL2   VAL3    OLD_CNT  NEW_CNT  MOD_TYPE  COLUMN_UPDATE_FLAG 
-- ----- -------- ------ ------- -------- -------- --------- -------------------
-- 2     C_OBJ#   SYS    VALID   1        0        -1        0                  
-- 3     I_OBJ#   SYS    VALID   1        0        0         0                  
-- 3     1        SYS    3       0        1        0         5                  
-- 6     C_TS#    SYS    VALID   1        0        0         0                  
-- 6     C_TS#    SYS    3       0        1        0         4                  
-- 11    I_USER#  SYS    VALID   1        0        0         0                  
-- 11    1        2      VALID   0        1        0         3                  
-- 19    IND$     SYS    VALID   1        0        0         0                  
-- 19    1        2      3       0        1        0         7                  
-- 28    CON$     SYS    VALID   1        0        0         0                  
-- 28    1        SYS    VALID   0        1        0         1                  
-- 46    I_COL2   SYS    VALID   1        0        0         0                  
-- 46    I_COL2   2      VALID   0        1        0         2                  
-- 49    I_CON2   SYS    VALID   1        0        0         0                  
-- 49    I_CON2   2      3       0        1        0         6                  
-- 102   11       22     33      0        1        1         0                  
-- 
-- 16 rows selected

-- >> inserts new records, deletes old records, updates records when all columns have changed

merge into t_target t
using t_temp s
on (s.key1 = t.key1)
when matched then update set val1 = s.val1, val2 = s.val2, val3 = s.val3
  where mod_type = -1 or column_update_flag = 7
delete where mod_type = -1
when not matched then insert (KEY1, VAL1, VAL2, VAL3) values (s.KEY1, s.VAL1, s.VAL2, s.VAL3)
/
-- 3 rows merged

-- >> updates records with val1 changed

merge into t_target t
using (select key1, val1 from t_temp where column_update_flag = 1) s
on (s.key1 = t.key1)
when matched then update set val1 = s.val1
/
-- 1 rows merged

-- >> updates records with val1 and val2 changed (you can see where this is going)

merge into t_target t
using (select key1, val1, val2 from t_temp where column_update_flag = 3) s
on (s.key1 = t.key1)
when matched then update set val1 = s.val1, val2 = s.val2
/
-- 1 rows merged


2 stars To Stew Ashton suggested approach   March 6, 2009 - 3pm Central time zone
Reviewer: Charlie MuJiang from Victoria, BC Canada
Hi Stew Ashton,

You approach is for replacement. I'd rather load source data into an interim table and then 
exchange the partition.

If target table has 100,000 rows, I only want to upload(MERGE, update/insert) part of the change 
data, such as 3000 rows, no delete; average row length is 1000 bytes.

Can you do a benchmark for this case?
to prove that your approach it more efficient(use less resources/latches) than simple update/merge 
all columns?

All the interim processes are not inexpensive, not to say we have 50+ columns in book table to 
GROUP BY, such as

==
insert into t_temp 
select a.*
from (
...
union all
...
  ) group by KEY1, VAL1, VAL2, VAL3 having sum(old_cnt) <> sum(new_cnt)
) a
==

Thanks,
Charlie


3 stars Skip redundant update   March 6, 2009 - 4pm Central time zone
Reviewer: Charlie MuJiang from Victoria, BC Canada
Hi Tom,

My plan A is MERGE/update all columns, and make plan B that develop complex PL/SQL code to update 
and skip the redundant column as a backup plan.

I'll start to build prototype for both. And then balance the performance and complexity (easy to 
maintain) after experiment several representative production data set.

Meanwhile, keep looking for other simple and efficient approaches.

Thanks,
Charlie


2 stars @Charlie re:suggested approach   March 7, 2009 - 2am Central time zone
Reviewer: Stew Ashton from Paris, France

Charlie, you are better equipped to test alternatives than I, because you know the data and you have the data.

I am sure my "suggested approach" is *worse* than others in every respect except one :
it achieves your stated goal (update only changed columns) in a set-based manner.

Personally, when I want to reduce redo and undo I avoid updates to unchanged ROWS, just as Sokrates said above. I never worry about unchanged COLUMNS. I am glad you have chosen another Plan A.

I posted this approach because it was so easy to tweak Tom's table compare method in order to pinpoint changed columns along with changed rows. That bit might conceivably be of interest to someone.

2 stars Correction to column compare code   March 9, 2009 - 8am Central time zone
Reviewer: Stew Ashton from Paris, France

There is a bug in my big insert statement about three screens up:
case when lag(val1) over(partition by KEY1 order by new_cnt) = val1 then 0 else 1 end
will evaluate to 1 when the old and new values are both NULL.One solution would be NVL:
case when nvl(lag(val1) over(partition by KEY1 order by new_cnt),chr(0))
= nvl(val1,chr(0)) then 0 else 1 end
Sorry about that :(


Write a Review
 


All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.

About Oracle | Legal Notices and Terms of Use | Privacy Statement