Skip to Main Content
  • Questions
  • PlSQL- Bulk Collect and Update (Better Approach)

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, ashswet.

Asked: September 22, 2016 - 7:55 pm UTC

Last updated: September 29, 2016 - 12:55 am UTC

Version: 11g Release 2

Viewed 1000+ times

You Asked

Hi Tom,

I am looking for a better coding approach than what I have in my current system. I have two tables dog_owner(16 Million Records) and dog_owner_stage(8 Million Records). In the current process. I usually insert based on a common owner_account_id from stage to prod. However for some reason if there is no owner_account_id on the stage. I use the matching based on the 1) email address (or) 2) FirstName and LastName and PhoneNo combination match it with prod and update my stage owner_account_id.(since it is used elsewhere in the process).

Now this process currently takes close to 40 mins to finish but sometimes due to some unknown reasons its taking close to 2 hrs to finish. I am trying to optimize this proess and would love to hear from you on the best approach.



--Finding the existing Dog owners using Dog Owner Email, Dog Onwer fname & lname and phone number matching

msg(0,'BULK COLLECT START: '||TO_CHAR(SYSDATE, 'DD-MON-YYYY HH:MI:SS AM'));

SELECT nvl(OWNER_ACCOUNT_ID,'0')||'$'||nvl(EMAIL_ADDR,' ')||'$'||nvl(OWNER_FIRST_NM,' ')||'$'||nvl(OWNER_LAST_NM,' ')||'$'||nvl(TELEPHONE_1_NBR,' ')
BULK COLLECT INTO OTAB
FROM DOG_OWNER A
WHERE EXISTS (SELECT 'Y'
FROM STAGING.DOG_OWNER B
WHERE B.PROCESS_CD = 'N'
AND
((A.EMAIL_ADDR = B.EMAIL_ADDR
AND
INSTR(A.EMAIL_ADDR,'@') > 1
)
OR
(B.OWNER_FIRST_NM = A.OWNER_FIRST_NM
AND
B.OWNER_LAST_NM = A.OWNER_LAST_NM
AND
B.TELEPHONE_1_NBR = A.TELEPHONE_1_NBR
)
)
);

IF OTAB.COUNT <> 0
THEN

FORALL i IN OTAB.FIRST..OTAB.LAST
UPDATE STAGING.DOG_OWNER A
SET A.OWNER_ACCT_ID = SUBSTR(OTAB(i), 1, INSTR(OTAB(i), '$', 1, 1)-1)
WHERE
((A.EMAIL_ADDR = SUBSTR(OTAB(i),INSTR(OTAB(i), '$', 1, 1)+1, (INSTR(OTAB(i), '$',1, 2)) - (INSTR(OTAB(i), '$', 1,1)+1))
AND
INSTR(A.EMAIL_ADDR,'@') > 1
)
OR
(A.OWNER_FIRST_NM = SUBSTR(OTAB(i), INSTR(OTAB(i), '$', 1, 2)+1, (INSTR(OTAB(i), '$', 1, 3)) - (INSTR(OTAB(i), '$', 1, 2)+1))
AND
A.OWNER_LAST_NM = SUBSTR(OTAB(i), INSTR(OTAB(i), '$', 1, 3)+1, (INSTR(OTAB(i), '$', 1, 4)) - (INSTR(OTAB(i), '$', 1, 3)+1))
AND
A.TELEPHONE_1_NBR = SUBSTR(OTAB(i), INSTR(OTAB(i), '$', 1, 4)+1)
)
)
AND A.OWNER_ACCOUNT_ID <> SUBSTR(OTAB(i), 1, INSTR(OTAB(i), '$', 1, 1)-1)
AND A.PROCESS_CD = 'N'
RETURNING (TO_CHAR(A.OWNER_ACCOUNT_ID)||'$'||SUBSTR(OTAB(i), 1, INSTR(OTAB(i), '$', 1, 1)-1)) BULK COLLECT INTO ENROLTAB;


END LOOP;

and Connor said...

The process seems to be:

select rows into collection where [complex criteria]
update table using collection and [complex criteria]

I'm curious why could not use just a merge statement and not have to "shift" all those rows around, ie

merge into STAGING.DOG_OWNER
using
( [your big select statement] )
on
( [join conditions ]

etc
etc


Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

plsql - Query Optimization

ashswet maganti, September 28, 2016 - 5:36 pm UTC

Hi I tried using the Merge Technique however looks like merge has limitations. I keep getting the following error when ever i tried it.

ORA-30926: unable to get a stable set of rows in the source tables when Merging tables

I checked for duplicate records but there were none. Also as I have close to (16 Million Records in dog_owner) and (8 Million Records dog_owner_stage). Doesnt Merge process will take a long time to execute it?


MERGE INTO STAGE.DOG_OWNER B
USING (SELECT DISTINCT OWNER_ACCOUNT_ID,OWNER_FIRST_NM,OWNER_LAST_NM,TELEPHONE_1_NBR,EMAIL_ADDR FROM DOG_OWNER) A
ON (B.PROCESS_STATUS_CD = 'B'
AND (A.EMAIL_ADDR = B.EMAIL_ADDR AND INSTR(A.EMAIL_ADDR,'@') > 1)
OR
(B.OWNER_FIRST_NM = A.OWNER_FIRST_NM AND B.OWNER_LAST_NM = A.OWNER_LAST_NM AND B.TELEPHONE_1_NBR = A.TELEPHONE_1_NBR))
WHEN MATCHED THEN UPDATE
SET B.OWNER_ACCOUNT_ID = A.OWNER_ACCOUNT_ID
Chris Saxon
September 29, 2016 - 12:55 am UTC

It's not about duplicates, it is whether (based on your queries) the merge may have to update the same row more than once (because that would make the result non-deterministic).

A nice blog post on that here

https://orastory.wordpress.com/2011/10/13/merge-oddity/

MERGE runs fine on large data sizes.

Perhaps try two merges with each of the conditions rather than just one to try workaround the stable set of rows issue.