Hi,
I was trying to optimize one of the batch update queries (given below) in my application
UPDATE schema1.TEST_RECORDS PARTITION (PARTDEF) gr SET gr.purge_status_cd = 'PURGE_PENDING', gr.purge_date = SYSDATE WHERE EXISTS ( SELECT 1 FROM SCHEMA2.TEST_ACTIONS PARTITION (PART1) ca WHERE gr.emcp = ca.action_id )
The problems with the current approaches are
1. Taking more than 5 hours to complete as it is updating several millions of records.
2. It is a single transaction
I went through several tutorials and finally decided to go with Parallel PL/SQL as stated in this link
http://www.orafaq.com/node/2450 I tested both approaches with 1M records and I do not see a big difference with respect to performance
CREATE OR REPLACE TYPE array_t AS TABLE OF NUMBER(38);
/
create or replace FUNCTION PARALLEL_PURGE( IN_CURSOR IN SYS_REFCURSOR)
RETURN array_t
PARALLEL_ENABLE (PARTITION IN_CURSOR BY ANY)
PIPELINED
IS
PRAGMA AUTONOMOUS_TRANSACTION;
TYPE NUM_TYPE IS TABLE OF NUMBER(38);
ACTION_ID NUM_TYPE;
cnt INTEGER := 0;
BEGIN
LOOP
FETCH IN_CURSOR BULK COLLECT INTO ACTION_ID LIMIT 50000;
EXIT WHEN ACTION_ID.COUNT() = 0;
FORALL i IN ACTION_ID.FIRST .. ACTION_ID.LAST
UPDATE schema1.TEST_RECORDS PARTITION (PARTDEF) SET purge_status_cd = 'PURGE_PENDING' , purge_date = SYSDATE WHERE emcp = ACTION_ID(i);
commit;
cnt := cnt + ACTION_ID.COUNT;
END LOOP;
CLOSE IN_CURSOR;
COMMIT;
PIPE ROW(cnt);
RETURN;
EXCEPTION
WHEN OTHERS THEN
ROLLBACK;
CLOSE IN_CURSOR;
PIPE ROW(cnt);
RAISE;
END;
/
I call my new function like this
SELECT sum(column_value) FROM TABLE(PARALLEL_PURGE(CURSOR( SELECT ca.action_id FROM SCHEMA2.TEST_ACTIONS partition(part1) ca INNER JOIN schema1.TEST_RECORDS PARTITION(PARTDEF) grxml ON (ca.action_id = grxml.emcp))));
I have few queries as stated below
1. Do you suggest any other approach which will give me a better performance.
2. Currently I am using PARTITION IN_CURSOR BY ANY. I have seen that we have an option to use RANGE/HASH in place of ANY. Since I am passing a Weak REF CURSOR, I am getting a compilation problem
3. Will i get any performance improvement if is use HASH/RANGE? What is the recomended approach to pass a strong ref cursor
Appreciate your help
Yes. Parallel enabling functions is designed so that can be used in a parallel query, not really for doing parallel operations *within* the function.
What you probably want here is DBMS_PARALLE_EXECUTE. The basic premise is
- you have a workload that can carved up in a logical way into smaller chunks
- define a routine that will work on one of those chunks
- it uses to job scheduler to submit multiple jobs, each working on a chunk, and they run in parallel
In your example, you might carve up the workload by logical chunks of action_id's
Some examples here
https://asktom.oracle.com/pls/apex/asktom.search?tag=how-to-use-dbms-parallel-execute-to-chunk-over-db-link