Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, John.

Asked: April 12, 2018 - 3:25 pm UTC

Last updated: April 17, 2018 - 1:52 am UTC

Version: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

Viewed 1000+ times

You Asked

My stored PROCEDURE searches for potential duplicate PeopleSoft Vouchers (PS_VOUCHER, PS_VOUCHER_LINE, PS_DISTRIB_LINE, PS_VENDOR, PS_PYMNT_VCHR_XREF).
The first step is to get the set of unique concatenated/delimited values for records that are currently being processed, but not yet fully processed (I am only considering a limited set of the total records in the tables).
Then I parse out the individual values w/ the delimiter.
Then my FOR/LOOP evaluates the values against ALL records (NOT just records-in-process). I have another FOR/LOOP, within the main FOR/LOOP, that checks to determine if there 2 or more (within the entire tables).

PROBLEM: the evaluation to determine if there is 2 or more is a “resource hog”.

QUESTION #1: If I explicitly OPEN/CLOSE the cursor (as opposed to a FOR/LOOP), will that reduce run-time because the cursor would be in cache?
QUESTION #2: If I change to a PACKAGE, would that reduce run-time, because it would be cache during run-time?

and Connor said...

The most benefit you'll probably see if to refactor the code into bulk processing SQL's.

For example, consider this:

SQL> create table t1 as select substr(owner,1,1) x, substr(object_name,1,2) y, d.* from dba_objects d where rownum < 1000;

Table created.

SQL> create table t2 as select substr(object_name,1,2) x, substr(owner,1,1) y, d.* from dba_objects d;

Table created.


and I want to see occurrence of when x||y from T1 clashes with x||y from T2.

I could write PL/SQL like this:

SQL> set timing on
SQL> set serverout on
SQL> declare
  2    c int;
  3  begin
  4  for i in ( select * from t1 ) loop
  5    select count(*)
  6    into   c
  7    from t2
  8    where  x||y = i.x||i.y;
  9
 10    if c > 0 then
 11      dbms_output.put_line('Clash with x/y for '||i.x||i.y);
 12    end if;
 13  end loop;
 14  end;
 15  /

...
...
Clash with x/y for SAP
Clash with x/y for SAP
Clash with x/y for SAP
Clash with x/y for SAP

PL/SQL procedure successfully completed.

Elapsed: 00:00:07.81


which is a row by row examination.

Or I could put the logic into a single SQL...

SQL> begin
  2  for i in (
  3    select t1.x, t1.y
  4    from   t1
  5    where  x||y in ( select x||y from t2 )
  6    )
  7  loop
  8    dbms_output.put_line('Clash with x/y for '||i.x||i.y);
  9  end loop;
 10  end;
 11  /
...
...
Clash with x/y for SAP
Clash with x/y for SAP
Clash with x/y for SAP
Clash with x/y for SAP

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.12


which is hundreds of times faster.

For your other questions:

Q1: No
Q2: No

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Another example

Stew Ashton, April 13, 2018 - 9:00 am UTC

The processed and in-process rows appear to be in the same table (or tables?). In that case you can use aggregate functions (if you don't need all the individual rows) or analytic functions (if you do).
-- Test table
create table t(val1, val2, processed_flag) as
select mod(level, 17), 
  mod(level, 19),
  case when mod(level, 64) = 1 then 'N' else 'Y' end
from dual connect by level <= 500;

-- Aggregate

select val1 || '/' || val2 concat,
  count(*) total_cnt,
  count(nullif(processed_flag, 'Y')) not_processed_cnt
from t
group by val1 || '/' || val2
having count(*) > 1
and count(nullif(processed_flag, 'Y')) > 0;

CONCAT   TOTAL_CNT NOT_PROCESSED_CNT
------- ---------- -----------------
10/15            2                 1
1/1              2                 1
11/5             2                 1
14/8             2                 1
7/12             2                 1

--Analytic

select * from (
  select t.*,
    count(*) over(partition by val1 || '/' || val2) total_cnt,
    count(nullif(processed_flag, 'Y'))
      over(partition by val1 || '/' || val2) not_processed_cnt
  from t
)
where total_cnt > 1 and not_processed_cnt > 0;

      VAL1       VAL2 P  TOTAL_CNT NOT_PROCESSED_CNT
---------- ---------- - ---------- -----------------
         1          1 N          2                 1
         1          1 Y          2                 1
         7         12 N          2                 1
         7         12 Y          2                 1
        10         15 N          2                 1
        10         15 Y          2                 1
        11          5 N          2                 1
        11          5 Y          2                 1
        14          8 N          2                 1
        14          8 Y          2                 1

Best regards, Stew Ashton

Can I ask a follow up question?

John Heeb, April 13, 2018 - 4:00 pm UTC

Connor,
You ROCK, thanks. I work in a PeopleSoft production instance, and an auditor would frown on me creating new tables. Are there any alternatives?

This is how I get the records to evaluate.
CURSOR get_unique_combination_recs IS -- get the unique set of records that must be evaluated
select distinct(V_LN.MERCHANDISE_AMT ||' PGH-STLRS '||DST.DEPTID||' PGH-STLRS '||DST.PERFORM_START_DT||' PGH-STLRS '||VCHR.VENDOR_ID) comb_data
from PS_VOUCHER VCHR, PS_VOUCHER_LINE V_LN, PS_DISTRIB_LINE DST, PS_VENDOR VND, PS_PYMNT_VCHR_XREF REF
where VCHR.BUSINESS_UNIT = V_LN.BUSINESS_UNIT
and V_LN.BUSINESS_UNIT = DST.BUSINESS_UNIT
and VCHR.VOUCHER_ID = V_LN.VOUCHER_ID
and VCHR.VOUCHER_ID = REF.VOUCHER_ID
and V_LN.VOUCHER_ID = DST.VOUCHER_ID
and V_LN.VOUCHER_LINE_NUM = DST.VOUCHER_LINE_NUM
and VCHR.VENDOR_ID = VND.VENDOR_ID
and REF.PYMNT_SELCT_STATUS = 'N' -- NOTICE only the unpaid records are evaluated
and (REF.PYMNT_HOLD = 'N' AND REF.VNDR_PYMNT_HLD_FLG = 'N')
and (VCHR.ORIGIN in ('EVC', 'ONL') AND V_LN.PO_ID = ' ')
and VCHR.POST_STATUS_AP = 'P'
and VCHR.CLOSE_STATUS = 'O'
and VCHR.VOUCHER_STYLE = 'REG'
and DST.DISTRIB_LINE_NUM = 1;

Then I parse out the values.

This is how I check for more than 1.
CURSOR get_count_ptnl_dup_recs (vendor_v VARCHAR2, org_v VARCHAR2, amount_v NUMBER, serv_begin_dt_v DATE) IS -- get the count of potential duplicates
select count(*)
from PS_VOUCHER VCHR, PS_VOUCHER_LINE V_LN, PS_DISTRIB_LINE DST, PS_VENDOR VND, PS_PYMNT_VCHR_XREF REF
where VCHR.BUSINESS_UNIT = V_LN.BUSINESS_UNIT
and V_LN.BUSINESS_UNIT = DST.BUSINESS_UNIT
and VCHR.VOUCHER_ID = V_LN.VOUCHER_ID
and VCHR.VOUCHER_ID = REF.VOUCHER_ID
and V_LN.VOUCHER_ID = DST.VOUCHER_ID
and V_LN.VOUCHER_LINE_NUM = DST.VOUCHER_LINE_NUM
and VCHR.VENDOR_ID = VND.VENDOR_ID
and REF.PYMNT_SELCT_STATUS != 'X' -- NOTICE, when checking for dups, even check the records that have been paid (but NOT cancelled)
and REF.PYMNT_TYPE != 'W'
and (REF.PYMNT_HOLD = 'N' AND REF.VNDR_PYMNT_HLD_FLG = 'N')
and (VCHR.ORIGIN in ('EVC', 'ONL') AND V_LN.PO_ID = ' ')
and VCHR.POST_STATUS_AP = 'P'
and VCHR.CLOSE_STATUS = 'O'
and VCHR.VOUCHER_STYLE = 'REG'
and DST.DISTRIB_LINE_NUM = 1
and VCHR.VENDOR_ID = vendor_v
and DST.DEPTID = org_v
and V_LN.MERCHANDISE_AMT = amount_v
and DST.PERFORM_START_DT = serv_begin_dt_v;
Connor McDonald
April 14, 2018 - 3:51 am UTC

select VCHR.VENDOR_ID, DST.DEPTID, V_LN.MERCHANDISE_AMT, DST.PERFORM_START_DT, count(*) 
from PS_VOUCHER VCHR, PS_VOUCHER_LINE V_LN, PS_DISTRIB_LINE DST, PS_VENDOR VND, PS_PYMNT_VCHR_XREF REF 
where VCHR.BUSINESS_UNIT = V_LN.BUSINESS_UNIT 
and V_LN.BUSINESS_UNIT = DST.BUSINESS_UNIT 
and VCHR.VOUCHER_ID = V_LN.VOUCHER_ID 
and VCHR.VOUCHER_ID = REF.VOUCHER_ID 
and V_LN.VOUCHER_ID = DST.VOUCHER_ID 
and V_LN.VOUCHER_LINE_NUM = DST.VOUCHER_LINE_NUM 
and VCHR.VENDOR_ID = VND.VENDOR_ID 
and REF.PYMNT_SELCT_STATUS != 'X' -- NOTICE, when checking for dups, even check the records that have been paid (but NOT cancelled) 
and REF.PYMNT_TYPE != 'W' 
and (REF.PYMNT_HOLD = 'N' AND REF.VNDR_PYMNT_HLD_FLG = 'N') 
and (VCHR.ORIGIN in ('EVC', 'ONL') AND V_LN.PO_ID = ' ') 
and VCHR.POST_STATUS_AP = 'P' 
and VCHR.CLOSE_STATUS = 'O' 
and VCHR.VOUCHER_STYLE = 'REG' 
and DST.DISTRIB_LINE_NUM = 1 
where (VCHR.VENDOR_ID,DST.DEPTID,V_LN.MERCHANDISE_AMT,DST.PERFORM_START_DT) in 
(
  select distinct VCHR.VENDOR_ID, DST.DEPTID, V_LN.MERCHANDISE_AMT , DST.PERFORM_START_DT
  from PS_VOUCHER VCHR, PS_VOUCHER_LINE V_LN, PS_DISTRIB_LINE DST, PS_VENDOR VND, PS_PYMNT_VCHR_XREF REF 
  where VCHR.BUSINESS_UNIT = V_LN.BUSINESS_UNIT 
  and V_LN.BUSINESS_UNIT = DST.BUSINESS_UNIT 
  and VCHR.VOUCHER_ID = V_LN.VOUCHER_ID 
  and VCHR.VOUCHER_ID = REF.VOUCHER_ID 
  and V_LN.VOUCHER_ID = DST.VOUCHER_ID 
  and V_LN.VOUCHER_LINE_NUM = DST.VOUCHER_LINE_NUM 
  and VCHR.VENDOR_ID = VND.VENDOR_ID 
  and REF.PYMNT_SELCT_STATUS = 'N' -- NOTICE only the unpaid records are evaluated 
  and (REF.PYMNT_HOLD = 'N' AND REF.VNDR_PYMNT_HLD_FLG = 'N') 
  and (VCHR.ORIGIN in ('EVC', 'ONL') AND V_LN.PO_ID = ' ') 
  and VCHR.POST_STATUS_AP = 'P' 
  and VCHR.CLOSE_STATUS = 'O' 
  and VCHR.VOUCHER_STYLE = 'REG' 
  and DST.DISTRIB_LINE_NUM = 1
)
having count(*) > 1


There are more efficient ways potentially, but this should get you started.

Hum?

Gh, April 15, 2018 - 5:27 am UTC

I cant really understand how could suggest a reliable solution without knowing how are tables and indexes defined and without at least an explain plan...etc

Is this table is partitioned on say start date?
What indexes are on?
Anyway stew approach would be the way to go especially if implemented via MV agg. On the 2nd cursor parameters.
But as said requester must provide all ddl.
Connor McDonald
April 16, 2018 - 1:50 am UTC

The key point here is not really *what* SQL is used - there are many alternatives. The key point is moving from a non-set approach (procedural row-at-a-time) to a set-based approach

This is a great resource, I will use it again.

A reader, April 16, 2018 - 12:43 pm UTC

Hello,
Thanks to Connor, Stew, and Gh --- this has been very helpful. Back in the late 90s I worked with Oracle financials, but now I work with PeopleSoft financial applications. I have typically leveraged my PL/SQL skills to load relatively low volume interface tables. I am not a skilled PL/SQL Developer, but I will include more detail when I ask for help in the future.
I am most comfortable with Connor’s “April 14, 2018 - 3:51 am UTC” suggestion. It would have been my preference to code it that way, but I assumed that would be less efficient because it was “nested”.
Stew, I hope all is well for you in Paris. The last time I got help from someone from Paris it was from a gentleman at Gilles Berthoud who helped me purchase a great set of bicycle fenders.
Thanks,
John
Connor McDonald
April 17, 2018 - 1:52 am UTC

thanks for the feedback

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.