Skip to Main Content
  • Questions
  • Detect out of order records in a single select statement

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Brian.

Asked: November 06, 2012 - 7:02 am UTC

Last updated: November 06, 2012 - 8:14 am UTC

Version: 11.2.0

Viewed 1000+ times

You Asked

Hi Tom
I am generating files by exporting from database views. These files are being sent to a third party software company. The application receiving the files has a dependancy on the order in which the files are sent to it. The developers have decided not to perform their own explicit ordering based on business data but rely on the order of the data in the file. Each file has a record_id that shows its record position in the file , followed by some policy data. The policy data has a policy number field. The receiving program breaks when a policy number has a record id that is greater than its last policy record id + 1 (Excluding the first policy in the file , and when a policy number changes as the file is being read)

Example
RECORD_ID   POLICY
1           111111  
2           111111
3           222222
4           111111



In the example above the receiving program breaks when it hits record 4.
I have fixed the original ordering issue and wrote a unit test to ensure we don't send the files out of order. However the code I've used to check this uses 2 cursors to perform the test. Is there a way to perform the check I require in a single select statement ,without the need for the procedural processing, so it brings back only those policy numbers where the record id for a policy are 'out of order' ?
I have 2 variants of the same function, CHECK_FOR_OUT_OF_ORDER_RECORDS & AM_I_FASTER, but they both still require looping through cursors.
Thank you.


In the sample below the function should detect a problem with policy number 1111111



create table TEST_ORDER
(
  RECORD_ID     NUMBER,
  POLICY_NUMBER VARCHAR2(4000)
)
;

insert into TEST_ORDER (RECORD_ID, POLICY_NUMBER)
values (4, '8888888');
insert into TEST_ORDER (RECORD_ID, POLICY_NUMBER)
values (5, '8888888');
insert into TEST_ORDER (RECORD_ID, POLICY_NUMBER)
values (6, '8888888');
insert into TEST_ORDER (RECORD_ID, POLICY_NUMBER)
values (1, '9999999');
insert into TEST_ORDER (RECORD_ID, POLICY_NUMBER)
values (3, '8888888');
insert into TEST_ORDER (RECORD_ID, POLICY_NUMBER)
values (10, '1111111');
insert into TEST_ORDER (RECORD_ID, POLICY_NUMBER)
values (12, '1111111');
commit;

create or replace package PKG_MIGRATION_UTILS is

  FUNCTION CHECK_FOR_OUT_OF_ORDER_RECORDS(p_table_or_view_name IN VARCHAR2) RETURN VARCHAR2;


  FUNCTION UNIT_TEST_ORDER_OF_EXTRACTS RETURN VARCHAR2;
  
  
  FUNCTION AM_I_FASTER(p_table_or_view_name IN VARCHAR2) RETURN VARCHAR2;


end PKG_MIGRATION_UTILS;
/
create or replace package body PKG_MIGRATION_UTILS is


  FUNCTION UNIT_TEST_ORDER_OF_EXTRACTS RETURN VARCHAR2
  IS
  l_success VARCHAR2(4000);

  BEGIN
   RETURN 'Yes All tables ordered successfully';
  EXCEPTION WHEN OTHERS THEN
            RAISE;

  END;

  FUNCTION CHECK_FOR_OUT_OF_ORDER_RECORDS(p_table_or_view_name IN VARCHAR2)   RETURN VARCHAR2
  IS

  TYPE cursor_typ IS REF CURSOR;
  c cursor_typ;
  d cursor_typ;



  l_policy_number VARCHAR2(4000);
  l_out_of_order_policy VARCHAR2(4000);
  l_opt_sql VARCHAR2(4000);
  l_sql VARCHAR2(4000);
  l_sql_2 VARCHAR2(4000);
  l_previous NUMBER;
  l_record_id NUMBER;
  l_out_of_seq_issue VARCHAR2(4000);
  l_skip_table NUMBER;

  BEGIN

  l_opt_sql:= 'SELECT count(*) FROM (SELECT policy_number FROM '|| p_table_or_view_name || ' group by policy_number having count(*) > 1)';

  EXECUTE IMMEDIATE l_opt_sql INTO l_skip_table;

  -- Small optimization step to only check tables that have more than one row per order key
  IF (l_skip_table < 1) THEN
       RETURN 'SUCCESS. Table ' || p_table_or_view_name || ' ordered correctly';
  END IF;
  -- Another optimzation putting a distinct on the policy number
  l_sql_2 := 'SELECT DISTINCT policy_number FROM '|| p_table_or_view_name;
  l_sql:=
  'select policy_number,
       previous,
       record_id,
       decode(record_id - nvl(previous,record_id),1,NULL,0,NULL,''Y'') as out_of_seq_issue
   from
   (
       select policy_number,
       record_id,
       lag(record_id) over (order by record_id) as previous
       from
       (
         select policy_number,record_id from ' || p_table_or_view_name || '
         WHERE policy_number= :1
       ) x
   ) where decode(record_id - nvl(previous,record_id),1,NULL,0,NULL,''Y'') IS NOT NULL';

   OPEN c FOR l_sql_2;
    LOOP
        FETCH c INTO l_policy_number;
        EXIT WHEN c%NOTFOUND;

        OPEN d FOR l_sql USING l_policy_number;
        LOOP
             FETCH d INTO l_out_of_order_policy,l_previous,l_record_id,l_out_of_seq_issue;
             EXIT WHEN d%NOTFOUND;
             raise_application_error(-20001,'FAILURE. Table ' || p_table_or_view_name || ' not ordered correctly. Check policy ' || l_out_of_order_policy);
        END LOOP;
        CLOSE d;

    END LOOP;
   CLOSE c;



   RETURN 'SUCCESS. Table ' || p_table_or_view_name || ' ordered correctly';

  EXCEPTION WHEN OTHERS THEN
     IF (d%ISOPEN) THEN
       CLOSE c;
     END IF;
     IF (c%ISOPEN) THEN
       CLOSE c;
     END IF;
     RAISE;
  END;


  FUNCTION AM_I_FASTER(p_table_or_view_name IN VARCHAR2) RETURN VARCHAR2
  IS
    TYPE cursor_typ IS REF CURSOR;
    l_opt_sql VARCHAR2(4000);
    l_sql VARCHAR2(4000);
    l_skip_table NUMBER;
    l_cursor SYS_REFCURSOR;
    l_policy_number VARCHAR2(4000);
    l_out_of_sequence VARCHAR2(4000);

    c cursor_typ;
  BEGIN
  
    l_opt_sql:= 'SELECT count(*) FROM (SELECT policy_number FROM '|| p_table_or_view_name || ' group by policy_number having count(*) > 1)';
    EXECUTE IMMEDIATE l_opt_sql INTO l_skip_table;

  -- Small optimization step to only check tables that have more than one row per order key
    IF (l_skip_table < 1) THEN
       RETURN 'SUCCESS. Table ' || p_table_or_view_name || ' ordered correctly';
    END IF;
  
    l_sql:= 'select policy_number, cursor(select  decode(record_id - nvl(lag(record_id) over (order by record_id),record_id),1,NULL,0,NULL,''Y'') as out_of_seq_issue
                                          from ' || p_table_or_view_name || ' t2
                                          where t1.policy_number = t2.policy_number
                                          )
             from (select distinct policy_number from  ' ||  p_table_or_view_name || ') t1';
             
             
    OPEN c FOR l_sql;
     LOOP
        FETCH c INTO l_policy_number,l_cursor;
        EXIT WHEN c%NOTFOUND;
        LOOP 
          FETCH l_cursor INTO l_out_of_sequence;
          EXIT WHEN l_cursor%NOTFOUND;
          
          IF (l_out_of_sequence IS NOT NULL) THEN
             raise_application_error(-20001,'FAILURE. Table ' || p_table_or_view_name || ' not ordered correctly. Check policy ' || l_policy_number);
          END IF;
          
        END LOOP;
        CLOSE l_cursor;
    END LOOP;
    CLOSE c;  
    
    RETURN 'SUCCESS. Table ' || p_table_or_view_name || ' ordered correctly';       
             
   EXCEPTION WHEN OTHERS THEN
     IF (c%ISOPEN) THEN
       CLOSE c;
     END IF;
     RAISE;
  END;


end PKG_MIGRATION_UTILS;
/


and Tom said...

it seems that the code boils down to this:

ops$tkyte%ORA11GR2> select count(*)
  2    from dual
  3   where exists ( select policy_number, record_id, prev_record_id
  4                    from (
  5                  select policy_number,
  6                         record_id,
  7                         nvl(lag(record_id) over (partition by policy_number order by record_id),record_id-1) prev_record_id
  8                    from test_order
  9                         )
 10                   where record_id <> prev_record_id+1
 11                 )
 12  /

  COUNT(*)
----------
         1

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into TEST_ORDER (RECORD_ID, POLICY_NUMBER)
  2  values (11, '1111111');

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select count(*)
  2    from dual
  3   where exists ( select policy_number, record_id, prev_record_id
  4                    from (
  5                  select policy_number,
  6                         record_id,
  7                         nvl(lag(record_id) over (partition by policy_number order by record_id),record_id-1) prev_record_id
  8                    from test_order
  9                         )
 10                   where record_id <> prev_record_id+1
 11                 )
 12  /

  COUNT(*)
----------
         0

ops$tkyte%ORA11GR2> 


if the query returns 0, all rows are "ok", they are sequential.

if the query returns 1, then there is an error.

ps: I hate this:


  FUNCTION UNIT_TEST_ORDER_OF_EXTRACTS RETURN VARCHAR2
  IS
  l_success VARCHAR2(4000);

  BEGIN
   RETURN 'Yes All tables ordered successfully';
  EXCEPTION WHEN OTHERS THEN
            RAISE;

  END;


why is that exception block in existence? what is its purpose - besides to hide the origin of the real exception? the only thing - repeat - the only thing that does is obscure the line number of the original exception!

actually, I don't get the purpose of that function at all.

You should have used bulk processing in this code - but fortunately, you don't need any procedural code at all.


Your optimization is actually a performance impediment, you have:

 l_opt_sql:= 'SELECT count(*) FROM (SELECT policy_number FROM '|| 
p_table_or_view_name || ' group by policy_number having count(*) > 1)';
    EXECUTE IMMEDIATE l_opt_sql INTO l_skip_table;

  -- Small optimization step to only check tables that have more than one row 
per order key
    IF (l_skip_table < 1) THEN
       RETURN 'SUCCESS. Table ' || p_table_or_view_name || ' ordered 
correctly';
    END IF;



Now, if the table is small, if we are to skip it, then this 'optimization' runs this very fast little query and returns - but all it did was cause us to skip executing a really tiny query!!!

If the table is large, this query takes a considerable amount of resources!!! And then it doesn't skip processing anything!!!!

think about it - that is an anti-optimization!


also you have a bug here:

EXCEPTION WHEN OTHERS THEN
     IF (d%ISOPEN) THEN
       CLOSE c;
     END IF;
     IF (c%ISOPEN) THEN
       CLOSE c;
     END IF;
     RAISE;
  END;



if d is open, then close c....

actually, you need not close them - locally scoped cursors are closed automagically in plsql when they go out of scope (unlike every other programming language)

ops$tkyte%ORA11GR2> show parameter open_cursors

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
open_cursors                         integer     30
ops$tkyte%ORA11GR2> @trace
ops$tkyte%ORA11GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA11GR2> declare
  2      procedure p
  3      is
  4          l_cursor sys_refcursor;
  5          l_rec    all_objects%rowtype;
  6      begin
  7          open l_cursor for 'select * from all_objects';
  8          fetch l_cursor into l_rec;
  9      end;
 10  begin
 11      for i in 1 .. 1000
 12      loop
 13          p;
 14      end loop;
 15  end;
 16  /

PL/SQL procedure successfully completed.


select *
from
 all_objects


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     1000      0.01       0.01          0          0          0           0
Execute   1000      0.04       0.03          0          0          0           0
Fetch     1000      1.06       1.10      15000      12000          0        1000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     3000      1.12       1.15      15000      12000          0        1000



if the cursor was not implicitly closed upon going out of scope - this would have raised an ora-1000 max open cursors exceeded.

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Brian Hennessy, November 06, 2012 - 9:21 am UTC

Thanks Tom, brilliant. I see your point on the 'optimization'. I have tables that only have 1 row per policy key so I was trying to exclude these from executing my procedural code but as a side effect I'm adversly affecting other calls. On the 'ps I hate this' I removed the code from this I should have removed the function completely before posting, sorry for the confusion.

I generally avoid explicit cursors, and my code below shows that .nice to know an explicit close isn't needed on them in pl/sql.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.