Thanks for the question, Pawan.

Asked: August 05, 2020 - 6:11 am UTC

Last updated: August 07, 2020 - 8:15 am UTC

Version: Oracle 12c

Viewed 1000+ times

You Asked

I have 2 tables which of same structure with around 180 columns in each table with one columns as PK. Each table have around 200 k records. I need to compare to tables columns by column and if for that records any difference is found for any of the remaining 179 columns then need to track that column name and the both the old values and new value from that column from both tables.
This can be achieve by SQL statement with UNION and group and LEAD functions but as need to compare the for 179 fields the length of the SQL code is very long. So thought of used nested table collection to compare both the tables by bulk collect both the values is 2 different nested tables and iterate them. First loop to iterate using collection count value and second loop using USER_TAB_COLS to iterate based on number of columns in the tables. Is there any possible to pass the field name to the nested loop dynamically ? Below is the sample code for that.

SET SERVEROUTPUT ON;
DECLARE

    TYPE TEST1_TYPE IS TABLE OF TEST1%ROWTYPE ;
 TEST1_TAB TEST1_TYPE;
 TEST2_TAB TEST1_TYPE;
 lcCol1 VARCHAR2(3000);
 lcCol2 VARCHAR2(3000);
 lQuery VARCHAR2(3000);
 
 CURSOR CUR_TAB_COL IS 
         SELECT 
           COLUMN_NAME ,DATA_TYPE
    FROM USER_TAB_COLS 
    WHERE TABLE_NAME='TEST1'
    ORDER BY COLUMN_ID;
       
    
    TYPE COL_TYPE IS TABLE OF CUR_TAB_COL%ROWTYPE; 
 COL_TAB COL_TYPE;


BEGIN

    SELECT 
         *
   BULK COLLECT INTO TEST1_TAB
  FROM TEST1
  ORDER BY ID;
  
 SELECT 
     *
  BULK COLLECT INTO TEST2_TAB
 FROM TEST2
 ORDER BY ID;    

    OPEN CUR_TAB_COL;
        FETCH CUR_TAB_COL BULK COLLECT INTO COL_TAB;
    CLOSE CUR_TAB_COL; 
 
 
 FOR I IN 1..TEST2_TAB.count
 LOOP
     FOR j IN COL_TAB.FIRST..COL_TAB.LAST
  LOOP
       lQuery:='SELECT TEST1_TAB('||i||').'||COL_TAB(j).COLUMN_NAME||',FROM DUAL';  
    
    EXECUTE IMMEDIATE lQuery INTO lcCol1;
    
    lQuery:='SELECT TEST2_TAB('||i||').'||COL_TAB(j).COLUMN_NAME||',FROM DUAL';  

  EXECUTE IMMEDIATE lQuery INTO lcCol2;
    
     END LOOP;
 END LOOP; 
 
END;
/

and Chris said...

I would stick to a pure SQL solution. Nested loops through the arrays will get slow as the data volumes grow. Plus you may run out of memory trying to bulk collect the whole table into an array.

This question covers various ways of doing this:

https://asktom.oracle.com/pls/apex/asktom.search?tag=how-to-compare-two-tables-of-data

If you just want to save yourself some typing, Stew Ashton has built a package to generate the SQL for you:

https://stewashton.wordpress.com/2015/01/21/stew_compare_sync-introducing-the-package/

To compare column-by-column, unpivot the tables first:

create table tnew ( pk primary key, c1, c2 ) as 
  select level pk, mod ( level, 2 ) c1, level c2 from dual
  connect by level <= 10;
create table told ( pk primary key, c1, c2 ) as 
  select level pk, mod ( level, 4 ) c1, level c2 from dual
  connect by level <= 10;

select pk, col, val,
       sum(old_ct) old_ct, sum(new_ct) new_ct
from (
  select t.*, 1 old_ct, 0 new_ct  
  from   told
  unpivot (
    val for col in ( c1, c2 ) -- list of columns to compare
  ) t
  union all
  select t.*, 0 old_ct, 1 new_ct  
  from   tnew 
  unpivot (
    val for col in ( c1, c2 )
  ) t
)
group by pk, col, val
having sum(old_ct) != sum(new_ct)
order by 1, new_ct;

PK    COL    VAL    OLD_CT    NEW_CT   
    2 C1          2         1         0 
    2 C1          0         0         1 
    3 C1          3         1         0 
    3 C1          1         0         1 
    6 C1          2         1         0 
    6 C1          0         0         1 
    7 C1          3         1         0 
    7 C1          1         0         1 
   10 C1          2         1         0 
   10 C1          0         0         1

Again, if you want to save yourself some typing/make this dynamic, you can use listagg to generate the column list to put in the unpivot clause for each table:

select listagg ( column_name, ',' )
         within group ( order by column_id ) cols
from   user_tab_cols
where  table_name = 'TNEW'
and    column_name not in ( -- assumes table as PK
  select ucc.column_name from user_constraints uc
  join   user_cons_columns ucc
  using  ( constraint_name )
  where  uc.table_name = 'TNEW'
  and    uc.constraint_type = 'P'
);

COLS    
C1,C2

Rating

(4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Pawan, August 05, 2020 - 2:28 pm UTC

Thanks Chris, as you mention i also found PURE SQL is better in execution time. But is it possible to pass dynamic value to the fields for nested table. Just wanted to know is this possible in ORACLE PL/SQL?

August 06, 2020 - 10:05 am UTC

I'm not aware of an easy way to do this. If you insist on going down the nested table route, it's likely to be easier to use the multiset operators.

But I repeat - using pure SQL is the far superior solution here.

Not in PL/SQL, but...

Stew Ashton, August 06, 2020 - 4:09 pm UTC

If you want to avoid typing or generating all the column names, you may be able to "unpivot" your rows using XML:

select empno, col, val from xmltable(
  '/ROWSET/ROW'
  passing xmltype(cursor(select * from emp))
  columns
    empno varchar2(128) path 'EMPNO',
    vals xmltype path '.'
), xmltable(
  '/ROW/*[not(name()="EMPNO")]'
  passing vals
  columns
    col varchar2(128) path 'name()',
    val varchar2(128) path '.'
);

EMPNO    COL         VAL                   
7839     ENAME       KING                   
7839     JOB         PRESIDENT              
7839     HIREDATE    1981-11-17 00:00:00    
7839     SAL         5000                   
7839     DEPTNO      10                     
7698     ENAME       BLAKE                  
7698     JOB         MANAGER                
7698     MGR         7839                   
7698     HIREDATE    1981-05-01 00:00:00    
7698     SAL         2850                   
7698     DEPTNO      30                     
...

Notice you not only have the column names as data, but you also have the values as strings, which you have to manage yourself with UNPIVOT. Now you can JOIN on EMPNO and COL and compare the values. Notice that columns with NULL values do not appear in the XML, so you have to do a FULL JOIN to cover the situation where one table has a value and the other doesn't.

Please note that this format may not be the best if the same PK is not present in both tables. For that kind of difference I would use my standard COMP_SYNC package.

Best regards,
Stew

August 06, 2020 - 4:14 pm UTC

Neat idea, thanks Stew.

Pawan, August 07, 2020 - 1:39 am UTC

For the query you mention using UNPIVOT, can it be changed further to give output like below having 4 columns.
PK _COL COL_NAME OLD_VAL NEW_VAL
2 c1 2 0

August 07, 2020 - 8:14 am UTC

See Stew's example below.

You might also want to check out this:

https://blog.jooq.org/2020/08/05/use-natural-full-join-to-compare-two-tables-in-sql

Fuller XML demo

Stew Ashton, August 07, 2020 - 6:50 am UTC

It's easier to do the comparison if the columns with NULL values appear in the generated XML. For that, another XML generation function can be used.

create table emp2 as select * from emp where deptno = 10;
update emp2 set comm = trunc(empno/10);
update emp2 set hiredate = hiredate+1 where rownum = 1;

with function xmlgen(p_select in varchar2) return clob is
  ctx dbms_xmlgen.ctxhandle;
begin
  ctx := dbms_xmlgen.newcontext(p_select);
  dbms_xmlgen.SETNULLHANDLING (ctx, 2);
  return dbms_xmlgen.getxml(ctx);
end;
empcols as (
  select empno, col, val from xmltable(
    '/ROWSET/ROW'
    passing xmltype(xmlgen('select * from emp where deptno = 10'), null, 1, 1)
    columns
      empno varchar2(128) path 'EMPNO',
      vals xmltype path '.'
  ), xmltable(
    '/ROW/*[not(name()="EMPNO")]'
    passing vals
    columns
      col varchar2(128) path 'name()',
      val varchar2(128) path '.'
  )
), emp2cols as (
  select empno, col, val from xmltable(
    '/ROWSET/ROW'
    passing xmltype(xmlgen('select * from emp2'), null, 1, 1)
    columns
      empno varchar2(128) path 'EMPNO',
      vals xmltype path '.'
  ), xmltable(
    '/ROW/*[not(name()="EMPNO")]'
    passing vals
    columns
      col varchar2(128) path 'name()',
      val varchar2(128) path '.'
  )
)
select empno, col, e.val eval, e2.val e2val 
from empcols e join emp2cols e2 using(empno, col)
where decode(e.val, e2.val, 0, 1) = 1
order by empno, col
/

EMPNO   COL        EVAL                  E2VAL                 
7782    COMM       -                     778                   
7839    COMM       -                     783                   
7839    HIREDATE   1981-11-17 00:00:00   1981-11-18 00:00:00   
7934    COMM       -                     793

This will only produce output for EMPNOs that are in both tables. I think this is a good thing. Rows that are missing from one table or the other should be shown one row at a time. If you really want output in those cases, say "full join" instead of just "join".

Regards, Stew

August 07, 2020 - 8:15 am UTC

Great work Stew, thanks for sharing

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Not in PL/SQL, but...

Fuller XML demo

More to Explore

SQL

Comparing 2 Nested Table Collection Which have 180 field defined in each collection by passing field name dynamically.

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Not in PL/SQL, but...

Fuller XML demo

More to Explore

SQL