Thanks for the question, Pinaki.

Asked: August 10, 2016 - 3:37 am UTC

Last updated: August 22, 2016 - 4:20 pm UTC

Version: 11g

Viewed 1000+ times

You Asked

Hi Tom ,
I am writing a procedure for field to field comparison of tables.

Create or Replace procedure data_check
AS
type all_tabs is table of table of all_tables.table_name%type;
v_tab_stg all_tabs := all_tabs();
v_tab_pstg all_tabs := all_tabs();
v_qry1 varchar2(500);
type rc is re cursor ;
l_cusror_stg rc ;
l_cursor_pstg rc;
type col_stg is table of all_tab_cols.column_name%type;
type col_pstg is table of all_tab_cols.column_name%type;
v_col_stg col_stg := col_stg();
v_col_pstg col_pstg := col_pstg();

begin

select table_name BULK COLLECT INTO v_tab_stg from all_tables where table_name like 'S?_%' escape '?' order by table_name desc;
select table_name BULK COLLECT INTO v_tab_pstg from all_tables where table_name like 'P?_%' escape '?' order by table_name desc;

for i in 1..v_tab_stg.COUNT LOOP
tab1 := v_tab_stg(i);
tab2 := v_tab_pstg(i);
v_qry1 := 'select * from '||tab1;
open l_cursor_stg for v_qry1 ;
v_qry1 := 'select * form '||tab2;
open l_cursor_pstg for v_qry1 ;
select column_name bulk collect into v_col_stg from all_tab_cols where table_name=tab1 order by column_name desc ;
select column_name bulk collect into v_col_pstg from all_tab_cols where table_name=tab2 order by column_name desc ;
for i in 1..5 LOOP /*comparing only for first 5 rows as if now */
fetch l_cursor_stg into v_tab1;
fetch l_cursor_pstg into v_tab2;
for j in v_col_stg.COUNT loop
if ( v_tab1.v_col_stg(j) <> v_tab2.v_col_pstg(j))
/* I will print or log some error here */
end if;
end loop;
end loop;

close l_cursor_stg ;
close l_cursor_pstg ;
end loop;

end;

Here can we declare these record type variables v_tab1 and v_tab2 dynamically
as the value of tab1 and tab2 will change for each iteration and I cannot use strong ref cursor as well due to the same reason.
Could you please suggest a way how i can declare these variables dynamically with proper row type of proper table ?

Adding more about this question : I want to compare all the tables stating with 's_'
to the equivalent all the 'p_'tables.
for example if I have tables (s_t1 , s_t2 and p_t1 , p_t2) , then i have to do a field level comparison
all the data of s_t1 with p_t1 and s_t2 with p_t2.
The requirement of this procedure is for automation off testing i.e, whether all the data of staging area moved properly to pstaging area.for example if i have value 5 in 5th column of 5th row of table s_t1 , the same should be present in 5th column of 5th row of p_t1 table.I know a minus query would be help but i have log the error as well like which table which columns and what values are causing the difference. Hence i have to take this approach.
I have updated the code to show you how i am thinking to do this.
Thanks for your help in advance. :)

and Chris said...

You can't have a dynamic record type. Even if you could, how would you know what the field names were?

But what you can do is have a dynamic query to find the differences of the form:

select col1, col2, col3, sum(s), sum(p) from (
  select t.*, 1 s, 0 p from s_t t
  union all 
  select t.*, 0 s, 1 p from p_t t
)
group by col1, col2, col3
having sum(s) <> sum(p);

Then unpivot the results to convert the columns to rows! This then gives you a query with a fixed number of columns. E.g. table name, column name and value:

select decode(ssum, 1, 'S_T', 'P_T' ), col, val
from (
select col1, col2, col3, sum(s) ssum, sum(p) psum from (
  select t.*, 1 s, 0 p from s_T t
  union all 
  select t.*, 0 s, 1 p from p_T t
)
group by col1, col2, col3
having sum(s) <> sum(p)
)
unpivot (val for col in ( col1, col2, col3))

You can dynamically generate the column list with listagg():

select listagg(column_name,',') within group (order by column_id) 
from   user_tab_cols
where  table_name = :tab;

So put it together and you can do something like:

create table s_t (
  x int, y int, z int
);

create table p_t (
  x int, y int, z int
);

insert into s_t values (1, 1, 1);
insert into s_t values (1, 2, 3);
insert into p_t values (1, 1, 1);
insert into p_t values (3, 2, 1);

declare
  sql_stmt varchar2(4000);
  tab      varchar2(30) := 'T';
  cols     varchar2(1000);
  
  differences_cur sys_refcursor;
  
  tab_name  varchar2(30);
  col_name  varchar2(30);
  col_value varchar2(4000);
begin
   
  select listagg(column_name,',') within group (order by column_id) 
  into   cols
  from   user_tab_cols
  where  table_name = 'S_' || tab;
  
  sql_stmt := 'select decode(ssum, 1, ''S_' || tab || ''', ''P_' || tab || ''' ), col, val
from (
select ' || cols || ', sum(s) ssum, sum(p) psum from (
  select t.*, 1 s, 0 p from s_' || tab || ' t
  union all 
  select t.*, 0 s, 1 p from p_' || tab || ' t
)
group by ' || cols || '
having sum(s) <> sum(p)
)
unpivot (val for col in (x, y, z))';

  open differences_cur for sql_stmt;
  loop
    fetch differences_cur into tab_name, col_name, col_value;
    exit when differences_cur%notfound;
    dbms_output.put_line (
      'Table ' || tab_name || ' is different for ' ||
      col_name || ' with ' || col_value
    );
  end loop;
  close differences_cur;
end;
/

Table P_T is different for X with 3
Table P_T is different for Y with 2
Table P_T is different for Z with 1
Table S_T is different for X with 1
Table S_T is different for Y with 2
Table S_T is different for Z with 3

If you want this to only show the fields that are different, you could modify your query to do something like:

- lag the values partitioned by column
- Only return those where the val and the lag value are different

e.g.:

select * from (
 select decode(ssum, 1, 'S_T', 'P_T' ), col, val,
         lag(val) over (partition by col order by ssum) lgval
  from (
  select X,Y,Z, sum(s) ssum, sum(p) psum from (
    select t.*, 1 s, 0 p from s_T t
    union all 
    select t.*, 0 s, 1 p from p_T t
  )
  group by X,Y,Z
  having sum(s) <> sum(p)
  )
  unpivot (val for col in (x, y, z))
)
where lgval <> val;

DECODE(SSUM,1,'S_T','P_T')  COL  VAL  LGVAL  
S_T                         X    1    3      
S_T                         Z    3    1

Rating

(3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Pinaki Das, August 21, 2016 - 6:47 pm UTC

Thanks Chris or the awesome solution you provided as always.Now I am getting the table name , column name , the actual value and the changed value. Is there a way that i can also find out the row number where the mismatch happened ?

Many Thanks

August 22, 2016 - 10:36 am UTC

What do you mean by "row number"? You can add rowids to the query which you can use later to find the relevant rows:

select * from (
 select decode(ssum, 1, 'S_T', 'P_T' ), col, val, rid, 
         lag(val) over (partition by col order by ssum) lgval
  from (
  select X,Y,Z, min(rid) rid, sum(s) ssum, sum(p) psum from (
    select t.rowid rid, t.*, 1 s, 0 p from s_T t
    union all 
    select t.rowid rid, t.*, 0 s, 1 p from p_T t
  )
  group by X,Y,Z
  having sum(s) <> sum(p)
  )
  unpivot (val for col in (x, y, z))
)
where lgval <> val;

DECODE(SSUM,1,'S_T','P_T')  COL  VAL  RID                 LGVAL  
S_T                         X    1    AAASiiAABAAAPz3AAB  3      
S_T                         Z    3    AAASiiAABAAAPz3AAB  1

select * from s_t
where  rowid = 'AAASiiAABAAAPz3AAB';

X  Y  Z  
1  2  3

A reader, August 22, 2016 - 12:48 pm UTC

what's a benefit to write a query using "type"?
Is it really impact the performance?

August 22, 2016 - 1:50 pm UTC

what's a benefit to write a query using "type"?

I'm not sure what you mean. Could you clarify please?

Back to "field to field comparison of tables"

Stew Ashton, August 22, 2016 - 2:24 pm UTC

I don't see how one can compare fields "of tables".

Chris demonstrates the "group by" method of comparing rows in two tables. The result contains rows that are in each table but not in the other.

To compare "fields", you need to know which two rows to compare. That means having a primary or unique key to identify rows for the comparison.

If the tables have a key, then there are 3 possibilities:

1) Each table has a row with that key and all the "fields" are the same. The "group by" method will eliminate those rows.

2) Each table has a row with that key and some "fields" are different. Comparison is possible here.

3) A row with that key is present in one table but not the other. In that case, there are no "fields" to compare.

Pinaki, you mention the "5th row". What do you mean by that? Your queries have no ORDER BY in them so similar rows could be returned in different orders.

Imagine S_T1 and P_T1 are exactly the same except that S_T1 has one extra row. If that row comes first in your query, then all the field comparisons will be wrong.

Are there primary / unique keys or not?

August 22, 2016 - 4:20 pm UTC

Good points Stew, thanks.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Back to "field to field comparison of tables"

More to Explore

Analytics

Can I declare the record variables dynamically

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Back to "field to field comparison of tables"

More to Explore

Analytics