i need change in each column to written as a seperate row
sachin, December  06, 2001 - 3:15 pm UTC
 
 
thanks for the answer.
but i need to log the changes on each column as a different row into that other table.
for ex:
select '1' a,'2' b,'3' c from dual
minus
 select '1' a,'4' b,'6' c from dual
i get the following result
A B C
- - -
1 2 3
i need that other table to contain the following rows.
column_name old_val new_val
b           2       4
c           3       6 
 
December  07, 2001 - 8:56 am UTC 
 
 
You would need a full outer join in order to do that (since the row could be in T1 and not in T2 or in T2 and not in T1).
Is trans_id the primary key? 
 
 
 
yes trans_id is a primary key
sachin, December  07, 2001 - 9:52 am UTC
 
 
can you give that sample query with that outer join.
thanks 
 
December  07, 2001 - 1:43 pm UTC 
 
 
Say you have the tables setup like this:
tkyte@TKYTE9I.US.ORACLE.COM> create table trans (trans_id int primary key,
  2                      c1       date,
  3                      c2       number,
  4                      c3       varchar2(10)
  5  )
  6  /
Table created.
Elapsed: 00:00:00.00
tkyte@TKYTE9I.US.ORACLE.COM>
tkyte@TKYTE9I.US.ORACLE.COM> create table trans_tmp (trans_id int primary key,
  2                      c1       date,
  3                      c2       number,
  4                      c3       varchar2(10)
  5  )
  6  /
Table created.
Elapsed: 00:00:00.03
tkyte@TKYTE9I.US.ORACLE.COM>
tkyte@TKYTE9I.US.ORACLE.COM> create table changes
  2  ( trans_id  int, cname varchar2(10), val1 varchar2(4000), val2 varchar2(4000) );
Table created.
This would capture the changes:
tkyte@TKYTE9I.US.ORACLE.COM> select *
  2    from (
  3  select t1.trans_id,
  4    decode( r, 1, 'C1', 2, 'C2', 3, 'C3' ) cname,
  5    decode( r, 1, to_char(t1.c1,'dd-mon-yyyy hh24:mi:ss'),
  6               2, to_char(t1.c2),
  7               3, t1.c3) v1,
  8    decode( r, 1, to_char(t2.c1,'dd-mon-yyyy hh24:mi:ss'),
  9               2, to_char(t2.c2),
 10               3, t2.c3) v2
 11  from trans t1,
 12       trans_tmp t2,
 13       (select rownum r from all_objects where rownum <= 3 )
 14  where t1.trans_id = :bv
 15    and t1.trans_id = t2.trans_id (+)
 16    )
 17  where v1 <> v2
 18     or (v1 is null and v2 is not null)
 19     or (v1 is not null and v2 is null)
 20  /
Where 3 in this query represents the number of columns we need to compare.  The decodes would convert everything into character strings for comparision...
 
 
 
 
outer join query
sachin, December  07, 2001 - 11:45 am UTC
 
 
tom,
can you post the outer join query that you talked about in your answer please.
trans_id is a primary key. 
 
 
outer join on the other table
sachin, December  07, 2001 - 2:48 pm UTC
 
 
tom,
i think your following query does the outer join only on table not on both. am i right?
what about the rows that are in t2(trans_tmp), but not in t1(trans)
 
 
December  07, 2001 - 3:18 pm UTC 
 
 
Ok, try this one on for size then:
select *
  from (
select t1.trans_id,
  decode( r, 1, 'C1', 2, 'C2', 3, 'C3' ) cname,
  decode( r, 1, to_char(t1.c1,'dd-mon-yyyy hh24:mi:ss'),
             2, to_char(t1.c2),
             3, t1.c3) v1,
  decode( r, 1, to_char(t2.c1,'dd-mon-yyyy hh24:mi:ss'),
             2, to_char(t2.c2),
             3, t2.c3) v2
from
     (select rownum rr, trans.* from trans where trans_id = :bv ) t1,
     (select rownum rr, trans_tmp.* from trans_tmp where trans_id = :bv) t2,
     (select rownum r from all_objects where rownum <= 3 ) t3,
     (select rownum rr from dual) t4
where t1.rr(+) = t4.rr
  and t2.rr(+) = t4.rr
  )
where v1 <> v2
   or (v1 is null and v2 is not null)
   or (v1 is not null and v2 is null)
/
 
 
 
 
What about when you just have raw data?
Duncan, January   08, 2004 - 7:04 am UTC
 
 
Tom
I have a procedure which has many input parameters representing elements of table data.  Once these parameters are read into the procedure I validate each element.  Once the data line has been validated I then need to check that the record doesn't already exist in my table.
I do this by checking a few fields of demographic data, eg name, address.  If the record doesn't exist then i insert it.
However if the record does exits i need to do an update of all the other fields of the row with the new data fields read into the procedure.  I also need to keep track of which fields have actually changed from their original table data.  Obviously I can do this the long way around by checking individual fields and comparing their table data to the new data, but I wanted to know if you have a quicker more efficient method since my table is very wide, and checking individual elements is tedious, especially because I have to account for NULL values.
I was wondering about loading the data line into maybe a RECORD type and comparing to the table data, but I understand that you cannot directly compare RECORD types.  
Would another type work, eg ORACLE OBJECT?  If so how could i get it to log which fields have been changed?
There must be a quick less painful way to do this as I can't believe it is that uncommon?
Many thanks for your help
(db version 8.1.7.4) 
 
January   08, 2004 - 1:48 pm UTC 
 
 
to check for field by field you have one and only one option.
to check field by field.
(it is rather "uncommon", I've not run into it as a requirement very often).
the "easiest" way I can think to do this is:
ops$tkyte@ORA920PC> create table t ( a int, b date, c varchar2(25) );
Table created.
<b>that represents your table.  A is your "key", b, c the data you update</b>
 
ops$tkyte@ORA920PC> insert into t values ( 1, trunc(sysdate), 'hello world' );
ops$tkyte@ORA920PC> insert into t values ( 2, null, 'hello world' );
ops$tkyte@ORA920PC> insert into t values ( 3, trunc(sysdate), null );
ops$tkyte@ORA920PC> create or replace view v
  2  as
  3  select b, 0 b_ind, c, 0 c_ind
  4    from t
  5   where 1=0;
 
View created.
<b>I'm using this view as a template.  Basically, we'll have column/indicator, column/indicator in this view.
The view is used to pick up the datatypes from the data dictionary easily and make it so that change to the base table -- changes the view straight away (and so we don't have to define the record in plsql)...
Instead of a view, this could be a cursor in the package
Here is your routine.  note how we use DECODE to do the compares (null safe)</b>
 
ops$tkyte@ORA920PC> create or replace procedure do_iu( p_a in int, p_b in date, p_c in varchar2 )
  2  as
  3      l_rec v%rowtype;
  4  begin
  5      select b, decode( b, p_b, 0, 1 ),
  6             c, decode( c, p_c, 0, 1 )
  7        into l_rec
  8        from t
  9       where a = p_a;
 10
 11      if ( l_rec.b_ind = 1 )
 12      then
 13          dbms_output.put_line( 'B will be changed from "' || l_rec.b || '" to ' || p_b );
 14      end if;
 15      if ( l_rec.c_ind = 1 )
 16      then
 17          dbms_output.put_line( 'C will be changed from "' || l_rec.c || '" to ' || p_c );
 18      end if;
 19
 20      update t
 21         set b = p_b, c = p_c
 22       where a = p_a;
 23      dbms_output.put_line( 'Updated ' || sql%rowcount );
 24  exception
 25      when no_data_found then
 26          dbms_output.put_line( 'Inserted' );
 27          insert into t (a,b,c) values ( p_a, p_b, p_c );
 28  end;
 29  /
 
Procedure created.
 
<b>and then to test...</b>
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> exec do_iu( 1, sysdate-1, 'hello world' );
B will be changed from "08-JAN-04" to 07-JAN-04
Updated 1
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> exec do_iu( 1, trunc(sysdate), 'goodbye' );
B will be changed from "07-JAN-04" to 08-JAN-04
C will be changed from "hello world" to goodbye
Updated 1
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> exec do_iu( 1, sysdate, 'goodbye' );
B will be changed from "08-JAN-04" to 08-JAN-04
Updated 1
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> exec do_iu( 2, sysdate, 'hello world' );
B will be changed from "" to 08-JAN-04
Updated 1
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> exec do_iu( 3, trunc(sysdate), 'hello world' );
C will be changed from "" to hello world
Updated 1
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> exec do_iu( 4, sysdate, 'hello world' );
Inserted
 
PL/SQL procedure successfully completed.
 
 
 
 
 
Read the tables only once!
Marco Stefanetti, February  03, 2004 - 2:03 pm UTC
 
 
I have found this simple solution, the benefit is that i read the two tables only once:
  1   select field,
  2          value,
  3          sum(source) source
  4   from (
  5          (
  6             select 'a' field,1 source, 20 value from dual
  7             union
  8             select 'b' field,1 source, 15 value from dual
  9             union
 10             select 'c' field,1 source, 30 value from dual
 11          )
 12        union all
 13         (
 14             select 'b' field,2 source, 15 value from dual
 15             union
 16             select 'c' field,2 source, 24 value from dual
 17             union
 18             select 'd' field,2 source, 13 value from dual
 19          )
 20       )
 21  group by field, value
 22* having sum(source) < 3
F      VALUE     SOURCE
- ---------- ----------
a         20          1
c         24          2
c         30          1
d         13          2
bye
 
 
February  03, 2004 - 2:13 pm UTC 
 
 
huh?
you do realize that you read dual 6 times and not only that but you did a sort/distinct on it....
And you'd read your TABLE 2 times for each column in it in general?  
how'd you get "that i read the two tables only once:"?
(and not only that, but it gets "the wrong answer" to boot, what happens with two tables:
T1
1         2
2         1
T2
1         1
2         2
Your technique would say "they are the same", but they are not. 
Not sure if you meant what you typed in as the example query -- why don't you work with real tables? 
 
 
 
you are too fast on respond
Marco Stefanetti, February  03, 2004 - 2:31 pm UTC
 
 
try this :
create table a (field char,source number,value number);
insert into a values('a',1,10);
insert into a values('b',1,10);
insert into a values('c',1,10); 
insert into a values('1',1, 2); 
insert into a values('2',1, 1); 
select * from a;
create table b (field char,source number,value number);
insert into b values('b',2,10);
insert into b values('c',2,12); 
insert into b values('d',2,10);
insert into b values('1',2, 1); 
insert into b values('2',2, 2); 
select * from b;
select field,
       value,
       sum(source) source
     from (
           ( select * from a )
            union all
           ( select * from b )
          )
   group by field, value
  having sum(source) < 3;
---------------------------------------
results:
select * from a;
F     SOURCE      VALUE
- ---------- ----------
a          1         10
b          1         10
c          1         10
1          1          2
2          1          1
 select * from b;
F     SOURCE      VALUE
- ---------- ----------
b          2         10
c          2         12
d          2         10
1          2          1
2          2          2
select field,
       value,
       sum(source) source
     from (
           ( select * from a )
            union all
           ( select * from b )
          )
   group by field, value
  having sum(source) < 3;
F      VALUE     SOURCE
- ---------- ----------
1          1          2
1          2          1
2          1          1
2          2          2
a         10          1
c         10          1
c         12          2
d         10          2
Record "b 10" doesn't compare!!!!!
I read table a once
I read table b once
see you 
 
February  03, 2004 - 6:15 pm UTC 
 
 
that is not at all what you posted -- not at all -- see what you posted, it would appear in your above example that you were PIVOTING fields named "a", "b", and "c"
that is why I said "do a real example for your does not make sense as presented"
what was the purpose of the union's of duals -- they only confused the issue.  I thought you were PIVOTING a table with 3 columns A, B and C -- that would not work. 
Now I see your general idea and say "looks good".  I would however do it like this:
                                                                                           
select id, count(src1), count(src2)
  from ( select a.*, 1 src1, to_number(null) src2 from big_table a
         union all
         select b.*, to_number(null) src1, 2 src2 from big_table b
       )
group by     ID, OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID,
DATA_OBJECT_ID, OBJECT_TYPE, CREATED, LAST_DDL_TIME, TIMESTAMP,
STATUS, TEMPORARY, GENERATED, SECONDARY
having count(src1) <> count(src2)
That does effectively the same thing -- BUT works even if table A has a row three times and table B does not have it at all (you would get sum(source) = 3 in that case)
This'll show how many rows are are in each table when they are not equal, will tolerate duplicate rows and show the mismatches.
Thanks for the interesting new twist -- it does do less work in most cases -- the group by could be killer on really wide rows.
 
 
 
 
ps
Marco Stefanetti, February  03, 2004 - 2:44 pm UTC
 
 
select field,
       sum( decode(source,1,value,null) ) table_a,
       sum( decode(source,2,value,null) ) table_b
from
( select field,
       value,
       sum(source) source
     from (
           ( select * from a )
            union all
           ( select * from b )
          )
   group by field, value
  having sum(source) < 3
)
group by field;
F    TABLE_A    TABLE_B
- ---------- ----------
1          2          1
2          1          2
a         10
c         10         12
d                    10
bye
 
 
 
just another example
Marco Stefanetti, February  03, 2004 - 4:19 pm UTC
 
 
You can compare 2 fields of 3 tables
------------------------------------------------------------------------------------------------------------
create table a ( key varchar2(10), value1 number, value2 varchar2(10) );
insert into a values('A',10,'aaa');
insert into a values('B',20,'bbb');
insert into a values('C',30,'ccc');
insert into a values('D',40,'ddd');
insert into a values('E',50,'eee');
create table b ( key varchar2(10), value1 number, value2 varchar2(10) );
insert into b values('B',20,'bbb');
insert into b values('C',30,'ccc');
insert into b values('D',41,'ddd');
insert into b values('E',50,'eee');
create table c ( key varchar2(10), value1 number, value2 varchar2(10) );
insert into c values('C',30,'ccc');
insert into c values('D',40,'ddd');
insert into c values('E',50,'eeX');
------------------------------------------------------------------------------------------------------------
select key,
       max( decode( source_code_tot, 1, value1, 3, value1, 5, value1, 7, value1, null ) ) a_v1,
       max( decode( source_code_tot, 1, value2, 3, value2, 5, value2, 7, value2, null ) ) a_v2,
       max( decode( source_code_tot, 2, value1, 3, value1, 6, value1, 7, value1, null ) ) b_v1,
       max( decode( source_code_tot, 2, value2, 3, value2, 6, value2, 7, value2, null ) ) b_v2,
       max( decode( source_code_tot, 4, value1, 5, value1, 6, value1, 7, value1, null ) ) c_v1,
       max( decode( source_code_tot, 4, value2, 5, value2, 6, value2, 7, value2, null ) ) c_v2,
       decode( source_code_tot, 7, 'OK', 'DIFF') descr
from
(
select key,
       value1,
       value2,
       sum(source_code) source_code_tot
  from (
         ( select a.*, 1 source_code from a)
         union all
         ( select b.*, 2 source_code from b)
         union all
         ( select c.*, 4 source_code from c)
       )
group by key, value1, value2
)
group by key, decode( source_code_tot, 7, 'OK', 'DIFF')
order by key;
KEY              A_V1 A_V2             B_V1 B_V2             C_V1 C_V2       DESCR
---------- ---------- ---------- ---------- ---------- ---------- ---------- --------------------------------
A                  10 aaa                                                    DIFF
B                  20 bbb                20 bbb                              DIFF
C                  30 ccc                30 ccc                30 ccc        OK
D                  40 ddd                41 ddd                40 ddd        DIFF
E                  50 eee                50 eee                50 eeX        DIFF
------------------------------------------------------------------------------------------------------------
Is it strange to sum the labels 1,2,4? 
not so strange, table d would have source_code=8
Useful?
yes, if one of the tables is a complex view on a remote db, and you can read it only once
fast?
I compare 10000 rows of a remote table in 2 seconds.
let me know ... 
 
 
Multiple records
Marco Stefanetti, February  04, 2004 - 12:35 pm UTC
 
 
YOU:
that is not at all what you posted -- not at all -- see what you posted, it 
would appear in your above example that you were PIVOTING fields named "a", "b", 
and "c"
ME:
Sorry, the union select in the first of my posts was simply a table example,
but it was confusing and, sure, it is better with real tables
YOU:
select id, count(src1), count(src2)
  from ( select a.*, 1 src1, to_number(null) src2 from big_table a
         union all
         select b.*, to_number(null) src1, 2 src2 from big_table b
       )
group by     ID, OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID,
DATA_OBJECT_ID, OBJECT_TYPE, CREATED, LAST_DDL_TIME, TIMESTAMP,
STATUS, TEMPORARY, GENERATED, SECONDARY
having count(src1) <> count(src2)
That does effectively the same thing -- BUT works even if table A has a row 
three times and table B does not have it at all (you would get sum(source) = 3 
in that case)
ME:
I'm not sure, you are not considering the content of the fields, only keys.
YOU:
the group by could be killer on really wide rows.
ME:
Yes, but I think that something "killer" would be necessary in any case
YOU: BUT works even if table A has a row three times
ME: I have got another idea, hope i'm right
we can use the product of primes instead of the sum of binary numbers
PROBLEM:
find differences and missing records between similar tables
TABLES:
create table a ( key varchar2(10), value1 varchar2(10) );
insert into a values('A','aaa');
insert into a values('A','aaa');
insert into a values('B','bbb');
insert into a values('B','bbb');
insert into a values('C','ccc');
insert into a values('D','ddd');
insert into a values('E','eee');
create table b ( key varchar2(10), value1 varchar2(10) );
insert into b values('B','bbb');
insert into b values('C','ccc');
insert into b values('C','ccc');
insert into b values('D','ddd');
insert into b values('E','eee');
create table c ( key varchar2(10), value1 varchar2(10) );
insert into c values('C','ccc');
insert into c values('C','ccc');
insert into c values('D','ddd');
insert into c values('E','eeX');
SELECT:
select 
  key,
  decode( mod(source_code_tot,2), 0, value1, null)  a,
  decode( mod(source_code_tot,3), 0, value1, null)  b,
  decode( mod(source_code_tot,5), 0, value1, null)  c,
  source_code_tot p,
  mod(source_code_tot,30) r,
  decode(  mod(source_code_tot,30), 0, 'OK', 'DIFF')  res
from
(
  select key,
         value1,
         trunc( exp( sum( ln( source_code) ) ) ) source_code_tot
    from (
           ( select a.*, 2 source_code from a)
           union all
           ( select b.*, 3 source_code from b)
           union all
           ( select c.*, 5 source_code from c)
         )
  group by key, value1
)
;
RESULTS:
KEY   A     B     C         P     R RES
----- ----- ----- ----- ----- ----- -----
A     aaa                   4     4 DIFF 
B     bbb   bbb            12    12 DIFF 
C     ccc   ccc   ccc     450     0 OK 
D     ddd   ddd   ddd      30     0 OK 
E                 eeX       5     5 DIFF 
E     eee   eee             6     6 DIFF 
It seems to work, but may be inaccurate due to exp(sum(ln(x))) because of approximations
STEPS:
1) union all : we consider all records af our tables labelling them whit a prime number
2) group by key, value1 : we consider as different, records with different values ( difference between records),
   at the same time we calculate the product of the labels (exp sum ln), this is a single number that memorizes the
   provenience of the records in the group
3) decode(mod,30) : we can have a total multiple of 30 (2*3*5) if, and only if, the record was present in the tree sources.
I'm sorry for the long posts, but it seems to me quite interesting.
 
 
February  04, 2004 - 5:42 pm UTC 
 
 
watch out - that exp(sum(ln(x))) would be a killer cpu wise!!
I like your approach with the minor modification I have (which permits duplicates -- I'm not assuming "keys" of any sort here at all).
The group by approach (given that you don't get sort key too long) is elegant and efficient. 
 
 
 
Sorry
Marco Stefanetti, February  04, 2004 - 1:29 pm UTC
 
 
Sorry, you are rigth
considering the above tables it's enought:
select key,
       decode( ta, 0, null, value1)  a,
       decode( tb, 0, null, value1)  b,
       decode( tc, 0, null, value1)  c,
       decode( ta, 0, 'DIFF', 
               decode( tb, 0, 'DIFF',
                 decode( tc, 0, 'DIFF', 'OK' ) ) ) res  
from
(
  select key,
         value1,
         sum(ta) ta,
         sum(tb) tb,
         sum(tc) tc
    from (
           ( select a.*, 1 ta, 0 tb, 0 tc from a)
           union all
           ( select b.*, 0 ta, 1 tb, 0 tc from b)
           union all
           ( select c.*, 0 ta, 0 tb, 1 tc from c)
         )
  group by key, value1
)
RESULTS:
KEY        A          B          C          RES
---------- ---------- ---------- ---------- ---------------
A          aaa                              DIFF
B          bbb        bbb                   DIFF
C          ccc        ccc        ccc        OK
D          ddd        ddd        ddd        OK
E                                eeX        DIFF
E          eee        eee                   DIFF
OK, stop here, I think we have found a good solution. 
 
 
Comparing two data between two tables with simular data types
Bonnie, November  22, 2004 - 7:51 pm UTC
 
 
Hi Tom,
I was also having problems compare data from two simular table, and tried your solution using the decode statement.  However, since I'm comparing 202 columns between the two tables, I'm getting an error of 'ORA-00939: too many arguments for function'  Any ideas on how I can resolve this?
Your help is much appreciated. 
 
November  22, 2004 - 8:11 pm UTC 
 
 
decode stops at 255.
so, you'd have to do at least 2 decodes
  3  select t1.trans_id,
  4    decode( r, 1, 'C1', 2, 'C2', 3, 'C3' ) || decode( r, 4, c4, 5, c5 ) cname,
stopping the first decode with 200 or so inputs and starting the next. 
 
 
 
Comparing two data between two tables with simular data types
Bonnie, December  01, 2004 - 1:03 pm UTC
 
 
Thank you so much for that.  It took a while to sort through all the different variables on my two tables as they were not always consistant.  Anyhow, I was wondering if you know how I can resolve this dynamic sql:
--this first one returns the out put of 
 s_PlSqlBlock :=
   'UPDATE  TRANS
    SET '||C1||' =  to_date('||c_cursor.V2||',''dd-mon-yyyy hh24:mi:ss'')'||
   ' WHERE trans_id = '||c_cursor.trans_id;
    
the out put from a select statement is:
UPDATE TRANS
SET C1= to_date(27-apr-2004 10:45:00,'dd-mon-yyyy hh24:mi:ss') 
WHERE trans_id= 3;
--notice there's no quotes around the '27-apr-2004 10:45:00' which caused the update statement to fail, with 'ORA-00904: invalid column name' error.              
The following also doesn't work.
s_PlSqlBlock :=
'UPDATE  TRANS
SET '||C1||' =  to_date('||'''c_cursor.V2'''||',''dd-mon-yyyy hh24:mi:ss'')'||
' WHERE trans_id = '||c_cursor.trans_id;
Any ideas how I can make this work?
Thank you so much!!
    
      
 
December  01, 2004 - 1:49 pm UTC 
 
 
USE BIND VARIABLES!!!!!!!
don't put literals in SQL like that. 
 
 
 
comparing two tables
biju, December  08, 2004 - 11:50 am UTC
 
 
i Have a two tables(tb1,tb2) with same fields(id(pk),name,age,age_ret)
i want a query which will give me a table(tmp) containing (id,name,age in tb1,age in tb2,age_ret in tb1,age_ret in tb2)
if there is change in age,age_ret in table tb2 compared to tb1
can u help me out 
 
December  09, 2004 - 12:38 pm UTC 
 
 
did you read the above examples?  you are 99.9% of the way there.
not too much left to do on your part. 
 
 
 
Table/Column Compare
Ivan, January   28, 2005 - 5:49 pm UTC
 
 
Regarding Marco's  February 04, 2004 result 
1) I should think that the 3 table's set of relevant column(s) could be considered equal 
only if have an equal count of (any) duplicate values, along with having equal values.
In his example - with a key of 'C' - Table A has only 1 row, other Tables have 2 rows 
It could be "DIFF", not "OK"?
2) The DECODE can be replaced with a CASE to eliminate DECODE's above stated limitation.
So another query could be:
select key,
       decode( ta, 0, null, value1)  a,
       decode( tb, 0, null, value1)  b,
       decode( tc, 0, null, value1)  c,
       --decode( ta, 0, 'DIFF', 
       --        decode( tb, 0, 'DIFF',
       --          decode( tc, 0, 'DIFF', 'OK' ) ) ) res        
   (CASE WHEN ta = tb AND ta = tc THEN 'OK' ELSE 'DIFF' END)  res
from
(
  select key,
         value1,
         sum(ta) ta,
         sum(tb) tb,
         sum(tc) tc
    from (
           ( select a.*, 1 ta, 0 tb, 0 tc from a)
           union all
           ( select b.*, 0 ta, 1 tb, 0 tc from b)
           union all
           ( select c.*, 0 ta, 0 tb, 1 tc from c)
         )
  group by key, value1
);
Hit ENTER to Continue
KEY        A          B          C          RES
---------- ---------- ---------- ---------- ----
A          aaa                              DIFF
B          bbb        bbb                   DIFF
C          ccc        ccc        ccc        DIFF
D          ddd        ddd        ddd        OK
E                                eeX        DIFF
E          eee        eee                   DIFF
F                                           DIFF
 
 
 
A reader, February  11, 2005 - 3:23 am UTC
 
 
Hi Tom,
  We want to compare the data contents of table A and B. We know that each record is unique in table A and B, and the number of records in table A and B is the same. If A minus B and B minus B returns no rows, then can we conclude that the data in A and B are the same?
Thanks 
 
 
A reader, February  11, 2005 - 8:13 pm UTC
 
 
Tom,
Is there some package in Streams to compare tables like the one in replication.
Thanks. 
 
 
Your link is very useful
A reader, February  12, 2005 - 4:22 am UTC
 
 
 
Mihail Bratu, March     30, 2005 - 10:02 am UTC
 
 
 
 
Diff's columns for big tables
GP, April     28, 2005 - 9:05 am UTC
 
 
Hello Tom,
I have a similar requirement 
Table A (180 columns), Table B(180 columns)
Both the tables having approximately 100,000 rows
Need to find all the differences and log them
I am querying it like below
tableA minus tableB 
union all
tableB minus tableA
I get certain rows but it's hard to determine which columns have changed
Hence, Referring to your followup earlier on this Posting.
 
tkyte@TKYTE9I.US.ORACLE.COM> select *
  2    from (
  3  select t1.trans_id,
  4    decode( r, 1, 'C1', 2, 'C2', 3, 'C3' ) cname,
  5    decode( r, 1, to_char(t1.c1,'dd-mon-yyyy hh24:mi:ss'),
  6               2, to_char(t1.c2),
  7               3, t1.c3) v1,
  8    decode( r, 1, to_char(t2.c1,'dd-mon-yyyy hh24:mi:ss'),
  9               2, to_char(t2.c2),
 10               3, t2.c3) v2
 11  from trans t1,
 12       trans_tmp t2,
 13       (select rownum r from all_objects where rownum <= 3 )
 14  where t1.trans_id = :bv
 15    and t1.trans_id = t2.trans_id (+)
 16    )
 17  where v1 <> v2
 18     or (v1 is null and v2 is not null)
 19     or (v1 is not null and v2 is null)
 20  /
Following is the query I am currenlty using for diffs
(lk is the primary key)
(select 180 columns from A@itenv where lk in (select lk from lklnr@itenv)
 minus
 select 180 columns from A@cbenv where lk in (select lk from lklnr@cbenv))
union all
(select 180 columns from A@cbenv where lk in (select lk from lklnr@cbenv)
 minus
 select 180 columns from A@itenv where lk in (select lk from lklnr@itenv))
How can I rewrite the above query for my use 
to get the differences and columns with both values.
Thankyou. 
 
April     28, 2005 - 9:34 am UTC 
 
 
building on the comparing two tables article here:
  https://www.oracle.com/technetwork/issue-archive/2005/05-jan/o15asktom-084959.html  
which demonstrates what I've found to be fastest approach (see second article on that page)...
You sort of need to have some idea of the "primary key" here (in order to compare what is different -- we need to be able to "join" the two tables row by row right)...
So, you use the query there:
SQL> select c1, c2, c3, 
  2         count(src1) CNT1, 
  3          count(src2) CNT2
  4     from 
  5   ( select a.*, 
  6          1 src1, 
  7          to_number(null) src2 
  8       from a
  9      union all
  10    select b.*, 
  11         to_number(null) src1, 
  12         2 src2 
  13      from b
  14   )
  15  group by c1,c2,c3
  16 having count(src1) <> count(src2)
  17 /
 
  C1   C2    C3   CNT1    CNT2
 ---   --    --   ----    ----
   2    x     y      0       1
   2   xx     y      1       0
   3    x     y      1       0
   3    x    yy      0       1
 
<b>and order by "primary key" (say it was C1 in this example)</b>
then the matching rows appear one under the other.
or you can further use analytics and
count(distinct nvl(c2,'something c2 cannot be')) over (partition by c1) cnt_c2,
and look at the columns where cnt_NN > 1
 
 
 
 
 
Lk is the primary key in table A and table B
A reader, April     28, 2005 - 12:24 pm UTC
 
 
 
April     28, 2005 - 2:03 pm UTC 
 
 
great -- use it where I used c1.. 
 
 
 
I like analytics
Vinayak, April     28, 2005 - 2:07 pm UTC
 
 
u mean something like this?:
SQL> select * from t1 ;
        ID COL1       COL2
---------- ---------- ----------
         1 x          XXX
         1 x          X
         1 y          YY
SQL> select * from t2 ;
        ID COL1       COL2
---------- ---------- ----------
         1 x          XXX
         1 y          Y
SQL> select * from
  2  (select tname,id,col1,col2,count(*) over (partition by id,col1,col2) cnt
  3  from (select 'T1' tname,t1.* from t1 union all select 'T2' tname,t2.* from t2)
  4  ) where cnt=1 ;
TNAME                                    ID COL1       COL2              CNT
-------------------------------- ---------- ---------- ---------- ----------
T1                                        1 x          X                   1
T2                                        1 y          Y                   1
T1                                        1 y          YY                  1
 
 
 
April     28, 2005 - 2:10 pm UTC 
 
 
yes 
 
 
Tom, I need further help
GP, April     28, 2005 - 2:26 pm UTC
 
 
I went through the article and was wonderful.
I tried rewriting my query
but I am missing please guide me further
LK is the Primary key in both the tables
select c1, c2, c3, count(src1) CNT1, count(src2) CNT2
from 
   (
   select a.*, 1 src1, to_number(null) src2 
      from A@itenv a
      where lk in  (select lk from lklnr@itenv)
      union all
    select b.*, to_number(null) src1, 2 src2 
     from B b
     where lk in  (select lk from lklnr@itenv)
     )
where a.lk = b.lk     
group by c1,c2,c3
having count(src1) <> count(src2)
I need the lk, columns with both the values that are different.
Please Guide.
 
 
April     28, 2005 - 2:34 pm UTC 
 
 
change c1, c2, c3.... to YOUR COLUMNS
there is not any join.
select YOUR_COLUMNS, count(src1) CNT1, count(src2) CNT2
from 
   (
   select a.*, 1 src1, to_number(null) src2 
      from A@itenv a
      where lk in  (select lk from lklnr@itenv)
      union all
    select b.*, to_number(null) src1, 2 src2 
     from B b
     where lk in  (select lk from lklnr@itenv)
     )
group by YOUR_COLUMNS
having count(src1) <> count(src2) 
 
 
 
I did the column select
GP, April     28, 2005 - 3:32 pm UTC
 
 
I am sorry, I guess I did not put my question clear.
Yeap I am putting all my columns in select and group by
but that would give me all the columns in my resultset
but I want to have lk, just the column that changed, value 1,val 2.
(val1 and val2 are value of columns that are different in both rows)
Please help me get through further
Thankyou very much. 
 
April     28, 2005 - 4:22 pm UTC 
 
 
there is where after you get this:
select YOUR_COLUMNS, count(src1) CNT1, count(src2) CNT2
from 
   (
   select a.*, 1 src1, to_number(null) src2 
      from A@itenv a
      where lk in  (select lk from lklnr@itenv)
      union all
    select b.*, to_number(null) src1, 2 src2 
     from B b
     where lk in  (select lk from lklnr@itenv)
     )
group by YOUR_COLUMNS
having count(src1) <> count(src2)
(call that Q) you'll:
select q.*, count(distinct c1) over (partition by lk) cnt_c1, ....
 from (q)
those counts can be used as "flags" to say "look at me"
 
 
 
 
Last followup on this table comparison thread is confusing to me
Rick, May       09, 2005 - 7:37 pm UTC
 
 
This has been an awesome and very useful thread. However,is it possible for you to provide a more complete example on how to return only the primary key column and column(s) whose values have changed please? 
Did you mean that the results of the compare rows script be inserted into a table called q and then select from that table to get the counts?  You say that the counts can be used as "flags" to say "look at me"  Would you please expound on that?  Please excuse my obtuseness 
 
May       09, 2005 - 8:02 pm UTC 
 
 
you have to look column by column -- no magic there, the flags where just a way to say "look at me", nothing more. 
 
 
 
Maybe this script will help! ;)
LOU, July      14, 2005 - 5:28 pm UTC
 
 
undefine TABLE1
undefine TABLE2
define g_table1 = '&&TABLE1'
define g_table2 = '&&TABLE2'
set verify off
set feedback off
set serveroutput on size 1000000
spo temp_file.sql
declare
v_owntab1 varchar2(255) := '&&g_table1';
v_owntab2 varchar2(255) := '&&g_table2';
v_own1 varchar2(255);
v_own2 varchar2(255);
v_tab1 varchar2(255);
v_tab2 varchar2(255);
v_dot1 number := 0;
v_dot2 number := 0;
type t_cols is table of varchar2(255) index by binary_integer; v_cols1
t_cols; v_cols2 t_cols; v_out1 varchar2(255); v_out2 varchar2(255); kq
CONSTANT varchar2(1) := ''''; v_ind number := 0; v_str
varchar2(2000):=null; v_ind_found boolean := FALSE; v_ind_colno number
:= 0;
  procedure print_cols (p_cols in t_cols) is
  begin
  for i in 1..p_cols.count
  loop
    dbms_output.put_line(','||p_cols(i));
  end loop;
  end print_cols;
begin
  v_dot1 := instr(v_owntab1, '.');
  v_dot2 := instr(v_owntab2, '.');
  if v_dot1 > 0 then
    v_own1 := upper(substr(v_owntab1, 1, v_dot1-1));
    v_tab1 := upper(substr(v_owntab1, v_dot1+1));
  else
    v_own1 := null;
    v_tab1 := upper(v_owntab1);
  end if;
  if v_dot2 > 0 then
    v_own2 := upper(substr(v_owntab2, 1, v_dot2-1));
    v_tab2 := upper(substr(v_owntab2, v_dot2+1));
  else
    v_own2 := null;
    v_tab2 := upper(v_owntab2);
  end if;
  select column_name
  bulk collect into v_cols1
  from all_tab_columns
  where table_name = v_tab1
  and owner = nvl(v_own1, user)
  order by column_id;
  select column_name
  bulk collect into v_cols2
  from all_tab_columns
  where table_name = v_tab2
  and owner = nvl(v_own2, user)
  order by column_id;
  if v_cols1.count = 0 or v_cols2.count = 0 then
    dbms_output.put_line('Either or Both the tables are invalid');
    return;
  end if;
  dbms_output.put_line('(');
  dbms_output.put_line('select '||kq||'TAB1'||kq);
  print_cols(v_cols1);
  dbms_output.put_line(' from '||nvl(v_own1, user)||'.'||v_tab1);
  dbms_output.put_line('MINUS');
  dbms_output.put_line('select '||kq||'TAB1'||kq);
  print_cols(v_cols1);
  dbms_output.put_line(' from '||nvl(v_own2, user)||'.'||v_tab2);
  dbms_output.put_line(')');
  dbms_output.put_line('UNION');
  dbms_output.put_line('(');
  dbms_output.put_line('select '||kq||'TAB2'||kq);
  print_cols(v_cols1);
  dbms_output.put_line(' from '||nvl(v_own2, user)||'.'||v_tab2);
  dbms_output.put_line('MINUS');
  dbms_output.put_line('select '||kq||'TAB2'||kq);
  print_cols(v_cols1);
  dbms_output.put_line(' from '||nvl(v_own1, user)||'.'||v_tab1);
  dbms_output.put_line(')');
  dbms_output.put_line('order by ');
  for c1 in (
  select b.column_name
  from all_indexes a, all_ind_columns b
  where a.owner=b.index_owner
  and a.index_name=b.index_name
  and a.uniqueness = 'UNIQUE'
  and a.table_owner = nvl(v_own1, user)
  and a.table_name = v_tab1
  order by b.index_name, b.column_position
  )
  loop
    v_ind_found := TRUE;
    v_ind_colno := v_ind_colno + 1;
    if v_ind_colno = 1 then
      dbms_output.put_line(c1.column_name);
    else
      dbms_output.put_line(','||c1.column_name);
    end if;
  end loop;
  if not v_ind_found then
    dbms_output.put_line('2 ');
  end if;
  dbms_output.put_line(';');
end;
/
spo off
set feedback on 
 
 
Humaira Liaqat, August    15, 2005 - 4:32 am UTC
 
 
Hello Tom,
 I want to write a procedure which would insert,update and delete records from 
10 tables using join query.
The structure of these tables is: they have 3 columns namely OLD,NEW AND VALUE.
The query is like
SELECT A.RED AS OLD,B.RED AS NEW,B.NAME AS VALUE FROM COLOR_S1 A ,COLOR_S B 
WHERE A.NAME = B.NAME (+) UNION 
SELECT A.RED AS OLD,B.RED AS NEW,A.NAME AS VALUE FROM COLOR_S1 A,COLOR_S B WHERE 
B.NAME = A.NAME (+) ;
Here for example COLOR_S1 and COLOR_S2 are similar parent tables where value of 
red is different in both the tables but they are joined using NAME which is 
common for both the tables.
When ever there is an insert or update or delete in any of these parent 
tables(like COLOR_S1 AND COLOR_S2), the procedure shuld fire and respective 
child table( 3 column table) should get update.
The number of parent tables is 20(inclusive of all _S1 and _S2 tables).The 
number of child tables are 10.(_S tables).
Please guide me how to write this procedure so that the query gets fired 
everytime any change occurs in any of the 20 parent tables.
Thanks
Humaira. 
 
 
August    15, 2005 - 7:20 am UTC 
 
 
did not follow that at all.   
 
 
 
Humaira Liaqat, August    16, 2005 - 1:15 am UTC
 
 
Hi Tom,
   I am sorry I could not explain You properly.
Anyway can you just help me to write a procedure to update a table if any insert,update or delete happens in the parent table? If a child table has 2 parent which are tables similar to each other except their primary key can a join query be used in the procedure to populate the child table?
Thanks
Humaira. 
 
August    16, 2005 - 10:50 am UTC 
 
 
and what might this "update" do -- you'll need to explain it before I can demonstrate it.
What is the goal, why do you need to update "a table" (assuming the child table(s)) when you update the parent?  what are you trying to accomplish. 
 
 
 
comparing two table data and logging the changes in MS Access
Shailesh, March     16, 2006 - 3:55 am UTC
 
 
Now I want to compare two tables having same Structure in MS Access  
 
March     16, 2006 - 11:41 am UTC 
 
 
go for it? 
 
 
 
Now you have the differences, process them
Drew DeGraff, July      14, 2006 - 12:22 pm UTC
 
 
I built upon what you guys started.  
This will compare a “base” table with a “refresh” and put the differences in a “diff” table
We will need to create a diff table from the base table and add three columns; base_tab number, refresh_tab number, and action varchar2(30).  These columns will indicate which table the differences come from.
We first populate the diff table.  We will then update the diff table with the action, update, delete, insert, old, that the rows represent.  The query brings back all records that are different, “old” will represent the records in the base table that were changed in some way by the “update” records.  The “old” records can be ignored or deleted.
The basic query for getting the differences is
select pk, column1, column2, column3,
          count(src1) base_tab, count(src2) refresh_tab
      from
    ( select a.*,
           1 src1,
           to_number(null) src2  
        from base_tab a
       union all
     select b.*,
          to_number(null) src1,
          2 src2
       from refresh_tab b
    )
   group by pk, column1, column2, column3
  having count(src1) <> count(src2)
This query only returns rows that are different or do no exist in the base and/or the refresh table.
The base_tab and refresh_tab column fields should only contain a 1 or 0 unless the primary key has been violated and there are duplicates.  1 or more means it exists in that respective table, 0 means it doesnÂ’t exist.  If there are duplicates then we may need to tweak the code further.
Create table base_tab (pk number, column1 number, column2 varchar2(30), column3 date);
Create table refresh_tab (pk number, column1 number, column2 varchar2(30), column3 date);
Create table diff (pk number, column1 number, column2 varchar2(30), column3 date, base_tab number, refresh_tab number, action varchar2(30));
Insert into base_tab VALUES (1, 1.21,Â’SNACK CAKESÂ’,Â’31-aug-2007Â’);
Insert into base_tab VALUES (2, .96,Â’DORITOSÂ’,Â’30-sep-2007Â’);
Insert into base_tab VALUES (3, .31,Â’SODAÂ’,Â’31-aug-2007Â’);
Insert into base_tab VALUES (4, 2.67,Â’PIZZAÂ’,Â’30-nov-2007Â’);
Insert into refresh_tab VALUES (1, 1.21,Â’SNACK CAKESÂ’,Â’31-aug-2007Â’);
Insert into refresh_tab VALUES (2, .96,Â’DORITOSÂ’,Â’31-OCT-2007Â’);
Insert into refresh_tab VALUES (3, .31,Â’SODA POPÂ’,Â’31-aug-2007Â’);
Insert into refresh_tab VALUES (5, .21,Â’SNICKERSÂ’,Â’30-nov-2007Â’);
Insert into refresh_tab VALUES (6, .45,Â’BEERÂ’,Â’30-nov-2007Â’);
--clean out the differences table
truncate table diff;
--
-- build the difference data
--
insert into diff select pk,column1,column2, column3,
          count(src1) base_tab, count(src2) refresh_tab, null action
      from
    ( select a.*,
           1 src1,
           to_number(null) src2  
        from base_tab a
       union all
     select b.*,
          to_number(null) src1,
          2 src2
       from refresh_tab b
    )
   group by pk,column1,column2, column3
  having count(src1) <> count(src2);
--
-- update old and update records.  It gets all records where the pk 
-- exists more than once, one record would be from the base, the other 
-- is the changed record in the refresh table
-- if the base_tab column contains a 1, then itÂ’s a Old, else its an 
-- Update
--
    update diff set action=decode(base_tab,1,'Old','Update') where 
       (pk) IN (
       select pk from (
 select pk, count(*) from diff group by pk
 having count(*) >1) );
--
-- update the delete and insert records.  It gets all records where the
-- pk exist only once.
-- if the base_tab column contains a 1, then itÂ’s a Delete, else its an -- insert
--    
   update diff set action=decode(base_tab,1,'Delete','Insert')  where 
       (pk) IN (
       select pk from (
 select pk, count(*) from diff 
 group by pk having count(*) =1) );
SELECT * FROM DIFF;
 PK      col1 col2                 col3        bt rt  action
  2     .96 DORITOS              30-SEP-07   1   0  Old
  2     .96 DORITOS              31-OCT-07   0   1  Update
  3     .31 SODA                 31-AUG-07   1   0  Old
  3     .31 SODA POP             31-AUG-07   0   1  Update
  4    2.67 PIZZA                30-NOV-07   1   0  Delete
  5     .21 SNICKERS             30-NOV-07   0   1  Insert
  6     .45 BEER                 30-NOV-07   0   1  Insert
Now you can process the orders any way you wish.
  
 
 
Please tell me if I am missing something
Srinivas Narashimalu, July      21, 2006 - 11:27 am UTC
 
 
Hi Tom,
I had previously taken your advice on tuning this query. I had tuned it and have also come up with a different approach of getting this report. 
This query essentially displays records for periods (let's say) 2005.01.01 to 2005.07.21 (sysdate-12months) and 2006.01.01 to 2006.07.21 (sysdate). This is mentioned in the "pfact.invoice_date_key" condition in the query (query is given below).
Previously the resultset of this query would be inserted to a table everyday (with truncating it before inserting). 
Now I want to insert into the table only one day's new records each day, instead of truncating and populating it everyday with 'all' the data. 
I created FULL_REP table (to hold all the records i.e upto sysdate (condition for "pfact.invoive_date_key" in the query)) using CTAS using the query given below.
I created PART_REP table (to hold partial the records i.e upto sysdate-1 (condition for "pfact.invoive_date_key" in the query)) using CTAS using the query given below.
But this is what I get when I tested it -
SQL> select count(*) from FULL_REP; (this table has data for periods between 01.01.2005 and 07.19.2005 and 01.01.2006 and 07.19.2006)
 
  COUNT(*)
----------
    251544
 
SQL> select count(*) from PART_REP; (this table has data for periods between 01.01.2005 and 07.18.2005 and 01.01.2006 and 07.18.2006)
 
  COUNT(*)
----------
    250926
 
SQL> select 251544 - 250926 from dual;
 
251544-250926
-------------
          618
 
SQL> create table STAGE_REP as (select * from FULL_REP minus select * from PART_REP);        
 
Table created.
 
SQL> select count(*) from del2;
 
  COUNT(*)
----------
      8909
WHY IS IT CREATED WITH 8909 ROWS? IT MUST HAVE BEEN 618 RIGHT?
And when I insert the STAGE_REP values into PART_REP it's resulting in 
select count(*) from PART_REP;
  COUNT(*)
----------
    259835
Please tell me where I am going wrong. Below is the sql used to create tables FULL_REP,PART_REP and STAGE_REP.
SELECT pfact.invoice_year,
                dlr.dealer_key,
                dlr.dealer_number,
                dlr.name,
                dlr.city||', '||dlr.state AS location,
                dlr.country,
                dlr.classification,
                dlr.start_date_active,
                dlr.end_date_active,
                dlr.old_dealer_number,
                regdist.region_code,
                regdist.region_desc,
                regdist.regional_manager,
                regdist.district_code,
                regdist.district_desc,
                regdist.district_manager,
                dlr.facing_dealer,
                dlr.parts_parent_dealer,
--              dlr.parent_location,    -- to be added to dealer_dim
                '',
                dlr.financial_report_dealer,
                dlr.consolidation_dealer,
                dlr.parts_management_group,
                org.organization_code,
                org.division_code,
                supp.supplier_code,
                supp.name,
                pgrp.product_group_desc,
--              pvdr.parent_code,
--              pvdr.part_classification,
--              pvdr.proprietary_ind,
--              dship.direct_ship_program,
--              ordr.order_type,
--              pri.priority_code,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY'),
                        DECODE(ordr.order_type,'PDC',
                                DECODE(pri.priority_code,'S',
                                        SUM(pfact.local_ext_sales_price), 0) ,0) ,0), 0) AS pdc_stock,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY'),
                        DECODE(ordr.order_type,'PDC',
                                DECODE(pri.priority_code,'E',
                                        SUM(pfact.local_ext_sales_price), 0), 0), 0), 0) AS pdc_vor,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY'),
                        DECODE(ordr.order_type,'DSP',
                                SUM(pfact.local_ext_sales_price), 0), 0), 0) AS vdsp,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY'),
                        DECODE(ordr.order_type,'SD',
                                SUM(pfact.local_ext_sales_price), 0), 0), 0) AS ship_direct,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY'),
                        SUM(pfact.local_ext_sales_price), 0), 0) AS gross_sales,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY'),
                        SUM(pfact.local_gross_profit), 0), 0) AS gross_profit,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY') - 1,
                        DECODE(ordr.order_type,'PDC',
                                DECODE(pri.priority_code,'S',
                                        SUM(pfact.local_ext_sales_price), 0), 0), 0), 0) AS pdc_stock_prev,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY') - 1,
                        DECODE(ordr.order_type,'PDC',
                                DECODE(pri.priority_code,'E',
                                        SUM(pfact.local_ext_sales_price), 0), 0), 0), 0) AS pdc_vor_prev,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY') - 1,
                        DECODE(ordr.order_type,'DSP',
                                SUM(pfact.local_ext_sales_price), 0), 0), 0) AS vdsp_prev,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY') - 1,
                        DECODE(ordr.order_type,'SD',
                                SUM(pfact.local_ext_sales_price), 0), 0), 0) AS ship_direct_prev,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY') - 1,
                        SUM(pfact.local_ext_sales_price), 0), 0) AS gross_sales_prev,
                NVL(DECODE(invoice_year, TO_CHAR(SYSDATE - 1,'YYYY') - 1,
                        SUM(pfact.local_gross_profit), 0), 0) AS gross_profit_prev
        FROM    priority_dim             pri,
                product_group_dim              pgrp,
                order_type_dim           ordr,
                organization_dim         org,
                dealer_dim               dlr,
                region_district_dim      regdist,
                price_vendor_dim         pvdr,
               part_dim                 pdim,
                supplier_dim             supp,
                part_order_line_fact     pfact
        WHERE regdist.type(+) = 'PARTS'
--        AND dlr.dealer_number = 'A300'
          AND regdist.region_district_key IN
                                (SELECT MIN(region_district_key)
                                 FROM region_district_dim
                                 WHERE type = 'PARTS'
                                 AND dealer_key = dlr.dealer_key)
          AND (pfact.invoice_date_key BETWEEN TO_NUMBER(TO_CHAR(SYSDATE-1, 'yyyy')- 1||'0101')
                                          AND TO_NUMBER(TO_CHAR(SYSDATE-1, 'yyyymmdd')))
          AND (pfact.invoice_date_key NOT BETWEEN TO_NUMBER(TO_CHAR(ADD_MONTHS(SYSDATE-1, -12), 'yyyymmdd'))
                                              AND TO_CHAR(TO_CHAR(SYSDATE-1, 'YYYY')- 1)||'1231')
--w/ Canada dealers
--        AND (SUBSTR(dlr.dealer_number, 1, 1) NOT IN ('X','Y'))
--w/o Canada dealers
          AND (SUBSTR(dlr.dealer_number, 1, 1) NOT IN ('X','Y','N'))
          AND NVL(gl_code, 'MISC') NOT LIKE 'DDSS%'
          AND NVL(gl_code, 'MISC') NOT LIKE 'SSMX%'
          AND NVL(gl_code, 'MISC') NOT LIKE 'SSMU%'
          AND pdim.part_ind = 'Y'
          AND   dlr.dealer_key = regdist.dealer_key(+)
          AND dlr.dealer_key = pfact.dealer_key
--        AND regdist.region_district_key (+) = pfact.region_district_key       
          AND org.organization_key = pfact.organization_key
        AND pgrp.product_group = pvdr.product_group
--        AND cr.credit_reason_key = pfact.credit_reason_key                    
--        AND dship.direct_ship_program_key(+) = pfact.direct_ship_program_key
          AND ordr.order_type_key = pfact.order_type_key
          AND pri.priority_key = pfact.priority_key
          AND pdim.part_key = pfact.shipped_part_key
          AND pdim.organization_key = pfact.organization_key
          AND pvdr.price_vendor_key = pfact.price_vendor_key
          AND supp.supplier_key = pfact.supplier_key
        GROUP BY pfact.invoice_year,
                dlr.dealer_key,
                dlr.dealer_number,
                dlr.name,
                dlr.city||', '||dlr.state,
                dlr.country,
                dlr.classification,
                dlr.start_date_active,
                dlr.end_date_active,
                dlr.old_dealer_number,
                regdist.region_code,
                regdist.region_desc,
               regdist.regional_manager,
                regdist.district_code,
                regdist.district_desc,
                regdist.district_manager,
                dlr.facing_dealer,
                dlr.parts_parent_dealer,
--              dlr.parent_location,
                dlr.financial_report_dealer,
                dlr.consolidation_dealer,
                dlr.parts_management_group,
                org.organization_code,
                org.division_code,
                supp.supplier_code,
                supp.name,
                pgrp.product_group_desc,
--              pvdr.parent_code,
--              pvdr.part_classification,
--              pvdr.proprietary_ind,
--              dship.direct_ship_program,
                ordr.order_type,
                pri.priority_code;
Sorry to bother with such huge query!
Thanks,
Srinivas 
 
 
July      23, 2006 - 7:41 am UTC 
 
 
I'd be looking at my data to figure out what I did wrong?  I mean, you have the 8206 rows - debug it?
Not knowing your data as you do....
Not knowing how you built part and full rep (are they "correct" in the first place)
One cannot really say (and that is a rather large query, I'm not going to reverse engineer it here) 
 
 
 
Appreciate it!
Srinivas Narashimalu, July      23, 2006 - 10:42 pm UTC
 
 
Thanks for the response Tom!
I know it's a large query..I was sort of not comfortable posting that..but somehow wanted to know where I was going wrong as I have been troubleshooting for quiet sometime.
Anyhow thanks again for your response and greatly appreciate your politeness.
Thanks,
Srinivas 
 
 
Bringing 1 table in sync with the other
A reader, July      28, 2006 - 4:33 pm UTC
 
 
Given 2 tables with similar structure (same PK and N optional attributes)
How can the techniques discussed on this thread be used to bring Table A "in sync" with Table B? i.e. 
a. if the PK is the same, update the N optional attributes in Table A to be the same as those in Table B. 
b. If B doesn't have the row, delete it from A.
c. If A doesn't have the row, add it to A
i.e. B is the master table, I want to make A look like it.
Additionally, I would like to "log" each INSERT/UPDATE/DELETE I am doing to a audit trail, so I guess  a set-based approach is out?
Any ideas appreciated. Thanks 
 
July      28, 2006 - 9:01 pm UTC 
 
 
simple.
delete from a;
insert into a select * from b;
not sure of the point of the audit trail since a row in b could have been changed 500 times, or once? 
 
 
 
Checksum Based Approach
Thiyag, September 13, 2006 - 2:26 am UTC
 
 
Hi Tom:
Can you suggest some checksum based approach for comparing the table contents. For example say there is some algorithm built, which reads the contents of the table and give a checksum value which could be used to compare the content of another table present in some other database. This way the two databases need not be connected to perform the content check. 
 
September 13, 2006 - 7:25 am UTC 
 
 
select sum( checksum( col1||'/'||col2||'/'||....||'/'||colN) )
  from t;
perhaps. 
 
 
 
What is checksum?
Thiyag, September 13, 2006 - 9:44 am UTC
 
 
I am interested in knowing your opinion on how this checksum need to be written? Any suggestions. Is there some already implemented packages available in Oracle which uses this checksum but for a different purpose that could be harnessed?  
 
September 13, 2006 - 2:49 pm UTC 
 
 
dbms_obfuscation_toolkit can create an md5 hash
dbms_crypto has more than just md5
they are checksum like examples. 
 
 
 
multiple tables with same structure
satish, November  16, 2006 - 9:52 am UTC
 
 
TOM,
I have two tables for each day like (t20060101, o20060101 ...etc say 730 tables per year) with same structure,  I need to query data from these tables, given column values and start date and end date. 
Is this type of question answered? if so, please give me the link, if not, can you please give me the code?
satish 
 
November  16, 2006 - 3:20 pm UTC 
 
 
I would say you've done a bad thing.
You meant to partition
but you didn't
why not? 
 
 
 
A reader, November  17, 2006 - 9:38 am UTC
 
 
 
 
A similar table comparison problem
sd, March     29, 2007 - 6:22 am UTC
 
 
Hello Tom,
i've got a similar constellation as the original poster. I want to compare two 
master-detail tables efficiently in a PL/SQL procedure.
These are the simplified example tables. Each entry in the master table can
have an arbitrary amount of entries in the detail table. P/Fkeys and indices
are missing, obviously.
CREATE TABLE app_queue (
  id          NUMBER        NOT NULL,
  description VARCHAR2(100) NOT NULL
);
CREATE TABLE app_queue_detail (
  queue_id   NUMBER       NOT NULL,
  fieldname  VARCHAR2(40) NOT NULL,
  fieldvalue VARCHAR2(40) NOT NULL
);
CREATE TABLE app_notification (
  id          NUMBER        NOT NULL,
  description VARCHAR2(100) NOT NULL
);
CREATE TABLE app_notification_detail (
  notification_id NUMBER       NOT NULL,
  fieldname       VARCHAR2(40) NOT NULL,
  fieldvalue      VARCHAR2(40) NOT NULL
);
I want to return all app_notification and app_queue id's where all fieldnames
and fieldvalues of the notification match a queue's details. I've had some
approaches with nested cursors but i wasn't really satisfied with them.
Example data following.
Match, return app_queue.id and app_notification.id:
app_queue_detail   app_notification_detail
NAME joseph        NAME joseph
BDAY 1990
No match:
app_queue_detail   app_notification_detail
NAME joseph        NAME john
BDAY 1990          BDAY 19%
No match:
app_queue_detail   app_notification_detail
NAME joseph        FAVC gr%
BDAY 1990          
ZIPC 31337
Is there a possibility to turn this problem into an efficient query?
Thanks in advance. 
March     30, 2007 - 12:47 pm UTC 
 
 
need some inserts there
and an explanation of why the first one is a match.
 
 
 
differences of records in single table
Rasika, May       24, 2007 - 1:09 pm UTC
 
 
Hi Tom,
I have similar question but this is with a single table, let's say we audit changes to the records of a table and we transfer each modfied record into new audit table
ex: if table T1 is with columns (PK, C1, C2, C3) and our audit table is AUD_T1 (PK, C1, C2, C3, OPERATION_TYPE, TIMESTAMP, USER)
(In the audit table actually PK is not primary key column but it just keep the values for primary key column PK of T1)
so when any change happens to T1 that new (or old) record will be copied to AUD_T1 with additional op_type (insert, delete, update) timestamp and the user
so it could be as follows in AUD_T1
PK   C1   C2  C3   OPERATION_TYPE  TIMESTAMP  USER
1    a    b   c       UPDATE        t1         u1
1    a1   b   c1      UPDATE        t2         u1
1    a1   b1  c1      UPDATE        t3         u2
2    c    d   e       UPDATE        t4         u1
2    c1   d   e       UPDATE        t5         u2
so what I need to generate is table as follows using above date in AUD_T1
to make it easy if we consider records with PK = 1
TIMESTAMP  COLUMN   OLD_VALUE  NEW_VALUE
t2         C1       a          a1
t2         C3       c          c1
t3         C2       b          b1
and actually if I can get with following format that would be even great
At t2 user u1 changed
-- 
 C1  a   a1
 C3  c   c1
At t3 user u2 changed
--
 C2  b   b1
This is basically yo represent some meaningful audit report to user, we already keep records as 
in AUD_T1 but the problem is presenting them as above..
If you could provide few SQL queries or PL/SQL SPs which could do such thing or point to any
existing question & answer link that would be great
Thanks Tom     
 
 
data from single table
Rasika, May       24, 2007 - 1:11 pm UTC
 
 
Hi Tom,
I have similar question but this is with a single table, let's say we audit changes to the records of a table and we transfer each modfied record into new audit table
ex: if table T1 is with columns (PK, C1, C2, C3) and our audit table is AUD_T1 (PK, C1, C2, C3, OPERATION_TYPE, TIMESTAMP, USER)
(In the audit table actually PK is not primary key column but it just keep the values for primary key column PK of T1)
so when any change happens to T1 that new (or old) record will be copied to AUD_T1 with additional op_type (insert, delete, update) timestamp and the user
so it could be as follows in AUD_T1
PK   C1   C2  C3   OPERATION_TYPE  TIMESTAMP  USER
1    a    b   c       UPDATE        t1         u1
1    a1   b   c1      UPDATE        t2         u1
1    a1   b1  c1      UPDATE        t3         u2
2    c    d   e       UPDATE        t4         u1
2    c1   d   e       UPDATE        t5         u2
so what I need to generate is table as follows using above date in AUD_T1
to make it easy if we consider records with PK = 1
TIMESTAMP  COLUMN   OLD_VALUE  NEW_VALUE
t2         C1       a          a1
t2         C3       c          c1
t3         C2       b          b1
and actually if I can get with following format that would be even great
At t2 user u1 changed
-- 
 C1  a   a1
 C3  c   c1
At t3 user u2 changed
--
 C2  b   b1
This is basically yo represent some meaningful audit report to user, we already keep records as 
in AUD_T1 but the problem is presenting them as above..
If you could provide few SQL queries or PL/SQL SPs which could do such thing or point to any
existing question & answer link that would be great
We're using Oracle 10g
Thanks Tom     
 
 
tom, please declare...
Ralph Buetikofer, January   14, 2009 - 5:33 am UTC
 
 
hi tom
thank you so far for lots of good tipps! 
i have to compare several huge tables with millions of records in it. so read this thread and found some good solutions incl. the one from stefano and you. now i have to find the fastest way to compare our data.
therefor i have a table with only two cols, no pk and 5 millions records. it's intressting for me finding out that solution 1 is faster than your solution 2. did i forget something or is there something wrong in my example? i'd like you to let me know.
here the examples:
1) A where A not in B union all B where B not in A
SELECT a.col1, a.col2
  FROM tab_a a
 WHERE (a.col1, a.col2) NOT IN
       (SELECT b.col1, b.col2
          FROM tab_b b)
UNION ALL
SELECT b.col1, b.col2
  FROM tab_b e60
 WHERE (b.col1, b.col2) NOT IN
       (SELECT a.col1, a.col2
          FROM tab_a a);          
result: 0 Records selected (compare was fine), 17seconds
2) your solution described above
SELECT col1,
       col2,
       COUNT(src1) CNT1,
       COUNT(src2) CNT2
  FROM (SELECT a.col1, a.col2,
               1 src1,
               TO_NUMBER(NULL) src2
          FROM tab_a a
        UNION ALL
        SELECT b.col1, b.col2,
               TO_NUMBER(NULL) src1,
               2 src2
          FROM tab_b b)
 GROUP BY col1,
          col2
HAVING COUNT(src1) <> COUNT(src2);
result: 0 Records selected (compare was fine), 25seconds... (?)
btw: we're using oracle 10g r2
any ideas tom?
thank you for your reply. 
January   16, 2009 - 4:23 pm UTC 
 
 
tkprof it, post those results.
wall clock times are not very useful - probably they are "the same"
Your code does not handle nulls at all, hopefully col1 and col2 are NOT NULL..  That is, your query is not useful if any of the attributes are NULLABLE (so it would not matter how fast or slow it is :) )
ops$tkyte%ORA11GR1> create table tab_a ( col1 number, col2 number );
Table created.
ops$tkyte%ORA11GR1> create table tab_b ( col1 number, col2 number );
Table created.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> insert into tab_a values ( 1, 1 );
1 row created.
ops$tkyte%ORA11GR1> insert into tab_a values ( 1, null );
1 row created.
ops$tkyte%ORA11GR1> insert into tab_b values ( 1, 1 );
1 row created.
ops$tkyte%ORA11GR1> insert into tab_b values ( null, 1 );
1 row created.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> SELECT a.col1, a.col2
  2    FROM tab_a a
  3   WHERE (a.col1, a.col2) NOT IN
  4         (SELECT b.col1, b.col2
  5            FROM tab_b b)
  6  UNION ALL
  7  SELECT b.col1, b.col2
  8    FROM tab_b b
  9   WHERE (b.col1, b.col2) NOT IN
 10         (SELECT a.col1, a.col2
 11            FROM tab_a a);
no rows selected
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> SELECT col1,
  2         col2,
  3         COUNT(src1) CNT1,
  4         COUNT(src2) CNT2
  5    FROM (SELECT a.col1, a.col2,
  6                 1 src1,
  7                 TO_NUMBER(NULL) src2
  8            FROM tab_a a
  9          UNION ALL
 10          SELECT b.col1, b.col2,
 11                 TO_NUMBER(NULL) src1,
 12                 2 src2
 13            FROM tab_b b)
 14   GROUP BY col1,
 15            col2
 16  HAVING COUNT(src1) <> COUNT(src2);
      COL1       COL2       CNT1       CNT2
---------- ---------- ---------- ----------
                    1          0          1
         1                     1          0
 
 
 
 
Matching from Single Table based on user input
Shravani, July      08, 2011 - 4:47 am UTC
 
 
Hi Tom,
Thanks so much for all your help and mentoring.
I have a requirement of comparing the records from same table. The output should be based on the user input. 
Script :
----------------------------------------------------
drop table dispatch_records;
create table dispatch_records (cust_no varchar2(10), dispatch_Date date, dispatch_zip varchar2(10));
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_1', '01-Jan-2001', '07001');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_1', '11-Jan-2001', '07771');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_1', '22-Feb-2001', '08731');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_2', '22-Feb-2001', '07771');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_2', '22-Feb-2001', '08731');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_3', '27-Mar-2003', '07001');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_4', '27-Mar-2003', '07001');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_5', '01-Jan-2001', '08731');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_5', '01-Jan-2001', '07001');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_5', '01-Jan-2001', '07771');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_6', '22-Feb-2001', '07771');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_6', '22-Feb-2001', '08731');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_7', '22-Feb-2001', '10011');
insert into dispatch_records(cust_no, dispatch_date, dispatch_zip) values ('Cust_7', '22-Feb-2001', '10017');
commit;
----------------------------------------------------
The requirement is
If user selects option "All" then interface should display the customers where all the zipcode are matching
Output For "All ZipCode(s) Matching"
---------------------------------------------
Dispatch Location matching for : Cust_1, Cust_5
Dispatch Location matching for : Cust_2, Cust_6
Dispatch Location matching for : Cust_3, Cust_4
---------------------------------------------
If user selects option "Any 1" then it means display the customers for which any 1 zipcode is matching.
Output For "Any 1 ZipCode(s) Matching"
-----------------------------------------------
Dispatch Location matching for : Cust_1, Cust_2
Dispatch Location matching for : Cust_1, Cust_3
Dispatch Location matching for : Cust_1, Cust_4
Dispatch Location matching for : Cust_1, Cust_5
Dispatch Location matching for : Cust_1, Cust_6
Dispatch Location matching for : Cust_2, Cust_5
Dispatch Location matching for : Cust_2, Cust_6
Dispatch Location matching for : Cust_3, Cust_4
Dispatch Location matching for : Cust_3, Cust_5
Dispatch Location matching for : Cust_4, Cust_5
Dispatch Location matching for : Cust_5, Cust_6
---------------------------------------------
If user selects option "Any 2" then it means display the customers for which any 2 zip codes are matching.
Output For "Any 2 ZipCode(s) Matching"
-----------------------------------------------
Dispatch Location matching for : Cust_1, Cust_2
Dispatch Location matching for : Cust_1, Cust_5
Dispatch Location matching for : Cust_1, Cust_6
Dispatch Location matching for : Cust_2, Cust_5
Dispatch Location matching for : Cust_2, Cust_6
Dispatch Location matching for : Cust_5, Cust_6
---------------------------------------------
Can you please suggest a effective way in SQL to achieve this?
Thanks & Regards
Shravani 
July      08, 2011 - 2:59 pm UTC 
 
 
ops$tkyte%ORA11GR2> with t
  2  as
  3  ( select distinct cust_no, dispatch_zip zip,
  4           count(distinct dispatch_zip) over (partition by cust_no) cnt
  5      from dispatch_records
  6  )
  7  select t1.cust_no, t2.cust_no
  8    from t t1, t t2
  9   where t1.cust_no < t2.cust_no
 10     and t1.zip = t2.zip
 11   group by t1.cust_no, t2.cust_no
 12  having max(t1.cnt) = max(t2.cnt)
 13     and max(t1.cnt) = count(*)
 14   order by 1
 15  /
CUST_NO    CUST_NO
---------- ----------
Cust_1     Cust_5
Cust_2     Cust_6
Cust_3     Cust_4
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> with t
  2  as
  3  ( select distinct cust_no, dispatch_zip zip
  4      from dispatch_records
  5  )
  6  select distinct t1.cust_no, t2.cust_no
  7    from t t1, t t2
  8   where t1.cust_no < t2.cust_no
  9     and t1.zip = t2.zip
 10   order by 1
 11  /
CUST_NO    CUST_NO
---------- ----------
Cust_1     Cust_2
Cust_1     Cust_3
Cust_1     Cust_4
Cust_1     Cust_5
Cust_1     Cust_6
Cust_2     Cust_5
Cust_2     Cust_6
Cust_3     Cust_4
Cust_3     Cust_5
Cust_4     Cust_5
Cust_5     Cust_6
11 rows selected.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> with t
  2  as
  3  ( select distinct cust_no, dispatch_zip zip
  4      from dispatch_records
  5  )
  6  select t1.cust_no, t2.cust_no
  7    from t t1, t t2
  8   where t1.cust_no < t2.cust_no
  9     and t1.zip = t2.zip
 10   group by t1.cust_no, t2.cust_no
 11  having count(*) >= 2
 12   order by 1
 13  /
CUST_NO    CUST_NO
---------- ----------
Cust_1     Cust_2
Cust_1     Cust_5
Cust_1     Cust_6
Cust_2     Cust_5
Cust_2     Cust_6
Cust_5     Cust_6
6 rows selected.
 
 
 
 
Thanks so much for the simple solution.
Shravani, July      11, 2011 - 3:12 am UTC
 
 
 
 
Result Set is Huge
Shravani, July      27, 2011 - 6:19 am UTC
 
 
Hi Tom,
This is regarding the question "Matching from Single Table based on user input" asked in the same thread.
Actually I want to use the query logic you gave to pass input to another program.
Queries are working fine for small volume, however for higher volume it is returning huge data. For example for 100K records in "dispatch_records" table the query is returning 12M records. And in production the volume of this table is more than 1M.
Is it possible to get the output only at customer level and
also is it possible to get the output using "dispatch_records" table just once?
Regards
Shravani 
July      28, 2011 - 7:01 pm UTC 
 
 
state the question entirely here - do not make me read up and down on the page.  give YOUR example, explained in your words, with your data.
I did not understand this:
Is it possible to get the output only at customer level and
also is it possible to get the output using "dispatch_records" table just once?
 
 
 
 
Result Set is Huge 
Shravani, July      31, 2011 - 10:40 am UTC
 
 
Sorry Tom...I will phrase my question properly.
For further processing, want to get out the customers for whom the dispatch zip codes are common. Basically want to group the output based on what is matching.
Here is the table structure and data.
drop table dispatch_records;
create table dispatch_records (cust_no varchar2(10), dispatch_Date date, dispatch_zip 
varchar2(10));
------------------------------------------------------
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_1', '07001');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_1', '07771');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_1', '08731');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_2', '07771');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_2', '08731');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_3', '07001');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_4', '07001');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_5', '08731');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_5', '07001');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_5', '07771');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_6', '07771');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_6', '08731');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_7', '10011');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_7', '10017');
commit;
------------------------------------------------------
The query you gave in previous thread for this is 
with t
as
( select distinct cust_no, dispatch_zip zip
  from dispatch_records
)
select distinct t1.cust_no, t2.cust_no
from t t1, t t2
where t1.cust_no < t2.cust_no
and t1.zip = t2.zip
order by 1
/
And the output is 
CUST_NO    CUST_NO
---------- ----------
Cust_1     Cust_2
Cust_1     Cust_3
Cust_1     Cust_4
Cust_1     Cust_5
Cust_1     Cust_6
Cust_2     Cust_5
Cust_2     Cust_6
Cust_3     Cust_4
Cust_3     Cust_5
Cust_4     Cust_5
Cust_5     Cust_6
--------------------
The query does give me the required output, however since it is returning each combination, the resultset is becoming very huge. 
Is it possible to get something like below. This will enable me to pass on lesser records to next program. Basically once I have the below output, I will apply the DISTINCT after removing the ZIP
----------------------------------------
Cust_Number Zip Rank
-----------------------------------------
Cust_1  07771 1
Cust_1  08731 1
Cust_2  07771 1
Cust_2  08731 1
Cust_5  07771 1
Cust_5  08731 1
Cust_6  07771 1 
Cust_6  08731 1
 
Cust_1  07001 2
Cust_1  07771 2
Cust_1  08731 2
Cust_5  07001 2 
Cust_5  07771 2
Cust_5  08731 2
Cust_1  07001 3
Cust_3  07001 3
Cust_4  07001 3
Cust_5  07001 3
------------------------------------
Regards
Shravani
 
August    01, 2011 - 11:48 am UTC 
 
 
explain your output.  It is not immediately obvious to me what it represents.
and if you are trying to return less data to the client, why is your desired output LARGER than my original output? 
 
 
Shravani, August    02, 2011 - 6:48 am UTC
 
 
Hi Tom,
Thanks so much for the response. Basically I want to send the customers based on the groups. In the below output I am getting the customers in Pairs. All the customers with RANK column value as 1 are the ones who have 07771 and 08731 zip codes common. The ones with Rank column value as 2, have 3 zip codes common. 
Cust_Number    Zip    Rank
-----------------------------------------
Cust_1        07771    1
Cust_1        08731    1
Cust_2        07771    1
Cust_2        08731    1
Cust_5        07771    1
Cust_5        08731    1
Cust_6        07771    1    
Cust_6        08731    1
    
Cust_1        07001    2
Cust_1        07771    2
Cust_1        08731    2
Cust_5        07001    2    
Cust_5        07771    2
Cust_5        08731    2
Cust_1        07001    3
Cust_3        07001    3
Cust_4        07001    3
Cust_5        07001    3
------------------------------------
Basically my final goal is to send the distinct customers based on a common zip codes the represent.
My final goal is to get the output as
 
Cust_Number  Rank
-------------------------
Cust_1  1
Cust_2  1
Cust_5  1
Cust_6  1
Cust_1  2
Cust_2  2
Cust_1  3
Cust_3  3
Cust_4  3
Cust_5  3
The above output is clerly giving the distinct groupings of customers based on the matching zip codes.
Regards
Shravani 
August    02, 2011 - 7:12 am UTC 
 
 
how does cust_2 get a rank of 2 in your final output?????
What appears to be "clear" to you is "muddy" to me.
 
 
 
Sorry for the typo 
Shravani, August    02, 2011 - 8:00 am UTC
 
 
Sorry once again Tom for my stupid mistakes. The final output should be 
Cust_Number     Rank
-------------------------
Cust_1        1
Cust_2        1
Cust_5        1
Cust_6        1
Cust_1        2
Cust_5        2
Cust_1        3
Cust_3        3
Cust_4        3
Cust_5        3
Regards
Shravani 
August    02, 2011 - 8:38 am UTC 
 
 
why isn't cust_3 in rank 1?
clear as mud I'd say?
Here is what I think you want - again, the answer is LARGER - but whatever.  Your output does not compute for me at all
ops$tkyte%ORA11GR2> with t
  2  as
  3  ( select distinct cust_no, dispatch_zip zip
  4    from dispatch_records
  5    )
  6    select distinct t1.cust_no t1_cust_no, t2.cust_no t2_cust_no
  7    from t t1, t t2
  8    where t1.cust_no < t2.cust_no
  9    and t1.zip = t2.zip
 10    order by 1
 11  /
T1_CUST_NO T2_CUST_NO
---------- ----------
Cust_1     Cust_2
Cust_1     Cust_3
Cust_1     Cust_4
Cust_1     Cust_5
Cust_1     Cust_6
Cust_2     Cust_5
Cust_2     Cust_6
Cust_3     Cust_4
Cust_3     Cust_5
Cust_4     Cust_5
Cust_5     Cust_6
11 rows selected.
ops$tkyte%ORA11GR2> with t
  2  as
  3  ( select distinct cust_no, dispatch_zip zip
  4    from dispatch_records
  5    )
  6    select distinct new_cust_no, dense_rank() over (order by t1_cust_no)
  7      from (
  8    select t1_cust_no, nvl(t2_cust_no,t1_cust_no) new_cust_no
  9      from (
 10    select distinct t1.cust_no t1_cust_no, t2.cust_no t2_cust_no
 11    from t t1, t t2
 12    where t1.cust_no < t2.cust_no
 13    and t1.zip = t2.zip
 14          )
 15   group by grouping sets((t1_cust_no),(t1_cust_no,t2_cust_no))
 16         )
 17    order by 2
 18  /
NEW_CUST_N DENSE_RANK()OVER(ORDERBYT1_CUST_NO)
---------- -----------------------------------
Cust_1                                       1
Cust_2                                       1
Cust_3                                       1
Cust_4                                       1
Cust_5                                       1
Cust_6                                       1
Cust_2                                       2
Cust_5                                       2
Cust_6                                       2
Cust_3                                       3
Cust_4                                       3
Cust_5                                       3
Cust_4                                       4
Cust_5                                       4
Cust_5                                       5
Cust_6                                       5
16 rows selected.
 
 
 
 
Sorry Again Tom.....
Shravani, August    03, 2011 - 5:40 am UTC
 
 
Hi Tom,
Thanks so much for giving your valuable time on my question.
And I am extremely sorry Tom, I am not explaining the requirements properly. 
The cust_3 is not in Rank 1 because it has match on the different zip codes.
This time I will try my best to explain the requirement with expected result set.
I customized your query for pulling the customers to make the groups based on the zip codes and number of zip codes match.
select 
    c1, c2, rnk, zip
from
(
    select distinct c1, c2, zip,
           dense_rank() over (order by sum_zip) Rnk
    from   
    (
    with t
        as
        ( select distinct cust_no, dispatch_zip zip
            from dispatch_records order by 1
        )
        select t1.cust_no c1, t2.cust_no c2, t1.zip,  
               sum(t1.zip) over(partition by t1.cust_no, t2.cust_no) sum_zip
          from t t1, t t2
         where t1.cust_no < t2.cust_no
           and t1.zip = t2.zip
    )
)
order by 3, 1, 2, 4
Output 
===============================
C1      C2   RNK      ZIP
===============================
Cust_1 Cust_3 1 07001  --------
Cust_1 Cust_4 1 07001  |
Cust_3 Cust_4 1 07001  Customers in this set have one zip code
Cust_3 Cust_5 1 07001  07001 common and hence are grouped.
Cust_4 Cust_5 1 07001 --------|
Cust_1 Cust_2 2 07771 -------- 
Cust_1 Cust_2 2 08731  |
Cust_1 Cust_6 2 07771  |
Cust_1 Cust_6 2 08731  |
Cust_2 Cust_5 2 07771  Customers in this set have 08731 and 07771
Cust_2 Cust_5 2 08731  zip codes in common and hence are grouped.
Cust_2 Cust_6 2 07771  |
Cust_2 Cust_6 2 08731  |
Cust_5 Cust_6 2 07771  |
Cust_5 Cust_6 2 08731 --------
Cust_1 Cust_5 3 07001 --------
Cust_1 Cust_5 3 07771  Customers in this set have 3 zip codes in common 
Cust_1 Cust_5 3 08731 --------
===============================
I am not able to rollup the above output further to get the final output as 
======================
Cust_No  Rank
======================
Cust_1  1 -------- 
Cust_3  1  Customers with Only zip code (07001)
Cust_4  1  matching
Cust_5  1 --------
Cust_1  2 -------- 
Cust_2  2  Customers with 2 zip codes (08731 and 
Cust_5  2  07771) matching. 
Cust_6  2 --------
Cust_1  3 ------- Customers with 3 zip codes (07001, 07771 and 08731)
Cust_5  3 --------matching
=====================
I sincerely request you to spend few minutes for the last time on this requirement.
Regards
Shravani
 
August    03, 2011 - 7:54 am UTC 
 
 
this has nothing to do with the original query.  And you are right - you didn't really explain this before.
Assuming your query is correct, just use the same grouping sets technique I used above:
ops$tkyte%ORA11GR2> select distinct nvl(c2, c1), rnk
  2    from (
  3  select c1, c2, rnk, zip
  4  from
  5  (
  6      select distinct c1, c2, zip,
  7             dense_rank() over (order by sum_zip) Rnk
  8      from
  9      (
 10      with t
 11          as
 12          ( select distinct cust_no, dispatch_zip zip
 13              from dispatch_records order by 1
 14          )
 15          select t1.cust_no c1, t2.cust_no c2, t1.zip,
 16                 sum(t1.zip) over(partition by t1.cust_no, t2.cust_no) sum_zip
 17            from t t1, t t2
 18           where t1.cust_no < t2.cust_no
 19             and t1.zip = t2.zip
 20      )
 21  )
 22      )
 23  group by grouping sets( (c1,rnk),(c1,c2,rnk) )
 24  order by 2, 1
 25  /
NVL(C2,C1)        RNK
---------- ----------
Cust_1              1
Cust_3              1
Cust_4              1
Cust_5              1
Cust_1              2
Cust_2              2
Cust_5              2
Cust_6              2
Cust_1              3
Cust_5              3
10 rows selected.
 
 
 
 
Thanks so much 
Shravani, August    03, 2011 - 9:20 am UTC
 
 
Thanks so much Tom. This is exactly what I wanted.
I just have 1 question, In the above query since Zip codes was having all numerics it was possible to put the SUM(zip) and then apply the DENSE_RANK. But suppose we have a non-numeric data, then how can we group it.
Here are the scripts for your ready reference.
drop table dispatch_records;
create table dispatch_records (cust_no varchar2(10), dispatch_Date date, dispatch_zip 
varchar2(20));
------------------------------------------------------
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_1', 'Edison');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_1', 'Bridgwater');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_1', 'Forked River');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_2', 'Bridgwater');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_2', 'Forked River');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_3', 'Edison');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_4', 'Edison');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_5', 'Forked River');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_5', 'Edison');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_5', 'Bridgwater');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_6', 'Bridgwater');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_6', 'Forked River');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_7', 'Avenel');
insert into dispatch_records(cust_no, dispatch_zip) values ('Cust_7', 'South Jersey');
commit;
------------------------------------------------------
With this data, how can we get the output as
======================
Cust_No  Rank
======================
Cust_1  1 -------- 
Cust_3  1  Customers with Only zip code (Edison)
Cust_4  1  matching
Cust_5  1 --------
Cust_1  2 -------- 
Cust_2  2  Customers with 2 zip codes (Bridgwater and 
Cust_5  2  Forked River) matching. 
Cust_6  2 --------
Cust_1  3 ------- Customers with 3 zip codes (Edison, Bridgwater and Forked River)
Cust_5  3 --------matching
=====================
Regards
Shravani 
August    03, 2011 - 9:54 am UTC 
 
 
question for you, what if the zip codes are:
10000
20000
16000
14000
for two groups - wouldn't that mess up your query as well.  Like I said, I'm assuming your query works - maybe it doesn't.  I didn't really analyze it.
you'd need to spec out the inner query for us, in detail - very very clearly.
I simply took your work and made it return what you wanted.  If your work doesn't work... then you'll need to work on that bit. 
 
 
Shravani, August    03, 2011 - 10:51 am UTC
 
 
Sorry Again Tom.
Believe me this is the first time in AskTom forum, I am not able to explain properly what I want.
Actually my question is more on applying the analytical functions on more than one record, but I am not doing a good job of phrasing it.
For the below data, the grouping should be based on the matching zip codes. I used a very dirty method of doing a SUM(ZIP) OVER (PARTITION BY T1.CUST_NO, T2.CUST_NO).
Thinking here was to set up a base value on which analytical functions will work.
My question is without using SUM(), can we group the data, basically how to tell Oracle to apply the rank based on the matching rows.
Output 
===============================
C1         C2             ZIP
===============================
Cust_1    Cust_3        07001    
Cust_1    Cust_4        07001     
Cust_3    Cust_4        07001     
Cust_3    Cust_5        07001     
Cust_4    Cust_5        07001    
Cust_1    Cust_2        07771    
Cust_1    Cust_2        08731    
Cust_1    Cust_6        07771    
Cust_1    Cust_6        08731    
Cust_2    Cust_5        07771    
Cust_2    Cust_5        08731    
Cust_2    Cust_6        07771    
Cust_2    Cust_6        08731    
Cust_5    Cust_6        07771    
Cust_5    Cust_6        08731    
Cust_1    Cust_5        07001    
Cust_1    Cust_5        07771    
Cust_1    Cust_5        08731   
===============================
Regards
Shravani 
August    03, 2011 - 12:39 pm UTC 
 
 
what happens when you have customers that match twice but with different zips?  
eg: 
c1   X
c2   X
c1   Y
c2   Y
c3   A
c4   B
c3   A
c4   B
c1 and c2 match twice, c3 and c4 match twice - but they match on different "zips"
You need to be really really precise.  It has nothing to do with asking for the first time in a forum - we are all software developers.  Our *MOST IMPORTANT* skill - the skill you need more than any - is the skill of putting on paper what actually needs to be done.
You know, write a specification.  From start to finish, in sufficient detail that a programmer can generate code from it.
Otherwise - you get garbage.
So, please spec it out - clearly, in detail, precisely.
Yes - it is hard, yes it is a pain, no - you cannot get out of doing it.  It is what we do after all. 
 
 
Sorry once again Tom
rupalimay21@gmail.com, August    04, 2011 - 11:39 am UTC
 
 
Hi Tom,
Hats off to you for the below line
"we are all software developers. Our *MOST IMPORTANT* skill - the skill you need more than any - is the skill of 
putting on paper what actually needs to be done. "
Very true Tom.
I will give you little background on the business requirement. The client I am working acquired few small companies and the data got integrated successfully.
Now in few of the ledgers we are finding that there are lot of duplicate customer records. Our funtional experts came up with alogorith to identify the duplicates.
They gave set of rules like
Rule 1. Customers with same registration date
Rule 2. Customers with same zip codes (This rule has to be dynamic, means in program we should have flexibility to increase number of matching zip codes)
Rule 3. Customers who have consumed goods of more than 5M in last 2 recent years
and so on. 
note : All the rules goes with AND condition. And the rules are arranged as per the granularity, means Rule-1 is at high granular level than Rule-2. 
As per the functional team, rules can change any time hence they are looking for some kind of framework which will identify duplicates based on the defined
rules.
We have created a Stage Table which holds all the customers. Using PL/SQL we process each rule and then flag off the customer codes based on the group.
For example for the below scenario after the rule_1 process
   Cust_1, Cust_2, Cust_3 shares same registration date. 
   Cust_3, Cust_5        shares same registration date. 
DUPS_BASE
---------------------------------------------------------------------
CUST_NO  Rule_1  Rule_2  Rule_3
----------------------------------------------------------------------
Cust_1  1
Cust_2  1
Cust_3  1
Cust_4  2
Cust_5  2
----------------------------------------------------------------------
Once the Rule_1 is processed, process takes second rule and Cust_1, Cust_2, Cust_3 gets evaluated against Rule_2.
I asked question specific to Rule_2. Basically I want to bundle customers in a group, so that next process will pick them up and apply next rule.
Now to answer your question, if 
c1  X 
c2  X 
c1  Y 
c2  Y 
then C1 and C2 will grouped under same ranking. Basically, all the customers are grouped based on the matching zip codes, so that next rule will get applied on the set of customers.
Tom, I am not sure whether I explained the requirement properly. I will surely not again request you to spend time on this, because i know I have wasted
lot of your time already.
Thanks so much for all your advices Tom.
Regards
Shravani 
August    04, 2011 - 11:48 am UTC 
 
 
I asked about this:
eg: 
c1  X 
c2  X 
c1  Y 
c2  Y 
c3  A 
c4  B 
c3  A 
c4  B 
c1 and c2 match twice, c3 and c4 match twice - but they match on different "zips" 
not just c1, c2 - what is the output for c1,c2,c3,c4
Also, cust_1 might appear in MANY groups then right? - did you/do you consider that?
cust_1 might have 5 zip codes, of which 3 match cust_2, 3 match cust_3, 3 match cust_4 - but cust_2,3,4 do not match each other three times.  What then?  What does the output look like at that point?
 
 
 
Differences in data content
Eric, November  12, 2011 - 9:57 pm UTC
 
 
To help me compare the contents of two tables for I project I am working on I looked at ideas for a year and ended writing the following. I hope is it not a really old idea. It populates a table with the primary key, the column name and the data from both tables. In short the primary key is reported as a tilde delimited value, the two tables are joined by their primary keys and the differences reported. I have not used this for tables with more than 10 million rows. On smaller tables it will compare about 20 million values a minute on a slow server. Not sure how it handles big tables. One downside is NVL forces a full table scan but, we are looking at the entire table anyway.
--
create table data_differences ( pkey_string varchar2(200) ,
  column_name  varchar2(30) ,  table1_name  varchar2(30)  ,
  table2_name   varchar2(30),  table1_value varchar2(240) ,
  table2_value varchar2(240),  job_date date )
 
create table data_job_status ( column_name varchar2(30) ,
   transaction_date date )
 
create table table_one ( id number(4) , source varchar2(12) ,
name varchar2(20) , transaction_date date , userid varchar2(20) )
 
create table table_two  as select * from table_one
 
alter table table_one add ( constraint table_one_pk primary key ( id , source ) )
alter table table_two add ( constraint table_two_pk primary key ( id , source ) )
 
insert into table_one values ( 1 , 'CORP' , 'SMITH'   ,
  TO_DATE ( '11082011' , 'MMDDYYYY' ) , 'USER10' )
insert into table_one values ( 2 , 'CORP' , 'SMITH'   ,
  TO_DATE ( '11081999' , 'MMDDYYYY' ) , 'USER11' )
insert into table_one values ( 3 , 'CORP' , 'SMITH'   ,
  TO_DATE ( '11081999' , 'MMDDYYYY' ) , 'USER15' )
 
insert into table_two values ( 1 , 'CORP' ,'SMITH'   ,
  TO_DATE ( '11082011' , 'MMDDYYYY' ) , 'USER1' )
insert into table_two values ( 2 , 'CORP' , 'JOHNSON' ,
  TO_DATE ( '11081999' , 'MMDDYYYY' ) , 'USER1' )
insert into table_two values ( 3 , 'CORP' , 'JOHNSON' ,
  TO_DATE ( '11082011' , 'MMDDYYYY' ) , 'USER1' )
--
-- 2 of 3 rows returned
PKEY_STRING     COLUMN_NAME       TABLE1_NAME     TABLE2_NAME     TABLE1_VALUE     TABLE2_VALUE 
--------------  ----------------  --------------  --------------  ---------------  ------------
2~CORP          NAME              TABLE_ONE       TABLE_TWO       SMITH            JOHNSON
3~CORP          TRANSACTION_DATE  TABLE_ONE       TABLE_TWO       08-NOV-99        08-NOV-11
--
-- BEGIN
--  COMPARE_DATA2 ( 'GUEST' , 'GUEST' , 'TABLE_ONE' , 'TABLE_TWO' );
-- END
--
CREATE OR REPLACE PROCEDURE compare_table_content
( in_owner1 in varchar , in_owner2 in varchar ,
   in_table1 in varchar , in_table2 in varchar ) is
  n       number(8);
  x       number(8);
  c_col   varchar2(30);    /* column name */
  c_nam   varchar2(30);    /* primary key name */
  d_typ   varchar2(12);    /* data type */
  q_col   varchar2(30);    /* column for primary key */
  char_col varchar2(32);   /* column name in quotes */
  ch_table1 varchar2(32);
  ch_table2 varchar2(32);
  q_str   varchar2(500);   /* query string */
  x_str   varchar2(1000);  /* execute string */
  ins_str varchar2(2000);  /* insert string */
  key_str varchar2(500);   /* delimited primary key */
  whr_str varchar2(2000);  /* where string */
  nls_dte varchar2(20) := '''DD-MON-YYYY'''; /* date fmt */
--
-- default values NVL function
--
  nvl_chr varchar2(20) := '''QqQqQq''';
  nvl_dte varchar2(40) := 'sysdate + 100';
--
-- Get primary key
--
  cursor c1 ( in_owner in varchar , in_table in varchar  ) is
   select a.constraint_name , a.column_name , a.position
   from all_cons_columns a , all_constraints b
    where a.table_name = upper ( in_table )
      and b.table_name = a.table_name
      and b.owner      = upper ( in_owner )
      and a.owner      =  b.owner
      and a.constraint_name = b.constraint_name
      and b.constraint_type = 'P'
     order by a.position;
--
-- Get column names not in primary key and not excluded
--  In this example has been excluded.
--
  cursor c2 ( in_owner in varchar , in_table in varchar ,
              pkey_name in varchar ) is
    select a.column_name , a.data_type
     from all_tab_columns a
       where a.table_name = upper ( in_table )
         and a.owner = upper ( in_owner )
         and a.data_type in ( 'VARCHAR2' , 'NUMBER' , 'DATE' )
         and not exists ( select 1 from all_cons_columns b
         where a.owner       = b.owner
           and a.table_name  = b.table_name
           and b.constraint_name = upper ( pkey_name )
           and a.column_name = b.column_name )
       and a.column_name != 'USERID'
    order by a.column_id;
 
begin
 
  execute immediate 'truncate table data_job_status reuse storage';
  execute immediate 'truncate table data_differences reuse storage';
  ch_table1 := chr(39)||upper ( in_table1 )||chr(39);
  ch_table2 := chr(39)||upper ( in_table2 )||chr(39);
 
  open c1 ( in_owner1 , in_table1 );
   loop
    fetch c1 into c_nam , q_col , n;
     exit when c1%notfound;
 
    if ( n = 1 ) then
 
     ins_str := 'insert into data_differences select ';
     whr_str := ' where a.'||q_col||' = b.'||q_col;
     key_str := ' a.'||q_col;
 
    else
 
     whr_str := whr_str||' and a.'||q_col||' = b.'||q_col;
     key_str := key_str||'||'||chr(39)||'~'||chr(39)||'|| a.'||q_col;
 
    end if;
 
   end loop;
  close c1;
 
  open c2 (  in_owner1 , in_table1 , c_nam );
   loop
    fetch c2 into c_col , d_typ;
     exit when c2%notfound;
 
     char_col := chr(39)||c_col||chr(39);
 
     if ( d_typ = 'NUMBER' ) then
 
      q_str := ' AND NVL ( A.'||c_col||' , -99.9191 ) '||
       '          != NVL ( B.'||c_col||' , -99.9191 )';
 
      x_str := ins_str||' '||key_str||' , '||char_col||
       ' , '||ch_table1||' , '||ch_table2||' , '||
       ' to_char ( a.'||c_col||' ) , to_char ( b.'||c_col||' ) '||
       ' , sysdate from '||
       ' '||in_owner1||'.'||in_table1||' a , '||in_owner2||
       '.'||in_table2||' b '||
        whr_str||'  '||q_str;
 
     elsif ( d_typ = 'DATE' ) then
 
      q_str := ' AND NVL ( A.'||c_col||' , '||nvl_dte||' ) '||
        '         != NVL ( B.'||c_col||' , '||nvl_dte||' ) ';
 
      x_str := ins_str||' '||key_str||' , '||char_col||
       ' , '||ch_table1||' , '||ch_table2||' , '||
       ' to_char ( a.'||c_col||' , '||nls_dte||' ) , '||
       ' to_char ( b.'||c_col||' , '||nls_dte||' ) , sysdate '||
       ' FROM '||in_owner1||'.'||in_table1||' a , '||in_owner2||
       '.'||in_table2||' b '||whr_str||'  '||q_str;
 
     else
 
      q_str := ' AND NVL ( A.'||c_col||' , '||nvl_chr||'  )'||
       '          != NVL ( B.'||c_col||' , '||nvl_chr||'  )';
 
      x_str := ins_str||' '||key_str||' , '||char_col||
       ' , '||ch_table1||' , '||ch_table2||' , '||
       ' a.'||c_col||' , b.'||c_col||' , sysdate '||
       ' FROM '||in_owner1||'.'||in_table1||' a , '||in_owner2||
       '.'||in_table2||' b '||whr_str||'  '||q_str;
 
     end if;
 
     insert into data_job_status values ( c_col , sysdate );
     commit;
 
     execute immediate ( x_str );
     commit;
 
    end loop;
   close c2;
 
   commit;
 
end compare_table_content;
/
 
 
Comparing versioned records
Parthiban Nagarajan, March     03, 2012 - 2:39 am UTC
 
 
drop table my_cmp purge;
-- This is the main table
-- VERSION_NO is incremented for each UPDATE
-- The record before the change is copied to another table (my_cmp_audit - given below)
create table my_cmp
(
    key_col1x varchar2(10),
    key_col2x varchar2(10),
    info_col1x date,
    info_col2x number,
    oth_col1x varchar2(10),
    oth_col2x date,
    oth_col3x number,
    version_no number,
    constraint my_cmp_pk primary key (key_col1x, key_col2x)
);
begin
    insert into my_cmp values ('k1', '1', to_date('16feb1985', 'dd-mon-yyyy'), 27, 'Hello', sysdate, 10, 5);
    insert into my_cmp values ('k1', '2', to_date('03mar1970', 'dd-mon-yyyy'), 42, 'World', sysdate, 10, 2);
end;
/
drop table my_cmp_audit purge;
-- This table contains the history of records from my_cmp
create table my_cmp_audit
(
    key_col1x varchar2(10),
    key_col2x varchar2(10),
    info_col1x date,
    info_col2x number,
    oth_col1x varchar2(10),
    oth_col2x date,
    oth_col3x number,
    version_no number,
    constraint my_cmp_audit_pk primary key (key_col1x, key_col2x, version_no)
);
-- Sample data
begin
    insert into my_cmp_audit values ('k1', '1', to_date('16feb1985', 'dd-mon-yyyy'), 27, 'Hello', sysdate       , 10, 4);
    insert into my_cmp_audit values ('k1', '1', to_date('16feb1985', 'dd-mon-yyyy'), 27, 'Hell' , trunc(sysdate), 10, 3);
    insert into my_cmp_audit values ('k1', '1', to_date('16feb1985', 'dd-mon-yyyy'), 27, 'Help' , sysdate       , 10, 2);
    insert into my_cmp_audit values ('k1', '1', to_date('16feb1985', 'dd-mon-yyyy'), 1 , 'xxxxx', sysdate-3333  ,  5, 1);
    --
    insert into my_cmp_audit values ('k1', '2', to_date('03mar1970', 'dd-mon-yyyy'), 42, 'World', sysdate - 50, 10, 1);
end;
/
-- This view shows us the record level comparison side-by-side (using self-join)
-- INFO_COL is not compared; OTH_COLs are compared
-- OTH_COL is given $ suffix if it is from the AUDIT table
-- But this view information is just half the way
create or replace view MY_CMP_VU1 as 
  with all_recs as ( 
select KEY_COL1X, KEY_COL2X, INFO_COL1X, OTH_COL1X, OTH_COL2X, OTH_COL3X, VERSION_NO
  from MY_CMP
 union all 
 select KEY_COL1X, KEY_COL2X, INFO_COL1X, OTH_COL1X as OTH_COL1$, OTH_COL2X as OTH_COL2$, OTH_COL3X as OTH_COL3$, VERSION_NO
   from MY_CMP_AUDIT
 ) 
 select KEY_COL1X, KEY_COL2X, a.INFO_COL1X, a.OTH_COL1X, a.OTH_COL2X, a.OTH_COL3X, b.OTH_COL1X as OTH_COL1$, b.OTH_COL2X as OTH_COL2$, b.OTH_COL3X as OTH_COL3$, a.VERSION_NO
   from (select * from all_recs) a
   join (select * from all_recs) b
  using (KEY_COL1X, KEY_COL2X)
  where b.VERSION_NO = a.VERSION_NO - 1;
select * from my_cmp_vu1 order by key_col1x, key_col2x, version_no;
KEY_COL1X  KEY_COL2X  INFO_COL1X       OTH_COL1X  OTH_COL2X         OTH_COL3X OTH_COL1$  OTH_COL2$         OTH_COL3$ VERSION_NO
---------- ---------- ---------------- ---------- ---------------- ---------- ---------- ---------------- ---------- ----------
k1         1          16/02/1985 00:00 Help       03/03/2012 14:03         10 xxxxx      17/01/2003 14:03          5          2
k1         1          16/02/1985 00:00 Hell       03/03/2012 00:00         10 Help       03/03/2012 14:03         10          3
k1         1          16/02/1985 00:00 Hello      03/03/2012 14:03         10 Hell       03/03/2012 00:00         10          4
k1         1          16/02/1985 00:00 Hello      03/03/2012 14:03         10 Hello      03/03/2012 14:03         10          5
k1         2          03/03/1970 00:00 World      03/03/2012 14:03         10 World      13/01/2012 14:03         10          2
-- This view gives the complete and expected result
create or replace view my_cmp_vu2 as
select key_col1x, key_col2x, info_col1x,
       field_name, old_value, new_value
  from
(
select key_col1x, key_col2x, info_col1x, version_no,
       'OTH_COL1X' field_name, to_char(oth_col1x) new_value, to_char(oth_col1$) old_value
  from my_cmp_vu1
 where oth_col1x <> oth_col1$
 union all
select key_col1x, key_col2x, info_col1x, version_no,
       'OTH_COL2X', to_char(oth_col2x), to_char(oth_col2$)
  from my_cmp_vu1
 where oth_col2x <> oth_col2$
 union all
select key_col1x, key_col2x, info_col1x, version_no,
       'OTH_COL3X', to_char(oth_col3x), to_char(oth_col3$)
  from my_cmp_vu1
 where oth_col3x <> oth_col3$
)
 order by 1, 2, version_no;
select * from my_cmp_vu2 order by key_col1x, key_col2x;
KEY_COL1X  KEY_COL2X  INFO_COL1X       FIELD_NAM OLD_VALUE                                NEW_VALUE
---------- ---------- ---------------- --------- ---------------------------------------- ----------------------------------------
k1         1          16/02/1985 00:00 OTH_COL1X xxxxx                                    Help
k1         1          16/02/1985 00:00 OTH_COL2X 17/01/2003 14:03                         03/03/2012 14:03
k1         1          16/02/1985 00:00 OTH_COL3X 5                                        10
k1         1          16/02/1985 00:00 OTH_COL2X 03/03/2012 14:03                         03/03/2012 00:00
k1         1          16/02/1985 00:00 OTH_COL1X Help                                     Hell
k1         1          16/02/1985 00:00 OTH_COL1X Hell                                     Hello
k1         1          16/02/1985 00:00 OTH_COL2X 03/03/2012 00:00                         03/03/2012 14:03
k1         1          16/02/1985 00:00 OTH_COL2X 03/03/2012 14:03                         03/03/2012 14:03
k1         2          03/03/1970 00:00 OTH_COL2X 13/01/2012 14:03                         03/03/2012 14:03
But when I think about the performance, its hard for me to accept this as the optimal solution.
Because, in the production environment, the data volume would be large and the columns to be compared also >200
Please give me some idea.
Thanks and regards 
March     03, 2012 - 8:54 am UTC 
 
 
what do you need to optimize for?  to retrieve a single record, a hundred records via some predicate, all records?  what is large to you? 
 
 
Re: Comparing versioned records
Parthiban Nagarajan, March     04, 2012 - 2:08 am UTC
 
 
Hi Tom
I want to scan all the records of a monthly partition.
This is to report how the records are getting mutated.
So, every month, a new partition is added and at the end of the month, the report is generated for that partition.
Also, there are a couple of tables which are list partitioned on a region flag.  This needs to be scanned based on the LAST_UPDATE_DATETIME field.
Thanks and regards 
March     04, 2012 - 9:17 am UTC 
 
 
who would read such a lengthly report?  Is this *really* a need?  Since it can be generated anytime, for any set of records of interest - why would you generate a report no human being on the planet could even ever read/understand/make use of?
Do you really want to do this?  truly?  It seems like a big waste?
You can either do the self join as you are doing or union all this data together and use lag or lead to marry the current row with the previous or next row to do the comparision.
but - if you ask me - I wouldn't do it, no one could ever look at it. 
 
 
Re: Comparing versioned records
Parthiban Nagarajan, March     04, 2012 - 10:33 am UTC
 
 
Hi Tom
Believe it or not.  We are already doing this.
But with a package, object types and its corresponding table types, with row-by-row processing with dynamic SQL to collect the column_name and etc.
I just want to make it simple so that the maintenance of the code would be easy.
I am not sure how the report is used.  They might check only for a few keys (or records) based on some other report.
Also, this report is to list only the records that are modified (i.e. version_no > 1 or the records which have at least one matching record in the audit table).
Please give me some idea to tune the 2nd view or if this cannot be tuned, please tell me your comments on the 2nd view at the performance angle.
And my opinion is this should be better when compared to its existing form (package, dynamic sql, etc.)
Thanks and regards 
March     05, 2012 - 6:53 am UTC 
 
 
ops$tkyte%ORA11GR2> with data as (select level l from dual connect by level <= 3)
  2  select *
  3    from (
  4  select key_col1x, key_col2x, info_col1x, version_no,
  5         'OTH_COL'||l||'X' field_name,
  6         decode( l, 1, to_char(oth_col1x), 2, to_char(oth_col2x), 3, to_char(oth_col3x) ) new_value,
  7         decode( l, 1, to_char(oth_col1$), 2, to_char(oth_col2$), 3, to_char(oth_col3$) ) old_value
  8    from my_cmp_vu1, data
  9         )
 10   where decode( new_value, old_value, 1, 0 ) = 0
 11  /
KEY_COL1X  KEY_COL2X  INFO_COL1 VERSION_NO FIELD_NAME NEW_VALUE  OLD_VALUE
---------- ---------- --------- ---------- ---------- ---------- ----------
k1         1          16-FEB-85          4 OTH_COL1X  Hello      Hell
k1         1          16-FEB-85          3 OTH_COL1X  Hell       Help
k1         1          16-FEB-85          2 OTH_COL1X  Help       xxxxx
k1         1          16-FEB-85          2 OTH_COL2X  05-MAR-12  19-JAN-03
k1         1          16-FEB-85          2 OTH_COL3X  10         5
k1         2          03-MAR-70          2 OTH_COL2X  05-MAR-12  15-JAN-12
6 rows selected.
try that one on (or look up pivot if you are in 11g) 
 
 
 
Re: Comparing versioned records
Parthiban Nagarajan, March     06, 2012 - 10:37 am UTC
 
 
Hi Tom
Yes.  As you mentioned, I need to UNPIVOT.  I did it with the CROSS JOIN and DECODE, as of now.  I will try the UNPIVOT operator also.
Thanks for your invaluable guidance on this.
Thanks and regards 
 
Compare Tables & Aggregate Results
Ian Wallace, January   02, 2013 - 5:23 am UTC
 
 
Hi Tom,
This has been a great thread. 
If I run the following query as suggested above:
SELECT source_code,
       lcn_id, 
       ssm_id, 
       source_client, 
       tct_type,
       tct_subtype,
       value_date,
       interest_accrued,
       interest_earned,
       ccy_id,
       interest_rate,
       incremental_earned_home,
       incremental_earned,
       COUNT(src1) base_tab, 
       COUNT(src2) refresh_tab
      from
    ( SELECT a.*,
             1 src1,
             TO_NUMBER(NULL) src2  
      FROM tr_owner2.daily_interest_active2 a
      UNION ALL
      SELECT b.*,
             TO_NUMBER(NULL) src1,
             2 src2
      FROM tr_owner.daily_interest_active2 b
    )
GROUP BY source_code,
         lcn_id, 
         ssm_id, 
         source_client, 
         tct_type,
         tct_subtype,
         value_date,
         interest_accrued,
         interest_earned,
         ccy_id,
         interest_rate,
         incremental_earned_home,
         incremental_earned
HAVING COUNT(src1) <> count(src2);
I get:
source_code  lcn_id ssm_id source_client  tct_type tct_subtype value_date  interest_accrued  interest_earned   ccy_id interest_rate incremental_earned_home incremental_earned   base_tab  refresh_tab                            
-----------  ------ ------ -------------  -------- ----------- ----------  ----------------  ---------------   ------ ------------- ----------------------- ------------------   --------  -----------
111111       0IRE   AC     999999         INT                  01/04/2006  526.08            -19.13            GBP    4.6                                                        1          0                                      
111111       0IRE   AC     999999         INT                  01/04/2006  526.08            -19.13            GBP    4.9                                                        0          1                                      
The PK for both tables is:
source_code,lcn_id,ssm_id,source_client,tct_type,tct_subtype,value_date
What I would like to do is combine both rows into one row based on PK as follows:
source_code  lcn_id ssm_id source_client  tct_type tct_subtype value_date  interest_accrued_old  interest_accrued_new   interest_earned_old  interest_earned_new     interest_rate_old    interest_rate_new           
-----------  ------ ------ -------------  -------- ----------- ----------  --------------------  --------------------   -------------------  ----------------------- ------------------   -----------------  
111111       0IRE   AC     999999         INT                  01/04/2006  526.08                526.08                 -19.13               -19.13                  4.6                  4.9 
I'm stuck however on just how to go about doing that?
Aplogies if this has already been answered.
Thanks,
Ian.
 
January   04, 2013 - 2:45 pm UTC 
 
 
what makes one old or new???  I'll assume base_tab is "old" and refresh_tab is "new"
anyway - you have your query Q above, it returns a "table".  
that table includes:
source_code,lcn_id,ssm_id,source_client,tct_type,tct_subtype,value_date
interest_accrued,
       interest_earned,
       ccy_id,
       interest_rate,
       incremental_earned_home,
       incremental_earned,
and the base_tab, refresh_tab counts  
You want to 'pivot' the interest_accrued, interest_earned and interest_rates.  You can use pivot in 11g - or in all releases "max(decode)", I'll do max(decode) since you didn't say...
select source_code,lcn_id,ssm_id,source_client,tct_type,tct_subtype,value_date,
       max(decode( base_tab, 1, interest_accrued )) ia_old, 
       max(decode( refresh_tab, 1, interest_accrued )) ia_new,
       ... same for the other two columns ...
  from (Q)
 group by source_code,lcn_id,ssm_id,source_client,tct_type,tct_subtype,value_date
had I been given a create table, some inserts (a test case in short), I could have written the query syntactically correct - but I wasn't so this is just psuedo code... search this site for pivot for full examples.... 
 
 
 
I don't know how does query work....confused
Texas Cowboy, June      10, 2013 - 4:03 pm UTC
 
 
Tom,
your responses to several discussions have been awesome. I just have a question about your query (shown below) to several posts. 
I just don't know why you chose 1 for src1 and 2 for src2 in the UNION ALL subquery. I am having trouble understanding how this query compares the data and shows only differences. What effect does 1 and 2 have when you count the records in outer query
select c1, c2, c3, 
  2         count(src1) CNT1, 
  3          count(src2) CNT2
  4     from 
  5   ( select a.*, 
  6          1 src1, 
  7          to_number(null) src2 
  8       from a
  9      union all
  10    select b.*, 
  11         to_number(null) src1, 
  12         2 src2 
  13      from b
  14   )
  15  group by c1,c2,c3
  16 having count(src1) <> count(src2)
  17 / 
June      18, 2013 - 2:42 pm UTC 
 
 
they could have been 0, 0 - or anything.  Just needed something (not null) to count.  1 and 2 seemed "appropriate" given their name source 1 and source 2.
If you have a key (c1,c2,c3) that exists only in A, then "count(src1) cnt1" would be 1 and "count(src2) cnt2" would be 0 - we would see the row exists in A but not B.
and vice versa.
if the key (c1,c2,c3) existed in both tables - then count(src1/src2) would be 1 and we would not display that row. 
 
 
Yes now I understand....
Texas Cowboy, June      18, 2013 - 4:04 pm UTC
 
 
I am such a dolt for not reading it carefully for the first time. I was thinking you were doing SUM() instead of COUNT(), so naturally I thought why would 1 ever equal to 2. Alas, it was COUNT() and your explanation totally made sense. I took this query, modified it to fit my tables, and it worked very well. 
Count of 2 to 1 or 1 to 2 means there is a duplicate in one table. 
Count of 1 to 0 or 0 to 1 means there is no row of that pk in one table.
count of 1 and 1 will not display because the row is in both tables and it's same.
 
 
oracle
A reader, December  29, 2015 - 12:13 pm UTC
 
 
nice