Hi Tom,
I seek your help on how to compare two rows in the table and if they are same merge the rows.
create table test(id number, start_date date, end_date date, col1 varchar2(10), col2 varchar2(10), col3 varchar2(10));
insert into test values(1, to_date('01/01/1900', 'mm/dd/yyyy'), to_date('05/01/2006', 'mm/dd/yyyy'), 'a', 'b', 'a');
insert into test values(1, to_date('05/02/2006', 'mm/dd/yyyy'), to_date('06/01/2006', 'mm/dd/yyyy'), 'x', 'y', 'z');
insert into test values(1, to_date('07/01/2006', 'mm/dd/yyyy'), to_date('07/01/2016', 'mm/dd/yyyy'), 'x', 'y', 'z');
insert into test values(1, to_date('08/01/2016', 'mm/dd/yyyy'), to_date('09/01/2016', 'mm/dd/yyyy'), 'x', 'b', 'a');
insert into test values(1, to_date('10/01/2016', 'mm/dd/yyyy'), to_date('12/31/2020', 'mm/dd/yyyy'), 'a', 'b', 'a');The dataset would be
id start date end date col1 col2 col3
1 01/01/1900 05/01/2006 a b c
1 05/02/2006 06/01/2006 x y z ------ record 2
1 07/01/2006 07/01/2016 x y z ------ record 3
1 08/01/2016 09/01/2016 x b a
1 10/01/2016 12/31/2016 a b c
As you can see record 2 and record 3 column 1, 2, 3 values are same. These records need to be merged into 1 record with min start date from record 2 and max end date from record 3.
expected result set is as follows
id start date end date col1 col2 col3
1 01/01/1900 05/01/2006 a b c
1 05/02/2006* 07/01/2016* x y z ------ 2 and 3 merged with min and max dates
1 08/01/2016 09/01/2016 x b a
1 10/01/2016 12/31/2016 a b c
How can we achieve this in sql statement.
Your help is highly appreciated.
Thanks in advance.
I'm not clear on how exactly you define the X, Y, Z rows to be grouped, but not those for A, B, C. Is it when there's a month between the previous end and the next start?
You need to be explicit on how this matching works to get a complete solution.
But in any case:
Pattern matching (match_recognize) to the rescue!
Use this to define a pattern variable which gives the criteria for merging rows. I've worked on the assumption you combine them if there's exactly one month between the end & start dates. Adjust this as necessary for your data.
Then it's just a matter of searching for any row which has this variable after it. Again, I've assumed there are many consecutive rows you want to merge them all. If this is not the case, change the regular expression quantifier from match any number of rows (*/asterisk) to zero to N. e.g. if you want to merge at most two rows, use the following pattern:
strt consecutive{0,1}All together this gives:
select *
from test match_recognize (
partition by id, col1, col2, col3
order by start_date
measures
first ( start_date ) as st,
last ( end_date ) as en
pattern ( strt consecutive* )
define
consecutive as
start_date = add_months (
prev ( end_date ), 1
)
)
order by st, en;
ID COL1 COL2 COL3 ST EN
1 a b a 01-JAN-1900 00:00:00 01-MAY-2006 00:00:00
1 x y z 02-MAY-2006 00:00:00 01-JUL-2016 00:00:00
1 x b a 01-AUG-2016 00:00:00 01-SEP-2016 00:00:00
1 a b a 01-OCT-2016 00:00:00 31-DEC-2020 00:00:00 Stew Ashton discusses the challenge of merging continuous date ranges in more detail at:
https://stewashton.wordpress.com/2014/03/16/merging-contiguous-date-ranges/ If you're looking for more background on pattern matching, you can find more examples in my answer at:
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:9533477800346658909 Or my slides on this topic at:
https://www.slideshare.net/ChrisSaxon1/how-to-find-patterns-in-your-data-with-sql