Skip to Main Content
  • Questions
  • Count groups when next value = current value plus one

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Francisco.

Asked: July 29, 2016 - 7:39 pm UTC

Last updated: August 02, 2016 - 12:43 pm UTC

Version: 11.2.0.3.0

Viewed 1000+ times

You Asked

I have this table:
create table test (myname varchar2(1), dts_id number);

With this data:
insert into test (myname, dts_id) values ('A',1);
insert into test (myname, dts_id) values ('A',2);
insert into test (myname, dts_id) values ('A',3);
insert into test (myname, dts_id) values ('A',6);
insert into test (myname, dts_id) values ('A',9);
insert into test (myname, dts_id) values ('A'.10);
insert into test (myname, dts_id) values ('B',1);
insert into test (myname, dts_id) values ('B',2);

I need to count, per name, how many "groups" of consecutive dts_ids I have (consecutive = previous dts_is + 1) . So, my expected result is:
MYNAME, count
A, 3
B, 1

For Name A - 3 groups
1st group - DTS_ID: 1,2,3
2nd group - DTS_ID: 6
3rd group - DTS_ID: 9, 10
For Name B - 1 group
1st group - DTS_ID: 1,2

I can not add objects to this database - no functions or anything else.

Do you have any suggestions how to write a "simple"/single select statement for this?

Thank you!

and Chris said...

You can do this with the Tabibitosan method.

This splits the ranges into groups by:

- Assigning a row_number to each row
- Subtracting this from the dts_id from this

Consecutive numbers will map to the same value. This is because both the row number and dts_id increase by 1, so the difference is the same.

You can then group by this to get the number of rows in each group. Group again to find the total number of groups:

create table test (myname varchar2(1), dts_id number);

insert into test (myname, dts_id) values ('A',1);
insert into test (myname, dts_id) values ('A',2);
insert into test (myname, dts_id) values ('A',3);
insert into test (myname, dts_id) values ('A',6);
insert into test (myname, dts_id) values ('A',9);
insert into test (myname, dts_id) values ('A',10);
insert into test (myname, dts_id) values ('B',1);
insert into test (myname, dts_id) values ('B',2);

select myname, count(c) from (
select myname, grp, count(*) c from (
select myname, 
       dts_id - row_number() over (partition by myname order by dts_id) grp
from   test
)
group  by myname, grp
)
group by myname;

MYNAME  COUNT(C)  
A       3         
B       1

For more discussion of this method see:

https://www.youtube.com/watch?v=yvimYixXo2Q&feature=youtu.be&t=40s

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Query not working with my test data

Rajneesh Pandey, August 02, 2016 - 10:28 am UTC

Hi Chris,

I ran the same query on the below data set but not getting the correct result. Can you please let me know if I am doing anything wrong.

with t as (
select 'A' name, 10 id from dual
union
select 'A' name, 11 id from dual
union
select 'A' name, 12 id from dual
union
select 'A' name, 13 id from dual
union
select 'A' name, 41 id from dual
union
select 'A' name, 42 id from dual
union
select 'B' name, 10 id from dual
union
select 'B' name, 11 id from dual
union
select 'B' name, 12 id from dual
)
select name, count(c) from (
select name, grp, count(*) c from (select name, id - row_number() over (partition by name order by id) grp from t)
group by name, grp)
group by name

Regards,
Rajneesh
Chris Saxon
August 02, 2016 - 10:56 am UTC

What do you get and what are you expecting?

Based on what I can see there's two groups for A (10-13 & 41-42) and one for B (10-12). That's the output I get:

SQL> with t as (
  2  select 'A' name, 10 id from dual
  3  union
  4  select 'A' name, 11 id from dual
  5  union
  6  select 'A' name, 12 id from dual
  7  union
  8  select 'A' name, 13 id from dual
  9  union
 10  select 'A' name, 41 id from dual
 11  union
 12  select 'A' name, 42 id from dual
 13  union
 14  select 'B' name, 10 id from dual
 15  union
 16  select 'B' name, 11 id from dual
 17  union
 18  select 'B' name, 12 id from dual
 19  )
 20  select name, count(c) from (
 21  select name, grp, count(*) c
 22  from (
 23    select name, id - row_number() over (partition by name order by id) grp from t
 24  )
 25  group by name, grp)
 26  group by name ;

N   COUNT(C)
- ----------
A          2
B          1


And it's what you asked for, no?

Rajneesh Pandey, August 02, 2016 - 11:28 am UTC

Hi Chris,

I think I had misunderstood the requirement. I thought that the expected result should give a count of pairs for each group. like in my example for A, it should be 4 (10,11), (11,12), (12,13), (41,42), and for B it should be (10,11), (11,12), i.e. 2.

If I modify your query like below, it is giving me the above expected result.

with t as (
select 'A' name, 10 id from dual
union
select 'A' name, 11 id from dual
union
select 'A' name, 12 id from dual
union
select 'A' name, 13 id from dual
union
select 'A' name, 41 id from dual
union
select 'A' name, 42 id from dual
union
select 'B' name, 10 id from dual
union
select 'B' name, 11 id from dual
union
select 'B' name, 12 id from dual
)
select name, sum(c) from (
select name, grp, (count(*) -1) c from (
select name, id - row_number() over (partition by name order by id) grp from t)
group by name, grp)
group by name

Regards,
Rajneesh
Chris Saxon
August 02, 2016 - 12:43 pm UTC

Yes, the number of consecutive pairs will be the number of rows in the group - 1.

So simple, so obvious...

Francisco, August 02, 2016 - 1:06 pm UTC

I run this on my 4.8M records table (about 300K distinct names where only 10K belong to more than a group ranging from 2 to 5K groups) and performance was pretty good - no need to tweak anything. I liked the KISS YouTube video. My dts_id reflects the date_time_stamp and I have a table that links the actual date and time to the id. I will see how performance behaves if I follow the YouTube video to find these data clusters based on the date and time - if I can get it to perform well, then there is one less table to keep in my schema! Thanks a lot for the help

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.