Lets start with this data set in place.
truncate table t;
insert into t values(1,to_date('22-NOV-2016 09:00:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(1,to_date('22-NOV-2016 09:05:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(1,to_date('22-NOV-2016 09:25:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(1,to_date('22-NOV-2016 09:35:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(1,to_date('22-NOV-2016 10:00:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(1,to_date('22-NOV-2016 10:35:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(2,to_date('22-NOV-2016 09:15:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(2,to_date('22-NOV-2016 09:45:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(3,to_date('22-NOV-2016 09:20:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(4,to_date('22-NOV-2016 09:25:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(4,to_date('22-NOV-2016 10:15:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(4,to_date('22-NOV-2016 10:35:00','dd-mon-yyyy hh24:mi:ss'));
insert into t values(5,to_date('22-NOV-2016 10:35:00','dd-mon-yyyy hh24:mi:ss'));
commit;
Given this
....
If you need the first case (split into 30 minute groups) this is easiest with pattern matching.
Again, you want to partition by user and order by date. Once you've done this you want patterns
where the current date is within 30 minutes of the first date in the match.
So your pattern variable looks like:
thirty as login_date - first(login_date) <= 30/1440
....with pattern matching (works from 12c and above), was able to get this.
demo@ORA12C> select user_id,login_date,fd,
2 dense_rank() over(partition by user_id
3 order by fd) rnk
4 from (
5 select *
6 from t
7 match_recognize(
8 partition by user_id
9 order by login_date
10 measures
11 match_number() mno,
12 classifier() cls,
13 strt.login_date as fd
14 all rows per match
15 pattern( strt down*)
16 define
17 down as login_date - strt.login_date <= (30/24/60) )
18 )
19 /
USER_ID LOGIN_DATE FD RNK
---------- -------------------- -------------------- ----------
1 22-NOV-2016 09:00:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:05:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:25:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:35:00 22-NOV-2016 09:35:00 2
1 22-NOV-2016 10:00:00 22-NOV-2016 09:35:00 2
1 22-NOV-2016 10:35:00 22-NOV-2016 10:35:00 3
2 22-NOV-2016 09:15:00 22-NOV-2016 09:15:00 1
2 22-NOV-2016 09:45:00 22-NOV-2016 09:15:00 1
3 22-NOV-2016 09:20:00 22-NOV-2016 09:20:00 1
4 22-NOV-2016 09:25:00 22-NOV-2016 09:25:00 1
4 22-NOV-2016 10:15:00 22-NOV-2016 10:15:00 2
4 22-NOV-2016 10:35:00 22-NOV-2016 10:15:00 2
5 22-NOV-2016 10:35:00 22-NOV-2016 10:35:00 1
13 rows selected.
Now thinking about to handle this in 11g database, so with MODEL clause got this.
demo@ORA12C> select user_id,login_date,x as fd,y as rnk
2 from t
3 model
4 partition by (user_id)
5 dimension by ( row_number() over(partition by user_id order by login_date) r )
6 measures ( login_date , cast(null as date) x, 0 y )
7 rules (
8 x[any] order by r = case when cv(r) = 1 then login_date[cv()]
9 when login_date[cv()] - x[cv()-1] <= 30/24/60 then x[cv()-1]
10 else login_date[cv()] end ,
11 y[any] order by r = case when cv(r)=1 then 1
12 when x[cv()] = x[cv()-1] then y[cv()-1]
13 else y[cv()-1]+1 end
14 )
15 order by user_id,login_date;
USER_ID LOGIN_DATE FD RNK
---------- -------------------- -------------------- ----------
1 22-NOV-2016 09:00:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:05:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:25:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:35:00 22-NOV-2016 09:35:00 2
1 22-NOV-2016 10:00:00 22-NOV-2016 09:35:00 2
1 22-NOV-2016 10:35:00 22-NOV-2016 10:35:00 3
2 22-NOV-2016 09:15:00 22-NOV-2016 09:15:00 1
2 22-NOV-2016 09:45:00 22-NOV-2016 09:15:00 1
3 22-NOV-2016 09:20:00 22-NOV-2016 09:20:00 1
4 22-NOV-2016 09:25:00 22-NOV-2016 09:25:00 1
4 22-NOV-2016 10:15:00 22-NOV-2016 10:15:00 2
4 22-NOV-2016 10:35:00 22-NOV-2016 10:15:00 2
5 22-NOV-2016 10:35:00 22-NOV-2016 10:35:00 1
13 rows selected.
demo@ORA12C>
with the above recursive with clause, don't get the correct result set.
demo@ORA12C> with rws (user_id, login_date, min_date, grp) as (
2 select user_id, login_date, login_date min_date, 1 grp from t t1
3 where login_date in (
4 select min(login_date) from t t2
5 where t2.user_id = t2.user_id
6 )
7 union all
8 select t.user_id, t.login_date,
9 case
10 when t.login_date - rws.min_date > 30/1440 then
11 t.login_date
12 else
13 rws.min_date
14 end min_date,
15 case
16 when t.login_date - rws.min_date > 30/1440 then
17 grp + 1
18 else
19 grp
20 end grp
21 from rws
22 join t
23 on t.user_id = rws.user_id
24 and t.login_date > rws.login_date
25 and t.login_date = (
26 select min(login_date) from t t2
27 where t2.user_id = t2.user_id
28 and t2.login_date > rws.login_date
29 )
30 )
31 select * from rws;
USER_ID LOGIN_DATE MIN_DATE GRP
---------- -------------------- -------------------- ----------
1 22-NOV-2016 09:00:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:05:00 22-NOV-2016 09:00:00 1
2 rows selected.
demo@ORA12C>
looks like few join were in-correct in the Recursive with clauses.
....
4 select min(login_date) from t t2
5 where t2.user_id = t2.user_id
27 where t2.user_id = t2.user_id
28 and t2.login_date > rws.login_date
....t2.user_id joins again with t2.user_id , hence the problem is
The algorithm for the recursive with clause goes like this.
a) for each user start with the least login_date as grp=1
b) for each user, get the next login_date greater than the above least login date and find the distance between the current login_date and the least login date
c) if the distance is more than 30 mins, then increment the grp by one else have it as such.
d) then proceed the same way for others rows
Now the updated recursive with clause goes like this.
demo@ORA12C> with rws ( user_id, login_date, min_dt, grp ) as (
2 select t1.user_id, t1.login_date , t1.login_date min_dt, 1 grp
3 from t t1
4 where login_date = ( select min(t2.login_date)
5 from t t2
6 where t2.user_id = t1.user_id )
7 union all
8 select t3.user_id, t3.login_date,
9 case when t3.login_date - r1.min_dt > 30/24/60 then
10 t3.login_date else r1.min_dt end ,
11 case when t3.login_date - r1.min_dt > 30/24/60 then
12 r1.grp +1 else r1.grp end
13 from t t3, rws r1
14 where t3.user_id = r1.user_id and
15 t3.login_date > r1.login_date and
16 t3.login_date = ( select min(t4.login_date)
17 from t t4
18 where t4.user_id = r1.user_id and
19 t4.login_date > r1.login_date )
20 )
21 select user_id, login_date, min_dt fd, grp rnk
22 from rws
23 order by user_id, login_date;
USER_ID LOGIN_DATE FD RNK
---------- -------------------- -------------------- ----------
1 22-NOV-2016 09:00:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:05:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:25:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:35:00 22-NOV-2016 09:35:00 2
1 22-NOV-2016 10:00:00 22-NOV-2016 09:35:00 2
1 22-NOV-2016 10:35:00 22-NOV-2016 10:35:00 3
2 22-NOV-2016 09:15:00 22-NOV-2016 09:15:00 1
2 22-NOV-2016 09:45:00 22-NOV-2016 09:15:00 1
3 22-NOV-2016 09:20:00 22-NOV-2016 09:20:00 1
4 22-NOV-2016 09:25:00 22-NOV-2016 09:25:00 1
4 22-NOV-2016 10:15:00 22-NOV-2016 10:15:00 2
4 22-NOV-2016 10:35:00 22-NOV-2016 10:15:00 2
5 22-NOV-2016 10:35:00 22-NOV-2016 10:35:00 1
13 rows selected.
demo@ORA12C>
But do we have any way to get this using Analytics ?
Tried using Analytics, but got only this. (which is incorrect!)
demo@ORA12C> select user_id,login_date, grp fd,
2 dense_rank() over(partition by user_id
3 order by grp ) rnk
4 from (
5 select user_id,login_date,
6 min(login_date) over(partition by user_id) min_dt ,
7 case when login_date - min(login_date) over(partition by user_id) > 30/24/60
8 then login_date
9 else min(login_date) over(partition by user_id)
10 end grp
11 from t
12 )
13 order by user_id,login_date
14 /
USER_ID LOGIN_DATE FD RNK
---------- -------------------- -------------------- ----------
1 22-NOV-2016 09:00:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:05:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:25:00 22-NOV-2016 09:00:00 1
1 22-NOV-2016 09:35:00 22-NOV-2016 09:35:00 2
1 22-NOV-2016 10:00:00 22-NOV-2016 10:00:00 3
1 22-NOV-2016 10:35:00 22-NOV-2016 10:35:00 4
2 22-NOV-2016 09:15:00 22-NOV-2016 09:15:00 1
2 22-NOV-2016 09:45:00 22-NOV-2016 09:15:00 1
3 22-NOV-2016 09:20:00 22-NOV-2016 09:20:00 1
4 22-NOV-2016 09:25:00 22-NOV-2016 09:25:00 1
4 22-NOV-2016 10:15:00 22-NOV-2016 10:15:00 2
4 22-NOV-2016 10:35:00 22-NOV-2016 10:35:00 3
5 22-NOV-2016 10:35:00 22-NOV-2016 10:35:00 1
13 rows selected.
demo@ORA12C>
in this above example, the user_id=4 has three dates ( 22-NOV-2016 09:25:00, 22-NOV-2016 10:15:00 and 22-NOV-2016 10:35:00 ) - order by login_date asc
a) take the first date(09:25) and compare that with the next date (10:15)
b) if the difference is less than 30 min, then same group else new group, then proceed the comparison using (10:15) as a base
c) next the distance between 10:15 and 10:35 is less the 30 min, so lies in the same group
so user_id=4 has totally two group - group=1 has only one element (22-NOV-2016 09:25:00) and group=2 has two elements ( 22-NOV-2016 10:15:00 and 22-NOV-2016 10:35:00 )
These kind of inter-row calculations/results are easy to handle using MODEL clause then Analytics.
kindly let us know if you have any options to get this using Analytics.