Hi Tom,
drop table user_points;
create table user_points
(
id_user int ,
point_date date ,
points int
)
;
alter table user_points add constraint PK_user_points primary key (id_user, point_date);
insert into user_points values (1, to_date('01/02/2005', 'dd/mm/yyyy'), 10);
insert into user_points values (1, to_date('01/02/2004', 'dd/mm/yyyy'), 8);
insert into user_points values (1, to_date('01/02/2003', 'dd/mm/yyyy'), 6);
insert into user_points values (1, to_date('01/02/2000', 'dd/mm/yyyy'), 4);
insert into user_points values (2, to_date('01/02/2005', 'dd/mm/yyyy'), 10);
insert into user_points values (2, to_date('01/02/2004', 'dd/mm/yyyy'), 8);
insert into user_points values (2, to_date('01/02/2000', 'dd/mm/yyyy'), 4);
insert into user_points values (3, to_date('01/02/2008', 'dd/mm/yyyy'), 12);
insert into user_points values (3, to_date('01/02/2005', 'dd/mm/yyyy'), 10);
insert into user_points values (3, to_date('01/02/2004', 'dd/mm/yyyy'), 8);
insert into user_points values (3, to_date('01/02/1999', 'dd/mm/yyyy'), 4);
insert into user_points values (4, to_date('01/02/2005', 'dd/mm/yyyy'), 10);
insert into user_points values (4, to_date('01/02/2004', 'dd/mm/yyyy'), 8);
insert into user_points values (4, to_date('01/02/2001', 'dd/mm/yyyy'), 4);
insert into user_points values (5, to_date('01/02/2005', 'dd/mm/yyyy'), 10);
insert into user_points values (5, to_date('01/02/2004', 'dd/mm/yyyy'), 8);
insert into user_points values (5, to_date('01/02/2001', 'dd/mm/yyyy'), 4);
We want to compare users to each other and decide through a "rule" who is the biggest.
user "A" is bigger than user "B" means that
- the latest point of "A" is bigger than the latest point of "B"
- if the latest points are equal, then we look to the latest dates
- if the latest date of "A" is before the latest date of "B" then "A" is the biggest
- if the latest dates are equal then we look to the preceding points and we re-apply the rule.
if users are totally equal then there is no winner.
Examples :
--------
- user 1 is bigger than user 2 because : 10 = 10 and 01/02/1999 is before 01/02/2000.
- user 3 is bigger than user 2 because 12 > 10
- user 2 is bigger than user 4 because
- 10 = 10 and 01/02/2005 = 01/02/2005
- 8 = 8 and 01/02/2004 = 01/02/2004
- 4 = 4 and 01/02/2000 (of user 2) is before 01/02/2001 (of user 4), then 2 is bigger than 4
- users 4 and 5 are equal.
The expected result is :
user_A user_B biggest
------ ------ -------
1 2 1
1 3 3
1 4 1
1 5 1
2 3 ...
2 4 ...
...
Thanks in advance,
Amine
Thanks for providing a complete script Amine, it makes it much easier for us to help you!
I've also added a new oldest row for user 5 and created user 6 as a copy of 4. This is to show what to do when one user is a copy of another, but with one (or more) extra earlier rows:
insert into user_points values (5, to_date('01/02/2000', 'dd/mm/yyyy'), 2);
insert into user_points values (6, to_date('01/02/2005', 'dd/mm/yyyy'), 10);
insert into user_points values (6, to_date('01/02/2004', 'dd/mm/yyyy'), 8);
insert into user_points values (6, to_date('01/02/2001', 'dd/mm/yyyy'), 4);
Based on my understanding of the logic, this means that:
5 is now bigger than 4 (because there's one earlier row in 5)
4 and 6 are equal
5 is also bigger than 6 (same reason as for 4)
If this understanding is wrong you'll have to update as appropriate.
The outline of the SQL is:
1. Rank each row by date descending (so the most recent row = 1, the next 2 and so on)
2. Also find the minimum date for each user
3. Join every user row to every other user where this calculated rank is the same
4. For each row returned by this join:
- check the users' points. If A > B then return "A wins" and vice versa
- If the points are equal check the dates. If A DT < B DT then return "A wins" and vice versa
- Otherwise they are equal and return null
5. Find the first non-null value returned by the check above for each user pair. Return the winner appropriately
6. If they are equal, check the minimum dates. If A MIN DT < B MIN DT than A wins and vice-versa
7. Otherwise they are equal
The first step is to assign the ranks and get the minimum dates. I've done this with the following analytics:
select up.*,
rank() over (partition by id_user order by points desc) rk,
min(point_date) over (partition by id_user) min_dt
from user_points up
Note this assumes that dates are unique for each user. If a user can have two points recorded on the same date you'll need to modify this.
We need to join the result of this back to itself where the ranks are the same and A ID < B ID (so each pair is only listed once). I've placed the query above in a with clause to make it easier to follow:
with ranks as (
select up.*,
rank() over (partition by id_user order by points desc) rk,
min(point_date) over (partition by id_user) min_dt
from user_points up
)
select *
from ranks a
join ranks b
on a.id_user < b.id_user
and a.rk = b.rk
To implement the logic at step 4 to find the winner of each result, I've used the following case statement:
case
when a.points < b.points then 'b_wins'
when a.points > b.points then 'a_wins'
else
case
when a.point_date > b.point_date then 'b_wins'
when a.point_date < b.point_date then 'a_wins'
else null --equal
end
end comp
This returns the winner for each row, or null if they are equal. To get the winner for each A ID, B ID pair, you need to find the first non-null value returned by this ordered by date descending. The analytic function first_value() enables you to do this. Ensure you ignore nulls to skip over the equal rows:
with ranks as (
select up.*,
rank() over (partition by id_user order by points desc) rk,
min(point_date) over (partition by id_user) min_dt
from user_points up
)
select aid, bid, comp, amin, bmin,
first_value(comp) ignore nulls over (partition by aid, bid order by adt desc) fv
from (
select a.id_user aid, a.point_date adt, a.points ap, a.rk ark, a.min_dt amin,
b.id_user bid, b.point_date bdt, b.points bp, b.rk brk, b.min_dt bmin,
case
when a.points < b.points then 'b_wins'
when a.points > b.points then 'a_wins'
else
case
when a.point_date > b.point_date then 'b_wins'
when a.point_date < b.point_date then 'a_wins'
else null --equal
end
end comp
from ranks a
join ranks b
on a.id_user < b.id_user
and a.rk = b.rk
) s;
To return the winner, just group by A ID and B ID to find the max value returned by the first_value function. If they are still equal at this point (FV is always null), check whether either A or B has an earlier start date to see if one has more rows than the other does. If these are the same the users are indeed equal. Use the MIN DT calculated in the with clause to do this:
case
when max(fv) = 'a_wins' then to_char(aid)
when max(fv) = 'b_wins' then to_char(bid)
else
case
when amin < bmin then to_char(aid)
when bmin < amin then to_char(bid)
else 'equal'
end
end w
Put it all together and you get:
with ranks as (
select up.*,
rank() over (partition by id_user order by points desc) rk,
min(point_date) over (partition by id_user) min_dt
from user_points up
)
select aid, bid,
case
when max(fv) = 'a_wins' then to_char(aid)
when max(fv) = 'b_wins' then to_char(bid)
else
case
when amin < bmin then to_char(aid)
when bmin < amin then to_char(bid)
else 'equal'
end
end w
from (
select aid, bid, comp, amin, bmin,
first_value(comp) ignore nulls over (partition by aid, bid order by adt desc) fv
from (
select a.id_user aid, a.point_date adt, a.points ap, a.rk ark, a.min_dt amin,
b.id_user bid, b.point_date bdt, b.points bp, b.rk brk, b.min_dt bmin,
case
when a.points < b.points then 'b_wins'
when a.points > b.points then 'a_wins'
else
case
when a.point_date > b.point_date then 'b_wins'
when a.point_date < b.point_date then 'a_wins'
else null --equal
end
end comp
from ranks a
join ranks b
on a.id_user < b.id_user
and a.rk = b.rk
) s
)
group by aid, bid, amin, bmin
order by aid, bid;
AID BID W
---------- ---------- ----------------------------------------
1 2 1
1 3 3
1 4 1
1 5 1
1 6 1
2 3 3
2 4 2
2 5 2
2 6 2
3 4 3
3 5 3
3 6 3
4 5 5
4 6 equal
5 6 5