Hi,
I have an ordered table of products and customers as follows:
rn product customer
1 859274 A
2 859267 A
3 859250 A
4 863592 B
5 862250 B
6 862700 B
7 862694 B
8 862120 B
9 863592 C
10 862250 C
11 862120 D
12 862694 C
13 863592 E
14 862113 D
15 863592 F
16 862250 F
17 862700 E
18 862694 E
19 863592 G
I need an SQL sentence that will return the top 10 product/customer pairs HOWEVER both product and customer cannot repeat itself more than 2 times.
to make it clearer row 3 is out because customer A was included twice before.
row 13 is out (although first time this customer appears) because product 863592 was already included 2 times in vetted records.
the answer for the above example should be:
1 859274 A
2 859267 A
4 863592 B
5 862250 B
9 863592 C
10 862250 C
11 862120 D
14 862113 D
17 862700 E
18 862694 E
I tried several options like analytic functions including cumulative counts, lags, etc... was not able to solve this one :-)
Thanks
You can use row_number to assign an incrementing counter for each customer or product. Just pass these in the partition by clause. e.g.:
row_number () over ( partition by product order by rn )
You can then filter out those with a rn > 2
CREATE TABLE t (
rn int, product int, customer varchar2(1)
) ;
INSERT INTO t (rn, product, customer) VALUES (1, 859274, 'A');
INSERT INTO t (rn, product, customer) VALUES (2, 859267, 'A');
INSERT INTO t (rn, product, customer) VALUES (3, 859250, 'A');
INSERT INTO t (rn, product, customer) VALUES (4, 863592, 'B');
INSERT INTO t (rn, product, customer) VALUES (5, 862250, 'B');
INSERT INTO t (rn, product, customer) VALUES (6, 862700, 'B');
INSERT INTO t (rn, product, customer) VALUES (7, 862694, 'B');
INSERT INTO t (rn, product, customer) VALUES (8, 862120, 'B');
INSERT INTO t (rn, product, customer) VALUES (9, 863592, 'C');
INSERT INTO t (rn, product, customer) VALUES (10, 862250, 'C');
INSERT INTO t (rn, product, customer) VALUES (11, 862120, 'D');
INSERT INTO t (rn, product, customer) VALUES (12, 862694, 'C');
INSERT INTO t (rn, product, customer) VALUES (13, 863592, 'E');
INSERT INTO t (rn, product, customer) VALUES (14, 862113, 'D');
INSERT INTO t (rn, product, customer) VALUES (15, 863592, 'F');
INSERT INTO t (rn, product, customer) VALUES (16, 862250, 'F');
INSERT INTO t (rn, product, customer) VALUES (17, 862700, 'E');
INSERT INTO t (rn, product, customer) VALUES (18, 862694, 'E');
INSERT INTO t (rn, product, customer) VALUES (19, 863592, 'G');
with ranks as (
select t.*,
row_number () over ( partition by product order by rn ) prod_rn ,
row_number () over ( partition by customer order by rn ) cust_rn
from t
)
select * from ranks
where prod_rn <= 2
and cust_rn <= 2
order by rn;
RN PRODUCT CUSTOMER PROD_RN CUST_RN
1 859274 A 1 1
2 859267 A 1 2
4 863592 B 1 1
5 862250 B 1 2
9 863592 C 2 1
10 862250 C 2 2
11 862120 D 2 1
14 862113 D 1 2
17 862700 E 2 2