Skip to Main Content
  • Questions
  • Update a column after multiple criteria

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, Marian.

Asked: February 12, 2019 - 11:28 am UTC

Last updated: February 13, 2019 - 11:16 am UTC

Version: 1

Viewed 1000+ times

You Asked

Hello,

I have the following data:

Id AMAOUNT NO_PRATITION ID_STRATEGY
1 100 99 XXX
2 200 99 XXX
3 0 99 YYY
4 100 99 YYY
5 200 99 YYY
6 0 99 ZZZ
7 100 99 ZZZ
8 200 99 ZZZ
9 300 99 ZZZ
10 0 99 WWW
11 0 99 WWW
12 100 99 WWW
13 100 99 WWW
14 0 99 RRR
15 100 99 RRR
16 100 99 RRR
17 200 99 RRR

create table test(id number,amount number, no_partition number,id_strategy varchar2(20));


insert into TEST (id, amount, no_partition, id_strategy)
values (1, 100, 99, 'XXX');
insert into TEST (id, amount, no_partition, id_strategy)
values (2, 200, 99, 'XXX');
insert into TEST (id, amount, no_partition, id_strategy)
values (3, 0, 99, 'YYY');
insert into TEST (id, amount, no_partition, id_strategy)
values (4, 100, 99, 'YYY');
insert into TEST (id, amount, no_partition, id_strategy)
values (5, 200, 99, 'YYY');
insert into TEST (id, amount, no_partition, id_strategy)
values (6, 0, 99, 'ZZZ');
insert into TEST (id, amount, no_partition, id_strategy)
values (7, 100, 99, 'ZZZ');
insert into TEST (id, amount, no_partition, id_strategy)
values (8, 200, 99, 'ZZZ');
insert into TEST (id, amount, no_partition, id_strategy)
values (9, 300, 99, 'ZZZ');
insert into TEST (id, amount, no_partition, id_strategy)
values (10, 0, 99, 'WWW');
insert into TEST (id, amount, no_partition, id_strategy)
values (11, 0, 99, 'WWW');
insert into TEST (id, amount, no_partition, id_strategy)
values (12, 100, 99, 'WWW');
insert into TEST (id, amount, no_partition, id_strategy)
values (13, 100, 99, 'WWW');
insert into TEST (id, amount, no_partition, id_strategy)
values (14, 0, 99, 'RRR');
insert into TEST (id, amount, no_partition, id_strategy)
values (15, 100, 99, 'RRR');
insert into TEST (id, amount, no_partition, id_strategy)
values (16, 100, 99, 'RRR');
insert into TEST (id, amount, no_partition, id_strategy)
values (17, 200, 99, 'RRR');
insert into TEST (id, amount, no_partition, id_strategy)
values (18, 100, 99, 'QQQ');
insert into TEST (id, amount, no_partition, id_strategy)
values (19, 100, 99, 'QQQ');
commit;




The request is to update amount column, according with the following algorithm:

1. The rows will be grouping by no_partition,id_strategy.
2. We RE LOOKING FOR THE SECOND MAX VALUE
Ex: for ID_strategy = 'XXX' we keep amount=100 (id=1)
for ID_strategy = 'YYY' we keep amount=100 (id=4)
for ID_strategy = 'ZZZ' we keep amount=200 (id=8)
for ID_strategy = 'RRR' we keep amount=100 (id=15 and id=16)

3. If the second max value does not exist, we will keep the max value
Ex: for ID_strategy = 'WWW' we keep amount=100 (id=12 and id=13)
for ID_strategy = 'QQQ' we keep amount=100 (id=18 and id=19)

4. If we have multiple second max values, we will keep the min ID
Ex: for ID_strategy = 'RRR' we keep amount=100 (id=15)

5. If max amount or second max amount to be take in consideration, the amount must be positive(>0)

At the end the data must be updated like that:

Id AMAOUNT(before) AMAOUNT(after) NO_PRATITION ID_STRATEGY
1 100 100 99 XXX
2 200 0 99 XXX
3 0 0 99 YYY
4 100 100 99 YYY
5 200 0 99 YYY
6 0 0 99 ZZZ
7 100 100 99 ZZZ
8 200 0 99 ZZZ
9 300 0 99 ZZZ
10 0 0 99 WWW
11 0 0 99 WWW
12 100 100 99 WWW
13 100 0 99 WWW
14 0 0 99 RRR
15 100 100 99 RRR
16 100 0 99 RRR
17 200 0 99 RRR

I tried to create only one select for all the cases, but I don’t succeded.
I identified 3 cases:
1.Update only the lines where we can find 2 rows for each id_strategy(Ex: id_strategy = ‘QQQ’)

update test tt
              set tt.amount = 0
            where tt.id in
                  (select k.id              
                      from (select min(mm.id) over(partition by mm.id_strategy, mm.rk ) minu,
                                    mm.cnt_total,
                                    mm.cnt_total_rk,
                                    mm.rk,
                                    mm.id,
                                    mm.amount,
                                    mm.id_strategy
                               from (
                    select count(*) over(partition by yy.id_strategy ) cnt_total,
                           count(*) over(partition by yy.id_strategy,yy.amount order by yy.amount desc ) cnt_total_rk,
                           dense_rank() over(partition by yy.id_strategy order by yy.amount desc) rk,
                           amount,
                           max(amount) over(partition by yy.id_strategy) mx,
                           yy.id,
                           yy.id_strategy
                      from test yy
                     where yy.no_partition = 99
                     )
                     mm
                    ) k
                     where k.cnt_total = 2 
                           and k.cnt_total_rk=2
                           and k.amount>0
                           and k.minu != k.id)
              and tt.no_partition = 99;  


2. Update only the lines where I can find second max = 0 for each id_strategy(Ex: id_strategy = ‘WWW’)

update test tt
            set tt.amount = 0
          where tt.id in
                (select k.id
                   from (select dense_rank() over(partition by mm.id_strategy, mm.amount order by mm.id) rk2,
                                mm.amount vv,
                                mm.*
                           from (select max(amount) over(partition by yy.id_strategy) mx,
                                        yy.*
                                   from test yy
                                  where yy.no_partition = 99) mm) k
                  where k.rk2 != 1
                    and k.mx = k.vv
                    and k.id_strategy in
                        (select distinct id_strategy
                           from (select min(mm.amount) over(partition by mm.id_strategy, mm.rk ) as min_amount_lv2,
                                        mm.rk,
                                        mm.id_strategy
                                   from (select dense_rank() over(partition by yy.id_strategy order by yy.amount desc) rk,
                                                amount,
                                                max(amount) over(partition by yy.id_strategy) mx,
                                                yy.id,
                                                yy.id_strategy
                                           from test yy
                                          where yy.no_partition = 99
                                         ) mm) k
                          where k.rk = 2
                            and k.min_amount_lv2 = 0
                           
                         ))
            and tt.no_partition = 99;


3. Update all rows that have second max > 0 (Ex: id_strategy in XXX,YYY,ZZZ,RRR)
update test tt
           set tt.amount = 0
         where tt.id not in
               (with res as
                (select k.id
                   from (select min(mm.id) over(partition by mm.id_strategy, mm.rk ) minu,
                                 mm.*
                            from (select dense_rank() over(partition by yy.id_strategy order by yy.amount desc) rk,
                                          amount,
                                          max(amount) over(partition by yy.id_strategy ) mx,
                                          yy.id,
                                          yy.id_strategy
                                     from test yy
                                    where yy.no_partition = 99
                                 ) mm) k
                  where (k.minu = k.id and k.rk = 2 and k.amount > 0))
                 select res.id from res
                )
           and tt.id_strategy in
               (with res as
                (select k.id_strategy
                   from (select min(mm.id) over(partition by mm.id_strategy, mm.rk ) minu,
                                 mm.*
                            from (select dense_rank() over(partition by yy.id_strategy order by yy.amount desc) rk,
                                          amount,
                                          max(amount) over(partition by yy.id_strategy) mx,
                                          yy.id,
                                          yy.id_strategy
                                     from test yy
                                    where  yy.no_partition = 99
                                 ) mm) k
                  where (k.minu = k.id and k.rk = 2 and k.amount > 0))
                 select res.id_strategy from res
                )
           and tt.no_partition = 99;



My question is:
It is possible to update all this data in a single UPDATE clause?
If it is possible , please give me an advise.


Thanks a lot in advance,
Marian

and Chris said...

I'm not sure I understand exactly what you're looking for...

But if you want to set the amount to zero for all but the second row, sorted by amount desc, id for each no_partition & id_strategy you can use nth_value.

Use this to return the ID of the second row in the set. And update all those not in this:

update test
set    amount = 0
where  id not in ( 
  select * from (
    select nth_value ( id, 2 ) over ( 
             partition by no_partition, id_strategy
             order by amount desc, id
           ) second_max_id
    from   test t
  )
  where  second_max_id is not null
);

select * from test
order  by id;

ID   AMOUNT   NO_PARTITION   ID_STRATEGY   
   1      100             99 XXX           
   2        0             99 XXX           
   3        0             99 YYY           
   4      100             99 YYY           
   5        0             99 YYY           
   6        0             99 ZZZ           
   7        0             99 ZZZ           
   8      200             99 ZZZ           
   9        0             99 ZZZ           
  10        0             99 WWW           
  11        0             99 WWW           
  12        0             99 WWW           
  13      100             99 WWW           
  14        0             99 RRR           
  15      100             99 RRR           
  16        0             99 RRR           
  17        0             99 RRR           
  18        0             99 QQQ           
  19      100             99 QQQ 


I'm not sure this is exactly what you're looking for though; I'm not clear what's supposed to happen in these cases:

- Does it matter we've updated row 12 for WWW, not 13?
- If RRR has another row with amount = 200, are the rows with amount = 100 (15 & 16) still "second"?

Depending on the answers to these you may need to use some combination of rank/dense_rank/row_number instead. But the principle is the same - update all the rows that aren't "second" in this set.

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

nth_values resolved my problem

Marian Sultanoiu, February 13, 2019 - 8:53 am UTC

Hello,

First of all, I want to thank you for the answer.
The function nth_values resolved my problem.
Regarding your questions:
- Does it matter we've updated row 12 for WWW, not 13?
If we have multiple amount values with the same max value, we must the use an additional condition, only the min id must remain as it is and the others must be set to 0), so for WWW min(id) = 12 and the line with Id=13 must be updated.
- If RRR has another row with amount = 200, are the rows with amount = 100 (15 & 16) still "second"?
Yes, we must take the values distinct, doesn’t matter how many rows we have with the same values.
The rows with amount 100 are the second, and from 2 rows only the min(id) must remains, in our case id=15 remains and id = 16 must be updated.

This is the query that returns rows that must remain, the others must be updated to 0:

select *
  from (select min(mm.id) over(partition by mm.id_strategy, mm.rk) minu,
               mm.*
          from (select nth_value(yy.id, 2) over(partition by yy.id_strategy, yy.amount) as vv,
                       dense_rank() over(partition by yy.id_strategy order by yy.amount desc) rk,
                       yy.id,
                       yy.amount,
                       yy.id_strategy,
                       count(*) over(partition by yy.id_strategy) cnt_total,
                       count(*) over(partition by yy.id_strategy, yy.amount order by yy.amount desc) cnt_total_rk --,
                  from test yy
                 where yy.no_partition = 99
                   and yy.amount > 0) mm) kyk

 where kyk.amount > 0
   and ((kyk.vv is null and kyk.cnt_total_rk > 1) 
       or
       (kyk.vv is not null and kyk.minu = kyk.id and kyk.cnt_total = 2) 
       or
       (kyk.vv is not null and kyk.minu = kyk.id and kyk.cnt_total > 2 and
       kyk.rk > 1) 
       or
       (kyk.vv is not null and kyk.minu = kyk.id and kyk.cnt_total > 2 and
       kyk.rk = 1 and kyk.cnt_total_rk > 1) 
       or
       (kyk.vv is null and kyk.rk = 2))
       
MINU VV  RK ID AMOUNT ID_STRATEGY CNT_TOTAL CNT_TOTAL_RK      
18   18  1  18  100   QQQ         2         2
15   15  2  15  100   RRR         3         2
12   13  1  12  100   WWW         2         2
1    2   1      100   XXX         2         1
4    2   4      100   YYY         2         1
8    2   8      200   ZZZ         3         1


Best regards,
Marian
Chris Saxon
February 13, 2019 - 11:16 am UTC

Great, glad this helped. Thanks for getting back to us.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.