Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Amine.

Asked: June 12, 2021 - 4:18 pm UTC

Last updated: June 22, 2021 - 1:13 pm UTC

Version: 11.2.0.3

Viewed 1000+ times

You Asked

Hi Gurus,

The schema is provided in the LiveSQL Link.

Employees of each department have to make two (02) propositions (prop1 and prop2) among four (04).
So prop1 and prop2 go from 1 to 4.
prop2 is optional.
The decided column is the decision of the HR department.
When the decision is set, we don t care about prop1 and prop2.

So, we have three (03) possible scenarios :
1- Every employee gets his first proposition (prop1) satisfied (the best scenario) ;
2- Some employees get one of those propositions (prop1 or prop2) satisfied ;
3- Some employees get none of their propositions satisfied, meaning that the HR department
decides to assign them a choice they didn't make just to make the repartition fair (the worst scenario).

Let's take an example for id_dpt = 1329.

SQL> select * from t where id_dpt = 1329 order by prop1, prop2;

   ID_DPT    ID_EMP     PROP1     PROP2   DECIDED
--------- --------- --------- --------- ---------
     1329      4107         1         2
     1329      4140         1         2
     1329      4087         1         2
     1329      4042         1         2
  
     1329      4154         2         1
     1329      4145         2         1
     1329      4083         2         3
     1329      4065         2         3
     1329      4057         2         3
     1329      4037         2         3
     1329      4006         2         3
     1329      4017         2         3
     1329      4150         2         3
     1329      4149         2         3
     1329      4135         2         3
     1329      4118         2         3
     1329      4111         2         3
  
     1329      4137         3         2
     1329      4092         3         2
     1329      4104         3         4
     1329      4094         3         4
     1329      4020         3         4
     1329      4127         3         4
     1329      4124         3         4
     1329      4152         3         4
  
     1329      4146         4         3
     1329      4102         4         3
     1329      4129         4         3
     1329      4143         4         3

29 ligne(s) sélectionnée(s).

SQL> 



As you can see, there is 29 employees in the department, and the repartition is not fair
-1 : 13.79% (4/29)
-2 : 44.8% (13/29)
-3 : 27.58% (8/29)
-4 : 13.79% (4/29)

Ideally we should have (29/4 = 7.25) employees for each choice.

So a possible scenario, could be :

     1329      4107         1         2
     1329      4140         1         2
     1329      4087         1         2
     1329      4042         1         2
     1329      4154         2         1
     1329      4145         2         1
  
     1329      4083         2         3
     1329      4065         2         3
     1329      4057         2         3
     1329      4037         2         3
     1329      4006         2         3
     1329      4017         2         3
     1329      4150         2         3
     1329      4149         2         3
  
     1329      4135         2         3
     1329      4118         2         3
     1329      4111         2         3
     1329      4137         3         2
     1329      4092         3         2
     1329      4104         3         4
     1329      4094         3         4
     1329      4020         3         4
  
     1329      4127         3         4
     1329      4124         3         4
     1329      4152         3         4
     1329      4146         4         3
     1329      4102         4         3
     1329      4129         4         3
     1329      4143         4         3


getting this repartition :
-1 : 20.68% (6/29)
-2 : 27.58% (8/29)
-3 : 27.58% (8/29)
-4 : 24.13% (7/29)

As you can see, we just applied the scenario 2 (Some employees get one of those propositions satisfied).

We can also go further, with scenario 3, by giving, for example, the employee 4083 the choice 1, just to get the repartition more fair :

-1 : 24.13% (7/29)
-2 : 24.13% (7/29)
-3 : 27.58% (8/29)
-4 : 24.13% (7/29)

But this is not the best scenario, because employee 4083 would be unsatisfied.

Back to the whole problem, here is the repartition according prop1.

(id_dpt in rows and prop1 in column)

        1      2       3      4      TOTAL
1329    4      13      8      4      29
---------------------------------
1330            1      3             4
---------------------------------
1361    1      21      18     4      44
---------------------------------
1362    1      2       8      1      12
---------------------------------
1363    1      5       7      2      15
---------------------------------
1364           1       2             3
---------------------------------
1365    2     10       3      2      17
---------------------------------
1407                   2      1      3
---------------------------------
1408           4       5             9
---------------------------------
1409    2      8       6      2      18
---------------------------------
1414                   1             1
---------------------------------
TOTAL   11     65      63     16     155
         7%    42%     41%    10%


As you can see, the repartition is not fair (7% 42% 41% 10%) according to prop1.

Now we want to make a repartition (at best 25% 25% 25% 25%) according to prop1, prop2 and the decided columns.
At worst, we can affect them a proposition they didn't make.

In the best case, the final result should looks like this :

        1      2       3      4      TOTAL
1329    ..     ..      ..     ..     29
---------------------------------
1330    ..     ..      ..     ..     4
---------------------------------
1361    ..     ..      ..     ..     44
---------------------------------
1362    ..     ..      ..     ..     12
---------------------------------
1363    ..     ..      ..     ..     15
---------------------------------
1364    ..     ..      ..     ..     3
---------------------------------
1365    ..     ..      ..     ..     17
---------------------------------
1407    ..     ..      ..     ..     3
---------------------------------
1408    ..     ..      ..     ..     9
---------------------------------
1409    ..     ..      ..     ..     18
---------------------------------
1414    ..     ..      ..     ..     1
---------------------------------
TOTAL   ..     ..      ..     ..     155
        25%    25%     25%    25%


Thanks in advance.

Amine




with LiveSQL Test Case:

and Chris said...

You need to be really explicit about how you define fair.

For example, is it better for two people to lose their second choice or one person to lose their first choice?

Next, you need to be careful about local vs global averages. As your example shows, any departments where the number of employees is not a multiple of four will not get an exact 25% split between the choices.

If you allocate all the extra choices to prop1, then 2, then 3, the overall distribution will be skewed to prop1. Even though each department is as close as it can get to 25%. e.g.:

        1      2       3      4      TOTAL
1329    2      1       1      1      5
---------------------------------
1330    2      1       1      1      5
---------------------------------
1361    2      1       1      1      5
---------------------------------
1362    2      1       1      1      5


Choice 1 has 40% of the employees, even though each individual department is as close to a 25% split as is possible.

But if you optimize the global split, it's possible for some departments to have their employees choices heavily skewed. e.g. this allocates 100% of the employees in each department to one choice, but the overall split is still 25%:

        1      2       3      4      TOTAL
1329    4      0       0      0      4
---------------------------------
1330    0      4       0      0      4
---------------------------------
1361    0      0       4      0      4
---------------------------------
1362    0      0       0      4      4


One solution to these problems is to get both the employees to rank their choices and HR to rank the employees for each choice.

You can then use an algorithm such as Gale–Shapley to match the employees to their choices.

https://en.wikipedia.org/wiki/Gale%E2%80%93Shapley_algorithm

This is an algorithmic method, meaning you (may) need many passes over the data to get the result. Doing this in SQL is... challenging.

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.