Hi Gurus,
The schema is provided in the LiveSQL Link.
Employees of each department have to make two (02) propositions (prop1 and prop2) among four (04).
So prop1 and prop2 go from 1 to 4.
prop2 is optional.
The decided column is the decision of the HR department.
When the decision is set, we don t care about prop1 and prop2.
So, we have three (03) possible scenarios :
1- Every employee gets his first proposition (prop1) satisfied (the best scenario) ;
2- Some employees get one of those propositions (prop1 or prop2) satisfied ;
3- Some employees get none of their propositions satisfied, meaning that the HR department
decides to assign them a choice they didn't make just to make the repartition fair (the worst scenario).
Let's take an example for id_dpt = 1329.
SQL> select * from t where id_dpt = 1329 order by prop1, prop2;
ID_DPT ID_EMP PROP1 PROP2 DECIDED
--------- --------- --------- --------- ---------
1329 4107 1 2
1329 4140 1 2
1329 4087 1 2
1329 4042 1 2
1329 4154 2 1
1329 4145 2 1
1329 4083 2 3
1329 4065 2 3
1329 4057 2 3
1329 4037 2 3
1329 4006 2 3
1329 4017 2 3
1329 4150 2 3
1329 4149 2 3
1329 4135 2 3
1329 4118 2 3
1329 4111 2 3
1329 4137 3 2
1329 4092 3 2
1329 4104 3 4
1329 4094 3 4
1329 4020 3 4
1329 4127 3 4
1329 4124 3 4
1329 4152 3 4
1329 4146 4 3
1329 4102 4 3
1329 4129 4 3
1329 4143 4 3
29 ligne(s) sélectionnée(s).
SQL>
As you can see, there is 29 employees in the department, and the repartition is not fair
-1 : 13.79% (4/29)
-2 : 44.8% (13/29)
-3 : 27.58% (8/29)
-4 : 13.79% (4/29)
Ideally we should have (29/4 = 7.25) employees for each choice.
So a possible scenario, could be :
1329 4107 1 2
1329 4140 1 2
1329 4087 1 2
1329 4042 1 2
1329 4154 2 1
1329 4145 2 1
1329 4083 2 3
1329 4065 2 3
1329 4057 2 3
1329 4037 2 3
1329 4006 2 3
1329 4017 2 3
1329 4150 2 3
1329 4149 2 3
1329 4135 2 3
1329 4118 2 3
1329 4111 2 3
1329 4137 3 2
1329 4092 3 2
1329 4104 3 4
1329 4094 3 4
1329 4020 3 4
1329 4127 3 4
1329 4124 3 4
1329 4152 3 4
1329 4146 4 3
1329 4102 4 3
1329 4129 4 3
1329 4143 4 3
getting this repartition :
-1 : 20.68% (6/29)
-2 : 27.58% (8/29)
-3 : 27.58% (8/29)
-4 : 24.13% (7/29)
As you can see, we just applied the scenario 2 (Some employees get one of those propositions satisfied).
We can also go further, with scenario 3, by giving, for example, the employee 4083 the choice 1, just to get the repartition more fair :
-1 : 24.13% (7/29)
-2 : 24.13% (7/29)
-3 : 27.58% (8/29)
-4 : 24.13% (7/29)
But this is not the best scenario, because employee 4083 would be unsatisfied.
Back to the whole problem, here is the repartition according prop1.
(id_dpt in rows and prop1 in column)
1 2 3 4 TOTAL
1329 4 13 8 4 29
---------------------------------
1330 1 3 4
---------------------------------
1361 1 21 18 4 44
---------------------------------
1362 1 2 8 1 12
---------------------------------
1363 1 5 7 2 15
---------------------------------
1364 1 2 3
---------------------------------
1365 2 10 3 2 17
---------------------------------
1407 2 1 3
---------------------------------
1408 4 5 9
---------------------------------
1409 2 8 6 2 18
---------------------------------
1414 1 1
---------------------------------
TOTAL 11 65 63 16 155
7% 42% 41% 10%
As you can see, the repartition is not fair (7% 42% 41% 10%) according to prop1.
Now we want to make a repartition (at best 25% 25% 25% 25%) according to prop1, prop2 and the decided columns.
At worst, we can affect them a proposition they didn't make.
In the best case, the final result should looks like this :
1 2 3 4 TOTAL
1329 .. .. .. .. 29
---------------------------------
1330 .. .. .. .. 4
---------------------------------
1361 .. .. .. .. 44
---------------------------------
1362 .. .. .. .. 12
---------------------------------
1363 .. .. .. .. 15
---------------------------------
1364 .. .. .. .. 3
---------------------------------
1365 .. .. .. .. 17
---------------------------------
1407 .. .. .. .. 3
---------------------------------
1408 .. .. .. .. 9
---------------------------------
1409 .. .. .. .. 18
---------------------------------
1414 .. .. .. .. 1
---------------------------------
TOTAL .. .. .. .. 155
25% 25% 25% 25%
Thanks in advance.
Amine
You need to be really
explicit about how you define
fair.
For example, is it better for two people to lose their second choice or one person to lose their first choice?
Next, you need to be careful about local vs global averages. As your example shows, any departments where the number of employees is not a multiple of four will not get an exact 25% split between the choices.
If you allocate all the extra choices to prop1, then 2, then 3, the overall distribution will be skewed to prop1. Even though each department is as close as it can get to 25%. e.g.:
1 2 3 4 TOTAL
1329 2 1 1 1 5
---------------------------------
1330 2 1 1 1 5
---------------------------------
1361 2 1 1 1 5
---------------------------------
1362 2 1 1 1 5
Choice 1 has 40% of the employees, even though each individual department is as close to a 25% split as is possible.
But if you optimize the global split, it's possible for some departments to have their employees choices heavily skewed. e.g. this allocates 100% of the employees in each department to one choice, but the overall split is still 25%:
1 2 3 4 TOTAL
1329 4 0 0 0 4
---------------------------------
1330 0 4 0 0 4
---------------------------------
1361 0 0 4 0 4
---------------------------------
1362 0 0 0 4 4
One solution to these problems is to get both the employees to rank their choices and HR to rank the employees for each choice.
You can then use an algorithm such as Gale–Shapley to match the employees to their choices.
https://en.wikipedia.org/wiki/Gale%E2%80%93Shapley_algorithm This is an algorithmic method, meaning you (may) need many passes over the data to get the result. Doing this in SQL is... challenging.