Hello,
I've got a tricky one and can't solve it. Hopefully you can point me in the right direction.
we have a table in which processes with their start and end time are listed. Here is a simplified example:
PROCESS | START_DATE | END_DATE
--------+------------+-----------
1 | 01.01.2018 | 05.01.2018
2 | 01.01.2018 | 03.01.2018
3 | 03.01.2018 | 10.01.2018
4 | 06.01.2018 | 10.01.2018
5 | 06.01.2018 | 08.01.2018
6 | 09.01.2018 | 14.01.2018
7 | 11.01.2018 | 12.01.2018
Here is the create script:
CREATE TABLE TEST_PROJ(
PROCESS_ID NUMBER
,START_DATE DATE
,END_DATE DATE
);
INSERT INTO TEST_PROJ(PROCESS_ID, START_DATE, END_DATE)
VALUES (1, TO_DATE('01.01.2018', 'DD.MM.YYYY'), TO_DATE('05.01.2018', 'DD.MM.YYYY'));
INSERT INTO TEST_PROJ(PROCESS_ID, START_DATE, END_DATE)
VALUES (2, TO_DATE('01.01.2018', 'DD.MM.YYYY'), TO_DATE('03.01.2018', 'DD.MM.YYYY'));
INSERT INTO TEST_PROJ(PROCESS_ID, START_DATE, END_DATE)
VALUES (3, TO_DATE('03.01.2018', 'DD.MM.YYYY'), TO_DATE('10.01.2018', 'DD.MM.YYYY'));
INSERT INTO TEST_PROJ(PROCESS_ID, START_DATE, END_DATE)
VALUES (4, TO_DATE('06.01.2018', 'DD.MM.YYYY'), TO_DATE('10.01.2018', 'DD.MM.YYYY'));
INSERT INTO TEST_PROJ(PROCESS_ID, START_DATE, END_DATE)
VALUES (5, TO_DATE('06.01.2018', 'DD.MM.YYYY'), TO_DATE('08.01.2018', 'DD.MM.YYYY'));
INSERT INTO TEST_PROJ(PROCESS_ID, START_DATE, END_DATE)
VALUES (6, TO_DATE('09.01.2018', 'DD.MM.YYYY'), TO_DATE('14.01.2018', 'DD.MM.YYYY'));
INSERT INTO TEST_PROJ(PROCESS_ID, START_DATE, END_DATE)
VALUES (7, TO_DATE('11.01.2018', 'DD.MM.YYYY'), TO_DATE('12.01.2018', 'DD.MM.YYYY'));
COMMIT;
I'd like to create a SQL in which the processes are listed beneath each other on a daily basis. The order of the positions is determined by its START_DATE, its duration DESCENDING, and at last by its PROCESS_ID. When a process has chosen its position, then it must keep it until it is finished even if a position slot in front of it becomes free. This free slot can then be occupied by an other process. Here is an example of the desired output:
DAY |Position_1 |Position_2 |Position_3 |Position_4 |Position_5
-----------+-----------+-----------+-----------+-----------+-----------
01.01.2018 | 1| 2| | |
02.01.2018 | 1| 2| | |
03.01.2018 | 1| 3| | |
04.01.2018 | 1| 3| | |
05.01.2018 | | 3| | |
06.01.2018 | 4| 3| 5| |
07.01.2018 | 4| 3| 5| |
08.01.2018 | 4| 3| | |
09.01.2018 | 4| 3| 6| |
10.01.2018 | | | 6| |
11.01.2018 | 7| | 6| |
12.01.2018 | | | 6| |
13.01.2018 | | | 6| |
The END_DATE of the process is not part of the productive time. If the END_DATE is '05.01.2018' the process is finished on '04.01.2018 23:59:59'.
What I've come up with is the following statement. The result seems to be fine at the first glance but if you look at day '06.01.2018' you'll notice that process 4 overwrites process 3. I understand why my SQL behaves in that way, but I have no idea how to get the desired output. I'll be glad for every hint how to tackle this problem.
WITH
DATES
AS
(SELECT DAY
FROM (SELECT TO_DATE('01.01.1901', 'DD.MM.YYYY') + ROWNUM - 1 AS DAY
FROM DUAL
CONNECT BY LEVEL <= TO_DATE('31.12.2999', 'DD.MM.YYYY') - TO_DATE('01.01.1901', 'DD.MM.YYYY') + 1)
WHERE DAY BETWEEN TO_DATE('01.01.2018', 'DD.MM.YYYY') AND TO_DATE('20.01.2018', 'DD.MM.YYYY')),
PROCESSES
AS
(SELECT DISTINCT PROCESS_ID
,END_DATE - START_DATE AS DURATION
,DAY
,START_DATE
,END_DATE
FROM TEST_PROJ, DATES
WHERE DATES.DAY BETWEEN TEST_PROJ.START_DATE AND TEST_PROJ.END_DATE),
PROC_RANKING
AS
(SELECT PROCESS_ID
,DURATION
,DAY
,START_DATE
,END_DATE
,RANK() OVER(PARTITION BY DAY ORDER BY START_DATE, DURATION DESC, PROCESS_ID) AS PROCESS_RANK
FROM PROCESSES
WHERE DAY <> END_DATE),
PROC_ORDER
AS
(SELECT DISTINCT PROC_RANKING.PROCESS_ID, MAX(PROC_RANKING.PROCESS_RANK) OVER (PARTITION BY PROC_RANKING.PROCESS_ID) AS PROCES_POSITION
FROM PROC_RANKING),
TOGETHER
AS
(SELECT PROCESSES.PROCESS_ID, PROCESSES.DAY, PROC_ORDER.PROCES_POSITION
FROM PROCESSES, PROC_ORDER
WHERE PROCESSES.PROCESS_ID = PROC_ORDER.PROCESS_ID AND PROCESSES.DAY <> PROCESSES.END_DATE)
(SELECT *
FROM TOGETHER
PIVOT
(MAX(TOGETHER.PROCESS_ID)
FOR PROCES_POSITION
IN (1 AS "Position_1"
,2 AS "Position_2"
,3 AS "Position_3"
,4 AS "Position_4"
,5 AS "Position_5")))
ORDER BY DAY;
The basic problem here is that the current position of a process is taken from its position on the previous day. If it existed.
The PROC_RANKING step only considers relative positions on the current day.
What you need is a recursive operation. After the first day, this assigns processes to slots by:
- First filling the slots for the processes that existed yesterday & today
- Then filling the empty slots based on the "ideal" rank for that day
If there is a way to do this in pure SQL, I can't find it!
But the following PL/SQL procedure will do it. Pass it comma-separated strings of the previous assignment and the "ideal" current assignment.
The function splits these into comma-separated arrays. One of these is indexed by the processes for the current day. You can then loop through the previous day's array to see if that process exists on the current day.
If it does, assign it to the previous day's slot and remove it from the list of current processes to add.
Once you've complete this loop, you can loop through the assigned slots. For any that are empty, take the first process from the current slots (if there is one). Then remove it from the current list.
Finally convert the actual assignments back to a comma separated list and return that.
Which gives:
create or replace function f ( prev_seq varchar2, curr_seq varchar2 )
return varchar2 as
prev_vals_tab dbms_sql.number_table;
curr_vals_tab dbms_sql.number_table;
curr_inds_tab dbms_sql.number_table;
final_vals_tab dbms_sql.number_table;
ith_curr_value pls_integer;
ith_prev_value pls_integer;
final_seq varchar2(30);
begin
-- parse strings into tables
for i in 1 .. 5 loop
if i = 1 then
ith_curr_value := substr ( curr_seq, 1, instr(curr_seq, ',') - 1 );
ith_prev_value := substr ( prev_seq, 1, instr(prev_seq, ',') - 1 );
elsif i = 5 then
ith_curr_value := substr ( curr_seq, instr(curr_seq, ',', 1, 4) + 1);
ith_prev_value := substr ( prev_seq, instr(prev_seq, ',', 1, 4) + 1);
else
ith_curr_value := substr ( curr_seq, instr(curr_seq, ',', 1, i-1) + 1, instr(curr_seq, ',', 1, i) - instr(curr_seq, ',', 1, i-1) - 1 );
ith_prev_value := substr ( prev_seq, instr(prev_seq, ',', 1, i-1) + 1, instr(prev_seq, ',', 1, i) - instr(prev_seq, ',', 1, i-1) - 1 );
end if;
prev_vals_tab(i) := ith_prev_value;
curr_vals_tab(i) := ith_curr_value;
if ith_curr_value is not null then
curr_inds_tab( ith_curr_value ) := i ;
end if;
end loop;
/* If a process is in the current and previous days
- assign it to the same slot it had in the previous day
& remove from the current list
*/
for i in 1 .. 5 loop
if curr_inds_tab.exists(prev_vals_tab(i)) then
final_vals_tab(i) := prev_vals_tab(i);
curr_vals_tab.delete(curr_inds_tab(prev_vals_tab(i)));
end if;
end loop;
/* For the unassigned slots:
- Take the first value from the current list (and remove it)
*/
for i in 1 .. 5 loop
if not final_vals_tab.exists(i) then
final_vals_tab(i) := curr_vals_tab(curr_vals_tab.first);
curr_vals_tab.delete(curr_vals_tab.first);
end if;
end loop;
-- convert array to comma separated list
for i in 1 .. 5 loop
if final_vals_tab.exists(i) then
final_seq := final_seq || final_vals_tab(i);
end if;
if i < 5 then
final_seq := final_seq || ',' ;
end if;
end loop;
return final_seq ;
end f;
/
With this in place, it's time for the SQL!
Generate all the desired days and slots. Assign processes to their "ideal" slot (not considering previous assignment).
Then collapse this down to a single row for each day with listagg to get a comma separated list of processes each day.
You can now start recursing through this. For each day, pass the previous day's process assignments and the current "ideal" arrangement.
The function will spit out the correct order. Keep going until you've generated all the days you need:
WITH DATES AS (
SELECT TO_DATE('01.01.2018', 'DD.MM.YYYY') + ROWNUM - 1 AS DAY,
TO_DATE('01.01.2018', 'DD.MM.YYYY') first_day
FROM DUAL
CONNECT BY LEVEL <= TO_DATE('20.01.2018', 'DD.MM.YYYY') - TO_DATE('01.01.2018', 'DD.MM.YYYY') + 1
), slots as (
select level slot_no
from dual connect by level <= 5
), date_slots as (
select * from dates
cross join slots
), proc_days as (
select dt.day, tp.process_id,
row_number() over (partition by dt.day order by tp.process_id) rn
from dates dt
left join test_proj tp
on dt.day >= tp.start_date
and dt.day < tp.end_date
), proc_slots as (
select ds.day, ds.first_day,
replace(listagg(coalesce(to_char(pd.process_id), '##'), ',') within group (order by ds.slot_no), '##') slots
from date_slots ds
left join proc_days pd
on ds.day = pd.day
and ds.slot_no = pd.rn
group by ds.day, ds.first_day
), day_orders (dy, curr_slots, prev_slots) as (
select ps.day dy, ps.slots, ',,,,'
from proc_slots ps
where ps.day = ps.first_day
union all
select ps.day,
f(do.curr_slots, ps.slots ), do.curr_slots
from day_orders do
join proc_slots ps
on do.dy + 1 = ps.day
)
select dy, curr_slots
from day_orders
order by dy;
DY CURR_SLOTS
01-JAN-2018 00:00:00 1,2,,,
02-JAN-2018 00:00:00 1,2,,,
03-JAN-2018 00:00:00 1,3,,,
04-JAN-2018 00:00:00 1,3,,,
05-JAN-2018 00:00:00 ,3,,,
06-JAN-2018 00:00:00 4,3,5,,
07-JAN-2018 00:00:00 4,3,5,,
08-JAN-2018 00:00:00 4,3,,,
09-JAN-2018 00:00:00 4,3,6,,
10-JAN-2018 00:00:00 ,,6,,
11-JAN-2018 00:00:00 7,,6,,
12-JAN-2018 00:00:00 ,,6,,
13-JAN-2018 00:00:00 ,,6,,
14-JAN-2018 00:00:00 ,,,,
15-JAN-2018 00:00:00 ,,,,
16-JAN-2018 00:00:00 ,,,,
17-JAN-2018 00:00:00 ,,,,
18-JAN-2018 00:00:00 ,,,,
19-JAN-2018 00:00:00 ,,,,
20-JAN-2018 00:00:00 ,,,,
Phew! That was tough! ;)
All you have to do now is split the comma-separated strings out into columns :)