Skip to Main Content
  • Questions
  • Recursive CTE getting VALUES from another table

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Paul .

Asked: August 20, 2021 - 9:26 am UTC

Last updated: August 25, 2021 - 9:49 am UTC

Version: 19c

Viewed 1000+ times

You Asked

I have a recursive CTE, which generates start and END dates for multiple dates, which is working as expected.

It will stop creating rows before the count(*) of the locations table is exceeded.

I'm looking for a way to assign a location_id to each row in the recursive CTE without having a duplicate location_id.


There is no commonality between the start/end dates and the locations so a JOIN didn't seem possible to me.

In the sample code below I tried a cross apply, which gave me the same location_id for each group generated by the recursive code so that isn't working either and that's how I left the test case.

I'm using the PIPELINED function to simulate multiple dates but I'm not necessarily tied to that idea. If the solution requires processing one DATE at a time.

My goal is to populate the schedule table below by passing in a schedule_id, range of dates and a number, which represents the time the schedule should begin.
In my test CASE that is all hard coded as I wanted to keep the test CASE as simple as possible. You can assume the schedule_id =1 or any number you wish.


Once I get this logic working I want to use a MERGE statement to INSERT rows so there will not be duplicate rows for a schedule if the process is run more than once with the same values.

The PRIMARY KEY being schedule_id, location_id, base_date

My test CASE is below, which I can't seem to get working. Thanks in advance for your expertise and patience.


Current Result 

LOCATION_ID    BASE_DATE    START_DATE    END_DATE
1    08212021 00:00:00    08212021 23:16:00    08212021 23:21:00
1    08212021 00:00:00    08212021 23:26:00    08212021 23:31:00
1    08212021 00:00:00    08212021 23:36:00    08212021 23:41:00
…
…

Expected Result

LOCATION_ID    BASE_DATE    START_DATE    END_DATE
1    08212021 00:00:00    08212021 23:16:00    08212021 23:21:00
2    08212021 00:00:00    08212021 23:26:00    08212021 23:31:00 
3    08212021 00:00:00    08212021 23:36:00    08212021 23:41:00
…
...


ALTER SESSION SET NLS_DATE_FORMAT = 'MMDDYYYY HH24:MI:SS';

CREATE OR REPLACE TYPE nt_date IS TABLE OF DATE;
/

    CREATE OR REPLACE FUNCTION generate_dates_pipelined(
      p_from IN DATE,
      p_to   IN DATE
    )
    RETURN nt_date  PIPELINED   DETERMINISTIC
    IS
      v_start DATE := TRUNC(LEAST(p_from, p_to));
      v_end   DATE := TRUNC(GREATEST(p_from, p_to));
    BEGIN
     LOOP
        PIPE ROW (v_start);
        EXIT WHEN v_start >= v_end;
        v_start := v_start + INTERVAL '1' DAY;
      END LOOP;
      RETURN;
    END       generate_dates_pipelined;
/

CREATE TABLE locations AS
    SELECT level AS location_id,
       'Door ' || level AS location_name,

    CASE round(dbms_random.value(1,3)) 
            WHEN 1 THEN 'A' 
            WHEN 2 THEN 'T' 
            WHEN 3 THEN 'T' 
         END AS location_type

    FROM   dual
    CONNECT BY level <= 20;

create table schedule(
       schedule_id NUMBER(4),
       location_id number(4),
       base_date DATE,
       start_date DATE,
       end_date DATE,
         CONSTRAINT start_min check (start_date=trunc(start_date,'MI')),   
       CONSTRAINT end_min check (end_date=trunc(end_date,'MI')),
 CONSTRAINT end_gt_start CHECK (end_date >= start_date),
CONSTRAINT same_day CHECK (TRUNC(end_date) = TRUNC(start_date))
      );


WITH input  (base_date,start_time) AS (
SELECT 
COLUMN_VALUE,
COLUMN_VALUE+
NUMTODSINTERVAL(83760, 'SECOND')
  FROM   TABLE(generate_dates_pipelined(DATE '2021-08-21',DATE '2021-08-30')) 
)
SELECT
     location_id,
     base_date, start_time + (LEVEL-1) * INTERVAL '10' MINUTE
         AS start_date,
       start_time + (LEVEL-1) * INTERVAL '10' MINUTE + INTERVAL '5' MINUTE
         AS end_date 
FROM   input i
CROSS APPLY  (SELECT location_id from locations)
CONNECT BY (LEVEL-1) * INTERVAL '10' MINUTE < INTERVAL '1' DAY
AND    LEVEL <= (SELECT COUNT(*) FROM locations)
AND    start_time + (LEVEL-1) * INTERVAL '10' MINUTE < TRUNC(start_time) + INTERVAL '1' DAY;


and Chris said...

I'm looking for a way to assign a location_id to each row in the recursive CTE without having a duplicate location_id


What exactly do you mean here?

Each location can appear only once in the results? Or every other location must appear before you reuse the same location?

Either way, you've created the locations table with values from 1..N with no gaps. So it looks to me like you can assign row numbers to your date generator and join these to the locations.

If you want to cycle through the locations, take the modulus of the date row numbers first. Use however many rows there are in the locations table for the modulus.

As you're on 19c, here's an example using a SQL macro to generate the dates and cycle through the locations:

create or replace function generate_days (   
  start_date date, end_date date,  
  day_increment integer default 1  
)   
  return clob sql_macro as  
  stmt clob;  
begin  
    
  stmt := 'select start_date  
           + ( level - 1 ) * day_increment as dt  
  from   dual  
  connect by level <= (  
    ( ( end_date - start_date ) + day_increment 
    ) / day_increment  
  )';  
  
  return stmt;  
    
end generate_days;  
/

select * from (
  select dt, 
         mod ( row_number () over ( order by dt ), 20 ) + 1 rn
  from   generate_days ( date '2021-08-21',date '2021-08-30', 10/1440 )
) join locations
on     rn = location_id
order  by dt
fetch first 23 rows only;

DT                         RN    LOCATION_ID LOCATION_NAME    LOCATION_TYPE   
21-AUG-2021 00:00:00        2              2 Door 2           T                
21-AUG-2021 00:10:00        3              3 Door 3           T                
... <snip> ...
21-AUG-2021 02:50:00       19             19 Door 19          T                
21-AUG-2021 03:00:00       20             20 Door 20          T                
21-AUG-2021 03:10:00        1              1 Door 1           T                
21-AUG-2021 03:20:00        2              2 Door 2           T                
21-AUG-2021 03:30:00        3              3 Door 3           T                
21-AUG-2021 03:40:00        4              4 Door 4           T


Read more about date generation & macros at https://blogs.oracle.com/sql/how-to-generate-days-weeks-or-months-between-two-dates-in-oracle-database

PS - the example query is just a plain CTE, it's not recursive. To be recursive there must be a union all branch, joining the CTE to the source tables.

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Paul, August 24, 2021 - 5:19 pm UTC

While I know about SQL macros I'm unsure why you changed my function. Was there some reason why that wouldn't work in this situation? If do, can you please elaborate as I'm not understanding your reason.

My objective is to assign a unique location_id to each range of start and END times once I run out of location_id for the start and END time the same ranges will be used for the next day. I am trying to create a schedule for N number of days in one shot. Note how my example starts at some offset, has a time range then a gap to start the next range.

My response has code that does this for one day only. I want many days.

CREATE OR REPLACE TYPE nt_date IS TABLE OF DATE;
/

CREATE OR REPLACE FUNCTION generate_dates_pipelined(
  p_from IN DATE,
  p_to   IN DATE
)
  RETURN nt_date PIPELINED DETERMINISTIC
IS
  v_start DATE := TRUNC(LEAST(p_from, p_to));
  v_end   DATE := TRUNC(GREATEST(p_from, p_to));
BEGIN
  LOOP
    PIPE ROW (v_start);
    EXIT WHEN v_start >= v_end;
    v_start := v_start + INTERVAL '1' DAY;
  END LOOP;
  RETURN;
END generate_dates_pipelined;
/

CREATE OR REPLACE FUNCTION CONVERT_TO_SECONDS( 
  i_date_string IN VARCHAR2 
)
RETURN INTEGER DETERMINISTIC
AS
BEGIN
  RETURN ( TO_DATE(i_date_string, 'HH24:MI:SS')
         - TO_DATE('00:00:00', 'HH24:MI:SS')
         ) * 86400;
END;
/  

CREATE TABLE locations AS
    SELECT level AS location_id,
       'Door ' || level AS location_name,

    CASE round(dbms_random.value(1,3)) 
            WHEN 1 THEN 'G' 
            WHEN 2 THEN 'G' 
            WHEN 3 THEN 'G' 
         END AS location_type

    FROM   dual
    CONNECT BY level <= 5;

     ALTER TABLE locations 
         ADD ( CONSTRAINT locations_pk
       PRIMARY KEY (location_id));

create table schedule(
      seq_num NUMBER  GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
       schedule_id NUMBER(4),
       location_id number(4),
       base_date DATE,
       start_date DATE,
       end_date DATE,
          constraint schedule_pk primary key (schedule_id, location_id, base_date),
         CONSTRAINT start_min check (start_date=trunc(start_date,'MI')),   
       CONSTRAINT end_min check (end_date=trunc(end_date,'MI')),
 CONSTRAINT end_gt_start CHECK (end_date >= start_date)
      );


CREATE OR REPLACE PROCEDURE CREATE_SCHEDULE(
  i_schedule_id IN PLS_INTEGER,
  i_base_date   IN DATE,
  i_offset      IN PLS_INTEGER DEFAULT 0, 
  i_incr        IN PLS_INTEGER DEFAULT 10,
  i_duration    IN PLS_INTEGER DEFAULT 5
)
AS 
  l_offset   interval day to second;
  l_incr     interval day to second;
  l_duration interval day to second;
BEGIN
  l_offset   := NUMTODSINTERVAL(i_offset, 'SECOND') ;
  l_incr     := NUMTODSINTERVAL(i_incr, 'MINUTE') ;
  l_duration := NUMTODSINTERVAL(i_duration, 'MINUTE') ;

  MERGE INTO schedule dst
  USING (
    SELECT   i_schedule_id AS schedule_id,
             l.location_id,
             i_base_date AS base_date,
             i_base_date + l_offset + (l_incr * (ROWNUM - 1))
               AS start_date,
             i_base_date + l_offset + (l_incr * (ROWNUM - 1)) + l_duration
               AS end_date
    FROM     locations l
    where  location_id in ( 
      select location_id
      from   locations
      where  location_type = 'G'
    ) 

  ) src
  ON (   src.schedule_id = dst.schedule_id
     AND src.location_id = dst.location_id
     AND src.base_date   = dst.base_date
  )
  WHEN NOT MATCHED THEN
    INSERT (
      schedule_id,
      location_id,
      base_date,
      start_date,
      end_date
    ) VALUES (
      src.schedule_id,
      src.location_id,
      src.base_date,
      src.start_date,
      src.end_date
    );
END;
/


EXEC CREATE_SCHEDULE(1,TRUNC(SYSDATE),CONVERT_TO_SECONDS('16:00:00'));
/


Chris Saxon
August 25, 2021 - 9:49 am UTC

Pipelined table functions do work - I just think SQL macros are better for this task. The database resolves the macro at parse time, so at runtime you're executing pure SQL. This will usually be faster.

If you want to use each location once for each day, you could change the row_number expression to start at one again each day like this:

row_number () over ( 
  partition by trunc ( dt ) 
  order by dt 
)

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.