Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Alexander.

Asked: June 06, 2005 - 5:13 pm UTC

Last updated: June 13, 2005 - 11:54 am UTC

Version: 9.1

Viewed 10K+ times! This question is

You Asked

Tom,

given a table structure

create table time_on_task(begin_dt date not null primary key, end_dt date not null);

how would you enforce a condition that intervals do not overlap?
The way I'm doing it now seems to be not too simple:

create table time_on_task(begin_dt date not null primary key, end_dt date not null, num_overlaps number(1) default(0) check(num_overlaps = 0));

create or replace trigger time_on_task_ins
before insert on time_on_task
for each row
begin
update time_on_task set num_overlaps = num_overlaps + 1
where (
(:new.begin_dt between begin_dt and end_dt)
or
(:new.end_dt between begin_dt and end_dt)
or
(begin_dt between :new.begin_dt and :new.end_dt)
);
end;
/

set line 120;
insert into time_on_task(begin_dt, end_dt) values('12-dec-2003', '15-dec-2003')
1 row inserted


insert into time_on_task(begin_dt, end_dt) values('20-dec-2003', '21-dec-2003')
1 row inserted


insert into time_on_task(begin_dt, end_dt) values('5-dec-2003', '7-dec-2003')
1 row inserted


--should fail
insert into time_on_task(begin_dt, end_dt) values('1-dec-2003', '5-dec-2003')
ORA-02290: check constraint (STARSDIRECT.SYS_C00644968) violated
ORA-06512: at "STARSDIRECT.TIME_ON_TASK_INS", line 2
ORA-04088: error during execution of trigger 'STARSDIRECT.TIME_ON_TASK_INS'
insert into time_on_task(begin_dt, end_dt) values('1-dec-2003', '6-dec-2003')
ORA-02290: check constraint (STARSDIRECT.SYS_C00644968) violated
ORA-06512: at "STARSDIRECT.TIME_ON_TASK_INS", line 2
ORA-04088: error during execution of trigger 'STARSDIRECT.TIME_ON_TASK_INS'
insert into time_on_task(begin_dt, end_dt) values('1-dec-2003', '7-dec-2003')
ORA-02290: check constraint (STARSDIRECT.SYS_C00644968) violated
ORA-06512: at "STARSDIRECT.TIME_ON_TASK_INS", line 2
ORA-04088: error during execution of trigger 'STARSDIRECT.TIME_ON_TASK_INS'
insert into time_on_task(begin_dt, end_dt) values('1-dec-2003', '8-dec-2003')
ORA-02290: check constraint (STARSDIRECT.SYS_C00644968) violated
ORA-06512: at "STARSDIRECT.TIME_ON_TASK_INS", line 2
ORA-04088: error during execution of trigger 'STARSDIRECT.TIME_ON_TASK_INS'
insert into time_on_task(begin_dt, end_dt) values('5-dec-2003', '7-dec-2003')
ORA-02290: check constraint (STARSDIRECT.SYS_C00644968) violated
ORA-06512: at "STARSDIRECT.TIME_ON_TASK_INS", line 2
ORA-04088: error during execution of trigger 'STARSDIRECT.TIME_ON_TASK_INS'
insert into time_on_task(begin_dt, end_dt) values('20-dec-2003', '21-dec-2003')
ORA-02290: check constraint (STARSDIRECT.SYS_C00644968) violated
ORA-06512: at "STARSDIRECT.TIME_ON_TASK_INS", line 2
ORA-04088: error during execution of trigger 'STARSDIRECT.TIME_ON_TASK_INS'
insert into time_on_task(begin_dt, end_dt) values('20-dec-2003', '22-dec-2003')
ORA-02290: check constraint (STARSDIRECT.SYS_C00644968) violated
ORA-06512: at "STARSDIRECT.TIME_ON_TASK_INS", line 2
ORA-04088: error during execution of trigger 'STARSDIRECT.TIME_ON_TASK_INS'
insert into time_on_task(begin_dt, end_dt) values('21-dec-2003', '22-dec-2003')
ORA-02290: check constraint (STARSDIRECT.SYS_C00644968) violated
ORA-06512: at "STARSDIRECT.TIME_ON_TASK_INS", line 2
ORA-04088: error during execution of trigger 'STARSDIRECT.TIME_ON_TASK_INS'

I'd rather not use a trigger, since it can be disabled/dropped, and it does not guarantee the integrity of existing data.

Creating a materialized view just to enforce this constraint seems to be an overkill.

How to do it better?




and we said...

see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:42170695313022#42220587905650 <code>

*any* constraint can be disabled.
*any* constraint can be violated.
*any* one.

(correctly coded, mulit-user) Triggers are as "safe" as any constraint to protect data.

If you are worried about triggers being disabled, you must use methods to prevent that from happening.

If you are worried about constraints being disabled, you must use the same methods to prevent that from happening.

Neither is harder to get around than the other.


But your trigger suffers from the fact that in a multi-user environment, it quite simply will "not work".

I insert A and B
You insert A and B

the triggers fire in our respective sessions, but you cannot see me, and I cannot see you.

Therefore, neither of us believes the other has overlapping data.

And data integrity is gone.

Here is an excerpt from my 2nd edition of Expert One on One that I'm working on that describes the multi-user issue and shows what technique *I* would use (a stored procedure, no modifications directly to the table itself, don't GRANT. The logic could be placed into a trigger if you want, but I wouldn't)

<quote>

One of the side-effects of Oracle's 'non-blocking' approach is that if you actually want to ensure that no more than one user has access to a row at once, then you, the developer, need to do a little work yourself. Consider the following example. A developer was demonstrating to me a resource-scheduling program (for conference rooms, projectors, etc.) that he had just developed and was in the process of deploying. The application implemented a business rule to prevent the allocation of a resource to more than one person, for any given period of time. That is, the application contained code that specifically checked that no other user had previously allocated the time slot (as least the developer thought it did). This code queried the schedules table and, if no rows existed that overlapped that time slot, inserted the new row. So, the developer was basically concerned with two tables:

create table resources ( resource_name varchar2(25) primary key, ... );

create table schedules
( resource_name references resources,
start_time date not null,
end_time date not null,
check (start_time < end_time ),
primary key(resource_name,start_time)
);

And, before making, say, a room reservation, the application would query:

select count(*)
from schedules
where resource_name = :room_name
and (start_time <= :new_end_time)
AND (end_time >= :new_start_time)

It looked simple and bullet-proof (to the developer anyway); if the count came back zero, the room was yours. If it came back non-zero, you could not reserve it for that period. Once I knew what his logic was, I set up a very simple test to show him the error that would occur when the application went live. An error that would be incredibly hard to track down and diagnose after the fact - one would be convinced it must be a database bug.

All I did was get someone else to use the terminal next to him. They both navigated to the same screen and, on the count of three, each hit the Go button and tried to reserve the same room for about the same time - one from 3pm to 4pm and the other from 3:30pm to 4:00pm. Both people got the reservation - the logic, which worked perfectly in isolation, failed in a multi-user environment. The problem in this case was caused in part by Oracle's non-blocking reads. Neither session ever blocked the other session. Both sessions simply ran the above query and then performed the logic to schedule the room. They could both run the query to look for a reservation, even if the other session had already started to modify the schedules table (the change wouldn't be visible to the other session until commit, by which time it was too late). Since they were never attempting to modify the same row in the schedules table, they would never block each other and, thus, the business rule could not enforce what it was intended to enforce.

The developer needed a method of enforcing the business rule in a multi-user environment, a way to ensure that exactly one person at a time made a reservation on a given resource. In this case, the solution was to impose a little serialization of his own. What we will do is lock the parent row in the RESOURCES table prior to making a modification in the SCHEDULES table. That way, all modifications to the SCHEDULES table for a given RESOURCE_NAME value will be done one at a time. That is, to reserve a block of time for the resource named "X", we will lock the single row in the RESOURCES table for "X" and then modify the SCHEDULES table. So, in addition to performing the count(*) above, the developer must first:

select * from resources where resource_name = :room_name FOR UPDATE;

What we did here was to lock the resource (the room) to be scheduled immediately before scheduling it, in other words before we query the Schedules table for that resource. By locking the resource we are trying to schedule, we have ensured that no one else is modifying the schedule for this resource simultaneously. They must wait until we commit our transaction - at which point, they would be able to see our schedule. The chance of overlapping schedules is removed. The developer must understand that, in the multi-user environment, they must at times employ techniques similar to those used in multi-threaded programming. The FOR UPDATE clause is working like a semaphore in this case. It serializes access to the resources tables for that particular row - ensuring no two people can schedule it simultaneously. My suggestion was to make this logic a "transaction API", that is, bundle all of the logic into a stored procedure and only permit applications to modify the data via this API, the code could look like this:

create or replace procedure schedule_resource
( p_resource_name in varchar2,
p_start_time in date,
p_end_time in date
)
as
l_resource_name resources.resource_name%type;
l_cnt number;
begin

We start by locking the single row in the resource table for the resource you want to schedule. If anyone else has this row locked, we will block and wait for it:

select resource_name into l_resource_name
from resources
where resource_name = p_resource_name
FOR UPDATE;

Now, we are the only ones inserting into this schedules table FOR THIS resource name, it is safe to look at this table now:

select count(*)
into l_cnt
from schedules
where resource_name = p_resource_name
and (start_time <= p_end_time)
and (end_time >= p_start_time);

if ( l_cnt <> 0 )
then
raise_application_error
(-20001, 'Room is already booked!' );
end if;

Now if we get here in the code without raising an error, we can safely insert rows for our resource into the schedule table without any overlaps:

insert into schedules
( resource_name, start_time, end_time )
values
( p_resource_name, p_start_time, p_end_time );
end schedule_resources;

This is still highly concurrent as there are potentially thousands of resources to be reserved - what we have done is ensure that only one person modifies a resource at any time. This is a rare case where the manual locking of data you are not going to actually update is called for. You need to be able to recognize where you need to do this and, perhaps as importantly, where not to (I have an example of when not to below). Additionally, this does not lock the resource from other people reading the data as it might in other databases, hence this will scale very well.

Issues such as the above have massive implications when attempting to port an application from database to database (I return to this theme a little later in the chapter), and this trips people up time and time again. For example, if you are experienced in other databases, where writers block readers and vice versa then you may have grown reliant on that fact to protect you from data integrity issues. The lack of concurrency is one way to protect yourself from this - that is how it works in many non-Oracle databases. In Oracle, concurrency rules supreme and you must be aware that, as a result, things will happen differently (or suffer the consequences).

</quote>

Now, if you have "no master table", you'll need to use the LOCK TABLE command in order to serialize inserts/modifications into this table.

Rating

  (30 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Most helpful, thanks

Alexander, June 07, 2005 - 2:08 pm UTC

Tom,
this was most useful, excellent explanations, thank you.

One thing I wanted to discuss more. You have said that

*any* constraint can be disabled.
*any* constraint can be violated.
*any* one.

I think I can easily detect if any constraint has been violated:

select * from user_constraints
where (validated <> 'VALIDATED' or status = 'DISABLED')
and owner <> 'SYSTEM'

If I go for stored procedures/triggers to enforce my business rules, I might need to monitor data integrity on case-by-case basis, running at least 1 select per business rule, like this one, for the rule about no overlaps:

select * time_on_task t1 join time_on_task t2
on (t1.begin_dt between t2.begin_dt and t2.end_dt)
or
(t1.end_dt between t2.begin_dt and t2.end_dt)
or
(t2.begin_dt between t1.begin_dt and t1.end_dt)
-- expected results: no rows

So using a constraint seems to be a little bit more preferrable.
What do you think?


Tom Kyte
June 07, 2005 - 2:21 pm UTC

that just shows constraints that may or may not have been violated. My point was, if you consider bypassing a trigger to be a fatal "do not use a trigger" reason, then the same thing exists for a constraint.


When you CAN use a constraint, by all means DO use a constraint. NEVER write code where declarative integrity can work.

HOWEVER, when you have a rule that cannot be enforced by a constraint, such as overlap detection, use the tools available.


Definitely, given the choice between "enforce rule via constraint or via trigger", it would always be constraint. However, if constraint cannot "work" -- then we have to look at other options.

I prefer "do not permit direct access to table, use procedure, use API" myself in that case.

why does ...

A reader, June 07, 2005 - 4:27 pm UTC

nice thread.

however, I don't understand why Alexander's
original approach does not raise a
mutating table issue.

Could you please explain (if not too off-topic) ?

Tom Kyte
June 07, 2005 - 6:02 pm UTC

well, it will under the right circumstances.

the server knows with "insert into t (...) VALUES (....)" that there is precisely one row.  and hence the mutating issue (when a trigger would see an operation part way through) won't exist.

ops$tkyte@ORA9IR2> insert into time_on_task(begin_dt, end_dt) values('12-dec-2003', '15-dec-2003');
 
1 row created.
 
ops$tkyte@ORA9IR2> rollback;
 
Rollback complete.
 
ops$tkyte@ORA9IR2> insert into time_on_task(begin_dt, end_dt) SELECT '12-dec-2003', '15-dec-2003' from dual;
insert into time_on_task(begin_dt, end_dt) SELECT '12-dec-2003', '15-dec-2003' from dual
               *
ERROR at line 1:
ORA-04091: table OPS$TKYTE.TIME_ON_TASK is mutating, trigger/function may not
see it
ORA-06512: at "OPS$TKYTE.TIME_ON_TASK_INS", line 2
ORA-04088: error during execution of trigger 'OPS$TKYTE.TIME_ON_TASK_INS'
 


So, the first time, it knew "only one row, no problem", the second time it sees "select ...." and says -- hey, could be 0, 1 or more rows -- table is mutating, no luck. 

great

Matthias Rogel, June 08, 2005 - 2:19 am UTC

thanks for explanation

Helena Marková, June 08, 2005 - 8:30 am UTC


great explanations

Alexander, June 08, 2005 - 10:38 am UTC

Tom, thanks again for great explantions.

let us consider another approach, using the table from my first post:

create materialized view log on time_on_task with sequence, rowid, 
primary key (end_dt) including new values;

create materialized view detect_overlap 
build immediate refresh complete on commit
as
select a.begin_dt, a.end_dt, count(*) num_overlaps
from time_on_task a, time_on_task b where
((a.begin_dt between b.begin_dt and b.end_dt)
 or  
 (a.end_dt between b.begin_dt and b.end_dt)
 or
 (b.begin_dt between a.begin_dt and a.end_dt)
)
group by a.begin_dt, a.end_dt

alter table detect_overlap add constraint no_overlaps check(num_overlaps< 2);

this seems to work in multi-user environment just fine:

connection 1:
insert into time_on_task(begin_dt, end_dt) values('30-nov-2003', '2-dec-2003')
1 row inserted

connection 2:
SQL> insert into time_on_task(begin_dt, end_dt) values('1-dec-2003', '3-dec-2003');

1 row created.

connection 2:
SQL> commit;

Commit complete.

connection 1:
commit
ORA-12008: error in materialized view refresh path
ORA-02290: check constraint (STARSDIRECT.NO_OVERLAPS) violated

What do you think?
I'm wondering if (and if yes, then how) this constraint could be bypassed.

 

Tom Kyte
June 08, 2005 - 11:27 am UTC

You really want to do a non-equi join of a big table with itself every time a commit is done??????


That will not scale beyond a handful of rows. It is why I went with the trigger.


incremental refresh

Mikito Harakiri, June 08, 2005 - 12:52 pm UTC

The trick is to make MV incrementally refresheable. Then, update penalty is small.

Tom Kyte
June 08, 2005 - 1:10 pm UTC

do it....

not everything can be done, we (you and I) just had this conversation on another thread. You couldn't do it there either.

I considered and rejected the materialized view for this as well.

What is the big deal?

Mikito Harakiri, June 08, 2005 - 3:03 pm UTC

create table intervals (
    tail integer unique,
    head integer
);

create materialized view log on intervals 
WITH ROWID INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW tail_mv 
REFRESH FAST ON COMMIT as
select i1.rowid rid1,  
       i2.rowid rid2 
from intervals i1, intervals i2
where i1.tail > i2.tail and i1.tail <= i2.head
;

ALTER TABLE tail_mv
ADD CONSTRAINT ck_tail_mv CHECK(rid1 is null);

Add symmetric MV for the i1.head.

SQL> insert into intervals values (1,2);

1 row created.

SQL> insert into intervals values (3,10);

1 row created.

SQL> commit;

Commit complete.

SQL> insert into intervals values (5,15);

1 row created.

SQL> commit;
commit
*
ERROR at line 1:
ORA-12008: error in materialized view refresh path
ORA-02290: check constraint (SCOTT.CK_TAIL_MV) violated 

Tom Kyte
June 08, 2005 - 5:23 pm UTC

now, make it work in 9i, for that is the original posters version.

you have to work with the software they have, not the software you have. When presented with a question, there are lots of variables -- this won't work in 9i.

so, "the trick" as you say -- is to develop a solution that works in 9i.

oh, and scale it up too, that is relevant no? what happens with 10,000 rows in there (did you try it out?)

take it all of the way, all of the steps.




why are you saying it does not scale well?

Alexander, June 08, 2005 - 5:05 pm UTC

Tom,

I did some benchmarking:
-------- setup half million rows --------------
create table resources as select rownum resource_id from all_objects where rownum<1001;

create unique index i_resources on resources(resource_id);

create table schedules(resource_id number not null,
start_time date not null, end_time date not null, primary key(resource_id, start_time)
);
insert /*+ append */ into schedules
select r1.resource_id as resource_id,
trunc(sysdate) + r2.resource_id*2 as start_time,
trunc(sysdate) + r2.resource_id*2 + 1 as end_time
from resources r1 cross join resources r2
where r2.resource_id < 501;

--------- 500K rows in schedules table -----

create unique index i_schedules on schedules(resource_id, start_time, end_time);
/
------- I used your stored procedure ----------
create or replace procedure schedule_resource
( p_resource_id in number,
p_start_time in date,
p_end_time in date
)
as
l_resource_id resources.resource_id%type;
l_cnt number;
begin

select resource_id into l_resource_id
from resources
where resource_id = p_resource_id
FOR UPDATE;

select count(*)
into l_cnt
from schedules
where resource_id = p_resource_id
and (start_time <= p_end_time)
and (end_time >= p_start_time);
if ( l_cnt <> 0 )
then
raise_application_error
(-20001, 'Room is already booked!' );
end if;

insert into schedules
( resource_id, start_time, end_time )
values
( p_resource_id, p_start_time, p_end_time );
end schedule_resource;
/
---------- this sp inserts 5K rows ---------
create or replace procedure schedule_1k_rows
as
begin
for r in(select * from resources)
loop
schedule_resource(r.resource_id, '12-dec-1993', '15-dec-1993');
schedule_resource(r.resource_id, '12-dec-1994', '15-dec-1994');
schedule_resource(r.resource_id, '12-dec-1995', '15-dec-1995');
schedule_resource(r.resource_id, '12-dec-1996', '15-dec-1996');
schedule_resource(r.resource_id, '12-dec-1997', '15-dec-1997');
end loop;
end schedule_1k_rows;
/
select sysdate from dual;
call schedule_1k_rows();
select sysdate from dual;
rollback;

select sysdate from dual
SYSDATE
---------------------
6/8/2005 1:54:18 PM
1 row selected


call schedule_1k_rows()
Method called

select sysdate from dual
SYSDATE
---------------------
6/8/2005 1:54:30 PM
1 row selected
-------------- it inserted 5K rows in 12 seconds -

---------- setting up a materialized view ---------

create materialized view log on schedules with sequence, rowid,
primary key (end_time) including new values;

create materialized view detect_overlap
build immediate refresh fast on commit
as
select a.resource_id, a.start_time, a.end_time, count(*) num_overlaps
from schedules a, schedules b
where (a.resource_id = b.resource_id)
and (a.start_time <= b.end_time)
and (a.end_time >= b.start_time)
group by a.resource_id, a.start_time, a.end_time;

alter table detect_overlap add constraint no_overlaps check(num_overlaps< 2);
create unique index ui_detect_overlap on detect_overlap(resource_id, start_time, end_time);

---------- this sp inserts 5K rows ---------

create or replace procedure schedule_more_rows
as
begin
for r in(select * from resources)
loop
insert into schedules values(r.resource_id, '12-dec-1993', '15-dec-1993');
insert into schedules values(r.resource_id, '12-dec-1994', '15-dec-1994');
insert into schedules values(r.resource_id, '12-dec-1995', '15-dec-1995');
insert into schedules values(r.resource_id, '12-dec-1996', '15-dec-1996');
insert into schedules values(r.resource_id, '12-dec-1997', '15-dec-1997');
end loop;
end schedule_more_rows;
/
select sysdate from dual;
call schedule_more_rows();
select sysdate from dual;
rollback;

select sysdate from dual
SYSDATE
---------------------
6/8/2005 2:16:09 PM
1 row selected


call schedule_more_rows()
Method called

select sysdate from dual
SYSDATE
---------------------
6/8/2005 2:16:15 PM
1 row selected


commit
Commit complete

select sysdate from dual
SYSDATE
---------------------
6/8/2005 2:16:24 PM
1 row selected

-------------- it inserted 5K rows in 6 seconds -
---------- then verified at commit in 10 seconds
----------- 16 seconds total
----------- the stored procedure approach took 12 seconds

so I tested both approaches against a table with 500K rows, and I don't see neither the stored procedure approach nor the materialized view one to be dramatically faster.

What am I missing?
What do you think?

Tom Kyte
June 08, 2005 - 10:14 pm UTC

that does look OK. confirmed that it works in 9i and it runs pretty well.

something _is_ missing

Gabe, June 09, 2005 - 4:05 pm UTC

Was the point of that MV-based solution to show:
1. that it is possible?
or/and
2. that is it better? … and why?

Even with the test shown it is clear it is hardly better: it took longer, there is at least twice as much disk space required (the MV has as many rows as the ‘schedules’ table + the indexes + the MV Log) … so the I/O will definitely be higher.

In any case, regarding <quote>What am I missing?</quote> …

Major point (I believe) … for both implementations the transactions have to be kept short for concurrency reasons. Hence committing as soon as possible is desired (actually required) … as they are, the tests don’t [even] resemble a real life scenario. Once we start committing we start seeing some differences:

Now, I kept the general constructs of the original tests ... except:
* I have two different (but otherwise identical) sets of objects for each test
* I added COMMIT in the procedures right after inserting the schedule
* I reduced the tests from 5000 to 100 inserts (I wasn’t prepared to wait that long!)

The bottom line:

--
-- FOR UPDATE based solution (commit 100 inserts one by one):
--
flip@FLOP> exec schedule_100_rows;

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.05

--
-- MV-based solution (commit 100 inserts one by one):
--
flip@FLOP> exec schedule_100_rows2;

PL/SQL procedure successfully completed.

Elapsed: 00:08:26.02

--
-- MV-based solution (just commit once for all 100 inserts):
--
flip@FLOP> exec schedule_100_rows2

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.00

So, is the MV-based solution possible? Yes … recommended? No.

------------------------------------------------------------------------------
The complete sequence of steps:

flip@FLOP> alter session set nls_date_format='dd-mon-yyyy';

Session altered.

Elapsed: 00:00:00.00
flip@FLOP> create table resources as
2 select rownum resource_id from all_objects where rownum<1001;

Table created.

Elapsed: 00:00:00.05
flip@FLOP>
flip@FLOP> create unique index i_resources on resources(resource_id);

Index created.

Elapsed: 00:00:00.00
flip@FLOP>
flip@FLOP> create table schedules
2 ( resource_id number not null
3 ,start_time date not null
4 ,end_time date not null
5 ,primary key(resource_id, start_time)
6 );

Table created.

Elapsed: 00:00:00.00
flip@FLOP> insert /*+ append */ into schedules
2 select r1.resource_id as resource_id,
3 trunc(sysdate) + r2.resource_id*2 as start_time,
4 trunc(sysdate) + r2.resource_id*2 + 1 as end_time
5 from resources r1 cross join resources r2
6 where r2.resource_id < 501;

500000 rows created.

Elapsed: 00:00:14.07
flip@FLOP>
flip@FLOP> create unique index i_schedules on schedules(resource_id, start_time, end_time);

Index created.

Elapsed: 00:00:08.08
flip@FLOP>
flip@FLOP> analyze table resources compute statistics for table for all indexes for all columns;

Table analyzed.

Elapsed: 00:00:00.02
flip@FLOP> analyze table schedules compute statistics for table for all indexes for all columns;

Table analyzed.

Elapsed: 00:00:22.05
flip@FLOP>
flip@FLOP> create or replace procedure schedule_resource
2 ( p_resource_id in number,
3 p_start_time in date,
4 p_end_time in date
5 )
6 as
7 l_resource_id resources.resource_id%type;
8 l_cnt number;
9 begin
10
11 select resource_id into l_resource_id
12 from resources
13 where resource_id = p_resource_id
14 FOR UPDATE;
15
16 select count(*)
17 into l_cnt
18 from schedules
19 where resource_id = p_resource_id
20 and (start_time <= p_end_time)
21 and (end_time >= p_start_time);
22
23 if ( l_cnt <> 0 )
24 then
25 raise_application_error
26 (-20001, 'Room is already booked!' );
27 end if;
28
29 insert into schedules
30 ( resource_id, start_time, end_time )
31 values
32 ( p_resource_id, p_start_time, p_end_time );
33
34 commit;
35 end schedule_resource;
36 /

Procedure created.

Elapsed: 00:00:00.01
flip@FLOP>
flip@FLOP> create or replace procedure schedule_100_rows
2 as
3 begin
4 for r in(select * from resources where resource_id < 21)
5 loop
6 schedule_resource(r.resource_id, '12-dec-1993', '15-dec-1993');
7 schedule_resource(r.resource_id, '12-dec-1994', '15-dec-1994');
8 schedule_resource(r.resource_id, '12-dec-1995', '15-dec-1995');
9 schedule_resource(r.resource_id, '12-dec-1996', '15-dec-1996');
10 schedule_resource(r.resource_id, '12-dec-1997', '15-dec-1997');
11 end loop;
12 end;
13 /

Procedure created.

Elapsed: 00:00:00.00
flip@FLOP> create table resources2 as
2 select rownum resource_id from all_objects where rownum<1001;

Table created.

Elapsed: 00:00:00.05
flip@FLOP>
flip@FLOP> create unique index i_resources2 on resources2(resource_id);

Index created.

Elapsed: 00:00:00.00
flip@FLOP>
flip@FLOP> create table schedules2
2 ( resource_id number not null
3 ,start_time date not null
4 ,end_time date not null
5 ,primary key(resource_id, start_time)
6 );

Table created.

Elapsed: 00:00:00.00
flip@FLOP>
flip@FLOP> insert /*+ append */ into schedules2
2 select r1.resource_id as resource_id,
3 trunc(sysdate) + r2.resource_id*2 as start_time,
4 trunc(sysdate) + r2.resource_id*2 + 1 as end_time
5 from resources2 r1 cross join resources2 r2
6 where r2.resource_id < 501;

500000 rows created.

Elapsed: 00:00:14.05
flip@FLOP>
flip@FLOP> create unique index i_schedules2 on schedules2(resource_id, start_time, end_time);

Index created.

Elapsed: 00:00:11.00
flip@FLOP>
flip@FLOP> analyze table resources2 compute statistics for table for all indexes for all columns;

Table analyzed.

Elapsed: 00:00:00.01
flip@FLOP> analyze table schedules2 compute statistics for table for all indexes for all columns;

Table analyzed.

Elapsed: 00:00:28.02
flip@FLOP>
flip@FLOP> create materialized view log on schedules2 with sequence, rowid,
2 primary key (end_time) including new values;

Materialized view log created.

Elapsed: 00:00:00.06
flip@FLOP>
flip@FLOP> create materialized view detect_overlap
2 build immediate refresh fast on commit
3 as
4 select a.resource_id, a.start_time, a.end_time, count(*) num_overlaps
5 from schedules2 a, schedules2 b
6 where (a.resource_id = b.resource_id)
7 and (a.start_time <= b.end_time)
8 and (a.end_time >= b.start_time)
9 group by a.resource_id, a.start_time, a.end_time;

Materialized view created.

Elapsed: 00:02:52.01
flip@FLOP>
flip@FLOP> alter table detect_overlap add constraint no_overlaps check(num_overlaps< 2);

Table altered.

Elapsed: 00:00:05.03
flip@FLOP>
flip@FLOP> create unique index ui_detect_overlap on detect_overlap(resource_id, start_time, end_time);

Index created.

Elapsed: 00:00:15.03
flip@FLOP>
flip@FLOP> analyze table detect_overlap compute statistics for table for all indexes for all columns;

Table analyzed.

Elapsed: 00:00:49.07
flip@FLOP>
flip@FLOP> create or replace procedure schedule_resource2
2 ( p_resource_id in number,
3 p_start_time in date,
4 p_end_time in date
5 )
6 as
7 begin
8 insert into schedules2
9 ( resource_id, start_time, end_time )
10 values
11 ( p_resource_id, p_start_time, p_end_time );
12
13 commit;
14 end;
15 /

Procedure created.

Elapsed: 00:00:00.01
flip@FLOP>
flip@FLOP> create or replace procedure schedule_100_rows2
2 as
3 begin
4 for r in(select * from resources2 where resource_id < 21)
5 loop
6 schedule_resource2 (r.resource_id, '12-dec-1993', '15-dec-1993');
7 schedule_resource2 (r.resource_id, '12-dec-1994', '15-dec-1994');
8 schedule_resource2 (r.resource_id, '12-dec-1995', '15-dec-1995');
9 schedule_resource2 (r.resource_id, '12-dec-1996', '15-dec-1996');
10 schedule_resource2 (r.resource_id, '12-dec-1997', '15-dec-1997');
11 end loop;
12 end;
13 /

Procedure created.

Elapsed: 00:00:00.00
flip@FLOP> exec schedule_100_rows;

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.05
flip@FLOP> exec schedule_100_rows2;

PL/SQL procedure successfully completed.

Elapsed: 00:08:26.02

------------------------------------------------------------------------------
Just to show what happens when all 100 rows are processed within one transaction (just one commit to trigger the MV refresh) I’ll redo the MV-based test after taking the commit out of ‘schedule_resource2’ and put it after the loop in ‘schedule_100_rows2’.
-----------------------------------------------------------------------------

flip@FLOP>
flip@FLOP> drop materialized view detect_overlap;

Materialized view dropped.

Elapsed: 00:00:00.05
flip@FLOP> drop table schedules2;

Table dropped.

Elapsed: 00:00:00.03
flip@FLOP> drop table resources2;

Table dropped.

Elapsed: 00:00:00.00
flip@FLOP> create table resources2 as
2 select rownum resource_id from all_objects where rownum<1001;

Table created.

Elapsed: 00:00:00.05
flip@FLOP>
flip@FLOP> create unique index i_resources2 on resources2(resource_id);

Index created.

Elapsed: 00:00:00.00
flip@FLOP>
flip@FLOP> create table schedules2
2 ( resource_id number not null
3 ,start_time date not null
4 ,end_time date not null
5 ,primary key(resource_id, start_time)
6 );

Table created.

Elapsed: 00:00:00.00
flip@FLOP>
flip@FLOP> insert /*+ append */ into schedules2
2 select r1.resource_id as resource_id,
3 trunc(sysdate) + r2.resource_id*2 as start_time,
4 trunc(sysdate) + r2.resource_id*2 + 1 as end_time
5 from resources2 r1 cross join resources2 r2
6 where r2.resource_id < 501;

500000 rows created.

Elapsed: 00:00:15.05
flip@FLOP>
flip@FLOP> create unique index i_schedules2 on schedules2(resource_id, start_time, end_time);

Index created.

Elapsed: 00:00:11.05
flip@FLOP>
flip@FLOP> analyze table resources2 compute statistics for table for all indexes for all columns;

Table analyzed.

Elapsed: 00:00:00.01
flip@FLOP> analyze table schedules2 compute statistics for table for all indexes for all columns;

Table analyzed.

Elapsed: 00:00:28.09
flip@FLOP>
flip@FLOP> create materialized view log on schedules2 with sequence, rowid,
2 primary key (end_time) including new values;

Materialized view log created.

Elapsed: 00:00:00.01
flip@FLOP>
flip@FLOP> create materialized view detect_overlap
2 build immediate refresh fast on commit
3 as
4 select a.resource_id, a.start_time, a.end_time, count(*) num_overlaps
5 from schedules2 a, schedules2 b
6 where (a.resource_id = b.resource_id)
7 and (a.start_time <= b.end_time)
8 and (a.end_time >= b.start_time)
9 group by a.resource_id, a.start_time, a.end_time;

Materialized view created.

Elapsed: 00:02:54.05
flip@FLOP>
flip@FLOP> alter table detect_overlap add constraint no_overlaps check(num_overlaps< 2);

Table altered.

Elapsed: 00:00:05.03
flip@FLOP>
flip@FLOP> create unique index ui_detect_overlap on detect_overlap(resource_id, start_time, end_time);

Index created.

Elapsed: 00:00:15.00
flip@FLOP>
flip@FLOP> analyze table detect_overlap compute statistics for table for all indexes for all columns;

Table analyzed.

Elapsed: 00:00:50.06
flip@FLOP>
flip@FLOP> create or replace procedure schedule_resource2
2 ( p_resource_id in number,
3 p_start_time in date,
4 p_end_time in date
5 )
6 as
7 begin
8 insert into schedules2
9 ( resource_id, start_time, end_time )
10 values
11 ( p_resource_id, p_start_time, p_end_time );
12
13 -- commit;
14 end;
15 /

Procedure created.

Elapsed: 00:00:00.01
flip@FLOP>
flip@FLOP> create or replace procedure schedule_100_rows2
2 as
3 begin
4 for r in(select * from resources2 where resource_id < 21)
5 loop
6 schedule_resource2 (r.resource_id, '12-dec-1993', '15-dec-1993');
7 schedule_resource2 (r.resource_id, '12-dec-1994', '15-dec-1994');
8 schedule_resource2 (r.resource_id, '12-dec-1995', '15-dec-1995');
9 schedule_resource2 (r.resource_id, '12-dec-1996', '15-dec-1996');
10 schedule_resource2 (r.resource_id, '12-dec-1997', '15-dec-1997');
11 end loop;
12 commit;
13 end;
14 /

Procedure created.

Elapsed: 00:00:00.00
flip@FLOP> exec schedule_100_rows2

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.00
flip@FLOP>


yes, commiting after every insert makes a big difference

Alexander, June 10, 2005 - 11:26 am UTC

I repeated the benchmarks committing after every insert, and yes there is quite a difference:
5k rows were inserted in 15 seconds via a stored procedure
5k rows were inserted in 200 seconds using the mv approach

this is quite a difference, 14 times slower, though a little bit less pronounced than in your benchmarks

thank you for a very good point

One question for Tom at the end ...

Gabe, June 10, 2005 - 3:47 pm UTC

Alexander:

Thank you for following up. Yes … the 8.5 minutes I got for 100 inserts did appear monstrously slow … since you confirmed 200 seconds for 5k rows, I took it upon myself to investigate it a bit further.

The only [relevant] thing I did over your test case was to run those ANALYZE commands on resources2, schedules2 and detect_overlap … well, the detect_overlap MV makes all the difference.

Oracle uses two MERGE statements to refresh the MV (not clear why two, but anyway) and one of them (the one actually merging the 100 rows) had a hint added
/*+ CARDINALITY(50000) */ which throws DETECT_OVERLAP into a full table scan (in our case one FT for each insert/merge). I took the hint out and explained the MERGE statement … not surprisingly the FT was out and the INDEX UNIQUE SCAN was in.

I re-run the entire test case without the ANALYZEs and tkprof confirmed the FTs were out … the hint was still there but as /*+ CARDINALITY(1000) */ it was small enough not to throw DETECT_OVERLAP into FTs. Overall it took 12 seconds for 100 rows … the kind of numbers your observed.

Hence, the conclusion remains the same (a MV-based implementation is _heavier_). There still is the issue of why Oracle comes out with those hints (50K cardinality out of 500K rows?)… this seems to be a case where having statistics makes things worse … unless?????

Tom … would you have any explanation regarding the generated MERGE statements for MV refreshes? My tests were on 9.2.0.1.0.

Cheers.


Tom Kyte
June 10, 2005 - 4:06 pm UTC

No, I don't -- 10% would seem a high estimate of the number of rows modified I would think.

Don't understand

Mikito Harakiri, June 10, 2005 - 5:06 pm UTC

Why MV takes storage space in this example? Isn't MV supposed to be empty? ("All the intervals that overlap", and there is normally none)

Tom Kyte
June 10, 2005 - 5:15 pm UTC

create materialized view detect_overlap
build immediate refresh fast on commit
as
select a.resource_id, a.start_time, a.end_time, count(*) num_overlaps
from schedules a, schedules b
where (a.resource_id = b.resource_id)
and (a.start_time <= b.end_time)
and (a.end_time >= b.start_time)
group by a.resource_id, a.start_time, a.end_time;

No, it isn't. num_overlaps must be < 2. count(*) will be 1 normally.

What about

Mikito Harakiri, June 10, 2005 - 5:45 pm UTC

create materialized view detect_overlap
build immediate refresh fast on commit
as
select a.resource_id
from schedules a, schedules b
where (a.resource_id = b.resource_id)
and (a.start_time <= b.end_time)
and (a.end_time >= b.start_time)

Check if this MV empty as a null constraint on resource_id columns?



Tom Kyte
June 10, 2005 - 6:48 pm UTC

I'm not getting through.

Please -- you want to answer?  To Help?  

test it
measure it
prove it
show it
do it
implement it
try it out
show us


it is very OK to be devils advocate, but just show it...


ops$tkyte@ORA9IR2> create materialized view detect_overlap
  2  build immediate refresh fast on commit
  3  as
  4  select a.resource_id
  5  from schedules2 a, schedules2 b
  6  where (a.resource_id = b.resource_id)
  7  and (a.start_time <= b.end_time)
  8  and (a.end_time >= b.start_time);
from schedules2 a, schedules2 b
     *
ERROR at line 5:
ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized
view


ops$tkyte@ORA10G> create materialized view detect_overlap
  2  build immediate refresh fast on commit
  3  as
  4  select a.resource_id
  5  from schedules2 a, schedules2 b
  6  where (a.resource_id = b.resource_id)
  7  and (a.start_time <= b.end_time)
  8  and (a.end_time >= b.start_time);
from schedules2 a, schedules2 b
     *
ERROR at line 5:
ORA-12052: cannot fast refresh materialized view OPS$TKYTE.DETECT_OVERLAP
 
 

count(*) < 2 or 1

A reader, June 10, 2005 - 5:48 pm UTC

Couldn't we add an "and not a.start_time

count(*) < 2 or 1

Anders, June 10, 2005 - 5:50 pm UTC

Couldn't we add an "and not a.start_time=b.start_time" (since the primary key is (resource_id, start_time) and a.resource_id=b.resource_id) to bring the row count down?

it's better do have the MV empty

Alexander, June 10, 2005 - 5:57 pm UTC

Gabe,
thanks for the feedback, it was most interesting.
BTW, what version of Oracle are you using?
If 9i, could gathering statistics with the package, not ANALYZE command, make any difference?

Mikito,
it did help to have the MV empty, thanks. I added

and (a.start_time <> b.start_time)

to the MV's definition, and changed the check constraint to be (num_overlaps< 1). That done, the time to insert 5K rows went down from 200 sec to 70 sec.

Tom Kyte
June 10, 2005 - 6:49 pm UTC

but that is called "a bug"

i'll reserve the room from 9am to 10am
you'll reserve it from 9am to 9:30am



I will most respectfully disagree

Alexander, June 10, 2005 - 7:23 pm UTC

Tom,
that's what I have a PK for:
primary key(resource_id, start_time) will prevent that from happening


Tom Kyte
June 10, 2005 - 8:52 pm UTC

that would be true ;)

Materialized view

A reader, June 10, 2005 - 7:38 pm UTC

Does the view need to be materialized?

Tom Kyte
June 10, 2005 - 8:52 pm UTC

only if you want constraints to be validated.

Interval overlapping condition

Mikito Harakiri, June 10, 2005 - 7:56 pm UTC

Intervals [x1,y1] and [x2,y2] overlap whenever
x1 is between [x2,y2]
OR
y1 is between [x2,y2]
Depending on whether you allow touching intervals or not you may adjust equality into inequality. But you need *2* empty materialized views (or one with union all).

If intervals don't overlap, then the join is empty.

Tom Kyte
June 10, 2005 - 8:53 pm UTC

no, that is not true.

[x1,y1] and [x2,y2] overlap when x2 < x1 and y2 > y1 as well.

Why MV should be empty.

Mikito Harakiri, June 10, 2005 - 8:20 pm UTC

It is easy to argue that a view that should be empty without writing a line of the code. A view is a join of two intervals that overlap. If no intervals overlap, then the join is empty! If this appeared to be not the case, then check view definition for errors;-)

Tom Kyte
June 10, 2005 - 8:57 pm UTC


Does the view need to be materialized?

Mikito Harakiri, June 10, 2005 - 8:26 pm UTC

One can write view trigger that raizes an error whenever a tuple is inserted into a view. One may argue that this solution is cleaner bacause the trigger code is simpler.

Tom Kyte
June 10, 2005 - 8:58 pm UTC


Never mind

Mikito Harakiri, June 10, 2005 - 8:30 pm UTC

The [non-materialized] view evaluation should be efficient. I can't think of effective access path, however.

Tom Kyte
June 10, 2005 - 9:15 pm UTC


disagree

Mikito Harakiri, June 10, 2005 - 9:34 pm UTC

<quote>[x1,y1] and [x2,y2] overlap when x2 < x1 and y2 > y1 as well.</quote>

1. x2 < x1 (reversed your condition)
2. x1 < y1 (by interval definition)
3. y1 < y2 (reversed your other condition)
4. x1 < y2 (1 and 2, transitivity)

therefore x2 < x1 < y2
or x between x2 and y2

Precision

Mikito Harakiri, June 10, 2005 - 9:40 pm UTC

Yes, I made a typo -- missed rowids in the MV definition. Does it invalidate the idea?

BTW, two more typos:
"1 and 2, transitivity"
should be
"2 and 3, transitivity"

and
"or x between x2 and y2"
should read
"or x1 between x2 and y2"




Tom Kyte
June 10, 2005 - 10:06 pm UTC

if you want to postulate,

please demonstrate

that is all I ever ask.... words are words. nothing more.

show us that what you speak of, do not hypothesize, do not say "well maybe this'll work"

show us.

nothing more, nothing less. show us.... the "proof is in the pudding" as they say. Just give us working, tested, proven, examples.

Love to be shown other ways.

Hate to be told, hypothetically -- maybe there are other ways so maybe hypothetically what you said isn't the best.

in search of the best, we need examples, evidence.

no arguments, this is not philosophy, this is much much more concrete.


As an individual, you like to say "i don't like this" (eg: analytics, partitioning, and so one). If you don't like it, show us why -- I want to understand the TECHNICAL (not "i don't like it") reasons.

So, if you want to say "no, there are better ways", show it. that is all I ever ask.

Show it.
Show us.
Prove it.
test it.

Alberto Dell'Era, June 12, 2005 - 5:04 pm UTC

> alter table detect_overlap add constraint no_overlaps check (num_overlaps< 1);

> create unique index ui_detect_overlap on detect_overlap
> (resource_id, start_time, end_time);

Shouldn't they be (since they're defined on an MV):

alter table detect_overlap add constraint no_overlaps check (num_overlaps< 1) DEFERRABLE;

alter table detect_overlap add constraint ui_detect_overlap
unique (resource_id, start_time, end_time) DEFERRABLE;

?

Tom Kyte
June 12, 2005 - 11:10 pm UTC

indeed, they should be deferrable as the MV refresh is a series of singleton statements, not a single SQL statement.




DEFFERABLE on MVs

Gabe, June 13, 2005 - 10:54 am UTC

<quote>... as the MV refresh is a series of singleton statements, not a single SQL statement.</quote>

Can you elaborate on this? ... or point me to some relevant info?

Excluding a FK constraint between MVs ... can one show an example of DEFFERABLE making a difference on a REFRESH ON COMMIT MV?


Tom Kyte
June 13, 2005 - 11:54 am UTC

the refresh of a materialized view is a process, not a single "merge" or anything. it is a big process. constraints are supported against MV's but one of the rules is they should be deferrable since they can be violated during the refresh process.

Think about a unique constraint on a column Y. The mv has:


X Y
--- ---
1 200

to start with. Now, the base table has these operations performed on it:

insert into base values ( 2, 200 );
delete from base where x = 1;


Which modification gets applied to the MV first? the insert or the delete? We don't know, we cannot control, therefore we must allow Y to become "non unique" for a while -- knowing that when the transaction is truly committed, it'll be OK.

The stored procedure is much simpler than the MV

Alexander, June 13, 2005 - 12:57 pm UTC

Tom, Gabe, and Mikito,
thank you for your suggestions.

We shall go for the stored procedure that Tom suggested. The stored procedure has a very big advantage:
it is way simpler than the MV.

Great thread

Yuan, April 18, 2006 - 4:25 pm UTC

I have the same challenge, but there is no master table to select for update, so I'm stuck with the MV solution.

My question is why is this unique index necessary on the MV?

> create unique index ui_detect_overlap on detect_overlap
> (resource_id, start_time, end_time);

Doesn't the definition of the view prohibit duplicates?

create materialized view detect_overlap
build immediate refresh fast on commit
as
select a.resource_id, a.start_time, a.end_time, count(*) num_overlaps
from schedules a, schedules b
where (a.resource_id = b.resource_id)
and (a.start_time <= b.end_time)
and (a.end_time >= b.start_time)
and a.rowid != b.rowid
group by a.resource_id, a.start_time, a.end_time;

Also, I substituted a.rowid != b.rowid for the proposed a.start_time <> b.start_time. I believe this makes it a little clearer what it's trying to do.

Interval Condition

Yuan, June 28, 2006 - 4:03 pm UTC

Mikito Harakiri was right about the original interval condition being insufficient. It would not catch the overlap where start1 < start2 < end1 < end2.

Mikito also provided the solution, but I missed it the first time around, so I figured I'd post the entire query with his conditions:

create materialized view detect_overlap
build immediate refresh fast on commit
as
select a.resource_id, a.start_time, a.end_time, count(*) num_overlaps
from schedules a
join schedules b on (a.resource_id = b.resource_id
and a.rowid != b.rowid)
where a.start_time between b.start_time and b.end_time
or a.end_time between b.start_time and b.end_time
group by a.resource_id, a.start_time, a.end_time;

oh my god!

Jay, July 26, 2007 - 2:41 pm UTC

I wonder what it would be like to have Mikito and Tom work for the same company (and the same project????)!!

Wow... now, wouldn't that be fun?