Skip to Main Content
  • Questions
  • Is materialized view synchronization cheap?

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Mikito.

Asked: September 10, 2001 - 3:44 pm UTC

Last updated: July 26, 2005 - 4:44 pm UTC

Version: 9i

Viewed 1000+ times

You Asked

Oracle Manual emphasizes the analogy between materialized views and indexes. I wonder how cheap are materialized view updates? Oracle has a patented algorithm for materialized view fast refresh. But is it adviseable to use materialized views outside of Data Warehousing scope?

and Tom said...

I would say that there are three cases:

1) MV's that refresh on commit. These are NOT candidates for OLTP systems. Why? Predominant one is concurrency. An MV is usually a rollup. Say you have that "sum(sal) from emp" MV. Upon the update of the EMP table -- every transaction will SERIALIZE on the refresh of the MV. It would be a big choke point.


2) MV's that refresh FULLY on a timed basis for reporting purposes. these ARE candidates for use in an OLTP system. Why? Because this adds no overhead to the OLTP system (no MV logs to maintain) and reduces the number of big queries against the OLTP data making the OLTP system even faster.

3) MV's that refresh FAST on a timed basis. These PROBABLY are not candidates for use in an OLTP system. Why? Because the MV logs add overhead and OLTP is characterized by high volume, high speed, short transactions. It depends on your needs however with this one.

When we say they are like indexes -- that is just an analogy. Just as a unique index on t(x) can speed up "select * from t where x = 5", an MV can speed up "select count(*) from t" (assume t is a 5 billion row table).

Now indexes can speed up OLTP but, improperly used, they can bring it to a halt as well (too much maintenance).



Rating

  (15 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

indexes and MVs

D@B.A, September 11, 2001 - 12:13 pm UTC

Wouldn't users see phantoms in case 2? As a consequense, if their transactions rely upon the data returned by an MV that is outside of the scope of database transaction isolation, it could damage their semantics. Allowing MVs into OLTP means that the transaction scope must include MVs as well (exactly the same as for indexes).

IMHO, materialised views and indexes are indistinguisheable in case 1. Both speed up reading access for a single user. Both reduce the total system throughput, because of additional locking imposed upon indexing structures.

Summarizing, materialized views are just relational indexing structures.

Tom Kyte
September 11, 2001 - 10:45 pm UTC

I was refering to the use of MV's in the scope of a reporting system that might not need "the most current information" so a time based refreshed would be applicable.

You would not get phantoms - you would get "stale" data "as of a point in time".

Of course if you need the most current information -- you need to use the "real tables" or a refresh on commit MV.

Please Provide abbrevation

mtk, September 12, 2001 - 7:08 am UTC

Please Provide abbrevation for all Short term used by u.
I can able to understand this question and answer, if i know
expansion for OLTB.

Please Provide abbrevation

mtk, September 12, 2001 - 7:09 am UTC

Please Provide abbrevation for all Short term used by u.
I can able to understand this question and answer, if i know
expansion for OLTB.

Sonali kelkar, September 12, 2001 - 9:45 am UTC

OLTP means Online transaction Processing.. you may read about it in oracle book.
MV is Materialized Views..

MVs versus User-created Aggregate Tables.

Richard, July 08, 2003 - 9:02 am UTC

Hi,

We have a Decision Support System (DSS) that uses our Oracle tables; we are on version 8.1.5.

For the DSS, we repopulate our fact tables from various data repositories, and then build aggregate tables, indexes etc. The data in the DSS-related tables (fact and otherwise) then remain static throughout the working day (barring any emergency reloads) until the following night's data load process.

The rebuild of the aggregate tables is the most time-consuming part of our nightly processes, and I wondered if we are missing a trick here by not employing MVs?

Is this an ideal case for MVs? Are there any quirks we should be aware of? I do have a copy of "Expert One on One" and have read the chapter on MVs, but would still be very grateful for any feedback.

Thanks in advance for any help!

Tom Kyte
July 08, 2003 - 9:46 am UTC

if you are creating the aggregates via "create table as select ..." or populating them via

truncate
disable indexes
insert /*+ APPEND */
rebuild indexes

they'll be "as fast" as an MV, since you are in effect rebuild the database.

If you have the horsepower, employ parallel processing, either in the form of parallel query OR in the form of "we'll do as much concurrent work as we can -- we'll build 5 of these aggregates at the same time"

Complete refresh of MV slower than table

Rob, July 08, 2003 - 10:07 am UTC

I have found

truncate
disable indexes
insert /*+ APPEND */
rebuild indexes

on a large (300 + M rows) table to be faster than a complete MV refresh because of the fact that the first thing DBMS_MVIEW.REFRESH does is
truncate the underlying table. The truncate causes any index that you have made unusable to become usable. This means that you are maintaining any index on the MV on the fly as opposed to doing a parallel nologging rebuild at the end of the load.

aggregate tables vs. MVs

A reader, July 08, 2003 - 10:58 am UTC

Rebuilding the aggregate tables alone does not cause query re-write. You have to have MVs for query rewrite. You can rebuild the aggreaget table either using a CTAS or sqlldr and then create an MV on top of it by using the "ON PREBUILT TABLE" option. This option does not take any time as it will only mark the table as an MV in the database catalog. However, you will not be able to do DML operations on that table after the MV is created on it. We have to drop the MV, update the table and then recreate the MV using pre-built option. Tom, is there any other shortcut we can use for this?

Tom Kyte
July 08, 2003 - 1:06 pm UTC

umm, why would you do DML on it?

MVs versus Aggregate Tables.

Richard, July 08, 2003 - 11:41 am UTC

Thanks for the feedback Tom.

We do use parallel processors and rebuild the aggregates as you say.

I don't think we can do much more to speed things up, with the resources that we have.

limits between distinct kind of MV

juancarlosreyesp@yahoo.com, August 13, 2003 - 11:43 am UTC

Hi Tom,
As is very difficult to set when a table is in OLTP or DSS.
Please could you give a limit in transactions per minute, in which you suggest to use all the 3 kind of MV you described.

For example MV on commit, are suggested if you have less than X transactions por minute.

:) Thank you

Tom Kyte
August 13, 2003 - 11:55 am UTC

not possible to ROT (rule of thumb) something that is so dependent on

o size/spread of data in mv (refresh on commit for a SELECT COUNT(*) FROM T -- where everyone serializes on commit to update the count is very very different from select deptno, sum(sal) from dept where all serialize at the DEPTNO level)

o size/speed of machine

o complexity of the mv

for example

it is all about BENCHMARKING your situation (you know, *simulation*)

MV's that refresh on commit

michael, August 13, 2003 - 12:14 pm UTC

Hello Tom,

You stated:
1) MV's that refresh on commit. These are NOT candidates for OLTP systems.
Why? Predominant one is concurrency. An MV is usually a rollup. Say you have
that "sum(sal) from emp" MV. Upon the update of the EMP table -- every
transaction will SERIALIZE on the refresh of the MV. It would be a big choke
point.

My question:
How do you obtain real time data? For example, total number of items as items are being added? Other than refresh on commit, what method would you suggest to get accurrate data?

Thanks.

Tom Kyte
August 13, 2003 - 12:23 pm UTC

a query?


that is as real time as you get.

Thank you :)

A reader, August 13, 2003 - 12:19 pm UTC


So Why Use MVS?

Richard, October 07, 2003 - 9:04 am UTC

Hi,

A while back, I asked a question in this thread (MVs versus User-created Aggregate Tables).

Assuming the scenario I described in that question, is it definitely the case that aggregate tables will perform as well as MVs with similar purpose? I only ask, as I have read elsewhere that MVs will out-perform aggregate tables.

As ever, thank you for your help.

Tom Kyte
October 07, 2003 - 9:26 am UTC

At the end of the day -- bit for bit, byte for byte -- the would be exactly the same on disk.

Where MV's are BETTER then DIY's (do it yourself's) is in the transparency.


Pretend for moment that after creating an index -- you had to go out and change application code to take advantage of it. You had to train the developes to use it directly in future code. You had to train end users to be aware of it in ad-hoc queries. Pretend further that next week you created an infinitely BETTER index and now you have to go recode, retrain, reimplement.

It would not be workable, no one would stand for it. Indexes should be transparent -- the best one used at the right times.


Well, MV's are the indexes of your data warehouse/reporting system. They should be transparent to the application. Application queries "details", system automagically rewrites queries to use summaries. Add a new summary and wa-lah, applications start taking advantage of it *transparently*. Just like an index. Additionally, just like an index -- you have dependencies set up in the data dictionary -- you can see what the MV relies on, is dependent on. You can see if it is STALE or FRESH. The system maintains it.

The advantages of MV's over DIY's are numerous -- from transparent performance benefit (no recode, no train/retrain), to maintainability, to documentation...

MVs on PREBUILT TABLE with FAST REFRESH question

Adam Kahn, September 14, 2004 - 12:07 pm UTC

For EACH fact that I am aggregating, do I need a separate COUNT([fact name]) column in the prebuilt table? This is what DBMS_MVIEW.EXPLAIN_MVIEW advises and when I do this, indeed, fast refresh works. It seems like a waste of resources. I'm using Oracle 9.2.0.5.

For example, to ensure that FAST REFRESH works, does my prebuilt table have to look like:

COLUMN COMMENT
State Dimension 1
City Dimension 2
Order_Tot SUM(Order_amount)
Order_Cnt COUNT(Order_amount)
Line_Items_Qty SUM(Quantity)
Line_Items_Cnt COUNT(Quantity)
Records COUNT(*)

Furthermore, the COUNT([fact name]) results are always the same for non-null facts; seems like a waste of resources.

Please confirm or tell me what I am doing incorrectly.

TIA


Tom Kyte
September 14, 2004 - 1:08 pm UTC

it is required by the fast refresh process, it is just a number, not very large. It is the way they programmed it.

I can use a MV

robert, December 08, 2004 - 11:01 am UTC

hello, Tom,
>> 2) MV's that refresh FULLY on a timed basis for reporting purposes.


8.1.7.4/OLTP system
--------------------------------------------
I have a process that does thing like this:
UPDATE my_gtt
SET c20 = calc_pkg.func20(tranx_id),
c21 = calc_pkg.func21(tranx_id),
c22 = calc_pkg.func22(tranx_id),
c23 = calc_pkg.func23(tranx_id),
c24 = calc_pkg.func24(tranx_id),
c25 = calc_pkg.func25(tranx_id),
c26 = calc_pkg.func26(tranx_id),
............

where those SQL functions look up several other tables.
- SLOW but boss insists calc functions be maintained.

Can I create MV like this :

1)
create mat. view mv_pre_calc
as
select
tranx_id,
calc_pkg.func20(tranx_id) l_c21,
calc_pkg.func21(tranx_id) l_c22,
calc_pkg.func22(tranx_id) l_c23,
calc_pkg.func23(tranx_id) l_c24,
calc_pkg.func24(tranx_id) l_c25,
calc_pkg.func25(tranx_id) l_c26,
calc_pkg.func26(tranx_id) l_c27
.....
from TBL_TRANSACTIONS --<-- has tranx_id as PK

2) create an PK/unique index on mv_pre_calc.tranx_id
3) Use this MV DIRECTLY to do correlated-update in my process above ?

This looks good to you ?

thanks





Tom Kyte
December 09, 2004 - 12:32 pm UTC

it'll make the update go faster -- but the entire process would take longer .

if your goal was to make the update go faster -- that'll help.

Your Help is needed!

Lou, July 26, 2005 - 4:31 pm UTC

Tom,

I have this issue. We have a package that gets call everytime we have truck loaded and they need locations. It works well when it gets calls once...but sometimes we get 50 trucks waiting for locations and the package gets call 50 times. This is very taxing on the system. Any ideas how we may approach this issue maybe thru serialization of
some sort....if you want I can include the package

Tom Kyte
July 26, 2005 - 4:44 pm UTC

so, basically, you need to make sure that only "N" of your database connections are active simultaneously and the other M-N connections are put on hold right?

Shared Server can be used to do that.

You can set up a pool of shared servers (say 10). At most 10 sessions can be active (and the other 40 would "wait")


OR

You can use AQ (advanced queues) and control the number of background processes that are available to process requests - your procedure would place a request in a queue, one of the "queue processing processes" you've started would pick up a request and run the stored proceudre -- placing the output in a response queue (or just a table) - which the original process can pick up.

OR

you can use the resource manager and limit the number of concurrent active sessions...

OR

you can use dbms_jobs to run the procedure (just like with AQ) and job queue processes would limit the number of procedures running...