Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Gabriel.

Asked: April 15, 2014 - 6:44 pm UTC

Last updated: April 17, 2014 - 5:32 pm UTC

Version: 11.2.0.4

Viewed 10K+ times! This question is

You Asked

Hello Mr. Kyte,

Can you please explain to me how can I use materialized views to replicate a subset of the data of a table based on the comparison of the date found in a column with the current time.
The logical query would be of type

where date_column < sysdate - 7;


I was able to create a writeable materialized view that contained all the data from my base table. I then ran a delete on the mview, deleting all data

where date_column < sysdate - 7;


However, on the next fast refresh, instead of getting just the new and modified data, all of the data, including the one I deleted was reinserted. I always thought that mlogs would store the rowids of the modified records and move these. Of course I was wrong as explained by the documentation on writeable mviews.

So, is there a way to replicate just a subset of data from a table based on a comparison with a date that is not in the same table?

Thank you very much,

and Tom said...

the goal, the sole goal, of a materialized view is to synchronize the materialized view with the defining table(s).


when you refresh a writeable materialized view, the data - by definition - will be whatever is in the defining table(s).


If you use sysdate, you cannot fast refresh - your refresh was a complete refresh. Read only materialized views cannot fast refresh with sysdate either.


You would have to use a complete refresh of this particular materialized view - or go to a more sophisticated method, such as golden gate, where you would stream over the modifications and periodically delete the old data. A changes only materialized view is not going to work here.



think about for a minute...

In order for "create materialized view mv refresh fast as select * from t where date_column > sysdate-7" to work - we'd have to put a "delete" record into the materialized view log every time sysdate advanced by one second and a new record(s) was to be deleted.

If the materialized view log were to be used, we would need to know

a) NEW records
b) MODIFIED records
(so far, so good)

c) records to be deleted.... we can get the records that were really deleted, but how would you suggest to record all of the records that need to be deleted based on the "date_column > sysdate-7" constraint? every second, we'd have to poll the table to find any records that would need to be deleted!!


Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

am i doing something wrong?

Gabriel Schor, April 17, 2014 - 2:11 pm UTC

Hello Mr. Kyte,

I agreed with everything you said, but then, being so stubborn, i decided to give it another try. Here is what I came up with:


On the "production" side:

create table gs_my_objects as select * from dba_objects;

create table gs_my_tables as select * from dba_tables;

CREATE TABLE GS_MY_DATE
(MY_DATE  DATE);

Insert into GS_MY_DATE (MY_DATE)
 Values (TO_DATE('04/26/2014 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
COMMIT;

alter table gs_my_objects add constraint pk_gs_my_objects primary key (OBJECT_ID);

alter table gs_my_tables add constraint pk_gs_my_tables primary key (owner, table_name);

alter table gs_my_date add constraint pk_gs_my_date primary key (my_date);

create materialized view log on gs_my_objects with rowid;
create materialized view log on gs_my_tables with rowid;
create materialized view log on gs_my_date with rowid;

on the "replication" side:

drop MATERIALIZED VIEW gs_join;
create MATERIALIZED VIEW gs_join 
REFRESH FAST ON DEMAND
WITH PRIMARY KEY AS 
select o.owner, t.table_name, d.my_date,  o.rowid o_rowid, d.rowid d_rowid, t.rowid t_rowid
from rschema.gs_my_objects@mylink o,
rschema.gs_my_tables@mylink t,
rschema.gs_my_date@mylink d
where o.object_name=t.table_name
and t.last_analyzed > d.my_date - 30; 

begin
update rschema.gs_my_date@mylink d set my_date=my_date+1;
commit;
dbms_mview.refresh ('GS_JOIN','F');
end;

select count(*) from GS_JOIN;



the count results change as I execute the pl/sql more and more...

Is there something wrong with this that I don't see? Previously when I was trying this I had the two data tables on the "production" database and the date table on the "replication" database. But now that I have them all on one side it seems to work... unless there is something fundamentally wrong with what I did...

Thank you,
Tom Kyte
April 17, 2014 - 5:32 pm UTC

well, since you are not referencing SYSDATE, you are making this all table driven and it is a multi-table join - and we see the update you are artificially making and have logs for it and all...

no reference to sysdate (rownum, analytics, etc etc etc etc) and a properly formed query as a multi-table join, that follows all of the rules for a fast refresh, that is a different question.

it is not an automagic "just make it refresh on a schedule", it is a process you are going through.