Skip to Main Content
  • Questions
  • Using Materialized views with Oracle Change Data Capture

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Paolo.

Asked: March 20, 2017 - 5:06 am UTC

Last updated: March 28, 2017 - 12:49 am UTC

Version: 11.2.0.4

Viewed 1000+ times

You Asked

Tom,

I would value your opinion on my intended solution for replicating data aggregations to external applications.

We have a Data Warehouse with a table containing 100 million rows with several tens of thousands of new rows added each day. This data needs to be regularly aggregated (daily basis at least) and the aggregated changes sent to 3rd party systems and applications. The aggregation might itself be a couple of million rows but with say 10,000 rows being updated/added each day, over the course of the day.

My proposed solution is to use a Materialized view to create the aggregation and make this a fast-refreshable view so it updates as and when new data is loaded to the warehouse.

Using oracle Change Data Capture I would track the changes occurring against the Materialized View and generate “UPSERT” transactions that will be sent to the 3rd Party applications (either by calling their API’s or sending flat files).

I’d value your thoughts on this approach and whether there are better techniques/applications for implementing such a solution.


and Connor said...

*Conceptually* it appears fine, but I would not be building a solution with CDC. From the 12c docs:

"Oracle Change Data Capture is not included in Oracle Database 12c"

So you can with your MV option, but seeing as its only a couple of million rows, I'd consider manually doing the before-and-after comparison, which I think would way more efficient than CDC anyway. Here's an example of what you could do


SQL> create table t
  2  as select
  3       d.OWNER
  4      ,d.OBJECT_NAME
  5      ,d.SUBOBJECT_NAME
  6      ,rownum OBJECT_ID
  7      ,d.DATA_OBJECT_ID
  8      ,d.OBJECT_TYPE
  9      ,d.CREATED
 10      ,d.LAST_DDL_TIME
 11      ,d.TIMESTAMP
 12      ,d.STATUS
 13    from dba_objects d,
 14     ( select 1 from dual connect by level <= 20 );

Table created.

SQL>
SQL> create global temporary table gtt on commit preserve rows
  2  as select * from t where 1=0;

Table created.

--
-- So our "mat view" T has 1.5million rows in it.
--

--
-- we take a copy of what it is currently into a global temp table
--

SQL> insert /*+ APPEND */ into gtt select * from t;

1552240 rows created.

SQL> commit;

Commit complete.

--
-- Now here is our "mods" to the mat view which occurred as part of a refresh
--

SQL>
SQL> delete from t
  2  where owner like 'APEX%'
  3  and rownum <= 10000;

10000 rows deleted.

SQL>
SQL> insert into t
  2  select
  3       d.OWNER
  4      ,d.OBJECT_NAME
  5      ,d.SUBOBJECT_NAME
  6      ,-1*d.OBJECT_ID
  7      ,d.DATA_OBJECT_ID
  8      ,d.OBJECT_TYPE
  9      ,d.CREATED
 10      ,d.LAST_DDL_TIME
 11      ,d.TIMESTAMP
 12      ,d.STATUS
 13  from dba_objects d
 14  where rownum <= 10000;

10000 rows created.

--
-- Now I can do a full outer join to pick up 
-- deletes, inserts and updates
--

SQL> set timing on
SQL> select t.object_id, gtt.object_id
  2  from t
  3   full outer join gtt
  4   on ( t.object_id = gtt.object_id )
  5  where
  6    (  t.object_id is null or
  7       gtt.object_id is null or
  8       decode(t.owner,gtt.owner,1,0) = 0 or
  9       decode(t.object_name,gtt.object_name,1,0) = 0 or
 10       decode(t.subobject_name,gtt.subobject_name,1,0) = 0
          ...
          ...
 11    )
 12   /

20000 rows selected.

Elapsed: 00:00:03.55
SQL>


and on a laptop, its a few seconds for those 1.5 million rows.

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Paolo Godinich, March 27, 2017 - 9:45 am UTC

Thanks for your response. Pretty much the alternative I was thinking of but felt as though I would be re-inventing the wheel by not using in-built replication technology. I'm not familiar with Goldengate which replaces CDC but will look into it as we have many structures that need to be replicated to a variety of 3rd party applications - some of which need near real-time updates.
Connor McDonald
March 28, 2017 - 12:49 am UTC

Goldgengate is an awesome replication technology especially for heterogeneous replication scenarios....

... but awesome != cheap :-)

More to Explore

Hints

Need more information on hints? Check out the complete hint descriptions