Thanks for the question, Robert.

Asked: July 16, 2020 - 11:43 am UTC

Last updated: July 30, 2020 - 1:08 am UTC

Version: 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production (soon will be migrated to 18c Enterprise Edition Release 18.0.0.0.0 - Production, version 18.7.0.0.0)

Viewed 1000+ times

You Asked

Our application reads data from a big table (~120mil rows). Indexes are set properly, so read queries are quick (tens of ms at max). Application does not change the data in table just reads using same query just were clause

Once a day there is an ETL job populating changes (from external system) into delta table and consequently runs Stored Procedure which populates changes from delta into main table via merge. Last time there were 16mil of records in delta table.
Our problem is that some of application queries are slow (1-35sec instead of few ms) during merge. It is just ~0.056% of daily queries but we need to mitigate that anyway. Queries are same just were clause can have more values in "IN" set for some columns (e.g.

PRODUCT_CODE IN (:5 , :6, :7)

vs.

PRODUCT_CODE IN (:5)

).

What solution we should use to eliminate data update influence to read queries?
We see 2 possibilities:
1) use materialized view - 1:1 with main table, refreshed only after ETL+merge is performed on main table
2) using table synonym together with 2 versions of main table (active and passive). Updates would be done to passive table and once done (including indexes and statistics recalculation) the synonym will be change to passive table making it active.
Would any of mentioned solution resolve our problem?
Thanks, Robert

and Connor said...

Probably the first thing you need to do is identify the cause of slowness. Potential causes are:

a- machine is starved of CPU during the merge
b- machine is starved of IO bandwidth during the merge
c- queries are doing block cleanout
d- query are doing lots of read consistency undo work

Its important to find cause first, because (for example) if its (1) or (2) then your suggested options won't have much benefit because you're still doing just as much work.

So I'd be looking at some tracing of your queries during the process, but using dbms_session but also some snapshots of their session stats (the cleanout and undo records applied stats are relevant here)

Assuming your machine is fine, and you're getting hit by (c) or (d), then in terms of your options proposed, one thing to be careful of is that when you "flick the switch" all queries are going to have to go through a parse phase again. Whether that is a concern is really dependent on volume/frequency.

Also, if you head down the passive/active route, rather than a merge, often a create-table-as-select is much more effective

Rating

(1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

We found the root cause - so, any new recommendation?

Robert (original requestor), July 29, 2020 - 3:15 pm UTC

Our DBA identified root cause as:
most of your application queries are using DCT_NETWORK_SEGMENT_IND1 as hint so during load when you running merge statement on table having DCT_NETWORK_SEGMENT_IND1 index updated very heavily and all new blocks has to be read from disks and also global cache has to be updated thus all depending app queries are slowed down.

Unfortunately there is not much we can do to reduce load on this single index so we recommend to push merge load out of business hours or use alternate refresh design (pre loaded tables, materialized views, synonym switch etc).

So, AskTom:
Pushing merge out of business hours (one of your suggestions) is not possible.
Will materialized view help then? For such a big table and quite big update done by merge even MV refresh could affect reads I guess. We have 100-200 queries on the table per second.
Synonym switch could be better alternative but how will DB handle existing queries during switch? Will those be answered using a table where synonym was pointing to before switch?
Thanks.

July 30, 2020 - 1:08 am UTC

OK...here's a mechanism of letting you flip without interruption to existing queries.

Let's assume table T1 is our current large table

SQL> create table t1 
  2  as select d.* from dba_objects d,
  3    ( select 1 from dual connect by level <= 200 );

Table created.

SQL>
SQL> create index t1_ix1 on t1 ( object_id ) ;

Index created.

SQL> create index t1_ix2 on t1 ( object_name ) ;

Index created.

This is "version 1" of our data. When we want to evolve to version 2, we clone the data and make our changes. For this demo, our "change" is to lowercase some of the data

SQL> create table t2 
  2  as select * from t1;

Table created.

SQL>
SQL> create index t2_ix1 on t2 ( object_id ) ;

Index created.

SQL> create index t2_ix2 on t2 ( object_name ) ;

Index created.

SQL> update t2 set owner = lower(owner)
  2  where object_id < 100;

19600 rows updated.

SQL> commit;

Commit complete.

Now we need a mechanism to control whether users see T1 or T2. I will use a context that is available cross-session for that

SQL> create context CTX using pkg accessed globally;

Context created.

SQL> create or replace
  2  package pkg is
  3    procedure setver(p_ver varchar2);
  4  end;
  5  /

Package created.

SQL>
SQL> create or replace
  2  package body pkg is
  3    procedure setver(p_ver varchar2) is
  4    begin
  5      dbms_session.set_context('CTX','VER',p_ver);
  6    end;
  7  end;
  8  /

Package body created.

SQL>

We now have a view that consolidates BOTH sets of data, with a predicate to limit access based on our context

SQL> create or replace view USER_DATA as
  2  select * from t1
  3  where sys_context('CTX','VER') = '1'
  4  union all
  5  select * from t2
  6  where sys_context('CTX','VER') = '2';

View created.

and we are now good to go... Here's a sample query

SQL> set autotrace traceonly explain
SQL> select *
  2  from USER_DATA
  3  where object_id = 23;

Execution Plan
----------------------------------------------------------
Plan hash value: 194448029

----------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |           |   394 |    98K|   173K  (1)| 00:00:07 |
|   1 |  VIEW                                  | USER_DATA |   394 | 52008 |   402   (0)| 00:00:01 |
|   2 |   UNION-ALL                            |           |       |       |            |          |
|*  3 |    FILTER                              |           |       |       |            |          |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1        |   197 | 26004 |   201   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | T1_IX1    |   197 |       |     3   (0)| 00:00:01 |
|*  6 |    FILTER                              |           |       |       |            |          |
|   7 |     TABLE ACCESS BY INDEX ROWID BATCHED| T2        |   197 | 26004 |   201   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN                  | T2_IX1    |   197 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(SYS_CONTEXT('CTX','VER')='1')
   5 - access("OBJECT_ID"=23)
   6 - filter(SYS_CONTEXT('CTX','VER')='2')
   8 - access("OBJECT_ID"=23)

Notice the FILTER lines. We immediately exclude from access one side of the UNION ALL based on the context value. So lets prove that with some test cases.

A query to T1 directly gives the following cost:

SQL> set autotrace traceonly stat
SQL> select *
  2  from T1
  3  where object_id = 23;

200 rows selected.


Statistics
---------------------------------------------------------
          1  recursive calls
          0  db block gets
        217  consistent gets
          0  physical reads
          0  redo size
      29941  bytes sent via SQL*Net to client
        542  bytes received via SQL*Net from client
         15  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        200  rows processed

SQL> set autotrace off

So a query to USER_DATA should be the same, when we using version 1 of the data

SQL> set autotrace traceonly stat
SQL> select *
  2  from USER_DATA
  3  where object_id = 23;

200 rows selected.


Statistics
------------------------------------------------------
          0  recursive calls
          0  db block gets
        217  consistent gets
          0  physical reads
          0  redo size
       6934  bytes sent via SQL*Net to client
        549  bytes received via SQL*Net from client
         15  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        200  rows processed

SQL> set autotrace off

And now, just a flick of a switch of our context activates the new data

SQL> exec pkg.setver('1');

PL/SQL procedure successfully completed.

SQL> set autotrace off
SQL> select owner
  2  from USER_DATA
  3  where object_id = 23
  4  and rownum <= 10;

OWNER
------------------------------
SYS
SYS
SYS
SYS
SYS
SYS
SYS
SYS
SYS
SYS

10 rows selected.

SQL>
SQL> exec pkg.setver('2');

PL/SQL procedure successfully completed.

SQL> select owner
  2  from USER_DATA
  3  where object_id = 23
  4  and rownum <= 10;

OWNER
------------------------------
sys
sys
sys
sys
sys
sys
sys
sys
sys
sys

10 rows selected.

When you are ready for version 3 of your data, you just truncate T1, load the new data into there, and flick the version back to 1

and so forth. Running queries are unaffected because they see the value of the context when the query started. New queries pick up the context value.

Because its a global context, you change it once in your session (the one that does the data population) and everyone sees it immediately.

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

We found the root cause - so, any new recommendation?

More to Explore

Performance

Queries from a big table are slow during merge

Breadcrumb

Question and Answer

You Asked

and Connor said...

Rating

Comments

We found the root cause - so, any new recommendation?

More to Explore

Performance