Thanks for the question, Vincent.

Asked: March 15, 2012 - 11:38 am UTC

Last updated: October 29, 2012 - 4:54 am UTC

Version: 10.2.0.4.0

Viewed 10K+ times! This question is

You Asked

Hi Tom

Since years our systems relies on the GTT ( with “no logging” option, direct insert /*+append */…) to manage many and very complex data transformation ( intermediate resultset in memory with Pl/SQL array + final result set in the GTT ). That was and it’s still very efficient.

In the meantime “Data Guard” has been implemented and the DBA set the “force logging” option .
Consequently the amount of redo significantly increased..

Our system is growing ... and now we have to deal with a huge amount of redo log generated by those data transformation ( 15GB/hr … or much more, depending on the user activity ..).

We have some business requirements in the pipeline that would request again more and additional data transformation !

My question is : How could I minimize/regulate this amount of redo ? (considering that we cannot extract the data with a single - even complex - SQL )

1- Restore “force logging” to No , and manage the logging clause individually, by object .
Our DBA will probably reject this option …. and how to make sure that all the new tables will be created with the right logging clause …

2 - Store the transformed data into a collection instead of a GTT
> PL/SQL collection + pipeline function to read the data…
> collection based on DB type + SQL to read the data

However this option shall consume much more CPU ...
and I'm afraid of the performance / stability with large data set in memory ....

3. Others ...?

Before initiating tests, I would appreciate your advice to isolate a relevant approach .

Thanks
Vincent

and Tom said...

Not to worry, your insert /*+ APPEND */ into global temporary tables is not generating redo - we do not generate redo for operations on TEMP. the only redo a global temporary table would generate would be for the UNDO - however, the direct path load skips undo.

So, no redo is generated by this (unless of course you have indexes - the index maintenance would generate undo)

ops$tkyte%ORA11GR2> create global temporary table gtt as select * from all_objects where 1=0;

Table created.

ops$tkyte%ORA11GR2> create table t nologging as select * from all_objects where 1=0;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select log_mode, force_logging from v$database;

LOG_MODE     FOR
------------ ---
ARCHIVELOG   YES

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace on statistics
ops$tkyte%ORA11GR2> insert /*+ append */ into gtt select * from all_objects;

72255 rows created.


Statistics
----------------------------------------------------------
        168  recursive calls
       1047  db block gets
      47367  consistent gets
          0  physical reads
        256  redo size
        895  bytes sent via SQL*Net to client
       1024  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
       1497  sorts (memory)
          0  sorts (disk)
      72255  rows processed

ops$tkyte%ORA11GR2> insert /*+ append */ into t select * from all_objects;

72255 rows created.


Statistics
----------------------------------------------------------
        456  recursive calls
       1704  db block gets
      47528  consistent gets
          0  physical reads
    8538760  redo size
        898  bytes sent via SQL*Net to client
       1022  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
       1497  sorts (memory)
          0  sorts (disk)
      72255  rows processed

ops$tkyte%ORA11GR2> commit;

Commit complete.

Rating

(5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Excellent !!!!

Vincent, March 16, 2012 - 11:25 am UTC

Your demo is so straightforward and explicit.
Thanks very much.

More globally, when would you recommend the usage of “table function” (collection types, pipeline functions ...) ?
Obviously there is no magic and generic answer, but I’m sure you have guideline in mind!

With the “table function” there is no DML (at the opposite of the insert into the GTT ) , but can we reasonably address large data set with this kind of approach ( 10’000 .. 100’000 rows)

At least with the GTT we may use index, and globally the performance can be stable – linear – with regard to the size of the resulset.

What happen once the data from the “table function” is used into "complex" queries, with multiple joins …..?
The system has no statistic for the data coming from the “table function”, no index …. basically nothing to help to choose a good plan to manipulate the data. !

Thanks .
Vincent

March 16, 2012 - 12:06 pm UTC

More globally, when would you recommend the usage of “table function”
(collection types, pipeline functions ...) ?
Obviously there is no magic and generic answer, but I’m sure you have guideline
in mind!

pipelined functions are neat for doing ETL - they do not eat up copious amounts of RAM as the data is truly "pipelined" in most cases. Your plsql table function produces a few rows, feeds them to the next step, they do a bit and so on.

do not use a collection variable here, truly pipeline the data.

What happen once the data from the “table function” is used into "complex"
queries, with multiple joins …..?
The system has no statistic for the data coming from the “table function”, no
index …. basically nothing to help to choose a good plan to manipulate the
data. !

the cardinality hint is very useful here - you know about how much data it will return - you can tell us that (just be approximate, doesn't have to be exact).

Great

Vincent ZANINI, March 20, 2012 - 11:15 am UTC

Thanks very much for that clarification.
Best Regards
Vincent

GTT Deletion without Undo

Vincent ZANINI, March 26, 2012 - 8:34 am UTC

Hi Again

> I would like to know if there is an option to delete in my GTT ( with the option ON COMMIT PRESERVE ROWS ), without generating any Undo/Redo.
By re-using your example, the insertion of ~60'000 rows, does not generate redo, while the deletion generates about ~17MB and takes more than 2s.
The "truncate" shall be an option but it's a bit rude !

> Concerning the insert process, surprisingly I noticed no major difference between a
. Direct insert into a GTT
. Direct insert into a normal table with LOGGING activated

Obviously the amount of redo is dramatically different, but the response times are very close. Naturally I would have expected much more difference.
I done several tests, with larger dataset. A roughly estimate gives 10% of difference in favor of the GTT. Is it a normal expected figure, according to your experience.

Test ~ 600'000 rows

vini>REM
vini>REM Bulk DIRECT insert into GTT
vini>REM
vini>
vini>
vini>insert /*+ append */ into gtt SELECT /*+ ALL_ROWS */ o.* FROM all_objects o,(SELECT ROWNUM FROM all_objects WHERE ROWNUM <= 10) x ;

585000 rows created.

Elapsed: 00:00:07.04

Statistics
----------------------------------------------------------
287 recursive calls
8079 db block gets
101613 consistent gets
0 physical reads
328 redo size
361 bytes sent via SQL*Net to client
423 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1020 sorts (memory)
0 sorts (disk)
585000 rows processed

vini>REM
vini>REM Same Bulk DIRECT insert into regular table with NOLOGING
vini>REM
vini>insert /*+ append */ into t SELECT /*+ ALL_ROWS */ o.* FROM all_objects o,(SELECT ROWNUM FROM all_objects WHERE ROWNUM <= 10) x ;

585000 rows created.

Elapsed: 00:00:07.39

Statistics
----------------------------------------------------------
3756 recursive calls
10700 db block gets
102790 consistent gets
8217 physical reads
66664024 redo size
361 bytes sent via SQL*Net to client
421 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1020 sorts (memory)
0 sorts (disk)
585000 rows processed

Thanks
Vincent

March 26, 2012 - 11:15 am UTC

truncate is the way to empty a global temporary table without generating redo or undo. If you use delete - there will be less redo than for a 'regular' table, but all of the undo will have redo generated for it (and delete generates the MOST undo)

As for point two about your direct path test - response times in such a simple test with such a tiny bit of data are very misleading.

You have to think about many other things as well. We'll now have to archive that redo. You'll have to backup that redo. If other things were happening - we'll have to coordinate the generation of that redo.

No one said "global temporary table is going to be faster than regular table"

They've only said "they are temporary tables"

They are pretty much behaving just like a regular table in most ways.

Mayank, May 02, 2012 - 5:49 am UTC

Hi Tom,
You've suggested use of Cardinality hint with collections:
"the cardinality hint is very useful here - you know about how much data it will return - you can tell us that (just be approximate, doesn't have to be exact)"
I'm running on Oracle 11.2.0.3.0, and using a nested table with large tables (4 tables each with aprox 4 million of records) in an SQL query. The nested table has only 10-20 rows, however Oracle is producing a plan involving Full Table Scans along with Hash Joins, resulting in high cost for SQL. A nested loop hint resolves this issue. But I'm not sure if these hints (Nested Loop or Cardinality) will work in cases where data continues to grow over the period of time. Any thoughts?
Also, is costly plan due to lack of statistics a common problem with Table or Pipelined Table function? I've seen this problem first time in 4 years.

May 02, 2012 - 1:47 pm UTC

pipelined table functions have no statistics - ever. So, we default an estimated cardinality to them (typically 8,168 rows - in an 8k blocksize database).

the cardinality hint is nice to let the optimizer know how many rows you typically expect from it. I would suggest trying to use a cardinality of 10. That would tell the optimizer enough information for it to figure out to use a nested loops join (or something even more efficient maybe) to retrieve the data.

what is exist in redo log files

Prashant, October 29, 2012 - 4:14 am UTC

Hi Tom,

We are recovering data from redo log file.
What is stored in redo log files. Are all sql queries which are running after back up taken? if answer is no then how oracle engine identify the all transaction which needs to be recover.

October 29, 2012 - 4:54 am UTC

the redo logs in short contain a big list of bytes to change.

when you make a change to a block, you change some bytes from one value to another - not all of them - just some. We call this a 'redo vector'. This set of changed bytes gets recorded in the redo log. So, in the redo log you'll see things like "go to file N, block M and change the bytes starting at offset Y through X to this: 0xXXXXXXXXXXXXXXXXXXXXX".

you will not find any sql in there (well, you might if you have certain options turned on for data guard, but that'll be DDL not DML). You'll find a list of changes that need to be applied to blocks in order so that they can be recovered to a point in time.

More to Explore

Backup/Recovery

Check out the complete guide to all of the Backup & Recovery techniques in the Oracle Database.

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Excellent !!!!

Great

GTT Deletion without Undo

what is exist in redo log files

More to Explore

Backup/Recovery

Reduce Redo log with GTT+Direct Path & "Force Logging" option

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Excellent !!!!

Great

GTT Deletion without Undo

what is exist in redo log files

More to Explore

Backup/Recovery