Skip to Main Content

Breadcrumb

XMAS_TREE

The AskTOM team is taking a break over the holiday season, so we're not taking questions or responding to comments, but we might still sneak in a few Office Hours sessions! Thanks for being a member of the AskTOM community.

Question and Answer

Tom Kyte

Thanks for the question, Joe.

Asked: February 09, 2007 - 11:30 am UTC

Last updated: February 22, 2007 - 7:49 pm UTC

Version: 10.2.1

Viewed 1000+ times

You Asked

I have written a PLSQL package that copies data across multiple databases using a database link to our data warehouse. It uses BULK_COLLECT with a limit of 1000- 2500 depending on the column count within each table. I am commiting outside the loop, so there is only one.

The DBAs are stating the amount of REDO logs is incredible, and they have had to increase the diskspace available for Oracle to catch up. They havent specifically stated that the issue is from my code but the issue appeared around the implementation time frame.

How can I monitor or ensure the logging is optimized when writing code?

and Tom said...

By doing things in single sql statements as much as possible.

the biggest impact on redo will be the amount of work done per call, per execution.

Also, look for opportunities to do direct path, nonlogged operations (but coordinate with DBAs!!! they need a backup right after)

ops$tkyte%ORA10GR2> create table t ( x int primary key, y char(10), z date );

Table created.

ops$tkyte%ORA10GR2> create index t_idx1 on t(y);

Index created.

ops$tkyte%ORA10GR2> create index t_idx2 on t(z);

Index created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @mystat "redo size"
ops$tkyte%ORA10GR2> set echo off

NAME                        VALUE
---------------------- ----------
redo size                84538288

ops$tkyte%ORA10GR2> begin
  2          for x in (select object_id a, 'x' b, created c from all_objects)
  3          loop
  4                  insert into t values ( x.a, x.b, x.c );
  5          end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> @mystat2
ops$tkyte%ORA10GR2> set echo off

NAME                        VALUE DIFF
---------------------- ---------- ------------------
redo size               144124840       59,586,552

ops$tkyte%ORA10GR2> truncate table t;

Table truncated.

ops$tkyte%ORA10GR2> @mystat "redo size"
ops$tkyte%ORA10GR2> set echo off

NAME                        VALUE
---------------------- ----------
redo size               144294508

ops$tkyte%ORA10GR2> begin
  2          insert into t select object_id, 'x', created from all_objects;
  3  end;
  4  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> @mystat2
ops$tkyte%ORA10GR2> set echo off

NAME                        VALUE DIFF
---------------------- ---------- ------------------
redo size               168114280       23,819,772

Rating

  (5 ratings)

We're not taking comments currently, so please try again later if you want to add a comment.

Comments

A reader, February 09, 2007 - 1:33 pm UTC


vadim, February 11, 2007 - 6:03 pm UTC

Hi Tom,

Isn't it because of 2phase commit logic ? I always used RPCs over db links with PL/SQL table out parameters to avoid SELECT's over db links. It was a common practice since Otacale 7.3. Is it still a better approach with 10g ?

Regards,

Vadim
Tom Kyte
February 12, 2007 - 10:31 am UTC

I was not using a dblink.

It is the difference between slow by slow processing versus bulk processing.

I would not avoid selects over dblinks ever - you have the same 2pc stuff going on with remote procedure calls like that.

Redo Generation using DB link

Ghulam Rasool, February 20, 2007 - 6:59 am UTC

I consider you as my GURU and I am an humbel student of yours.
I have two very simple questions:
1. Should we use DB link if we can avoid it.
2. I have two databases A and B on two different servers. I have big table resides on B. One background process runs on a server where A database resides lets say BGP. BGP basically reads data from A database and writes in B database through database link. Question is where redo logs will be generated. Please give some example as proof of concept.
TIA
Tom Kyte
February 20, 2007 - 9:50 am UTC

1) should we use a database link if we can avoid it...

not sure how to answer that. Don't do anything you can avoid doing - the fastest way to do something is "not to do it", so if you don't have to do something....

would need specifics before answering something like that.


2) the database being written to will obviously generate redo - it is the thing that needs to REDO the operation. The database being read from might generate redo as well as a select can and will generate redo under many circumstances.

No proof of concept needs be done for something like this, the database you write to obviously needs redo to protect itself.

Ghulam Rasool, February 21, 2007 - 4:06 am UTC

I am sorry I was not able to make my self clear. I try one more time giving full scenario in detail. We are talking about a telecom database (RBO Mode). Size of database is almost 5TB.

We have some very large Call Tables 500 GB each. One of our consultant suggests that

¿ Each Call table should have a separate db.
¿ These DBs can be created either on the production server or on any other server depending upon the availability of resources.
¿ These tables will be accessible to online agents through database links such that end user will not feel any difference.

Disadvantage of having these tables in production DB:
¿ They generate archive logs very frequently, in turn database spends valuable time in archiving them and this degrade the overall performance.
¿ They occupy a huge space inside TABS database which makes backup, restore and recovery takes very long time.
¿ The huge space occupied by these tables makes cloning the database very hard and seems undoable.

Advantage of hosting all these large tables in a separate instance
¿ Off load tabs in case of running long running reports
¿ Generate less archive logs which will have impact on overall performance of TABS
¿ Shrink down the production database size such that it will be easy to backup, easy to restore, easy to clone, and easy to have standby system.

MY POINT IS:
Instead of having separate databases for these tables, we can have:

¿ One HISTORY DATABASE in which we can archive the purged data from production.
¿ In order to reduce redo logs we can create these tables and their associated indexes with no logging option. For recovery purpose, we can use export dumps of temporary tables. (We create temporary table for call detail records, process the data, insert into large table/call tables, export the temp table and drop them)
¿ To me maintenance/handling of Call tables and other big tables is an issue. We cannot afford to keep months of data in an OLTP system. Therefore, for performance and manageability, production database should keep only the most recent data that is required for whole operation.

In order to kill the issue, there should be a purging and archiving policy that needs to be followed strictly for production database as well as History database. If we purge the data properly I dont think the size of database would cross one TB size.
Three types of tables are essentially candidate for purging.

¿ Call Tables
¿ Log Tables
¿ Tables used to run the system

Ideally for Call tables it is suggested that:

¿ Keep 1 + current month data in all call tables. Rest of the data should be moved to history database (Data retention in history database depends on Business requirement).
¿ The advantage of having this approach will be less scanning in turn the performance of DB should be very fast
¿ If there is a need to read collective data for reporting or queries etc, we can create a view on both the tables
¿ DBA team will require to schedule house keeping of these tables
Advantages
¿ It will also eliminate dependency of System operation on other databases.
¿ This option would eliminate the requirement of database link for overall system operation. If you look, how database link works, you will find following additional steps in order to complete a request, which would definitely impact the performance.

o SELECT * FROM TABLE@DBLINK
o The database will resolve DBLINK to a host name ... it will use the TNSNAMES.ORA unless fully described
o Naming resolution (DNS, NIS etc) will resolve the HOST to a TCP/IP address
o A connection will be made to a listener at the TCP/IP address
o The listener for the PORT will resolve the SID and finish the connection to the database.

I hope this time the question is clear.
TIA
Tom Kyte
February 21, 2007 - 10:54 am UTC

... Each Call table should have a separate db. ..
Allow me to be blunt. That would be stupid. 100%.



None of the reasoning makes sense. Archiving, on a properly configured system, adds approximately 0% overhead to the performance of the database.

None of the backup, recovery, whatever scenario's make sense either - since you, well, have to back them up anyway.

All multiple databases will do is:

consume more disk
consume more ram
consume more of your time
make patching a nightmare
increase your complexity
make tuning a machine virtually, no - physically, impossible.

Thanks for the reply

Ghulam Rasool, February 22, 2007 - 3:56 pm UTC

What about the second option when I am talking about purging policy and having history database.


Tom Kyte
February 22, 2007 - 7:49 pm UTC

I personally do not see the need for another database, that would be up to you.

large databases are not "slow" or "scary"

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library