Skip to Main Content
  • Questions
  • Tunning heterogeneous services generic connectivity to Sql Server

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Cristian.

Asked: February 16, 2003 - 2:16 pm UTC

Last updated: February 22, 2005 - 8:44 am UTC

Version: 8.1.7

Viewed 1000+ times

You Asked

I'm loading data from a Sql Server 2000 to an Oracle 8i DB both DB are in diferent machines, both with windows 2000 Server. For this task I'm using heterogeneous services generic connectivity to Sql Server, with an ODBC driver for SQL Server. I managed to get it working, I can select the data from Sql Server, and insert it in Oracle. I'm looking foward to improve the speed of the data load. I have to load aprox. about 3 million rows, distributed between aprox 15 tables. So far I have:

1.- Truncate the tables prior to loading them, using the reuse storage clause.
2.- Do the inserting on rows with insert /*+ Append */ into t1 select * from t1@sqlserver.
3.- Prior to inserting set the index as unusable and after loading rebuild the with nologing.
4.- Making the inserts at the same time, but no more of 4 at the same time. Opening multiple Sql Plus windows.


What I want to know is if this could be performance improvements:

1.- Using an Ole DB driver instead of ODBC.
2.- Configuring multiple agents (I noticed that when I open multiple sqlplus windows and do at the same time the insert select from sql server multiple agents are started, multiple instances of hsodbc.exe running)
3.- Changing the values HS_RPC_FETCH_SIZE, HS_RPC_FETCH_REBLOCKING to see what happens.


Any recommendation would be greatly appreciated. Are there any benchmarks about this ?


and Tom said...

I would never use odbc for 3million of anything.

You want to load fast? bcp the data out of sqlserver into a flat file and use sqlldr to reload.

On my laptop, I can do about 1.8 million records in about 30 seconds that way.


You want speed -- use load utilities.
You want slow -- odbc ;)


the sqlldr utility, when loading into a nologging table using the direct=y option is blindingly fast -- no undo, no redo, just raw speed.

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Tunning heterogeneous services generic connectivity

Cristian Perez, February 16, 2003 - 3:09 pm UTC

Thanks for your answer. I completely agree with you, regarding the sqlldr to reload approach. I Wanted, to know how fast could it go with the heterogeneous services generic connectivity and if any improvements could be made to using that approach.

Tom Kyte
February 16, 2003 - 3:34 pm UTC

Not very fast.

the odbc layer is going to kill you here.

bcp+sqlldr would be the only method I would consider -- in the time we took to have this back and forth -- it would have been loaded (and indexed and queried a couple of times)


but if you go down this path:

insert /*+ append */ into a NOLOGGIGING table

set the indexes unusable

in as many sessions as you can - if you have 2 cpus, 8 sessions would not be bad. User a where clause that doesn't kill sqlserver to split the data up.




Performance of Heterogeneous Service

Matt, February 21, 2005 - 5:47 am UTC

I am just thinking about how to migrate data from a mainframe system to Oracle.

Initially I envisaged generating flat file extracts from the mainframe and validating/loading these as external tables. However, by using heterogeneous services I can eliminate this step (and some of the dev effort of the flat file). I could then carry out a pure SQL migration by pulling the data from the remote database, validating and loading into Oracle. My main concern of this approach is that the (possibly) complicated SQL that would retreive the data from the mainframe and load into Oracle would be hard to tune if slow (lack of mainframe database knowledge on my part).

From your comment above, I understand that heterogeneous services *may* not be the most performant approach. Obviously I would need to actually prototype something to put some figures against the speed of this (and some thought into how fast it *needs* to be).

An alternative would be to avoid any complex distributed SQL and pull the data over to Oracle table by table into a copy of the remote schema (say as GTT's). The remaining processing is then local to this data and access will be faster.

From your experience, which might be the most efficient approach;

1) Writing flat file data extracts and tranferring these to the local DB to be accessed as external tables; or

2) Writing a data tranfer process to pull the data into Oracle as GTT's ( CTAS or insert /*+ append */ ) and accessing the GTT's directly.

Best Regards,

Tom Kyte
February 21, 2005 - 10:47 am UTC

if I could have sql access to the data for a one time "copy it over and stage it to transform it"

rather than "get a programmer to extract the data, ftp it converting ebcdic to ascii, find out how many mis-communications we had -- redo the extract, ftp it again, finally get the sqlldr control files/external tables created to load it properly"


I would go with #1 to start, if it worked, it would be my path of least resistance (i'm assuming odbc access to the mainframe data source, else you are in the domain of the gateways and for a one time conversion -- they might be overkill, they are nice for ongoing processing)

Clarification

Matt, February 21, 2005 - 10:59 am UTC

Your #1 option being to configure the heterogenous service to allow SQL access (using ODBC API) to the mainframe data (rather than mess with flat files)?

With SQL access, I can then either;
i) write a complex query to retreive and validate the data; or
ii) collect the data using SQL a table at a time.

Most likely I will try i) and fall back to ii) if I have performance issues.



Tom Kyte
February 21, 2005 - 12:47 pm UTC

option one is sql yes.


I would probably do "ii)" out of the box so as to not impact the mainframe and just to get it into something I work with better -- you'll find it'll be faster to try and retry against local data.

Something just struck me...

Matt, February 22, 2005 - 6:40 am UTC

It just struck me that what this migration is actually trying to achieve is very similar to an ETL process.

It may make sense to use OWB with the heterogenous service to define a process to extract the data from the mainframe and place into local copies of the mainframe tables, the next stage could then be driven by various OWB steps to validate and translate the data and load (or merge) the data into the target Oracle DB.

Is this approach feasible?

It seems to me that the migration process (as I understand it) might be quicker and more efficient if I 'rolled my own'. However, I may not be the person doing the implementation, or maybe there will be a team of people doing the implementation. I have yet to get to grips with size of this migration.

I'm just looking for ways to speed up the development and maintenance (and maybe save some dollars) of the migration process.

Regards,

Tom Kyte
February 22, 2005 - 8:44 am UTC

generally (and I'm speaking loosely here) the tools are useful for things that have a long life and many people. They tend to not always speed up things -- but rather to bring a certain level of documentation, rigor, commonality, reuse into play. And then schedule and run these processes over and over and over.

For a one time mass migrate by a small team, you might be better off doing it yourself -- unless you know the tool already (so no learning curve) and feel comfortable with it.

More to Explore

Hints

Need more information on hints? Check out the complete hint descriptions