Monday, March 26, 2012

need faster data transfer :: oledb

i need to fetch data into multiple tables from a remote server (ss 2005)...so inside a sequence container , i have placed various data flow task...each task in turn gets data from oledb source to oledb destination.....there is minimal transformation involved...but the data is huge...and its taking unexpectably long...(im gettin only fresh data..identified by datetime fields of source tables..thats the only check.),

and way to make it faster...source db in not in my control...so i cant get indexes on it..:(

There are way to many reasons for it to be slow. It could be your server, their server, network, etc.

The best performance you will get would be a "Execute SQL Task" and not a Data Task. You might need to do a "SELECT * INTO temptable FROM REMOTE.table" and then do your transforms into the real target table on your server.

When I have to do this kind of work, I usually just transfer the entire target to a new database and then do my work on my server.|||

Nitin Khurana wrote:

i need to fetch data into multiple tables from a remote server (ss 2005)...so inside a sequence container , i have placed various data flow task...each task in turn gets data from oledb source to oledb destination.....there is minimal transformation involved...but the data is huge...and its taking unexpectably long...(im gettin only fresh data..identified by datetime fields of source tables..thats the only check.),

and way to make it faster...source db in not in my control...so i cant get indexes on it..:(

There are so many "it depends" answers to this that its hardly worth replying.

There is a webcast on performance tuning that you may wish to watch:

Donald Farmer's Technet webcast
(http://blogs.conchango.com/jamiethomson/archive/2006/06/14/4076.aspx)

And info here about a performance tuning whitepaper:

Whitepaper on Performance Tuning Techniques
(http://blogs.conchango.com/jamiethomson/archive/2006/04/09/3594.aspx)

-Jamie

|||

Nitin Khurana wrote:

i need to fetch data into multiple tables from a remote server (ss 2005)...so inside a sequence container , i have placed various data flow task...each task in turn gets data from oledb source to oledb destination.....there is minimal transformation involved...but the data is huge...and its taking unexpectably long...(im gettin only fresh data..identified by datetime fields of source tables..thats the only check.),

and way to make it faster...source db in not in my control...so i cant get indexes on it..:(

Be sure you are using a SQL select statement in your OLE DB Source. Select just the columns you need, and if you need them all, "select *" has shown to be faster than simply selecting the table from the drop-down box.

Are your data flow tasks hooked sequentially to each other, or are you running as many as you can in parallel? If you're running them in parallel, try completing one first before moving on to the next one.|||okkk...let me tell a bit more.... network/server config etc are not my scope of prob...there r other packages performing gud...to be specific, im using the fast load option, (oledb source and destination) , in few tables , number of rows are over a few million (after filtering ,i.e which need to be transfered), these few tables are the bottleneck ..consider both source and destination as highend servers...what are the options i can set for rows per batch,..commit size,etc , size of rows is not much either , say 200 - 300 KB.... so what are the options i have..|||

Nitin Khurana wrote:

okkk...let me tell a bit more.... network/server config etc are not my scope of prob...there r other packages performing gud...to be specific, im using the fast load option, (oledb source and destination) , in few tables , number of rows are over a few million (after filtering ,i.e which need to be transfered), these few tables are the bottleneck ..consider both source and destination as highend servers...what are the options i can set for rows per batch,..commit size,etc , size of rows is not much either , say 200 - 300 KB.... so what are the options i have..

Are there indexes in place on the destination tables?

Try this:
Copy your package and in the new one (package_copy), take your OLE DB source and go right into a row counter and only a row counter -- nothing else. Run the package. How long does it take?

Then create another copy of your "master" package and in that copy (package_copy2) replace your destination with a row counter. How long does that take?

Create a copy of your "package_copy" package and remove the row counter and replace it with a raw file destination. Run the package. Then, delete everything and start over with a raw file source (pointing to the file just created) and then attach it to a row counter. How long does this new package take to run?

=================

The first test tells you how long it takes to get data from the source as fast as you can. (VALUEA)
The second test tells you how long it takes to get through the data flow, not counting the destination. (VALUEB)
The third test tells you how efficient the source pull is. (VALUEC)

Take the total package run time (your problem) and subtract from it VALUEB. This number tells you how long it takes to put data into the destination table. (VALUED)

Take VALUED and subtract from it VALUEA. This tells you how long it takes to transform the data.

Take VALUEA and subtract from it VALUEC. This tells you how inefficient the source pull is if the difference is not zero. (not likely)

*** NOTE: I first learned of this technique by watching a Webcast by Donald Farmer titled, SQL Server 2005 Integration Services: Performance and Scale.|||

Nitin Khurana wrote:

okkk...let me tell a bit more.... network/server config etc are not my scope of prob...there r other packages performing gud...to be specific, im using the fast load option, (oledb source and destination) , in few tables , number of rows are over a few million (after filtering ,i.e which need to be transfered), these few tables are the bottleneck ..consider both source and destination as highend servers...what are the options i can set for rows per batch,..commit size,etc , size of rows is not much either , say 200 - 300 KB.... so what are the options i have..

Only you can answer those questions. Test and measure is my advice to you.

There is some useful performance tuning info in the whitepaper that I linked to earlier.

You might also want to look at using the SQL Server Destination if at all possible.

-Jamie

|||

<<after filtering ,i.e which need to be transfered>>

This is where I would look first. That and network lag.

|||

I am having the same problem. Taking data from 1 table and copying it to another table. The table only consists of 3 columns, 1 is a data type image. In SSMS it also takes forever to open the table so I can see part of the reason why this package is taking so long. But it is taking about 20 minutes to get the data from the source and another 20 minutes to commit the data. With the old dts package it used a copy sql server objects task and the whole job ran in about 20 minutes. So I also tried using a transfer sql server objects task to copy the data from this one table and it is also taking around 40 minutes.

I have no idea why it would take so much longer.

Sam

|||

samssb wrote:

I am having the same problem. Taking data from 1 table and copying it to another table. The table only consists of 3 columns, 1 is a data type image. In SSMS it also takes forever to open the table so I can see part of the reason why this package is taking so long. But it is taking about 20 minutes to get the data from the source and another 20 minutes to commit the data. With the old dts package it used a copy sql server objects task and the whole job ran in about 20 minutes. So I also tried using a transfer sql server objects task to copy the data from this one table and it is also taking around 40 minutes.

I have no idea why it would take so much longer.

Sam

Did you run the test that I indicated above? That will help determine if it's the source pull or the destination push that is causing the bottleneck. However, ensure that you are using FAST LOAD in the OLE DB destination, otherwise you'll get slow performance.sql

No comments:

Post a Comment