Hello,
I have a system that I am writing to automaticly import Credit Transaction
data into a SQL Server 2000 Database. I am using a VB.Net application to
detect when the file arives and prep it for parsing. The file is
aproximately 10Mb of relatively complex hierarchal data that is defined by 2
character tokens at the begining of each data line. The structure breaks ou
t
into aproximately 6 parent-child related tables with numerous lookup tables
on each primary table.
The previous aproach loaded each line of raw data into an Import table and
used a Stored Procedure to break down the information into the tables. The
Stored Procedure is large and very complex. It also has the problem that th
e
Parsing takes so long I need to implement a very long timeout on my
connection or I get errors. It is also difficult to track progress through
the Import for the user to see.
The options I see at this point are;
1. Use the current approach of loading the raw data into an Import table and
Parsing with a Stored Procedure.
2. Write the Parsing Routine in VB.NET and use ADO.NET to load the data into
the apropriate tables.
3. Use a mixed aproach of breaking the massive Stored Procedure into smaller
procedures and managing the steps with VB.Net.
I am looking for any tips and advice that the guru's out there are willing
to give. Especially a comparision of the effiency of doing ADO.NET bulk
inserts vs. Stored Procedure parsing.
Thanks for any help given.
Doug
PS - Sorry for the cross posting in dotnet.general but I am hoping to get a
balanced set of views on this issue.Doug,
I have heard that sql server is not good for parsing routines, so this can
lead you to do it in your client app.
What about using XML to represent this complex hierarchal data?
AMB
"Doug R" wrote:
> Hello,
> I have a system that I am writing to automaticly import Credit Transaction
> data into a SQL Server 2000 Database. I am using a VB.Net application to
> detect when the file arives and prep it for parsing. The file is
> aproximately 10Mb of relatively complex hierarchal data that is defined by
2
> character tokens at the begining of each data line. The structure breaks
out
> into aproximately 6 parent-child related tables with numerous lookup table
s
> on each primary table.
> The previous aproach loaded each line of raw data into an Import table and
> used a Stored Procedure to break down the information into the tables. Th
e
> Stored Procedure is large and very complex. It also has the problem that
the
> Parsing takes so long I need to implement a very long timeout on my
> connection or I get errors. It is also difficult to track progress throug
h
> the Import for the user to see.
> The options I see at this point are;
> 1. Use the current approach of loading the raw data into an Import table a
nd
> Parsing with a Stored Procedure.
> 2. Write the Parsing Routine in VB.NET and use ADO.NET to load the data in
to
> the apropriate tables.
> 3. Use a mixed aproach of breaking the massive Stored Procedure into small
er
> procedures and managing the steps with VB.Net.
> I am looking for any tips and advice that the guru's out there are willing
> to give. Especially a comparision of the effiency of doing ADO.NET bulk
> inserts vs. Stored Procedure parsing.
> Thanks for any help given.
> Doug
> PS - Sorry for the cross posting in dotnet.general but I am hoping to get
a
> balanced set of views on this issue.|||I have done the same type of thing in the recent past. The methodology I
have used is to execute a BCP statement into a properly structured table.
BCP is native SQL Server, much faster than a Bulk Insert statement, deals
with security in-line (you specify user.password in the command) and if the
data is consistent (the type of data appears in the same position in each
line) the BCP command will import into the properly formatted table without
any parsing required. The syntax for command is listed in Books On-line.
You will need to create a format file for the import to use. This is simply
executing a BCP FORMAT command against the desired SQL Server destination
table. Once the format file is built, you use that to define for the BCP IN
command the data structure you are importing. It is an extra step to create
the format file, but once you have it you place it in a location where your
process can access it will do the heavy lifting for you.
The advantages to this process are several. First, because you are
pre-defining a data structure to the import process, no additional parsing
is needed. Second, because BCP runs in a DOS environment it is very fast.
Third, because user ID and password are contained in the command line,
connection strings and other security items are unnecessary.
There is one additional caveat to using BCP in a procedure. You will have
to run it in a DOS environment using the xp_cmdshell extended procedure.
This allows SQL Server to run DOS commands from within a stored procedure.
The user ID that runs this will need privileges assigned to the
master.dbo.xp_cmdshell proc in SQL Server Enterprise Manager.
My suggestion for implementation would be to write a simple stored procedure
that builds the BCP statement as a Varchar variable and passes it to
master.dbo.xp_cmdshell to run. Call this stored procedure using the execute
nonquery option of the SQLServer.SQLCommand object in VB.Net. Doing this
will allow for full automation of the import process.
If you need any further assistance you can contact me at
groutme_alternate@.sbcglobal.net.
Good Luck:
groutme in SO Cal.
"Doug R" <DougR@.discussions.microsoft.com> wrote in message
news:D7FEFA89-E6BD-49C9-A0E5-5464130B3440@.microsoft.com...
> Hello,
> I have a system that I am writing to automaticly import Credit Transaction
> data into a SQL Server 2000 Database. I am using a VB.Net application to
> detect when the file arives and prep it for parsing. The file is
> aproximately 10Mb of relatively complex hierarchal data that is defined by
2
> character tokens at the begining of each data line. The structure breaks
out
> into aproximately 6 parent-child related tables with numerous lookup
tables
> on each primary table.
> The previous aproach loaded each line of raw data into an Import table and
> used a Stored Procedure to break down the information into the tables.
The
> Stored Procedure is large and very complex. It also has the problem that
the
> Parsing takes so long I need to implement a very long timeout on my
> connection or I get errors. It is also difficult to track progress
through
> the Import for the user to see.
> The options I see at this point are;
> 1. Use the current approach of loading the raw data into an Import table
and
> Parsing with a Stored Procedure.
> 2. Write the Parsing Routine in VB.NET and use ADO.NET to load the data
into
> the apropriate tables.
> 3. Use a mixed aproach of breaking the massive Stored Procedure into
smaller
> procedures and managing the steps with VB.Net.
> I am looking for any tips and advice that the guru's out there are willing
> to give. Especially a comparision of the effiency of doing ADO.NET bulk
> inserts vs. Stored Procedure parsing.
> Thanks for any help given.
> Doug
> PS - Sorry for the cross posting in dotnet.general but I am hoping to get
a
> balanced set of views on this issue.|||Thanks for the reply. Regretably I have no control over the form of the
input data or I would be using XML. The data is being sent to us from an
outside business.
Are you familiar with ADO.NET for doing bulk inserts? I would be very
interested in hearing the Pros and Cons of Stored Proc vs ADO.NET.
An alternative would possibly be something that parses the import data file
into XML before the insert. Any ideas on how this could be done efficiently
?
"Alejandro Mesa" wrote:
> Doug,
> I have heard that sql server is not good for parsing routines, so this can
> lead you to do it in your client app.
> What about using XML to represent this complex hierarchal data?
>
> AMB
> "Doug R" wrote:
>|||If your stored procedure uses cursors or some other method to processes the
data line-by-line then probably you can rewrite some or all of it as
set-based SQL code for a significant performance improvement.
> Especially a comparision of the effiency of doing ADO.NET bulk
> inserts vs. Stored Procedure parsing
Bulk Insert is equally possible from TSQL or ADO.NET code so this doesn't
seem to me like a meaningful comparison. The real question is whether TSQL o
r
client code can transform the data efficiently. Since this is fundamentally
about data manipulation my money would be on TSQL ... but that's without
seeing the detail of the actual problem.
David Portas
SQL Server MVP
--|||Doug,
If you are subcribed to SQL Server Magazine, there is a good article about
this theme.
Bulk-Insert Options for ADO.NET
http://www.windowsitpro.com/Article...>
oad_7pv0.asp
ADO.NET 2.0 will include this feature.
http://www.windowsitpro.com/SQLServ...62.html
AMB
"Doug R" wrote:
> Thanks for the reply. Regretably I have no control over the form of the
> input data or I would be using XML. The data is being sent to us from an
> outside business.
> Are you familiar with ADO.NET for doing bulk inserts? I would be very
> interested in hearing the Pros and Cons of Stored Proc vs ADO.NET.
> An alternative would possibly be something that parses the import data fil
e
> into XML before the insert. Any ideas on how this could be done efficient
ly?
>
> "Alejandro Mesa" wrote:
>
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment