I receive a monthly XLSX file that needs imported into SQL Server using an SSIS package. Unfortunately, the sender does not follow UNC naming for the filename or worksheets, and our recent migration to SQL Server 2012 has caused the package to fail - even when using an Excel Connection Manager. We have also tried sending them a template, but they refuse to follow it, and we have no leverage to force them to do so.
I have been attempting an update to the package which would use a Script Task to import each of the two Excel worksheets into a System.Object
for each, which I could then either query, or loop through, to import the data into the destination SQL Server tables.
So far, using the examples from Microsoft here I have been successful at importing the Excel file path/name, and both worksheet names, into Object variables. However, this does not create an Object containing the actual datasets from either worksheet.
Based on examples here and elsewhere around the web, I've started a C# script that I believe will output the worksheet data into an Object variable, but I'm not very proficient with C# and having difficulty getting it debugged without a complete example to copy from. This is my code so far:
using System;
using System.Data;
using System.Data.OleDb;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
public DataSet Main()
{
string fileName;
string connectionString;
fileName = Dts.Variables["ExcelFile"].Value.ToString();
Console.WriteLine(fileName);
connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;" +
"Data Source=" + fileName + ";Extended Properties=Excel 12.0 Xml";
Console.WriteLine(connectionString);
DataSet data = new DataSet();
using (OleDbConnection con = new OleDbConnection(connectionString))
{
con.Open();
OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM [Sheet1$]", connectionString);
adapter.Fill(data);
}
return data;
}
}
The code builds successfully, but when I run the package I receive a nondescript error
Error: 0x1 at Script Task: Exception has been thrown by the target of an invocation.
Task failed: Script Task
I do not get any output from either of my Console.WriteLine
commands, so I believe the Script Task is failing immediately. I do have Delay Validation = True, though changing it did not make a difference. Do you see any obvious/novice errors in my script? I've worked with SQL and SSIS for a number of years, but my C#/VB/Java/etc. knowledge and experience is limited.
Also, if I'm overlooking a better method to accomplish this in SSIS(other than the Excel Connection, which does not work), please let me know.
UPDATE - 5/31/16: I had a little time to work on the project today, and made a bit of progress, I think. I've updated my Script Task to include the following:
DataSet data = new DataSet();
using (OleDbConnection con = new OleDbConnection(connectionString))
{
con.Open();
OleDbDataAdapter adapter = new OleDbDataAdapter(query, con);
//OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM [Indemnity Scores$]", con);
adapter.Fill(data);
Dts.Variables["ExcelDataTable_IndemnityScores"].Value = data;
}
The Script Task completes successfully now, so I added a Foreach Loop container afterward, set it to Foreach From Variable Enumerator, and selected ExcelDataTable_IndemnityScores as the collection.
However, now I'm having difficulty extracting data from this Object variable. It has(or at least should have) two columns, which I've set in the Variable Mappings, and am using an Execute SQL command to insert the values into a table. Unfortunately, only a single blank value for each column are inserted.
So next I replaced the Execute SQL with a simple Script Task to return the values of each variable. Unfortunately, instead of the value it returns "Microsoft.SqlServer.Dts.Runtime.Variable". I presume this is a novice error on my part, but I've not found anything online explaining the error yet?
UPDATE 6/14/2016: I finally completed the package and it ran successfully in production yesterday. I ended up using advice presented here, as well as examples found elsewhere. My general workflow required triple-nested Foreach Loops to get both worksheets imported from the source workbook(s) - I only expect one per-month, but nothing has been 100% consistent with this task.
My outermost loop simply enumerates my import directory to find the file(s) downloaded by the FTP process. It contains two script tasks. The first one simply confirms the filename of the first spreadsheet downloaded by the FTP process. I used the Microsoft link above for my code, with only minor modification for my variable names.
The second task gets all worksheet names from the first spreadsheet, and was also built using the Microsoft link above. However I exclude any worksheet names with "#" to prevent the XML database from being assigned to my variable.
The second loop(first inner loop) enumerates through each worksheet name parsed within the first loop. It contains three script tasks, the first of which imports the data from the first worksheet into my object variable.
public void Main() { try {
string fileName;
string connectionString;
string worksheetName;
string query;
fileName = Dts.Variables["ExcelFile"].Value.ToString();
//MessageBox.Show("InsertWorksheetDataIntoObject - Filename: " + fileName);
connectionString = String.Format("Provider=Microsoft.ACE.OLEDB.12.0;" +
"Data Source={0};Extended Properties=Excel 12.0 Xml;", fileName);
//MessageBox.Show("Connection: " + connectionString);
worksheetName = Dts.Variables["ExcelTable"].Value.ToString();
worksheetName = worksheetName.Replace("'", "");
//MessageBox.Show("InsertWorksheetDataIntoObject - Worksheet: " + worksheetName);
query = string.Format("SELECT * FROM [" + worksheetName + "]");
//MessageBox.Show("Query: " + query);
DataSet data = new DataSet();
using (OleDbConnection con = new OleDbConnection(connectionString))
{
con.Open();
OleDbDataAdapter adapter = new OleDbDataAdapter(query, con);
adapter.Fill(data);
Dts.Variables["ExcelDataTable"].Value = data;
}
Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception ex)
{
Dts.Events.FireError(-1, "ErrorMessage", ex.ToString(), "", 0);
Dts.TaskResult = (int)ScriptResults.Failure;
}
//return data;
}
The second scrip task in this loop simply deletes any blank rows from Excel. I could have incorporated it with the script above, but I kept it portable for potential future reuse somewhere else.
The third script task in this loop uses the worksheet name to set a variable which is used in the next loop to determine my destination table.
The third loop(2nd inner loop) enumerates the rows in the object variable containing the data from the worksheet. It contains a single Execute SQL task which imports the data from the two source columns into the correct destination table based on the variable value set by the worksheet name above. Since the worksheet names have not always been consistent, this loop connects directly to my object variable, which eliminates the need to call the source columns by name. Rather, I just assign each one to a destination variable in the Foreach loop, and pass that data into my table row-by-row.
Thanks again for everyone's help and suggestions!
Aucun commentaire:
Enregistrer un commentaire