Monday, 9 May 2011

Update / Insert ( UPSERT ) MySQL Table From SQL Server Table - SSIS

There are lots of links which described how to UPSERT( Update / Insert ) between two SQL Server tables. which mostly used Lookup

Transform with OLEDB Command, like this:
http://www.insideit.fr/post/2010/06/17/Lookup-simple-sur-SSIS

But when you work with MySQL there is a limitation, you can not use OLEDB Connections, you can use ODBC or ADO.NET Connectors from

.NET to MySQL Database, as described briefly here:
http://blogs.msdn.com/b/mattm/archive/2008/03/03/connecting-to-mysql-from-ssis.aspx

So, How you can UPSERT with MySQL Tables? You will face problem with Lookup transform because you have no OLEDB Connection to mysql,

and you can not use OLEDB Command on the other side.

This is a solution for Upsert ( Update/Insert) with MySQL table:
(Note that in this solution Cache connection manager used which is only available in SSIS 2008)

 Assume Structure of SQL Server table is :

CREATE TABLE [dbo].[SourceSqlServerTable](
    [ID] [int] NOT NULL,
    [FirstName] [varchar](255) NULL,
    [LastName] [varchar](255) NULL)


and Structure of MySQL table is:

CREATE TABLE mysqltable (
  ID int(11) DEFAULT NULL,
  FirstName varchar(255) DEFAULT NULL,
  LastName varchar(255) DEFAULT NULL
)


Data in SQL Server Table:


Data in MySQL Table:


Solution:

There are three steps as follows,
First Step:

Add a data flow task to control flow, name it as "Fill Cache Component"
Add an ADO.NET Source which pointing to the MySQL Table.
For finding how to create connection from SSIS to MySQL look at this article:
http://blogs.msdn.com/b/mattm/archive/2008/03/03/connecting-to-mysql-from-ssis.aspx
 

Then add a Cache Transform and connect data path ( green arrow ) from ado.net source to this cache transform.

Double click on connection manager,
Click on New to create new cache connection.
In cache connection manager editor, check the "Use file cache" checkbox.
And browse a path to save cache file there,


In the columns tab, set PK of mysql table as index ( set index position as 1 ),then OK.


Map columns.



The first step fills Cache file with data from MySQL Table ( Which is lookup table ).

Second Step:

Go back to control flow.
Create new Variable of OBJECT data type in package scope and name it as UpdatedRows.
Add another data flow task, name this one as "Lookup".
Connect precedence constraint ( green arrow ) from first data flow task to second data flow task.
Then double click on second data flow task.
Add an OLEDB Source ( setup a new oledb connection to sql server database ) and point it to SQL Server table.


Then add a Lookup transform, connect green arrow from oledb source to lookup.
Then double click on lookup, and in lookup transformation editor,
Set Connection Type as "Cache Connection manager".
And set "specify how to handle rows with no match entries" with "redirect no match rows to no match output".


Then in connection tab, select cache connection manager from the drop down list.


And in columns tab, map joining fields


Add an ado.net Destination pointing to the MySQL database, and map columns.


This will do INSERT part of UPSERT.
Add a RecordSet Destination, and double click on it, in Component properties tab, set VariableName with User::UpdatedRows.



In the second step you INSERT new rows in MySQL table, and fills UpdatedRows to an object datatype variable.
This is whole schema of this second Data flow task:



Third Step:

Go back to control flow,
Create these Variables:

Name        Scope        DataType      Value
----------------------------------------------
ID          Package      Int32         0
FirstName   Package      String
LastName    Package      String


Add a Foreach Loop Container, and connect precedence constraint from "lookup" data flow to this container.
In Foreach Loop Editor, Set Enumerator as ADO Enumerator.
And set Ado object source variable with User::UpdatedRows.


Then in Variable mappings tab, do these mappings:



Then add an execute sql task inside foreach loop container,
And set connection to MySQL database there,
Write update command in SQLStatement property,as below:
Update MySQLTable
Set FirstName=@FirstName
,LastName=@LastName
where ID=@ID
 


Then in Parameter mappings tab, do these mappings:



That's all, this is the whole schema of package:



Now run the package, and result in MySQL table after Upsert will be:


The End.

1 comment:

  1. Thank you so much for providing information about SSIS functions such as Update,Insert and Delete.

    SSIS Upsert

    ReplyDelete