Monday, 9 May 2011

Looping through specific files - SSIS

As you may know, there is an enuerator in Foreach Loop Container which help you to loop through files, named File Foreach Enumerator. you can specify a location for searching through files and get files as enuerator in loop.
But, there are some times which you need to loop through specific files not all of files in the directory.
you can use MASK in file names and then just loop through match files. I want to talk about another common real world problem now.

Suppose you have a table which has file names inside, and you have a directory of physical files. now you want to loop through files in directory and check if the file is in table do task1 else do task2.

let's go on sample, through this sample you will find how to implement foreach loop container for file enumerator and use execute sql task with precedence constraint.

First of all our table structure in database is:
CREATE TABLE [dbo].[RawFiles](
    [ID] [bigint] IDENTITY(1,1) NOT NULL,
    [DateTime] [datetime] NOT NULL,
    [FileName] [nvarchar](max) NULL,
    [SourceFileBin] [varbinary](max) NULL,
    [FileType] [int] NULL,
    [Switch] [bigint] NULL,

    [ID] ASC

the [FileName] column has exact file name values inside.

now create a SSIS package,
Add two package scope variables:
FileName    datatype: string
Status        datatype: Int32
Add a Foreach Loop Container
double click on foreach container and Foreach Loop Editor window will open,
go to Collection tab, set Enumerator as Foreach File Enumerator
in Enumerator configuration, set Folder to the folder your physical files exists there.
in the Retrieve file name section select NAME ONLY , because we have only file names in table we want to lookup.

go to Variable Mappings tab,
select User::FileName in variable column, and set Index column as 0.

now Foreach loop container setting is finished, this will loop through files in the specified folder and in each iteration fills the FileName variable.

Add an Execute SQL Task inside the Foreach Loop container,
double click on it to open Execute SQL Task Editor,
connect it to your database with an OLE DB Connection Manager,
set ResultSet property to Single Row.
set SQLStatement as :

select count(*) as cnt from RawFiles where FileName=?

question mark in this statement means parameter,

now go to Parameter Mapping tab,
Add a line there,
set Variable Name as User::FileName ,
set Data Type as Nvarchar and parameter size as 500 , this is because filename field in table has Nvarchar data type,
set Parameter Name as 0 ,
Note that when you use OLE DB Connection managers, then you should name your parameters with 0,1,2,3 ...

this will fill the question mark parameter with the value from FileName variable.

now go to Result Set tab,
Add a line there,
set Result Name as cnt ,
set Variable Name as User::Status ,

this will fill the Status variable with result of query above, that means if there was a record with this filename in table then Status will be 1 or more, else will be 0 .

now , filling status made complete, the only step remains is that you should check value of Status and send it to appropriate task. suppose you want to do this:
status>0  -> message box which shows "exists!"
status<=0  -> message box which shows "NOT exists!"

so, Add two Script tasks after Execute sql task inside the foreach loop,
connect green arrow from execute sql task to these script task,
for the first script task, right click on green arrow, select Edit
Precedence Constraint Editor will open,
set Evaluationi Operation as EXPRESSION,
type this expression in EXPRESSION text box:

do same thing for another script task and set expression there as @Status<=0

write a MessageBox.Show("exists") in first script task, and MessageBox.Show("NOT exists") in second script task.

All done, just run the package.
this is whole schema:


No comments:

Post a Comment