SQL Saturday Charlotte 2017

I am excited to announce I am finally ready to present my advanced ETL Framework with Biml seminar at SQL Saturday Charlotte on Oct 14.  And I have a 10AM slot!!!   Wooo!

SQLSat683_Speaking_300x225

Implementing a SSIS Framework and enforcing SSIS Patterns (with Biml).

(Using Biml to Automate the Implementation of a SSIS Framework)

Let’s use Biml to automate the implementation of a standard SSIS Framework. I.e. Logging, error handling, etc. Business Intelligence Markup Language (Biml) is great at automating the creation of SSIS packages. With Biml we can generate a template package that implements a standard SSIS framework. In this fast-paced session, we will create the tables and load some metadata, write the Biml to implement logging and error handling, and generate some packages that implement our standard framework. In this session, you don’t need to know Biml, but some familiarity with XML, TSQL or C# will help. By the end of this session, you will know how to use Biml to automatically generate packages that implement a simple SSIS Framework with Audit Logging and Error Handling.

BMI KeyNotes – Toastmasters Speech on Biml

As practice for delivering a 50-minute seminar to my fellow ETL Developers at BMI, I presented Speech 5 from the Toastmasters Technical Presentations manual.  It was a whopping 12-15 minute speech and I didn’t go over time!!!   Yay!!!  (I normally blow through the red card and end up a minute or two over my time before the Toastmaster yanks off the podium.)

I think the speech went well.  I only got off script a couple of times.  (But at least I didn’t read it verbatim, which I too often do.)  I lost some people during the demonstration portion of the speech, but that is where I cut a 40-minute demonstration down to 4 minutes, so I knew it was going to be too fast and too technical.  I do hope that most of my non-technical audience at least got a glimpse into the world of an ETL developer, even if they didn’t understand it fully.

Please use the comments section below to ask questions and provide feedback.  I welcome comments, questions, and constructive criticism about the speech and the content.  Thank you.

Below are links to the PowerPoint slide decks I used, the speech I delivered and the support files for the demonstration I gave.

How to move a ton of data from the Mainframe to the Cloud – PowerPoint

How to move a ton of data from the Mainframe to the Cloud – Speech

Using Biml to Automate the Generation of SSIS Packages – PowerPoint

Intro to Biml Speech – Support Files

SQL Saturday Atlanta!

I am excited to announce that I will be giving my BIML seminar in Atlanta in July!!! Here’s the abstract for my session:

SQL Saturday 652

How to move a ton of data from the Mainframe to the Cloud (with Biml).

So, you need to move data from 75 tables on the mainframe to new tables in SQL Azure. Do you: a) hand code one package to load all 75 tables, b) hand code 75 packages that move a table each, or c) wish there was a better way?
There is! Business Intelligence Markup Language (Biml) can automate the creation of packages so that they all follow the same script. In this session, we will create some simple metadata to be able to generate multiple packages and their associated connection managers. You will see Biml in action. You will see the XML that Biml uses to define packages and connections. You will see the C# code that Biml uses to fetch metadata and dynamically generate packages in SSIS. And you will see packages and connection managers generated from Biml before your very eyes.

SQL Saturday!!!

Yay!!! I’ll be speaking about Biml at SQL Saturday in Chattanooga in June!
I’m very excited. In typical fashion, I will probably cram too much into the hour, but I’ll try not to. Here’s the abstract for my session.

SQL Saturday

Using Biml to Automate the Generation of SSIS Packages

So, you need to move data from 75 tables on the mainframe to new tables in SQL Azure. Do you: a) hand code one package to load all 75 tables, b) hand code 75 packages that move a table each, or c) wish there was a better way?
There is! Business Intelligence Markup Language (Biml) can automate the creation of packages so that they all follow the same script. In this session, we will create some simple metadata to be able to generate multiple packages and their associated connection managers. You will see Biml in action. You will see the XML that Biml uses to define packages and connections. You will see the C# code that Biml uses to fetch metadata and dynamically generate packages in SSIS. And you will see packages and connection managers generated from Biml before your very eyes.

Biml Tutorial Part 3 – Adding Connection Managers to our Package Generator

Now that we can generate simple empty packages, the next step is to add connection managers to the project and package.  After that, we will add a DataFlow.  This article will focus on the Connections though, and we will start with a pair of standard OleDb Connections.  Recall the metadata tables from the last article:

Setting up the Source and Target Databases

To insert the connection information, we’ll need to add a source, a target, the type (which is Ole Db) and then we’ll need to hook them up to the Package.
For the source, I’ll use the AdventureWorks2014 Database for SQL Server 2014.  Any DB will do though.
For the target, I’ve created an empty database called MySqlDatabase and added the Person table (from AdventureWorks2014) with the following SQL:

CREATE DATABASE MySqlDatabase;
GO
USE MySqlDatabase;
GO
CREATE TABLE dbo.[Person](
	[BusinessEntityID] [int] NOT NULL CONSTRAINT [PK_Person_BusinessEntityID] PRIMARY KEY CLUSTERED,
	[PersonType] [nchar](2) NOT NULL,
	[NameStyle] bit NOT NULL CONSTRAINT [DF_Person_NameStyle] DEFAULT 0,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[MiddleName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NOT NULL,
	[Suffix] [nvarchar](10) NULL,
	[EmailPromotion] [int] NOT NULL CONSTRAINT [DF_Person_EmailPromotion] DEFAULT 0,
	[AdditionalContactInfo] [xml] NULL,
	[Demographics] [xml] NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL  NOT NULL CONSTRAINT [DF_Person_rowguid] DEFAULT (newid()),
	[ModifiedDate] [datetime] NOT NULL CONSTRAINT [DF_Person_ModifiedDate] DEFAULT (getdate()))
GO

We need to fashion an OleDb Connection String for each database, like so:

Source:  Data Source=.;Initial Catalog=AdventureWorks2014;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;
Target:  Data Source=.;Initial Catalog=MySqlDatabase;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;

Then we can insert the connection records, starting by adding a Connection Type of OleDb.

INSERT INTO Biml.ConnectionType ([ConnectionTypeName]) VALUES ('OleDb');
INSERT INTO Biml.Connection ([ConnectionName], [CreateInProject], [ConnectionTypeId],
    [OLEDB_ConnectionString],
    [OLEDB_DatabaseName], [ConnectionGuid])
VALUES ('My First Source', 1, (SELECT ConnectionTypeId FROM ConnectionType WHERE ConnectionTypeName = 'OleDb'),
    'Data Source=.;Initial Catalog=AdventureWorks2014;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;',
    'AdventureWorks2014', NEWID());
INSERT INTO Biml.Connection ([ConnectionName], [CreateInProject], [ConnectionTypeId],
    [OLEDB_ConnectionString],
    [OLEDB_DatabaseName], [ConnectionGuid])
VALUES ('My First Target', 1, (SELECT ConnectionTypeId FROM ConnectionType WHERE ConnectionTypeName = 'OleDb'),
    'Data Source=.;Initial Catalog=MySqlDatabase;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;',
    'MySqlDatabase', NEWID());

Now that we have the connection strings in the database, we need to hook them up to the package.  In the PackageConnection table, we also need to provide the TableSchema and TableName or SQL Statement.  In this case, we’ll pull from Person.Person.  Here are the SQL statements to insert the required rows into the PackageConnection table:

INSERT INTO Biml.PackageConnection (PackageId, ConnectionId,
    IsSource, IsTarget, TableSchema, TableName, DirectInputSql)
VALUES ((SELECT PackageId FROM Biml.Package WHERE PackageName = 'My First Biml Package'),
    (SELECT c.ConnectionId FROM Biml.Connection c WHERE c.ConnectionName = 'My First Source'),
    1,0,'Person','Person',NULL);
INSERT INTO Biml.PackageConnection (PackageId, ConnectionId,
    IsSource, IsTarget, TableSchema, TableName, DirectInputSql)
VALUES ((SELECT PackageId FROM Biml.Package WHERE PackageName = 'My First Biml Package'),
    (SELECT c.ConnectionId FROM Biml.Connection c WHERE c.ConnectionName = 'My First Target'),
    0,1,'dbo','Person',NULL);

Configuring the Biml file for Connections

Because we are using the OLEDB drivers, and have tables in the source and target that have the same structure, we do not need to map the fields.  Biml and SSIS will automatically map those.
So, we have populated the metadata.  Now, we need to modify the Biml so that it includes the Connection information.  In the structure of a Biml file, the connections are fully described in the root

<Connections></Connections>

section, and are then referenced inside the package

<Connections></Connections>

section.  Here’s what this would look like if we hard-coded the connections in the Biml file:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

	<!--Root level Connections-->
    <Connections>
        <Connection Name="My First Source"
    		ConnectionString="Provider=SQLNCLI11;Server=.;Database=AdventureWorks2014;Trusted_Connection=yes;"
			CreateInProject="true"
			CreatePackageConfiguration="false" RetainSameConnection="true" />
        <Connection Name="My First Target"
    		ConnectionString="Provider=SQLNCLI11;Server=.;Database=MySqlDatabase;Trusted_Connection=yes;"
			CreateInProject="true"
			CreatePackageConfiguration="false" RetainSameConnection="true" />
    </Connections>

    <Packages>
        <Package Name="My First Biml Package" ConstraintMode="Parallel"
			ProtectionLevel="EncryptSensitiveWithUserKey">
			
			<!--Package level Connection references-->
            <Connections>
                <Connection ConnectionName="My First Source" />
                <Connection ConnectionName="My First Target" />
             </Connections>
             <Tasks>
                 <Container Name="SEQC - Main Control Flow" ConstraintMode="Linear">
                     <Variables></Variables>
                     <Tasks>
                         <Expression Name="Placeholder" Expression="1" />
                     </Tasks>
                 </Container>
             </Tasks>
        </Package>
    </Packages>
</Biml>

Now that we’ve looked at what the Biml code would look like if we hard coded everything, let’s get back to the version of the code we had at the end of the last article:

<#@ template language="C#" hostspecific="true" tier="0"#>
<#@ import namespace="System" #>
<#@ import namespace="System.Data" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <# var bimlConfigConnection = (AstDbConnectionNode)RootNode.Connections["MainBimlConnection"]; #>
    <Packages>
        <# var packagesToCreate = ExternalDataAccess.GetDataTable(bimlConfigConnection.ConnectionString, 
        "SELECT PackageId, PackageName, ConstraintMode, ProtectionLevel, GeneratePackage FROM Biml.Package p "
        + " WHERE GeneratePackage = 1 ORDER BY p.PackageName"); 
        foreach(DataRow package in packagesToCreate.Rows) { 
        #>
            <!-- Build Package Here -->
            <Package Name="<#=package["PackageName"] #>" ConstraintMode="<#=package["ConstraintMode"] #>" 
                     ProtectionLevel="<#=package["ProtectionLevel"] #>">
                <Tasks>
    		    <Container Name="SEQC - Main Control Flow" ConstraintMode="Linear">
                        <Variables></Variables>
            	        <Tasks>
            	            <Expression Name="Placeholder" Expression="1" />
                        </Tasks>
                    </Container> 
                </Tasks>
            </Package>
        <# } #>
    </Packages>
</Biml>

Adding the Dynamic Connections to the Package

Root Level Connections

With this as the starting point, we’ll need to add a SQL Statement and a foreach() loop to pull in the connection info at the root level, like so:

<#@ template language="C#" hostspecific="true" tier="0"#>
<#@ import namespace="System" #>
<#@ import namespace="System.Data" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <# var bimlConfigConnection = (AstDbConnectionNode)RootNode.Connections["MainBimlConnection"]; #>
    <Connections>
	<# var connectionsToCreate = ExternalDataAccess.GetDataTable(bimlConfigConnection.ConnectionString, 
        "SELECT ConnectionId, ConnectionName, OLEDB_ConnectionString, CreateInProject FROM Biml.Connection Where ConnectionTypeId = 1 order by ConnectionName"); 
        foreach(DataRow connection in connectionsToCreate.Rows) { #>
           <Connection Name="<#=connection["ConnectionName"] #>" ConnectionString="<#=connection["OLEDB_ConnectionString"] #>" 
		   CreateInProject="<#=connection["CreateInProject"] #>" CreatePackageConfiguration="false" 
		   RetainSameConnection="true" />
        <# } #>
    </Connections> 

Package Level Connection References

We also a need the SQL Statement and foreach() loop to pull in the connection info at the package level. Here we’ll use the PackageId from the Package loop as a criterion in the WHERE clause:

<Package Name="<#=package["PackageName"] #>" ConstraintMode="<#=package["ConstraintMode"] #>" 
    ProtectionLevel="<#=package["ProtectionLevel"] #>">
    <Connections>
        <# connectionsToCreate = ExternalDataAccess.GetDataTable(bimlConfigConnection,
            "SELECT mbc.ConnectionName, CreateInProject, ConnectionGuid "
            + " FROM Biml.Connection mbc"
            + " INNER JOIN Biml.PackageConnection mbpc ON mbc.ConnectionId = mbpc.ConnectionId"
            + " INNER JOIN Biml.Package mbp ON mbpc.PackageId = mbp.PackageId"
            + " WHERE mbp.GeneratePackage = 1 and mbpc.PackageId = " + package["PackageId"].ToString()
            + " ORDER BY mbc.ConnectionTypeId;");
        foreach(DataRow connection in connectionsToCreate.Rows) { 
        #>
            <Connection ConnectionName="<#=connection["ConnectionName"] #>" />
        <# } #>
    </Connections>

Gotcha! #1

Now, here’s a nice little Gotcha! In Sql Server, in the Biml.Connection table, we store the CreateInProject flag as a bit. Biml rightly converts this to a bool, with values of True or False. However, the CreateInProject attribute requires values of true or false, with lowercase first letters. In order to correct this behavior, we need to add a function that can convert the bool values into a string that is either “true” or “false”. This is accomplished by adding a new Biml file, let’s call it -code.biml, and then creating a function in that file:

<#+ 
string FixBool(object inBool)
{
	
    if (inBool == "False" || inBool == "0")
    {
		return "false";
    }
    else
    {
        return "true";
    }
}
#>

Notice the symbols at the top and bottom of the function. These allow Biml to use this function as an include file. In the main biml file, we need to add a line:

<#@ include file="-code.biml" #>

And now we can wrap the CreateInProject flag inside the FixBool() function, like so:

   foreach(DataRow connection in connectionsToCreate.Rows) { #>
		<Connection Name="<#=connection["ConnectionName"] #>" ConnectionString="<#=connection["OLEDB_ConnectionString"] #>" 
		CreateInProject="<#=FixBool((bool)connection["CreateInProject"]) #>" 
		CreatePackageConfiguration="false" RetainSameConnection="true" />
	<# } #>
</Connections>

Testing it all: Generating the package.

Let’s test it all.

  1. Shift-click to select the MainBimlConnection.biml file and the My Second Package.biml file.
  2. Right-click and Check Biml for Errors. Fix any errors.
  3. Right-click and Generate SSIS Packages.

If everything went well, you should now have two new Package Level connection managers and a new SSIS Package.
So, it looks like everything worked.

Gotcha! #2

There is one last Gotcha! though. This one has to do with the Guids of the Connection Managers. If three different developers generate the packages you’ve defined here, they will get three different Guids for each connection manager.
To fix this, we need everyone to use connection managers with the same Guid.
Which Guid should it be? It should be the one in the Connection table.
How do I change it? That field is grayed out. Open the connection manager in Code view instead of Design view and you can paste in the Guid that was used in the Connection table.
Changing the Guid for a Connection Manager
Once you’ve changed the connection manager’s Guid to match the one in the Connection table, we need to tell Biml to use that Guid when it’s generating the package, like so:

        foreach(DataRow connection in connectionsToCreate.Rows) { 
        #>
            <Connection ConnectionName="<#=connection["ConnectionName"] #>" Id="<#=connection["ConnectionGuid"] #>" />
        <# } #>
    </Connections>

Then regenerate the packages. Don’t let it regenerate the connection managers though. You can prevent this by unchecking the connection managers in the confirmation box:
Biml Confirmation Dialog
As you increase the number of packages, you will get a lot more connection managers generated. The easiest way I have found to not let it regenerate them, is to check the ones you want to keep into Source Code Management (TFS, Git, etc.), then let Biml generate new connection managers and then Undo the Pending changes, so that you revert to the saved and checked-in version of the connection managers.

In Conclusion

Putting it all together, we get this Biml file:

<#@ template language="C#" hostspecific="true" tier="0"#>
<#@ import namespace="System" #>
<#@ import namespace="System.Data" #>
<#@ include file="-code.biml" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <# var bimlConfigConnection = (AstDbConnectionNode)RootNode.Connections["MainBimlConnection"]; #>
    
	<Connections>
	    <# var connectionsToCreate = ExternalDataAccess.GetDataTable(bimlConfigConnection.ConnectionString, 
        "SELECT ConnectionId, ConnectionName, OLEDB_ConnectionString, CreateInProject FROM Biml.Connection Where ConnectionTypeId = 1 order by ConnectionName"); 
       foreach(DataRow connection in connectionsToCreate.Rows) { #>
            <Connection Name="<#=connection["ConnectionName"] #>" ConnectionString="<#=connection["OLEDB_ConnectionString"] #>" 
            CreateInProject="<#=FixBool((bool)connection["CreateInProject"]) #>" 
            CreatePackageConfiguration="false" RetainSameConnection="true" />
		<# } #>
    </Connections>    
    
    <Packages>
        
        <# var packagesToCreate = ExternalDataAccess.GetDataTable(bimlConfigConnection.ConnectionString, 
        "SELECT PackageId, PackageName, ConstraintMode, ProtectionLevel, GeneratePackage FROM Biml.Package p "
        + " WHERE GeneratePackage = 1 ORDER BY p.PackageName"); 
       foreach(DataRow package in packagesToCreate.Rows) { 
            #>
            
        	<!-- Build Package Here -->
        	<Package Name="<#=package["PackageName"] #>" ConstraintMode="<#=package["ConstraintMode"] #>" 
        	        ProtectionLevel="<#=package["ProtectionLevel"] #>">
        	        
    	        <Connections>
            		<# connectionsToCreate = ExternalDataAccess.GetDataTable(bimlConfigConnection,
            		   "SELECT mbc.ConnectionName, CreateInProject, ConnectionGuid "
            			+ " FROM Biml.Connection mbc"
            			+ " INNER JOIN Biml.PackageConnection mbpc ON mbc.ConnectionId = mbpc.ConnectionId"
            			+ " INNER JOIN Biml.Package mbp ON mbpc.PackageId = mbp.PackageId"
            			+ " WHERE mbp.GeneratePackage = 1 and mbpc.PackageId = " + package["PackageId"].ToString()
            			+ " ORDER BY mbc.ConnectionTypeId;");
                   foreach(DataRow connection in connectionsToCreate.Rows) { 
            		   #>
                    <Connection ConnectionName="<#=connection["ConnectionName"] #>" Id="<#=connection["ConnectionGuid"] #>" />
            		<# } #>
                 </Connections>
        	        
                <Tasks>
            		<Container Name="SEQC - Main Control Flow" ConstraintMode="Linear">
            			<Variables></Variables>
            			<Tasks>
            				<Expression Name="Placeholder" Expression="1" />
            			</Tasks>
                    </Container> 
                </Tasks>
            </Package>
        <# } #>
    </Packages>
</Biml>

And it will generate this package (notice the Connection Managers added at the bottom):
Generated Package - After Adding Connections

endmark endmark endmark

Biml Tutorial 2 – Creating a Simple Package Generator

So I spent yesterday morning entering the metadata to describe some 70 or so packages that are all part of a big data load that moves data off the mainframe (DB2) into SQL Azure.  Occasionally, I would build the packages with BIML Express to ensure I hadn’t introduced bad metadata.  Today, I tweaked the BIML to allow for column level mappings between source and target tables.  Almost all of the tables are being staged into SQL Azure in the same format that they appear in DB2, so this was not necessary for those packages as BIML and SSIS will sort out the mappings automatically.

Before I get into the advanced material though, I need to finish the introduction that I started before the new job and move to Tennessee.

♦  ♦  ♦

 In the last post, I wrote about the metadata we need so that we can create ETL Packages.

We’re going to need at least the following tables:

  • Package
  • Connection
  • PackageConnection
  • ConnectionType

Additionally, Packages can have a Parent Package.

A Data Model Diagram might look like this:

With these tables, we can:

  • Define a Package to be Generated
  • Store the connection strings we need for our Development connections
  • Identify which source and target connections to use
  • Identify the Type of a Connection so we generate the right Source and Target components
  • Choose which packages to generate
  • Identify Parent-Child package relationships
  • Select whether a connection is at the Package or Project level

Defining a Package in MetaData

By populating the Package table, we can define a package to be generated.

Package
PackageId Int Identity, Primary Key, Not Null
PackageName Nvarchar(200) Not Null
ConstraintMode Nvarchar(10) Not Null, Default: Parallel
ProtectionLevel Nvarchar(50) Not Null, Default: EncryptSensitiveWithUserKey
ParentPackageId Int Nullable, Foreign Key to Package.PackageId
GeneratePackage Bit Not Null, Default: 1

Inserting a row into the table can be as simple as:

INSERT INTO Biml.Package (PackageName) VALUES ('My First Package');

With the table populated, we can now look at the Biml Code to Generate a Package from the MetaData defined so far.

To get started:

  1. Make sure you have Visual Studio 2015 and the latest SSDT, and that BIML Express is installed.
  2. Create a new Integration Services project named “My First Biml Project”
  3. Delete the Package1.dtsx that is created for you
  4. Right-click on the Project and select Add New Biml File
  5. Right-click again to add a second Biml file and rename the file to be: “My First Biml”

Now we have two blank Biml files, we need to setup a connection to the Biml MetaData tables.

In the Main Biml Connection file, enter this code:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <Connection Name="MainBimlConnection" ConnectionString="Data Source=MySqlServerInstance;Initial Catalog=BimlMetaDataDB;Provider=SQLNCLI11.1;Integrated Security=SSPI;" />
    </Connections>
</Biml>

That’s it. This file is done and won’t need to be changed unless you move your BimlMetaData database to a new server.

In the second file, we’ll be getting a bit more code in.

We’ll start by declaring our code language as C# and importing a couple of namespaces:

<#@ template language="C#" hostspecific="true" tier="0" #>
<#@ import namespace="System" #>
<#@ import namespace="System.Data" #>

Next, comes the Root Biml tag:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

Now, we need to use the connection we defined in the other file:

<# var bimlConfigConnection = (AstDbConnectionNode)RootNode.Connections["MainBimlConnection"]; #>

Then we’ll start the Packages tag:

<Packages>

Now, we pull in the records from the Package table to define our package(s) and close the Packages tag:

<# var  packagesToCreate = ExternalDataAccess.GetDataTable(bimlConfigConnection.ConnectionString,    "SELECT PackageId, PackageName, ConstraintMode, ProtectionLevel, GeneratePackage FROM Biml.Package p WHERE GeneratePackage = 1 ORDER BY p.PackageName");     foreach(DataRow package in packagesToCreate.Rows) {         #>
        <!-- Build Package Here -->
    <# } #>
</Packages>

Where we put the comment, we need to put the code that builds the package itself, such as:

<Package Name="<#=package["PackageName"] #>" ConstraintMode="<#=package["ConstraintMode"] #>" ProtectionLevel="<#=package["ProtectionLevel"] #>">
    <Tasks>
        <Container Name="SEQC - Main Control Flow" ConstraintMode="Linear">
            <Variables></Variables>
            <Tasks>
                <Expression Name="Placeholder" Expression="1" />
             </Tasks>
         </Container>
    </Tasks>
</Package>

The final Biml file should look like this:

First Package Code 1

Now, to generate the package.  First, select both Biml files (shift-click).  Then right-click and choose “Check Biml For Errors”:

Check for Errors

If there are any errors, read the error and check the Biml against the sample.

If there are no errors, then right-click again and choose “Generate SSIS Packages”:

Generate Package

In a few seconds, the Biml engine should generate the package we defined:

Generated Package

That’s it for now.  I’ll be back on as soon as I can to add connections to our package.

endmark endmark endmark

Biml Tutorial 1 – Introducing BIML

Intro to Automated Package and Template Generation with BIML

In order to simplify and automate the template creation and package creation process, we will create a Proof-of-concept (POC) that uses the Business Intelligence Markup Language (BIML, pronounced bim-ul).  BIML provides a language with which to generate SSIS packages using highly configurable metadata.

Business Intelligence Markup Language (BIML)

BIML has been a free component of BIDS Helper for years.  BIDS Helper is a well-established open-source product that has been part of the SSIS developer’s toolkit through many versions of SQL Server, up to and including SQL Server 2014.  BIML is an XML-based language, with code embedded in the XML document to perform specific functions.  It looks and feels a lot like working with standard ASP pages.

example-of-biml

BIDS Helper is available for SQL Server 2008R2 through SQL Server 2014 and Visual Studio 2010 through Visual Studio 2013.  For Visual Studio 2015, SQL Server 2014 and SQL Server 2016, a new tool, BIML Express is available.  Like BIDS Helper, it is a Visual Studio add-in.  However, BIML Express gives us the ability to edit BIML pages inside Visual Studio, with Intellisense and Syntax-Highlighting, something that was very awkward with BIDS Helper.

The high-end editor for BIML is Varigence’s Mist product (available for $249 per developer per month).  Varigence also hosts a free online editor called BIML Online.

If your development shop can afford it, Mist is the way to go.  If not, you will be using a combination of Visual Studio 2015 with BIML Express to edit/update BIML files, BIML Online to reverse engineer *.dtsx packages into BIML files, and Notepad++ (with the XML Plugin).

SSIS Template Generation

BIML works to automate template generation by using a combination of metadata (stored in SQL Server tables) and C# code, stored in the *.biml files themselves.

The basic structure of a package in BIML is as follows:

sample-package

You can see the tag that declares the Package Name.  Then it lsample-generated-packageists any package level .  (Project level connections are automatically available.)   Then it lists the that are declared in the Package.  Then we see the Control Flow , which we have divided into three s.  Within each container are the Control Flow for that container.  In the “Main Control Flow , you would put the logic for this particular package, both Control Flow Tasks, and Data Flow Tasks can be defined in this container.

If I run the BIML script above, it will output a standard *.dtsx SSIS package, shown at right.

Given the static XML above, and the standard package at right, what if I wanted to Generate several packages, each with a different name, but conforming to all our standards?  To accomplish this with BIML, we first introduce a table in the BIML schema that will hold package specifications, and then we modify the XML above to use parameters to generate the packages.

The table could look like this:

biml-package

The change to the BIML file is to add some C# code to loop through the packages in the metadata table defined above, and then to add some BIML specific tags that evaluate the expressions.  The tags are written inside the quotes for the attributes like Name and ProtectionLevel.  Like so:

sample-package-bimlized

Why Bother?

Why bother with the parameters?  Why not just run the script, rename the package, run the script again and rename the second package, run the script again and rename the third package?  (Sounds tedious, doesn’t it.)  The reason we bother is so that we can generate a lot of package templates, and then regenerate the package templates when the ETL Framework is updated.

For example, to accommodate a new method for delivering notifications, say we modified the ETL Framework a little.  Without the metadata for BIML to use, we could ignore this change and only apply it to template packages going forward, using the script and rename technique.  Or we could manually edit each package in our solution to retrofit it so that the framework change can be used in the existing packages as well as new ones.  Or we could regenerate all the packages with BIML, and then edit each to copy the Main Control Flow container from the existing package to the new one.  This is, debatably, a better way of handling changes to the Framework, but it still incurs the tedium of copying the Main Control Flow container from the old to the new.  Not ideal.

Perhaps we could generate all or most of the main package, and move the custom tasks into a separate, child package?

It turns out, we can.

SSIS Package / Template Generation

Most packages do much the same thing.  They fetch some data from some source and load it into a database into a staging table.  Then they perform some transformations and load the staged data into some final tables.  The transformations in the middle might be custom, but the rest is not.  The Extract from the source to the staging tables can be configured with metadata and coded for in BIML.  The Load from intermediary tables to the final tables can be configured too.  The only bit that is data specific is the Transformation happening in the middle of the process.  If we isolate that in its own package, we can fully automate the master package generation and automate the child template generation.

This is accomplished by capturing connection data for the different sources, mapping data from the source to staging table, and mapping data from the intermediate table to the final table.

A simple data model can accommodate the metadata we need.  The Connection table needs to contain all the parameters for the different kinds of connections we might encounter, and BIML will identify which columns are needed for each connection type.  Packages need to know which connections they use.  The Extract into staging tables needs to know the field mappings from source to target.  Likewise, the Load from the intermediary tables to the final tables needs to know its field mappings from source to target.

Continued here:  Introducing BIML – Continued