Intro to Automated Package and Template Generation with BIML
In order to simplify and automate the template creation and package creation process, we will create a Proof-of-concept (POC) that uses the Business Intelligence Markup Language (BIML, pronounced bim-ul). BIML provides a language with which to generate SSIS packages using highly configurable metadata.
BIML has been a free component of BIDS Helper for years. BIDS Helper is a well-established open-source product that has been part of the SSIS developer’s toolkit through many versions of SQL Server, up to and including SQL Server 2014. BIML is an XML-based language, with code embedded in the XML document to perform specific functions. It looks and feels a lot like working with standard ASP pages.
BIDS Helper is available for SQL Server 2008R2 through SQL Server 2014 and Visual Studio 2010 through Visual Studio 2013. For Visual Studio 2015, SQL Server 2014 and SQL Server 2016, a new tool, BIML Express is available. Like BIDS Helper, it is a Visual Studio add-in. However, BIML Express gives us the ability to edit BIML pages inside Visual Studio, with Intellisense and Syntax-Highlighting, something that was very awkward with BIDS Helper.
If your development shop can afford it, Mist is the way to go. If not, you will be using a combination of Visual Studio 2015 with BIML Express to edit/update BIML files, BIML Online to reverse engineer *.dtsx packages into BIML files, and Notepad++ (with the XML Plugin).
BIML works to automate template generation by using a combination of metadata (stored in SQL Server tables) and C# code, stored in the *.biml files themselves.
The basic structure of a package in BIML is as follows:
You can see the tag that declares the Package Name. Then it lists any package level . (Project level connections are automatically available.) Then it lists the that are declared in the Package. Then we see the Control Flow , which we have divided into three s. Within each container are the Control Flow for that container. In the “Main Control Flow , you would put the logic for this particular package, both Control Flow Tasks, and Data Flow Tasks can be defined in this container.
If I run the BIML script above, it will output a standard *.dtsx SSIS package, shown at right.
Given the static XML above, and the standard package at right, what if I wanted to Generate several packages, each with a different name, but conforming to all our standards? To accomplish this with BIML, we first introduce a table in the BIML schema that will hold package specifications, and then we modify the XML above to use parameters to generate the packages.
The table could look like this:
The change to the BIML file is to add some C# code to loop through the packages in the metadata table defined above, and then to add some BIML specific tags that evaluate the expressions. The tags are written inside the quotes for the attributes like Name and ProtectionLevel. Like so:
Why bother with the parameters? Why not just run the script, rename the package, run the script again and rename the second package, run the script again and rename the third package? (Sounds tedious, doesn’t it.) The reason we bother is so that we can generate a lot of package templates, and then regenerate the package templates when the ETL Framework is updated.
For example, to accommodate a new method for delivering notifications, say we modified the ETL Framework a little. Without the metadata for BIML to use, we could ignore this change and only apply it to template packages going forward, using the script and rename technique. Or we could manually edit each package in our solution to retrofit it so that the framework change can be used in the existing packages as well as new ones. Or we could regenerate all the packages with BIML, and then edit each to copy the Main Control Flow container from the existing package to the new one. This is, debatably, a better way of handling changes to the Framework, but it still incurs the tedium of copying the Main Control Flow container from the old to the new. Not ideal.
Perhaps we could generate all or most of the main package, and move the custom tasks into a separate, child package?
It turns out, we can.
Most packages do much the same thing. They fetch some data from some source and load it into a database into a staging table. Then they perform some transformations and load the staged data into some final tables. The transformations in the middle might be custom, but the rest is not. The Extract from the source to the staging tables can be configured with metadata and coded for in BIML. The Load from intermediary tables to the final tables can be configured too. The only bit that is data specific is the Transformation happening in the middle of the process. If we isolate that in its own package, we can fully automate the master package generation and automate the child template generation.
This is accomplished by capturing connection data for the different sources, mapping data from the source to staging table, and mapping data from the intermediate table to the final table.
A simple data model can accommodate the metadata we need. The Connection table needs to contain all the parameters for the different kinds of connections we might encounter, and BIML will identify which columns are needed for each connection type. Packages need to know which connections they use. The Extract into staging tables needs to know the field mappings from source to target. Likewise, the Load from the intermediary tables to the final tables needs to know its field mappings from source to target.
Continued here: Introducing BIML – Continued