Azure Data Factory Publishing Error

I am using Azure Data Factory v2. It is bound to an Azure DevOps GIT repository. I made some changes in my branch, including deleting some obsolete items. Then I merged my changes back to the ‘master’ branch.

Now, I’m trying to publish my changes from the ‘master’ branch to the Azure Data Factory. When I do, it says it’s deployed 33 of 33 changes, and then it fails, saying:

“Publishing Error

The document cannot be deleted since it is referenced by <some Pipeline>.”

I searched high and low looking for some evidence that the offending Pipeline existed anywhere in the GIT repo. It did not.

Then, I discovered that you can change from the Azure DevOps GIT version of the Data Factory to the actual Data Factory version by selecting the latter from the dropdown in the top left corner of the Data Factory editor:

This saved the day. I was able to locate and delete the offending Pipeline(s) directly from the actual Data Factory. Then, when I switched back to Azure DevOps GIT mode, it allowed me to publish the changes from the ‘master’ branch. Woohoo!

Connecting to IBM DB2 zOS from Azure Data Factory v2

Connecting to IBM DB2 zOS from Azure Data Factory v1 was a matter of setting up the Azure Data Gateway on an on-prem server that had the IBM DB2 Client installed; creating an ODBC connection to DB2 (I called it DB2Test).  Then, in the Data Factory v1 Copy Wizard, Select the ODBC source, pick the Gateway, and enter the phrase:  DSN=DB2Test into the Connection String.  This worked for us.

Azure Data Factory v2

First, the Azure Data Gateway is now called “Hosted Integration Runtime”.  So download and install the IR client on your on-prem gateway machine.  On my machine, it auto-configured to use the existing Data Factory Gateway configuration, which is NOT what I wanted.  After uninstalling and reinstalling the IR client a couple of times, it stopped auto-configuring and asked me for a key.  To get the key, I had our Azure Dev configuration guy run the following PowerShell:

Import-Module AzureRM
$dataFactoryName = "myDataFactoryv2NoSSIS"
$resourceGroupName = "myResourceGroup"
$selfHostedIntegrationRuntimeName = "mySelfHostedIntegrationRuntime"
Login-AzureRmAccount
Set-AzureRmDataFactoryV2IntegrationRuntime -ResourceGroupName $resouceGroupName -DataFactoryName $dataFactoryName -Name $selfHostedIntegrationRuntimeName -Type SelfHosted -Description "selfhosted IR description"
Get-AzureRmDataFactoryV2IntegrationRuntimeKey -ResourceGroupName $resourceGroupName -DataFactoryName $dataFactoryName -Name $selfHostedIntegrationRuntime

I then pasted the Key into the Integration Runtime Configuration screen, and it connected properly to myDataFactoryv2NoSSIS.  Tada:

snip_20180124073749

Next, is to test the connection to DB2.  I went to the Diagnostics tab, entered the DSN and credentials, just like I did for Data Factory V1:

Failed to connect to the database. Error message: ERROR [HY010] [IBM][CLI Driver] CLI0125E Function sequence error. SQLSTATE=HY010

Dang! Much googling later, I found this obscure note.

I added the phrase “Autocommit=Off” to the DSN in the connection string, and voila, the connection worked.  So my final diagnostic looked like this:

snip_20180124074603

YAY!!