SSIS 2016: Retrieving the ErrorDescription and ErrorColumnName

Since SQL Server 2016 there is a ‘built-in’ function to retrieve the actual column name of the column which caused the error. But there’s also a flaw, in case of a database constraint the ‘ErrorColumn’ (input parameter) will remain empty. So I’ve added an ELSE-statement to it in case it doesn’t receive a value:

How to set it up

1) Copy the script below

public override void Input0_ProcessInputRow(Input0Buffer Row)
{
Row.ErrorDescription = this.ComponentMetaData.GetErrorDescription(Row.ErrorCode);

IDTSComponentMetaData130 componentMetaData = this.ComponentMetaData as IDTSComponentMetaData130;

if (Row.ErrorColumn > 0)
{
Row.ErrorColumnName = componentMetaData.GetIdentificationStringByID(Row.ErrorColumn);
}
else
{
Row.ErrorColumnName = "Database constraint :)";
};
}


2) Paste it into the script window and save it


3) Select both ‘ErrorCode’ and ‘ErrorColumn’ as input parameters


4) Now add two new output columns as shown below

In my case both output column are of data type string (DT_STR) with a size 250.

5) Add a Derived Column to the Data Flow after the Script Component and use the following SSIS expression to replace the original output value of ‘ErrorColumnName’

REPLACE( REVERSE( SUBSTRING( REVERSE( ErrorColumnName ),1,FINDSTRING( REVERSE(ErrorColumnName) ,”[“,1) – 1) ),”]”,” “)

(The original output value also contains the component’s name and we’re just interested in the column name itself)

 

There you go!
Many thanks to my colleague Gigi for the SSIS Expression 🙂

 


 


 

Advertisements

SSIS The package contains two objects with the duplicate name of output column “ErrorCode”

Grrr…this one can be annoying!
But when you know how to cope with it, it’s an easy fix.

The problem: The SSIS package is failing validation with a message to the effect of “The package contains two objects with the duplicate name of “output column “ErrorCode” (123)” and “output column “ErrorCode” (456)”.

The explanation: The Data Flow source (for at least OLE DB Sources) will have a Error Output that by default will contain all the columns of the Source with two additional columns: ErrorCode and ErrorColumn.

The solution: The source column can be renamed as a last resort. This is often not practical or possible, the better solution is to rename the Error Output column. With the data flow open:

1) Choose the Data Flow Source
2) Choose Properties
3) Click ‘Show Advanced Editor’
4) Navigate to ‘Input and Output Properties’
5) Open the Source Error Output
6) Scroll to the bottom of the list and rename the last ‘ErrorCode’ column.

SSIS advanced editor errorcode errorcolumn



Thanks to: http://bturnip.com/nerdalerts/archives/146








SSIS not working with temp tables

Received this fine error message during ETL?:
The metadata could not be determined because statement ” in procedure ” uses a temp table.

If you are working on SSIS 2012, then it uses system stored procedure sp_describe_first_result_set to fetch the metadata of the tables and it does not support temporary tables. But you can go for other options like table variables and CTEs which are going to work fine.
In my case, since I’m a bit stubborn I looked for another solution and found a few:



Solution 1:
An option is to use a SET FMTONLY ON command in front of your actual query to send a fake “First result set” to SSIS with your correct column structure. So you can do something like:

SET FMTONLY ON
select 0 as a, 1 as b, 'test' as C, GETDATE() as D
SET FMTONLY OFF

select a, b, c, d from ##TempTable

When SSIS runs sp_describe_first_result_set, it will return the metadata and column names of your FMTONLY command, and won’t complain about not being able to determine the metadata of your temp table because it won’t even try.



Solution 2:
And the best solution is this one in my opinion, simply provide SSIS the metadata up front like:

EXEC [dbo].[usp_SSIS_DataSet] @Date = '2012-01-01'
WITH RESULT SETS
(
(
[DateID] [int] NOT NULL,
[Sales] [decimal](16, 4) NOT NULL
)
)

NOTE: Don’t know why, but syntax wise it needs the round brackets twice.
You can use this in your OLE DB Source component as a query (or in a variable).








How to create a view (with end-date) for an INSERT-ONLY SCD Type 2 table?

Well Albert Garcia Diaz wrote a great example for that.
He uses LEAD() combined with the OVER-clause to easily accomplish this:
“…LEAD provides access to a row at a given physical offset that follows the current row…”

More info about LEAD():
https://msdn.microsoft.com/en-us/library/hh213125.aspx


CREATE VIEW [hstg].[vw_Table_History]
AS
SELECT
ETL_EXECUTION_DT AS START_DATE,
ISNULL(LEAD(ETL_EXECUTION_DT) OVER (PARTITION BY BusinessKey ORDER BY ETL_EXECUTION_DT), '9999-12-31 00:00:00.0000000') AS END_DATE,
[CURRENT_RECORD] = CASE
WHEN LEAD(ETL_EXECUTION_DT) OVER (PARTITION BY BusinessKey ORDER BY ETL_EXECUTION_DT) IS NULL THEN 'Y'
ELSE 'N'
END,

[BusinessKey],
[Field1],
[Field2],

ETL_BATCH_DT AS ETL_ROW_START_DT,
S.ETL_BATCH_ROW_ID,
CONVERT (DATETIME, '99991231') AS ETL_ROW_END_DT,
ETL_ROW_HASH,
CONVERT (NVARCHAR (512), 'AdvWorks') AS ETL_ROW_SOURCE,
ETL_CDC_OPERATION,
ETL_EXECUTION_DT
FROM [hstg].[Table] S








SSIS BufferTempStoragePath: The buffer manager cannot write 8 bytes to file

Hi there!

This morning I received this nice little error during ETL:

"The buffer manager cannot write 8 bytes to file C:\....There was insufficient disk space or quota."

ssis-BufferTempStoragePath-buffer-manager-cannot-write-8-bytes-to-file-error

First thing Google told me was:

Check if you are experiencing any memory problems and errors ..


Okay…you’re right again Google, we do have a memory shortage (for heavy SSIS).
Then I found this blog post by BI Monkey explaining a few things:
http://www.bimonkey.com/2008/04/blobtempstoragepath-and-buffertempstoragepath/

Two innocuous properties on every Data Flow in SSIS. You’ve likely never noticed them. But they can easily kill your jobs if not set properly.

They tell SSIS where to write to on disk under given circumstances. BufferTempStoragePath is where on disk it will write buffer contents in the event that there is no available memory. BLOBTempStoragePath is where on disk it will write any BLOB data in the buffer.


One of the tips from MSSQLTips about this:
http://www.mssqltips.com/sqlservertip/1867/sql-server-integration-services-ssis-performance-best-practices/

Best Practice #8 – BufferTempStoragePath and BLOBTempStoragePath

If there is a lack of memory resource i.e. Windows triggers a low memory notification event, memory overflow or memory pressure, the incoming records, except BLOBs, will be spooled to the file system by SSIS. The file system location is set by the BufferTempStoragePath of the data flow task. By default its value is blank, in that case the location will be based on the of value of the TEMP/TMP system variable.

Likewise SSIS may choose to write the BLOB data to the file system before sending it to the destination because BLOB data is typically large and cannot be stored in the SSIS buffer. Once again the file system location for the spooling BLOB data is set by the BLOBTempStoragePath property of the data flow task. By default its value is blank. In that case the location will be the value of the TEMP/TMP system variable. As I said, if you don’t specify the values for these properties, the values of TEMP and TMP system variables will be considered as locations for spooling. The same information is recorded in the log if you enable logging of the PipelineInitialization event of the data flow task as shown below.

User:PipelineInitialization,ARSHADALI-LAP,FAREAST\arali,Data Flow Task,{C80814F8-51A4-4149-8141-D840C9A81EE7},{D1496B27-9FC7-4760-821E-80285C33E74D},10/11/2009 1:38:10 AM,10/11/2009 1:38:10 AM,0,0x,No temporary BLOB data storage locations were provided. The buffer manager will consider the directories in the TEMP and TMP environment variables.

So far so good. What is important here is to change this default values of the BufferTempStoragePath/BLOBTempStoragePath properties and specify locations where the user executing the package (if the package is being executed by SQL Server Job, then SQL Server Agent service account) has access to these locations. Preferably both locations should refer to separate fast drives (with separate spindles) to maximize I/O throughput and improve performance.


In the Dataflow properties:
ssis-BufferTempStoragePath-default-dataflow-properties


He advices to setup two package variables in every (template) SSIS package to configure these during deployment (which makes sense) or at least change the default C-drive (which is in our case the smallest disk) to a different one.

Conclusion
Put this one on your SQL Server / SSIS checklist!
Overwrite the default paths for BLOBTempStoragePath and BufferTempStoragePath if your C-drive is too little or too slow (SSD perhaps?).







SQL Server, SSIS and Biml data types

Cathrine Wilhelmsen wrote a blog post with a mapping table (or cheat sheet) with the data types of SQL Server (DB), SSIS and Biml:
(you’re able to sort the columns on Cathrine’s post, just click on the link above)


The table below shows a simplified mapping between SQL Server, SSIS (SQL Server Integration Services) and Biml (Business Intelligence Markup Language) data types. The current version of the table does not include all type systems, all possible mappings or all data types, but is instead meant as a quick reference while developing and learning Biml. It was mainly made as a cheat sheet for myself, but I hope other new Biml developers will find it useful.


biml-data-types








SSIS: Extract package information (e.g. DataFlowTaskName, TaskDescription, PrecedenceConstraint, etc.)

Found this great blog post from Saravanan:
http://sqlmall.wordpress.com/2014/05/29/extract-ssis-2012-package-data/

He describes a way to search through the XML coding of your SSIS packages, to for example, look for a certain DataFlow which is called ‘RAW_DWH_Main‘.

Step 1) Enable FILESTREAM via SQL Server Configuration Manager

http://sqlserverscribbles.com/2013/07/09/how-to-enable-and-configure-filestream-in-sql-server-2008-2012/
http://msdn.microsoft.com/en-us/library/cc645923.aspx

Step 2) Create the FileTable (and a new database)
NOTE: Make sure to fill in an existing path for the directory.


EXEC sp_configure filestream_access_level, 2
RECONFIGURE
GO

CREATE DATABASE PackageDefinition
ON PRIMARY
(Name = PackageDefinition,
FILENAME = 'E:\FileTableDB\PackageDefinition.mdf'),
FILEGROUP FTFG CONTAINS FILESTREAM
(NAME = Package,
FILENAME='E:\FileTableDB\PackageDefinition')
LOG ON (Name = PackageDefinitionLog,
FILENAME = 'E:\FileTableDB\PackageDefinitionLog.ldf')
WITH FILESTREAM (NON_TRANSACTED_ACCESS = FULL,
DIRECTORY_NAME = N'PackageDefinition');
GO

USE PackageDefinition;
CREATE SCHEMA FT
GO

CREATE TABLE FT.PackageDefinition AS FileTable
WITH
(FileTable_Directory = 'PackageDirectory');
GO

Here’s your new FileTable:

Search-in-ssis-packages-with-sql-filetable-structure


Explore the directory and copy/paste your SSIS packages here

Search-in-ssis-packages-with-sql-filetable-explore-directory


Now copy (or perhaps move/relocate) your TFS to this directory.

Almost done…run the query!

;WITH XMLNAMESPACES
(
'www.microsoft.com/SqlServer/Dts' AS DTS
),

PackageDefinition
AS
( /*Query the file content([file_stream]) from filetable*/
SELECT *
,FileTableRootPath() AS RootPath
,file_stream.GetFileNamespacePath() as FilePath
,path_locator.GetLevel() AS PathLevel
,PackageData = CAST(CAST([file_stream] AS VARBINARY(MAX)) AS XML )
FROM [FT].[PackageDefinition]
WHERE file_type = 'dtsx' --SSIS Packages
--AND Name LIKE '%RAW_DWH_Main%' --FileName
AND file_stream.GetFileNamespacePath() LIKE '\PackageDirectory\DWH.SSIS\%' --Project folder
AND path_locator.GetLevel() = 2

)

SELECT r.*
FROM
(
SELECT
RootPath
,FilePath
,PathLevel
,PackageName = Exe.pkg.value ('@DTS:ObjectName' ,'VARCHAR(250)')
,CreationDate = Exe.pkg.value ('@DTS:CreationDate' ,'DATETIME2(0)')
,CreatorName = Exe.pkg.value ('@DTS:CreatorName' ,'VARCHAR(250)')
,DataFlowTaskName = Task.pkg.value ('@DTS:Description' ,'VARCHAR(250)')
,TaskDescription = Task.pkg.value ('@DTS:ObjectName' ,'VARCHAR(250)')
,PrecedenceConstraint = '['+REPLACE((pc.pkg.value ('@DTS:From','VARCHAR(250)')),'Package\','')+'] ' +'---->' + '['+REPLACE((pc.pkg.value ('@DTS:To' ,'VARCHAR(250)')),'Package\','') +']'
,refId = Con.pkg.value ('@DTS:refId' ,'VARCHAR(250)')
,ConnectionProvider = Con.pkg.value ('@DTS:CreationName' ,'VARCHAR(250)')
,ConnectionManagerName = Con.pkg.value ('@DTS:ObjectName' ,'VARCHAR(250)')
,ConnectionString = Par.pkg.value ('@DTS:ObjectName' ,'VARCHAR(250)')
,ParameterName = CM.pkg.value ('@DTS:ConnectionString' ,'VARCHAR(250)')
,ParameterValue = Par.pkg.value ('.' ,'VARCHAR(250)')
,VariableName = Var.pkg.value ('@DTS:ObjectName' ,'VARCHAR(250)')
,VariableValue = Var.pkg.value ('.' ,'VARCHAR(250)')
,PackageData
FROM PackageDefinition
OUTER APPLY PackageData.nodes('/DTS:Executable') AS Exe(pkg)
OUTER APPLY Exe.pkg.nodes('DTS:Executables/DTS:Executable') AS Task(pkg)
OUTER APPLY Exe.pkg.nodes('DTS:PackageParameters/DTS:PackageParameter') AS Par(pkg) --Will only return data while using the 'Package Deployment'-model I guess.
OUTER APPLY Exe.pkg.nodes('DTS:Variables/DTS:Variable') AS Var(pkg)
OUTER APPLY Exe.pkg.nodes('DTS:ConnectionManagers/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager') AS CM(pkg)
OUTER APPLY Exe.pkg.nodes('DTS:ConnectionManagers/DTS:ConnectionManager') AS Con(pkg)
OUTER APPLY Exe.pkg.nodes('DTS:PrecedenceConstraints/DTS:PrecedenceConstraint') AS pc(pkg)
) AS r
WHERE r.DataFlowTaskName LIKE '%MyDataFlow%'
ORDER BY FilePath



The results:

Search-in-ssis-packages-with-sql-filetable-xml-query-results


Important note:
The query doesn’t return any nested tasks, you may need to use a LIKE-operator for that:
WHERE CAST(PackageData AS NVARCHAR(MAX)) LIKE '%%'

If you have a solution for this, please let me know or post the solution in the comments. Thanks!


More info about FileTables:
http://blog.tallan.com/2012/01/01/its-a-file-system-its-a-database-table-its-sql-server-denali-filetable/

How-filetables-work-more-info-sql-server