How to automatically export/backup the MDS model

I noticed that I was making backups of the MDS model manually, so it’s time to automate that in a basic but effective way 🙂
When using CMD, it’s always tricky to get the correct timestamp (which I’ll use in the output filename). If found a nice script on: https://stackoverflow.com/questions/1064557/creating-a-file-name-as-a-timestamp-in-a-batch-job

Normally when you would like to import or export/backup a MDS model, you’ll use these scripts:

CREATE:
CD C:\Program Files\Microsoft SQL Server\130\Master Data Services\Configuration
MDSModelDeploy createpackage -service MDS1 -model "Biml meta-data" -version "VERSION_1" -package "E:\MDS\BimlMetaData_20170804.pkg" -includedata

DEPLOY:
CD C:\Program Files\Microsoft SQL Server\130\Master Data Services\Configuration
MDSModelDeploy deployclone -package "E:\MDS\BimlMetaData_20170804.pkg" -service MDS1

 

To automate this, create a BATCH-file (*.bat) in your local ‘MDS backup directory’ (in my case ‘E:\MDS\’) and copy/paste the following script into it:

:: THIS CODE WILL DISPLAY A 2-DIGIT TIMESTAMP FOR USE IN APPENDING FILENAMES

:: CREATE VARIABLE %TIMESTAMP%

for /f "tokens=1-8 delims=.:/-, " %%i in ('echo exit^|cmd /q /k"prompt $D $T"') do (
for /f "tokens=2-4 delims=/-,() skip=1" %%a in ('echo.^|date') do (
set dow=%%i
set mm=%%j
set dd=%%k
set yy=%%l
set hh=%%m
set min=%%n
set sec=%%o
set hsec=%%p
)
)

:: ensure that hour is always 2 digits

if %hh%==0 set hh=00
if %hh%==1 set hh=01
if %hh%==2 set hh=02
if %hh%==3 set hh=03
if %hh%==4 set hh=04
if %hh%==5 set hh=05
if %hh%==6 set hh=06
if %hh%==7 set hh=07
if %hh%==8 set hh=08
if %hh%==9 set hh=09

:: assign timeStamp:
:: Add the date and time parameters as necessary - " yy-mm-dd-dow-min-sec-hsec "

set timeStamp=%yy%%mm%%dd%_%hh%%min%%sec%

:: --------- TIME STAMP DIAGNOSTICS -------------------------

:: Un-comment these lines to test output

:: echo dayOfWeek = %dow%
:: echo year = %yy%
:: echo month = %mm%
:: echo day = %dd%
:: echo hour = %hh%
:: echo minute = %min%
:: echo second = %sec%
:: echo hundredthsSecond = %hsec%
:: echo.
:: echo Hello!
:: echo Today is %dow%, %mm%/%dd%.
:: echo.
:: echo Your timestamp will look like this: %timeStamp%
:: echo.
:: echo.
:: echo.
:: pause

CD C:\Program Files\Microsoft SQL Server\130\Master Data Services\Configuration

MDSModelDeploy createpackage -service MDS1 -model "Biml meta-data" -version "VERSION_1" -package "E:\MDS\BimlMetaData_%timeStamp%.pkg" -includedata

 

IMPORTANT: Make sure that you change the ‘MDS backup directory’ in the code above.

Now create a scheduled task (standard Windows functionality) that executes the BATCH-file and runs either on a specific time and/or system startup:



 



 



 

 


 


 


 


 


 


 


 


 


 


 


 


 

Other handy scripts:
PowerShell: Delete old files based on a given retention period (days) #maintenance

Use PowerShell to script all database objects

 


 


 

Advertisements

SSIS 2016: Retrieving the ErrorDescription and ErrorColumnName

Since SQL Server 2016 there is a ‘built-in’ function to retrieve the actual column name of the column which caused the error. But there’s also a flaw, in case of a database constraint the ‘ErrorColumn’ (input parameter) will remain empty. So I’ve added an ELSE-statement to it in case it doesn’t receive a value:

How to set it up

1) Copy the script below

public override void Input0_ProcessInputRow(Input0Buffer Row)
{
Row.ErrorDescription = this.ComponentMetaData.GetErrorDescription(Row.ErrorCode);

IDTSComponentMetaData130 componentMetaData = this.ComponentMetaData as IDTSComponentMetaData130;

if (Row.ErrorColumn > 0)
{
Row.ErrorColumnName = componentMetaData.GetIdentificationStringByID(Row.ErrorColumn);
}
else
{
Row.ErrorColumnName = "Database constraint :)";
};
}


2) Paste it into the script window and save it


3) Select both ‘ErrorCode’ and ‘ErrorColumn’ as input parameters


4) Now add two new output columns as shown below

In my case both output column are of data type string (DT_STR) with a size 250.

5) Add a Derived Column to the Data Flow after the Script Component and use the following SSIS expression to replace the original output value of ‘ErrorColumnName’

REPLACE( REVERSE( SUBSTRING( REVERSE( ErrorColumnName ),1,FINDSTRING( REVERSE(ErrorColumnName) ,”[“,1) – 1) ),”]”,” “)

(The original output value also contains the component’s name and we’re just interested in the column name itself)

 

There you go!
Many thanks to my colleague Gigi for the SSIS Expression 🙂

 


 


 

Hashing within the Azure Data landscape

The research

I’ve did a small research about what’s the best option is for hashing (e.g. determining deltas or hashing concatenated natural keys). Up until now, the algorithm used for hashing is MD5. Please note that we’re not discussing the hashing of passwords. Want to read more? https://www.linkedin.com/feed/update/urn:li:activity:6295982490112925696

 

Which hashing algorithm should we use?

MD5 has a minimal performance impact and the output (VARBINARY or CHAR(32)) has the lowest storage footprint. But…as of SQL Server 2016, all hashing algorithms except SHA2_256 and SHA2_512 are deprecated:

“…Beginning with SQL Server 2016, all algorithms other than SHA2_256, and SHA2_512 are deprecated. Older algorithms (not recommended) will continue working, but they will raise a deprecation event…”

https://docs.microsoft.com/en-us/sql/t-sql/functions/hashbytes-transact-sql

 

Basically Microsoft is saying good bye to all other/older algorithms. When you look at the current Azure data landscape, you bet that for the near future you’re unable to continue using MD5 in your Data Warehouse or Data Mart (SQL Server 2017+, Azure SQL DB and Azure SQL DWH).

So… SHA2_256 it is for hashing keys. I guess for hashing passwords or other sensitive values, use SHA2_512 in combination with salt.

 

Wouldn’t that impact the (ETL) performance?

Yep, unfortunately. But please keep in mind that we just need to continue evolving and Microsoft is in charge 🙂 Just make sure that you only need to hash values once, e.g. in your staging/landing area.

 

How should we use it?

Within SQL Server, Azure SQL DWH or Azure SQL DB, you’ll probably use HASHBYTES() to calculate the hash and it returns a VARBINARY. Personally I’m not a big fan of storing ‘varbinaries’ in a DWH, that why I’m storing it as a CHAR(). I’m not sure whether VARBINARY is supported in all of Azure’s services (ML, etc.?).

Depending on the hashing algorithm, HASHBYTES() will return hashes with different sizes/lengths (bytes):



 


 


 

The VARBINARY values are:



 


 

Now…how to cast/convert it to CHAR()? On LinkedIn we had a short discussion about using CAST() or CONVERT(), since they output different strings. My advice would be to use CONVERT(), since it represents the actual VARBINARY-value (with or without the ‘0x’-prefix).

If you want to have the prefix included in the output string, use:
SELECT CONVERT(CHAR(),HASHBYTES('',''),(1))

Personally I think that you can lose the prefix, because you can always concatenate it if needed. So:
SELECT CONVERT(CHAR(),HASHBYTES('',''),(2))



 


 

Now when you’re converting the VARBINARY to a CHAR(xx), the size/length increases:



 


 

Conclusion

So basically it all come to:

Generate a SHA2_256 hash like SELECT CONVERT(CHAR(64),HASHBYTES('SHA2_256',@TESTVAL),(2))
Generate a SHA2_512 hash like SELECT CONVERT(CHAR(128),HASHBYTES('SHA2_512,@TESTVAL),(2))

Store a SHA2_256 hash as CHAR(64)
Store a SHA2_512 hash as CHAR(128)

Unless Microsoft decides to support MD5 or SHA1 within SQL Server vNext / HASHBYTES(), my advice will be to start using SHA2_256 for key hashes.

In case you don’t need to combine Azure services and just stick to a ‘normal’ DWH on SQL Server, storing a SHA2_256 as BINARY(32) would be the best choice. BINARY() as a datatype could not be supported in a specific service or tool.

Happy hashing! 🙂

 


 


 

How to Encrypt and Decrypt Text in SQL Server (2008+)

Here’s a simple example on how to encrypt (and decrypt) values in SQL Server:

How to Encrypt and Decrypt Text in SQL Server

 

--How to Encrypt and Decrypt Text in SQL Server (2008+)
--Example by Clint Huijbers

DECLARE
@Key1 NVARCHAR(50) = N'8g87g8ag4r8hinsg^%$#F&^F&^F^F&ÛIBOUG(%*^R&$%&#%^C5'
,@Key2 NVARCHAR(50) = N'&*^%R%^**F$F*OUYBYUB*F%74d654d7f685f65f56f6v6vc88d'
,@ValueToEncrypt VARCHAR(150) = 'This is the value that will be encrypted by the built-in function of SQL Server with two keys and 128-bits encryption :) blablablablablablablablablabl'
,@EncryptedValue VARBINARY(350) --Increase the VARBINARY size when you increase the lengh of @ValueToEncrypt!

SELECT @EncryptedValue = ENCRYPTBYPASSPHRASE((@Key1+N'||'+@Key2),@ValueToEncrypt)
SELECT @EncryptedValue

SELECT CONVERT(VARCHAR(150),DECRYPTBYPASSPHRASE((@Key1+N'||'+@Key2),@EncryptedValue))

 


 


 

SQL Table/CTE for SQL Server, SSIS and Biml data types

Cathrine Wilhelmsen posted an easy lookup table for the data types within SQL Server, SSIS and Biml.

For loading the meta-data information based on source tables, I find it handy to provide all three data types in the result set of the query. So here’s an easy Common Table Expression / Table Value Constructor which might help you also.


WITH DataTypeMatrix (DataTypeSQL, DataTypeSSIS, DataTypeBiml)
AS
(
SELECT *
FROM
(
VALUES
('bigint','DT_I8','Int64')
,('binary','DT_BYTES','Binary')
,('bit','DT_BOOL','Boolean')
,('char','DT_STR','AnsiStringFixedLength')
,('date','DT_DBDATE','Date')
,('datetime','DT_DBTIMESTAMP','DateTime')
,('datetime2','DT_DBTIMESTAMP2','DateTime2')
,('datetimeoffset','DT_DBTIMESTAMPOFFSET','DateTimeOffset')
,('decimal','DT_NUMERIC','Decimal')
,('float','DT_R8','Double')
,('geography','DT_IMAGE','Object')
,('geometry','DT_IMAGE','Object')
,('hierarchyid','DT_BYTES','Object')
,('image','DT_IMAGE','Binary')
,('int','DT_I4','Int32')
,('money','DT_CY','Currency')
,('nchar','DT_WSTR','StringFixedLength')
,('ntext','DT_NTEXT','String')
,('numeric','DT_NUMERIC','Decimal')
,('nvarchar','DT_WSTR','String')
,('real','DT_R4','Single')
,('rowversion','DT_BYTES','Binary')
,('smalldatetime','DT_DBTIMESTAMP','DateTime')
,('smallint','DT_I2','Int16')
,('smallmoney','DT_CY','Currency')
,('sql_variant','DT_WSTR','Object')
,('text','DT_TEXT','AnsiString')
,('time','DT_DBTIME2','Time')
,('timestamp','DT_BYTES','Binary')
,('tinyint','DT_UI1','Byte')
,('uniqueidentifier','DT_GUID','Guid')
,('varbinary','DT_BYTES','Binary')
,('varchar','DT_STR','AnsiString')
,('xml','DT_NTEXT','Xml')
) AS F (DataTypeSQL, DataTypeSSIS, DataTypeBiml)
)

SELECT *
FROM DataTypeMatrix




Need to re-format the T-SQL code? Use the Notepad++ plugin ‘Poor Man’s T-Sql Formatter

 


 


 

MDS: How to create or deploy a package from a model

A few lines I need to remember for MDS 🙂

CREATE:

CD C:\Program Files\Microsoft SQL Server\130\Master Data Services\Configuration

MDSModelDeploy createpackage -service MDS1 -model “Biml meta-data” -version “VERSION_1” -package “C:\MDS\BimlMetaData_20170614.pkg” -includedata

DEPLOY:

CD C:\Program Files\Microsoft SQL Server\130\Master Data Services\Configuration

MDSModelDeploy deployclone -package “C:\MDS\BimlMetaData_20170614.pkg” -service MDS1

MDS Create package from model