Ensemble Modeling Forms (DWH Models)

This is a copy/paste post of Hans Hultgren‘s blog post about the forms of Ensemble Modeling:
https://hanshultgren.wordpress.com/2013/06/16/ensemble-modeling-forms/

Ensemble Modeling Forms: Modeling the Agile Data Warehouse

Ensemble-modeling


Anchor Modeling. Data Vault Modeling. Focal Point Modeling. To name a few. In fact there are dozens of data warehouse data modeling patterns that have been introduced over the past decade. Among the top ones there are a set of defining characteristics. These characteristics are combined in the definition of Ensemble modeling forms (AKA Data Warehouse Modeling). See coverage notes on the Next Generation DWH Modeling conference here (and summary here).

The differences between them define the flavors of Ensemble Modeling. These flavors have vastly more in common than they have differences. When compared to 3NF or Dimensional modeling, the defining characteristics of the Ensemble forms have an 80/20 rule of commonality.

All these forms practice Unified Decomposition (breaking concepts into component parts) with a central unique instance as a centerstone (Anchor, Hub, Focal Point, etc.).
Each separates context attributes into dedicated table forms that are directly attached to the centerstone.
Each uncouples relationships from the concepts they seek to relate.
Each purposefully manages historical time-slice data with varying degrees of sophistication concerning temporal variations.
Each recognizes the differences between static and dynamic data.
Each recognizes the reality of working with constantly changing sources, transformation and rules.
Each recognizes the dichotomy of the enterprise-wide natural business key.
From that foundation of commonality, the various forms of Ensembles begin to take on their own flavors.

While Data Vault is foundationally based on the natural business key as the foundation of the centerstone (Hub), both Anchor and Focal Point center on a unique instance of a concept where the business key is strongly coupled but separate from the centerstone (Anchor, Focal Point).
Both Data Vault and Anchor aim to model Ensembles at the Core Business Concept level while Focal Point tends to deploy slightly more abstracted or generic forms of concepts.
Data Vault and Focal Point utilize forms of attribute clusters (logical combinations) for context while Anchor relies on single attribute context tables.
And there are other differentiating factors as well.

There is one thing that we can all agree on: modeling the agile data warehouse means applying some form of Ensemble modeling approach. The specific flavor choice (Data Vault, Anchor, Focal Point, etc.) should be based on the specific characteristics of your data warehouse program.

* Learn more on Anchor Modeling with Lars Rönnbäck here: Anchor Modeling

More info about Ensemble Modeling: https://hanshultgren.wordpress.com/2012/11/20/ensemble-modeling/








SSIS Anchor Modeling example (tutorial)

About
This Anchor Modeling SSIS example is brought to you by:

Clint Huijbers (BumbleBI IT)

Bas van den Berg (C2H)

The first version of the demo was created by Bas and shared on AnchorModeling.com.

Tutorial Video by Bas
Advice you to watch it first 🙂

Tutorial-by-Bas-van-den-Berg


New version of the SSIS package
The fast performing package (final step in his tutorial) was rebuild by Clint and is described below:

Download
To download the SSIS package and T-SQL scripts, click here.


The Anchor Model (AM)
The AM is about telephone calls, that includes two Anchors (Call and Phone), two Ties and three Attributes:

Anchor Model with mnemonics
AM-Model-with-mnemonics

Anchor Model without mnemonics
AM-Model-without-mnemonics


How to install

1) Create a new database called ‘AM_SSIS_Example

2) Create all AM-objects by executing the script: ‘Generated SQL Script by AM Online Modeler.sql

3) Generate source data by executing the script: ‘Create dbo_Calls table (source).sql
Give it a few minutes, it will generate 691.200 rows

The table dbo.Calls is already filled with demo data:
dbo_Calls

4) Create a Batch-table for metadata by executing the script: ‘Create BA_Batch table.sql

The database should now have a couple of tables (also Views, SPs and UDFs):
SQL-Server-Tables

5) Open the SSIS package in BIDS/SSDT and execute the package

The SSIS package demos two scenarios:
#1 – Insert records by using the INSERT-trigger on view dbo.lPhone
#2 – Insert records directly into the Anchor and Attributes
(by using the fast load / bulk insert and parallel processing of the Attributes)

SSIS-Package-Design

Raw File
A Raw File Source is used to temporary store and quickly re-load the full dataset, located in ‘C:\Temp\AM_SSIS_Example_RAW_FILE‘. Important: make sure to manually create/verify this Raw File. You’ll be able to find it here:

SSIS-Raw-File-Connection

Open the ‘Raw File Destination Editor‘-window (right-click and ‘Edit’) and verify the Raw File’s location:

SSIS-Raw-File

6) Watch the show

SSIS-Package-Executed

7) Verify the results by executing the script: ‘Verify the results.sql

AM-data-loaded

8) Increase the volume!
Would you like to re-process all data and perhaps with a lot more records? 😀
Truncate all tables by executing the script: ‘Truncate and delete all data.sql’
Now, drop the table dbo.Calls.
Execute the script ‘Create dbo_Calls table (source).sql‘ again, but this time increase the variable @EndDateTime with just a couple of days.
Have fun!

Coyote

SQLTimer: a SQL Server Timer

the-science-of-timing

thinking_animated_smiley_emoticon

So…you’ve got a fancy Business Intelligence (BI) solution? Predictive Analytics or perhaps Actionable Analytics? Ever thought about the right time to contact your (potential) customers?



SQLTimer might be a handy addition to your BI Suite!
Like a timer for your lighting, it checks every hour whether it’s the right moment to execute jobs for categories like:

BusinessHour: Indicates whether or not it’s the right time to contact your customer during business hours (included in the SQLTimer script)

Other additional time schedules which are downloadable:
Twitter: Indicates whether or not it’s the right time to contact your Twitter users
FaceBook: Indicates whether or not it’s the right time to contact your FaceBook users
LinkedIn: Indicates whether or not it’s the right time to contact your LinkedIn users
GooglePlus: Indicates whether or not it’s the right time to contact your Google+ users
Pinterest: Indicates whether or not it’s the right time to contact your Pinterest users

In this blogpost, I’ll be diving into the SQLTimer itself and the free time schedule called ‘BusinessHour’ (included in the script).

Download the complete script free on:
SQLTimer – How to install


How it works:
Just like the timer for your lightbulb, it is configured on an hourly basis:

SQL-Timer-Execute-TimeSchedule

The values indicate whether it is a bad, better or even the best time to contact your business on a weekly basis:

BusinessHour values:
0) = Outside of business hours
1) = Inside of business hours
2) = Preferred hours

Important note: SQLTimer is setup on a day-to-day basis and thus you might tweak the time schedule a bit for local holidays or other important days in the year:

SQL-Timer-TimeSchedule

SQLTimer objects:
(An existing database is required)
Let’s have a look at all the SQLTimer objects:

SQL-Timer-Objects

*** Tables ***

timer.TimerLogging
This table contains the logging (when a time schedule was executed (StartTime/EndTime) and what value the time schedule had)

SQLTimer’s logging in timer.TimerLogging:

SQL-Timer-Logging

timer.TimeSchedule
This table actually contains the time schedule itself

timer.TimeScheduleConfiguration
This table is all about the configuration

SQLTimer’s configuration in timer.TimeScheduleConfiguration
In case you would like to change the name of the schema, don’t forget to change the configuration settings:

SQL-Timer-TimeScheduleConfiguration


*** Stored Procedures ***

timer.usp_Hourly_Timer
This SP is the timer itself and needs to be scheduled in SQL Agent to be run on an hourly basis

timer.usp_Select_TimeSchedule
You might want to use the timer.usp_Select_TimeSchedule() stored procedure to analyze the time schedule for the coming week:

SQL-Timer-Execute-Select-TimeSchedule

timer.usp_Execute_BusinessHour_Jobs
This SP will execute the preferred jobs (for example: T-SQL code, SP’s or SQL Agent Jobs) for the time schedule ‘BusinessHour’

Here’s how the timer.usp_Execute_BusinessHour_Jobs looks like.
You need to add your T-SQL code, SP’s or SQL Agent Jobs here that need to be executed if the @TimeScheduleValue reaches a certain value.

SQL-Timer-Execute-BusinessHour-Jobs


*** Install the SQLTimer ***

So far about the SQLTimer itself, it’s time to install it!
1) Download the T-SQL script here: SQLTimer – How to install

2) Unzip the ZIP-file and open the *.sql file in SSMS or SSDT

3) Change the database name and uncomment the T-SQL code:

SQL-Timer-T-SQL-Script

4) Execute the T-SQL script
Output of the script:

SQL-Timer-T-SQL-Script-executed

5) Create a SQL Agent Job (Execute T-SQL statements) which executes the timer.usp_Hourly_Timer stored procedure every hour


6) Don’t forget to update the time schedule with your local (company) holidays 🙂
Table: timer.TimeSchedule


Websites:
SQLTimer
BumbleBI IT


DateTimeOffset to Date Time Offset (Dimension) Identities

These SQL functions might come in handy for your ETL and DWH.
I’m currently designing a DWH which will have three ‘TimeDimensions’:
DateDimension (grain = day)
TimeDimension (grain = second)
TimeZoneDimension (grain = offset in minutes)

Simply said, these functions convert a DATETIMEOFFSET value into three types of integer identity values (which can be used in your dimensions).

DateTimeOffset to Date Time Offset Identities - Results

Want it? 🙂
Download it here:
DateTimeOffset to Date Time Offset Identities – SQL Code