Overview

I have been asked multiple times if ER/Studio supports Data Vault Modeling. The answer is yes, ER/Studio does support multiple approaches of data modeling including Data Vault. This article will introduce some key facts behind the Data Vault Architecture and explain why it has become so prominent in the Enterprise Data Warehouse world today.

Data vault architecture

Organizational data management and architectural needs have changed massively in recent decades. Today’s enterprise data solutions must be agile, flexible to rapid changes, scalable, reliable, and cheaper than ever.

Dan Linstedt, creator of the Data Vault method, gives the following formal definition (Source: Wikipedia) for the Data Vault methodology:

“The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent, and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses.”

Business changes drive most of the data modeling projects in an enterprise. These changes have a huge impact on the downstream related integrated systems and are associated with large development, test and implementation costs.

The Data Vault modeling approach has been introduced to address agility, flexibility, compliance, auditing and scalability issues that exist in traditional approaches for Data Warehouse data modeling according to Kimball and Inmon and to reduce large change-related costs.

Data Vault differentiates three core types of entities and is based on the concept of the business process decoupling business keys from description and context.

  1. Hub – representing a list of unique business keys
  2. Link – describes a unique list of relationships/interactions between business keys
  3. Satellites – contain descriptions and the contexts of the business keys or links

ER/Studio objects for data modeling

ER/Studio provides a number of components like business data objects, shapes, text blocks, and data dictionary elements to represent Data Vault patterns.

Diagram 1 below shows a Data Vault model with Business Data Objects and text blocks. ER/Studio’s Business Data Objects (BDOs) are containers for describing a business concept. Business Data Objects allow you to combine entities into groups that can be used to describe and graphically represent entities/tables that share a common characteristic. Different text fonts and background colors can be used to visualize Hubs, Satellites and Links.

Additional descriptions and notes can be added to the BDO container to document the grouping purposes. Notes are included as part of the HTML report when users generate a report for the model. You can format the notes using standard HTML tags. You can collapse a BDO containment frame by clicking the minus sign in the top right hand corner of the BDO frame to show or hide the entities in the respective hub, link or satellite group. ER/Studio attachments can be used to bind an external piece of information like core architecture of the data warehouse to the BDO and to document the enterprise data warehouse architecture.

Diagram 1: BDO representing containers of Data Vault related objects

Diagram 2 below shows the explorer view of ER/Studio showing where the BDOs and shapes are created. These items can be easily found in the navigation tree within the models in which they are created.

 

Diagram 2: Explorer view of objects

Diagram 3 below depicts a functional view of a Data Vault model where Hubs and Links and their respective satellites are functionally grouped to show the different domains or functional areas. This has been achieved by using different shapes, text blocks, and coloring.

Diagram 3: Functional views of entities

ER/Studio data dictionaries and macros

Additionally, ER/Studio has local and enterprise data dictionary (shown in Diagram 4) components where domains can be defined as reusable attributes for common Data Vault attributes/columns like Surrogate_ID, Business_Key, Record_Source, Date_Time_Stamp. These attributes can be defined once and then dragged and dropped in an entity or a table.

Diagram 4: Data dictionary objects related to Data Vault

Last but not least, ER/Studio provides an automation interface and gets shipped with macros which allows you to create Hub, Link and Satellite template entities with predefined Data Vault attributes and use those template objects while creating recurring objects.

Diagram 5: Macros to create reusable entities

Anonymous
  • ER/Studio does not get shipped with macros for the creation of data vault related objectes. But it is easy to create such a macro.
    The following macro for example creates a hub with some basic attributes. You can create satellites and links with simple modifications
    of this macro.

    '#Language "WWB-COM"

    ' This macro creates a Hub entity with the following requirements:
    '
    ' Creates a hub entity name based on your input and a prefix.
    ' The entity has the following attributes:
    ' Name: "Your_Input_SQN" type Data type Integer, and has a PK
    ' Name: "Your_Input_Business Key" Data type Varchar 50 and Not Null. The business key name is unique for each hub and must be changed manually.
    ' Name: Load Time Stamp Data Type Datetime Not Null
    ' Name: Record Source Data Type Varchar 150 Not Null


    Sub Main

    Dim theDiagram As Diagram
    Dim theModel As Model
    Dim theHubEnt As Entity
    Dim theAttr As AttributeObj

    Begin Dialog UserDialog 260,112 ' %GRID:10,7,1,1
    TextBox 30,28,210,21,.Prefix
    Text 30,7,160,14,"Hub_Name :",.Text1
    OKButton 80,70,90,21
    End Dialog

    Dim dlg As UserDialog

    ' Run the dialog box.

    Dialog dlg

    ' Get the prefix from the dialog box.

    Prefix = dlg.Prefix


    Set theDiagram = DiagramManager.ActiveDiagram
    Set theModel = theDiagram.ActiveModel

    Set theHubEnt = theModel.Entities.Add(100, 100)
    theHubEnt.EntityName = "HUB_" +Prefix
    theHubEnt.TableName = Prefix

    Set theAttr = theHubEnt.Attributes.Add(Prefix+"_SQN", True)
    theAttr.Datatype = "Integer"

    Set theAttr = theHubEnt.Attributes.Add(Prefix+ "_Business Key", False)
    theAttr.Datatype = "VARCHAR"
    theAttr.DataLength = 50
    theAttr.NullOption = True

    Set theAttr = theHubEnt.Attributes.Add("Load_Time_Stamp", False)
    theAttr.Datatype = "DATETIME"
    theAttr.NullOption = True
    theAttr.NullOption = True

    Set theAttr = theHubEnt.Attributes.Add("Record_Data_Source", False)
    theAttr.Datatype = "VARCHAR"
    theAttr.DataLength = 150
    theAttr.NullOption = True


    MsgBox "Done"

    End Sub

  • Thank you for your feedback Dan. Yes, I do agree that Data Vault is far more than just a data model and encompasses a system of business intelligence, integration, methodology, standards and enterprise data warehouse architecture.The lines in the diagram are a bit overlapping and do not show the connections properly.

    But the following rules apply:

    A hub is never joined to another Hub. Each Satellite is never dependent on more than one parent table and is never a parent table to any other table. A satellite is based on a non-identifying business element opposite to links which are based on identifying business element relationships.

  • A better layout of the relationship lines in diagram 1 would be helpful (as well as a better readable screenshot).