Data Vault Modeling with ER/Studio Data Architect

by Apr 18, 2018

Overview

I have been asked multiple times if ER/Studio supports Data Vault Modeling. The answer is yes, ER/Studio does support multiple approaches of data modeling including Data Vault. This article will introduce some key facts behind the Data Vault Architecture and explain why it has become so prominent in the Enterprise Data Warehouse world today.

Data vault architecture

Organizational data management and architectural needs have changed massively in recent decades. Today’s enterprise data solutions must be agile, flexible to rapid changes, scalable, reliable, and cheaper than ever.

Dan Linstedt, creator of the Data Vault method, gives the following formal definition (Source: Wikipedia) for the Data Vault methodology:

“The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent, and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses.”

Business changes drive most of the data modeling projects in an enterprise. These changes have a huge impact on the downstream related integrated systems and are associated with large development, test and implementation costs.

The Data Vault modeling approach has been introduced to address agility, flexibility, compliance, auditing and scalability issues that exist in traditional approaches for Data Warehouse data modeling according to Kimball and Inmon and to reduce large change-related costs.

Data Vault differentiates three core types of entities and is based on the concept of the business process decoupling business keys from description and context.

  1. Hub – representing a list of unique business keys
  2. Link – describes a unique list of relationships/interactions between business keys
  3. Satellites – contain descriptions and the contexts of the business keys or links

ER/Studio objects for data modeling

ER/Studio provides a number of components like business data objects, shapes, text blocks, and data dictionary elements to represent Data Vault patterns.

Diagram 1 below shows a Data Vault model with Business Data Objects and text blocks. ER/Studio’s Business Data Objects (BDOs) are containers for describing a business concept. Business Data Objects allow you to combine entities into groups that can be used to describe and graphically represent entities/tables that share a common characteristic. Different text fonts and background colors can be used to visualize Hubs, Satellites and Links.

Additional descriptions and notes can be added to the BDO container to document the grouping purposes. Notes are included as part of the HTML report when users generate a report for the model. You can format the notes using standard HTML tags. You can collapse a BDO containment frame by clicking the minus sign in the top right hand corner of the BDO frame to show or hide the entities in the respective hub, link or satellite group. ER/Studio attachments can be used to bind an external piece of information like core architecture of the data warehouse to the BDO and to document the enterprise data warehouse architecture.

Diagram 1: BDO representing containers of Data Vault related objects

Diagram 2 below shows the explorer view of ER/Studio showing where the BDOs and shapes are created. These items can be easily found in the navigation tree within the models in which they are created.

 

Diagram 2: Explorer view of objects

Diagram 3 below depicts a functional view of a Data Vault model where Hubs and Links and their respective satellites are functionally grouped to show the different domains or functional areas. This has been achieved by using different shapes, text blocks, and coloring.

Diagram 3: Functional views of entities

ER/Studio data dictionaries and macros

Additionally, ER/Studio has local and enterprise data dictionary (shown in Diagram 4) components where domains can be defined as reusable attributes for common Data Vault attributes/columns like Surrogate_ID, Business_Key, Record_Source, Date_Time_Stamp. These attributes can be defined once and then dragged and dropped in an entity or a table.

Diagram 4: Data dictionary objects related to Data Vault

Last but not least, ER/Studio provides an automation interface and gets shipped with macros which allows you to create Hub, Link and Satellite template entities with predefined Data Vault attributes and use those template objects while creating recurring objects.

Diagram 5: Macros to create reusable entities