Data Architecture: The Foundation for Enterprise Architecture and Governance

by Oct 27, 2018

"Data modeling is all about understanding the data within our organizations and then representing this data in a precise visual (the “data model”) which aids communication – saving time during development, saving money on support, and improving data quality. The best place to equip your team with data modeling skills is Data Modeling Zone (DMZ), an annual conference in the US, Europe, and Asia Pacific." – Steve Hoberman
   
We have sponsored every US DMZ since the conference was first initiated.  The value of data modeling cannot be understated, and in fact, is more important now than ever before.  It is fundamental to unraveling and understanding the complex data ecosystems that today's organizations are facing, comprised of multiple technologies and formats including relational, NoSQL, raw data, document stores, data lakes and data warehouses.  I am pleased to share a few highlights of the topic I presented at DMZ in Madison, WI last week, combined with how concepts are implemented in ER/Studio:
   
Data is used to represent all tangible and intangible items of interest to an organization.  It exists and migrates through a proliferation of disparate systems with updates and transformations at virtually any stage in its life cycle.  Enterprise Architecture as a whole is essential to align processes and technology in order to execute an organization's strategy.  This can not be accomplished without a robust data architecture foundation, which supports all other aspects of enterprise architecture.  Today's enterprise architecture frameworks originated and grew from data architecture, expanding to encompass business architecture, application architecture and technical architecture.  The entire structure is necessary to enable governance.  While data architecture is the foundation, business architecture can be regarded as the central pillar of the structure.

Integrated data & process modeling in particular are essential to understand what is important to an organization and how it operates, as well as how the organization needs to adapt to attain strategic objectives.  In a nutshell, data represents "WHAT" and business process represents "HOW", also providing necessary context.  Data is a strategic asset.  To take full advantage of it we must fully understand the data value chain.

As we proceed from left to right, we are building the data value chain.  Facts are captured, stored, and expressed as data.  Information is data in context.  Without context, data is meaningless; we create meaningful information by interpreting the context around data. 

  • business meaning of data elements and related terms
  • format in which the data is presented
  • timeframe represented by the data
  • relevance of the data to a given usage
Knowledge is information in perspective, integrated into a viewpoint based on the recognition and interpretation of patterns combined with other information and experience.  Knowledge is applied through informed decisions and actions.
ER/Studio is a powerful modeling suite that allows you to truly understand the data, transforming it into knowledge.  To do so, we locate, map and decode the data into visual models which promote understanding.  This is done through reverse engineering and metadata imports from various platforms and sources.  Naming standards are used to clarify what the data represents.  There are often multiple instances of the same concept scattered across multiple data stores in an organization, often with different names.  Universal mappings, a capability unique to ER/Studio, can be used to link those entity instances.  More granular linking at the attribute level can also be done, which is especially useful for critical data elements.
To further understand where data originated and how it is used, ER/Studio can create visual data lineage models including sources, targets and transformations.

ER/Studio process modeling capabilities fully implement the full BPMN 2.0 specification, with significant extensions beyond that.  Integrated business process models are used to fully define and document important processes, functional responsibilities, goals, objectives, data store usage and links between processes and the applications that implement them.

Process models can incorporate entities and tables from associated data models, even specifying Create, Read, Update and Delete (CRUD) operations if required.  This builds a very rich specification of how the data is used in an organization, as well as visibility of the data life cycle.

As discussed above, it is essential to understand the business meaning of data elements and related terms.  This is facilitated by true enterprise data dictionaries and business glossaries.  The architecture of the enterprise dictionaries and models allows governance and other user-defined characteristics to be defined and attached in the models themselves, with the associated metadata that is published into the Team Server collaboration platform.  This can be used to define and designate important classifications for reference and master data management, security, data retention policies, data quality characteristics and more.  The information is very easy to communicate, since it can be visually displayed in the model diagrams as well.

In ER/Studio Enterprise Team Edition, the business glossary is not restricted to business terms and their definitions.  The glossary structure is extremely flexible, supporting linked and nested glossaries that can be used to represent functional areas, related terms, catalogs of governance policies and even serve as a directory to link reference and master data.  Thus, governance policies and reference data can be linked to the associated metadata and model constructs.

As time has progressed, business has become increasingly complex.  Most organizations use only 15-20% of their internal data effectively.  That figure is declining, since the amount of data that organizations must contend with is growing at an exponential rate.  As Dr. Peter Aiken eloquently stated in one of his conference sessions last week, "There will never be less data than right now!"  Therefore, in order to survive, organizations must embrace and improve their data architecture capabilities, with a focus on data modeling in particular.  Data modeling provides the visual "blueprints" that allow organizations to understand their data and add the necessary context and perspective to create actionable knowledge.  Data is a fundamental building block in every organization.  It is rooted in the past, a key indicator in the present and a strategic asset to enable the future.
  
“Organizations that do not understand the overwhelming importance of managing data and information as tangible assets in the new economy will not survive.” Tom Peters, 2001