Building a Business-Driven Data Architecture

by Oct 6, 2016

In September, IDERA’s Ron Huizenga joined Eric Little and Malcolm Chisholm for a Hot Technologies session, hosted by Eric Kavanagh of the Bloor Group, on building a business-driven data architecture. Over the past several decades, there has been a shift from a process focus toward an emphasis on data for analytics and business intelligence, recognizing the importance of data architecture. Businesses are now thinking about the value of data, and how data can drive the business strategy. Data is no longer regarded as an end result, but is central to business models.

Complex data environments can really drive a lot of business value, and technology is moving fast – it’s hard to keep up in today’s business environments. How to integrate big data, how to incorporate semantics, how to leverage legacy data – these are some of the challenges people are facing. Some barriers to implementing a business-driven data architecture include top-down and waterfall workflows – these methods want detailed requirements up front and a start-to-finish process. These days, it’s important for data development to adapt to a sprint-based workflow that can implement in phases and adapt quickly to changes. It’s putting data at the heart of the development process, and testing for data, not just for functionality.

The variety of data – structured, semi-structured, and unstructured – coming from multiple data sources – relational, NoSQL, document, graph and other data stores – calls for a way to organize the chaos. Many organizations are using multiple platforms and various solutions such as ERP or SaaS systems, and may also have legacy systems to manage. It requires a large coordinated effort to manage data effectively, and that starts with good data models. Data-centric development involves data modelers interacting with developers to improve data quality and consistency.

How well do you know what your data is and what it means? How can you provide the necessary context for all the types of data in your enterprise? Data models are the blueprints for databases, and both of these feed into the larger enterprise architecture targeted for business use. The model layer should provide a logical representation of the physical database structures, with common elements and vocabularies clearly described. It’s also critical to maintain reference data, such as naming standards, to ensure consistency, even as the business evolves. ER/Studio can easily attach naming standards to models, submodels, or tables to provide a correlation between logical and physical model naming conventions. Elements that can be commonly reused can be added to a data dictionary. Additionally, ER/Studio offers Universal Mappings to connect terms or entities across models. All of these features help to improve data quality and consistency between models and databases.

To further expand the metadata, companies should establish business glossaries where the business and IT teams can collaborate on the terms and definitions, so that everyone is working from a common knowledge base, with a complete understanding rather than just one individual’s perspective. ER/Studio offers the Team Server web-based platform that helps organizations establish and maintain these glossaries, with inputs from both business and data professionals. The business owns the data and must be involved in providing the source of truth for these contents. Having a common foundation will enable business users to make better analyses and decisions using the corporate data.

We received several questions during the session that we didn’t have time to answer. Those Q&As are captured here:

Q: Can ER/Studio generate an XML schema from an ER model? 

A: Yes, ER/Studio Data Architect will export XML schemas. There are also many additional options for exports to other tools and frameworks.

Q: Can I build data security on to the extensions which could be part of my overall data firewall strategy?

A: Through the data security properties as well as attachments (metadata extensions) you can add additional metadata to most model constructs. So if there are properties you wish to document which are relevant to your data firewall strategy, this approach should work.

Q: Do you have a web interface for modeling, e.g. in the Enterprise Team Edition version?

A: There is an interactive viewer in Team Server to view models, drill down to see metadata, etc. However, creation and changes to the data models are done in ER/Studio Data Architect, which is a client based tool, then published to Team Server. Similarly, changes to business process models are made in the ER/Studio Business Architect tool and published to Team Server.

Q: Do you provide a merge/spilt feature as part of glossary management?          

A: The architecture in Team Server is such that a term can be associated to multiple glossaries if desired. Therefore, there is no need for a split or merge to overcome limitations found in some other products which restrict a term to be contained in only one glossary.

Q: What security does ER/Studio have to access the repository? Can that be integrated with an enterprise directory service?

A: The repository uses a project structure as well as user roles (groups). It can be integrated with Active Directory.

Q: Does the tool provides mapping, i.e. from a source data model to target data model in the same modeling repository? Can it generate an exception report if there is any anomaly found in mapping semantics?   

A: Source-to-target mapping with transformations is supported in the data lineage capabilities. There is also a concept called universal mappings which allows linking of equivalent concepts across models that are in the repository. Universal mappings can be at the entity/table level or even at the attribute/column level.

Q: If I have to link one glossary term to one of my objects defined in the semantic layer of a BI tool, what is the best way to achieve that?

A: If you have a model that represents the semantic layer in the BI tool, then you can link a glossary term directly to the corresponding model construct.