Hadoop HBASE database - when supported?

Hello - I just saw the Mongo DB demo. Great. Hadoop's native nosql database is HBASE, and we are planning on some very large implementations. We will be moving away from relational over time. HBASE allows for embedded entities, essentially allowing child entities to be part the parent row in a somewhat denormalized fashion. Rows will be very wide. We need a tool that will allow us to model these rows with embedded entities (repeating groups) and to create the HBASE physical model and the code to instantiate it (the DDL equivalent). Is this on the Embarcadero road map?
  • Hello - I just saw the Mongo DB demo. Great. Hadoop's native nosql database is HBASE, and we are planning on some very large implementations. We will be moving away from relational over time. HBASE allows for embedded entities, essentially allowing child entities to be part the parent row in a somewhat denormalized fashion. Rows will be very wide. We need a tool that will allow us to model these rows with embedded entities (repeating groups) and to create the HBASE physical model and the code to instantiate it (the DDL equivalent). Is this on the Embarcadero road map?
  • Thanks for your note. Glad you liked the video. We do offer support for Hadoop Hive, including reverse engineering and generating DDL. You can use our support for Hive to create HBASE tables. We'll have another video and blog soon on our Hadoop Hive support.
  • In reply to Joy Ruff:

    Hello Joy, are repeating groups / embedded entities supported now? They're not in XE4, are they in XE7?
  • In reply to Michael R2593:

    Hi George,

    We introduced native NoSQL support for Hadoop Hive and MongoDB in the XE6 version of ER/Studio with further enhancements in XE7.

    Hive and Hbase are both built over the Hadoop File System (HDFS), but that's where the similarity ends. Hive allows for specifying tables stored on HDFS via HQL, which is similar to SQL DDL for creation of the file constructs. By contrast, Hbase is a key/value store which runs on top of HDFS. Unlike Hive, HBase is partitioned to tables, and tables are further split into column families. Column families group together a certain set of columns. Therefore, this presents a significantly different challenge for forward and reverse engineering than we see with Hive, which is much closer to a relational pattern.

    However, we are looking at ways to support the different NoSQL implementations. We have already done so with MongoDB, which is a document store that contains embedded objects and arrays, which are not relational concepts. We are able to determine the objects and arrays by parsing the JSON. We have also created our own notation as extensions to relational notation to depict the embedded objects and arrays.

    We do not directly support Hbase at this time, but we are evaluating it as well as other NoSQL implementations. We do not have a firm time frame to implement Hbase yet. For Hbase and other NoSQL implementations we are interested in hearing from customers about your plans to use specific platforms, use cases outlining how you wish to use those platforms and challenges you are facing. This will assist us in extending our industry leading modeling capabilities further, enabling you to achieve maximum benefit from the new platforms.
  • In reply to Ron Huizenga:

    Thanks for the update, Ron. Have you read Thomas Frisendhal's book "Graph Data Modeling for NoSQL and SQL"? I'm reading it at the moment, and it's worth a read. See https://technicspub.com/nosql/
  • In reply to Michael R2593:

    Hello Michael
    My client has the same issue - we need to record Data Lineage to and from HBase, and want to continue using ER/Studio, if we can. We could cobble something together, such as creating 'fake' tables and references to mimic the Hbase structure - have you found a method that works, and is usable within Data Lineage?