Agile + Data Modeling = Greater Success

Let me start by saying that agile development is fundamentally sound and has serious merit. In may in fact be the best development methodology there is. However, that does not mean there are no serious issues with it in practice. In this blog, I will point out some of the major shortcomings that I have seen in its practice from some of the most gifted developers that I’ve ever worked with. My hope is to educate developers so that these shortcomings can be reduced if not eliminated.

Most developers exercise deliberate and prudent care when developing their application code classes. Some use diagramming tools, while the rest simply rely upon methodical design planning and review processes – often which are entirely manual. Regardless, application developers generally create very flexible and reusable classes. Moreover, they can explain their classes to new team members in a very concise and understandable fashion. I’ve never heard a senior developer when questioned about their classes say “I don’t know what that it’s for” or “you’d have to read the code to figure out what it’s for”. In general, they both fundamentally and intrinsically value classes since that’s the key item they work with. Hence they just know what the classes are for because that’s where their focus is.

But when I ask these same senior developers following a strict agile development methodology what various tables are for, the answers are quite surprising. Some will reply “I simply don’t know”. Others will say “I think it’s for X but it also might be for Y”. When I spot that are duplicate tables the answer is developer #1 uses one group and developer #2 uses the other group. Of course, when I ask to see a data model for all their tables there is none, even if they did use diagramming tools to design their classes. It appears that persistent data storage via tables or any other database storage mechanism (i.e. key-value store, wide column store, document, or graph) are simply unavoidable means to an end, and thus not as important. As such, the understandable result is both poor design and an understanding of it.

Why should anyone care about these observations and issues? As product manager when I perform my final user acceptance testing of a product before release, I tend to find major product issues that almost invariably are data related. Moreover, it’s not uncommon for a competent QA team to encounter many such errors as well even though they may not have the same product intended audience appreciation. In either case, the problem has already been baked into the cake and generally takes far longer to correct. Worse yet, the solution is quite often a band-aide to get it out the door rather than to go back and fix it properly due to time constraints.

Once the product is out the door, imagine being a tech support analyst trying to troubleshoot a user problem which seems to defy logic. The majority of those issues which are not just because of stupid user error end up being data related. The answer quite often being it will be fixed in the next version (since it’s a design flaw that cannot be admitted) and which often will require an export/import of the data to correct the problem or the running of a one-time data cleansing utility. Either way, the answer becomes a forced update with a painful update process for the end-user.

Another problem occurs when a sales rep says the customer is sophisticated and requires a data model or some such road map of the database the product uses. As a product manager, I always prefer to have one ready for such requirements, but many companies or VP’s of product management will not allow for such disclosures under the guise of protecting the product’s intellectual property (i.e. the database design). To that, I say “total hogwash”, because any potential customer smart enough to ask for such will simply reverse engineer the data model during their trial period and examine it closely since we must be hiding something if we cannot just share it. I’ve lost many deals when this occurs.

OK, so I’ve identified the problem. So what is my answer? It’s easy; integrate intelligent persistent data storage design into the agile development methodology. Make the first few scrum iterations focused on collecting a basic understanding of the data requirements and how best to persist that data. Where then appropriate create a data model or some other diagrammatic representation and treat that delivery as part of the application itself. Share that data model or diagram early on with product management to sanity check it, and then share it with QA to help them to properly test all possible scenarios, and finally share that data model with any potential customer who asks to show them the quality of your product.

For all such needs, IDERA’s ER/Studio ranks among the very best. It really should be a must-have tool for anyone building internal applications for their own company and for any company creating software to sell to others. If every application developer and software vendor integrated a tool like IDERA'sER/Studio into their agile development process the quality of all software everywhere would increase exponentially.

Anonymous