Recently, I wrote about treating a data process as the business process that it truly is. The premise was that we must look at data management processes in a more structured way. By spotting repeatable patterns and looking for opportunities to automate some of the mechanical aspects of the process, we can create opportunities to optimize process and improve performance.
I’d like to continue that theme and stay focused on the end-to-end process of building the data warehouse. There are many micro-steps that go into building the warehouse. We’ll keep the discussion at a high-level and inspect the process as follows:
Gather and Analyze Requirements->Create and Manage Representative Models->Integrate Data-> Enable Data Access Through a BI Layer->Test and Balance->Release to Production
You may use different terms, but you get the gist. The problem that presents itself is the speed at which these functional steps can be performed while still producing the high-quality foundation for analytics that is required to support critical business decision making. Traditional methodologies suggest that this process can take 12-18 months or more depending on resources, scope, and complexity. And, that’s if you get it right the first time.
A year and a half is an eternity in terms of a business cycle. So, what can we do to address information requirements at the speed of business without sacrificing quality, short-circuiting the “process” of implementing our information infrastructure, and ensuring that what is actually produced is what was expected?
Let’s begin at the beginning. It all starts with gathering and analyzing requirements. Without a comprehensive set of requirements defined by the business users in need of information to support decision-making, the data warehouse process will never get off the ground. I admit that this is a blinding glimpse of the obvious, but if we look at the requirements gathering process and how we leverage the information gained during these exercises we really start to question whether or not we take it seriously enough.
Traditionally, requirements are documented in some semi-formal way (more often than not spreadsheets are used to capture them). Then the requirements are translated into some form of conceptual model. The conceptual model gets presented back to the business, gets refined and eventually reaches an agreed upon state of accuracy.
It’s the whole notion of “translating” the requirements that bothers me. Every time we go through a step to “translate”, we risk losing the value in the original message. To make matters worse, there are multiple points at which translations must take place in order to get to the end state of the implemented data warehouse. Requirements are translated into conceptual models. Conceptual models are translated into logical data models. Logical data models are translated into a series of physical data models (3NF for the integrated data layer, dimensional models for the data marts or reporting schemas – let alone landing zones and staging tables). And, all may be done by different people with different skills! It reminds me of the old telephone game where one person whispers a message to the first person in line, then they turn to the next and pass on the message to the best of their recollection and it gets repeated down the line. We all know what comes out the other end of that line – something completely foreign and unintelligible from the original message.
We must streamline the process to create a more direct line from business requirements to the resultant reporting data in order to ensure that we take the most direct route from business event to supporting information. The chain from business user to business analyst to data architect to database administrator contains too much margin for error. Once requirements are taken out of their original form – the one produced in conjunction with the business – disconnects can go undiscovered until much, much later in the build and deployment process. How many times have you presented a report to the business (one that you think they asked for) and gotten a response of “that’s not what I wanted” or “it’s not right” or “what I meant was…”?
Kalido recognized this flaw in the process early on and created the Business Information Modeler to clearly (and collaboratively) document end user business requirements directly in the form of a mutually understandable (and agreed upon) model. The resultant business information model contains all of the business concepts expressed by business users. The model also embeds all of the structural validation that is necessary to ensure integrity and quality of data based on how the business uses the data, not how a database stores the data. The ability to represent business requirements in their raw form without the confusion caused by having to interpret a technically oriented data model that most business users do not understand ensures accuracy in the resulting work product and eliminates a translation point in the process.
Next time around, we’ll extend the notion of the business information model and how it can be leveraged to further reduce risk to data warehouse implementation and enable a truly iterative approach to building the data warehouse foundation from the ground up – without the “lost in translation” issue that has plagued these projects for years. Elimination of a risk point and streamlining the communication channel is a huge step forward toward optimizing our process – pass it on.