Kimball Group puts out some fantastic design tips, and one I especially liked in the last year was one from Joy Mundy (Design Tip #158 Making Sense of the Semantic Layer). The arrival of this in my inbox was extremely timely because I was at the time involved in a disagreement with a BI consultant who was currently specialized in one of the new breed of BI tools. His position was that semantic layers were unnecessary and the realm of “traditional, IT-centric reporting environments.” If I didn’t understand that, well, I just didn’t understand the niche that particular product was fulfilling towards the goal of enabling ad-hoc end user reporting without IT involvement. The argument started because I was inquiring about the capability of third-party solutions like Kalido of preconfiguring the semantic layer for the tool rather than having to manually configure it.
The cynical side of me leans towards Upton Sinclair’s quote in this area: “It is difficult to get a man to understand something, when his salary depends upon his not understanding it.” In other words, if a BI tool doesn’t yet support the ability to automatically create a semantic layer, it is understandable the vendor takes the approach that a semantic layer isn’t needed—right up until the point they delivery that feature.
In this area, Joy didn’t mince words:
“Is the semantic layer a mandatory component of a DW/BI architecture? The answer is yes, if you plan to open the doors to your DW/BI system for ad hoc use. Every hour you spend on building a rich semantic layer will pay off in improved adoption and success in the user community. Don’t simply run the wizard for your BI tool and generate a semantic layer that’s equivalent to the underlying tables (even if they’re dimensional). Spend the time to make it as rational and polished as you can.”
Let me provide just a few quick examples of how I think this can be useful for any BI tool, even for the new breed of BI tools that pride themselves on not requiring semantic layers.
1) Let’s say you have 1,000 tables, all with German acronyms. You already have all of the fields mapped with business friendly names to the unfriendly physical schema, and you even have business definitions for each field, but now it is time for some ad-hoc reporting. Do you A) turn your end users loose on the repository as is, or B) do you spend a significant amount of time retyping your mappings and definitions into the BI tool import layer to make the fields easier for the end users to understand? Obviously (B) is what happens in the real world, but why should I have to go through that effort? Every mapping you make manually in the BI tool is something I should have been able to create automatically if I had direct access to a programmatic API on the BI layer side.
2) Kalido can create ‘hybrid’ slowly changing dimensions that are consumed by joining to the surrogate key of an object and constraining it at report time with a SQL where clause for the version of time variance you want (the current hierarchy, the hierarchy at time of transaction, or the hierarchy at a specific point in time). Most BI tools try to auto-join the keys, but in this case, auto-join by itself doesn’t work because the result needs to be further constrained. For tools with configurable semantic layers, Kalido can provide the join the end user wants automatically. But for less sophisticated BI tools, manual work needs to be done for each join, if the BI tool supports it at all. For BI tools that don’t support it, you might have to maintain a view layer to resolve the time variance issues to fit with how the tool prefers to see the data. That’s right — you have to adapt your physical design to the limitations of the BI tool, which greatly conflicts with many of the BI tool messages that they can just accept the data the way it is.
The reason I feel so passionately about this is Kalido is able to automatically generate the semantic layer for leading BI tools with a simple 7-tab wizard we call the Kalido Universal Information Director (see attached image).
This wizard makes it fast and easy to generate BI tool semantic layers, and because it is reading from the Kalido business information model changes to the underlying warehouse structure can be rapidly reflected in your BI tool. We intend to add support for more BI tools, however we are limited in what we can do if the BI tool vendors themselves don’t have a programmatic API for us to talk to. Fortunately, as the newer tools evolve and get more success in the enterprise, and their users are faced with significant manual effort to hook these products to larger enterprise data warehouses, we are seeing positive movement in opening up the BI configuration layers to third parties for some of these tools. As Joy points out, some BI layers are “microscopically thin” while others are “rich and robust,” but Kalido should be able to save you significant design and build work in either case. And that’s a good thing! This is another example of the automation Kalido brings to data warehousing.
If you are a data warehouse professional and aren’t subscribed to Kimball Design Tips, I highly encourage you to do so. Even if you are an expert, there are always things to learn.