Starting this week, I’ll be publishing a series of blogs on enterprise data governance. As always, I welcome your comments and feedback.
When it comes to data management, presentations and whitepapers all have a very consistent theme: Data is important, and we need to do something about it. The vendor landscape changed. Technology fashion changed. But the message remains the same, almost as if nobody is aware of the problem or has done anything about it.
Let’s look at the facts. How much have we spent on data management over the last 10 years? Gartner says that worldwide IT spend in 2008 was a whopping $3.4 trillion, an 8% growth over 2007. That’s the GDP of Germany. Of that, about $40 billion was spent on data management software alone. Assuming typical labor-software-hardware ratios and an 8% growth rate, a rough answer to my question is $1.4 trillion. Clearly, we’ve done plenty about data. But are we doing the right thing?
Near the end of 2008, the global financial system stood at the edge of the abyss. In a conference call with analysts, the CEO of a global banking giant was repeatedly asked to quantify mortgage-backed security holdings on the bank’s books. “I don’t have that information,” said the CEO over and over.
During the previous 10 years, this bank had spent a total of $37 billion on IT operations alone. We now know the bank was solvent. But at that moment, facing a collapsing stock and plummeting market confidence, the CEO couldn’t produce the one piece of data that could’ve saved his company and his job. After $37 billion spent. It was staggering.
That was not an isolated incident. Survey after survey points to chronic data problems in most organizations. $1.4 trillion hasn’t done the trick. There’s no reason to believe that spending more money doing the same things will make things any better. We need to rethink the problem and come up with a different approach. But to get to the right approach, we first need to identify the root cause.
Let’s take an example of a simple data quality problem. The finance department can’t send out an invoice because the customer’s billing address is missing. So finance calls the sales person. If the sales person doesn’t have it on hand, someone will need to contact the customer. A few days later, the right billing address is unearthed and an invoice sent. Stories like this are repeated every day, everywhere. Payment is delayed, affecting cash flow. Normal business process breaks down, increasing cost. The economic impact is very real.
The obvious solution is to make sure that each sales rep enters a complete and accurate billing address when entering an order. The best way to tackle data quality problem is do it as upstream as possible: at the point of entry, the moment someone captures the real world in bits and bytes. There is one problem: who will tell Sales?
The elephant in the room is that good data has a cost. It takes time and discipline to investigate, verify, and put good data in a system. And this piece of data — billing address — is not required for the Sales function. But finance needs it. And other business processes need it for operations or analysis. Data generated by one business function is consumed by multiple business functions downstream, often very distant from the point of entry. In other words, a large group of people benefit from good data, but they are usually not the same people who bear the cost of good data.
This poses an organizational and behavioral challenge: How do we make people accountable for good data that benefits others, most of whom they don’t even know about? Where does the authority come from? What are the positive and negative incentives?
Another challenge is that Sales is not the only group that can create, change or access customer data. Customer Service can, and so can Finance. An even larger group of people can see and report problems with data. If we assign sole ownership of customer data to sales, we absolve the rest of the organization from their responsibility.
Important data assets have multiple providers and multiple consumers who are often unaware of each other, and data quality is often not in the immediate interest of data providers. There is no transparency and accountability. This is the root cause of bad data. In the case of our global bank, traders and risk management staff in thousands of pockets throughout the globe are the providers of data that the CEO needed. When there’s no transparency and accountability, the aggregate data is untrustworthy. This is one of the key reasons that the big bank’s CEO lost his job, and its shareholders got nearly wiped out.
In my next blog, I’ll discuss various approaches to the problem and their merits.
This blog is part 1 of a multi-part series of blogs on the topic of Enterprise Data Governance. To read other posts from this series, please see below.