Putting two hot topics together doesn’t often make sense. But if you have one minute to read this blog, I hope to convince you that data governance and cloud computing are indeed highly intertwined.
To CIOs, cloud computing is a multi-tier migration of a swath of IT services outside a company’s firewalls. At the infrastructure level, it’s about provisioning of computational and storage capacity. At the next level, it’s about services, like backup and recovery, transaction processing, credit checks, etc. Finally, it’s about full-blow applications, like Saleforce.com. For small and mid-sized companies, the cloud is the great equalizer: world-class IT services are within reach, and better yet, these services are elastic. For big enterprises with sophisticated data centers and IT management, the cloud is a growing nuisance that needs urgent attention. Lines of business will continue to buy cloud-based services, and someday IT will surely be called upon to integrate it with everything else.
How does the cloud impact data management? The cloud sets off some big trends. The first one is clear: more and more data will be stored outside a company’s firewall. The obvious concern here is data security. I won’t dwell on it because so much has been written on this topic.
A less obvious trend is that cloud computing will lead to greater inter-dependence among companies for each others’ data. More and more data will originate from outside a company’s firewalls, and more and more data will leave for systems outside a company’s firewalls. “Outside the firewall,” no matter how you cut it, means outside a company’s full control.
Data quality will suffer. How do you prevent a business partner from giving you bad data? Is it in the contract? How much do you know about the data coming from your data vendor, or data used by your service vendor? How do you know if it’s any good? In a world of great inter-dependence, how would one even know the origin of certain data? How do you enforce any common data definitions and standards, if your SaaS applications are nearly impossible, or outright impossible to customize? How do ensure your data is retired and archived properly from a multi-tenant database? Who owns your data anyway, like information about your company’s employees so brazenly displayed on Linked-in?
The cloud is going to make it even harder than before to get your data house in order. Good data governance practices can help. In fact, all four sub-disciplines of data governance: data definition, lifecycle, quality, and security are all relevant. Your data policies not only impact how data is managed inside your firewall, but should also figure into how you work with vendors and partners in the ether.
The cloud means you’ll have less control over how your data is produced, stored, and consumed. So make sure data governance is a part of your cloud strategy, so you can reap the rewards of cloud computing without wrecking havoc for your data.