A Brief History of Data Governance

Data management has gone through significant changes in the 20 years that I’ve been in this business. Data emerged out of the lockboxes of disparate legacy transactional systems, and data management came to be a separate and sophisticated discipline enabled by advanced software and hardware. Gone are the days when most people needed to be convinced that data is a valuable asset. Through three recessions (1990, 2001, 2008), the data management industry marched forward nearly unscathed; spending continues to increase faster than overall IT spending.

Over this period, the degree to which data is governed as an asset went through three distinct eras: the Application Era, the Enterprise Repository Era, and the Policy Era. I’m fortunate enough to have experienced the tail-end of the Application Era, spent most of my career building enterprise repositories, and can now see the emergence of the Policy Era. Note that I’ve defined these labels by how an organization as a whole thinks about data, not to minimize the importance role that applications and enterprise repositories play today or in the future.

Application Era (1960-1990)

When organizations began mainstream adoption of data processing technology, systems were built to support transactional business processes to make them less labor intensive: taking an order, balancing a general ledger, etc.. Data was seen as a byproduct of running the business and had little value beyond the transaction and the application that processed it. In those days, data was not treated as a valuable, shared asset, so the need for governance did not arise.

Some organizations attempted governing data through enterprise data modeling. Their success was limited for two reasons: One, these efforts were driven by IT and without the broad organizational support and authority to enforce compliance; Two, the rigidity of packaged applications further reduced their effectiveness. So the idea of data governance through enterprise modeling was mostly an academic exercise.

Enterprise Repository Era (1990-2010)

Starting from roughly 1990 (you can debate the exact year), most organizations began to realize that the value of data extends beyond transactions. Decision making increasingly relied on data analysis. Plus, business processes consume increasingly large amounts of data created in distant parts of an organization for a different purpose. This led to a trend of thinking about data for broad use cases beyond the localized context of a transaction.

We tackled this problem by building large scale repositories, such as data warehouses, that take an enterprise perspective. ERP and ERP consolidation — the notion of having a single, integrated set of plumbing run the business — is driven by the same philosophy. More recently, we’ve accepted the idea that not all data is of equal value, and it is more cost-effective pay extra attention to the data that describe core business entities that are widely referenced. This led to the current build out of master data repositories.

While enterprise repositories yield a lot of benefits, they also have the deserved reputation of being very expensive and risky undertakings. Creating a view of data that supports multiple use cases invariably results in conflicts, and ultimately it’s up to the business ( the consumers of data) to resolve these conflicts. As a result,  data governance gradually but surely came to be recognized as critical to success.

However, because data related activities are generally carried out on a system-by-system basis, governance is typically siloed around individual enterprise repositories: data governance for a data warehouse or an ERP system, for example, or data governance for master data management. Also, governance is informal, lacking a distinct organizational structure and clearly defined and executed processes.

Policy Era (2010-?)

Even though consolidation is the theme of many enterprise data initiatives, in reality, the opposite has occurred. Systems and data repositories proliferated. It’s not hard to see why. Data complexity and volume continue to explode; business has grown more sophisticated in their use of data, which drives new demand that require different ways to combine, manipulate, store, and present information. Merger, acquisitions, and other strategic business changes lay waste to long term plans in the midst of execution. In other words, enterprise repositories alone have been unable to keep up with business reality.

Forward thinking companies recognized this and began to solve the data problem in a different way. They formed business-led governance organizations to care for data for the enterprise, and created collaborative processes to manage a core set of data deemed critical for the business. More significantly, they took a policy-centric approach to data models, data quality standards, data security and lifecycle management. Rather than envisioning ever-larger and more encompassing repositories, they put processes in place for defining, implementing and enforcing policies for data. It is acceptable for the same type of data to be stored in multiple places as long as they adhere to the same set of policies. Enterprise repositories continue to be important, but they’re built on governed platforms integrated with enterprise data policies.

Emerging from this is a shift in mindset: Business takes increasing responsibility for data content, and data is widely recognized as one of the most valuable corporate assets throughout the organization. For IT, a policy-centric approach is liberating. It affords more flexibility in designing systems to serve business needs without giving up consistency and control.

The Future

Successful implementations of policy-centric data governance will produce pervasive and long-lasting improvements in business performance. Over time, the scope of these data governance programs will increase to cover all major areas of competence: model, quality, security and lifecycle. And clearly defined and enforced policies will cover all high-value data assets, the business processes that produce and consume them, and systems that store and manipulate them.

More importantly, a strong culture that values data will become firmly entrenched in every aspect of doing business. The organization for data governance will become distinct and institutionalized, viewed as critical to business in a way no different than other permanent business functions like human resources and finance.

Do you agree? I welcome your comments.

6 replies
  1. Jim Harris
    Jim Harris says:

    Excellent post, Winston.

    I also began my career at the tail-end of the Application Era, but probably would have to say that my career has been more of a 50/50 split between applications and enterprise repositories, because as you know, history does not move forward at the same pace for all organizations, including software vendors–by which, I mean that my professional experience was influenced more by working for vendors selling application-based solutions than it was by working with clients who were, let’s just say, less than progressive.

    As you also know, the theory of the diffusion of innovations describes the five stages and the rate at which new ideas and technology spread through cultures, starting with Innovators (2.5%) and Early Adopters (13.5%), then the Early Majority (34%) and Late Majority (34%), and finally ending with the Laggards (16%).

    Therefore, as you stated, the exact starting points of the three eras you described can easily be debated because progress can be painfully slow until the enough of the Early Majority begins to embrace the new ideas and technology–and causes the so-called Tipping Point where progress begins to accelerate enough for the mainstream to take notice.

    For example, it could be argued that master data management (MDM) reached its tipping point in late 2009, and with the wave of acquisitions in early 2010, MDM stepped firmly on the gas pedal of the Early Majority, and we are perhaps just beginning to see the start of MDM’s Late Majority.

    It is much harder to estimate where we are with the diffusion of the innovation of data governance. I think we are still in the Early Adopter phase in 2010, and perhaps 2011 will be “The Year of Data Governance” in much the same way that some have declared 2010 to to be “The Year of MDM”–meaning that it may be another six to twelve months before we can claim the Early Majority has truly embraced not just the idea of data governance, but have realistically begun their journey toward making it happen.

    But then again, my name isn’t Nostradamus.

    Sorry for the long comment, thanks again for the excellent post.

    Best Regards,


    • Winston Chen
      Winston Chen says:

      Thanks for your comments Jim.

      I did grapple with this point and debated whether I should put years in my Eras. You’re absolutely right that maturity and adoption is uneven. Everyone once in a while I run into a company that doesn’t have any data warehouses or BI. People started building decision support systems 40 years ago!

      When I thought about the years, I did exactly what you suggested and thought about it in terms of “crossing the chasm”, “tipping point”, or the beginning of mainstream adoption. I tried to recall when I stopped talking mostly about why and started to talk more about how. Even then, the boundary years are very much up to debate. I also agree with you that MDM is probably at the height of the adoption curve.

      As for data governance, I had this conversation with several people. There is a consensus that data governance has just crossed the chasm into early majority. The caveat is that we generally talk to global 2000 companies, so it’s a skewed sample of all the organizations out there.

Trackbacks & Pingbacks

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply