EMC announced last week they had acquired Greenplum, makers of an MPP analytic database, for an undisclosed sum. According to Dave Kellogg’s blog the valuation of Greenplum could have been as much as $300 to $400 million, and for a young startup, it is a fabulous exit and was for sure the right answer for them.
But is performance and capacity really the most pressing customer issue in data warehousing?
When you think about data growth, the desire for more analytics (which can require lots of data to sift through) and the necessary performance to serve user requirements, then it’s clear this solution’s benefit for the data warehousing market is fast performance and increased capacity. EMC is apparently planning on creating appliances as well. Yet customers already have many options to choose from to deal with performance and capacity, including Kalido partners Teradata and Netezza who play here and offer appliances.
The biggest problem we see in data warehousing isn’t just performance and capacity; it’s getting data warehouses up and running fast and then handling the inevitable change due to new requirements, new data sources and changes in the business. The EMC/Greenplum combination doesn’t help customers deal with this. Their customers will still build their warehouse the old fashioned way – lots of hand-coded ETL that piles up as more data from more sources is added, and the business users continually demand more and different views. How will customers deal with thousands of ETL jobs? How long will it take to make the data warehouse current when a company makes an acquisition, sells a division, reorganizes a sales force, introduces new products and retires old ones, opens and closes stores, and moves off internally-managed OLTP systems to cloud-based SaaS data and application providers? Who has the time and resources to untangle that hairball?
And that’s not even mentioning the master data challenges.
In the end, your data warehouse is valuable as long as it is able to keep delivering business value to your organization. Because business isn’t static, the data warehouse can’t be static either. While it is great that EMC/Greenplum will provide another solution to choose from that addresses the scale and performance issues in data warehousing, more attention needs to be paid to how fast those data warehouses can be built and how they are going to remain relevant as they age.
What do you think? Is another speed and scale answer what data warehousing needs now, or is handling change and keeping the warehouse agile the right question to address?