When recently pondering how balance data should be modeled, I was reminded of my favorite Star Trek episode, The Balance of Terror. Those sneaky Romulans had crossed the neutral zone and were destroying Federation outposts. Naturally, only the Enterprise was close enough to intervene. The Enterprise was faster than the Romulan ship and agile enough to avoid their weapons, but the Romulan ship had a cloaking device that rendered it invisible. Though their capabilities were very different, each ship had distinct advantages—and either was capable of winning the day.
Balances need not be terrifying, but they can be tricky. At first glance a balance seems like any other transaction, but when you look deeper it also starts to resemble a reference data attribute. The most relevant definition for a balance I could find was from Investorworlds.com: “The amount of money in an account, equal to the net of credits and debits at that point in time for that account.” The tricky part is the “at that point in time” bit. Most transactions occur at single point in time; however, a balance represents every measured point in time until it changes. Which raises an important question: How do we represent that efficiently in a data warehouse?
Big data is delivering ever-increasing data volumes to data warehouses, and users are expecting more information to be delivered to them faster than ever. Both are challenging architects to deliver efficient solutions that optimize storage cost, query and load performance, design complexity, and system management. There are no absolutes—and the only right answers are the ones that balance these considerations best for your organization.
Having an agile data management environment helps balance these concerns, while presenting alternative solutions to challenges like managing balance data. Several options for managing balances in an agile environment are described below.
Record a balance snapshot – The current value of every balance is recorded at every measured period—daily, for example. This is the simplest design, but not the most efficient since balance transactions are loaded even when the values haven’t changed. As data volumes grow, so do storage costs and system management challenges.
Perform a balance calculation – The current value of every balance is recorded only at significant periods—at month-end, for example. The changes that occur between significant periods are also recorded, but as a separate type of transaction. An agile data management solution can utilize derived data objects to automate the balance calculation at any point in time, managing the complexity that accompanies the storage and performance gains of the option.
Define the balance as a time variant attribute – Instead of presuming that a balance must be a transaction, consider modeling it as reference data. Time-variant attributes generate small data sets because balances are only recorded when their values change. This approach introduces a bit of complexity around representing reference data attributes as transactions to reporting tools; however, this won’t be difficult if your data management solutions are agile.
Balances are tricky. They seem like transactions, but behave like reference data. Much like the starships from The Balance of Terror, there are well-matched alternatives to managing them. Like the Enterprise, performing a balance calculation utilizes the speed and flexibility of an agile data management process. And like the fictitious Romulan cloaking device, implementing balances as time-variant reference data is a cutting-edge approach made practical through agile data management. One thing is certain, the more agile your data management processes are, the better your options for managing balances will be.