temporal data

All Things in Time

Matt Johnson-Pint

Mar 29, 2012 • 3 min read

Let's face it, we live in a world that is governed by time. We describe all kinds of things with time at different scales. At one end of the scale we might measure the creation of the universe in terms of billions or trillions of years. At the other end of the scale we might measure the movement of electrons in nanoseconds or picoseconds. And there are all kinds of scales in between.

As software developers, we are usually interested in the scale of things ranging from a few years down to minutes, or sometimes seconds. Now, I am specifically talking about business software development and capturing information about time. This is quite different from game programming or performance testing, where we certainly do care about time to the millisecond precision. But for the sake of discussion, let's keep this focused on typical business domain concerns.

So if time is so important, why do we gloss over it so easy when modeling a domain? Let's consider a somewhat familiar business domain:

public class Author
{
    public int Id { get; set; }
    public string Name { get; set; }
    public IEnumerable<int> BookIds { get; set; }
}

public class Book
{
    public int Id { get; set; }
    public string Title { get; set; }
    public int AuthorId { get; set; }
}

public class Store
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Address { get; set; }
    public string Website { get; set; }
    public IEnumerable<int> BookIds { get; set; }
}

There are several things you may notice about this domain. Each of these classes is an aggregate so they will have their own operations, and they each live in their own repository. The Author and Store classes each keep a collection of BookIds. (Depending on the underlying database technology, this may or may not be the most effective way to model the data, but it is certainly valid from a domain perspective.) The Author's book list represents the books they've written, and the Store's book list represents the books they sell.

So what's wrong with this domain model? Everything! It lives in fantasy universe where time does not exist! When were the books written? When were they put in the stores? Was the store always around? Did the store stop selling certain books? None of these questions can be answered by the above code. The best we can say is that the domain represents a snapshot of "now", and that we keep it as current as possible.

So many, many applications follow this paradigm implicitly. They only track the current data. So what's wrong with that? Well, for starters, it's not how things work in the real world. Books don't just instantly come into existence. They go through a process of being written, published, sold, and maybe discontinued. And stores don't just keep track of books they sell. They also need to know about books they're going to start selling in the future, books they used to sell in the past, and they need the system to keep track of these things in an efficient manner. Even the author could change. How? Perhaps the author got married and changed their name. Is that a different author? NO.

Almost every object we could possibly model in any domain has some sense of time. And no, just adding fields for DateCreated, DateUpdated, etc. does not solve the problem - those are usually just noise fields, and serve very little real purpose.

We can solve these domain concerns by introducing several different temporal patterns. I will spend much of this blog discussing the various patterns and how they work, but if you want to jump ahead, you can start reading Martin Fowler's pages on the subject of Effectivity and the Temporal Property and Temporal Object patterns.

You might think that implementing these patterns on the domain classes would take care of things and we could call it done, but you would be wrong. Introducing temporal logic into the domain has many side effects. Most databases aren't designed to work well with temporal data. Although you can certainly work temporal concerns into your database schema (see here for example), you will quickly find that querying for information can be very difficult an inefficient.

To really address all temporal and bitemporal concerns, we have to consider the entire system, from the user interface, to the domain model, and through to the database. We also need to think about how temporal concerns are addressed in patterns such as CQRS and Event Sourcing. In fact, these last two can greatly simplify the temporal problem, as time is much easier associated with behavior than with state. Think about it, you always "do something" at a specific time, and that behavior can be captured as commands and events.

I will be blogging much more on this subject (and others), and including real-world cases with concrete examples in C#. Please check my blog often, subscribe to the rss feed, or follow me on twitter @mj1856.

See you next time. :)

Sign up for more like this.