Working with Date and Time in RavenDB

The following is an article I wrote for the knowledge base on the RavenDB web site back in August of 2013. Recently, the site was revamped and the knowledge base was removed. Since the information is still relavent, I thought it would be worth republishing here. Enjoy!


Working with Date and Time in RavenDB

Handling Dates in JSON

In RavenDB, documents are stored in JSON format, but the JSON specification doesn't mention dates at all. It's generally left up to each implementation to decide what format should be used. Formatting is important, because everything is essentially a string.

Here are three common ways you might encounter dates in JSON format:

"1375559852"
"1375559852706"
  • As a raw number, this usually represents seconds or milliseconds since Jan 1, 1970 UTC.
  • It is usually not recommended, as it is very difficult for a human to read.
  • You should avoid using these in RavenDB. If for some reason you must use them, then be sure to use SortOptions.Long in your indexes, so that they are ordered properly.
"/Date(1375559852706)/"
"/Date(1375559852706-0700)/"
  • This string format wraps the numeric milliseconds value with some text, and optionally provides an offset.
  • It was invented by Microsoft, and is the default if you use DataContractJsonSerializer or the JavaScriptSerializer classes.
  • It was the default format used by ASP.Net MVC3, so it is sometimes refered to as an "ASP.Net JSON Date". It is no longer the default in MVC4.
  • These were used by early versions of RavenDB 1.0.
  • You should never use these in newer versions of RavenDB.
"2013-08-03T12:57:32.7060000"
"2013-08-03T12:57:32.7060000-07:00"
"2013-08-03T19:57:32.7060000Z"
  • These formats are part of a larger specification known as ISO8601.
  • They are human readable, and in a format that can't be misinterpreted.
  • They are interchangeable with other computer systems, and easily recognized.
  • They are lexicographically sortable when the offset is either not provided, or when all data shares the same offset.
  • This is the default format provided by the JSON.Net serializer, used by RavenDB and by WebAPI, MVC4 and others.
  • This is the format that RavenDB uses, and the recommended format anywhere else you may use date and time in JSON.
  • Specifically in RavenDB:

    • While ISO8601 allows a space to separate date and time parts, we always use the T separator.
    • Seconds and fractional seconds are optional in the ISO8601 spec, but we always specifiy seconds, and use 7 digits of precision for decimals.
    • Note that range checking would be fine with any level of precision, but equality checks would fail. While "2013-08-03T19:57:32.7060000Z" and "2013-08-03T19:57:32.706Z" represent the same moment in time, they are not the exact same string so they are not lexicographically equal.
    • The ISO8601 spec also allows for an offset to be presented as -0700, but we always use the form with the colon separator, -07:00.
    • Our format comes directly from dateTime.ToString("o") or dateTimeOffset.ToString("o") in .NET. This also aligns with RFC 3339.

Working with Date and Time in .Net

RavenDB is built on the Microsoft .Net Framework, which supplies several data types for working with date and time. When using the RavenDB .NET Client API, this is the most common way you will interact with date and time values.

It is very important to understand the behavior that these types present. Sometimes, the behavior is quirky or non-obvious. Here are some external articles that you can read describing just a few of the issues you may encounter:

While these issues are not anything special for RavenDB, they are probably things you will have to deal with when storing and retrieving date and time values.

Supported .Net Data Types

DateTime

The DateTime type is probably the most common you will encounter. It is used to store a date and a time. While this may sound simple, the context is also important to consider.

As an analogy, imagine if you had an integer called Distance and you put the number 2 into it. What does that represent? Is it 2 inches? meters? miles? The data type doesn't record this information, so is would be better to call your field something like DistanceInKilometers so it is meaningful.

Likewise, there are different types of values you might store in a DateTime field. This is represented by the .Kind property, and it is very important to pay attention to the kind of values you are working with. There are three possible kinds:

DateTimeKind.Utc
  • This represent an exact moment in time, in the UTC time zone.
  • UTC and GMT mean the same thing in this context.
  • No matter where in the world you are, values of this kind are unambiguous.
  • These values are serialized into RavenDB with a Z character at the end.
    • Example: "2013-08-03T19:57:32.7060000Z"
DateTimeKind.Unspecified
  • This represents a position on a calendar, and a time on a clock.
  • This is not the same thing as an exact moment in time, because you do not know whose calendar or clock it represents. Even when you have that information separately, you could misapply it due to daylight saving time rules.
  • Typically, these values have been localized to a particular time zone, but not necessarily any time zone that you happen to be using. It might align with the time zone of your user, but that would be handled in your own application logic. There is nothing in the data itself that will tell you what actual moment in time this aligns with.
  • They are typically obtained from the new DateTime(...) constructor, or by parsing a string of user input data with DateTime.Parse and similar methods.
  • You can use these in RavenDB, but they should be reserved for values that don't need to align to a precise universal moment in time.
  • For example, if a company says "On Monday April 1st 2013, all of our offices worldwide will open at 10:00 AM" - that is a value you would store with an Unspecified DateTime.
  • RavenDB stores these without any specifier at the end.

    • Example: "2013-08-03T12:57:32.7060000"
  • Because .Net does not have a separate Date type, sometimes you will use an unspecified DateTime where the time is midnight, but really you mean just the whole calendar date. So while you may only care about "2013-08-03", it is still stored in RavenDB as "2013-08-03T00:00:00.0000000". This is a limitation of .Net - not of JSON or RavenDB, and can be overcome by using the Noda Time library (see below). The important part is this: Once you start treating a DateTime as if it were just a date, never ever use the time portion for anything.

DateTimeKind.Local
  • This represents a position on the calendar and clock in use by the computer that the code happens to be running on.
  • It is commonly encountered with DateTime.Now.
  • While this may sound useful, it's fairly dangerous to use in server-side code. After all, you don't want the behavior or meaning of your data to change based on where you happen to deploy your application.
  • When serialized, this also gets the same representation as Unspecified kinds.

    • Example: "2013-08-03T12:57:32.7060000".
  • When deserialized, there is nothing in this string to indicate that it was originally local time. So it it will become Unspecified. RavenDB will not set it back to local kind for you, because there is no way it could know that is what you meant.

  • You should avoid working with local kinds whenever possible. Even in desktop applications, you are always better off using DateTimeOffset instead.
DateTimeOffset

The DateTimeOffset type represents a date and time, and includes an offset. The offset tells you exactly how the date and time values are related to UTC. Therefore, it can be used for both an exact moment in time, and for a position on a calendar and clock. For a better understanding of DateTimeOffset, please see the StackOverflow entry on DateTime vs DateTimeOffset.

  • In RavenDB, a DateTimeOffset is stored with its full ISO representation.

    • Example: "2013-08-03T12:57:32.7060000-07:00".
  • However - it is only stored that way in a document. When it is indexed, it is first converted to its UTC equivalent DateTime representation.

    • Example: "2013-08-03T19:57:32.7060000Z".
  • Notice in these examples, that it's not just the end of the string that is changing, but the hour is also changing as the offset is applied.

  • This is done specifically so that they are ordered lexicographically in the index. This enables querying and sorting.
  • The RavenDB .Net Client API will automatically manage the conversions between DateTimeOffset and UTC DateTime for you when you use the LINQ API, or when using the LuceneQuery methods that accept parameters.
  • Most of the time, you will not need to concern yourself with this translation. However there are a few scenarios where you have to deal with this:

    • If you are constructing your own Lucene queries from text - you will be bypassing this behavior. You must translate the inputs to UTC DateTime before adding them to your queries.
    • If you are projecting the index using ProjectFromIndexFieldsInto or similar mechanisms, then you will only have the UTC equivalent. This means you are still talking about the same moment in time, but you will find your DateTimeOffset values to have been adjusted to zero for the offset. (One way to work around this is to map the offset as a separate field, and apply it manually when returned.)
TimeSpan

The TimeSpan type is intended to represent a duration of time, such as "3 hours and 20 minutes". However, it is sometimes also used to represent the time-of-day without a date, in 24-hour format - such as "03:20". These really are two separate concerns, but there is no Time type provided by .Net - so this is an acceptable compromise. If distinguishing between these purposes is important to you, you can use the Noda Time library (see below).

When serialized by .Net using ToString(), a value greater to or equal than 24 hours is presented as a quantity of standard days. For example, 25 hours is serialized as "1.01:00:00", or "one day, one hour, zero minute and zero seconds".

The precision of a TimeSpan can be as small as 1 tick (or 100 nanoseconds), so it might have any number of decimals to the right. It can store values from slightly more than 10,675,199 days, to as low as negative 10,675,199 days. (A negative TimeSpan represents moving backwards in time.) Because of this, there is no possible string representation that is lexicographically sortable. That presents a challenge for RavenDB.

  • In RavenDB, TimeSpan values are stored in documents using their default string representation as described above.
  • But when indexed, the value of the .Ticks property is stored in the index as a long integer.
  • The RavenDB .Net Client API will translate this for you, so most of the time you will not have to do anything special.
  • But if you are building Lucene queries manually, you need to consider that range queries are going to to have to be translated. (The best advice is to avoid doing that.)

Working with "Now"

When you need to get the current time, you should never call DateTime.Now. This returns the time local to the computer you are running on, but throws away any information about the time zone or daylight saving time status that might be in effect.

For example, in the US Eastern Time zone, in 2013 daylight saving time ends on November 3rd at 2:00 AM local time. So the clock goes from 1:59:59 back to 1:00:00. If you were to call DateTime.Now during this period, you might get a value of "2013-11-03T01:00:00.0000000", but you have absolutely no way to tell which of the two different 1:00 AM times you were talking about. Querying RavenDB with this time might provide unexpected results.

Therefore, when querying an index you can use any of the following without issue:

  • DateTime.UtcNow
  • DateTimeOffset.UtcNow
  • DateTimeOffset.Now

Using any of these will query with an exact moment in time. In our example, it would be one of the following:

  • "2013-11-03T05:00:00.0000000Z" for EDT (UTC-4)
  • "2013-11-03T06:00:00.0000000Z" for EST (UTC-5)

Note, if you are writing a RavenDB bundle, you should use the Raven.Abstractions.SystemTime.UtcNow abstraction, as it can be mocked for easier testing.

You can't say "Now" in an Index

Don't try to use any "now" equivalent in an index definition. You can't. It would be dangerous, since the value of "now" is constantly changing, and indexes are only built when data is stored or changed. Newer versions of RavenDB will prevent you from doing this.

An example scenario would be a map-reduce an index that stays up to date with your top customers "over the past 30 days". It would need a "now" to determine the period to calculate with. There have been some attempts at a workaround, but it is ill-advised since it can lead to heavy pressure on the index at peak times. Consider any attempt at this functionality to be highly experimental and not recommended or supported.

Instead, you might consider building an index that calculates daily totals, for every day. You could then query for the past 30 days of data. You can sum the values to obtain the same results. Of course, deciding what a "day" is, leads into our next topic about time zones.

Time Zone Conversions

It is common for humans to think in terms of "days", especially in business. We use words like "today, "tomorrow", "yesterday", and sometimes we qualify them like "Wednesday", or "December 25th 2013". But these are all positions on a calendar - not unique moments in time. In order to make these values useful and accurate, we need to consider the context. This is typically discussed in terms of "time zones".

Let's take the example of an online ordering system. Customers from all over the world place orders, and we store a timestamp called OrderPlaced with each order. If we're doing things properly, the timestamp will either be a DateTime with UTC kind, or it will be a DateTimeOffset. When the order is placed, we'll obtain the timestamp with one of the three options discussed earlier in the section "working with now".

Now our boss asks to add some functionality into the back-end of this system that will show order totals by day. Immediately we have a problem. The question to ask your boss: "Whose day are you talking about?"

Of course, your boss may not understand the question, as humans can be horribly self-centered. Show him or her the visualization at everytimezone.com. This isn't really every time zone, but it does have many of them, and demonstrates how we experience time differently depending on where we are in the world.

There are a few different types of responses you might get:

  1. Our customers are all over the world, so we'll conform to UTC for our reporting purposes.
  2. Our customers are all over the world, so give us back totals by their time zone.
  3. Our company headquarters is in New York, so give us totals for each "New York" day.
  4. Our company has offices in New York, Los Angeles, London and Singapore. Give each office their own reports and totals accordingly.

Option #1 is the easiest. In a map-reduce index, group by the UTC calendar date. It will be OrderPlaced.Date if you are using a UTC DateTime, or will be OrderPlaced.UtcDateTime.Date if you are using a DateTimeOffset. Be careful, if you are using a DateTimeOffset and you just call OrderPlaced.Date - you will be talking about the local date, not the UTC date.

Option #2 can be easy, but only if we store our OrderPlaced using a DateTimeOffset. In a client-server application, we could populate that field using DateTimeOffset.Now, but since this is an online ordering system it is likely that our business logic is on a web server. Calling DateTimeOffset.Now would give the offset of the server, not our customer. Somewhere in our application we'll need the time zone of our customer so that we can save a DateTimeOffset where the offset is correct for the customer's time zone.

If we don't want to ask the user for their time zone, another way would be to build the full ISO string for the DateTimeOffset in the browser. This would have to be done in JavaScript. There are many ways to do this, but the very easiest way that is fully compatible with all browsers is with the moment.js library, using moment().format().

Once we have a DateTimeOffset stored, we can create a map-reduce index and group by OrderPlaced.Date. This will put each order in a group with a logical calendar date, but that date doesn't really represent anyone in particular's local calendar.

A different approach to #2 would be to store OrderPlaced as a UTC DateTime, and to store the customer's time zone in a separate document. You could convert the order time to the customer's time zone in the index map, similar to option #3 below. (You would have to use LoadDocument to pull in the customer's time zone.) I'm not going to show an example of this, because I don't recommend it. The problem is that if the customer changes their time zone, it will change the output of the index even on past data. Since an order is placed at a particular point in time, the time zone should be fixed at that moment. The DateTimeOffset approach does that for us implicitly.

Option #3 is very common. We'll still want to group by a whole date, but it is important to first translate the OrderPlaced timestamp to the time zone we are interested in. This can be done with the TimeZoneInfo class in our index map. For example:

Map = salesOrders =>  
    from salesOrder in salesOrders
    select new {
        Date = TimeZoneInfo.ConvertTimeBySystemTimeZoneId(
                               salesOrder.OrderPlaced,
                               "Eastern Standard Time")
                           .Date,
        Total = salesOrder.Total
}

Now when group by the Date field in the reduce part of our index, we are aligned with the US Eastern time zone. Note that "Eastern Standard Time" is the value from TimeZoneInfo.Id that we want the results in. Don't get fooled by the name, it represents both EST and EDT.

Option #4 is less common, but it does occur. We'll follow the example set in option #3, but since the dates will be used as keys in the map-reduce field we can't just add additional time zones as new fields in the same index. Instead, we'll have to create separate indexes for each time zone.

So instead of just SalesOrders/ByDay, you would have separate indexes:

  • SalesOrders/ByDay/ForNewYork
  • SalesOrders/ByDay/ForLosAngeles
  • SalesOrders/ByDay/ForLondon
  • SalesOrders/ByDay/ForSingapore

In your application logic, you would have to determine which index was appropriate to query.

Noda Time Support

The API offered by .Net via the DateTime, DateTimeOffset, TimeSpan and TimeZoneInfo classes is fully functional, but it can also be a bit confusing. There are many nuances to understand, and many places where you could do things wrong if you are not careful.

Fortunately, the open-source community has put together a better offering called Noda Time. It offers date types that cannot be misinterpreted, such a Instant, LocalDateTime and ZonedDateTime. There are also separate types for LocalTime, Duration, Offset and Period, rather than just one TimeSpan type. It also has a type LocalDate which represents a date without a time - something that doesn't exist in .Net.

Another key reason to use Noda Time is for improved time zone support. The TimeZoneInfo class is limited to Microsoft Windows time zones, while much of the rest of the world uses IANA/Olson time zones. Noda Time supports both. For more information, see the timezone tag wiki on StackOverflow.

While you don't have to do anything special to use Noda Time in your application logic, if you would like to use Noda Time types in your entities that get stored in RavenDB - then you need support for serialization of these types. You can find this support in the RavenDB-NodaTime integration library.

Additional Help

Like always, you can ask for help in the RavenDB Google Group. If you have date/time related questions that are not about RavenDB, please post to StackOverflow using the [datetime] and/or [timezone] tags.