Happy New LEAP Year!
If you haven't realized it yet, 2016 is a leap year. For most people, this may just be an interesting oddity. An extra day to work or play. But for developers, the leap year can cause significant pain. It's January 1st as I'm writing this. If you are just now thinking about checking your code for leap year bugs, you better move quickly. In fact, you probably are already experiencing the effects, and may not even realize it!
"Meh," you say. "My code is just fine. We have unit tests."
"Oh really", I say. "Do your tests properly mock the clock? Do they test edge cases including February 29th and December 31st? Have you tested any low-level C++ code you might have as well as the rest of your system? Do you even know what a leap year bug looks like?" Most often, blank stares.
There's a lot to cover here, so let's start with the most important things first.
- February 29th is not the only day affected by the leap year. Another very important date is December 31st, because it is the 366th day of the year and many applications mistakenly hard-code a year as 365 days.
- Leap year bugs can be found anywhere, but are most dangerous in C / C++ code, where they can cause application crashes or buffer overflows (which are a security risk).
- Past leap years have included some high-impact, high-profile bugs, such as: The 2012 Microsoft Azure outage - where miscalculation of a certificate expiration date caused service disruptions for up to 12 hours. The 2010 Sony PlayStation Network outage - caused by misidentification of 2010 as a leap year The 2008 bricking of all Microsoft Zune devices - caused by a logic error on December 31st. The 2008 Microsoft Exchange management bug, which prevented administrators from doing much of anything on February 29th. Lotus 1-2-3's miscalculation of year 1900, which still impacts Microsoft Excel today, over 30 years later!. These are just the big ones that made the news. I'm sure thousands more occurred with varying degrees of impact and noticeability.
The two most dangerous leap year bugs
#1: Adding or subtracting years in C / C++
In C / C++ code that uses the Win32 API, the SYSTEMTIME
structure is a common representation of civil time. It has distinct fields for each part of a date, separating the year, month, day values (and others). It is very common to see the following code:
SYSTEMTIME st; // declare a SYSTEMTIME variable
GetSystemTime(&st); // set it to the current date and time
st.wYear++; // increment it by one year
This code will succeed without error. However, the risk is that if the code is called on February 29th, the resulting value will still be on February 29th, but in a non-leap year. For example, 2016-02-29 + 1 year = 2017-02-29
, which does not exist!
This value might be passed around quite a bit before it ultimately ends up as a parameter to another function, such as SystemTimeToFileTime
, where it will cause the function to fail with a return value of zero. Unfortunately, it is extremely common to find code that uses this method without checking the return value. This can lead to unpredictable results, such as leaving a FILETIME
value in its uninitialized state.
- Always check the status result of Win32 functions, especially
SystemTimeToFileTime
. - Correctly add a year to a
SYSTEMTIME
by checking the validity of the result and adjusting when necessary:
SYSTEMTIME st; // declare a SYSTEMTIME variable
GetSystemTime(&st); // set it to the current date and time
st.wYear++; // increment it by one year
// check to see if it's a leap year
bool leap = st.wYear % 4 == 0 && (st.wYear % 100 != 0 || st.wYear % 400 == 0);
// If it's Feb 29th, but it's not a leap year, then move back to Feb 28th
st.wDay = st.wMonth == 2 && st.wDay == 29 && !leap ? 28 : st.wDay;
Note that a similar bug can also occur in standard C++ (non-Windows) code as well. The tm
struct is used instead of SYSTEMTIME
, which has slightly different behavior. Months are 0-11 instead of 1-12, so Februrary is month 1. Instead of SystemTimeToFileTime
, you might call _mkgmtime
to produce a time_t
structure. The key difference though is that instead of it failing, when passing February 29th in a non-leap year, it will produce a value that represents March 1st. Your application may be expecting February 28th, and if so it will need to adjust.
#2: Declaring an array of values for each day of the year
int items[365];
items[dayOfYear - 1] = x;
The above C code could just as easily be rewritten in C# or another language, or use a string or some other data type instead of an integer. The key point is that we're declaring a fixed-size array to hold data, and assuming that there will be a location in the array for every day of the year. The problem, of course, is that in a leap year, there will not be a place in the array for the 366th day, December 31st.
The effects of this vary considerably by language. In C#, this will cause an IndexOutOfRangeException
. In C, unless the bounds checking compiler option is enabled, this will create a buffer overflow, the effects of which could be negligible, or considerable. JavaScript developers have less to worry about in this regard, as a 366th element will be added automatically.
Data Filtering Issues
There are other effects of leap year bugs that can affect data anywhere from February 28th of the prior year, to March 1st of the following year. Usually these are in data filtering, where a range query doesn't account for the extra day - either by assuming a year is always 365 days, or assuming February is always 28 days. Consider a SQL statement such as:
SELECT AVG(Total) as AverageOrder, SUM(Total) as GrandTotal
FROM Orders WHERE OrderDate >= @startdate AND OrderDate < @enddate
This query is fine, but consider what happens if @enddate
is set to today, and @startdate
is set to today minus 365 days. If the range happens to include the February 29th leap day, then it's not covering an entire year. The start date is one day short, and thus the values are incorrect, assuming the intent was indeed to represent one year's worth of data.
When evaluating bugs like this one, ask yourself what the impact of the bug is. In this case, where are these values displayed? For example, if the average order amount is feeding a chart on a dashboard that gets updated every day, it might not be quite as important as when the total sales for the year is listed in a company's financial report such as an SEC filing. Of course, this assessment requires someone familiar with the application and it's usage; there is no one-size-fits-all rule to follow.
It may be tempting to solve this problem with an approach such as this:
TimeSpan oneYear = TimeSpan.FromDays(isLeapYear(endDate.Year) ? 366 : 365);
DateTime startDate = endDate - oneYear;
However, this is approach is flawed. One cannot determine the number of days to add just by evaluating the year alone. Consider that endDate
could be 2016-01-01
, and though 2016 is a leap year, there's only 365 days to subtract to reach 2015-01-01
. Instead, you have to consider whether or not the February 29th leap day is included in the range. This leads to some fairly complex code if you try to do it by hand, especially when you consider covering multiple years instead of just one.
Ultimately, it comes down to the fact that TimeSpan
in .NET (and similar types in other languages) is a representation of absolute time, and both "year" and "month" are units of civil time. The absolute amount of time in a year or month is variable depending on which years or months you are describing. (The same can actually be said for a "day" when you consider daylight saving time, though that's getting off-topic for this post.)
The correct solution for .NET is:
DateTime startDate = endDate.AddYears(-1);
The AddYears
method correctly implements all of the necessary logic to determine how many days to move forward, or backwards in the case of a negative value.
Adding a year in JavaScript
JavaScript developers really should be using moment.js for this, where it's as simple as:
var m = moment();
m.add(1, 'years');
However, some folks still like to do things the hard way so you'll often see this:
var d = new Date();
d.setFullYear(d.getFullYear() + 1);
The problem here is one I mentioned earlier. If today is February 29th of a leap year, then the resulting value will be March 1st. That may or may not be acceptable to you. Consider that for every other date, the result is in the same month as the original value. Also consider that your application may be expecting an end-of-month date instead of a start-of-month date.
Here is a function you can use to correctly add years in JavaScript without requiring a full library:
function addYears(d, n) {
var m = d.getMonth();
d.setFullYear(d.getFullYear() + n);
if (d.getMonth() !== m)
d.setDate(d.getDate() - 1);
}
// example usage
var d = new Date();
addYears(d, 1);
This implementation adds the years, and then checks to see if the rollover to March occurred or not, and compensates if it did. Again, do not try to implement this by figuring out exactly how many days to add - unless you really know what you're doing.
Other Common Mistakes
There are many other things developers get wrong related to the leap year, such as:
- Messing up the leap year algorithm. It's not just every four years. It's every four years as long as the year is not divisible by 100, unless it's divisible by 400. 1900 was not a leap year. 2000 was a leap year. 2100 will not be a leap year.
- Using an array of days in each month, where February has 28 days. When using such an array, you must account for the 29th day in a leap year. A better approach is to use a different array for leap years than for common years. Or better yet - use the APIs you have (when available) instead of trying to do the math yourself.
- Branching the code for leap years, and then not testing all code paths. For example, the code from the Zune bug has an
IsleapYear(year)
branch at the top, which clearly was never tested. - Using separate year, month, and day values in without validating them. For example, you may have a UI with separate drop-down controls to pick each component. It's not enough to test that the day is valid within the month. You also have to consider the year.
- Using the average length of a year, such as 365.25, or 365.2425 days in date math. While this may be scientifically accurate, it is never appropriate for actual manipulation of civil time. At least, not if you care about accurate values. It's fine if you only need an approximation, but the associated time-of-day will likely be off in the result.
How do I catch leap year bugs?
- Scrutinize your code carefully. Search for anything time related, and go over it with a fine-toothed comb.
- Make sure you have lots of unit tests, and know how to "mock the clock" (described in the next section).
- Test year-round, not just before leap years.
- Validate all inputs, including configuration.
- Validate results and complete scenarios. Have a failure strategy!
I'm often asked about two other approaches:
Static Code Analysis
It would be wonderful if there was just a set of tools you could run against your code that would point out where you had leap-year bugs. Unfortunately, I don't know of any. Simple string-search, or even regex search, can only get you so far.
Truly what is needed for .NET is a comprehensive set of Roslyn Analyzers that can catch common date/time bugs including leap year, time zone, daylight saving time, parsing, and more. Unfortunately, I have not the time to create such analyzers myself. Maybe I'll get to that at some point in the future, but it doesn't exist today.
It would also be nice to have similar tools for C++, JavaScript, and other languages. Though I know of none.
Time Warp
Why not just move the clock forward and see what happens? That might actually work for some systems, but there's a few problems with this idea.
- Your unit tests might still not catch everything. You probably won't catch data filtering errors unless you actually look (manually) at every screen and report of your entire application. This is error prone for sure.
- You might develop a false sense of security, believing everything to be ok. Only when your customers call complaining on February 29th or March 1st will you realize how wrong you were.
- Many systems have to authenticate with domain servers, or use other authentication schemes that are time sensitive. Recognize that the Kerberos protocol has strict time sync requirements, with a default tolerance of 5 minutes. Also consider that SSL certificates, code signing certificates, and other security-related things are clock dependent and will fail if you try to lie about what time it really is.
So, in general, I recommend against this approach.
Mock The Clock
How do you test code that behaves differently on different dates? Mock the clock!
This is a common pattern found in many reliable systems. The key point is that the system clock - you know, the thing that tells you what time it is - should not be used haphazardly. Application logic should never make a direct call to DateTime.Now
or DateTime.UtcNow
or new Date()
or GetSystemTime
or whatever the equivalent is in your language to get the current date and time.
Instead, you should treat the clock as a service (in the DDD sense), and like any service, you should be able to mock it.
For example, in .NET, instead of directly calling DateTimeOffset.UtcNow
(or similar APIs) from application logic:
- Create an interface
IClock
with a methodGetCurrentTime
that returns aDateTimeOffset
. - Create a
SystemClock
class that implements fromIClock
, whereGetCurrentTime
callsDateTimeOffset.UtcNow
. - Create a
FakeClock
class that implementsIClock
, which accepts a fixed value as a constructor parameter and whereGetCurrentTime
simply returns the fixed value. - In your application logic, only depend on the
IClock
interface. Typically this is constructor injected. - Use a
FakeClock
during testing, and wire up aSystemClock
at runtime.
This may sound like a lot of work, but once you go through the motions you'll see where it has advantages. This is truly the only way to ensure all of your code is tested when the current date and time are dependencies.
I intentionally did not provide code for this here, as the pattern should be the same for many different languages. Also, there's already a really great implementation of this in Noda Time, which comes with IClock
and SystemClock
in the main assembly, and FakeClock
in the NodaTime.Testing
assembly. You'd do well to use Noda Time for this, and many other reasons.
Conclusion
Leap year is here. It's not Y2K, or Y2038, but it is something we have to contend with on a regular basis. How much code did you write over the last four years? Are you sure it's all up to par? Take the time now to test and scan through your code. You will probably find a few things you didn't realize were lurking in the shadows.