Why Don't We Check Our Math? 3
One of life’s little mysteries is why so few traditional mainstream language have support for catching overflows for fixed-with arithmetic types. Java, for all it’s concerns about bounds checking and memory errors, doesn’t really provide any mechanism for catching overflows. C’s view of the matter is to make all unsigned math do wraparound and leaves the signed case literally undefined. C++ did nothing to improve upon C’s behavior. It’s just a mess.
One could perhaps make the argument that these kinds of errors rarely show up, but I see them all the time when I review code.
I can’t count how often I’ve seen code like this:
size_t buffer_size;
...
/* skip on down to the evil stuff */
unsigned char *iter = buffer;
while ((*iter++ = getc(file)) != EOF) {
if ((iter - buffer) == buffer_size) {
buffer_size += buffer_size;
buffer == realloc(buffer, buffer_size);
}
}All is well and good unless buffer_size ever gets to be greater than SIZE_MAX/2, and then suddenly you are writing off in to lala land. Yeah, that’d mean realloc() would have to succeed in allocating >SIZE_MAX/2 memory, but with our modern systems still primarily running 32-bit code, despite having multiple gigs of memory, this isn’t exaclty unheard of. Code like this can be found everywhere. Heck, if you check back a few generations of GNU’s corelib functions you’ll find something almost exactly like the above.
Statically typed functional programming languages tend to handle this issue either through boxed types. Dynamically typed languages tend to this by simply checking for overflow and automagically promoting to wider and wider arithmetic types in the event that an overflow occurs. Both approaches are decent approximations of an ideal solution, but they are both a response to the problem that mainstream languages seem to have their head in the sand about.
I’ve mentioned this to some people, and have received comments like “well C is very close to the metal, so they want to expose you to how the CPU does the math”. Great! Most CPU’s have an overflow register just waiting to let you know that all hell has broken loose, so surely C takes advantage of this? ;-)
The reality is that with a simple check of a register value, we can save ourselves a ton of bugs. This is a really cheap safety feature that one could always disable in performance sensitive code that had been carefully reviewed.
What brought this to mind was that I was dusting off some old code that I’ve recovered from a crashed drive and I found an old project of mine called “checkedmath” which addressed this shortcoming in C++. C++, for all of it’s shortcomings, provides just enough support for metaprogramming that you can generally come up with way to address a lot of its shortcomings in code. In this case, I added overflow checking by taking advantage of operator overloading. I’m going to polish it off a bit before posting it, but the basics look something like this:
template <typename T>
struct CheckedNumber {
CheckedNumber<T> operator+=(const T aNumber) {
if (value >= 0) {
if ((std::numeric_limits<T>::max() - value) < aNumber) {
throw arithmetic_bounds_exception(*this, aNumber, "+");
}
} else if ((std::numeric_limits<T>::min() - value) > aNumber) {
throw arithmetic_bounds_exception(*this, aNumber, "+");
}
value += aNumber;
return *this;
}
private:
T value;
};Now, that doesn’t take advantage of the hardware’s overflow detection, but my plan was always to get out a generic version that could pretty much work on any platform and then write some more efficient specializations in inline assembler (if I ever got around to re-bootstrapping my assembly programming knowledge) at a later date. The actual code is more generic than the above (probably more than it needs to be really), but you get the idea.
The reason I never finished this project was that after I figured out how to do it right, it occured to me that surely someone else had already done the same thing. Now it’s been a year later and I have yet to see anything like this. So, I’m going to throw it out to the blogosphere: anyone seen anything like this?
UPDATE: Apparently VB does handle overflow.
UPDATE: Looks like Microsoft has SafeInt. It doesn’t do boxed types and lacks optimizations, but it’s still a good start. I may still push my CheckedNumber implementation out at some point, but at least there is a semi-decent implementation of checked arithmetic out there.
Trackbacks
Use the following link to trackback from your own site:
http://xblog.xman.org/trackbacks?article_id=why-dont-we-check-our-math&day=15&month=09&year=2006
“VB has these type promotion tables for their operators, which while dazzling, fail to prevent overflow.”
In every version of VB I have used, if you give the variable a specific type (say Integer or Single), it will throw overflow errors as expected.
C# at least has the option for overflow checking, though I think Microsoft was brain dead for turning it off by default.
One of the reasons I won’t use Java for business applications is that I cannot figure out how to check for overflows. I don’t think it is even possible.
Yes, this has been done. See SafeInt:
http://tinyurl.com/ql9pm
Original URL: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dncode/html/secure05052005.asp
You might find this interesting:
http://www.deez.info/sengelha/blog/2004/01/16/implemented-integer-overflow-class/