Floating-point exceptions and the CLR
I’ve decided to share some of our experiences during the development of our math and statistics libraries in the hope that they may contribute to improvements in the .NET platform as the next version is being designed.
The CLR is a general-purpose runtime environment, and cannot be expected to support every application at the fastest possible speed. However, I do expect it to perform reasonably well, and if a performance hit can be avoided, then it should be. The absence of any floating-point exception mechanism incurs such a performance hit in some fairly common situations.
As an example, let’s take an implementation of complex numbers. This is a type for general use, and has to give accurate results whenever possible. For obvious reasons, we want the core operations to be as fast as possible. This means we want to inline when we can, and make our code fast, too. Most operations are fairly straightforward, but division isn’t. Let’s start with the ‘naïve’ implementation:
struct Complex{
private re, im; public static operator/(Complex z1, Complex z2) { double d = z2.re*z2.re + z2.im*z2.im; double resultRe = z1.re * z2.re + z1.im * z2.im; double resultIm = z1.im * z2.re – z1.re * z2.im; return new Complex(resultRe / d, resultIm / d); }}
If any of the values d, resultRe, and resultIm _under_flow, the result will lose accuracy, because subnormal numbers by definition don’t have the full 52 bit precision. The CLR also offers no indication that underflow has occurred. This can be fixed, mostly, by modifying the above to:
public static operator/(Complex z1, Complex z2)
{
if (Math.Abs(z2.re) > Math.Abs(z2.im)
{
double t = z2.im / z2.re;
double d = z2.re + t * z2.im;
double resultRe = (z1.re + t * z1.im);
double resultIm = (z1.im – t * z1.re);
return new Complex(resultRe / d, resultIm / d);
}
else
{
double t = z2.re / z2.im;
double d = t * z2.re + z2.im;
double resultRe = (t * z1.re + z1.im);
double resultIm = (t * z1.im – z1.re);
return new Complex(resultRe / d, resultIm / d);
}
}
This will give accurate results in a larger domain, but is slower because of the extra division. Worse still, some operations that one would expect to give exact results now aren’t exact. For example, if z1 = 27-21i and z2 = 9-7i, the exact result is 3, but the round-off in the division by 9 destroys the exact result.
IEEE-754 exceptions would come to the rescue here – if they were available. Exceptions (a term with a specific meaning in the IEEE-754 standard - not to be confused with CLR exceptions) raise a flag in the FPU’s status register, and can also be trapped by the operating system. We don’t need a trap here. We can do what we need to do with a flag. The code would look something like:
public static operator/(Complex z1, Complex z2)
{
FloatingPoint.ClearExceptionFlag(
FloatingPointExceptionFlag.Underflow);
double d = z2.re*z2.re + z2.im*z2.im;
double resultRe = z1.re * z2.re + z1.im * z2.im;
double resultIm = z1.im * z2.re – z1.re * z2.im;
if (FloatingPoint.IsExceptionFlagRaised(
FloatingPointExceptionFlag.Underflow)
{
// Code for the special cases.
}
return new Complex(resultRe / d, resultIm / d);
}
Note that the CLR strategy to “continue with default values” won’t work here, because complete underflow is defaulted to 0, which can not be distinguished from the common case when the result is exactly zero (and therefore no special action is required). The only way to do it right would be a whole series of ugly comparisons, which would make the code slower and harder to read/maintain. Even if a language supported a mechanism to check for underflow (by inserting comparisons to a suitable small value before and after storing the value), this would bloat the IL, introduce unnecessary round-off (by forcing a conversion from extended to double on each operation), and slow things down unnecessarily.
This type of scenario occurs many times in numerical calculations. You perform a calculation the quick and dirty way, and if it turns out you made a mess, you try again but you’re more cautious. The complex number example is the most significant one I have come across while developing our numerical libraries.
Nearly all hardware that the CLR (or its clones) run on, supports this floating-point exception mechanism. I found it somewhat surprising that a ‘standard’ virtual execution environment would not adopt another and well-established standard (IEEE-754) for a specific subset of its functionality.
For more on the math jargon used in this post, see my article Floating-Point in .NET Part 1: Concepts and Formats.