...making Linux just a little more fun!

**[ In reference to "A Question Of Rounding" in LG#143 ]**

**
Mauro Orlandini - IASF/Bologna [orlandini at iasfbo.inaf.it]
**

Talking about rounding, I almost went crazy to find out why a perl program I was writing did not give the right result. Here is the code:

#!/usr/bin/perl # $f = 4.95; $F = (4.8+5.1)/2; $F_spr = sprintf("%.10f", (4.8+5.1)/2); printf " f: %32.30f\n", $f; printf " F: %32.30f\n", $F; printf " F_spr: %32.30f\n", $F_spr; if ($f == $F) { print "\n f equals F!\n\n"; }

The test fails even if $f and $F are the same! I solved by using $F_spr instead of $F, but it took me a day to find it out... 8-(

Ciao, Mauro

--_^__^_(_)------------------------------------------------------------------(_) | / | Mauro Orlandini Email: orlandini at iasfbo.inaf.it | \ | | / | INAF/IASF Bologna Voice: +39-051-639-8667 | \ | | / | Via Gobetti 101 Fax: +39-051-639-8723 | \ | | / | 40129 Bologna - Italy WWW: http://www.iasfbo.inaf.it/~mauro/ | \ | | / |--------------------------------------------------------------------| \ | | / | Today's quote: | \ |

Top Back

**
Ben Okopnik [ben at linuxgazette.net]
**

On Tue, Oct 02, 2007 at 09:53:23AM +0200, Mauro Orlandini - IASF/Bologna wrote:

> Talking about rounding, I almost went crazy to find out why a perl program > I was writing did not give the right result. Here is the code: > > ---------------------------------------------- > #!/usr/bin/perl > # > $f = 4.95; > $F = (4.8+5.1)/2; > $F_spr = sprintf("%.10f", (4.8+5.1)/2); > > printf " f: %32.30f\n", $f; > printf " F: %32.30f\n", $F; > printf " F_spr: %32.30f\n", $F_spr; > > if ($f == $F) { print "\n f equals F!\n\n"; } > -------------------------------------------------- > > The test fails even if $f and $F are the same! I solved by using $F_spr > instead of $F, but it took me a day to find it out... 8-(

ben at Tyr:~$ perl -wle'print "diff: ", 4.95 - (4.8+5.1)/2' diff: 8.88178419700125e-16

Conclusion: PCs suck. ((

-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top Back

**
Jan Medlock [medlock at turboshower.net]
**

Are you sure this isn't because FE_TONEAREST is the default, which rounds 'N +
0.5' to the nearest **EVEN** integer?

Top Back

**
Paul Sephton [paul at inet.co.za]
**

On Tue, 2007-10-02 at 16:06 -0400, Jan Medlock wrote:

> Are you sure this isn't because FE_TONEAREST is the default, which rounds 'N + > 0.5' to the nearestEVENinteger?

Of course what you refer to is the reason behind the mistake. Rounding the result of calculations to nearest binary representable value, or to nearest even binary representable value on boundaries (where the closest even and closest odd is equidistant) is indeed the default rounding mode FOR BINARY FLOATING POINT OPERATIONS, which is as it should be.

However, in my opinion, this rounding mode should only apply to the results of binary floating point arithmetic operations, and cannot define how a decimal value should be rounded arithmetically in the process of converting that number to a textual representation for display.

The general approach for conversion of a binary IEEE number to a decimal notation, is to repeatedly subtract the modulus base 10 of the number starting with the most significant digit and for as many decimals as you like:

printf("%.64f\n", 1/3.0); 0.3333333333333333148296162562473909929394721984863281250000000000

The point of the article is that this is plain crazy. The width of the storage format for an IEEE double is 53 bits. Actually 52, but the 53rd bit exists because the FPU does not store the value zero. This limits the actual number of decimal digits that may be precisely represented to a maximum of 15. It does not matter where the decimal point is; we might have 0.333333333333333 3.33333333333333 33.3333333333333 or even 0.000000000000333333333333333

Regardless of the exponent, we can have at maximum 15 digits of precision stored accurately.

Therefore, one might rewrite the value 0.3333333333333333148296162562473909929394721984863281250000000000 as 0.333333333333333315 or 0.333333333333333 with 0.000000000000000315 as the portion which is imprecise, or possibly in error.

What the algorithm at the end of the article does to convert the binary number to decimal, is start at the last precise decimal digit (least sigificant decimal 15) and work it's way backwards toward the most significant digit, carrying the error forward in each step- similar to how one one would perform a long division manually.

This allows one to take the imprecise portion of the stored value into account for the purposes of decimal mode rounding, as opposed to the normal approach which works it's way from the most significant digit forwards.

If you work forwards in the generally accepted way, you are never able to incorporate the imprecise portion into the calculation for the purposes of decimal mode rounding.

Therefore, values such as 5000.525, stored internally as a binary number approximating 5000.524999999999636202 end up being rounded downwards to 5000.525, or 5000.52 depending on your choice of precision. This incorrect binary to decimal conversion is due to the error (or imprecise storage of the number) not being taken into account, and fails for both Microsoft as well as GLibC library implementations. It succeeds with my algorithm.

On the other hand, for values which are indeed precisely stored in binary form, notably values in the series 1.5, 2.5, 3.5... etc., we have an entirely different problem for the GLibC library. Note:

printf("%.18f\n", 1.5); 1.500000000000000000there is absolutely no error in the storage of the number. Hence, these numbers by definition lie on a boundary between integers.

For these values, GLibC (but not Microsoft) says that when converting the binary number to a decimal representation (precision 1 significant digit), the default IEEE rounding mode applies. As the value 1.5 is no closer to the number 1 than the number 2, it rounds to the value to the closest even, i.e. 2. Likewise, the value 2.5 is no closer to the value 2 than it is to the value 3- i.e. it is on a boundary between 2 and 3, and should therefore be rounded to the value 2, being the closest even.

Shorthand: 1.5 => 2 2.5 => 2 3.5 => 4 4.5 => 4

Now, as the article says, in real life, and according to the world of decimal arithmetic and Microsoft we would never round in this way. The expected rounded numbers for decimal rounding rules are well known: 1.5 => 2 2.5 => 3 3.5 => 4 4.5 => 5

As the operation of binary to decimal conversion has no direct relation to the process of approximating the result of an IEEE floating point OP, or the storage of an imprecisly represented number, the point of the article is that it would be incorrect to apply the IEEE default rounding mode in the conversion process.

IEEE, and certainly GLibC developers, cannot redefine the rules of decimal arithmetic.

Regards, Paul Sephton

> +-+--------------------------------------------------------------------+-+ > You've asked a question of The Answer Gang, so you've been sent the reply > directly as a courtesy. The TAG list has also been copied. Please send > all replies to tag at lists.linuxgazette.net, so that we can help our other > readers by publishing the exchange in our monthly Web magazine: > Linux Gazette (http://linuxgazette.net/) > +-+--------------------------------------------------------------------+-+ >_____________________________________________> TAG mailing list > TAG at lists.linuxgazette.net > http://lists.linuxgazette.net/mailman/listinfo/tag

Top Back

**
Jan Medlock [medlock at turboshower.net]
**

On Tuesday 02 October 2007 06:11:33 pm Paul Sephton wrote:

> On Tue, 2007-10-02 at 16:06 -0400, Jan Medlock wrote: > > Are you sure this isn't because FE_TONEAREST is the default, which rounds > > 'N + 0.5' to the nearestEVENinteger? > > Of course what you refer to is the reason behind the mistake. Rounding > the result of calculations to nearest binary representable value, or to > nearest even binary representable value on boundaries (where the closest > even and closest odd is equidistant) is indeed the default rounding mode > FOR BINARY FLOATING POINT OPERATIONS, which is as it should be. > > ...

Wow, thanks for the detailed reply and my apologies for not having understood
the depth of the problem. I see that IEEE rounding direction is **not** the
problem.

I'm afraid I don't understand your backward-stepping algorithm.

It seems to me that the problem is that rounding is inherently sensitive to the errors in storage of a number as a floating-point representation. I.e. round(x - e) != round(x + e), where e is the error between x and its float-point representation. I don't see that this is resolvable.

E.g. once you've stored 5000.525 and 5000.524999999999999999999999999999991 as floating point, there's no way to determine whether the value of the original number to 2 digits should be 5000.53 or 5000.52.

Top Back

**
Paul Sephton [paul at inet.co.za]
**

Hi, Jan

On Thu, 2007-10-04 at 17:04 -0400, Jan Medlock wrote:

> On Tuesday 02 October 2007 06:11:33 pm Paul Sephton wrote: > > On Tue, 2007-10-02 at 16:06 -0400, Jan Medlock wrote: > > > Are you sure this isn't because FE_TONEAREST is the default, which rounds > > > 'N + 0.5' to the nearestEVENinteger? > > > > Of course what you refer to is the reason behind the mistake. Rounding > > ... > > Wow, thanks for the detailed reply and my apologies for not having understood > the depth of the problem. I see that IEEE rounding direction isnotthe > problem.

Well, the only problem really, when it comes right down to it, is that different libraries follow different conventions for sprintf(), and if you are not aware of those differences when coding for multiple platforms, you have a chance (normally a very small chance) of getting unexpected results...

> I'm afraid I don't understand your backward-stepping algorithm.Sorry- let's see if I can explain it better...

> It seems to me that the problem is that rounding is inherently sensitive to > the errors in storage of a number as a floating-point representation. I.e. > round(x - e) != round(x + e), where e is the error between x and its > float-point representation. I don't see that this is resolvable. > > E.g. once you've stored > 5000.525 > and > 5000.524999999999999999999999999999991 > as floating point, there's no way to determine whether the value of the > original number to 2 digits should be > 5000.53 > or > 5000.52.

Absolutely. The c99 spec says (loosely) that conversion to decimal format and back to IEEE format should be the identity function for precisions <= DECIMAL_DIG. That is, IEEE=>decimal=>IEEE must produce the same number originally stored. To do that, the conversion must apply appropriate rounding to the last digit, as it is in many cases impossible to represent the actual binary number (radix 2) accurately as base 10.

The point you make though as per your example, is that the opposite is also true. It is also often impossible to represent the decimal number accurately as binary. Especially since in the case of decimal=>binary the size of the FPU is constrained. In other words, there is generally a loss in accuracy for the conversion process.

*_______________________________________________*
Now here is where I depart from convention. I must warn you that many
people disagree with me from this point onwards, and there is every
chance that they, being highly qualified, are right.

Quite simply, I recognise the loss of precision in the conversion from decimal=>binary, by fixing the precision at 15 significant digits (not decimal places). In other words, I state that "the converted binary number is accurate to a maximum precision of 15" (DECIMAL_DIG for IEEE doubles).

This is true, since the largest normalised mantissa (or significand) storable in the FPU is 16 decimal digits.

The value "5000.524999999999999999999999999999991" when rounded to 15 significant decimal digits is "5000.52500000000" (regardless of which rounding rules you apply).

The algorithm would convert the value "5000.524999999999999999999999999999991" to decimal by splitting it into "5000.52499999999" + "0.000000000009999999999999999999991" It then rounds the value "0.000000000009999999999999999999991" to "0.00000000001", and passes it into the first stage as the carry. The first iteration adds the carry to the value:

"5000.52499999999" + "0.00000000001" = "5000.52500000000" and then strips the last digit and remembers it. The process is then repeated: "5000.52500000000" = "5000.5250000000" + "0.00000000000" dec:0 e-11 "5000.5250000000" = "5000.525000000" + "0.0000000000" dec:0 e-10 "5000.525000000" = "5000.52500000" + "0.000000000" dec:0 e-9 "5000.52500000" = "5000.5250000" + "0.00000000" dec:0 e-8 "5000.5250000" = "5000.525000" + "0.0000000" dec:0 e-7 "5000.525000" = "5000.52500" + "0.000000" dec:0 e-6 "5000.52500" = "5000.5250" + "0.00000" dec:0 e-5 "5000.5250" = "5000.525" + "0.0000" dec:0 e-4 "5000.525" = "5000.52" + "0.005" dec:5 e-3 "5000.52" = "5000.5" + "0.02" dec:2 e-2 "5000.5" = "5000" + "0.5" dec:5 e-1 "5000" = "5000" + "0" dec:0 e+0 "5000" = "5000" + "0" dec:0 e+1 "5000" = "5000" + "0" dec:0 e+2 "5000" = "0" + "5000" dec:5 e+3Recursion ends here, and the recursion rollback fills in the digits.

Important to realise here, is that the accuracy of the original binary IEEE number is not retained. This is implied by the fact that I constrain the precision to 15 decimal digits, and the decimal representation does not contain enough information to reproduce the original number.

Whilst in this case converting back to binary would be the identity function, this would not be true in cases where the binary number was, for example, the result of a calculation.

Recognising the number 5000.525 as it is and not 5000.524999... is more important _to me_ though, than loss in accuracy. It means that when I display the number to a precision of 6, I get a rounded value 5000.53 and not 5000.52.

Mileage will vary.

Kind Regards, Paul Sephton

Top Back