Exploiting Structure in Floating-Point Arithmetic

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9582)

Abstract

The analysis of algorithms in IEEE floating-point arithmetic is most often carried out via repeated applications of the so-called standard model, which bounds the relative error of each basic operation by a common epsilon depending only on the format. While this approach has been eminently useful for establishing many accuracy and stability results, it fails to capture most of the low-level features that make floating-point arithmetic so highly structured. In this paper, we survey some of those properties and how to exploit them in rounding error analysis. In particular, we review some recent improvements of several classical, Wilkinson-style error bounds from linear algebra and complex arithmetic that all rely on such structure properties.

Keywords

Floating-point arithmetic IEEE standard 754-2008 Rounding error analysis High relative accuracy 

References

  1. 1.
    Baudin, M.: Error bounds of complex arithmetic, June 2011. http://forge.scilab.org/upload/compdiv/files/complexerrorbounds_v0.2.pdf
  2. 2.
    Brent, R.P., Percival, C., Zimmermann, P.: Error bounds on complex floating-point multiplication. Math. Comput. 76, 1469–1481 (2007)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Brent, R.P., Zimmerman, P.: Modern Computer Arithmetic. Cambridge University Press, Cambridge (2010)CrossRefGoogle Scholar
  4. 4.
    Brunie, N.: Contributions to Computer Arithmetic and Applications to Embedded Systems. Ph.D. thesis, École Normale Supérieure de Lyon, Lyon, France, May 2014. https://tel.archives-ouvertes.fr/tel-01078204
  5. 5.
    Champagne, W.P.: On finding roots of polynomials by hook or by crook. Master’s thesis, University of Texas, Austin, Texas (1964)Google Scholar
  6. 6.
    Corless, R.M., Fillion, N.: A Graduate Introduction to Numerical Methods, From the Viewpoint of Backward Error Analysis. Springer, New York (2013)CrossRefMATHGoogle Scholar
  7. 7.
    Cornea, M., Harrison, J., Tang, P.T.P.: Scientific Computing on Itanium\({}^{\textregistered }\)-based Systems. Intel Press, Hillsboro (2002)Google Scholar
  8. 8.
    Dekker, T.J.: A floating-point technique for extending the available precision. Numer. Math. 18, 224–242 (1971)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Demmel, J.W.: Applied Numerical Linear Algebra. SIAM, Philadelphia (1997)CrossRefMATHGoogle Scholar
  10. 10.
    Goldberg, D.: What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. 23(1), 5–48 (1991)CrossRefGoogle Scholar
  11. 11.
    Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press, Baltimore (2013)MATHGoogle Scholar
  12. 12.
    Graillat, S., Lefèvre, V., Muller, J.M.: On the maximum relative error when computing integer powers by iterated multiplications in floating-point arithmetic. Numer. Algorithms 70, 653–667 (2015). http://link.springer.com/article/10.1007/s11075-015-9967-8 MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Hauser, J.R.: Handling floating-point exceptions in numeric programs. ACM Trans. Program. Lang. Syst. 18(2), 139–174 (1996)CrossRefGoogle Scholar
  14. 14.
    Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)CrossRefMATHGoogle Scholar
  15. 15.
    Higham, N.J.: Floating-point arithmetic. In: Higham, N.J., Dennis, M.R., Glendinning, P., Martin, P.A., Santosa, F., Tanner, J. (eds.) The Princeton Companion to Applied Mathematics, pp. 96–97. Princeton University Press, Princeton (2015)Google Scholar
  16. 16.
    Holm, J.E.: Floating-Point Arithmetic and Program Correctness Proofs. Ph.D. thesis, Cornell University, Ithaca, NY, USA, August 1980Google Scholar
  17. 17.
    IEEE Computer Society: IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Standard 754–1985. IEEE Computer Society, New York (1985)Google Scholar
  18. 18.
    IEEE Computer Society: IEEE Standard for Floating-Point Arithmetic, IEEE Standard 754–2008. IEEE Computer Society, New York (2008)Google Scholar
  19. 19.
    Jeannerod, C.P.: A radix-independent error analysis of the Cornea-Harrison-Tang method, to appear in ACM Trans. Math. Softw. https://hal.inria.fr/hal-01050021
  20. 20.
    Jeannerod, C.P., Kornerup, P., Louvet, N., Muller, J.M.: Error bounds on complex floating-point multiplication with an FMA, to appear in Math. Comput. https://hal.inria.fr/hal-00867040v4
  21. 21.
    Jeannerod, C.P., Louvet, N., Muller, J.M.: Further analysis of Kahan’s algorithm for the accurate computation of \(2\times 2\) determinants. Math. Comput. 82(284), 2245–2264 (2013)MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Jeannerod, C.P., Louvet, N., Muller, J.M., Plet, A.: A library for symbolic floating-point arithmetic (2015). https://hal.inria.fr/hal-01232159
  23. 23.
    Jeannerod, C.-P., Louvet, N., Muller, J.-M., Plet, A.: Sharp error bounds for complex floating-point inversion. Numer. Algorithms 1–26 (2016). https://hal-ens-lyon.archives-ouvertes.fr/ensl-01195625
  24. 24.
    Jeannerod, C.P., Rump, S.M.: Improved error bounds for inner products in floating-point arithmetic. SIAM J. Matrix Anal. Appl. 34(2), 338–344 (2013)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Jeannerod, C.P., Rump, S.M.: On relative errors of floating-point operations: optimal bounds and applications (2014). https://hal.inria.fr/hal-00934443
  26. 26.
    Kahan, W.: Further remarks on reducing truncation errors. Commun. ACM 8(1), 40 (1965)CrossRefGoogle Scholar
  27. 27.
    Knuth, D.E.: The Art of Computer Programming. Seminumerical Algorithms, vol. 2, 3rd edn. Addison-Wesley, Reading (1998)MATHGoogle Scholar
  28. 28.
    Møller, O.: Quasi double-precision in floating point addition. BIT 5, 37–50 (1965)MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    Monniaux, D.: The pitfalls of verifying floating-point computations. ACM Trans. Program. Lang. Syst. 30(3), 12:1–12:41 (2008)CrossRefGoogle Scholar
  30. 30.
    Muller, J.M.: On the error of computing \(ab+cd\) using Cornea, Harrison and Tang’s method. ACM Trans. Math. Softw. 41(2), 7:1–7:8 (2015)CrossRefGoogle Scholar
  31. 31.
    Muller, J.M., Brisebarre, N., de Dinechin, F., Jeannerod, C.P., Lefèvre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S.: Handbook of Floating-Point Arithmetic. Birkhäuser, Boston (2010)CrossRefMATHGoogle Scholar
  32. 32.
    Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and dot product. SIAM J. Sci. Comput. 26(6), 1955–1988 (2005)MathSciNetCrossRefMATHGoogle Scholar
  33. 33.
    Overton, M.L.: Numerical Computing with IEEE Floating Point Arithmetic: Including One Theorem, One Rule of Thumb, and One Hundred and One Exercises. Society for Industrial and Applied Mathematics, Philadelphia (2001)CrossRefMATHGoogle Scholar
  34. 34.
    Priest, D.M.: On Properties of Floating Point Arithmetics: Numerical Stability and the Cost of Accurate Computations. Ph.D. thesis, Mathematics Department, University of California, Berkeley, CA, USA, November 1992Google Scholar
  35. 35.
    Rump, S.M.: Ultimately fast accurate summation. SIAM J. Sci. Comput. 31(5), 3466–3502 (2009)MathSciNetCrossRefMATHGoogle Scholar
  36. 36.
    Rump, S.M.: Error estimation of floating-point summation and dot product. BIT 52(1), 201–220 (2012)MathSciNetCrossRefMATHGoogle Scholar
  37. 37.
    Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation part I: faithful rounding. SIAM J. Sci. Comput. 31(1), 189–224 (2008)MathSciNetCrossRefMATHGoogle Scholar
  38. 38.
    Rump, S.M., Bünger, F., Jeannerod, C.P.: Improved error bounds for floating-point products and Horner’s scheme. BIT (2015). http://link.springer.com/article/10.1007/s10543-015-0555-z
  39. 39.
    Rump, S.M., Jeannerod, C.P.: Improved backward error bounds for LU and Cholesky factorizations. SIAM J. Matrix Anal. Appl. 35(2), 684–698 (2014)MathSciNetCrossRefMATHGoogle Scholar
  40. 40.
    Sterbenz, P.H.: Floating-Point Computation. Prentice-Hall, Englewood Cliffs (1974)Google Scholar
  41. 41.
    Trefethen, L.N.: Computing numerically with functions instead of numbers. Math. Comput. Sci. 1(1), 9–19 (2007)MathSciNetCrossRefMATHGoogle Scholar
  42. 42.
    Trefethen, L.N., Bau III, D.: Numerical Linear Algebra. SIAM, Philadelphia (1997)CrossRefMATHGoogle Scholar
  43. 43.
    Wilkinson, J.H.: Error analysis of floating-point computation. Numer. Math. 2, 319–340 (1960)MathSciNetCrossRefMATHGoogle Scholar
  44. 44.
    Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Oxford University Press, Oxford (1965)MATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Inria, Laboratoire LIP (U. Lyon, CNRS, ENSL, Inria, UCBL)Lyon Cedex 07France

Personalised recommendations