Enhanced Floating-Point Sums, Dot Products, and Polynomial Values

  • Jean-Michel Muller
  • Nicolas Brunie
  • Florent de Dinechin
  • Claude-Pierre Jeannerod
  • Mioara Joldes
  • Vincent Lefèvre
  • Guillaume Melquiond
  • Nathalie Revol
  • Serge Torres


In this chapter, we focus on the computation of sums and dot products, and on the evaluation of polynomials in IEEE 754 floating-point arithmetic. Such calculations arise in many fields of numerical computing. Computing sums is required, e.g., in numerical integration and the computation of means and variances. Dot products appear everywhere in numerical linear algebra. Polynomials are used to approximate many functions (see Chapter  10).


  1. [24]
    I. Babuška. Numerical stability in mathematical analysis. In Proceedings of the 1968 IFIP Congress, volume 1, pages 11–23, 1969.Google Scholar
  2. [36]
    M. Bennani and M. C. Brunet. PRECISE: simulation of round-off error propagation model. In 12th World IMACS Congress, July 1988.Google Scholar
  3. [52]
    S. Boldo and G. Melquiond. Emulation of FMA and correctly rounded sums: proved algorithms using rounding to odd. IEEE Transactions on Computers, 57(4):462–471, 2008.MathSciNetCrossRefGoogle Scholar
  4. [76]
    W. S. Brown. A simple but realistic model of floating-point computation. ACM Transactions on Mathematical Software, 7(4), 1981.Google Scholar
  5. [81]
    M. C. Brunet and F. Chatelin. A probabilistic round-off error propagation model, application to the eigenvalue problem. In Reliable Numerical Software, 1987. Available at
  6. [115]
    R. M. Corless and N. Fillion. A Graduate Introduction to Numerical Methods, From the Viewpoint of Backward Error Analysis. Springer, 2013.CrossRefGoogle Scholar
  7. [161]
    J. Demmel and H. D. Nguyen. Fast reproducible floating-point summation. In 21th IEEE Symposium on Computer Arithmetic (ARITH-21), pages 163–172, April 2013.Google Scholar
  8. [165]
    J. Demmel, P. Ahrens, and H. D. Nguyen. Efficient reproducible floating point summation and BLAS. Technical Report UCB/EECS-2016-121, EECS Department, University of California, Berkeley, June 2016.Google Scholar
  9. [166]
    J. Demmel and Y. Hida. Accurate and efficient floating point summation. SIAM Journal of Scientific Computing, 25(4):1214–1248, 2003.MathSciNetCrossRefGoogle Scholar
  10. [167]
    J. Demmel and Y. Hida. Fast and accurate floating point summation with application to computational geometry. Numerical Algorithms, 37(1):101–112, 2004.MathSciNetCrossRefGoogle Scholar
  11. [222]
    S. Graillat, P. Langlois, and N. Louvet. Algorithms for accurate, validated and fast computations with polynomials. Japan Journal of Industrial and Applied Mathematics, 26(2):215–231, 2009.MathSciNetzbMATHGoogle Scholar
  12. [223]
    S. Graillat, V. Lefèvre, and J.-M. Muller. On the maximum relative error when computing integer powers by iterated multiplications in floating-point arithmetic. Numerical Algorithms, 70:653–667, 2015.MathSciNetCrossRefGoogle Scholar
  13. [249]
    J. R. Hauser. Handling floating-point exceptions in numeric programs. ACM Transactions on Programming Languages and Systems, 18(2):139–174, 1996.CrossRefGoogle Scholar
  14. [257]
    N. J. Higham. The accuracy of floating point summation. SIAM Journal on Scientific Computing, 14(4):783–799, 1993.MathSciNetCrossRefGoogle Scholar
  15. [258]
    N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, PA, 2nd edition, 2002.CrossRefGoogle Scholar
  16. [266]
    T. E. Hull and J. R. Swenson. Test of probabilistic models for propagation of round-off errors. Communications of the ACM, 9:108–113, 1966.MathSciNetCrossRefGoogle Scholar
  17. [302]
    C.-P. Jeannerod and S. M. Rump. Improved error bounds for inner products in floating-point arithmetic. SIAM Journal on Matrix Analysis and Applications, 34(2):338–344, 2013.MathSciNetCrossRefGoogle Scholar
  18. [303]
    C.-P. Jeannerod and S. M. Rump. On relative errors of floating-point operations: optimal bounds and applications. Mathematics of Computation, 2016. To appear.Google Scholar
  19. [339]
    A. Klein. A generalized Kahan-Babuška-summation-algorithm. Computing, 76:279–293, 2006.MathSciNetCrossRefGoogle Scholar
  20. [342]
    D. E. Knuth. The Art of Computer Programming, volume 2. Addison-Wesley, Reading, MA, 3rd edition, 1998.Google Scholar
  21. [347]
    P. Kornerup, V. Lefèvre, N. Louvet, and J.-M. Muller. On the computation of correctly rounded sums. IEEE Transactions on Computers, 61(3):289–298, 2012.MathSciNetCrossRefGoogle Scholar
  22. [353]
    U. W. Kulisch. Circuitry for generating scalar products and sums of floating-point numbers with maximum accuracy. United States Patent 4622650, 1986.Google Scholar
  23. [354]
    U. W. Kulisch. Advanced Arithmetic for the Digital Computer: Design of Arithmetic Units. Springer-Verlag, Berlin, 2002.CrossRefGoogle Scholar
  24. [355]
    U. W. Kulisch. Computer Arithmetic and Validity: Theory, Implementation, and Applications. de Gruyter, Berlin, 2008.CrossRefGoogle Scholar
  25. [361]
    M. Lange and S. M. Rump. Error estimates for the summation of real numbers with application to floating-point summation. BIT Numerical Mathematics, 57(3):927–941, 2017.MathSciNetCrossRefGoogle Scholar
  26. [363]
    M. Lange and S. M. Rump. Sharp estimates for perturbation errors in summations. Manuscript available at, 2017.
  27. [367]
    P. Langlois. Automatic linear correction of rounding errors. BIT Numerical Algorithms, 41(3):515–539, 2001.MathSciNetCrossRefGoogle Scholar
  28. [368]
    P. Langlois and N. Louvet. How to ensure a faithful polynomial evaluation with the compensated Horner algorithm. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 141–149, June 2007.Google Scholar
  29. [370]
    C. Q. Lauter. Basic building blocks for a triple-double intermediate format. Technical Report 2005-38, LIP, École Normale Supérieure de Lyon, September 2005.Google Scholar
  30. [396]
    N. Louvet. Algorithmes Compensés en Arithmétique Flottante: Précision, Validation, Performances. Ph.D. thesis, Université de Perpignan, Perpignan, France, November 2007. In French.Google Scholar
  31. [410]
    W. F. Mascarenhas. Floating point numbers are real numbers. Manuscript available at, 2016.
  32. [454]
    A. Neumaier. Rundungsfehleranalyse einiger Verfahren zur Summation endlicher Summen. ZAMM, 54:39–51, 1974. In German.MathSciNetCrossRefGoogle Scholar
  33. [471]
    T. Ogita, S. M. Rump, and S. Oishi. Accurate sum and dot product. SIAM Journal on Scientific Computing, 26(6):1955–1988, 2005.MathSciNetCrossRefGoogle Scholar
  34. [472]
    T. Ogita, S. M. Rump, and S. Oishi. Verified solutions of linear systems without directed rounding. Technical Report 2005-04, Advanced Research Institute for Science and Engineering, Waseda University, Tokyo, Japan, 2005.Google Scholar
  35. [478]
    K. Ozaki, F. Bünger, T. Ogita, S. Oishi, and S. M. Rump. Simple floating-point filters for the two-dimensional orientation problem. BIT Numerical Mathematics, 56(2):729–749, 2016.MathSciNetCrossRefGoogle Scholar
  36. [479]
    K. Ozaki, T. Ogita, F. Bünger, and S. Oishi. Accelerating interval matrix multiplication by mixed precision arithmetic. Nonlinear Theory and its Applications, IEICE, 6(3):364–376, 2015.CrossRefGoogle Scholar
  37. [490]
    M. Pichat. Correction d’une somme en arithmétique à virgule flottante. Numerische Mathematik, 19:400–406, 1972. In French.MathSciNetCrossRefGoogle Scholar
  38. [495]
    D. M. Priest. Algorithms for arbitrary precision floating point arithmetic. In 10th IEEE Symposium on Computer Arithmetic (ARITH-10), pages 132–143, June 1991.Google Scholar
  39. [496]
    D. M. Priest. On Properties of Floating-Point Arithmetics: Numerical Stability and the Cost of Accurate Computations. Ph.D. thesis, University of California at Berkeley, 1992.Google Scholar
  40. [521]
    S. M. Rump. Ultimately fast accurate summation. SIAM Journal on Scientific Computing, 31(5):3466–3502, 2009.MathSciNetCrossRefGoogle Scholar
  41. [522]
    S. M. Rump. Error estimation of floating-point summation and dot product. BIT Numerical Mathematics, 52(1):201–220, 2012.MathSciNetCrossRefGoogle Scholar
  42. [525]
    S. M. Rump. Computable backward error bounds for basic algorithms in linear algebra. Nonlinear Theory and its Applications, IEICE, 6(3):360–363, 2015.CrossRefGoogle Scholar
  43. [526]
    S. M. Rump and H. Böhm. Least significant bit evaluation of arithmetic expressions in single-precision. Computing, 30:189–199, 1983.MathSciNetCrossRefGoogle Scholar
  44. [527]
    S. M. Rump, F. Bünger, and C.-P. Jeannerod. Improved error bounds for floating-point products and Horner’s scheme. BIT Numerical Mathematics, 56(1):293–307, 2016.MathSciNetCrossRefGoogle Scholar
  45. [528]
    S. M. Rump and C.-P. Jeannerod. Improved backward error bounds for LU and Cholesky factorizations. SIAM Journal on Matrix Analysis and Applications, 35(2):684–698, 2014.MathSciNetCrossRefGoogle Scholar
  46. [530]
    S. M. Rump and M. Lange. On the definition of unit roundoff. BIT Numerical Mathematics, 56(1):309–317, 2016.MathSciNetCrossRefGoogle Scholar
  47. [531]
    S. M. Rump, T. Ogita, and S. Oishi. Accurate floating-point summation part I: Faithful rounding. SIAM Journal on Scientific Computing, 31(1):189–224, 2008.MathSciNetCrossRefGoogle Scholar
  48. [532]
    S. M. Rump, T. Ogita, and S. Oishi. Accurate floating-point summation part II: Sign, K-fold faithful and rounding to nearest. SIAM Journal on Scientific Computing, 31(2):1269–1302, 2008.MathSciNetCrossRefGoogle Scholar
  49. [634]
    J. H. Wilkinson. Rounding Errors in Algebraic Processes. Prentice-Hall, Englewood Cliffs, NJ, 1963.zbMATHGoogle Scholar
  50. [644]
    G. Zielke and V. Drygalla. Genaue Loesung Linearer Gleichungssysteme. GAMM Mitteilungen der Gesellschaft für Angewandte Mathematik und Mechanik, 26:7–107, 2003.zbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Jean-Michel Muller
    • 1
  • Nicolas Brunie
    • 2
  • Florent de Dinechin
    • 3
  • Claude-Pierre Jeannerod
    • 4
  • Mioara Joldes
    • 5
  • Vincent Lefèvre
    • 4
  • Guillaume Melquiond
    • 6
  • Nathalie Revol
    • 4
  • Serge Torres
    • 7
  1. 1.CNRS - LIPLyonFrance
  2. 2.KalrayGrenobleFrance
  3. 3.INSA-Lyon - CITIVilleurbanneFrance
  4. 4.Inria - LIPLyonFrance
  5. 5.CNRS - LAASToulouseFrance
  6. 6.Inria - LRIOrsayFrance
  7. 7.ENS-Lyon - LIPLyonFrance

Personalised recommendations