Advertisement

BIT Numerical Mathematics

, Volume 53, Issue 4, pp 897–924 | Cite as

Some issues related to double rounding

  • Érik Martin-Dorel
  • Guillaume Melquiond
  • Jean-Michel Muller
Article

Abstract

Double rounding is a phenomenon that may occur when different floating-point precisions are available on the same system. Although double rounding is, in general, innocuous, it may change the behavior of some useful small floating-point algorithms. We analyze the potential influence of double rounding on the Fast2Sum and 2Sum algorithms, on some summation algorithms, and Veltkamp’s splitting.

Keywords

Floating-point arithmetic Double rounding Correct rounding 2Sum Fast2Sum Summation algorithms 

Mathematics Subject Classification (2010)

65G99 65Y04 68M15 

Notes

Acknowledgements

We are extremely grateful to the anonymous referees, whose suggestions have been very helpful for revising this paper. Especially, one of them suggested a drastic simplification of the proof of Theorem 4.1.

References

  1. 1.
    Bertot, Y., Castéran, P.: Interactive Theorem Proving and Program Development. Coq’Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer Science. Springer, Berlin (2004) CrossRefGoogle Scholar
  2. 2.
    Boldo, S.: Pitfalls of a full floating-point proof: example on the formal proof of the Veltkamp/Dekker algorithms. In: Furbach, U., Shankar, N. (eds.) Proceedings of the 3rd International Joint Conference on Automated Reasoning. Lecture Notes in Computer Science, vol. 4130, pp. 52–66 (2006) CrossRefGoogle Scholar
  3. 3.
    Boldo, S., Daumas, M.: Representable correcting terms for possibly underflowing floating point operations. In: Bajard, J.C., Schulte, M. (eds.) Proceedings of the 16th Symposium on Computer Arithmetic, pp. 79–86. IEEE Comput. Soc. Press, Los Alamitos (2003) Google Scholar
  4. 4.
    Boldo, S., Daumas, M., Moreau-Finot, C., Théry, L.: Computer validated proofs of a toolset for adaptable arithmetic. Tech. rep, École Normale Supérieure de Lyon (2001). Available at http://arxiv.org/pdf/cs.MS/0107025
  5. 5.
    Boldo, S., Melquiond, G.: Emulation of FMA and correctly rounded sums: proved algorithms using rounding to odd. IEEE Trans. Comput. 57(4), 462–471 (2008) MathSciNetCrossRefGoogle Scholar
  6. 6.
    Cornea, M., Harrison, J., Anderson, C., Tang, P.T.P., Schneider, E., Gvozdev, E.: A software implementation of the IEEE 754R decimal floating-point arithmetic using the binary encoding format. IEEE Trans. Comput. 58(2), 148–162 (2009) MathSciNetCrossRefGoogle Scholar
  7. 7.
    Dekker, T.J.: A floating-point technique for extending the available precision. Numer. Math. 18(3), 224–242 (1971) MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Figueroa, S.A.: When is double rounding innocuous? ACM SIGNUM Newsl. 30(3) (1995) Google Scholar
  9. 9.
    Figueroa, S.A.: A rigorous framework for fully supporting the IEEE standard for floating-point arithmetic in high-level programming languages. Ph.D. thesis, Department of Computer Science, New York University (2000) Google Scholar
  10. 10.
    Goldberg, D.: What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. 23(1), 5–47 (1991). An edited reprint is available at http://www.physics.ohio-state.edu/~dws/grouplinks/floating_point_math.pdf from Sun’s Numerical Computation Guide; it contains an addendum ”Differences among IEEE 754 implementations”, also available at http://www.validlab.com/goldberg/addendum.html CrossRefGoogle Scholar
  11. 11.
    Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002) CrossRefzbMATHGoogle Scholar
  12. 12.
    IEEE Computer Society: IEEE standard for floating-point arithmetic. IEEE Standard 754-2008 (2008). Available at http://ieeexplore.ieee.org/servlet/opac?punumber=4610933
  13. 13.
    International Organization for Standardization: Programming languages—C. ISO/IEC Standard 9899:1999, Geneva, Switzerland (1999) Google Scholar
  14. 14.
    Kahan, W.: Pracniques: further remarks on reducing truncation errors. Commun. ACM 8(1), 40 (1965) CrossRefGoogle Scholar
  15. 15.
    Kahan, W.: Lecture notes on the status of IEEE-754 (1996). PDF file accessible at http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF
  16. 16.
    Knuth, D.: The Art of Computer Programming vol. 2, 3rd edn. Addison-Wesley, Reading (1998) Google Scholar
  17. 17.
    Møller, O.: Quasi double-precision in floating-point addition. BIT Numer. Math. 5, 37–50 (1965) CrossRefzbMATHGoogle Scholar
  18. 18.
    Monniaux, D.: The pitfalls of verifying floating-point computations. ACM TOPLAS 30(3), 12 (2008) CrossRefGoogle Scholar
  19. 19.
    Muller, J.M., Brisebarre, N., de Dinechin, F., Jeannerod, C.P., Lefèvre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S.: Handbook of Floating-Point Arithmetic. Birkhäuser, Boston (2010) CrossRefzbMATHGoogle Scholar
  20. 20.
    Neumaier, A.: Rundungsfehleranalyse einiger Verfahren zur Summation endlicher Summen. Z. Angew. Math. Mech. 54, 39–51 (1974) (in German) MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Nievergelt, Y.: Scalar fused multiply-add instructions produce floating-point matrix arithmetic provably accurate to the penultimate digit. ACM Trans. Math. Softw. 29(1), 27–48 (2003) MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and dot product. SIAM J. Sci. Comput. 26(6), 1955–1988 (2005) MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Pichat, M.: Correction d’une somme en arithmétique à virgule flottante. Numer. Math. 19, 400–406 (1972) (in French) MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Priest, D.M.: Algorithms for arbitrary precision floating point arithmetic. In: Kornerup, P., Matula, D.W. (eds.) Proceedings of the 10th IEEE Symposium on Computer Arithmetic (Arith-10), pp. 132–144. IEEE Comput. Soc. Press, Los Alamitos (1991) Google Scholar
  25. 25.
    Priest, D.M.: On properties of floating-point arithmetics: numerical stability and the cost of accurate computations. Ph.D. thesis, University of California at Berkeley (1992) Google Scholar
  26. 26.
    Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation, part I: faithful rounding. SIAM J. Sci. Comput. 31(1), 189–224 (2008) MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation, part II: sign, K-fold faithful and rounding to nearest. SIAM J. Sci. Comput. 31(2), 1269–1302 (2008) MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Shewchuk, J.R.: Adaptive precision floating-point arithmetic and fast robust geometric predicates. Discrete Comput. Geom. 18, 305–363 (1997) MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Sterbenz, P.H.: Floating-Point Computation. Prentice-Hall, Englewood Cliffs (1974) Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Érik Martin-Dorel
    • 1
  • Guillaume Melquiond
    • 2
  • Jean-Michel Muller
    • 3
  1. 1.Inria Sophia Antipolis - MéditerranéeMarelle teamSophia Antipolis CedexFrance
  2. 2.Inria Saclay–Île-de-France, Toccata team, LRI Lab.CNRSOrsay CedexFrance
  3. 3.CNRS, lab. LIP, Inria Aric teamUniversité de LyonLyon Cedex 07France

Personalised recommendations