Skip to main content
Log in

Some issues related to double rounding

  • Published:
BIT Numerical Mathematics Aims and scope Submit manuscript

Abstract

Double rounding is a phenomenon that may occur when different floating-point precisions are available on the same system. Although double rounding is, in general, innocuous, it may change the behavior of some useful small floating-point algorithms. We analyze the potential influence of double rounding on the Fast2Sum and 2Sum algorithms, on some summation algorithms, and Veltkamp’s splitting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. The FMA instruction evaluates expressions of the form xy+z with one final rounding only.

References

  1. Bertot, Y., Castéran, P.: Interactive Theorem Proving and Program Development. Coq’Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer Science. Springer, Berlin (2004)

    Book  Google Scholar 

  2. Boldo, S.: Pitfalls of a full floating-point proof: example on the formal proof of the Veltkamp/Dekker algorithms. In: Furbach, U., Shankar, N. (eds.) Proceedings of the 3rd International Joint Conference on Automated Reasoning. Lecture Notes in Computer Science, vol. 4130, pp. 52–66 (2006)

    Chapter  Google Scholar 

  3. Boldo, S., Daumas, M.: Representable correcting terms for possibly underflowing floating point operations. In: Bajard, J.C., Schulte, M. (eds.) Proceedings of the 16th Symposium on Computer Arithmetic, pp. 79–86. IEEE Comput. Soc. Press, Los Alamitos (2003)

    Google Scholar 

  4. Boldo, S., Daumas, M., Moreau-Finot, C., Théry, L.: Computer validated proofs of a toolset for adaptable arithmetic. Tech. rep, École Normale Supérieure de Lyon (2001). Available at http://arxiv.org/pdf/cs.MS/0107025

  5. Boldo, S., Melquiond, G.: Emulation of FMA and correctly rounded sums: proved algorithms using rounding to odd. IEEE Trans. Comput. 57(4), 462–471 (2008)

    Article  MathSciNet  Google Scholar 

  6. Cornea, M., Harrison, J., Anderson, C., Tang, P.T.P., Schneider, E., Gvozdev, E.: A software implementation of the IEEE 754R decimal floating-point arithmetic using the binary encoding format. IEEE Trans. Comput. 58(2), 148–162 (2009)

    Article  MathSciNet  Google Scholar 

  7. Dekker, T.J.: A floating-point technique for extending the available precision. Numer. Math. 18(3), 224–242 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  8. Figueroa, S.A.: When is double rounding innocuous? ACM SIGNUM Newsl. 30(3) (1995)

  9. Figueroa, S.A.: A rigorous framework for fully supporting the IEEE standard for floating-point arithmetic in high-level programming languages. Ph.D. thesis, Department of Computer Science, New York University (2000)

  10. Goldberg, D.: What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. 23(1), 5–47 (1991). An edited reprint is available at http://www.physics.ohio-state.edu/~dws/grouplinks/floating_point_math.pdf from Sun’s Numerical Computation Guide; it contains an addendum ”Differences among IEEE 754 implementations”, also available at http://www.validlab.com/goldberg/addendum.html

    Article  Google Scholar 

  11. Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)

    Book  MATH  Google Scholar 

  12. IEEE Computer Society: IEEE standard for floating-point arithmetic. IEEE Standard 754-2008 (2008). Available at http://ieeexplore.ieee.org/servlet/opac?punumber=4610933

  13. International Organization for Standardization: Programming languages—C. ISO/IEC Standard 9899:1999, Geneva, Switzerland (1999)

  14. Kahan, W.: Pracniques: further remarks on reducing truncation errors. Commun. ACM 8(1), 40 (1965)

    Article  Google Scholar 

  15. Kahan, W.: Lecture notes on the status of IEEE-754 (1996). PDF file accessible at http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF

  16. Knuth, D.: The Art of Computer Programming vol. 2, 3rd edn. Addison-Wesley, Reading (1998)

    Google Scholar 

  17. Møller, O.: Quasi double-precision in floating-point addition. BIT Numer. Math. 5, 37–50 (1965)

    Article  MATH  Google Scholar 

  18. Monniaux, D.: The pitfalls of verifying floating-point computations. ACM TOPLAS 30(3), 12 (2008)

    Article  Google Scholar 

  19. Muller, J.M., Brisebarre, N., de Dinechin, F., Jeannerod, C.P., Lefèvre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S.: Handbook of Floating-Point Arithmetic. Birkhäuser, Boston (2010)

    Book  MATH  Google Scholar 

  20. Neumaier, A.: Rundungsfehleranalyse einiger Verfahren zur Summation endlicher Summen. Z. Angew. Math. Mech. 54, 39–51 (1974) (in German)

    Article  MathSciNet  MATH  Google Scholar 

  21. Nievergelt, Y.: Scalar fused multiply-add instructions produce floating-point matrix arithmetic provably accurate to the penultimate digit. ACM Trans. Math. Softw. 29(1), 27–48 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  22. Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and dot product. SIAM J. Sci. Comput. 26(6), 1955–1988 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  23. Pichat, M.: Correction d’une somme en arithmétique à virgule flottante. Numer. Math. 19, 400–406 (1972) (in French)

    Article  MathSciNet  MATH  Google Scholar 

  24. Priest, D.M.: Algorithms for arbitrary precision floating point arithmetic. In: Kornerup, P., Matula, D.W. (eds.) Proceedings of the 10th IEEE Symposium on Computer Arithmetic (Arith-10), pp. 132–144. IEEE Comput. Soc. Press, Los Alamitos (1991)

    Google Scholar 

  25. Priest, D.M.: On properties of floating-point arithmetics: numerical stability and the cost of accurate computations. Ph.D. thesis, University of California at Berkeley (1992)

  26. Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation, part I: faithful rounding. SIAM J. Sci. Comput. 31(1), 189–224 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  27. Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation, part II: sign, K-fold faithful and rounding to nearest. SIAM J. Sci. Comput. 31(2), 1269–1302 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  28. Shewchuk, J.R.: Adaptive precision floating-point arithmetic and fast robust geometric predicates. Discrete Comput. Geom. 18, 305–363 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  29. Sterbenz, P.H.: Floating-Point Computation. Prentice-Hall, Englewood Cliffs (1974)

    Google Scholar 

Download references

Acknowledgements

We are extremely grateful to the anonymous referees, whose suggestions have been very helpful for revising this paper. Especially, one of them suggested a drastic simplification of the proof of Theorem 4.1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean-Michel Muller.

Additional information

Communicated by Axel Ruhe.

This work is partly supported by the TaMaDi project of the French Agence Nationale de la Recherche.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Martin-Dorel, É., Melquiond, G. & Muller, JM. Some issues related to double rounding. Bit Numer Math 53, 897–924 (2013). https://doi.org/10.1007/s10543-013-0436-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10543-013-0436-2

Keywords

Mathematics Subject Classification (2010)

Navigation