Advertisement

Efficiency of Reproducible Level 1 BLAS

  • Chemseddine ChohraEmail author
  • Philippe Langlois
  • David Parello
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9553)

Abstract

Numerical reproducibility failures appear in massively parallel floating-point computations. One way to guarantee this reproducibility is to extend the IEEE-754 correct rounding to larger computing sequences, e.g. to the BLAS. Is the extra cost for numerical reproducibility acceptable in practice? We present solutions and experiments for the level 1 BLAS and we conclude about their efficiency.

Keywords

Reproducible Levels Correct Rounding Numerical Reproducibility Error-free Transform Reproductive Analogy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    IEEE 754–2008, Standard for Floating-Point Arithmetic. Institute of Electrical and Electronics Engineers, New York (2008)Google Scholar
  2. 2.
    Bohlender, G.: Floating-point computation of functions with maximum accuracy. IEEE Trans. Comput. C-26(7), 621–632 (1977)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Chohra, C., Langlois, P., Parello, D.: Implementation and Efficiency of Reproducible Level 1 BLAS (2015). http://hal-lirmm.ccsd.cnrs.fr/lirmm-01179986
  4. 4.
    Collange, S., Defour, D., Graillat, S., Iakimchuk, R.: Reproducible and accurate matrix multiplication in ExBLAS for high-performance computing. In: SCAN 2014, Würzburg, Germany (2014)Google Scholar
  5. 5.
    Dekker, T.J.: A floating-point technique for extending the available precision. Numer. Math. 18, 224–242 (1971)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Demmel, J.W., Nguyen, H.D.: Fast reproducible floating-point summation. In: Proceedings of 21th IEEE Symposium on Computer Arithmetic. Austin, Texas, USA (2013)Google Scholar
  7. 7.
  8. 8.
    Jézéquel, F., Langlois, P., Revol, N.: First steps towards more numerical reproducibility. ESAIM: Proc. 45, 229–238 (2013)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Muller, J.M., Brisebarre, N., de Dinechin, F., Jeannerod, C.P., Lefèvre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S.: Handbook of Floating-Point Arithmetic. Birkhäuser, Boston (2010)CrossRefGoogle Scholar
  10. 10.
    Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and dot product. SIAM J. Sci. Comput. 26(6), 1955–1988 (2005)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates Inc., Sebastopol (2007)Google Scholar
  12. 12.
  13. 13.
    Rump, S.M.: Ultimately fast accurate summation. SIAM J. Sci. Comput. 31(5), 3466–3502 (2009)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation - part I: faithful rounding. SIAM J. Sci. Comput. 31(1), 189–224 (2008)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Story, S.: Numerical reproducibility in the Intel Math Kernel Library. Salt Lake City, November 2012Google Scholar
  16. 16.
    Van Zee, F.G., van de Geijn, R.A.: BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans. Math. Software 41(3), 14:1–14:33 (2015)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Yamanaka, N., Ogita, T., Rump, S., Oishi, S.: A parallel algorithm for accurate dot product. Parallel Comput. 34(68), 392–410 (2008)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Zhu, Y.K., Hayes, W.B.: Correct rounding and hybrid approach to exact floating-point summation. SIAM J. Sci. Comput. 31(4), 2981–3001 (2009)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Zhu, Y.K., Hayes, W.B.: Algorithm 908: online exact summation of floating-point streams. ACM Trans. Math. Softw. 37(3), 37:1–37:13 (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  • Chemseddine Chohra
    • 1
    • 2
    Email author
  • Philippe Langlois
    • 1
    • 2
  • David Parello
    • 1
    • 2
  1. 1.Digits, Architectures et Logiciels InformatiquesUniv. Perpignan Via DomitiaPerpignanFrance
  2. 2.Laboratoire d’Informatique Robotique et de Microélectronique de MontpellierUniv. Montpellier II, UMR 5506, CNRSMontpellierFrance

Personalised recommendations