Skip to main content

AVX Acceleration of DD Arithmetic Between a Sparse Matrix and Vector

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8384))

Abstract

High precision arithmetic can improve the convergence of Krylov subspace methods; however, it is very costly. One system of high precision arithmetic is double-double (DD) arithmetic, which uses more than 20 double precision operations for one DD operation. We accelerated DD arithmetic using AVX SIMD instructions. The performances of vector operations in 4 threads are 51–59 % of peak performance in a cache and bounded by the memory access speed out of the cache. For SpMV, we used a double precision sparse matrix A and DD vector x to reduce memory access and achieved performances of 17–41 % of peak performance using padding in execution. We also achieved performances that were 9–33 % of peak performance for a transposed SpMV. For these cases, the performances were not bounded by memory access.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hasegawa, H.: Utilizing the quadruple-precision floating-point arithmetic operation for the Krylov subspace methods. In: The 8th SIAM Conference on Applied Linear, Algebra (2003)

    Google Scholar 

  2. Bailey, D.H.: QD (C++ / Fortran-90 double-double and quad-double package), http://crd-legacy.lbl.gov/dhbailey/mpdist/

  3. Intel: Intrinsics Guide, http://software.intel.com/en-us/articles/intel-intrinsics-guide

  4. Dekker, T.: A floating-point technique for extending the available precision. Numer. Math. 18, 224–242 (1971)

    Article  MATH  MathSciNet  Google Scholar 

  5. Knuth, D.E.: The Art of Computer Programming: Seminumerical Algorithms, vol. 2. Addison-Wesley, Reading (1969)

    MATH  Google Scholar 

  6. Barrett, R., et al.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, pp. 57–65. SIAM, Philadelphia (1994)

    Book  Google Scholar 

  7. The University of Florida Sparse Matrix Collection, http://www.cise.ufl.edu/research/sparse/matrices/

Download references

Acknowledgement

The authors would like to thank the reviewers for their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Toshiaki Hishinuma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hishinuma, T., Fujii, A., Tanaka, T., Hasegawa, H. (2014). AVX Acceleration of DD Arithmetic Between a Sparse Matrix and Vector. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55224-3_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-55224-3_58

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-55223-6

  • Online ISBN: 978-3-642-55224-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics