Abstract
High precision arithmetic can improve the convergence of Krylov subspace methods; however, it is very costly. One system of high precision arithmetic is double-double (DD) arithmetic, which uses more than 20 double precision operations for one DD operation. We accelerated DD arithmetic using AVX SIMD instructions. The performances of vector operations in 4 threads are 51–59 % of peak performance in a cache and bounded by the memory access speed out of the cache. For SpMV, we used a double precision sparse matrix A and DD vector x to reduce memory access and achieved performances of 17–41 % of peak performance using padding in execution. We also achieved performances that were 9–33 % of peak performance for a transposed SpMV. For these cases, the performances were not bounded by memory access.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hasegawa, H.: Utilizing the quadruple-precision floating-point arithmetic operation for the Krylov subspace methods. In: The 8th SIAM Conference on Applied Linear, Algebra (2003)
Bailey, D.H.: QD (C++ / Fortran-90 double-double and quad-double package), http://crd-legacy.lbl.gov/dhbailey/mpdist/
Intel: Intrinsics Guide, http://software.intel.com/en-us/articles/intel-intrinsics-guide
Dekker, T.: A floating-point technique for extending the available precision. Numer. Math. 18, 224–242 (1971)
Knuth, D.E.: The Art of Computer Programming: Seminumerical Algorithms, vol. 2. Addison-Wesley, Reading (1969)
Barrett, R., et al.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, pp. 57–65. SIAM, Philadelphia (1994)
The University of Florida Sparse Matrix Collection, http://www.cise.ufl.edu/research/sparse/matrices/
Acknowledgement
The authors would like to thank the reviewers for their helpful comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hishinuma, T., Fujii, A., Tanaka, T., Hasegawa, H. (2014). AVX Acceleration of DD Arithmetic Between a Sparse Matrix and Vector. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55224-3_58
Download citation
DOI: https://doi.org/10.1007/978-3-642-55224-3_58
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55223-6
Online ISBN: 978-3-642-55224-3
eBook Packages: Computer ScienceComputer Science (R0)