AVX Acceleration of DD Arithmetic Between a Sparse Matrix and Vector

Hishinuma, Toshiaki; Fujii, Akihiro; Tanaka, Teruo; Hasegawa, Hidehiko

doi:10.1007/978-3-642-55224-3_58

Toshiaki Hishinuma¹⁹,
Akihiro Fujii¹⁹,
Teruo Tanaka¹⁹ &
…
Hidehiko Hasegawa²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8384))

Included in the following conference series:

International Conference on Parallel Processing and Applied Mathematics

1614 Accesses
5 Citations
1 Altmetric

Abstract

High precision arithmetic can improve the convergence of Krylov subspace methods; however, it is very costly. One system of high precision arithmetic is double-double (DD) arithmetic, which uses more than 20 double precision operations for one DD operation. We accelerated DD arithmetic using AVX SIMD instructions. The performances of vector operations in 4 threads are 51–59 % of peak performance in a cache and bounded by the memory access speed out of the cache. For SpMV, we used a double precision sparse matrix A and DD vector x to reduce memory access and achieved performances of 17–41 % of peak performance using padding in execution. We also achieved performances that were 9–33 % of peak performance for a transposed SpMV. For these cases, the performances were not bounded by memory access.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hasegawa, H.: Utilizing the quadruple-precision floating-point arithmetic operation for the Krylov subspace methods. In: The 8th SIAM Conference on Applied Linear, Algebra (2003)
Google Scholar
Bailey, D.H.: QD (C++ / Fortran-90 double-double and quad-double package), http://crd-legacy.lbl.gov/dhbailey/mpdist/
Intel: Intrinsics Guide, http://software.intel.com/en-us/articles/intel-intrinsics-guide
Dekker, T.: A floating-point technique for extending the available precision. Numer. Math. 18, 224–242 (1971)
Article MATH MathSciNet Google Scholar
Knuth, D.E.: The Art of Computer Programming: Seminumerical Algorithms, vol. 2. Addison-Wesley, Reading (1969)
MATH Google Scholar
Barrett, R., et al.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, pp. 57–65. SIAM, Philadelphia (1994)
Book Google Scholar
The University of Florida Sparse Matrix Collection, http://www.cise.ufl.edu/research/sparse/matrices/

Download references

Acknowledgement

The authors would like to thank the reviewers for their helpful comments.

Author information

Authors and Affiliations

Major of Informatics, Kogakuin University, Tokyo, Japan
Toshiaki Hishinuma, Akihiro Fujii & Teruo Tanaka
Faculty of Lib., Info. and Media Sci., University of Tsukuba, Tsukuba, Japan
Hidehiko Hasegawa

Authors

Toshiaki Hishinuma
View author publications
You can also search for this author in PubMed Google Scholar
Akihiro Fujii
View author publications
You can also search for this author in PubMed Google Scholar
Teruo Tanaka
View author publications
You can also search for this author in PubMed Google Scholar
Hidehiko Hasegawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toshiaki Hishinuma .

Editor information

Editors and Affiliations

Institute of Computer and Information Science, Czestochowa University of Technology, Czestochowa, Poland
Roman Wyrzykowski
University of Tennessee, Department of Computer Science, Knoxville, Tennessee, USA
Jack Dongarra
Institute of Computer and Information Science, Czestochowa University of Technology, Czestochowa, Poland
Konrad Karczewski
Technical University of Denmark Informatics and Mathematical Modelling, Kongens Lyngby, Denmark
Jerzy Waśniewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hishinuma, T., Fujii, A., Tanaka, T., Hasegawa, H. (2014). AVX Acceleration of DD Arithmetic Between a Sparse Matrix and Vector. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55224-3_58

Download citation

DOI: https://doi.org/10.1007/978-3-642-55224-3_58
Published: 06 May 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55223-6
Online ISBN: 978-3-642-55224-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics