Acceleration of Multiple Precision Matrix Multiplication Based on Multi-component Floating-Point Arithmetic Using AVX2

Kouya, Tomonori

doi:10.1007/978-3-030-86976-2_14

Tomonori Kouya ORCID: orcid.org/0000-0003-0178-5519¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12953))

Included in the following conference series:

International Conference on Computational Science and Its Applications

1376 Accesses
2 Citations

Abstract

In this paper, we report the results obtained from the acceleration of multi-binary64-type multiple precision block and Strassen matrix multiplications with AVX2. We target double-double (DD), triple-double (TD), and quad-double (QD) precision arithmetic designed using certain types of error-free transformation (EFT) arithmetic. Furthermore, we implement SIMDized EFT functions, which simultaneously compute with four binary64 numbers on x86_64 computing environment, and by using help of them, we also develop SIMDized DD, TD, and QD additions and multiplications. In addition, AVX2 load/store functions were adopted to efficiently speed up reading and storing matrix elements from/to memory. Owing to these combined techniques, our implemented multiple precision matrix multiplications were accelerated more than three times compared with non-accelerated ones. Our accelerated matrix multiplication modifies parallelization performance with OpenMP.

Supported by JSPS KAKENHI (Grant Number JP20K11843) and Shizuoka Institute of Science and Technology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bailey, D.: QD. https://www.davidhbailey.com/dhbsoftware/
Intel Corp.: The intel intrinsics guide. https://software.intel.com/sites/landingpage/IntrinsicsGuide/
Dekker, T.J.: A floating-point technique for extending the available precision. Numerische Mathematik 18(3), 224–242 (1971) https://doi.org/10.1007/BF01397083
Fabiano, N., Muller, J.M., Picot, J.: Algorithms for triple words arithmetic. IEEE Trans. Comput. 68, 1573–1583 (2019)
Article MathSciNet Google Scholar
Golub, G.H., Loan, C.: Matrix Computations (4th ed.). Johns Hopkins University Press (2013)
Google Scholar
Hishinuma, T., Fujii, A., Tanaka, T., Hasegawa, H.: AVX acceleration of DD. Arithmetic between a sparse matrix and vector. Parallel Process. Appl. Math., 622–631 (2014)
Google Scholar
Kouya, T.: Performance evaluation of multiple precision matrix multiplications using parallelized Strassen and Winograd algorithms. JSIAM Lett. 8, 21–24 (2015). https://doi.org/10.14495/jsiaml.8.21
Fousse, L., Hanrot, G., Lefèvre, V., Pélissier, P., Zimmermann, P.: MPFR: a multiple-precision binary floating-point library with correct rounding. ACM Trans. Math. Softw. 33(2), 13 (2007). http://doi.acm.org/10.1145/1236463.1236468
MPLAPACK/MPBLAS: Multiple precision arithmetic LAPACK and BLAS. http://mplapack.sourceforge.net/
OpenBLAS. http://www.openblas.net/
ATLAS. http://math-atlas.sourceforge.net/
Granlaud, T., GMP development team: the GNU multiple precision arithmetic library. https://gmplib.org/
Kotakemori, T., Fujii, S., Hasegawa, H., Nishida, A.: Lis: Library of iterative solvers for linear systems. https://www.ssisc.org/lis/
Yagi, H., Ishiwata, E., Hasegawa, H.: Acceleration of interactive multiple precision arithmetic toolbox MuPAT using FMA, SIMD, and OpenMP. Adv. Parallel Comput. 36, 431–440 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Shizuoka Institute of Science and Technology, 2200-2 Toyosawa, Fukuroi, 437-8555, Japan
Tomonori Kouya

Authors

Tomonori Kouya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomonori Kouya .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Potenza, Italy
Beniamino Murgante
Covenant University, Ota, Nigeria
Sanjay Misra
University of Cagliari, Cagliari, Italy
Chiara Garau
University of Cagliari, Cagliari, Italy
Ivan Blečić
Monash University, Clayton, VIC, Australia
David Taniar
Kyushu Sangyo University, Fukuoka, Japan
Bernady O. Apduhan
University of Minho, Braga, Portugal
Ana Maria A. C. Rocha
Polytechnic University of Bari, Bari, Italy
Eufemia Tarantino
Polytechnic University of Bari, Bari, Italy
Carmelo Maria Torre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kouya, T. (2021). Acceleration of Multiple Precision Matrix Multiplication Based on Multi-component Floating-Point Arithmetic Using AVX2. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12953. Springer, Cham. https://doi.org/10.1007/978-3-030-86976-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-86976-2_14
Published: 12 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86975-5
Online ISBN: 978-3-030-86976-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics