Towards Reducing Communications in Sparse Matrix Kernels

Freire, Manuel; Marichal, Raul; Dufrechou, Ernesto; Ezzatti, Pablo

doi:10.1007/978-3-031-40942-4_2

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1828))

Included in the following conference series:

Conference on Cloud Computing, Big Data & Emerging Topics

2 Citations

Abstract

The significant presence that many-core devices like GPUs have these days, and their enormous computational power, motivates the study of sparse matrix operations in this hardware. The essential sparse kernels in scientific computing, such as the sparse matrix-vector multiplication (SpMV), usually have many different high-performance GPU implementations. Sparse matrix problems typically imply memory-bound operations, and this characteristic is particularly limiting in massively parallel processors. This work revisits the main ideas about reducing the volume of data required by sparse storage formats and advances in understanding some compression techniques. In particular, we study the use of index compression combined with sparse matrix reordering techniques. The systematic experimental evaluation on a large set of real-world matrices confirms that this approach is promising, achieving meaningful data storage reductions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://sparse.tamu.edu/.

References

Anzt, H., Dongarra, J., Flegar, G., Higham, N.J., Quintana-Ortí, E.S.: Adaptive precision in block-jacobi preconditioning for iterative sparse linear system solvers. Concurrency Comput. Pract. Experience 31(6), e4460 (2018). https://doi.org/10.1002/cpe.4460
Article Google Scholar
Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp. 1–11 (2009)
Google Scholar
Bell, N., Garland, M.: Cusp library (2012). https://github.com/cusplibrary/cusplibrary
Berger, G., Freire, M., Marini, R., Dufrechou, E., Ezzatti, P.: Unleashing the performance of bmsparse for the sparse matrix multiplication in GPUs. In: Proceedings of the 2021 12th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), pp. 19–26, November 2021
Google Scholar
Berger, G., Freire, M., Marini, R., Dufrechou, E., Ezzatti, P.: Advancing on an efficient sparse matrix multiplication kernel for modern gpus. Practice and Experience, Concurrency and Computation (2022)
Google Scholar
Cuthill, E., McKee, J.: Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of the 1969 24th National Conference, pp. 157–172. ACM Press (1969). https://doi.org/10.1145/800195.805928
Dufrechou, E., Ezzatti, P., Freire, M., Quintana-Ortí, E.S.: Machine learning for optimal selection of sparse triangular system solvers on GPUs. J. Parall. Distrib. Comput. 158, 47–55 (2021). https://doi.org/10.1016/j.jpdc.2021.07.013
Article Google Scholar
Dufrechou, E., Ezzatti, P., Quintana-Ortí, E.S.: Selecting optimal SpMV realizations for GPUs via machine learning. Int. J. High Perform. Comput. Appl. 35(3) (2021). https://doi.org/10.1177/1094342021990738
Gale, T., Zaharia, M., Young, C., Elsen, E.: Sparse GPU kernels for deep learning. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. SC 2020, IEEE Press (2020)
Google Scholar
Grützmacher, T., Cojean, T., Flegar, G., Göbel, F., Anzt, H.: A customized precision format based on mantissa segmentation for accelerating sparse linear algebra. Concurrency Comput. Pract. Experience 32(15) (2019). https://doi.org/10.1002/cpe.5418
Guo, D., Gropp, W., Olson, L.N.: A hybrid format for better performance of sparse matrix-vector multiplication on a GPU. Int. J. High Perform. Comput. Appl. 30(1), 103–120 (2015). https://doi.org/10.1177/1094342015593156
Article Google Scholar
Gustavson, F.G., Liniger, W., Willoughby, R.: Symbolic generation of an optimal crout algorithm for sparse systems of linear equations. J. ACM 17(1), 87–109 (1970)
Article MATH Google Scholar
Hong, C., Sukumaran-Rajam, A., Nisa, I., Singh, K., Sadayappan, P.: Adaptive sparse tiling for sparse matrix multiplication. In: Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming, ACM, February 2019. https://doi.org/10.1145/3293883.3295712
Kourtis, K., Goumas, G., Koziris, N.: Optimizing sparse matrix-vector multiplication using index and value compression. In: Proceedings of the 2008 Conference on Computing Frontiers, ACM Press (2008). https://doi.org/10.1145/1366230.1366244
Langr, D., Tvrdík, P.: Evaluation criteria for sparse matrix storage formats. IEEE Trans. Parall. Distrib. Syst. 27(2), 428–440 (2016). https://doi.org/10.1109/TPDS.2015.2401575
Article Google Scholar
Maggioni, M., Berger-Wolf, T.: CoAdELL: adaptivity and compression for improving sparse matrix-vector multiplication on GPUs. In: 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, IEEE, May 2014. https://doi.org/10.1109/ipdpsw.2014.106
Marichal, R., Dufrechou, E., Ezzatti, P.: Optimizing sparse matrix storage for the big data era. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds.) JCC-BD &ET 2021. CCIS, vol. 1444, pp. 121–135. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84825-5_9
Chapter Google Scholar
Monakov, A., Lokhmotov, A., Avetisyan, A.: Automatically tuning sparse matrix-vector multiplication for GPU architectures. In: High Performance Embedded Architectures and Compilers, pp. 111–125. Springer, Berlin Heidelberg (2010)
Google Scholar
Pinar, A., Heath, M.T.: Improving performance of sparse matrix-vector multiplication. In: Proceedings of the 1999 ACM/IEEE Conference on Supercomputing, pp. 30-es. SC 1999, Association for Computing Machinery, New York, NY, USA (1999)
Google Scholar
Saad, Y.: Sparskit: a basic tool kit for sparse matrix computations - version 2 (1994)
Google Scholar
Sun, X., Zhang, Y., Wang, T., Zhang, X., Yuan, L., Rao, L.: Optimizing SpMV for diagonal sparse matrices on GPU. In: 2011 International Conference on Parallel Processing, IEEE, September 2011. https://doi.org/10.1109/icpp.2011.53
Tang, W.T., et al.: Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemes. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, ACM (2013). https://doi.org/10.1145/2503210.2503234
Willcock, J., Lumsdaine, A.: Accelerating sparse matrix computations via data compression. In: Proceedings of the 20th Annual International Conference on Supercomputing - ICS 2006, ACM Press (2006). https://doi.org/10.1145/1183401.1183444
Xu, S., Lin, H.X., Xue, W.: Sparse matrix-vector multiplication optimizations based on matrix bandwidth reduction using NVIDIA CUDA. In: 2010 Ninth International Symposium on Distributed Computing and Applications to Business, Engineering and Science, IEEE, August 2010
Google Scholar
Yang, C., Buluç, A., Owens, J.D.: Design principles for sparse matrix multiplication on the GPU. In: Aldinucci, M., Padovani, L., Torquati, M. (eds.) Euro-Par 2018. LNCS, vol. 11014, pp. 672–687. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96983-1_48
Chapter Google Scholar

Download references

Acknowledgments

We acknowledge support of the ANII MPG Independent Research Group: Efficient Heterogeneous Computing at UdelaR, a partner group of the Max Planck Institute in Magdeburg. This work is partially funded by the UDELAR CSIC-INI project CompactDisp: Formatos dispersos eficientes para arquitecturas de hardware modernas. We also thank PEDECIBA Informática and the University of the Republic, Uruguay.

Author information

Authors and Affiliations

Instituto de Computación, INCO Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay
Manuel Freire, Raul Marichal, Ernesto Dufrechou & Pablo Ezzatti

Authors

Manuel Freire
View author publications
You can also search for this author in PubMed Google Scholar
Raul Marichal
View author publications
You can also search for this author in PubMed Google Scholar
Ernesto Dufrechou
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Ezzatti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ernesto Dufrechou .

Editor information

Editors and Affiliations

Universidad Nacional de La Plata, La Plata, Argentina
Marcelo Naiouf
Universidad Nacional de La Plata-CIC, La Plata, Argentina
Enzo Rucci
Universidad Nacional de La Plata, La Plata, Argentina
Franco Chichizola
Universidad Nacional de La Plata-CIC, La Plata, Argentina
Laura De Giusti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Freire, M., Marichal, R., Dufrechou, E., Ezzatti, P. (2023). Towards Reducing Communications in Sparse Matrix Kernels. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds) Cloud Computing, Big Data & Emerging Topics. JCC-BD&ET 2023. Communications in Computer and Information Science, vol 1828. Springer, Cham. https://doi.org/10.1007/978-3-031-40942-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-40942-4_2
Published: 11 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40941-7
Online ISBN: 978-3-031-40942-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Reducing Communications in Sparse Matrix Kernels