DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems

Mohammed, Thaha; Albeshri, Aiiad; Katib, Iyad; Mehmood, Rashid

doi:10.1007/s11227-020-03489-3

DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems

Published: 30 November 2020

Volume 77, pages 6313–6355, (2021)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

534 Accesses
10 Citations
4 Altmetric
Explore all metrics

Abstract

Sparse linear algebra is central to many areas of engineering, science, and business. The community has done considerable work on proposing new methods for sparse matrix-vector multiplication (SpMV) computations and iterative sparse solvers on graphical processing units (GPUs). Due to vast variations in matrix features, no single method performs well across all sparse matrices. A few tools on automatic prediction of best-performing SpMV kernels have emerged recently and require many more efforts to fully utilize their potential. The utilization of a GPU by the existing SpMV kernels is far from its full capacity. Moreover, the development and performance analysis of SpMV techniques on GPUs have not been studied in sufficient depth. This paper proposes DIESEL, a deep learning-based tool that predicts and executes the best performing SpMV kernel for a given matrix using a feature set carefully devised by us through rigorous empirical and mathematical instruments. The dataset comprises 1056 matrices from 26 different real-life application domains including computational fluid dynamics, materials, electromagnetics, economics, and more. We propose a range of new metrics and methods for performance analysis, visualization, and comparison of SpMV tools. DIESEL provides better performance with its accuracy \(88.2\%\), workload accuracy \(91.96\%\), and average relative loss \(4.4\%\), compared to \(85.9\%\), \(85.31\%\), and \(7.65\%\) by the next best performing artificial intelligence (AI)-based SpMV tool. The extensive results and analyses presented in this paper provide several key insights into the performance of the SpMV tools and how these relate to the matrix datasets and the performance metrics, allowing the community to further improve and compare basic and AI-based SpMV tools in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solving Sparse Matrices: A Comparative Analysis Between FPGA and GPU

A Prediction Framework for Fast Sparse Triangular Solves

Adaptive sparse matrix representation for efficient matrix–vector multiplication

Article 28 November 2015

References

AlAhmadi S, Muhammed T, Mehmood R, Albeshri A (2020) Performance characteristics for sparse matrix-vector multiplication on GPUs. Springer International Publishing, Cham, pp 409–426. https://doi.org/10.1007/978-3-030-13705-2_17
Alyahya H, Mehmood R, Katib I (2020) Parallel iterative solution of large sparse linear equation systems on the intel MIC architecture. Springer International Publishing, Cham, pp 377–407. https://doi.org/10.1007/978-3-030-13705-2_16
Asanovic K, Bodik R, Catanzaro BC, Gebis JJ, Husbands P, Keutzer K, Patterson DA, Plishker WL, Shalf J, Williams SW, Yelick KA (2006) The landscape of parallel computing research: a view from Berkeley. Tech. Rep. UCB/EECS-2006-183, EECS Department, University of California, Berkeley, http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html
Baskaran MM, Bordawekar R (2009) Optimizing sparse matrix-vector multiplication on GPUs. Tech. Rep. RC24704 (W0812-047), IBM Research
Bell N, Garland M (2008) Efficient sparse matrix-vector multiplication on CUDA. Tech. rep., Nvidia Technical Report NVR-2008-004, Nvidia Corporation
Benatia A, Ji W, Wang Y, Shi F (2016) Sparse matrix format selection with multiclass SVM for SpMV on GPU. In: 2016 45th International Conference on Parallel Processing (ICPP), pp 496–505. https://doi.org/10.1109/ICPP.2016.64
Benatia A, Ji W, Wang Y, Shi F (2018) Bestsf: a sparse meta-format for optimizing SpMV on GPU. ACM Trans Archit Code Optim 15(3). https://doi.org/10.1145/3226228
Bengio Y (2009) Learning deep architectures for ai. Found Trends Mach Learn 2(1):1–127. https://doi.org/10.1561/2200000006
Article MathSciNet MATH Google Scholar
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Article Google Scholar
Bernaschi M, Bisson M, Fantozzi C, Janna C (2016) A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units. SIAM J Sci Comput 38(1):C53–C72. https://doi.org/10.1137/15M1027826
Article MathSciNet MATH Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27, software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Choi JW, Singh A, Vuduc RW (2010) Model-driven autotuning of sparse matrix-vector multiply on GPUs. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Association for Computing Machinery, New York, NY, USA, PPoPP ’10, pp 115 – 126. https://doi.org/10.1145/1693453.1693471
Davis TA, Hu Y (2011) The university of Florida sparse matrix collection. ACM Trans Math Softw 38(1):1:1–1:25. https://doi.org/10.1145/2049662.2049663
Dhar S, Guo J, Liu J, Tripathi S, Kurup U, Shah M (2020) On-device machine learning: an algorithms and learning theory perspective. 1911.00623
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–118. http://dx.doi.org/10.1038/nature21056, letter
Filippone S, Cardellini V, Barbieri D, Fanfarillo A (2017) Sparse matrix-vector multiplication on GPGPUs. ACM Trans Math Softw 43(4):1–49. https://doi.org/10.1145/3017994
Article MathSciNet MATH Google Scholar
Golub GH, Van Loan CF (2012) Matrix computations, vol 3. JHU Press
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, http://www.deeplearningbook.org
Grimes RG, Kincaid DR, Young DM (1979) ITPACK 2.0 user’s guide. Center for Numerical Analysis, The University of Texas at Austin
Grossman M, Thiele C, Araya-Polo M, Frank F, Alpak FO, Sarkar V (2016) A survey of sparse matrix-vector multiplication performance on large matrices. ArXiv abs/1608.00636
Guo P, Wang L, Chen P (2014) A performance modeling and optimization analysis tool for sparse matrix-vector multiplication on GPUs. IEEE Trans Parallel Distrib Syst 25(5):1112–1123. https://doi.org/10.1109/TPDS.2013.123
Article Google Scholar
Janna C, Ferronato M, Gambolati G (2015) The use of supernodes in factored sparse approximate inverse preconditioning. SIAM J Sci Comput 37(1):C72–C94. https://doi.org/10.1137/140956026
Article MathSciNet MATH Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR abs/1412.6980,
Kirk DB, Wen-Mei WH (2016) Programming massively parallel processors: a hands-on approach. Morgan kaufmann
Kreutzer M, Hager G, Wellein G, Fehske H, Basermann A, Bishop AR (2012) Sparse matrix-vector multiplication on GPGPU clusters: a new storage format and a scalable implementation. In: Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International, IEEE, pp 1696–1702
Li K, Yang W, Li K (2015) Performance analysis and optimization for SpMV on GPU using probabilistic modeling. IEEE Trans Parallel Distrib Syst 26(1):196–205. https://doi.org/10.1109/TPDS.2014.2308221
Article Google Scholar
Li R, Saad Y (2013) GPU-accelerated preconditioned iterative linear solvers. J Supercomput 63(2):443–466. https://doi.org/10.1007/s11227-012-0825-3
Article Google Scholar
van der Maaten L, Hinton G (2012) Visualizing non-metric similarities in multiple maps. Mach Learn 87(1):33–55. https://doi.org/10.1007/s10994-011-5273-4
Article MathSciNet MATH Google Scholar
Maaten Lvd, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605
Mehmood R, Crowcroft J (2005) Parallel iterative solution method for large sparse linear equation systems. University of Cambridge, Computer Laboratory
Mohammed T (2017) A novel deep learning based iterative solver for large sparse linear equation systems. Master’s thesis, King Abdulaziz University. https://kaupp.sa/Details/Thesis/133000
Muhammed T, Mehmood R, Albeshri A, Katib I (2019) SURAA: a novel method and tool for loadbalanced and coalesced SpMV computations on GPUs. Appl Sci 9(5):947. https://doi.org/10.3390/app9050947
Article Google Scholar
Nisa I, Siegel C, Rajam AS, Vishnu A, Sadayappan P (2018) Effective machine learning based format selection and performance modeling for SpMV on GPUs. In: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp 1056–1065. https://doi.org/10.1109/IPDPSW.2018.00164
Saad Y, van der Vorst HA (2000) Iterative solution of linear systems in the 20th century. J Comput Appl Math 123(1–2):1–33. https://doi.org/10.1016/S0377-0427(00)00412-X, http://www.sciencedirect.com/science/article/pii/ S037704270000412X, numerical Analysis 2000. Vol. III: Linear Algebra
Sedaghati N, Mu T, Pouchet LN, Parthasarathy S, Sadayappan P (2015) Automatic selection of sparse matrix representation on GPUs. In: Proceedings of the 29th ACM on International Conference on Supercomputing, ACM, New York, NY, USA, ICS ’15, pp 99–108. https://doi.org/10.1145/2751205.2751244
Tan G, Liu J, Li J (2018) Design and implementation of adaptive SpMV library for multicore and many-core architecture. ACM Trans Math Softw 44(4). https://doi.org/10.1145/3218823
Usman S, Mehmood R, Katib I, Albeshri A (2019a) ZAKI+: a machine learning based process mapping tool for SpMV computations on distributed memory architectures. IEEE Access 7:81279–81296. https://doi.org/10.1109/ACCESS.2019.2923565
Article Google Scholar
Usman S, Mehmood R, Katib I, Albeshri A, Altowaijri S (2019b) ZAKI: a smart method and tool for automatic performance optimization of parallel SpMV computations on distributed memory machines. Mobile Netw Appl
Usman S, Mehmood R, Katib I (2020) Big data and HPC convergence for smart infrastructures: a review and proposed architecture. Springer International Publishing, Cham, pp 561–586. https://doi.org/10.1007/978-3-030-13705-2_23
Verschoor M, Jalba AC (2012) Analysis and performance estimation of the Conjugate Gradient method on multiple GPUs. Parallel Comput 38(10–11):552–575. https://doi.org/10.1016/j.parco.2012.07.002, http://www.sciencedirect.com/science/article/pii/ S0167819112000609
Zardoshti P, Khunjush F, Sarbazi-Azad H (2015) Adaptive sparse matrix representation for efficient matrix–vector multiplication. J Supercomput pp 1–21

Download references

Acknowledgements

This project was funded by the Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, under Grant Number RG-6-611-40. The authors, therefore, acknowledge with thanks the DSR for their technical and financial support. This work is supported by the HPC Center at King Abdulaziz University, Jeddah, Saudi Arabia. The experiments reported in this paper were performed on the Aziz supercomputer at KAU. We are thankful to the anonymous reviewers whose comments have greatly improved the quality of this paper.

Author information

Authors and Affiliations

Department of Computer Science, Aalto University, Espoo, 02150, Finland
Thaha Mohammed
Department of Computer Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
Aiiad Albeshri & Iyad Katib
High Performance Computing Center, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
Rashid Mehmood

Authors

Thaha Mohammed
View author publications
You can also search for this author in PubMed Google Scholar
Aiiad Albeshri
View author publications
You can also search for this author in PubMed Google Scholar
Iyad Katib
View author publications
You can also search for this author in PubMed Google Scholar
Rashid Mehmood
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rashid Mehmood.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mohammed, T., Albeshri, A., Katib, I. et al. DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems. J Supercomput 77, 6313–6355 (2021). https://doi.org/10.1007/s11227-020-03489-3

Download citation

Accepted: 23 October 2020
Published: 30 November 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11227-020-03489-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems

Abstract

Access this article

Similar content being viewed by others

Solving Sparse Matrices: A Comparative Analysis Between FPGA and GPU

A Prediction Framework for Fast Sparse Triangular Solves

Adaptive sparse matrix representation for efficient matrix–vector multiplication

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems

Abstract

Access this article

Similar content being viewed by others

Solving Sparse Matrices: A Comparative Analysis Between FPGA and GPU

A Prediction Framework for Fast Sparse Triangular Solves

Adaptive sparse matrix representation for efficient matrix–vector multiplication

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation