Optimizing Sparse Matrix Multiplications for Graph Neural Networks

Qiu, Shenghao; You, Liang; Wang, Zheng

doi:10.1007/978-3-030-99372-6_7

Shenghao Qiu¹⁰,
Liang You¹¹ &
Zheng Wang¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13181))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

Abstract

Graph neural networks (GNNs) are emerging as a powerful technique for modeling graph structures. Due to the sparsity of real-world graph data, GNN performance is limited by extensive sparse matrix multiplication (SpMM) operations involved in computation. While the right sparse matrix storage format varies across input data, existing deep learning frameworks employ a single, static storage format, leaving much room for improvement. This paper investigates how the choice of sparse matrix storage formats affect the GNN performance. We observe that choosing a suitable sparse matrix storage format can significantly improve the GNN training performance, but the right format depends on the input workloads and can change as the GNN iterates over the input graph. We then develop a predictive model to dynamically choose a sparse matrix storage format to be used by a GNN layer based on the input matrices. Our model is first trained offline using training matrix samples, and the trained model can be applied to any input matrix and GNN kernels with SpMM computation. We implement our approach on top of PyTorch and apply it to 5 representative GNN models running on a multi-core CPU using real-life and synthetic datasets. Experimental results show that our approach gives an average speedup of 1.17x (up to 3x) for GNN running time.

Z. Wang—This project was supported in part by an Alibaba Innovative Research Programme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Scalable decoupling graph neural network with feature-oriented optimization

Article 27 December 2023

Towards a Deep-Pipelined Architecture for Accelerating Deep GCN on a Multi-FPGA Platform

DRGN: a dynamically reconfigurable accelerator for graph neural networks

Article 13 September 2022

References

Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: OSDI (2016)
Google Scholar
Bojchevski, A., Günnemann, S.: Deep gaussian embedding of graphs: unsupervised inductive learning via ranking. arXiv (2017)
Google Scholar
Brockschmidt, M.: GNN-film: graph neural networks with feature-wise linear modulation. In: ICML 2020, 13–18 July 2020, Virtual Event (2020)
Google Scholar
Buitinck, L., et al.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop (2013)
Google Scholar
Chen, D., et al.: Optimizing sparse matrix-vector multiplications on an ARMv8-based many-core architecture. Int. J. Parallel Prog. 47, 418–432 (2019)
Article Google Scholar
Chen, D., et al.: Characterizing scalability of sparse matrix-vector multiplications on Phytium FT-2000+. Int. J. Parallel Prog. 1, 80–97 (2020)
Article Google Scholar
Chen, T., et al.: Xgboost: extreme gradient boosting. R Package 1(4), 1–4 (2015)
Google Scholar
Cui, P., et al.: A survey on network embedding. IEEE TKDE 31(5), 833–852 (2018)
Google Scholar
Cummins, C., et al.: End-to-end deep learning of optimization heuristics. In: PACT (2017)
Google Scholar
Dalton, S., et al.: Optimizing sparse matrix-matrix multiplication for the GPU. ACM TOMS 41, 1–20 (2015)
Article MathSciNet Google Scholar
Fey, M., Lenssen, J.E.: Fast graph representation learning with pytorch geometric. arXiv (2019)
Google Scholar
Gardner, M.W., Dorling, S.: Artificial neural networks (the multilayer perceptron)-a review of applications in the atmospheric sciences. Atmos. Environ. 32, 2627–2636 (1998)
Article Google Scholar
Gilbert, J.R., et al.: A unified framework for numerical and combinatorial computing. Comput. Sci. Eng. 10(2), 20–25 (2008)
Article Google Scholar
Greathouse, J.L., Daga, M.: Efficient sparse matrix-vector multiplication on GPUs using the CSR storage format. In: SC (2014)
Google Scholar
Hamilton, W.L., et al.: Inductive representation learning on large graphs. In: NeurIPS (2017)
Google Scholar
Hu, W., et al.: Open graph benchmark: Datasets for machine learning on graphs. arXiv (2020)
Google Scholar
Huang, K., et al.: Understanding and bridging the gaps in current GNN performance optimizations. In: PPoPP (2021)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv (2016)
Google Scholar
Langr, D., Tvrdik, P.: Evaluation criteria for sparse matrix storage formats. IEEE Trans. Parallel Distrib. Syst. 27(2), 428–440 (2015)
Article Google Scholar
Li, J., et al.: SMAT: an input adaptive auto-tuner for sparse matrix-vector multiplication. In: PLDI (2013)
Google Scholar
Mehrabi, A., et al.: Learning sparse matrix row permutations for efficient SPMM on GPU architectures. In: ISPASS (2021)
Google Scholar
Noble, W.S.: What is a support vector machine? Nat. Biotechnol. 24, 1565–1567 (2006)
Article Google Scholar
Paszke, A., et al.: Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (2019)
Google Scholar
Pichel, J.C., Pateiro-López, B.: Sparse matrix classification on imbalanced datasets using convolutional neural networks. IEEE Access (2019)
Google Scholar
Ren, J., et al.: Optimise web browsing on heterogeneous mobile platforms: a machine learning based approach. In: INFOCOM (2017)
Google Scholar
Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38
Chapter Google Scholar
Sedaghati, N., et al.: Automatic selection of sparse matrix representation on GPUs. In: ICS (2015)
Google Scholar
Tailor, S.A., Opolka, F.L., Liò, P., Lane, N.D.: Adaptive filters and aggregator fusion for efficient graph convolutions (2021)
Google Scholar
Tournavitis, G., et al.: Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping. In: PLDI (2009)
Google Scholar
Veličković, P., et al.: Graph attention networks (2018)
Google Scholar
Venkat, A., et al.: Loop and data transformations for sparse matrix code. In: PLDI (2015)
Google Scholar
Wang, H., et al.: Combining graph-based learning with automated data collection for code vulnerability detection. IEEE TIFS 16, 1943–1958 (2020)
Google Scholar
Wang, M., et al.: Deep graph library: towards efficient and scalable deep learning on graphs. (2019)
Google Scholar
Wang, Z., O’Boyle, M.: Machine learning in compiler optimization. In: Proceedings of the IEEE (2018)
Google Scholar
Wang, Z., O’Boyle, M.F.: Mapping parallelism to multi-cores: a machine learning based approach. In: PPoPP (2009)
Google Scholar
Wang, Z., O’Boyle, M.F.: Partitioning streaming parallelism for multi-cores: a machine learning based approach. In: PACT (2010)
Google Scholar
Wang, Z., et al.: Automatic and portable mapping of data parallel programs to OpenCL for GPU-based heterogeneous systems. ACM TACO 11(4), 1–26 (2014)
Google Scholar
Wang, Z., et al.: Integrating profile-driven parallelism detection and machine-learning-based mapping. ACM TACO 11, 1–26 (2014)
Google Scholar
Xie, Y., et al.: When do GNNs work: understanding and improving neighborhood aggregation. In: IJCAI (2020)
Google Scholar
Xu, K., et al.: Cross-lingual knowledge graph alignment via graph matching neural network (2019)
Google Scholar
Ye, G., et al.: Deep program structure modeling through multi-relational graph-based learning. In: PACT (2020)
Google Scholar
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Article Google Scholar
Zhang, P., et al.: Auto-tuning streamed applications on Intel Xeon Phi. In: IPDPS (2018)
Google Scholar
Zhang, P., et al.: Optimizing streaming parallelism on heterogeneous many-core architectures. IEEE TPDS 31(8), 1878–1896 (2020)
Google Scholar
Zhao, Y., et al.: Bridging the gap between deep learning and sparse matrix format selection. In: PPoPP (2018)
Google Scholar
Zhou, J., et al.: Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Leeds, Leeds, UK
Shenghao Qiu & Zheng Wang
Alibaba Group, Beijing, China
Liang You

Authors

Shenghao Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Liang You
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zheng Wang .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, University of Delaware, Newark, DE, USA
Xiaoming Li
Department of Computer Science, University of Delaware, Newark, DE, USA
Sunita Chandrasekaran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qiu, S., You, L., Wang, Z. (2022). Optimizing Sparse Matrix Multiplications for Graph Neural Networks. In: Li, X., Chandrasekaran, S. (eds) Languages and Compilers for Parallel Computing. LCPC 2021. Lecture Notes in Computer Science, vol 13181. Springer, Cham. https://doi.org/10.1007/978-3-030-99372-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-99372-6_7
Published: 24 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-99371-9
Online ISBN: 978-3-030-99372-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Optimizing Sparse Matrix Multiplications for Graph Neural Networks

Abstract

Access this chapter

Similar content being viewed by others

Scalable decoupling graph neural network with feature-oriented optimization

Towards a Deep-Pipelined Architecture for Accelerating Deep GCN on a Multi-FPGA Platform

DRGN: a dynamically reconfigurable accelerator for graph neural networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Optimizing Sparse Matrix Multiplications for Graph Neural Networks

Abstract

Access this chapter

Similar content being viewed by others

Scalable decoupling graph neural network with feature-oriented optimization

Towards a Deep-Pipelined Architecture for Accelerating Deep GCN on a Multi-FPGA Platform

DRGN: a dynamically reconfigurable accelerator for graph neural networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation