GPU Acceleration of Sparse Neural Networks

Gajurel, Aavaas; Louis, Sushil J.; Wu, Rui; Barford, Lee; Harris, Frederick C.

doi:10.1007/978-3-030-70416-2_41

Aavaas Gajurel¹⁵,
Sushil J. Louis¹⁵,
Rui Wu¹⁶,
Lee Barford^15,17 &
…
Frederick C. Harris Jr.¹⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1346))

711 Accesses
2 Citations

Abstract

In this paper, we use graphics processing units (GPU) to accelerate sparse and arbitrary structured neural networks. Sparse networks have nodes in the network that are not fully connected with nodes in preceding and following layers, and arbitrary structure neural networks have different number of nodes in each layers. Sparse Neural networks with arbitrary structures are generally created in the processes like neural network pruning and evolutionary machine learning strategies. We show that we can gain significant speedup for full activation of such neural networks using graphical processing units. We do a prepossessing step to determine dependency groups for all the nodes in a network, and use that information to guide the progression of activation in the neural network. Then we compute activation for each nodes in its own separate thread in the GPU, which allows for massive parallelization. We use CUDA framework to implement our approach and compare the results of sequential and GPU implementations. Our results show that the activation of sparse neural networks lends very well to GPU acceleration and can help speed up machine learning strategies which generate such networks or other processes that have similar structure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

W.S. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5(4), 115–133 (1943)
Article MathSciNet Google Scholar
R. Hecht-Nielsen, Theory of the backpropagation neural network, in Neural Networks for Perception (Elsevier, Amsterdam, 1992), pp. 65–93
Book Google Scholar
K. Fatahalian, J. Sugerman, P. Hanrahan, Understanding the efficiency of GPU algorithms for matrix-matrix multiplication, in Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware (ACM, New York, 2004), pp. 133–137
Book Google Scholar
Y. LeCun, J.S. Denker, S.A. Solla, Optimal brain damage, in Advances in Neural Information Processing Systems (Morgan Kaufmann, San Francisco, 1990), pp. 598–605
Google Scholar
B. Hassibi, D.G. Stork, Second order derivatives for network pruning: optimal brain surgeon, in Advances in Neural Information Processing Systems (Kaufmann , San Mateo, 1993), pp. 164–171
Google Scholar
K.O. Stanley, R. Miikkulainen, Evolving neural networks through augmenting topologies. Evolut. Comput. 10(2), 99–127 (2002)
Article Google Scholar
K.O. Stanley, D.B. D’Ambrosio, J. Gauci, A hypercube-based encoding for evolving large-scale neural networks. Artif. Life 15(2), 185–212 (2009)
Article Google Scholar
D. Floreano, P. Dürr, C. Mattiussi, Neuroevolution: from architectures to learning. Evolut. Intell. 1(1), 47–62 (2008)
Article Google Scholar
R. Miikkulainen, J. Liang, E. Meyerson, A. Rawal, D. Fink, O. Francon, B. Raju, H. Shahrzad, A. Navruzyan, N. Duffy et al., Evolving deep neural networks (2017). Preprint. arXiv:1703.00548
Google Scholar
F. Ino, J. Gomita, Y. Kawasaki, K. Hagihara, A gpgpu approach for accelerating 2-d/3-d rigid registration of medical images, in International Symposium on Parallel and Distributed Processing and Applications (Springer, New York, 2006), pp. 939–950
Google Scholar
D. Kirk et al., NVIDIA CUDA software and GPU parallel computing architecture, in ISMM, vol. 7 (2007), pp. 103–104
Google Scholar
J.E. Stone, D. Gohara, G. Shi, OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66–73 (2010)
Article Google Scholar
D. Strigl, K. Kofler, S. Podlipnig, Performance and scalability of GPU-based convolutional neural networks, in 2010 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) (IEEE, New York, 2010), pp. 317–324
Google Scholar
C. Nvidia, Cublas Library, vol. 15(27) (NVIDIA Corporation, Santa Clara, CA, 2008), p. 31
Google Scholar
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard et al., Tensorflow: a system for large-scale machine learning, in OSDI, vol. 16 (2016), pp. 265–283
Google Scholar
D. Göddeke, GPGPU-basic math tutorial. Univ. Dortmund, Fachbereich Mathematik, 2005
Google Scholar
NVIDIA, GPU-accelerated libraries for computing (2018) [Online]. Available: https://developer.nvidia.com/gpu-accelerated-libraries
D. Steinkraus, I. Buck, P. Simard, Using GPUs for machine learning algorithms,” in Proceedings of Eighth International Conference on Document Analysis and Recognition, 2005 (IEEE, New York, 2005), pp. 1115–1120
Google Scholar
I. Goodfellow, Y. Bengio, A. Courville, Y. Bengio, Deep Learning, vol. 1 (MIT Press, Cambridge, 2016)
MATH Google Scholar
Z. Luo, H. Liu, X. Wu, Artificial neural network computation on graphic process unit, in Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005. IJCNN’05, vol. 1 (IEEE, New York, 2005), pp. 622–626
Google Scholar
D. Scherer, H. Schulz, S. Behnke, Accelerating large-scale convolutional neural networks with parallel graphics multiprocessors, in International Conference on Artificial Neural Networks (Springer, New York, 2010), pp. 82–91
Google Scholar
S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, E. Shelhamer, cuDNN: efficient primitives for deep learning (2014). Preprint. arXiv:1410.0759
Google Scholar
A. Coates, B. Huval, T. Wang, D. Wu, B. Catanzaro, N. Andrew, Deep learning with COTS HPC systems, in International Conference on Machine Learning (2013), pp. 1337–1345
Google Scholar
S. Zhang, Z. Du, L. Zhang, H. Lan, S. Liu, L. Li, Q. Guo, T. Chen, Y. Chen, Cambricon-X: an accelerator for sparse neural networks, in 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (IEEE, New York, 2016), pp. 1–12
Google Scholar
J.M. Nageswaran, N. Dutt, J.L. Krichmar, A. Nicolau, A.V. Veidenbaum, A configurable simulation environment for the efficient simulation of large-scale spiking neural networks on graphics processors. Neur. Netw. 22(5–6), 791–800 (2009)
Article Google Scholar
C.-F. Juang, T.-C. Chen, W.-Y. Cheng, Speedup of implementing fuzzy neural networks with highdimensional inputs through parallel processing on graphic processing units. IEEE Trans. Fuzzy Syst. 19(4), 717–728 (2011)
Article Google Scholar
X. Chen, Y. Wang, X. Liu, M.J. Gales, P.C. Woodland, Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch, in Fifteenth Annual Conference of the International Speech Communication Association (2014)
Google Scholar
L. Luo, M. Wong, W.-M. Hwu, An effective GPU implementation of breadth-first search, in Proceedings of the 47th Design Automation Conference (ACM, New York, 2010), pp. 52–55
Google Scholar
P. Harish, P. Narayanan, Accelerating large graph algorithms on the GPU using CUDA, in International Conference on High-Performance Computing (Springer, New York, 2007), pp. 197–208
Google Scholar

Download references

Acknowledgements

This material is based in part upon work supported by the National Science Foundation under grant number IIA-1301726. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Nevada, Reno, Reno, NV, USA
Aavaas Gajurel, Sushil J. Louis, Lee Barford & Frederick C. Harris Jr.
Department of Computer Science, East Carolina University, Greenville, NC, USA
Rui Wu
Keysight Laboratories, Keysight Technologies, Reno, NV, USA
Lee Barford

Authors

Aavaas Gajurel
View author publications
You can also search for this author in PubMed Google Scholar
Sushil J. Louis
View author publications
You can also search for this author in PubMed Google Scholar
Rui Wu
View author publications
You can also search for this author in PubMed Google Scholar
Lee Barford
View author publications
You can also search for this author in PubMed Google Scholar
Frederick C. Harris Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frederick C. Harris Jr. .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, University of Nevada, Las Vegas, NV, USA
Shahram Latifi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gajurel, A., Louis, S.J., Wu, R., Barford, L., Harris, F.C. (2021). GPU Acceleration of Sparse Neural Networks. In: Latifi, S. (eds) ITNG 2021 18th International Conference on Information Technology-New Generations. Advances in Intelligent Systems and Computing, vol 1346. Springer, Cham. https://doi.org/10.1007/978-3-030-70416-2_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-70416-2_41
Published: 17 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-70415-5
Online ISBN: 978-3-030-70416-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics