A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks

Yuan, Haiying; Zeng, Zhiyong; Cheng, Junpeng; Li, Minghao

doi:10.1007/s00034-022-01992-x

A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks

Published: 02 March 2022

Volume 41, pages 4370–4389, (2022)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Haiying Yuan ORCID: orcid.org/0000-0003-1602-8078¹,
Zhiyong Zeng¹,
Junpeng Cheng¹ &
…
Minghao Li¹

307 Accesses
1 Altmetric
Explore all metrics

Abstract

In view of the technical challenge that convolutional neural networks involve in a large amount of computation caused by the information redundancy of the interlayer activation, a flexible sparsity-aware accelerator is proposed in this paper. It realizes the basic data transmission with coarse-grained control and realizes the transmission of sparse data with fine-grained control. In addition, the corresponding data arrangement scheme is designed to fully utilize the off-chip bandwidth. In order to improve the inference performance without accuracy reduction, the sparse activation is compressed to eliminate ineffectual activation while preserving topology information with the sparsity perceptron module. To improve power efficiency, the computational load is rationally allocated for multiplication accumulator array, and the convolution operation is decoupled by adder tree with FIFO. The accelerator is implemented on Xilinx VCU108, and 97.27% of the operations are non-zero activation operations. The accelerator running in sparsity mode is more than 2.5 times faster than that in density mode, and power consumption is reduced to 8.3 W. Furthermore, this flexible sparsity-aware accelerator architecture can be widely applied to large-scale deep convolutional neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks

Article 17 May 2023

MCPS: a mapping method for MAERI accelerator base on Cartesian Product based Convolution for DNN layers with sparse input feature map

Article 02 February 2022

SparseNN: A Performance-Efficient Accelerator for Large-Scale Sparse Neural Networks

Article 03 October 2017

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

A. Aimar et al., NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps. IEEE Transact. Neur. Netw. Learn. Sys. 30(3), 644–656 (2019)
Article Google Scholar
J. Albericio, P. Judd, T. Hetherington, T. Aamodt, N.E. Jerger, A. Moshovos, Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing. In: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (2016), pp. 1–13
F. Albu, J. Kadlec, N. Coleman, A. Fagan, Pipelined implementations of the a Priori error-feedback LSL algorithm using logarithmic arithmetic. In: IEee international conference on acoustics, speech, and signal processing (2022), pp. III-2681-III-2684
Y.H. Chen, J. Emer, V. Sze, Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. In: ACM/IEEE 43rd annual international symposium on computer architecture (2016), pp. 367–379
J. Cong, B.J.S. Xiao, Cham, Minimizing computation in convolutional neural networks. In: Artificial neural networks and machine learning (2014), pp. 281–290
S. Han et al, EIE: Efficient inference engine on compressed deep neural network. In: 2016 ACM/IEEE 43rd annual international symposium on computer architecture (2016), pp. 243–254
K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: 2015 IEEE international conference on computer vision (2015), pp. 1026–1034
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, L. Fei-Fei, Large-scale video classification with convolutional neural networks. In: 2014 IEEE conference on computer vision and pattern recognition (2014), pp. 1725–1732
A. Krizhevsky, I. Sutskever, G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks. Advances in neural information processing system 25(2) (2012)
Y. Lecun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
H. Lee, R. Grosse, R. Ranganath, A.Y. Ng, Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning (2009)
Y. Liang, L. Lu, J. Xie, OMNI: a framework for integrating hardware and software optimizations for sparse CNNs. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 40(8), 1648–1661 (2021)
Article Google Scholar
B. Liu, X. Chen, Y. Han, H. Xu, Swallow: a versatile accelerator for sparse neural networks. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 39(12), 4881–4893 (2020)
Article Google Scholar
J. Mas, T. Panadero, G. Botella, A.A. Del Barrio, C. García, CNN Inference acceleration using low-power devices for human monitoring and security scenarios. Comput. Electr. Eng. 88, 106859 (2020)
Article Google Scholar
P. Meloni, A. Garufi, G. Deriu, M. Carreras, D. Loi, CNN hardware acceleration on a low-power and low-cost APSoC. In: 2019 Conference on Design and Architectures for Signal and Image Processing (2019), pp. 7–12
V. Nair, G.E. Hinton, Rectified linear units improve restricted boltzmann machines. In: Icml 2010
A. Parashar et al. SCNN: An accelerator for compressed-sparse convolutional neural networks. In: 2017 ACM/IEEE 44th Annual international symposium on computer architecture (2017), pp. 27–40
J. Park et al. Faster CNNs with direct sparse convolutions and guided pruning, (2016)
S. Shamshirband, M. Fathi, A. Dehzangi, A.T. Chronopoulos, H. Alinejad-Rokny, A review on deep learning approaches in healthcare systems: taxonomies, challenges, and open issues. J. Biomed Inform. 113, 103627 (2021)
Article Google Scholar
S. Shamshirband, T. Rabczuk, K.W. Chau, A survey of deep learning techniques: application in wind and solar energy resources. IEEE Access 7, 164650–164666 (2019)
Article Google Scholar
J. Shang, L. Qian, Z. Zhang, L. Xue, H. Liu, LACS: A high-computational-efficiency accelerator for CNNs. IEEE Access 8, 6045–6059 (2020)
Article Google Scholar
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. Comput. Sci. (2014)
L. Yu, Deng, deep learning and its applications to signal and information processing [Exploratory DSP]. IEEE Signal Process. Mag. 28(1), 145–154 (2011)
Article Google Scholar
C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, J. Cong, Optimizing fpga-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA international symposium on field-programmable gate arrays (2015), pp. 161–170
S. Zhang et al. Cambricon-X: An accelerator for sparse neural networks. In: 2016 49th Annual IEEE/ACM International symposium on microarchitecture (2016), pp. 1–12
C. Zhu, K. Huang, S. Yang, Z. Zhu, H. Zhang, H. Shen, An efficient hardware accelerator for structured sparse convolutional neural networks on FPGAs. IEEE Transact Very Large Scale Integrat (VLSI) Sys 28(9), 1953–1965 (2020)
Article Google Scholar

Download references

Acknowledgements

This research work was supported by Research Fund from Natural Science Foundation of Beijing Municipality (CN) (4172010), National Natural Science Foundation of China (61001049) and Beijing Innovation Center for Future Chip (CN)(KYJJ2018009)

Author information

Authors and Affiliations

Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, People’s Republic of China
Haiying Yuan, Zhiyong Zeng, Junpeng Cheng & Minghao Li

Authors

Haiying Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Junpeng Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Minghao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haiying Yuan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yuan, H., Zeng, Z., Cheng, J. et al. A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks. Circuits Syst Signal Process 41, 4370–4389 (2022). https://doi.org/10.1007/s00034-022-01992-x

Download citation

Received: 29 April 2021
Revised: 08 February 2022
Accepted: 09 February 2022
Published: 02 March 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s00034-022-01992-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks

Abstract

Access this article

Similar content being viewed by others

Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks

MCPS: a mapping method for MAERI accelerator base on Cartesian Product based Convolution for DNN layers with sparse input feature map

SparseNN: A Performance-Efficient Accelerator for Large-Scale Sparse Neural Networks

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks

Abstract

Access this article

Similar content being viewed by others

Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks

MCPS: a mapping method for MAERI accelerator base on Cartesian Product based Convolution for DNN layers with sparse input feature map

SparseNN: A Performance-Efficient Accelerator for Large-Scale Sparse Neural Networks

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation