Skip to main content
Log in

A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

In view of the technical challenge that convolutional neural networks involve in a large amount of computation caused by the information redundancy of the interlayer activation, a flexible sparsity-aware accelerator is proposed in this paper. It realizes the basic data transmission with coarse-grained control and realizes the transmission of sparse data with fine-grained control. In addition, the corresponding data arrangement scheme is designed to fully utilize the off-chip bandwidth. In order to improve the inference performance without accuracy reduction, the sparse activation is compressed to eliminate ineffectual activation while preserving topology information with the sparsity perceptron module. To improve power efficiency, the computational load is rationally allocated for multiplication accumulator array, and the convolution operation is decoupled by adder tree with FIFO. The accelerator is implemented on Xilinx VCU108, and 97.27% of the operations are non-zero activation operations. The accelerator running in sparsity mode is more than 2.5 times faster than that in density mode, and power consumption is reduced to 8.3 W. Furthermore, this flexible sparsity-aware accelerator architecture can be widely applied to large-scale deep convolutional neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. A. Aimar et al., NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps. IEEE Transact. Neur. Netw. Learn. Sys. 30(3), 644–656 (2019)

    Article  Google Scholar 

  2. J. Albericio, P. Judd, T. Hetherington, T. Aamodt, N.E. Jerger, A. Moshovos, Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing. In: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (2016), pp. 1–13

  3. F. Albu, J. Kadlec, N. Coleman, A. Fagan, Pipelined implementations of the a Priori error-feedback LSL algorithm using logarithmic arithmetic. In: IEee international conference on acoustics, speech, and signal processing (2022), pp. III-2681-III-2684

  4. Y.H. Chen, J. Emer, V. Sze, Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. In: ACM/IEEE 43rd annual international symposium on computer architecture (2016), pp. 367–379

  5. J. Cong, B.J.S. Xiao, Cham, Minimizing computation in convolutional neural networks. In: Artificial neural networks and machine learning (2014), pp. 281–290

  6. S. Han et al, EIE: Efficient inference engine on compressed deep neural network. In: 2016 ACM/IEEE 43rd annual international symposium on computer architecture (2016), pp. 243–254

  7. K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: 2015 IEEE international conference on computer vision (2015), pp. 1026–1034

  8. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, L. Fei-Fei, Large-scale video classification with convolutional neural networks. In: 2014 IEEE conference on computer vision and pattern recognition (2014), pp. 1725–1732

  9. A. Krizhevsky, I. Sutskever, G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks. Advances in neural information processing system 25(2) (2012)

  10. Y. Lecun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436 (2015)

    Article  Google Scholar 

  11. H. Lee, R. Grosse, R. Ranganath, A.Y. Ng, Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning (2009)

  12. Y. Liang, L. Lu, J. Xie, OMNI: a framework for integrating hardware and software optimizations for sparse CNNs. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 40(8), 1648–1661 (2021)

    Article  Google Scholar 

  13. B. Liu, X. Chen, Y. Han, H. Xu, Swallow: a versatile accelerator for sparse neural networks. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 39(12), 4881–4893 (2020)

    Article  Google Scholar 

  14. J. Mas, T. Panadero, G. Botella, A.A. Del Barrio, C. García, CNN Inference acceleration using low-power devices for human monitoring and security scenarios. Comput. Electr. Eng. 88, 106859 (2020)

    Article  Google Scholar 

  15. P. Meloni, A. Garufi, G. Deriu, M. Carreras, D. Loi, CNN hardware acceleration on a low-power and low-cost APSoC. In: 2019 Conference on Design and Architectures for Signal and Image Processing (2019), pp. 7–12

  16. V. Nair, G.E. Hinton, Rectified linear units improve restricted boltzmann machines. In: Icml 2010

  17. A. Parashar et al. SCNN: An accelerator for compressed-sparse convolutional neural networks. In: 2017 ACM/IEEE 44th Annual international symposium on computer architecture (2017), pp. 27–40

  18. J. Park et al. Faster CNNs with direct sparse convolutions and guided pruning, (2016)

  19. S. Shamshirband, M. Fathi, A. Dehzangi, A.T. Chronopoulos, H. Alinejad-Rokny, A review on deep learning approaches in healthcare systems: taxonomies, challenges, and open issues. J. Biomed Inform. 113, 103627 (2021)

    Article  Google Scholar 

  20. S. Shamshirband, T. Rabczuk, K.W. Chau, A survey of deep learning techniques: application in wind and solar energy resources. IEEE Access 7, 164650–164666 (2019)

    Article  Google Scholar 

  21. J. Shang, L. Qian, Z. Zhang, L. Xue, H. Liu, LACS: A high-computational-efficiency accelerator for CNNs. IEEE Access 8, 6045–6059 (2020)

    Article  Google Scholar 

  22. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. Comput. Sci. (2014)

  23. L. Yu, Deng, deep learning and its applications to signal and information processing [Exploratory DSP]. IEEE Signal Process. Mag. 28(1), 145–154 (2011)

    Article  Google Scholar 

  24. C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, J. Cong, Optimizing fpga-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA international symposium on field-programmable gate arrays (2015), pp. 161–170

  25. S. Zhang et al. Cambricon-X: An accelerator for sparse neural networks. In: 2016 49th Annual IEEE/ACM International symposium on microarchitecture (2016), pp. 1–12

  26. C. Zhu, K. Huang, S. Yang, Z. Zhu, H. Zhang, H. Shen, An efficient hardware accelerator for structured sparse convolutional neural networks on FPGAs. IEEE Transact Very Large Scale Integrat (VLSI) Sys 28(9), 1953–1965 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This research work was supported by Research Fund from Natural Science Foundation of Beijing Municipality (CN) (4172010), National Natural Science Foundation of China (61001049) and Beijing Innovation Center for Future Chip (CN)(KYJJ2018009)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haiying Yuan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, H., Zeng, Z., Cheng, J. et al. A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks. Circuits Syst Signal Process 41, 4370–4389 (2022). https://doi.org/10.1007/s00034-022-01992-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-022-01992-x

Keywords

Navigation