Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks

Xia, Zihan; Wan, Rui; Chen, Jienan; Wang, Runsheng

doi:10.1007/s11432-021-3519-1

Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks

Research Paper
Published: 17 May 2023

Volume 66, article number 162404, (2023)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Zihan Xia¹,
Rui Wan¹,
Jienan Chen¹ &
…
Runsheng Wang²

138 Accesses
1 Citation
Explore all metrics

Abstract

Edge devices play an increasingly important role in the convolutional neural network (CNN) inference. However, the large computation and storage requirements are challenging for resource- and power-constrained hardware. These limitations might be overcome by exploring the following: (a) error tolerance via approximate computing, such as stochastic computing (SC); (b) data sparsity, including the weight and activation sparsity. Although SC can perform complex calculations with compact and simple arithmetic circuits, traditional SC-based accelerators suffer from the low reconfigurability and long bitstream, further making it difficult to benefit from the data sparsity. In this paper, we propose spatial-parallel stochastic computing (SPSC), which improves the spatial parallelism of the SC-based multiplier to the full extent while consuming fewer logic gates than the fixed-point implementation. Moreover, we present SPA, a highly reconfigurable SPSC-based sparse CNN accelerator with the proposed hybrid zero-skipping scheme (HZSS), to efficiently take advantage of different zero-skipping strategies for different types of layers. Comprehensive experiments show that SPA with up to 2477.6 Gops/W outperforms existing several binary-weight accelerators, SC-based accelerators, and the sparse CNN accelerator considering energy efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of the recent architectures of deep convolutional neural networks

Article 21 April 2020

A comprehensive review of Binary Neural Network

Article 30 March 2023

A review of convolutional neural network architectures and their optimizations

Article 22 June 2022

References

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, Lake Tahoe, 2012. 1097–1105
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016. 770–778
Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems, Montreal, 2015. 91–99
Sze V, Chen Y H, Yang T J, et al. Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE, 2017, 105: 2295–2329
Article Google Scholar
Zhou A J, Yao A B, Guo Y W, et al. Incremental network quantization: towards lossless CNNs with low-precision weights. In: Proceedings of the 5th International Conference on Learning Representations, Toulon, 2017. 1–14
Courbariaux M, Bengio Y, David J P, et al. Binaryconnect: training deep neural networks with binary weights during propagations. In: Proceedings of Advances in Neural Information Processing Systems, Montral, 2015. 3123–3131
Moons B, de Brabandere B, van Gool L, et al. Energy-efficient convnets through approximate computing. In: Proceedings of IEEE Winter Conference on Applications of Computer Vision, Lake Placid, 2016. 1–8
Chen C Y, Choi J, Gopalakrishnan K, et al. Exploiting approximate computing for deep learning acceleration. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition, Dresden, 2018. 821–826
Cheng C D, Tiw P J, Cai Y M, et al. In-memory computing with emerging nonvolatile memory devices. Sci China Inf Sci, 2021, 64: 221402
Article Google Scholar
Qian X H. Graph processing and machine learning architectures with emerging memory technologies: a survey. Sci China Inf Sci, 2021, 64: 160401
Article MathSciNet Google Scholar
Zhang T Y, Ye S K, Zhang K Q, et al. A systematic DNN weight pruning framework using alternating direction method of multipliers. In: Proceedings of the European Conference on Computer Vision, Munich, 2018. 184–199
Guo L H, Chen D W, Jia K. Knowledge transferred adaptive filter pruning for CNN compression and acceleration. Sci China Inf Sci, 2022, 65: 229101
Article Google Scholar
Brown B D, Card H C. Stochastic neural computation. I. Computational elements. IEEE Trans Comput, 2001, 50: 891–905
Article MathSciNet MATH Google Scholar
Naderi A, Mannor S, Sawan M, et al. Delayed stochastic decoding of LDPC codes. IEEE Trans Signal Process, 2011, 59: 5617–5626
Article MathSciNet MATH Google Scholar
Zhang C, Parhi K K. Latency analysis and architecture design of simplified SC polar decoders. IEEE Trans Circ Syst II, 2013, 61: 115–119
Google Scholar
Kim K, Kim J, Yu J, et al. Dynamic energy-accuracy trade-off using stochastic computing in deep neural networks. In: Proceedings of the 53nd ACM/EDAC/IEEE Design Automation Conference, Austin, 2016. 1–6
Li Z, Li J, Ren A, et al. HEIF: highly efficient stochastic computing-based inference framework for deep neural networks. IEEE Trans Comput-Aided Des Integr Circ Syst, 2018, 38: 1543–1556
Article Google Scholar
Xie Y, Liao S Y, Yuan B, et al. Fully-parallel area-efficient deep neural network design using stochastic computing. IEEE Trans Circ Syst II, 2017, 64: 1382–1386
Google Scholar
Zhang Y W, Zhang X Y, Song J H, et al. Parallel convolutional neural network (CNN) accelerators based on stochastic computing. In: Proceedings of IEEE International Workshop on Signal Processing Systems, Nanjing, 2019. 19–24
Sim H, Lee J. A new stochastic computing multiplier with application to deep convolutional neural networks. In: Proceedings of the 54th Annual Design Automation Conference, Austin, 2017. 1–6
Chen Y H, Krishna T, Emer J S, et al. Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid-State Circ, 2016, 52: 127–138
Article Google Scholar
Han S, Liu X Y, Mao H Z, et al. EIE: efficient inference engine on compressed deep neural network. In: Proceedings of the 43rd Annual International Symposium on Computer Architecture, Seoul, 2016. 243–254
Parashar A, Rhu M, Mukkara A, et al. SCNN: an accelerator for compressed-sparse convolutional neural networks. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, Toronto, 2017. 27–40
Zhang J F, Lee C E, Liu C, et al. SNAP: an efficient sparse neural acceleration processor for unstructured sparse deep neural network inference. IEEE J Solid-State Circ, 2021, 56: 636–647
Article Google Scholar
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations, San Diego, 2015. 1–14
Xia Z H, Chen J N, Huang Q, et al. Neural synaptic plasticity-inspired computing: a high computing efficient deep convolutional neural network accelerator. IEEE Trans Circ Syst I, 2020, 68: 728–740
MathSciNet Google Scholar
Liu S T, Han J. Energy efficient stochastic computing with sobol sequences. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition, Lausanne, 2017. 650–653
Anderson J H, Hara-Azumi Y, Yamashita S. Effect of LFSR seeding, scrambling and feedback polynomial on stochastic computing accuracy. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition, Dresden, 2016. 1550–1555
Ardakani A, Condo C, Gross W J. Fast and efficient convolutional accelerator for edge computing. IEEE Trans Comput, 2019, 69: 138–152
Article MATH Google Scholar
Zhang C, Li P, Sun G Y, et al. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, 2015. 161–170
Teman A, Rossi D, Meinerzhagen P, et al. Power, area, and performance optimization of standard cell memory arrays through controlled placement. ACM Trans Des Autom Electron Syst, 2016, 21: 1–25
Article Google Scholar
Kim K, Lee J, Choi K. Approximate de-randomizer for stochastic circuits. In: Proceedings of IEEE International SoC Design Conference (ISOCC), Gyeongju, 2015. 123–124
Romaszkan W, Li T, Melton T, et al. ACOUSTIC: accelerating convolutional neural networks through or-unipolar skipped stochastic computing. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, 2020. 768–773
Ardakani A, Condo C, Gross W J. A convolutional accelerator for neural networks with binary weights. In: Proceedings of IEEE International Symposium on Circuits and Systems, Florence, 2018. 1–5
Ando K, Ueyoshi K, Orimo K, et al. BRein memory: a single-chip binary/ternary reconfigurable in-memory deep neural network accelerator achieving 1.4 TOPS at 0.6 W. IEEE J Solid-State Circ, 2018, 53: 983–994
Article Google Scholar
Wang H Z, Xu W H, Zhang Z C, et al. An efficient stochastic convolution architecture based on fast FIR algorithm. IEEE Trans Circ Syst II, 2022, 69: 984–988
Google Scholar
Li T, Romaszkan W, Pamarti S, et al. GEO: generation and execution optimized stochastic computing accelerator for neural networks. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, 2021. 689–694

Download references

Acknowledgements

This work was supported by National Key Research and Development Program (Grant No. 2020YFB2-205500).

Author information

Authors and Affiliations

National Key Laboratory of Science and Technology on Communications, University of Electronic Science and Technology of China, Chengdu, 611731, China
Zihan Xia, Rui Wan & Jienan Chen
School of Integrated Circuits, Peking University, Beijing, 100871, China
Runsheng Wang

Authors

Zihan Xia
View author publications
You can also search for this author in PubMed Google Scholar
Rui Wan
View author publications
You can also search for this author in PubMed Google Scholar
Jienan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Runsheng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jienan Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xia, Z., Wan, R., Chen, J. et al. Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks. Sci. China Inf. Sci. 66, 162404 (2023). https://doi.org/10.1007/s11432-021-3519-1

Download citation

Received: 03 November 2021
Revised: 17 January 2022
Accepted: 20 March 2022
Published: 17 May 2023
DOI: https://doi.org/10.1007/s11432-021-3519-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks

Abstract

Access this article

Similar content being viewed by others

A survey of the recent architectures of deep convolutional neural networks

A comprehensive review of Binary Neural Network

A review of convolutional neural network architectures and their optimizations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks

Abstract

Access this article

Similar content being viewed by others

A survey of the recent architectures of deep convolutional neural networks

A comprehensive review of Binary Neural Network

A review of convolutional neural network architectures and their optimizations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation