Skip to main content
Log in

Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Edge devices play an increasingly important role in the convolutional neural network (CNN) inference. However, the large computation and storage requirements are challenging for resource- and power-constrained hardware. These limitations might be overcome by exploring the following: (a) error tolerance via approximate computing, such as stochastic computing (SC); (b) data sparsity, including the weight and activation sparsity. Although SC can perform complex calculations with compact and simple arithmetic circuits, traditional SC-based accelerators suffer from the low reconfigurability and long bitstream, further making it difficult to benefit from the data sparsity. In this paper, we propose spatial-parallel stochastic computing (SPSC), which improves the spatial parallelism of the SC-based multiplier to the full extent while consuming fewer logic gates than the fixed-point implementation. Moreover, we present SPA, a highly reconfigurable SPSC-based sparse CNN accelerator with the proposed hybrid zero-skipping scheme (HZSS), to efficiently take advantage of different zero-skipping strategies for different types of layers. Comprehensive experiments show that SPA with up to 2477.6 Gops/W outperforms existing several binary-weight accelerators, SC-based accelerators, and the sparse CNN accelerator considering energy efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444

    Article  Google Scholar 

  2. Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, Lake Tahoe, 2012. 1097–1105

  3. He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016. 770–778

  4. Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems, Montreal, 2015. 91–99

  5. Sze V, Chen Y H, Yang T J, et al. Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE, 2017, 105: 2295–2329

    Article  Google Scholar 

  6. Zhou A J, Yao A B, Guo Y W, et al. Incremental network quantization: towards lossless CNNs with low-precision weights. In: Proceedings of the 5th International Conference on Learning Representations, Toulon, 2017. 1–14

  7. Courbariaux M, Bengio Y, David J P, et al. Binaryconnect: training deep neural networks with binary weights during propagations. In: Proceedings of Advances in Neural Information Processing Systems, Montral, 2015. 3123–3131

  8. Moons B, de Brabandere B, van Gool L, et al. Energy-efficient convnets through approximate computing. In: Proceedings of IEEE Winter Conference on Applications of Computer Vision, Lake Placid, 2016. 1–8

  9. Chen C Y, Choi J, Gopalakrishnan K, et al. Exploiting approximate computing for deep learning acceleration. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition, Dresden, 2018. 821–826

  10. Cheng C D, Tiw P J, Cai Y M, et al. In-memory computing with emerging nonvolatile memory devices. Sci China Inf Sci, 2021, 64: 221402

    Article  Google Scholar 

  11. Qian X H. Graph processing and machine learning architectures with emerging memory technologies: a survey. Sci China Inf Sci, 2021, 64: 160401

    Article  MathSciNet  Google Scholar 

  12. Zhang T Y, Ye S K, Zhang K Q, et al. A systematic DNN weight pruning framework using alternating direction method of multipliers. In: Proceedings of the European Conference on Computer Vision, Munich, 2018. 184–199

  13. Guo L H, Chen D W, Jia K. Knowledge transferred adaptive filter pruning for CNN compression and acceleration. Sci China Inf Sci, 2022, 65: 229101

    Article  Google Scholar 

  14. Brown B D, Card H C. Stochastic neural computation. I. Computational elements. IEEE Trans Comput, 2001, 50: 891–905

    Article  MathSciNet  MATH  Google Scholar 

  15. Naderi A, Mannor S, Sawan M, et al. Delayed stochastic decoding of LDPC codes. IEEE Trans Signal Process, 2011, 59: 5617–5626

    Article  MathSciNet  MATH  Google Scholar 

  16. Zhang C, Parhi K K. Latency analysis and architecture design of simplified SC polar decoders. IEEE Trans Circ Syst II, 2013, 61: 115–119

    Google Scholar 

  17. Kim K, Kim J, Yu J, et al. Dynamic energy-accuracy trade-off using stochastic computing in deep neural networks. In: Proceedings of the 53nd ACM/EDAC/IEEE Design Automation Conference, Austin, 2016. 1–6

  18. Li Z, Li J, Ren A, et al. HEIF: highly efficient stochastic computing-based inference framework for deep neural networks. IEEE Trans Comput-Aided Des Integr Circ Syst, 2018, 38: 1543–1556

    Article  Google Scholar 

  19. Xie Y, Liao S Y, Yuan B, et al. Fully-parallel area-efficient deep neural network design using stochastic computing. IEEE Trans Circ Syst II, 2017, 64: 1382–1386

    Google Scholar 

  20. Zhang Y W, Zhang X Y, Song J H, et al. Parallel convolutional neural network (CNN) accelerators based on stochastic computing. In: Proceedings of IEEE International Workshop on Signal Processing Systems, Nanjing, 2019. 19–24

  21. Sim H, Lee J. A new stochastic computing multiplier with application to deep convolutional neural networks. In: Proceedings of the 54th Annual Design Automation Conference, Austin, 2017. 1–6

  22. Chen Y H, Krishna T, Emer J S, et al. Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid-State Circ, 2016, 52: 127–138

    Article  Google Scholar 

  23. Han S, Liu X Y, Mao H Z, et al. EIE: efficient inference engine on compressed deep neural network. In: Proceedings of the 43rd Annual International Symposium on Computer Architecture, Seoul, 2016. 243–254

  24. Parashar A, Rhu M, Mukkara A, et al. SCNN: an accelerator for compressed-sparse convolutional neural networks. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, Toronto, 2017. 27–40

  25. Zhang J F, Lee C E, Liu C, et al. SNAP: an efficient sparse neural acceleration processor for unstructured sparse deep neural network inference. IEEE J Solid-State Circ, 2021, 56: 636–647

    Article  Google Scholar 

  26. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations, San Diego, 2015. 1–14

  27. Xia Z H, Chen J N, Huang Q, et al. Neural synaptic plasticity-inspired computing: a high computing efficient deep convolutional neural network accelerator. IEEE Trans Circ Syst I, 2020, 68: 728–740

    MathSciNet  Google Scholar 

  28. Liu S T, Han J. Energy efficient stochastic computing with sobol sequences. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition, Lausanne, 2017. 650–653

  29. Anderson J H, Hara-Azumi Y, Yamashita S. Effect of LFSR seeding, scrambling and feedback polynomial on stochastic computing accuracy. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition, Dresden, 2016. 1550–1555

  30. Ardakani A, Condo C, Gross W J. Fast and efficient convolutional accelerator for edge computing. IEEE Trans Comput, 2019, 69: 138–152

    Article  MATH  Google Scholar 

  31. Zhang C, Li P, Sun G Y, et al. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, 2015. 161–170

  32. Teman A, Rossi D, Meinerzhagen P, et al. Power, area, and performance optimization of standard cell memory arrays through controlled placement. ACM Trans Des Autom Electron Syst, 2016, 21: 1–25

    Article  Google Scholar 

  33. Kim K, Lee J, Choi K. Approximate de-randomizer for stochastic circuits. In: Proceedings of IEEE International SoC Design Conference (ISOCC), Gyeongju, 2015. 123–124

  34. Romaszkan W, Li T, Melton T, et al. ACOUSTIC: accelerating convolutional neural networks through or-unipolar skipped stochastic computing. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, 2020. 768–773

  35. Ardakani A, Condo C, Gross W J. A convolutional accelerator for neural networks with binary weights. In: Proceedings of IEEE International Symposium on Circuits and Systems, Florence, 2018. 1–5

  36. Ando K, Ueyoshi K, Orimo K, et al. BRein memory: a single-chip binary/ternary reconfigurable in-memory deep neural network accelerator achieving 1.4 TOPS at 0.6 W. IEEE J Solid-State Circ, 2018, 53: 983–994

    Article  Google Scholar 

  37. Wang H Z, Xu W H, Zhang Z C, et al. An efficient stochastic convolution architecture based on fast FIR algorithm. IEEE Trans Circ Syst II, 2022, 69: 984–988

    Google Scholar 

  38. Li T, Romaszkan W, Pamarti S, et al. GEO: generation and execution optimized stochastic computing accelerator for neural networks. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, 2021. 689–694

Download references

Acknowledgements

This work was supported by National Key Research and Development Program (Grant No. 2020YFB2-205500).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jienan Chen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, Z., Wan, R., Chen, J. et al. Reconfigurable spatial-parallel stochastic computing for accelerating sparse convolutional neural networks. Sci. China Inf. Sci. 66, 162404 (2023). https://doi.org/10.1007/s11432-021-3519-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-021-3519-1

Keywords

Navigation