Skip to main content

Accelerating Inference on Binary Neural Networks with Digital RRAM Processing

  • Conference paper
  • First Online:

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 586))

Abstract

The need for efficient Convolutional Neural Network (CNNs) targeting embedded systems led to the popularization of Binary Neural Networks (BNNs), which significantly reduce execution time and memory requirements by representing the operands using only one bit. Also, due to 90% of the operations executed by CNNs and BNNs being convolutions, a quest for custom accelerators to optimize the convolution operation and reduce data movements has started, in which Resistive Random Access Memory (RRAM)-based accelerators have proven to be of interest. This work presents a custom Binary Dot Product Engine(BDPE) for BNNs that exploits the low-level compute capabilities enabled RRAMs. This new engine allows accelerating the execution of the inference phase of BNNs by locally storing the most used kernels and performing the binary convolutions using RRAM devices and optimized custom circuitry. Results show that the novel BDPE improves performance by 11.3%, energy efficiency by 7.4% and reduces the number of memory accesses by 10.7% at a cost of less than 0.3% additional die area.

J. Vieira—At the time of this work, João Vieira was affiliated with the University of Utah.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: ICML. ACM International Conference Proceeding Series, vol. 307, pp. 160–167. ACM (2008)

    Google Scholar 

  2. Silver, D., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)

    Article  MathSciNet  Google Scholar 

  3. Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788. IEEE Computer Society (2016)

    Google Scholar 

  4. Shafiee, A., et al.: ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In: ISCA, pp. 14–26. IEEE Computer Society (2016)

    Google Scholar 

  5. Vieira, J., et al.: A product engine for energy-efficient execution of binary neural networks using resistive memories. In: VLSI-SoC, pp. 160–165. IEEE (2019)

    Google Scholar 

  6. Giacomin, E., Greenberg-Toledo, T., Kvatinsky, S., Gaillardon, P.: A robust digital RRAM-based convolutional block for low-power image processing and learning applications. IEEE Trans. Circ. Syst. 66-I(2), 643–654 (2019)

    Google Scholar 

  7. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32

    Chapter  Google Scholar 

  8. Cong, J., Xiao, B.: Minimizing computation in convolutional neural networks. In: Wermter, S., et al. (eds.) ICANN 2014. LNCS, vol. 8681, pp. 281–290. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11179-7_36

    Chapter  Google Scholar 

  9. Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587. IEEE Computer Society (2014)

    Google Scholar 

  10. Girshick, R.B.: Fast R-CNN. In: ICCV, pp. 1440–1448. IEEE Computer Society (2015)

    Google Scholar 

  11. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011 (2011)

    Google Scholar 

  12. Wong, H.P., et al.: Metal-oxide RRAM. Proc. IEEE 100(6), 1951–1970 (2012)

    Article  Google Scholar 

  13. Tang, X., Giacomin, E., Micheli, G.D., Gaillardon, P.: Circuit designs of high-performance and low-power RRAM-based multiplexers based on 4t(ransistor)1r(ram) programming structure. IEEE Trans. Circ. Syst. 64-I(5), 1173–1186 (2017)

    Google Scholar 

  14. ARM: Arm architecture reference manual (2018)

    Google Scholar 

  15. Redmon, J.: Darknet: Open source neural networks in c (2013–2016). http://pjreddie.com/darknet/

  16. Binkert, N.L., et al.: The gem5 simulator. SIGARCH Comput. Archit. News 39(2), 1–7 (2011)

    Article  Google Scholar 

  17. Butko, A., Garibotti, R., Ost, L., Sassatelli, G.: Accuracy evaluation of GEM5 simulator system. In: ReCoSoC, pp. 1–7. IEEE (2012)

    Google Scholar 

  18. Qureshi, Y.M., Simon, W.A., Zapater, M., Atienza, D., Olcoz, K.: Gem5-x: A gem5-based system level simulation framework to optimize many-core platforms. In: SpringSim, pp. 1–12. IEEE (2019)

    Google Scholar 

  19. Abouzeid, F., et al.: 30% static power improvement on ARM cortex -a53 using static biasing-anticipation. In: ESSCIRC, pp. 37–40. IEEE (2016)

    Google Scholar 

  20. Pahlevan, A., et al.: Energy proportionality in near-threshold computing servers and cloud data centers: Consolidating or not? In: DATE, pp. 147–152. IEEE (2018)

    Google Scholar 

  21. Du, L., et al.: A reconfigurable streaming deep convolutional neural network accelerator for internet of things. IEEE Trans. Circ. Syst. 65-I(1), 198–208 (2018)

    Google Scholar 

  22. Chen, Y., Krishna, T., Emer, J.S., Sze, V.: Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. J. Solid-State Circuits 52(1), 127–138 (2017)

    Article  Google Scholar 

  23. Jo, J., Kim, S., Park, I.: Energy-efficient convolution architecture based on rescheduled dataflow. IEEE Trans. Circ. Syst. 65-I(12), 4196–4207 (2018)

    Google Scholar 

  24. Sim, J., Park, J., Kim, M., Bae, D., Choi, Y., Kim, L.: 14.6 A 1.42tops/w deep convolutional neural network recognition processor for intelligent IOE systems. In: ISSCC, pp. 264–265. IEEE (2016)

    Google Scholar 

  25. Kim, S., Howe, P., Moreau, T., Alaghi, A., Ceze, L., Sathe, V.S.: Energy-efficient neural network acceleration in the presence of bit-level memory errors. IEEE Trans. Circ. Syst. 65-I(12), 4285–4298 (2018)

    Google Scholar 

  26. Ni, L., Liu, Z., Yu, H., Joshi, R.V.: An energy-efficient digital reram-crossbar-based CNN with bitwise parallelism. IEEE J. Explor. Solid-State Comput. Dev. Circ. 3, 37–46 (2017)

    Google Scholar 

  27. Tang, T., Xia, L., Li, B., Wang, Y., Yang, H.: Binary convolutional neural network on RRAM. In: ASP-DAC, pp. 782–787. IEEE (2017)

    Google Scholar 

  28. Agbo, I., et al.: Quantification of sense amplifier offset voltage degradation due to zero-and run-time variability. In: ISVLSI, pp. 725–730. IEEE Computer Society (2016)

    Google Scholar 

  29. Sun, X., Yin, S., Peng, X., Liu, R., Seo, J., Yu, S.: XNOR-RRAM: a scalable and parallel resistive synaptic architecture for binary neural networks. In: DATE, pp. 1423–1428. IEEE (2018)

    Google Scholar 

  30. Chen, A., Lin, M.R.: Variability of resistive switching memories and its impact on crossbar array performance. In: 2011 International Reliability Physics Symposium, p. MY-7. IEEE (2011)

    Google Scholar 

  31. Xia, L., et al.: Switched by input: power efficient structure for RRAM-based convolutional neural network. In: DAC, ACM, pp. 125:1–125:6 (2016)

    Google Scholar 

  32. Chen, X., Jiang, J., Zhu, J., Tsui, C.: A high-throughput and energy-efficient RRAM-based convolutional neural network using data encoding and dynamic quantization. In: ASP-DAC, pp. 123–128. IEEE (2018)

    Google Scholar 

  33. Chi, P., et al.: PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In: ISCA, pp. 27–39. IEEE Computer Society (2016)

    Google Scholar 

Download references

Acknowledgements

This work was primarily supported by the grant 2016016 from the United States-Israel Binational Science Foundation.

Other supporting grants are SFRH/BD/144047/2019 from Fundação para a Ciência e a Tecnologia (FCT), Portugal; ERC Consolidator Grant COMPUSAPIEN (GA No. 725657); ERC starting grant Real-PIM-System (GA No. 757259); and EC H2020 EUROLAB4HPC2 project (GA No. 800962).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Vieira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vieira, J. et al. (2020). Accelerating Inference on Binary Neural Networks with Digital RRAM Processing. In: Metzler, C., Gaillardon, PE., De Micheli, G., Silva-Cardenas, C., Reis, R. (eds) VLSI-SoC: New Technology Enabler. VLSI-SoC 2019. IFIP Advances in Information and Communication Technology, vol 586. Springer, Cham. https://doi.org/10.1007/978-3-030-53273-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-53273-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-53272-7

  • Online ISBN: 978-3-030-53273-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics