Advertisement

Journal of Signal Processing Systems

, Volume 90, Issue 10, pp 1439–1451 | Cite as

A Rank Decomposed Statistical Error Compensation Technique for Robust Convolutional Neural Networks in the Near Threshold Voltage Regime

  • Yingyan Lin
  • Sai Zhang
  • Naresh R. Shanbhag
Article
  • 97 Downloads

Abstract

There has been a growing interest in implementing complex machine learning algorithms such as convolutional neural networks (CNNs) on lower power embedded platforms to enable on-device learning and inference. Many of these platforms are to be deployed as low power sensor nodes with low to medium throughput requirement. Near threshold voltage (NTV) designs are well-suited for these applications but suffer from a significant increase in variations. In this paper, we propose a variation-tolerant architecture for CNNs capable of operating in NTV regime for energy efficiency. A statistical error compensation (SEC) technique referred to as rank decomposed SEC (RD-SEC) is proposed. The key idea is to exploit inherent redundancy within matrix-vector multiplication (or dot product ensemble), a power-hungry operation in CNNs, to derive low-cost estimators for error detection and compensation. When evaluated in CNNs for both the MNIST and CIFAR-10 datasets, simulation results in 45 nm CMOS show that RD-SEC enables robust CNNs operating in the NTV regime. Specifically, the proposed architecture can achieve up to 11 × improvement in variation tolerance and enable up to 113 × reduction in the standard deviation of detection accuracy Pdet while incurring marginal degradation in the median detection accuracy.

Keywords

Convolutional neural networks Statistical error compensation Rank decomposition Near threshold voltage regime 

Notes

Acknowledgements

This work was supported in part by Systems on Nanoscale Information fabriCs (SONIC), one of the six SRC STARnet Centers, sponsored by MARCO and DARPA.

References

  1. 1.
    Chen, Y.H., Krishna, T., Emer, J., Sze, V. (2016). Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. In IEEE international solid-state circuits conference (ISSCC).Google Scholar
  2. 2.
    Chung, J.G., & Parhi, K.K. (2002). Frequency spectrum based low-area low-power parallel FIR filter design. EURASIP Journal on Applied Signal Processing, 2002, 944–953.zbMATHGoogle Scholar
  3. 3.
    Mahesh, R., & Vinod, A. (2010). New reconfigurable architectures for implementing FIR filters with low complexity. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 29(2), 275–288.Google Scholar
  4. 4.
    Liu, X., Zhou, J., Liao, X., Wang, C., Luo, J., Madihian, M., Je, M. (2012). Ultra-low-energy near-threshold biomedical signal processor for versatile wireless health monitoring. In 2012 IEEE Asian solid state circuits conference (a-SSCC) (pp. 381–384).  https://doi.org/10.1109/ASSCC.2012.6570806.
  5. 5.
    Kim, Y., Hong, I., Yoo, H.J. (2015). 18.3 A 0.5v 54 uw ultra-low-power recognition processor with 93.5 compression. In 2015 IEEE international solid-state circuits conference - (ISSCC) digest of technical papers (pp. 1–3).Google Scholar
  6. 6.
    Dreslinski, R., Wieckowski, M., Blaauw, D., Sylvester, D., Mudge, T. (2010). Near-threshold computing: reclaiming Moore’s law through energy efficient integrated circuits. Proceedings of the IEEE, 98(2), 253–266.Google Scholar
  7. 7.
    Das, S., Blaauw, D., Bull, D., Flautner, K., Aitken, R. (2009). Addressing design margins through error-tolerant circuits. In 46th ACM/IEEE design automation conference (DAC) (pp. 11–12).Google Scholar
  8. 8.
    Tschanz, J., Bowman, K., Wilkerson, C., Lu, S.L., Karnik, T. (2009). Resilient circuits: enabling energy-efficient performance and reliability. In IEEE/ACM international conference on computer-aided design (ICCAD).Google Scholar
  9. 9.
    Bahar, R., Mundy, J., Chen, J. (2003). A probabilistic-based design methodology for nanoscale computation. In IEEE/ACM international conference on computer aided design (ICCAD) (pp. 480–486).Google Scholar
  10. 10.
    Vaidya, N., & Pradhan, D. (1993). Fault-tolerant design strategies for high reliability and safety. IEEE Transactions on Computers, 42(10), 1195–1206.CrossRefGoogle Scholar
  11. 11.
    Shim, B., Sridhara, S., Shanbhag, N. (2004). Reliable low-power digital signal processing via reduced precision redundancy. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 12(5), 497–510.CrossRefGoogle Scholar
  12. 12.
    Choi, J., Kim, E.P., Rutenbar, R.A., Shanbhag, N.R. (2013). Error resilient MRF message passing architecture for stereo matching. In IEEE workshop on signal processing systems (siPS) (pp. 348–353).Google Scholar
  13. 13.
    Abdallah, R.A., & Shanbhag, N. R. (2013). Error-resilient systems via statistical signal processing. In IEEE workshop on signal processing systems (siPS).Google Scholar
  14. 14.
    Abdallah, R.A., & Shanbhag, N.R. (2013). An energy-efficient ecg processor in 45-nm cmos using statistical error compensation. IEEE Journal of Solid-State Circuits, 48(11), 2882–2893.CrossRefGoogle Scholar
  15. 15.
    Lin, Y., Zhang, S., Shanbhag, N.R. (2016). Variation-tolerant architectures for convolutional neural networks in the near threshold voltage regime. In 2016 IEEE international workshop on signal processing systems (SiPS).  https://doi.org/10.1109/SiPS.2016.11 (pp. 17–22).
  16. 16.
    Zhang, S., & Shanbhag, N. (2016). Probabilistic error models for machine learning kernels implemented on stochastic nanoscale fabrics. In Design, automation test in Europe (DATE).Google Scholar
  17. 17.
    Jarrett, K., Kavukcuoglu, K., Ranzato, M., LeCun, Y. (2009). What is the best multi-stage architecture for object recognition?. In IEEE 12th international conference on computer vision (pp. 2146–2153).Google Scholar
  18. 18.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.Google Scholar
  19. 19.
    Han, S., Pool, J., Tran, J., Dally, W. (2015). Learning both weights and connections for efficient neural network. In Advances in neural information processing systems (pp. 1135–1143).Google Scholar
  20. 20.
    Wang, Y. et al. (2016). Low power convolutional neural networks on a chip. In Proceedings of the IEEE international symposium on circuits and systems (ISCAS) (pp. 129–132).Google Scholar
  21. 21.
    Courbariaux, M. et al. (2016). Binarynet: training deep neural networks with weights and activations constrained to + 1 or − 1. arXiv:1602.02830.
  22. 22.
    Hwang, K., & Sung, W. (2014). Fixed-point feedforward deep neural network design using weights + 1, 0, and − 1. In IEEE workshop on signal processing systems (siPS), 2014 (pp. 1–6): IEEE.Google Scholar
  23. 23.
    Anwar, S., Hwang, K., Sung, W. (2015). Fixed point optimization of deep convolutional neural networks for object recognition. In IEEE international conference on acoustics, speech and signal processing (ICASSP), 2015 (pp. 1131–1135): IEEE.Google Scholar
  24. 24.
    Sung, W., Shin, S., Hwang, K. (2015). Resiliency of deep neural networks under quantization. arXiv:1511.06488.
  25. 25.
    Knag, P., Liu, C., Zhang, Z. (2016). A 1.40mm2 141mw 898gops sparse neuromorphic processor in 40nm cmos. In 2016 IEEE symposium on VLSI circuits (VLSI-circuits) (pp. 1–2).Google Scholar
  26. 26.
    Lin, Y., Sakr, C., Kim, Y., Shanbhag, N.R. (2017). Predictivenet: an energy-efficient convolutional neural network via zero prediction. In 2017 IEEE international symposium on circuits and systems (ISCAS).Google Scholar
  27. 27.
    Chen, T., Du, Z., Sun, N., Wang, J., Wu, C., Chen, Y., Temam, O. (2014). DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In ACM sigplan notices, (Vol. 49 pp. 269–284): ACM.Google Scholar
  28. 28.
    Du, Z., Fasthuber, R., Chen, T., Ienne, P., Li, L., Luo, T., Feng, X., Chen, Y., Temam, O. (2015). Shidiannao: shifting vision processing closer to the sensor. In ACM/IEEE 42nd annual international symposium on computer architecture (ISCA), 2015 (pp. 92–104).Google Scholar
  29. 29.
    Kang, M., Gonugondla, S.K., Keel, M.S., Shanbhag, N.R. (2015). An energy-efficient memory-based high-throughput vlsi architecture for convolutional networks. In IEEE international conference on acoustics, speech and signal processing (ICASSP), 2015 (pp. 1037–1041): IEEE.Google Scholar
  30. 30.
    Teodorescu, R., Nakano, J., Tiwari, A., Torrellas, J. (2007). Mitigating parameter variation with dynamic fine-grain body biasing. In 40th annual IEEE/ACM international symposium on microarchitecture (MICRO).Google Scholar
  31. 31.
    Liang, X., Wei, G.Y., Brooks, D. (2009). Revival: a variation-tolerant architecture using voltage interpolation and variable latency. IEEE Micro, 29, 127–138.Google Scholar
  32. 32.
    Strang, G. (2003). Introduction to linear algebra, 3rd edn. Wesley-Cambridge Press.Google Scholar
  33. 33.
    Bertsekas, D.P., & Tsitsiklis, J.N. (2008). Introduction to probability, 2nd edn. Belmont: Athena Scientific.Google Scholar
  34. 34.
    Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.222.9220
  35. 35.
    Rabaey, J.M., Chandrakasan, A., Nikolic, B. (2003). Digital integrated circuits: a design perspective. Upper Saddle River: Prentice-Hall, Inc.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Coordinated Science LaboratoryUniversity of Illinois at Urbana-ChampaignUrbanaUSA
  2. 2.Apple Inc.CupertinoUSA

Personalised recommendations