# A Rank Decomposed Statistical Error Compensation Technique for Robust Convolutional Neural Networks in the Near Threshold Voltage Regime

- 97 Downloads

## Abstract

There has been a growing interest in implementing complex machine learning algorithms such as convolutional neural networks (CNNs) on lower power embedded platforms to enable on-device learning and inference. Many of these platforms are to be deployed as low power sensor nodes with low to medium throughput requirement. Near threshold voltage (NTV) designs are well-suited for these applications but suffer from a significant increase in variations. In this paper, we propose a variation-tolerant architecture for CNNs capable of operating in NTV regime for energy efficiency. A statistical error compensation (SEC) technique referred to as rank decomposed SEC (RD-SEC) is proposed. The key idea is to exploit inherent redundancy within matrix-vector multiplication (or dot product ensemble), a power-hungry operation in CNNs, to derive low-cost estimators for error detection and compensation. When evaluated in CNNs for both the MNIST and CIFAR-10 datasets, simulation results in 45 nm CMOS show that RD-SEC enables robust CNNs operating in the NTV regime. Specifically, the proposed architecture can achieve up to 11 × improvement in variation tolerance and enable up to 113 × reduction in the standard deviation of detection accuracy *P*_{det} while incurring marginal degradation in the median detection accuracy.

## Keywords

Convolutional neural networks Statistical error compensation Rank decomposition Near threshold voltage regime## Notes

### Acknowledgements

This work was supported in part by Systems on Nanoscale Information fabriCs (SONIC), one of the six SRC STARnet Centers, sponsored by MARCO and DARPA.

## References

- 1.Chen, Y.H., Krishna, T., Emer, J., Sze, V. (2016). Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. In
*IEEE international solid-state circuits conference (ISSCC)*.Google Scholar - 2.Chung, J.G., & Parhi, K.K. (2002). Frequency spectrum based low-area low-power parallel FIR filter design.
*EURASIP Journal on Applied Signal Processing*,*2002*, 944–953.zbMATHGoogle Scholar - 3.Mahesh, R., & Vinod, A. (2010). New reconfigurable architectures for implementing FIR filters with low complexity.
*IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 29*(2), 275–288.Google Scholar - 4.Liu, X., Zhou, J., Liao, X., Wang, C., Luo, J., Madihian, M., Je, M. (2012). Ultra-low-energy near-threshold biomedical signal processor for versatile wireless health monitoring. In
*2012 IEEE Asian solid state circuits conference (a-SSCC)*(pp. 381–384). https://doi.org/10.1109/ASSCC.2012.6570806. - 5.Kim, Y., Hong, I., Yoo, H.J. (2015). 18.3 A 0.5v 54 uw ultra-low-power recognition processor with 93.5 compression. In
*2015 IEEE international solid-state circuits conference - (ISSCC) digest of technical papers*(pp. 1–3).Google Scholar - 6.Dreslinski, R., Wieckowski, M., Blaauw, D., Sylvester, D., Mudge, T. (2010). Near-threshold computing: reclaiming Moore’s law through energy efficient integrated circuits.
*Proceedings of the IEEE, 98*(2), 253–266.Google Scholar - 7.Das, S., Blaauw, D., Bull, D., Flautner, K., Aitken, R. (2009). Addressing design margins through error-tolerant circuits. In
*46th ACM/IEEE design automation conference (DAC)*(pp. 11–12).Google Scholar - 8.Tschanz, J., Bowman, K., Wilkerson, C., Lu, S.L., Karnik, T. (2009). Resilient circuits: enabling energy-efficient performance and reliability. In
*IEEE/ACM international conference on computer-aided design (ICCAD)*.Google Scholar - 9.Bahar, R., Mundy, J., Chen, J. (2003). A probabilistic-based design methodology for nanoscale computation. In
*IEEE/ACM international conference on computer aided design (ICCAD)*(pp. 480–486).Google Scholar - 10.Vaidya, N., & Pradhan, D. (1993). Fault-tolerant design strategies for high reliability and safety.
*IEEE Transactions on Computers*,*42*(10), 1195–1206.CrossRefGoogle Scholar - 11.Shim, B., Sridhara, S., Shanbhag, N. (2004). Reliable low-power digital signal processing via reduced precision redundancy.
*IEEE Transactions on Very Large Scale Integration (VLSI) Systems*,*12*(5), 497–510.CrossRefGoogle Scholar - 12.Choi, J., Kim, E.P., Rutenbar, R.A., Shanbhag, N.R. (2013). Error resilient MRF message passing architecture for stereo matching. In
*IEEE workshop on signal processing systems (siPS)*(pp. 348–353).Google Scholar - 13.Abdallah, R.A., & Shanbhag, N. R. (2013). Error-resilient systems via statistical signal processing. In
*IEEE workshop on signal processing systems (siPS)*.Google Scholar - 14.Abdallah, R.A., & Shanbhag, N.R. (2013). An energy-efficient ecg processor in 45-nm cmos using statistical error compensation.
*IEEE Journal of Solid-State Circuits*,*48*(11), 2882–2893.CrossRefGoogle Scholar - 15.Lin, Y., Zhang, S., Shanbhag, N.R. (2016). Variation-tolerant architectures for convolutional neural networks in the near threshold voltage regime. In
*2016 IEEE international workshop on signal processing systems (SiPS)*. https://doi.org/10.1109/SiPS.2016.11 (pp. 17–22). - 16.Zhang, S., & Shanbhag, N. (2016). Probabilistic error models for machine learning kernels implemented on stochastic nanoscale fabrics. In
*Design, automation test in Europe (DATE)*.Google Scholar - 17.Jarrett, K., Kavukcuoglu, K., Ranzato, M., LeCun, Y. (2009). What is the best multi-stage architecture for object recognition?. In
*IEEE 12th international conference on computer vision*(pp. 2146–2153).Google Scholar - 18.Lecun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998). Gradient-based learning applied to document recognition.
*Proceedings of the IEEE, 86*(11), 2278–2324.Google Scholar - 19.Han, S., Pool, J., Tran, J., Dally, W. (2015). Learning both weights and connections for efficient neural network. In
*Advances in neural information processing systems*(pp. 1135–1143).Google Scholar - 20.Wang, Y. et al. (2016). Low power convolutional neural networks on a chip. In
*Proceedings of the IEEE international symposium on circuits and systems (ISCAS)*(pp. 129–132).Google Scholar - 21.Courbariaux, M. et al. (2016). Binarynet: training deep neural networks with weights and activations constrained to + 1 or − 1. arXiv:1602.02830.
- 22.Hwang, K., & Sung, W. (2014). Fixed-point feedforward deep neural network design using weights + 1, 0, and − 1. In
*IEEE workshop on signal processing systems (siPS), 2014*(pp. 1–6): IEEE.Google Scholar - 23.Anwar, S., Hwang, K., Sung, W. (2015). Fixed point optimization of deep convolutional neural networks for object recognition. In
*IEEE international conference on acoustics, speech and signal processing (ICASSP), 2015*(pp. 1131–1135): IEEE.Google Scholar - 24.Sung, W., Shin, S., Hwang, K. (2015). Resiliency of deep neural networks under quantization. arXiv:1511.06488.
- 25.Knag, P., Liu, C., Zhang, Z. (2016). A 1.40mm2 141mw 898gops sparse neuromorphic processor in 40nm cmos. In
*2016 IEEE symposium on VLSI circuits (VLSI-circuits)*(pp. 1–2).Google Scholar - 26.Lin, Y., Sakr, C., Kim, Y., Shanbhag, N.R. (2017). Predictivenet: an energy-efficient convolutional neural network via zero prediction. In
*2017 IEEE international symposium on circuits and systems (ISCAS)*.Google Scholar - 27.Chen, T., Du, Z., Sun, N., Wang, J., Wu, C., Chen, Y., Temam, O. (2014). DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In
*ACM sigplan notices*, (Vol. 49 pp. 269–284): ACM.Google Scholar - 28.Du, Z., Fasthuber, R., Chen, T., Ienne, P., Li, L., Luo, T., Feng, X., Chen, Y., Temam, O. (2015). Shidiannao: shifting vision processing closer to the sensor. In
*ACM/IEEE 42nd annual international symposium on computer architecture (ISCA), 2015*(pp. 92–104).Google Scholar - 29.Kang, M., Gonugondla, S.K., Keel, M.S., Shanbhag, N.R. (2015). An energy-efficient memory-based high-throughput vlsi architecture for convolutional networks. In
*IEEE international conference on acoustics, speech and signal processing (ICASSP), 2015*(pp. 1037–1041): IEEE.Google Scholar - 30.Teodorescu, R., Nakano, J., Tiwari, A., Torrellas, J. (2007). Mitigating parameter variation with dynamic fine-grain body biasing. In
*40th annual IEEE/ACM international symposium on microarchitecture (MICRO)*.Google Scholar - 31.Liang, X., Wei, G.Y., Brooks, D. (2009). Revival: a variation-tolerant architecture using voltage interpolation and variable latency.
*IEEE Micro, 29*, 127–138.Google Scholar - 32.Strang, G. (2003).
*Introduction to linear algebra*, 3rd edn. Wesley-Cambridge Press.Google Scholar - 33.Bertsekas, D.P., & Tsitsiklis, J.N. (2008).
*Introduction to probability*, 2nd edn. Belmont: Athena Scientific.Google Scholar - 34.Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.222.9220
- 35.Rabaey, J.M., Chandrakasan, A., Nikolic, B. (2003).
*Digital integrated circuits: a design perspective*. Upper Saddle River: Prentice-Hall, Inc.Google Scholar