Advertisement

Thanks for Nothing: Predicting Zero-Valued Activations with Lightweight Convolutional Neural Networks

Conference paper
  • 718 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12355)

Abstract

Convolutional neural networks (CNNs) introduce state-of-the-art results for various tasks with the price of high computational demands. Inspired by the observation that spatial correlation exists in CNN output feature maps (ofms), we propose a method to dynamically predict whether ofm activations are zero-valued or not according to their neighboring activation values, thereby avoiding zero-valued activations and reducing the number of convolution operations. We implement the zero activation predictor (ZAP) with a lightweight CNN, which imposes negligible overheads and is easy to deploy on existing models. ZAPs are trained by mimicking hidden layer ouputs; thereby, enabling a parallel and label-free training. Furthermore, without retraining, each ZAP can be tuned to a different operating point trading accuracy for MAC reduction.

Keywords

Convolutional neural networks Dynamic pruning 

Notes

Acknowledgments

We acknowledge the support of NVIDIA Corporation for its donation of a Titan V GPU used for this research.

Supplementary material

504449_1_En_14_MOESM1_ESM.pdf (329 kb)
Supplementary material 1 (pdf 329 KB)

References

  1. 1.
    Akhlaghi, V., Yazdanbakhsh, A., Samadi, K., Gupta, R.K., Esmaeilzadeh, H.: SnaPEA: predictive early activation for reducing computation in deep convolutional neural networks. In: International Symposium on Computer Architecture (ISCA), pp. 662–673. IEEE (2018)Google Scholar
  2. 2.
    Albericio, J., Judd, P., Hetherington, T., Aamodt, T., Jerger, N.E., Moshovos, A.: Cnvlutin: ineffectual-neuron-free deep neural network computing. In: International Symposium on Computer Architecture (ISCA), pp. 1–13. IEEE (2016)Google Scholar
  3. 3.
    Asadikouhanjani, M., Ko, S.B.: A novel architecture for early detection of negative output features in deep neural network accelerators. Trans. Circ. Syst. II Express Briefs (2020)Google Scholar
  4. 4.
    Canziani, A., Paszke, A., Culurciello, E.: An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678 (2016)
  5. 5.
    Chang, J., Choi, Y., Lee, T., Cho, J.: Reducing MAC operation in convolutional neural network with sign prediction. In: International Conference on Information and Communication Technology Convergence (ICTC), pp. 177–182. IEEE (2018)Google Scholar
  6. 6.
    Chen, C.C., Yang, C.L., Cheng, H.Y.: Efficient and robust parallel DNN training through model parallelism on multi-GPU platform. arXiv preprint arXiv:1809.02839 (2018)
  7. 7.
    Dong, X., Huang, J., Yang, Y., Yan, S.: More is less: a more complicated network with less inference complexity. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5840–5848. IEEE (2017)Google Scholar
  8. 8.
    Elthakeb, A.T., Pilligundla, P., Esmaeilzadeh, H.: Divide and conquer: leveraging intermediate feature representations for quantized training of neural networks. In: International Conference on Machine Learning (ICML) Workshop on Understanding and Improving Generalization in Deep Learning (2019)Google Scholar
  9. 9.
    Figurnov, M., Ibraimova, A., Vetrov, D.P., Kohli, P.: PerforatedCNNs: acceleration through elimination of redundant convolutions. In: Advances in Neural Information Processing Systems (NIPS), pp. 947–955 (2016)Google Scholar
  10. 10.
    Gao, X., Zhao, Y., Dudziak, L., Mullins, R., Xu, C.z.: Dynamic channel pruning: feature boosting and suppression. In: International Conference on Learning Representations (ICLR) (2018)Google Scholar
  11. 11.
    Geifman, Y., El-Yaniv, R.: Selective classification for deep neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 4878–4887 (2017)Google Scholar
  12. 12.
    Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: International Conference on Learning Representations (ICLR) (2016)Google Scholar
  13. 13.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE (2016)Google Scholar
  14. 14.
    He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: International Conference on Computer Vision (ICCV), pp. 1389–1397. IEEE (2017)Google Scholar
  15. 15.
    Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach. Elsevier (2011)Google Scholar
  16. 16.
    Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
  17. 17.
    Hu, H., Peng, R., Tai, Y.W., Tang, C.K.: Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250 (2016)
  18. 18.
    Hua, W., Zhou, Y., De Sa, C., Zhang, Z., Suh, G.E.: Boosting the performance of CNN accelerators with dynamic fine-grained channel gating. In: International Symposium on Microarchitecture (MICRO), pp. 139–150. ACM (2019)Google Scholar
  19. 19.
    Huan, Y., Qin, Y., You, Y., Zheng, L., Zou, Z.: A multiplication reduction technique with near-zero approximation for embedded learning in IoT devices. In: International System-on-Chip Conference (SOCC), pp. 102–107. IEEE (2016)Google Scholar
  20. 20.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (ICML) (2015)Google Scholar
  21. 21.
    Kim, C., Shin, D., Kim, B., Park, J.: Mosaic-CNN: A combined two-step zero prediction approach to trade off accuracy and computation energy in convolutional neural networks. J. Emerg. Sel. Topics Circ. Syst. 8(4), 770–781 (2018)CrossRefGoogle Scholar
  22. 22.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2014)Google Scholar
  23. 23.
    Kligvasser, I., Rott Shaham, T., Michaeli, T.: xUnit: learning a spatial activation function for efficient image restoration. In: Conference on Computer Vision and Pattern Recognition (ECCV), pp. 2433–2442 (2018)Google Scholar
  24. 24.
    Krizhevsky, A., Hinton, G.: Convolutional deep belief networks on CIFAR-10 40(7), 1–9 (2010). Unpublished manuscriptGoogle Scholar
  25. 25.
    Krizhevsky, A., Nair, V., Hinton, G.: CIFAR-10 and CIFAR-100 datasets. http://www.cs.toronto.edu/kriz/cifar.html
  26. 26.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)Google Scholar
  27. 27.
    Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)MathSciNetzbMATHGoogle Scholar
  28. 28.
    Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient ConvNets. In: International Conference on Learning Representations (ICLR) (2017)Google Scholar
  29. 29.
    Lin, J., Rao, Y., Lu, J., Zhou, J.: Runtime neural pruning. In: Advances in Neural Information Processing Systems (NIPS), pp. 2181–2191 (2017)Google Scholar
  30. 30.
    Lin, Y., Sakr, C., Kim, Y., Shanbhag, N.: PredictiveNet: an energy-efficient convolutional neural network via zero prediction. In: International Symposium on Circuits and Systems (ISCAS), pp. 1–4. IEEE (2017)Google Scholar
  31. 31.
    Luo, J.H., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: International Conference on Computer Vision (ICCV), pp. 5058–5066. IEEE (2017)Google Scholar
  32. 32.
    Mahmoud, M., Siu, K., Moshovos, A.: Diffy: a déjà vu-free differential deep neural network accelerator. In: International Symposium on Microarchitecture (MICRO), pp. 134–147. IEEE (2018)Google Scholar
  33. 33.
    Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: International Conference on Machine Learning (ICML), pp. 807–814 (2010)Google Scholar
  34. 34.
    Parashar, A., et al.: SCNN: an accelerator for compressed-sparse convolutional neural networks. In: International Symposium on Computer Architecture (ISCA), pp. 27–40. IEEE (2017)Google Scholar
  35. 35.
    Qian, N.: On the momentum term in gradient descent learning algorithms. Neural Netw. 12(1), 145–151 (1999)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  37. 37.
    Sainath, T.N., Mohamed, A.R., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8614–8618. IEEE (2013)Google Scholar
  38. 38.
    Shomron, G., Weiser, U.: Spatial correlation and value prediction in convolutional neural networks. Comput. Architect. Lett. (CAL) 18(1), 10–13 (2019)CrossRefGoogle Scholar
  39. 39.
    Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484 (2016)CrossRefGoogle Scholar
  40. 40.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Machine Learning (ICML) (2015)Google Scholar
  41. 41.
    Song, M., Zhao, J., Hu, Y., Zhang, J., Li, T.: Prediction based execution on deep neural networks. In: International Symposium on Computer Architecture (ISCA), pp. 752–763. IEEE (2018)Google Scholar
  42. 42.
    Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017)CrossRefGoogle Scholar
  43. 43.
    Yazdani, R., Riera, M., Arnau, J.M., González, A.: The dark side of DNN pruning. In: International Symposium on Computer Architecture (ISCA), pp. 790–801. IEEE (2018)Google Scholar
  44. 44.
    Zhu, J., Jiang, J., Chen, X., Tsui, C.Y.: Sparsenn: an energy-efficient neural network accelerator exploiting input and output sparsity. In: Design, Automation and Test in Europe Conference (DATE), pp. 241–244. IEEE (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Faculty of Electrical EngineeringTechnionHaifaIsrael
  2. 2.Habana LabsAn Intel CompanyCaesareaIsrael

Personalised recommendations