Skip to main content
Log in

Convolution without multiplication: A general speed up strategy for CNNs

  • Article
  • Published:
Science China Technological Sciences Aims and scope Submit manuscript

Abstract

Convolutional Neural Networks (CNN) have achieved great success in many computer vision tasks. However, it is difficult to deploy CNN models on low-cost devices with limited power budgets, because most existing CNN models are computationally expensive. Therefore, CNN model compression and acceleration have become a hot research topic in the deep learning area. Typical schemes for speeding up the feed-forward process with a slight accuracy loss include parameter pruning and sharing, low-rank factorization, compact convolutional filters and knowledge distillation. In this study, we propose a general acceleration scheme that replaces the floating-point multiplication with integer addition. To this end, we propose a general accelerate scheme, where the floating point multiplication is replaced by integer addition. The motivation is based on the fact that every floating point can be replaced by the summation of an exponential series. Therefore, the multiplication between two floating points can be converted to the addition among exponentials. In the experiment section, we directly apply the proposed scheme to AlexNet, VGG, ResNet for image classification, and Faster-RCNN for object detection. The results acquired from ImageNet and PASCAL VOC show that the proposed quantized scheme has a promising performance, even with only one item of exponential. Moreover, we analyzed the eciency of our method on mainstream FPGAs. The experimental results show that the proposed quantized scheme can achieve acceleration on FPGA with a slight accuracy loss.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Cheng Y, Wang D, Zhou P, et al. A survey of model compression and acceleration for deep neural networks. arXiv: 1710.09282

  2. Liang S, Yin S Y, Liu L B, et al. FP-BNN: Binarized neural network on FPGA. Neurocomputing, 2018, 275: 1072–1086

    Article  Google Scholar 

  3. Li S, Dou Y, Niu X, et al. A fast and memory saved GPU acceleration algorithm of convolutional neural networks for target detection. Neurocomputing, 2017, 230: 48–59

    Article  Google Scholar 

  4. Rigamonti R, Sironi A, Lepetit V, et al. Learning separable filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, 2013. 2754–2761

  5. Cohen T S, Welling M. Group equivariant convolutional networks. In: Proceedings of the 33rd International Conference on Machine Learning. New York, 2016. 48: 2990–2999

    Google Scholar 

  6. Jaderberg M, Vedaldi A, Zisserman A. Speeding up convolutional neural networks with low rank expansions. In: Proceedings of British Machine Vision Conference. Nottingham, 2014

  7. Tai C, Xiao T, Wang X, et al. Convolutional neural networks with low-rank regularization. In: Proceedings of the 4th International Conference on Learning Representations. San Juan, 2016

  8. Manngård M, Kronqvist J, Böling J M. Structural learning in artificial neural networks using sparse optimization. Neurocomputing, 2018, 272: 660–667

    Article  Google Scholar 

  9. Courbariaux M, Bengio Y. Binarynet: Training deep neural networks with weights and activations constrained to +1 or −1. arXiv: 1602.02830

  10. Courbariaux M, Bengio Y, David J. Binaryconnect: Training deep neural networks with binary weights during propagations. arXiv: 1511.00363

  11. Denton E L, Zaremba W, Bruna J, et al. Exploiting linear structure within convolutional networks for efficient evaluation. In: Proceedings of the Annual Conference on Neural Information Processing Systems. Montreal, 2014. 1269–1277

  12. Li H, Ouyang W, Wang X. Multi-bias non-linear activation in deep neural networks. In: Proceedings of the 33rd International Conference on Machine Learning. New York, 2016, 48: 221–229

    Google Scholar 

  13. Zhai S, Cheng Y, Zhang Z M. Doubly convolutional neural networks. In: Proceedings of Annual Conference on Neural Information Processing Systems. Barcelona, 2016. 1082–1090

  14. Szegedy C, Ioffe S, Vanhoucke V. Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv: 1602.07261

  15. Wen W, Wu C, Wang Y, et al. Learning structured sparsity in deep neural networks. arXiv: 1608.03665

  16. Srinivas S, Babu R V, Data-free parameter pruning for deep neural networks. arXiv: 1507.06149

  17. Han S, Pool J, Tran J, et al. Learning both weights and connections for efficient neural networks. arXiv: 1506.02626

  18. Chen W, Wilson J, Tyree S, et al. Compressing neural networks with the hashing trick. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, 2015, 37: 2285–2294

    Google Scholar 

  19. Ullrich K, Meeds E, Welling M. Soft weight-sharing for neural network compression. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, 2017

  20. Lebedev V, Lempitsky V S. Fast convnets using group-wise brain damage. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 2554–2564

  21. Zhou H, Alvarez J M, Porikli F. Less is more: Towards compact CNNs. In: Proceedings of the 14th European Conference. Amsterdam, 2016, 9908: 662–677

    Google Scholar 

  22. Li H, Kadav A, Durdanovic I, et al. Pruning filters for efficient convnets. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, 2017

  23. Gong Y, Liu L, Yang M, et al. Compressing deep convolutional networks using vector quantization. arXiv: 1412.6115

  24. Wu J X, Leng C, Cheng J. Quantized convolutional neural networks for mobile devices. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 4820–4828

  25. Vanhoucke V, Senior A, Mao M Z, Improving the speed of neural networks on CPUs. In: Proceedings of the 25th Annual Conference on Neural Information Processing Systems. Sierra Nevada, 2011

  26. Gupta S, Agrawal A, Gopalakrishnan K, et al. Deep learning with limited numerical precision. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, 2015, 37: 1737–1746

    Google Scholar 

  27. Han S, Mao H, Dally W J. Deep compression: Compressing deep neural networks with pruning. arXiv: 1510.00149

  28. Rastegari M, Ordonez V, Redmon J, et al. Xnor-net: Imagenet classification using binary convolutional neural networks. In: Proceedings of the Computer Vision-ECCV 2016-14th European Conference. Amsterdam, 2016, 9908: 525–542

    Article  Google Scholar 

  29. Merolla P, Appuswamy R, Arthur J V, et al. Deep neural networks are robust to weight binarization and other nonlinear distortions. arXiv: 1606.01981

  30. Hu H X, Wen G, Yu X, et al. Distributed stabilization of heterogeneous MASs in uncertain strong-weak competition networks. IEEE Trans Syst Man Cybern Syst, 2020, doi: https://doi.org/10.1109/TSMC.2020.3034765

  31. Hu H X, Wen G, Yu W, et al. Distributed stabilization of multiple heterogeneous agents in the strong-weak competition network: A switched system approach. IEEE Trans Cybern, 2020, doi: https://doi.org/10.1109/TCYB.2020.2995154

  32. Krizhevsky A, Sutskever I, Hinton G. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 26th Annual Conference on Neural Information Processing Systems 2012. Lake Tahoe, 2012. 1106–1114

  33. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations. San Diego, 2015

  34. He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 770–778

  35. Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the Annual Conference on Neural Information Processing Systems. Montreal, 2015. 91–99

  36. Hubara I, Courbariaux M, Soudry D, et al. Binarized neural networks. In: Proceedings of the Annual Conference on Neural Information Processing Systems. Barcelona, 2016. 4114–4122

  37. Chen H, Wang Y, Xu C, et al. AdderNet: Do we really need multiplications in deep learning? In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 2020. 1465–1474

  38. Jin Q, Yang L, Liao Z. AdaBits: Neural network quantization with adaptive bit-widths. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 2020. 2143–2153

  39. Sun Y, Wang Z, Huang S, et al. Accelerating frequent item counting with FPGA. In: Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. Monterey, 2014. 109–112

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Bin Huang or SongZhi Su.

Additional information

This work was supported by the National Natural Science Foundation of China (Grant Nos. 41971424, 61701191), the Key Technical Project of Xiamen Ocean Bureau (Grant No. 18CZB033HJ11), the Natural Science Foundation of Fujian Province (Grant Nos. 2019J01712, 2020J01701), the Key Technical Project of Xiamen Science and Technology Bureau (Grant Nos. 3502Z20191018, 3502Z20201007, 3502Z20191022, 3502Z20203057), and the Science and Technology Project of Education Department of Fujian Province (Grant Nos. JAT190321, JAT190318, JAT190315).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cai, G., Yang, S., Du, J. et al. Convolution without multiplication: A general speed up strategy for CNNs. Sci. China Technol. Sci. 64, 2627–2639 (2021). https://doi.org/10.1007/s11431-021-1936-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11431-021-1936-2

Keywords

Navigation