Advertisement

Training Binary Weight Networks via Semi-Binary Decomposition

  • Qinghao Hu
  • Gang Li
  • Peisong Wang
  • Yifan Zhang
  • Jian ChengEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11217)

Abstract

Recently binary weight networks have attracted lots of attentions due to their high computational efficiency and small parameter size. Yet they still suffer from large accuracy drops because of their limited representation capacity. In this paper, we propose a novel semi-binary decomposition method which decomposes a matrix into two binary matrices and a diagonal matrix. Since the matrix product of binary matrices has more numerical values than binary matrix, the proposed semi-binary decomposition has more representation capacity. Besides, we propose an alternating optimization method to solve the semi-binary decomposition problem while keeping binary constraints. Extensive experiments on AlexNet, ResNet-18, and ResNet-50 demonstrate that our method outperforms state-of-the-art methods by a large margin (5% higher in top1 accuracy). We also implement binary weight AlexNet on FPGA platform, which shows that our proposed method can achieve \(\sim \)9\(\times \) speed-ups while reducing the consumption of on-chip memory and dedicated multipliers significantly.

Keywords

Deep neural networks Binary weight networks Deep network acceleration and compression 

Notes

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (No. 61332016, and No. 61572500), the Scientific Research Key Program of Beijing Municipal Commission of Education (KZ201610005012), and the Strategic Priority Research Program of Chinese Academy of Science (Grant No. XDBS01000000).

References

  1. 1.
    Cai, Z., He, X., Sun, J., Vasconcelos, N.: Deep learning with low precision by half-wave Gaussian quantization. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition. pp. 5406–5414. IEEE Computer Society, Honolulu (2017).  https://doi.org/10.1109/CVPR.2017.574
  2. 2.
    Cheng, J., Wu, J., Leng, C., Wang, Y., Hu, Q.: Quantized CNN: a unified approach to accelerate and compress convolutional networks. IEEE Trans. Neural Netw. Learn. Syst., 1–14 (2017).  https://doi.org/10.1109/TNNLS.2017.2774288CrossRefGoogle Scholar
  3. 3.
    Cheng, J., Wang, P., Li, G., Hu, Q., Lu, H.: Recent advances in efficientcomputation of deep convolutional neural networks. Front. IT EE 19(1), 64–77 (2018).  https://doi.org/10.1631/FITEE.1700789CrossRefGoogle Scholar
  4. 4.
    Courbariaux, M., Bengio, Y., David, J.: Binaryconnect: training deep neural networks with binary weights during propagations. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28, Montreal, Quebec, Canada, pp. 3123–3131 (2015)Google Scholar
  5. 5.
    Denil, M., Shakibi, B., Dinh, L., de Freitas, N.: Predicting parameters in deep learning. In: Burges, C.J.C., Bottou, L., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 2148–2156. Curran Associates Inc., Lake Tahoe (2013)Google Scholar
  6. 6.
    Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 1269–1277. Curran Associates Inc., Montreal (2014)Google Scholar
  7. 7.
    Dong, Y., Ni, R., Li, J., Chen, Y., Zhu, J., Su, H.: Learning accurate low-bit deep neural networks with stochastic quantization. CoRR abs/1708.01001 (2017). http://arxiv.org/abs/1708.01001
  8. 8.
    Gong, Y., Liu, L., Yang, M., Bourdev, L.: Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115 (2014)
  9. 9.
    Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 1737–1746. PMLR, Lille (2015)Google Scholar
  10. 10.
    Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28, Montreal, Quebec, Canada, pp. 1135–1143 (2015)Google Scholar
  11. 11.
    Hassibi, B., Stork, D.G.: Second order derivatives for network pruning: optimal brain surgeon. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Advances in Neural Information Processing Systems 5, pp. 164–171. Morgan Kaufmann, Denver (1992)Google Scholar
  12. 12.
    Hu, Q., Wang, P., Cheng, J.: From hashing to CNNs: training binary weight networks via hashing. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. AAAI Press, New Orleans (2018)Google Scholar
  13. 13.
    Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: Valstar, M.F., French, A.P., Pridmore, T.P. (eds.) BMVC 2014. BMVA Press, Nottingham (2014)Google Scholar
  14. 14.
    Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM 2014, pp. 675–678. ACM, New York (2014).  https://doi.org/10.1145/2647868.2654889
  15. 15.
    Kim, Y., Park, E., Yoo, S., Choi, T., Yang, L., Shin, D.: Compression of deep convolutional neural networks for fast and low power mobile applications. CoRR abs/1511.06530 (2015). http://arxiv.org/abs/1511.06530
  16. 16.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017).  https://doi.org/10.1145/3065386CrossRefGoogle Scholar
  17. 17.
    Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I.V., Lempitsky, V.S.: Speeding-up convolutional neural networks using fine-tuned CP-decomposition. CoRR abs/1412.6553 (2014). http://arxiv.org/abs/1412.6553
  18. 18.
    Lebedev, V., Lempitsky, V.S.: Fast convnets using group-wise brain damage. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2554–2564. IEEE Computer Society, Las Vegas (2016).  https://doi.org/10.1109/CVPR.2016.280
  19. 19.
    LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems 2, pp. 598–605. Morgan Kaufmann, Denver (1989)Google Scholar
  20. 20.
    Lin, D., Talathi, S., Annapureddy, S.: Fixed point quantization of deep convolutional networks. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of the 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 2849–2858. PMLR, New York (2016)Google Scholar
  21. 21.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 3431–3440. IEEE Computer Society (2015).  https://doi.org/10.1109/CVPR.2015.7298965
  22. 22.
    Qiu, J., et al.: Going deeper with embedded FPGA platform for convolutional neural network. In: Chen, D., Greene, J.W. (eds.) Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA 2016, pp. 26–35. ACM, New York (2016).  https://doi.org/10.1145/2847263.2847265
  23. 23.
    Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_32CrossRefGoogle Scholar
  24. 24.
    Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017).  https://doi.org/10.1109/TPAMI.2016.2577031CrossRefGoogle Scholar
  25. 25.
    Wang, P., Cheng, J.: Accelerating convolutional neural networks for mobile applications. In: Proceedings of the 2016 ACM on Multimedia Conference, MM 2016, pp. 541–545. ACM, New York (2016).  https://doi.org/10.1145/2964284.2967280
  26. 26.
    Wang, P., Cheng, J.: Fixed-point factorized networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 3966–3974. IEEE Computer Society (2017).  https://doi.org/10.1109/CVPR.2017.422
  27. 27.
    Wang, P., Hu, Q., Zhang, Y., Zhang, C., Liu, Y., Cheng, J.: Two-step quantization for low-bit neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society (2018)Google Scholar
  28. 28.
    Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neuralnetworks for mobile devices. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4820–4828. IEEE Computer Society, LasVegas (2016).  https://doi.org/10.1109/CVPR.2016.521
  29. 29.
    Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Constantinides, G.A., Chen, D. (eds.) Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 161–170. ACM, Monterey (2015).  https://doi.org/10.1145/2684746.2689060
  30. 30.
    Zhang, X., Zou, J., He, K., Sun, J.: Accelerating very deep convolutional networks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 1943–1955 (2016).  https://doi.org/10.1109/TPAMI.2015.2502579CrossRefGoogle Scholar
  31. 31.
    Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. CoRR abs/1606.06160 (2016). http://arxiv.org/abs/1606.06160

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.National Laboratory of Pattern RecognitionInstitute of Automation, Chinese Academy of SciencesBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.Center for Excellence in Brain Science and Intelligence TechnologyBeijingChina

Personalised recommendations