AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets

Tu, Zhijun; Chen, Xinghao; Ren, Pengju; Wang, Yunhe

doi:10.1007/978-3-031-20083-0_23

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13671))

Included in the following conference series:

European Conference on Computer Vision

2364 Accesses
23 Citations

Abstract

This paper studies the Binary Neural Networks (BNNs) in which weights and activations are both binarized into 1-bit values, thus greatly reducing the memory usage and computational complexity. Since the modern deep neural networks are of sophisticated design with complex architecture for the accuracy reason, the diversity on distributions of weights and activations is very high. Therefore, the conventional sign function cannot be well used for effectively binarizing full-precision values in BNNs. To this end, we present a simple yet effective approach called AdaBin to adaptively obtain the optimal binary sets \(\{b_1, b_2\}\) (\(b_1, b_2\in \mathbb {R}\)) of weights and activations for each layer instead of a fixed set (i.e., \(\{-1, +1\}\)). In this way, the proposed method can better fit different distributions and increase the representation ability of binarized features. In practice, we use the center position and distance of 1-bit values to define a new binary quantization function. For the weights, we propose an equalization method to align the symmetrical center of binary distribution to real-valued distribution, and minimize the Kullback-Leibler divergence of them. Meanwhile, we introduce a gradient-based optimization method to get these two parameters for activations, which are jointly trained in an end-to-end manner. Experimental results on benchmark models and datasets demonstrate that the proposed AdaBin is able to achieve state-of-the-art performance. For instance, we obtain a 66.4% Top-1 accuracy on the ImageNet using ResNet-18 architecture, and a 69.4 mAP on PASCAL VOC using SSD300.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

IEEE standard for binary floating-point arithmetic: ANSI/IEEE Std 754–1985, 1–20 (1985). https://doi.org/10.1109/IEEESTD.1985.82928
Anderson, A.G., Berg, C.P.: The high-dimensional geometry of binary neural networks. arXiv preprint arXiv:1705.07199 (2017)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Baskin, C., et al.: Uniq: uniform noise injection for non-uniform quantization of neural networks. ACM Trans. Comput. Syst. (TOCS) 37(1–4), 1–15 (2021)
Google Scholar
Bethge, J., Bartz, C., Yang, H., Chen, Y., Meinel, C.: MeliusNet: an improved network architecture for binary neural networks. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1439–1448 (2021)
Google Scholar
Bethge, J., Yang, H., Bornstein, M., Meinel, C.: Back to simplicity: how to train accurate BNNs from scratch? arXiv preprint arXiv:1906.08637 (2019)
Bethge, J., Yang, H., Bornstein, M., Meinel, C.: BinarydenseNet: developing an architecture for binary neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Google Scholar
Cai, Z., He, X., Sun, J., Vasconcelos, N.: Deep learning with low precision by half-wave gaussian quantization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5918–5926 (2017)
Google Scholar
Chen, H., et al.: AdderNet: do we really need multiplications in deep learning? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1468–1477 (2020)
Google Scholar
Chen, X., Zhang, Y., Wang, Y.: MTP: multi-task pruning for efficient semantic segmentation networks. arXiv preprint arXiv:2007.08386 (2020)
Chen, X., Zhang, Y., Wang, Y., Shu, H., Xu, C., Xu, C.: Optical flow distillation: towards efficient and stable video style transfer. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 614–630. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_37
Chapter Google Scholar
Choukroun, Y., Kravchik, E., Yang, F., Kisilev, P.: Low-bit quantization of neural networks for efficient inference. In: ICCV Workshops, pp. 3009–3018 (2019)
Google Scholar
Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830 (2016)
Deng, J., Dong, W., Socher, R., Li, L., Kai Li, Li Fei-Fei: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
Ding, R., Chin, T.W., Liu, Z., Marculescu, D.: Regularizing activation distribution for training binarized deep networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11408–11417 (2019)
Google Scholar
Gong, R., et al.: Differentiable soft quantization: Bridging full-precision and low-bit neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4852–4861 (2019)
Google Scholar
Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: International Conference on Machine Learning, pp. 1319–1327. PMLR (2013)
Google Scholar
Han, K., Wang, Y., Xu, Y., Xu, C., Wu, E., Xu, C.: Training binary neural networks through learning with noisy supervision. In: International Conference on Machine Learning, pp. 4017–4026. PMLR (2020)
Google Scholar
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. arXiv preprint arXiv:1506.02626 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig. Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Hou, L., Yao, Q., Kwok, J.T.: Loss-aware binarization of deep networks. arXiv preprint arXiv:1611.01600 (2016)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Article MathSciNet MATH Google Scholar
Li, F., Zhang, B., Liu, B.: Ternary weight networks. arXiv preprint arXiv:1605.04711 (2016)
Lin, M., et al.: SiMaN: sign-to-magnitude network binarization. arXiv preprint arXiv:2102.07981 (2021)
Lin, M., et al.: Rotated binary neural network. In: Advances in Neural Information Processing Systems, vol. 33 (2020)
Google Scholar
Lin, X., Zhao, C., Pan, W.: Towards accurate binary convolutional neural network. arXiv preprint arXiv:1711.11294 (2017)
Liu, Z., Shen, Z., Savvides, M., Cheng, K.-T.: ReActNet: towards precise binary neural network with generalized activation functions. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 143–159. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_9
Chapter Google Scholar
Liu, Z., Wu, B., Luo, W., Yang, X., Liu, W., Cheng, K.T.: Bi-Real Net: enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm. In: Proceedings of the European conference on computer vision (ECCV), pp. 722–737 (2018)
Google Scholar
Martinez, B., Yang, J., Bulat, A., Tzimiropoulos, G.: Training binary neural networks with real-to-binary convolutions. arXiv preprint arXiv:2003.11535 (2020)
Qin, H., et al.: Forward and backward information retention for accurate binary neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2250–2259 (2020)
Google Scholar
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32
Chapter Google Scholar
Wang, P., He, X., Li, G., Zhao, T., Cheng, J.: Sparsity-inducing binarized neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12192–12199 (2020)
Google Scholar
Wang, Z., Lu, J., Wu, Z., Zhou, J.: Learning efficient binarized object detectors with information compression. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 3082–3095 (2021). https://doi.org/10.1109/TPAMI.2021.3050464
Wang, Z., Wu, Z., Lu, J., Zhou, J.: BiDet: an efficient binarized object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2049–2058 (2020)
Google Scholar
Xu, Y., Han, K., Xu, C., Tang, Y., Xu, C., Wang, Y.: Learning frequency domain approximation for binary neural networks. In: NeurIPS (2021)
Google Scholar
Xu, Z., et al.: ReCU: reviving the dead weights in binary neural networks. arXiv preprint arXiv:2103.12369 (2021)
Yang, Z., et al.: Searching for low-bit weights in quantized neural networks. arXiv preprint arXiv:2009.08695 (2020)
Yu, X., Liu, T., Wang, X., Tao, D.: On compressing deep models by low rank and sparse decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7370–7379 (2017)
Google Scholar
Zhang, Z., Shao, W., Gu, J., Wang, X., Luo, P.: Differentiable dynamic quantization with mixed precision and adaptive resolution. In: International Conference on Machine Learning, pp. 12546–12556. PMLR (2021)
Google Scholar
Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016)

Download references

Acknowledgments

This work was supported in part by Key-Area Research and Development Program of Guangdong Province No. 2019B010153003, Key Research and Development Program of Shaanxi No. 2022ZDLGY01-08, and Fundamental Research Funds for the Xi’an Jiaotong University No. xhj032021005-05.

Author information

Authors and Affiliations

Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, China
Zhijun Tu & Pengju Ren
Huawei Noah’s Ark Lab, Beijing, China
Zhijun Tu, Xinghao Chen & Yunhe Wang

Authors

Zhijun Tu
View author publications
You can also search for this author in PubMed Google Scholar
Xinghao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Pengju Ren
View author publications
You can also search for this author in PubMed Google Scholar
Yunhe Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xinghao Chen or Pengju Ren .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 949 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tu, Z., Chen, X., Ren, P., Wang, Y. (2022). AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-20083-0_23
Published: 03 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20082-3
Online ISBN: 978-3-031-20083-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets