Skip to main content

Network Binarization via Contrastive Learning

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13671))

Included in the following conference series:

Abstract

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit. However, there is still a huge performance gap between Binary Neural Networks (BNNs) and their full-precision (FP) counterparts. As the quantization error caused by weights binarization has been reduced in earlier works, the activations binarization becomes the major obstacle for further improvement of the accuracy. BNN characterises a unique and interesting structure, where the binary and latent FP activations exist in the same forward pass (i.e. \(\text {Binarize}(\textbf{a}_F) = \textbf{a}_B\)). To mitigate the information degradation caused by the binarization operation from FP to binary activations, we establish a contrastive learning framework while training BNNs through the lens of Mutual Information (MI) maximization. MI is introduced as the metric to measure the information shared between binary and the FP activations, which assists binarization with contrastive learning. Specifically, the representation ability of the BNNs is greatly strengthened via pulling the positive pairs with binary and FP activations from the same input samples, as well as pushing negative pairs from different samples (the number of negative pairs can be exponentially large). This benefits the downstream tasks, not only classification but also segmentation and depth estimation, etc. The experimental results show that our method can be implemented as a pile-up module on existing state-of-the-art binarization methods and can remarkably improve the performance over them on CIFAR-10/100 and ImageNet, in addition to the great generalization ability on NYUD-v2. The code is available at https://github.com/42Shawn/CMIM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bachman, P., Hjelm, R.D., Buchwalter, W.: Learning representations by maximizing mutual information across views. In: NeurIPS (2019)

    Google Scholar 

  2. Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv:1308.3432 (2013)

  3. Bulat, A., Tzimiropoulos, G.: XNOR-Net++: improved binary neural networks. In: BMVC (2019)

    Google Scholar 

  4. Cai, Z., He, X., Sun, J., Vasconcelos, N.: Deep learning with low precision by half-wave gaussian quantization. In: CVPR (2017)

    Google Scholar 

  5. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40(4), 834–848 (2017)

    Article  Google Scholar 

  6. Chen, L., Wang, D., Gan, Z., Liu, J., Henao, R., Carin, L.: Wasserstein contrastive representation distillation. In: CVPR (2021)

    Google Scholar 

  7. Chen, X., He, K.: Exploring simple siamese representation learning. In: CVPR (2021)

    Google Scholar 

  8. Courbariaux, M., Bengio, Y., David, J.P.: BinaryConnect: training deep neural networks with binary weights during propagations. In: NeurIPS (2016)

    Google Scholar 

  9. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)

    Google Scholar 

  10. Ding, R., Chin, T.W., Liu, Z., Marculescu, D.: Regularizing activation distribution for training binarized deep networks. In: CVPR (2019)

    Google Scholar 

  11. Gao, S., Ver Steeg, G., Galstyan, A.: Efficient estimation of mutual information for strongly dependent variables. In: AISTATS (2015)

    Google Scholar 

  12. Gong, R., et al.: Differentiable soft quantization: bridging full-precision and low-bit neural networks. In: ICCV (2019)

    Google Scholar 

  13. Gutmann, M., Hyvärinen, A.: Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: AISTATS (2010)

    Google Scholar 

  14. Han, K., Wang, Y., Xu, Y., Xu, C., Wu, E., Xu, C.: Training binary neural networks through learning with noisy supervision. In: ICML (2020)

    Google Scholar 

  15. Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: ICLR (2016)

    Google Scholar 

  16. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR (2020)

    Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  18. He, X., et al.: ProxyBNN: learning binarized neural networks via proxy matrices. In: CVPR (2020)

    Google Scholar 

  19. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: NeurIPS (2014)

    Google Scholar 

  20. Hjelm, R.D., et al.: Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670 (2018)

  21. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  22. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks. In: NeurIPS (2016)

    Google Scholar 

  23. Kim, H., Park, J., Lee, C., Kim, J.J.: Improving accuracy of binary neural networks using unbalanced activation distribution. In: CVPR (2021)

    Google Scholar 

  24. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  25. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NeurIPS (2012)

    Google Scholar 

  26. Kullback, S.: Information Theory and Statistics. Courier Corporation (1997)

    Google Scholar 

  27. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  28. LeCun, Y., Denker, J., Solla, S.: Optimal brain damage. In: NeurIPS (1989)

    Google Scholar 

  29. Lin, M., et al.: Rotated binary neural network. In: NeurIPS (2020)

    Google Scholar 

  30. Lin, X., Zhao, C., Pan, W.: Towards accurate binary convolutional neural network. In: NeurIPS (2017)

    Google Scholar 

  31. Liu, Z., Luo, W., Wu, B., Yang, X., Liu, W., Cheng, K.T.: Bi-real net: binarizing deep network towards real-network performance. IJCV 128, 202–219 (2020). https://doi.org/10.1007/s11263-019-01227-8

    Article  Google Scholar 

  32. Liu, Z., Shen, Z., Savvides, M., Cheng, K.-T.: ReActNet: towards precise binary neural network with generalized activation functions. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 143–159. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_9

    Chapter  Google Scholar 

  33. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. JMLR 9, 2579–2605 (2008)

    MATH  Google Scholar 

  34. van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)

  35. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)

    Google Scholar 

  36. Poole, B., Ozair, S., Van Den Oord, A., Alemi, A., Tucker, G.: On variational bounds of mutual information. In: ICML (2019)

    Google Scholar 

  37. Qin, H., et al.: Forward and backward information retention for accurate binary neural networks. In: CVPR (2020)

    Google Scholar 

  38. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32

    Chapter  Google Scholar 

  39. Shang, Y., Duan, B., Zong, Z., Nie, L., Yan, Y.: Lipschitz continuity guided knowledge distillation. In: ICCV (2021)

    Google Scholar 

  40. Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54

    Chapter  Google Scholar 

  41. Tian, Y., Krishnan, D., Isola, P.: Contrastive representation distillation. In: ICLR (2021)

    Google Scholar 

  42. Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: CVPR (2018)

    Google Scholar 

  43. Yang, Z., et al.: Searching for low-bit weights in quantized neural networks. In: NeurIPS (2020)

    Google Scholar 

  44. Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016)

Download references

Acknowledgements

This research was partially supported by NSF CNS-1908658 (ZZ, YY), NeTS-2109982 (YY), Early Career Scheme of the Research Grants Council (RGC) of the Hong Kong SAR under grant No. 26202321 (DX), HKUST Startup Fund No. R9253 (DX) and the gift donation from Cisco (YY). This article solely reflects the opinions and conclusions of its authors and not the funding agents.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Yan .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 237 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shang, Y., Xu, D., Zong, Z., Nie, L., Yan, Y. (2022). Network Binarization via Contrastive Learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20083-0_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20082-3

  • Online ISBN: 978-3-031-20083-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics