Learning Architectures for Binary Networks

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12357)


Backbone architectures of most binary networks are well-known floating point (FP) architectures such as the ResNet family. Questioning that the architectures designed for FP networks might not be the best for binary networks, we propose to search architectures for binary networks (BNAS) by defining a new search space for binary architectures and a novel search objective. Specifically, based on the cell based search method, we define the new search space of binary layer types, design a new cell template, and rediscover the utility of and propose to use the Zeroise layer instead of using it as a placeholder. The novel search objective diversifies early search to learn better performing binary architectures. We show that our method searches architectures with stable training curves despite the quantization error inherent in binary networks. Quantitative analyses demonstrate that our searched architectures outperform the architectures used in state-of-the-art binary networks and outperform or perform on par with state-of-the-art binary networks that employ various techniques other than architectural changes.


Binary networks Backbone architecture Architecture search 



This work was partly supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No.2019R1 C1C1009283), Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2019-0-01842, Artificial Intelligence Graduate School Program (GIST) and No.2019-0-01351, Development of Ultra Low-Power Mobile Deep Learning Semiconductor With Compression/Decompression of Activation/Kernel Data), “GIST Research Institute(GRI) GIST-CNUH research Collaboration” grant funded by the GIST in 2020, and a study on the “HPC Support” Project, supported by the ‘Ministry of Science and ICT’ and NIPA.

The authors would like to thank Dr. Mohammad Rastegari for valuable comments and training details of XNOR-Net and Dr. Chunlei Liu and other authors of [25] for sharing their code.

Supplementary material (2.7 mb)
Supplementary material 1 (zip 2801 KB)


  1. 1.
    Bender, G., Kindermans, P.J., Zoph, B., Vasudevan, V., Le, Q.: Understanding and simplifying one-shot architecture search. In: ICML (2018)Google Scholar
  2. 2.
    Bulat, A., Martínez, B., Tzimiropoulos, G.: Bats: binary architecture search. ArXiv preprint arXiv:2003.01711 abs/2003.01711 (2020)
  3. 3.
    Bulat, A., Tzimiropoulos, G.: XNOR-Net++: improved binary neural networks. In: BMVC (2019)Google Scholar
  4. 4.
    Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. In: ICLR (2019).
  5. 5.
    Chen, X., Xie, L., Wu, J., Tian, Q.: Progressive differentiable architecture search: bridging the depth gap between search and evaluation. In: ICCV, pp. 1294–1303 (2019)Google Scholar
  6. 6.
    Courbariaux, M., Bengio, Y., David, J.P.: Binaryconnect: training deep neural networks with binary weights during propagations. In: NIPS (2015)Google Scholar
  7. 7.
    Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or \(-\)1. arXiv preprint arXiv:1602.02830 (2016)
  8. 8.
    Dong, J.-D., Cheng, A.-C., Juan, D.-C., Wei, W., Sun, M.: DPP-Net: device-aware progressive search for pareto-optimal neural architectures. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 540–555. Springer, Cham (2018). Scholar
  9. 9.
    Dong, X., Yang, Y.: Searching for a robust neural architecture in four GPU hours. In: CVPR (2019)Google Scholar
  10. 10.
    Gu, J., et al.: Projection convolutional neural networks for 1-bit CNNs via discrete back propagation. In: AAAI (2019)Google Scholar
  11. 11.
    Gu, J., et al.: Bayesian optimized 1-bit CNNs. In: CVPR (2019)Google Scholar
  12. 12.
    Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: NIPS (2015)Google Scholar
  13. 13.
    Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
  14. 14.
    Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
  15. 15.
    Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: alexnet-level accuracy with 50x fewer parameters and \(<\)0.5 mb model size. arXiv:1602.07360 (2016)
  16. 16.
    Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014)
  17. 17.
    Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: ICLR (2017).
  18. 18.
    Kim, H., Kim, K., Kim, J., Kim, J.J.: Binaryduo: reducing gradient mismatch in binary activation network by coupling binary activations. In: ICLR (2020).
  19. 19.
    Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)Google Scholar
  20. 20.
    Li, F., Zhang, B., Liu, B.: Ternary weight networks. arXiv preprint arXiv:1605.04711 (2016)
  21. 21.
    Li, L., Talwalkar, A.: Random search and reproducibility for neural architecture search. arXiv preprint arXiv:1902.07638 (2019)
  22. 22.
    Lin, X., Zhao, C., Pan, W.: Towards accurate binary convolutional neural network. In: NIPS (2017)Google Scholar
  23. 23.
    Liu, C., et al.: Auto-deeplab: hierarchical neural architecture search for semantic image segmentation. In: CVPR (2019)Google Scholar
  24. 24.
    Liu, C., et al.: Progressive neural architecture search. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 19–35. Springer, Cham (2018). Scholar
  25. 25.
    Liu, C., et al.: Circulant binary convolutional networks: enhancing the performance of 1-bit DCNNs with circulant back propagation. In: CVPR (2019)Google Scholar
  26. 26.
    Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: ICLR (2018).
  27. 27.
    Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: ICLR (2019).
  28. 28.
    Liu, Z., Wu, B., Luo, W., Yang, X., Liu, W., Cheng, K.-T.: Bi-Real Net: enhancing the performance of 1-Bit CNNs with improved representational capability and advanced training algorithm. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 747–763. Springer, Cham (2018). Scholar
  29. 29.
    Luo, R., Tian, F., Qin, T., Chen, E., Liu, T.Y.: Neural architecture optimization. In: NIPS (2018)Google Scholar
  30. 30.
    Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: ICML (2018)Google Scholar
  31. 31.
    Phan, H., Huynh, D., He, Y., Savvides, M., Shen, Z.: Mobinet: a mobile binary network for image classification. arXiv preprint arXiv:1907.12629 (2019)
  32. 32.
    Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). Scholar
  33. 33.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Shen, M., Han, K., Xu, C., Wang, Y.: Searching for accurate binary neural architectures. In: ICCV Workshop (2019)Google Scholar
  35. 35.
    Sifre, L., Mallat, S.: Rigid-motion scattering for image classification (2014)Google Scholar
  36. 36.
    Tan, M., Le, Q.V.: Efficientnet: rethinking model scaling for convolutional neural networks. In: ICML (2019)Google Scholar
  37. 37.
    Tan, S., Caruana, R., Hooker, G., Koch, P., Gordo, A.: Learning global additive explanations for neural nets using model distillation. arXiv preprint arXiv:1801.08640 (2018)
  38. 38.
    Wu, B., et al.: FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. In: CVPR (2019)Google Scholar
  39. 39.
    Xie, S., Kirillov, A., Girshick, R., He, K.: Exploring randomly wired neural networks for image recognition. arXiv preprint arXiv:1904.01569 (2019)
  40. 40.
    Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. In: ICLR (2019).
  41. 41.
    Zhang, C., Ren, M., Urtasun, R.: Graph hypernetworks for neural architecture search. In: ICLR (2019).
  42. 42.
    Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: CVPR (2018)Google Scholar
  43. 43.
    Zhou, Y., Ebrahimi, S., Arık, S.Ö., Yu, H., Liu, H., Diamos, G.: Resource-efficient neural architect. arXiv preprint arXiv:1806.07912 (2018)
  44. 44.
    Zhuang, B., Shen, C., Tan, M., Liu, L., Reid, I.: Towards effective low-bitwidth convolutional neural networks. In: CVPR (2018)Google Scholar
  45. 45.
    Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: ICLR (2017).
  46. 46.
    Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: CVPR (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.GIST (Gwangju Institute of Science and Technology)GwangjuSouth Korea
  2. 2.Indian Institute of Technology (IIT) RoorkeeRoorkeeIndia

Personalised recommendations