Advertisement

Resolution Switchable Networks for Runtime Efficient Image Recognition

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12360)

Abstract

We propose a general method to train a single convolutional neural network which is capable of switching image resolutions at inference. Thus the running speed can be selected to meet various computational resource limits. Networks trained with the proposed method are named Resolution Switchable Networks (RS-Nets). The basic training framework shares network parameters for handling images which differ in resolution, yet keeps separate batch normalization layers. Though it is parameter-efficient in design, it leads to inconsistent accuracy variations at different resolutions, for which we provide a detailed analysis from the aspect of the train-test recognition discrepancy. A multi-resolution ensemble distillation is further designed, where a teacher is learnt on the fly as a weighted ensemble over resolutions. Thanks to the ensemble and knowledge distillation, RS-Nets enjoy accuracy improvements at a wide range of resolutions compared with individually trained models. Extensive experiments on the ImageNet dataset are provided, and we additionally consider quantization problems. Code and models are available at https://github.com/yikaiw/RS-Nets.

Keywords

Efficient design Multi-resolution Ensemble distillation 

Notes

Acknowledgement

This work is jointly supported by the National Science Foundation of China (NSFC) and the German Research Foundation (DFG) in project Cross Modal Learning, NSFC 61621136008/DFG TRR-169. We thank Aojun Zhou for the insightful discussions.

Supplementary material

504470_1_En_32_MOESM1_ESM.pdf (3.2 mb)
Supplementary material 1 (pdf 3285 KB)

References

  1. 1.
    Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. In: ICLR (2019)Google Scholar
  2. 2.
    Chang, W., You, T., Seo, S., Kwak, S., Han, B.: Domain-specific batch normalization for unsupervised domain adaptation. In: CVPR (2019)Google Scholar
  3. 3.
    Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  4. 4.
    Gordo, A., Almazán, J., Revaud, J., Larlus, D.: End-to-end learning of deep visual representations for image retrieval. IJCV 124, 237–254 (2017).  https://doi.org/10.1007/s11263-017-1016-8MathSciNetCrossRefGoogle Scholar
  5. 5.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  6. 6.
    He, T., Zhang, Z., Zhang, H., Zhang, Z., Xie, J., Li, M.: Bag of tricks for image classification with convolutional neural networks. In: CVPR (2019)Google Scholar
  7. 7.
    Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
  8. 8.
    Hoffer, E., Weinstein, B., Hubara, I., Ben-Nun, T., Hoefler, T., Soudry, D.: Mix & match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency. arXiv preprint arXiv:1908.08986 (2019)
  9. 9.
    Howard, A., et al.: Searching for MobileNetV3. In: ICCV (2019)Google Scholar
  10. 10.
    Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
  11. 11.
    Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR (2017)Google Scholar
  12. 12.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)Google Scholar
  13. 13.
    Kornblith, S., Shlens, J., Le, Q.V.: Do better ImageNet models transfer better? In: CVPR (2019)Google Scholar
  14. 14.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  15. 15.
    Lan, X., Zhu, X., Gong, S.: Knowledge distillation by on-the-fly native ensemble. In: NeurIPS (2018)Google Scholar
  16. 16.
    Li, D., Zhou, A., Yao, A.: HBONet: harmonious bottleneck on two orthogonal dimensions. In: ICCV (2019)Google Scholar
  17. 17.
    Li, Z., Hoiem, D.: Learning without forgetting. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 614–629. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_37CrossRefGoogle Scholar
  18. 18.
    Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 122–138. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01264-9_8CrossRefGoogle Scholar
  19. 19.
    Mudrakarta, P.K., Sandler, M., Zhmoginov, A., Howard, A.G.: K for the price of 1: parameter-efficient multi-task and transfer learning. In: ICLR (2019)Google Scholar
  20. 20.
    Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: CVPR (2014)Google Scholar
  21. 21.
    Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)Google Scholar
  22. 22.
    Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: FitNets: hints for thin deep nets. In: ICLR (2015)Google Scholar
  23. 23.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115, 211–252 (2015).  https://doi.org/10.1007/s11263-015-0816-yMathSciNetCrossRefGoogle Scholar
  24. 24.
    Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., Chen, L.: MobileNetV2: inverted residuals and linear bottlenecks. In: CVPR (2018)Google Scholar
  25. 25.
    Sun, D., Yao, A., Zhou, A., Zhao, H.: Deeply-supervised knowledge synergy. In: CVPR (2019)Google Scholar
  26. 26.
    Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)Google Scholar
  27. 27.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR (2016)Google Scholar
  28. 28.
    Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks. In: ICML (2019)Google Scholar
  29. 29.
    Touvron, H., Vedaldi, A., Douze, M., Jégou, H.: Fixing the train-test resolution discrepancy. In: NeurIPS (2019)Google Scholar
  30. 30.
    Yu, J., Huang, T.S.: Universally slimmable networks and improved training techniques. In: ICCV (2019)Google Scholar
  31. 31.
    Yu, J., Yang, L., Xu, N., Yang, J., Huang, T.S.: Slimmable neural networks. In: ICLR (2019)Google Scholar
  32. 32.
    Zhang, D., Yang, J., Ye, D., Hua, G.: LQ-Nets: learned quantization for highly accurate and compact deep neural networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 373–390. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01237-3_23CrossRefGoogle Scholar
  33. 33.
    Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: CVPR (2018)Google Scholar
  34. 34.
    Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: CVPR (2018)Google Scholar
  35. 35.
    Zhou, A., Ma, Y., Li, Y., Zhang, X., Luo, P.: Towards improving generalization of deep networks via consistent normalization. arXiv preprint arXiv:1909.00182 (2019)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Beijing National Research Center for Information Science and Technology (BNRist), State Key Lab on Intelligent Technology and Systems, Department of Computer Science and TechnologyTsinghua UniversityBeijingChina
  2. 2.Cognitive Computing Laboratory, Intel Labs ChinaBeijingChina

Personalised recommendations