Skip to main content
Log in

EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Neural architecture search (NAS) methods have been proposed to relieve human experts from tedious architecture engineering. However, most current methods are constrained in small-scale search owing to the issue of huge computational resource consumption. Meanwhile, the direct application of architectures searched on small datasets to large datasets often bears no performance guarantee due to the discrepancy between different datasets. This limitation impedes the wide use of NAS on large-scale tasks. To overcome this obstacle, we propose an elastic architecture transfer mechanism for accelerating large-scale NAS (EAT-NAS). In our implementations, the architectures are first searched on a small dataset, e.g., CIFAR-10. The best one is chosen as the basic architecture. The search process on a large dataset, e.g., ImageNet, is initialized with the basic architecture as the seed. The large-scale search process is accelerated with the help of the basic architecture. We propose not only a NAS method but also a mechanism for architecture-level transfer learning. In our experiments, we obtain two final models EATNet-A and EATNet-B, which achieve competitive accuracies of 75.5% and 75.6%, respectively, on ImageNet. Both the models also surpass the models searched from scratch on ImageNet under the same settings. For the computational cost, EAT-NAS takes only fewer than 5 days using 8 TITAN X GPUs, which is significantly less than the computational consumption of the state-of-the-art large-scale NAS methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016

  2. He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016

  3. Sandler M, Howard A, Zhu M, et al. Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018

  4. Chen L C, Papandreou G, Kokkinos I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell, 2018, 40: 834–848

    Article  Google Scholar 

  5. Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation. 2017. ArXiv:1706.05587

  6. Huang Z L, Wang X G, Huang L C, et al. CCNet: criss-cross attention for semantic segmentation. In: Proceedings of International Conference on Computer Vision, 2019

  7. Huang Z L, Wang X G, Wei Y C, et al. CCNet: criss-cross attention for semantic segmentation. IEEE Trans Pattern Anal Mach Intell, 2020. doi: https://doi.org/10.1109/TPAMI.2020.3007032

  8. Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 1137–1149

    Article  Google Scholar 

  9. Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector. In: Proceedings of European Conference on Computer Vision, 2016

  10. Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. In: Proceedings of International Conference on Computer Vision, 2017

  11. Yi P, Wang Z Y, Jiang K, et al. Multi-temporal ultra dense memory network for video super-resolution. IEEE Trans Circ Syst Video Technol, 2020, 30: 2503–2516

    Article  Google Scholar 

  12. Zoph B, Vasudevan V, Shlens J, et al. Learning transferable architectures for scalable image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018

  13. Real E, Aggarwal A, Huang Y, et al. Regularized evolution for image classifier architecture search. In: Proceedings of AAAI Conference on Artificial Intelligence, 2019

  14. Pham H, Guan M Y, Zoph B, et al. Efficient neural architecture search via parameter sharing. In: Proceedings of International Conference on Machine Learning, 2018

  15. Zoph B, Le Q V. Neural architecture search with reinforcement learning. In: Proceedings of International Conference on Learning Representations, 2017

  16. Krizhevsky A, Hinton G. Learning Multiple Layers of Features From Tiny Images. Technical Report, 2009

  17. Deng J, Dong W, Socher R, et al. Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2009

  18. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556

  19. Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015

  20. Liu C X, Zoph B, Neumann M, et al. Progressive neural architecture search. In: Proceedings of European Conference on Computer Vision, 2018

  21. Tommasi T, Patricia N, Caputo B, et al. A deeper look at dataset bias. In: Proceedings of Domain Adaptation in Computer Vision Applications, 2017

  22. Tan M X, Chen B, Pang R M, et al. Mnasnet: platform-aware neural architecture search for mobile. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019

  23. Zhong Z, Yan J J, Wu W, et al. Practical block-wise neural network architecture generation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018

  24. Miikkulainen R, Liang J, Meyerson E, et al. Evolving deep neural networks. In: Proceedings of Artificial Intelligence in the Age of Neural Networks and Brain Computing, 2019

  25. Lu Z C, Whalen I, Boddeti V, et al. NSGA-Net: a multi-objective genetic algorithm for neural architecture search. 2018. ArXiv:1810.03522

  26. Liu H, Simonyan K, Yang Y. DARTS: differentiable architecture search. In: Proceedings of International Conference on Learning Representations, 2019

  27. Zhang X B, Huang Z H, Wang N Y. You only search once: single shot neural architecture search via direct sparse optimization. 2018. ArXiv:1811.01567

  28. Cai H, Zhu L G, Han S. ProxylessNAS: direct neural architecture search on target task and hardware. In: Proceedings of International Conference on Learning Representations, 2019

  29. Fang J M, Sun Y Z, Zhang Q, et al. Densely connected search space for more flexible neural architecture search. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020

  30. Fang J M, Sun Y Z, Peng K, et al. Fast neural network adaptation via parameter remapping and architecture search. In: Proceedings of International Conference on Learning Representations, 2020

  31. Dong X Y, Yang Y. Searching for a robust neural architecture in four GPU hours. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019

  32. Mei J R, Li Y W, Lian X C, et al. Atomnas: fine-grained end-to-end neural architecture search. In: Proceedings of International Conference on Learning Representations, 2020

  33. Wu B C, Dai X L, Zhang P Z, et al. FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019

  34. Chang J L, Zhang X B, Guo Y W, et al. Data: differentiable architecture approximation. In: Proceedings of Conference on Neural Information Processing Systems, 2019

  35. Wong C, Houlsby N, Lu Y F, et al. Transfer learning with neural automl. In: Proceedings of Conference on Neural Information Processing Systems, 2018

  36. Deb K. Multi-objective optimization. In: Proceedings of Search Methodologies, 2014

  37. Goldberg D E, Deb K. A comparative analysis of selection schemes used in genetic algorithms. Found Genetic Algorithms, 1991, 1: 69–93

    MathSciNet  Google Scholar 

  38. Liu H X, Simonyan K, Vinyals O, et al. Hierarchical representations for efficient architecture search. In: Proceedings of International Conference on Learning Representations, 2018

  39. Chollet F. Xception: deep learning with depthwise separable convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017

  40. Chen T, Goodfellow I J, Shlens J. Net2Net: accelerating learning via knowledge transfer. In: Proceedings of International Conference on Learning Representations, 2016

  41. Loshchilov I, Hutter F. SGDR: stochastic gradient descent with warm restarts. 2016. ArXiv:1608.03983

  42. DeVries T, Taylor G W. Improved regularization of convolutional neural networks with cutout. 2017. ArXiv:1708.04552

  43. Howard A G, Zhu M L, Chen B, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications. 2017. ArXiv:1704.04861

  44. Zhang X Y, Zhou X Y, Lin M X, et al. Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018

  45. Chen X, Xie L X, Wu J, et al. Progressive differentiable architecture search: bridging the depth GAP between search and evaluation. In: Proceedings of International Conference on Computer Vision, 2019

  46. Xu Y H, Xie L X, Zhang X P, et al. PC-DARTS: partial channel connections for memory-efficient architecture search. In: Proceedings of International Conference on Learning Representations, 2020

  47. Xie S R, Zheng H H, Liu C X, et al. SNAS: stochastic neural architecture search. In: Proceedings of International Conference on Learning Representations, 2019

  48. Real E, Moore S, Selle A, et al. Large-scale evolution of image classifiers. In: Proceedings of International Conference on Machine Learning, 2017

Download references

Acknowledgements

This work was in part supported by National Natural Science Foundation of China (NSFC) (Grant Nos. 61876212, 61976208, 61733007), Zhejiang Lab (Grant No. 2019NB0AB02), and HUST-Horizon Computer Vision Research Center. We thank Liangchen SONG and Guoli WANG for the discussion and assistance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinggang Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, J., Chen, Y., Zhang, X. et al. EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search. Sci. China Inf. Sci. 64, 192106 (2021). https://doi.org/10.1007/s11432-020-3112-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-020-3112-8

Keywords

Navigation