EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search

Fang, Jiemin; Chen, Yukang; Zhang, Xinbang; Zhang, Qian; Huang, Chang; Meng, Gaofeng; Liu, Wenyu; Wang, Xinggang

doi:10.1007/s11432-020-3112-8

EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search

Research Paper
Published: 06 August 2021

Volume 64, article number 192106, (2021)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Jiemin Fang^1,2^na1,
Yukang Chen⁴^na1,
Xinbang Zhang⁴,
Qian Zhang³,
Chang Huang³,
Gaofeng Meng⁴,
Wenyu Liu² &
…
Xinggang Wang²

127 Accesses
6 Citations
Explore all metrics

Abstract

Neural architecture search (NAS) methods have been proposed to relieve human experts from tedious architecture engineering. However, most current methods are constrained in small-scale search owing to the issue of huge computational resource consumption. Meanwhile, the direct application of architectures searched on small datasets to large datasets often bears no performance guarantee due to the discrepancy between different datasets. This limitation impedes the wide use of NAS on large-scale tasks. To overcome this obstacle, we propose an elastic architecture transfer mechanism for accelerating large-scale NAS (EAT-NAS). In our implementations, the architectures are first searched on a small dataset, e.g., CIFAR-10. The best one is chosen as the basic architecture. The search process on a large dataset, e.g., ImageNet, is initialized with the basic architecture as the seed. The large-scale search process is accelerated with the help of the basic architecture. We propose not only a NAS method but also a mechanism for architecture-level transfer learning. In our experiments, we obtain two final models EATNet-A and EATNet-B, which achieve competitive accuracies of 75.5% and 75.6%, respectively, on ImageNet. Both the models also surpass the models searched from scratch on ImageNet under the same settings. For the computational cost, EAT-NAS takes only fewer than 5 days using 8 TITAN X GPUs, which is significantly less than the computational consumption of the state-of-the-art large-scale NAS methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards Self-Supervised and Weight-preserving Neural Architecture Search

Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild

Article 03 November 2020

BigNAS: Scaling up Neural Architecture Search with Big Single-Stage Models

References

Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016
Sandler M, Howard A, Zhu M, et al. Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018
Chen L C, Papandreou G, Kokkinos I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell, 2018, 40: 834–848
Article Google Scholar
Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation. 2017. ArXiv:1706.05587
Huang Z L, Wang X G, Huang L C, et al. CCNet: criss-cross attention for semantic segmentation. In: Proceedings of International Conference on Computer Vision, 2019
Huang Z L, Wang X G, Wei Y C, et al. CCNet: criss-cross attention for semantic segmentation. IEEE Trans Pattern Anal Mach Intell, 2020. doi: https://doi.org/10.1109/TPAMI.2020.3007032
Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 1137–1149
Article Google Scholar
Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector. In: Proceedings of European Conference on Computer Vision, 2016
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. In: Proceedings of International Conference on Computer Vision, 2017
Yi P, Wang Z Y, Jiang K, et al. Multi-temporal ultra dense memory network for video super-resolution. IEEE Trans Circ Syst Video Technol, 2020, 30: 2503–2516
Article Google Scholar
Zoph B, Vasudevan V, Shlens J, et al. Learning transferable architectures for scalable image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018
Real E, Aggarwal A, Huang Y, et al. Regularized evolution for image classifier architecture search. In: Proceedings of AAAI Conference on Artificial Intelligence, 2019
Pham H, Guan M Y, Zoph B, et al. Efficient neural architecture search via parameter sharing. In: Proceedings of International Conference on Machine Learning, 2018
Zoph B, Le Q V. Neural architecture search with reinforcement learning. In: Proceedings of International Conference on Learning Representations, 2017
Krizhevsky A, Hinton G. Learning Multiple Layers of Features From Tiny Images. Technical Report, 2009
Deng J, Dong W, Socher R, et al. Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2009
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556
Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015
Liu C X, Zoph B, Neumann M, et al. Progressive neural architecture search. In: Proceedings of European Conference on Computer Vision, 2018
Tommasi T, Patricia N, Caputo B, et al. A deeper look at dataset bias. In: Proceedings of Domain Adaptation in Computer Vision Applications, 2017
Tan M X, Chen B, Pang R M, et al. Mnasnet: platform-aware neural architecture search for mobile. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019
Zhong Z, Yan J J, Wu W, et al. Practical block-wise neural network architecture generation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018
Miikkulainen R, Liang J, Meyerson E, et al. Evolving deep neural networks. In: Proceedings of Artificial Intelligence in the Age of Neural Networks and Brain Computing, 2019
Lu Z C, Whalen I, Boddeti V, et al. NSGA-Net: a multi-objective genetic algorithm for neural architecture search. 2018. ArXiv:1810.03522
Liu H, Simonyan K, Yang Y. DARTS: differentiable architecture search. In: Proceedings of International Conference on Learning Representations, 2019
Zhang X B, Huang Z H, Wang N Y. You only search once: single shot neural architecture search via direct sparse optimization. 2018. ArXiv:1811.01567
Cai H, Zhu L G, Han S. ProxylessNAS: direct neural architecture search on target task and hardware. In: Proceedings of International Conference on Learning Representations, 2019
Fang J M, Sun Y Z, Zhang Q, et al. Densely connected search space for more flexible neural architecture search. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020
Fang J M, Sun Y Z, Peng K, et al. Fast neural network adaptation via parameter remapping and architecture search. In: Proceedings of International Conference on Learning Representations, 2020
Dong X Y, Yang Y. Searching for a robust neural architecture in four GPU hours. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019
Mei J R, Li Y W, Lian X C, et al. Atomnas: fine-grained end-to-end neural architecture search. In: Proceedings of International Conference on Learning Representations, 2020
Wu B C, Dai X L, Zhang P Z, et al. FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019
Chang J L, Zhang X B, Guo Y W, et al. Data: differentiable architecture approximation. In: Proceedings of Conference on Neural Information Processing Systems, 2019
Wong C, Houlsby N, Lu Y F, et al. Transfer learning with neural automl. In: Proceedings of Conference on Neural Information Processing Systems, 2018
Deb K. Multi-objective optimization. In: Proceedings of Search Methodologies, 2014
Goldberg D E, Deb K. A comparative analysis of selection schemes used in genetic algorithms. Found Genetic Algorithms, 1991, 1: 69–93
MathSciNet Google Scholar
Liu H X, Simonyan K, Vinyals O, et al. Hierarchical representations for efficient architecture search. In: Proceedings of International Conference on Learning Representations, 2018
Chollet F. Xception: deep learning with depthwise separable convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017
Chen T, Goodfellow I J, Shlens J. Net2Net: accelerating learning via knowledge transfer. In: Proceedings of International Conference on Learning Representations, 2016
Loshchilov I, Hutter F. SGDR: stochastic gradient descent with warm restarts. 2016. ArXiv:1608.03983
DeVries T, Taylor G W. Improved regularization of convolutional neural networks with cutout. 2017. ArXiv:1708.04552
Howard A G, Zhu M L, Chen B, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications. 2017. ArXiv:1704.04861
Zhang X Y, Zhou X Y, Lin M X, et al. Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018
Chen X, Xie L X, Wu J, et al. Progressive differentiable architecture search: bridging the depth GAP between search and evaluation. In: Proceedings of International Conference on Computer Vision, 2019
Xu Y H, Xie L X, Zhang X P, et al. PC-DARTS: partial channel connections for memory-efficient architecture search. In: Proceedings of International Conference on Learning Representations, 2020
Xie S R, Zheng H H, Liu C X, et al. SNAS: stochastic neural architecture search. In: Proceedings of International Conference on Learning Representations, 2019
Real E, Moore S, Selle A, et al. Large-scale evolution of image classifiers. In: Proceedings of International Conference on Machine Learning, 2017

Download references

Acknowledgements

This work was in part supported by National Natural Science Foundation of China (NSFC) (Grant Nos. 61876212, 61976208, 61733007), Zhejiang Lab (Grant No. 2019NB0AB02), and HUST-Horizon Computer Vision Research Center. We thank Liangchen SONG and Guoli WANG for the discussion and assistance.

Author information

Fang J M and Chen Y K have contributed equally, and the work was performed during the internship at Horizon Robotics.

Authors and Affiliations

Institute of Artificial Intelligence, Huazhong University of Science and Technology, Wuhan, 430074, China
Jiemin Fang
School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, 430074, China
Jiemin Fang, Wenyu Liu & Xinggang Wang
Horizon Robotics, Beijing, 100089, China
Qian Zhang & Chang Huang
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Yukang Chen, Xinbang Zhang & Gaofeng Meng

Authors

Jiemin Fang
View author publications
You can also search for this author in PubMed Google Scholar
Yukang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xinbang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Gaofeng Meng
View author publications
You can also search for this author in PubMed Google Scholar
Wenyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xinggang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinggang Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, J., Chen, Y., Zhang, X. et al. EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search. Sci. China Inf. Sci. 64, 192106 (2021). https://doi.org/10.1007/s11432-020-3112-8

Download citation

Received: 28 February 2020
Revised: 19 May 2020
Accepted: 04 August 2020
Published: 06 August 2021
DOI: https://doi.org/10.1007/s11432-020-3112-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search

Abstract

Access this article

Similar content being viewed by others

Towards Self-Supervised and Weight-preserving Neural Architecture Search

Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild

BigNAS: Scaling up Neural Architecture Search with Big Single-Stage Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search

Abstract

Access this article

Similar content being viewed by others

Towards Self-Supervised and Weight-preserving Neural Architecture Search

Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild

BigNAS: Scaling up Neural Architecture Search with Big Single-Stage Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation