A Review of One-Shot Neural Architecture Search Methods

Zharikov, Ilia; Krivorotov, Ivan; Maximov, Egor; Korviakov, Vladimir; Letunovskiy, Alexey

doi:10.1007/978-3-031-19032-2_14

Ilia Zharikov⁶,
Ivan Krivorotov⁶,
Egor Maximov⁶,
Vladimir Korviakov⁷ &
…
Alexey Letunovskiy⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1064))

Included in the following conference series:

International Conference on Neuroinformatics

568 Accesses

Abstract

Neural network architecture design is a challenging and computational expensive problem. For this reason training a one-shot model becomes very popular way to obtain several architectures or find the best according to different requirements without retraining. In this paper we summarize the existing one-shot NAS methods, highlight base concepts and compare considered methods in terms of accuracy, number of needed for training GPU hours and ranking quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bender, G.M., jan Kindermans, P., Zoph, B., Vasudevan, V., Le, Q.: Understanding and simplifying one-shot architecture search (2018). http://proceedings.mlr.press/v80/bender18a/bender18a.pdf
Cai, H., Gan, C., Wang, T., Zhang, Z., Han, S.: Once-for-all: Train one network and specialize it for efficient deployment on diverse hardware platforms. arXiv preprint arXiv:1908.09791 (2019)
Cai, H., Zhu, L., Han, S.: Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332 (2019)
Chen, B., et al.: BN-NAS: Neural architecture search with batch normalization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 307–316 (2021)
Google Scholar
Chen, X., Xie, L., Wu, J., Tian, Q.: Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. arXiv preprint arXiv:1904.12760 (2019)
Chu, X., Li, X., Lu, Y., Zhang, B., Li, J.: Mixpath: A unified approach for one-shot neural architecture search. arXiv preprint arXiv:2001.05887 (2020)
Chu, X., Zhang, B., Li, J., Li, Q., Xu, R.: SCARLET-NAS: Bridging the gap between stability and scalability in weight-sharing neural architecture search. arXiv preprint arXiv:1908.06022 (2019)
Chu, X., Zhang, B., Xu, R., Li, J.: FairNAS: Rethinking evaluation fairness of weight sharing neural architecture search. arXiv preprint arXiv:1907.01845 (2019)
Chu, X., Zhou, T., Zhang, B., Li, J.: Fair darts: Eliminating unfair advantages in differentiable architecture search. arXiv preprint arXiv:1911.12126 (2019)
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018)
Dong, P., et al.: Prior-guided one-shot neural architecture search. arXiv preprint arXiv:2206.13329 (2022)
Dong, X., Yang, Y.: One-shot neural architecture search via self-evaluated template network. 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019). https://doi.org/10.1109/iccv.2019.00378.https://doi.org/10.1109/ICCV.2019.00378
Dong, X., Yang, Y.: One-shot neural architecture search via self-evaluated template network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3681–3690 (2019)
Google Scholar
Geada, R., Prangle, D., McGough, A.S.: Bonsai-Net: One-shot neural architecture search via differentiable pruners. arXiv preprint arXiv:2006.09264 (2020)
Guo, R., et al.: Powering one-shot topological nas with stabilized share-parameter proxy. arXiv preprint arXiv:2005.10511 (2020)
Guo, Y., et al.: Breaking the curse of space explosion: Towards efficient NAS with curriculum search. In: International Conference on Machine Learning, pp. 3822–3831. PMLR (2020)
Google Scholar
Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. arXiv preprint arXiv:1904.00420 (2019)
Hu, H., Langford, J., Caruana, R., Mukherjee, S., Horvitz, E., Dey, D.: Efficient forward architecture search. arXiv preprint arXiv:1905.13360 (2019)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 (2018)
Google Scholar
Hu, S., Xie, S., Zheng, H., Liu, C., Shi, J., Liu, X., Lin, D.: DSNAs: Direct neural architecture search without parameter retraining. arXiv preprint arXiv:2002.09128 (2020)
Hu, Y., Wang, X., Li, L., Gu, Q.: Improving one-shot NAS with shrinking-and-expanding supernet. Pattern Recognition 118, 108,025 (2021)
Google Scholar
Huang, S.Y., Chu, W.T.: PONAS: Progressive one-shot neural architecture search for very efficient deployment. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE (2021)
Google Scholar
Li, C., et al.: Block-wisely supervised neural architecture search with knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Li, G., Qian, G., Delgadillo, I.C., Müller, M., Thabet, A., Ghanem, B.: SGAS: Sequential greedy architecture search. arXiv preprint arXiv:1912.00195 (2019)
Li, L., Talwalkar, A.: Random search and reproducibility for neural architecture search. arXiv preprint arXiv:1902.07638 (2019)
Liang, H., et al.: Darts+: Improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035 (2020)
Liu, H., Simonyan, K., Yang, Y.: Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018)
Luo, R., Tian, F., Qin, T., Chen, E., Liu, T.Y.: Neural architecture optimization. arXiv preprint arXiv:1808.07233 (2019)
Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp. 116–131 (2018)
Google Scholar
Mei, J., et al.: Atomnas: Fine-grained end-to-end neural architecture search. arXiv preprint arXiv:1912.09640 (2019)
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018)
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv preprint arXiv:1710.05941 (2017)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520 (2018)
Google Scholar
Shen, Z., Qian, J., Zhuang, B., Wang, S., Xiao, J.: Bs-Nas: Broadening-and-shrinking one-shot nas with searchable numbers of channels. arXiv preprint arXiv:2003.09821 (2020)
Stamoulis, D, et al.: Single-path nas: Designing hardware-efficient convnets in less than 4 hours. arXiv preprint arXiv:1904.02877 (2019)
Su, X., et al.: K-shot Nas: Learnable weight-sharing for nas with k-shot supernets. In: International Conference on Machine Learning, pp. 9880–9890. PMLR (2021)
Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567 (2015)
Wang, X., et al.: Rome: Robustifying memory-efficient nas via topology disentanglement and gradients accumulation. arXiv preprint arXiv:2011.11233 (2020)
Wu, B., et al.: FbNet: Hardware-aware efficient convnet design via differentiable neural architecture search. arXiv preprint arXiv:1812.03443 (2019)
Wu, B., et al.: Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019). https://doi.org/10.1109/cvpr.2019.01099.https://doi.org/10.1109/CVPR.2019.01099
Xia, X., Xiao, X., Wang, X., Zheng, M.: Progressive automatic design of search space for one-shot neural architecture search. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2455–2464 (2022)
Google Scholar
Xu, Y., et al.: Pc-darts: Partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737 (2019)
You, S., Huang, T., Yang, M., Wang, F., Qian, C., Zhang, C.: Greedynas: Towards fast one-shot nas with greedy supernet. arXiv preprint arXiv:2003.11236 (2020)
Yu, J., et al.: Bignas: Scaling up neural architecture search with big single-stage models. arXiv preprint arXiv:2003.11142 (2020)
Yu, K., Ranftl, R., Salzmann, M.: How to train your super-net: An analysis of training heuristics in weight-sharing NAS. arXiv preprint arXiv:2003.04276 (2020)
Yu, K., Sciuto, C., Jaggi, M., Musat, C., Salzmann, M.: Evaluating the search phase of neural architecture search. arXiv preprint arXiv:1902.08142 (2019)
Zela, A., Elsken, T., Saikia, T., Marrakchi, Y., Brox, T., Hutter, F.: Understanding and robustifying differentiable architecture search. arXiv preprint arXiv:1909.09656 (2019)
Zela, A., Elsken, T., Saikia, T., Marrakchi, Y., Brox, T., Hutter, F.: Understanding and robustifying differentiable architecture search. arXiv preprint arXiv:1909.09656 (2020)
Zhang, X., Hou, P., Zhang, X., Sun, J.: Neural architecture search with random labels. arXiv preprint arXiv:2101.11834 (2021)
Zhang, Y., et al.: Deeper insights into weight sharing in neural architecture search. arXiv preprint arXiv:2001.01431 (2020)
Zhao, Y., Wang, L., Tian, Y., Fonseca, R., Guo, T.: Few-shot neural architecture search. arXiv preprint arXiv:2006.06863 (2020)
Zheng, X., Ji, R., Tang, L., Zhang, B., Liu, J., Tian, Q.: Multinomial distribution learning for effective neural architecture search. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1304–1313 (2019)
Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. arXiv preprint arXiv:1707.07012 (2018)

Download references

Author information

Authors and Affiliations

Moscow Institute of Physics and Technology, Moscow, Russia
Ilia Zharikov, Ivan Krivorotov & Egor Maximov
Intelligent Systems and Data Science Technology Center, Huawei Technologies Co., Ltd, Moscow, Russia
Vladimir Korviakov & Alexey Letunovskiy

Authors

Ilia Zharikov
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Krivorotov
View author publications
You can also search for this author in PubMed Google Scholar
Egor Maximov
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Korviakov
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Letunovskiy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ilia Zharikov .

Editor information

Editors and Affiliations

Scientific Research Institute for System Analysis, Russian Academy of Sciences, Moscow, Russia
Boris Kryzhanovsky
Scientific Research Institute for System Analysis, Russian Academy of Sciences, Moscow, Russia
Witali Dunin-Barkowski
Scientific Research Institute for System Analysis, Russian Academy of Sciences, Moscow, Russia
Vladimir Redko
Moscow Aviation Institute (National Research University), Moscow, Russia
Yury Tiumentsev

6 Appendix

Table 1. Reported in papers found architectures accuracies on CIFAR-10 dataset. GPU hours column indicates how many hours on stated on Resources column the method was trained, in brackets \((\cdot )\) total number of epochs is mentioned, Transfer means that architecture were selected and pretrained on different dataset. Top-1 demonstrates corresponding accuracies. Flops indicated number of floating point iteration, N — number of parameters of particular architecture. - means that the information was not reported by the authors. In this table presented only methods that reported some results on this dataset. Best viewed in zoom.

Full size table

Table 2. Reported in papers found architectures accuracies on ImageNet dataset. GPU hours column indicates how many hours on stated on Resources column the method was trained, in brackets \((\cdot )\) total number of epochs is mentioned, Transfer means that architecture were selected and pretrained on different dataset. Top-1 and Top-5 demonstrates corresponding accuracies, values in italics corresponds to results obtained with combined tricks such as Squeeze-and-Excitation [19], Swish activations [32] and AutoAugment [10]. Flops indicated number of floating point iteration, N — number of parameters of particular architecture. - means that the information was not reported by the authors. In this table presented only methods that reported some results on this dataset. Table is divided into 2 part, the second part is presented in Table 3. Best viewed in zoom.

Full size table

Table 3. Second part of the Table 2. Reported in papers found architectures accuracies on ImageNet dataset. GPU hours column indicates how many hours on stated on Resources column the method was trained, in brackets \((\cdot )\) total number of epochs is mentioned, Transfer means that architecture were selected and pretrained on different dataset. Top-1 and Top-5 demonstrates corresponding accuracies, values in italics corresponds to results obtained with combined tricks such as Squeeze-and-Excitation [19], Swish activations [32] and AutoAugment [10]. Flops indicated number of floating point iteration, N — number of parameters of particular architecture. - means that the information was not reported by the authors. In this table presented only methods that reported some results on this dataset. Best viewed in zoom.

Full size table

Table 4. Reported in papers correlation metrics of the proposed method. Dataset means dataset of models and/or dataset of images to obtain accuracies of models from indicated models dataset. Kendall-Tau column demonstrates corresponding metric values, Corr. means that authors demonstrate correlation w/o metric or using another metric. In this table presented only methods that reported some results connected with correlation measurement. Best viewed in zoom.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zharikov, I., Krivorotov, I., Maximov, E., Korviakov, V., Letunovskiy, A. (2023). A Review of One-Shot Neural Architecture Search Methods. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V., Tiumentsev, Y. (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research VI. NEUROINFORMATICS 2022. Studies in Computational Intelligence, vol 1064. Springer, Cham. https://doi.org/10.1007/978-3-031-19032-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-19032-2_14
Published: 19 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19031-5
Online ISBN: 978-3-031-19032-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Review of One-Shot Neural Architecture Search Methods

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

6 Appendix

6 Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation