Skip to main content

A Review of One-Shot Neural Architecture Search Methods

  • Conference paper
  • First Online:
Advances in Neural Computation, Machine Learning, and Cognitive Research VI (NEUROINFORMATICS 2022)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1064))

Included in the following conference series:

  • 568 Accesses

Abstract

Neural network architecture design is a challenging and computational expensive problem. For this reason training a one-shot model becomes very popular way to obtain several architectures or find the best according to different requirements without retraining. In this paper we summarize the existing one-shot NAS methods, highlight base concepts and compare considered methods in terms of accuracy, number of needed for training GPU hours and ranking quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bender, G.M., jan Kindermans, P., Zoph, B., Vasudevan, V., Le, Q.: Understanding and simplifying one-shot architecture search (2018). http://proceedings.mlr.press/v80/bender18a/bender18a.pdf

  2. Cai, H., Gan, C., Wang, T., Zhang, Z., Han, S.: Once-for-all: Train one network and specialize it for efficient deployment on diverse hardware platforms. arXiv preprint arXiv:1908.09791 (2019)

  3. Cai, H., Zhu, L., Han, S.: Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332 (2019)

  4. Chen, B., et al.: BN-NAS: Neural architecture search with batch normalization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 307–316 (2021)

    Google Scholar 

  5. Chen, X., Xie, L., Wu, J., Tian, Q.: Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. arXiv preprint arXiv:1904.12760 (2019)

  6. Chu, X., Li, X., Lu, Y., Zhang, B., Li, J.: Mixpath: A unified approach for one-shot neural architecture search. arXiv preprint arXiv:2001.05887 (2020)

  7. Chu, X., Zhang, B., Li, J., Li, Q., Xu, R.: SCARLET-NAS: Bridging the gap between stability and scalability in weight-sharing neural architecture search. arXiv preprint arXiv:1908.06022 (2019)

  8. Chu, X., Zhang, B., Xu, R., Li, J.: FairNAS: Rethinking evaluation fairness of weight sharing neural architecture search. arXiv preprint arXiv:1907.01845 (2019)

  9. Chu, X., Zhou, T., Zhang, B., Li, J.: Fair darts: Eliminating unfair advantages in differentiable architecture search. arXiv preprint arXiv:1911.12126 (2019)

  10. Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018)

  11. Dong, P., et al.: Prior-guided one-shot neural architecture search. arXiv preprint arXiv:2206.13329 (2022)

  12. Dong, X., Yang, Y.: One-shot neural architecture search via self-evaluated template network. 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019). https://doi.org/10.1109/iccv.2019.00378.https://doi.org/10.1109/ICCV.2019.00378

  13. Dong, X., Yang, Y.: One-shot neural architecture search via self-evaluated template network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3681–3690 (2019)

    Google Scholar 

  14. Geada, R., Prangle, D., McGough, A.S.: Bonsai-Net: One-shot neural architecture search via differentiable pruners. arXiv preprint arXiv:2006.09264 (2020)

  15. Guo, R., et al.: Powering one-shot topological nas with stabilized share-parameter proxy. arXiv preprint arXiv:2005.10511 (2020)

  16. Guo, Y., et al.: Breaking the curse of space explosion: Towards efficient NAS with curriculum search. In: International Conference on Machine Learning, pp. 3822–3831. PMLR (2020)

    Google Scholar 

  17. Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. arXiv preprint arXiv:1904.00420 (2019)

  18. Hu, H., Langford, J., Caruana, R., Mukherjee, S., Horvitz, E., Dey, D.: Efficient forward architecture search. arXiv preprint arXiv:1905.13360 (2019)

  19. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 (2018)

    Google Scholar 

  20. Hu, S., Xie, S., Zheng, H., Liu, C., Shi, J., Liu, X., Lin, D.: DSNAs: Direct neural architecture search without parameter retraining. arXiv preprint arXiv:2002.09128 (2020)

  21. Hu, Y., Wang, X., Li, L., Gu, Q.: Improving one-shot NAS with shrinking-and-expanding supernet. Pattern Recognition 118, 108,025 (2021)

    Google Scholar 

  22. Huang, S.Y., Chu, W.T.: PONAS: Progressive one-shot neural architecture search for very efficient deployment. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE (2021)

    Google Scholar 

  23. Li, C., et al.: Block-wisely supervised neural architecture search with knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  24. Li, G., Qian, G., Delgadillo, I.C., Müller, M., Thabet, A., Ghanem, B.: SGAS: Sequential greedy architecture search. arXiv preprint arXiv:1912.00195 (2019)

  25. Li, L., Talwalkar, A.: Random search and reproducibility for neural architecture search. arXiv preprint arXiv:1902.07638 (2019)

  26. Liang, H., et al.: Darts+: Improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035 (2020)

  27. Liu, H., Simonyan, K., Yang, Y.: Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018)

  28. Luo, R., Tian, F., Qin, T., Chen, E., Liu, T.Y.: Neural architecture optimization. arXiv preprint arXiv:1808.07233 (2019)

  29. Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp. 116–131 (2018)

    Google Scholar 

  30. Mei, J., et al.: Atomnas: Fine-grained end-to-end neural architecture search. arXiv preprint arXiv:1912.09640 (2019)

  31. Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018)

  32. Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv preprint arXiv:1710.05941 (2017)

  33. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520 (2018)

    Google Scholar 

  34. Shen, Z., Qian, J., Zhuang, B., Wang, S., Xiao, J.: Bs-Nas: Broadening-and-shrinking one-shot nas with searchable numbers of channels. arXiv preprint arXiv:2003.09821 (2020)

  35. Stamoulis, D, et al.: Single-path nas: Designing hardware-efficient convnets in less than 4 hours. arXiv preprint arXiv:1904.02877 (2019)

  36. Su, X., et al.: K-shot Nas: Learnable weight-sharing for nas with k-shot supernets. In: International Conference on Machine Learning, pp. 9880–9890. PMLR (2021)

    Google Scholar 

  37. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016)

  38. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567 (2015)

  39. Wang, X., et al.: Rome: Robustifying memory-efficient nas via topology disentanglement and gradients accumulation. arXiv preprint arXiv:2011.11233 (2020)

  40. Wu, B., et al.: FbNet: Hardware-aware efficient convnet design via differentiable neural architecture search. arXiv preprint arXiv:1812.03443 (2019)

  41. Wu, B., et al.: Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019). https://doi.org/10.1109/cvpr.2019.01099.https://doi.org/10.1109/CVPR.2019.01099

  42. Xia, X., Xiao, X., Wang, X., Zheng, M.: Progressive automatic design of search space for one-shot neural architecture search. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2455–2464 (2022)

    Google Scholar 

  43. Xu, Y., et al.: Pc-darts: Partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737 (2019)

  44. You, S., Huang, T., Yang, M., Wang, F., Qian, C., Zhang, C.: Greedynas: Towards fast one-shot nas with greedy supernet. arXiv preprint arXiv:2003.11236 (2020)

  45. Yu, J., et al.: Bignas: Scaling up neural architecture search with big single-stage models. arXiv preprint arXiv:2003.11142 (2020)

  46. Yu, K., Ranftl, R., Salzmann, M.: How to train your super-net: An analysis of training heuristics in weight-sharing NAS. arXiv preprint arXiv:2003.04276 (2020)

  47. Yu, K., Sciuto, C., Jaggi, M., Musat, C., Salzmann, M.: Evaluating the search phase of neural architecture search. arXiv preprint arXiv:1902.08142 (2019)

  48. Zela, A., Elsken, T., Saikia, T., Marrakchi, Y., Brox, T., Hutter, F.: Understanding and robustifying differentiable architecture search. arXiv preprint arXiv:1909.09656 (2019)

  49. Zela, A., Elsken, T., Saikia, T., Marrakchi, Y., Brox, T., Hutter, F.: Understanding and robustifying differentiable architecture search. arXiv preprint arXiv:1909.09656 (2020)

  50. Zhang, X., Hou, P., Zhang, X., Sun, J.: Neural architecture search with random labels. arXiv preprint arXiv:2101.11834 (2021)

  51. Zhang, Y., et al.: Deeper insights into weight sharing in neural architecture search. arXiv preprint arXiv:2001.01431 (2020)

  52. Zhao, Y., Wang, L., Tian, Y., Fonseca, R., Guo, T.: Few-shot neural architecture search. arXiv preprint arXiv:2006.06863 (2020)

  53. Zheng, X., Ji, R., Tang, L., Zhang, B., Liu, J., Tian, Q.: Multinomial distribution learning for effective neural architecture search. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1304–1313 (2019)

    Google Scholar 

  54. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. arXiv preprint arXiv:1707.07012 (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilia Zharikov .

Editor information

Editors and Affiliations

6 Appendix

6 Appendix

Table 1. Reported in papers found architectures accuracies on CIFAR-10 dataset. GPU hours column indicates how many hours on stated on Resources column the method was trained, in brackets \((\cdot )\) total number of epochs is mentioned, Transfer means that architecture were selected and pretrained on different dataset. Top-1 demonstrates corresponding accuracies. Flops indicated number of floating point iteration, N — number of parameters of particular architecture. - means that the information was not reported by the authors. In this table presented only methods that reported some results on this dataset. Best viewed in zoom.
Table 2. Reported in papers found architectures accuracies on ImageNet dataset. GPU hours column indicates how many hours on stated on Resources column the method was trained, in brackets \((\cdot )\) total number of epochs is mentioned, Transfer means that architecture were selected and pretrained on different dataset. Top-1 and Top-5 demonstrates corresponding accuracies, values in italics corresponds to results obtained with combined tricks such as Squeeze-and-Excitation [19], Swish activations [32] and AutoAugment [10]. Flops indicated number of floating point iteration, N — number of parameters of particular architecture. - means that the information was not reported by the authors. In this table presented only methods that reported some results on this dataset. Table is divided into 2 part, the second part is presented in Table 3. Best viewed in zoom.
Table 3. Second part of the Table 2. Reported in papers found architectures accuracies on ImageNet dataset. GPU hours column indicates how many hours on stated on Resources column the method was trained, in brackets \((\cdot )\) total number of epochs is mentioned, Transfer means that architecture were selected and pretrained on different dataset. Top-1 and Top-5 demonstrates corresponding accuracies, values in italics corresponds to results obtained with combined tricks such as Squeeze-and-Excitation [19], Swish activations [32] and AutoAugment [10]. Flops indicated number of floating point iteration, N — number of parameters of particular architecture. - means that the information was not reported by the authors. In this table presented only methods that reported some results on this dataset. Best viewed in zoom.
Table 4. Reported in papers correlation metrics of the proposed method. Dataset means dataset of models and/or dataset of images to obtain accuracies of models from indicated models dataset. Kendall-Tau column demonstrates corresponding metric values, Corr. means that authors demonstrate correlation w/o metric or using another metric. In this table presented only methods that reported some results connected with correlation measurement. Best viewed in zoom.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zharikov, I., Krivorotov, I., Maximov, E., Korviakov, V., Letunovskiy, A. (2023). A Review of One-Shot Neural Architecture Search Methods. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V., Tiumentsev, Y. (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research VI. NEUROINFORMATICS 2022. Studies in Computational Intelligence, vol 1064. Springer, Cham. https://doi.org/10.1007/978-3-031-19032-2_14

Download citation

Publish with us

Policies and ethics