Skip to main content

CLOSE: Curriculum Learning on the Sharing Extent Towards Better One-Shot NAS

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13680))

Included in the following conference series:

Abstract

One-shot Neural Architecture Search (NAS) has been widely used to discover architectures due to its efficiency. However, previous studies reveal that one-shot performance estimations of architectures might not be well correlated with their performances in stand-alone training because of the excessive sharing of operation parameters (i.e., large sharing extent) between architectures. Thus, recent methods construct even more over-parameterized supernets to reduce the sharing extent. But these improved methods introduce a large number of extra parameters and thus cause an undesirable trade-off between the training costs and the ranking quality. To alleviate the above issues, we propose to apply Curriculum Learning On Sharing Extent (CLOSE) to train the supernet both efficiently and effectively. Specifically, we train the supernet with a large sharing extent (an easier curriculum) at the beginning and gradually decrease the sharing extent of the supernet (a harder curriculum). To support this training strategy, we design a novel supernet (CLOSENet) that decouples the parameters from operations to realize a flexible sharing scheme and adjustable sharing extent. Extensive experiments demonstrate that CLOSE can obtain a better ranking quality across different computational budget constraints than other one-shot supernets, and is able to discover superior architectures when combined with various search strategies. Code is available at https://github.com/walkerning/aw_nas.

Z. Zhou and X. Ning—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bender, G., Kindermans, P.J., Zoph, B., Vasudevan, V., Le, Q.: Understanding and simplifying one-shot architecture search. In: International Conference on Machine Learning (ICML), pp. 550–559. PMLR (2018)

    Google Scholar 

  2. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: International Conference on Machine Learning (ICML), pp. 41–48 (2009)

    Google Scholar 

  3. Benyahia, Y., et al.: Overcoming multi-model forgetting. In: International Conference on Machine Learning (ICML), pp. 594–603. PMLR (2019)

    Google Scholar 

  4. Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Smash: one-shot model architecture search through hypernetworks. In: International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  5. Dong, X., Yang, Y.: Searching for a robust neural architecture in four GPU hours. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1761–1770 (2019)

    Google Scholar 

  6. Dong, X., Yang, Y.: NAS-bench-201: Extending the scope of reproducible neural architecture search. In: International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  7. Gong, C., Yang, J., Tao, D.: Multi-modal curriculum learning over graphs. ACM Trans. Intell. Syst. Technol. (TIST) 10(4), 1–25 (2019)

    Article  Google Scholar 

  8. Guo, S., et al.: CurriculumNet: weakly supervised learning from large-scale web images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 139–154. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_9

    Chapter  Google Scholar 

  9. Guo, Y., et al.: Breaking the curse of space explosion: towards efficient NAS with curriculum search. In: International Conference on Machine Learning (ICML), pp. 3822–3831. PMLR (2020)

    Google Scholar 

  10. Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 544–560. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_32

    Chapter  Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  12. Hong, W., et al.: Dropnas: grouped operation dropout for differentiable architecture search. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 2326–2332 (2020)

    Google Scholar 

  13. Hu, Y., et al.: Angle-based search space shrinking for neural architecture search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 119–134. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_8

    Chapter  Google Scholar 

  14. Jiang, L., Meng, D., Mitamura, T., Hauptmann, A.G.: Easy samples first: self-paced reranking for zero-example multimedia search. In: ACM International Multimedia Conference (MM), pp. 547–556 (2014)

    Google Scholar 

  15. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations (ICLR). OpenReview.net (2018)

    Google Scholar 

  16. Liang, H., et al.: Darts+: improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035 (2019)

  17. Liu, C., et al.: Progressive neural architecture search. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 19–35. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_2

    Chapter  Google Scholar 

  18. Liu, H., Simonyan, K., Yang, Y.: Darts: differentiable architecture search. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  19. Luo, R., Qin, T., Chen, E.: Understanding and improving one-shot neural architecture optimization. CoRR abs/1909.10815 (2019)

    Google Scholar 

  20. Ning, X., et al.: Evaluating efficient performance estimators of neural architectures. In: Annual Conference on Neural Information Processing Systems (NIPS) (2021)

    Google Scholar 

  21. Ning, X., Zheng, Y., Zhao, T., Wang, Yu., Yang, H.: A generic graph-based neural architecture encoding scheme for predictor-based NAS. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12358, pp. 189–204. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58601-0_12

    Chapter  Google Scholar 

  22. Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning (ICML), pp. 4095–4104. PMLR (2018)

    Google Scholar 

  23. Platanios, E.A., Stretcu, O., Neubig, G., Póczos, B., Mitchell, T.: Competence-based curriculum learning for neural machine translation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1162–1172 (2019)

    Google Scholar 

  24. Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollár, P.: Designing network design spaces. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10428–10436 (2020)

    Google Scholar 

  25. Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI Conference on Artificial Intelligence, vol. 33, pp. 4780–4789 (2019)

    Google Scholar 

  26. Ren, Z., Dong, D., Li, H., Chen, C.: Self-paced prioritized curriculum learning with coverage penalty in deep reinforcement learning. IEEE Trans. Neural Networks Learn. Syst. 29(6), 2216–2226 (2018)

    Article  Google Scholar 

  27. Siems, J., Zimmer, L., Zela, A., Lukasik, J., Keuper, M., Hutter, F.: NAS-bench-301 and the case for surrogate benchmarks for neural architecture search. arXiv preprint arXiv:2008.09777 (2020)

  28. Soviany, P., Ionescu, R.T., Rota, P., Sebe, N.: Curriculum learning: a survey. Int. J. Comput. Vis. (IJCV) 130, 1526–1565 (2022)

    Article  Google Scholar 

  29. Su, X., et al.: K-shot NAS: learnable weight-sharing for NAS with k-shot supernets. In: International Conference on Machine Learning (ICML), pp. 9880–9890. PMLR (2021)

    Google Scholar 

  30. Tay, Y., et al.: Simple and effective curriculum pointer-generator networks for reading comprehension over long narratives. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4922–4931 (2019)

    Google Scholar 

  31. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1492–1500 (2017)

    Google Scholar 

  32. Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  33. Yang, Z., et al.: Cars: continuous evolution for efficient neural architecture search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1829–1838 (2020)

    Google Scholar 

  34. Yu, K., Sciuto, C., Jaggi, M., Musat, C., Salzmann, M.: Evaluating the search phase of neural architecture search. In: International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  35. Zela, A., Siems, J., Hutter, F.: NAS-bench-1shot1: benchmarking and dissecting one-shot neural architecture search. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  36. Zhang, M., Li, H., Pan, S., Chang, X., Su, S.: Overcoming multi-model forgetting in one-shot NAS with diversity maximization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7809–7818 (2020)

    Google Scholar 

  37. Zhao, Y., Wang, L., Tian, Y., Fonseca, R., Guo, T.: Few-shot neural architecture search. In: International Conference on Machine Learning (ICML), pp. 12707–12718. PMLR (2021)

    Google Scholar 

  38. Zhou, H., Yang, M., Wang, J., Pan, W.: BayesNAS: a Bayesian approach for neural architecture search. In: International Conference on Machine Learning (ICML), pp. 7603–7613. PMLR (2019)

    Google Scholar 

  39. Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  40. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8697–8710 (2018)

    Google Scholar 

Download references

This work was supported by National Natural Science Foundation of China (No. U19B2019, 61832007), National Key Research and Development Program of China (No. 2019YFF0301500), Tsinghua EE Xilinx AI Research Fund, Beijing National Research Center for Information Science and Technology (BNRist), and Beijing Innovation Center for Future Chips.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Wang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 682 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, Z. et al. (2022). CLOSE: Curriculum Learning on the Sharing Extent Towards Better One-Shot NAS. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13680. Springer, Cham. https://doi.org/10.1007/978-3-031-20044-1_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20044-1_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20043-4

  • Online ISBN: 978-3-031-20044-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics