CLOSE: Curriculum Learning on the Sharing Extent Towards Better One-Shot NAS

Zhou, Zixuan; Ning, Xuefei; Cai, Yi; Han, Jiashu; Deng, Yiping; Dong, Yuhan; Yang, Huazhong; Wang, Yu

doi:10.1007/978-3-031-20044-1_33

Zixuan Zhou^12,14,
Xuefei Ning^12,13,
Yi Cai¹²,
Jiashu Han¹²,
Yiping Deng¹³,
Yuhan Dong¹⁴,
Huazhong Yang¹² &
…
Yu Wang¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13680))

Included in the following conference series:

European Conference on Computer Vision

2155 Accesses
4 Citations

Abstract

One-shot Neural Architecture Search (NAS) has been widely used to discover architectures due to its efficiency. However, previous studies reveal that one-shot performance estimations of architectures might not be well correlated with their performances in stand-alone training because of the excessive sharing of operation parameters (i.e., large sharing extent) between architectures. Thus, recent methods construct even more over-parameterized supernets to reduce the sharing extent. But these improved methods introduce a large number of extra parameters and thus cause an undesirable trade-off between the training costs and the ranking quality. To alleviate the above issues, we propose to apply Curriculum Learning On Sharing Extent (CLOSE) to train the supernet both efficiently and effectively. Specifically, we train the supernet with a large sharing extent (an easier curriculum) at the beginning and gradually decrease the sharing extent of the supernet (a harder curriculum). To support this training strategy, we design a novel supernet (CLOSENet) that decouples the parameters from operations to realize a flexible sharing scheme and adjustable sharing extent. Extensive experiments demonstrate that CLOSE can obtain a better ranking quality across different computational budget constraints than other one-shot supernets, and is able to discover superior architectures when combined with various search strategies. Code is available at https://github.com/walkerning/aw_nas.

Z. Zhou and X. Ning—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bender, G., Kindermans, P.J., Zoph, B., Vasudevan, V., Le, Q.: Understanding and simplifying one-shot architecture search. In: International Conference on Machine Learning (ICML), pp. 550–559. PMLR (2018)
Google Scholar
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: International Conference on Machine Learning (ICML), pp. 41–48 (2009)
Google Scholar
Benyahia, Y., et al.: Overcoming multi-model forgetting. In: International Conference on Machine Learning (ICML), pp. 594–603. PMLR (2019)
Google Scholar
Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Smash: one-shot model architecture search through hypernetworks. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Dong, X., Yang, Y.: Searching for a robust neural architecture in four GPU hours. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1761–1770 (2019)
Google Scholar
Dong, X., Yang, Y.: NAS-bench-201: Extending the scope of reproducible neural architecture search. In: International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Gong, C., Yang, J., Tao, D.: Multi-modal curriculum learning over graphs. ACM Trans. Intell. Syst. Technol. (TIST) 10(4), 1–25 (2019)
Article Google Scholar
Guo, S., et al.: CurriculumNet: weakly supervised learning from large-scale web images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 139–154. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_9
Chapter Google Scholar
Guo, Y., et al.: Breaking the curse of space explosion: towards efficient NAS with curriculum search. In: International Conference on Machine Learning (ICML), pp. 3822–3831. PMLR (2020)
Google Scholar
Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 544–560. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_32
Chapter Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Hong, W., et al.: Dropnas: grouped operation dropout for differentiable architecture search. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 2326–2332 (2020)
Google Scholar
Hu, Y., et al.: Angle-based search space shrinking for neural architecture search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 119–134. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_8
Chapter Google Scholar
Jiang, L., Meng, D., Mitamura, T., Hauptmann, A.G.: Easy samples first: self-paced reranking for zero-example multimedia search. In: ACM International Multimedia Conference (MM), pp. 547–556 (2014)
Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations (ICLR). OpenReview.net (2018)
Google Scholar
Liang, H., et al.: Darts+: improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035 (2019)
Liu, C., et al.: Progressive neural architecture search. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 19–35. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_2
Chapter Google Scholar
Liu, H., Simonyan, K., Yang, Y.: Darts: differentiable architecture search. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Luo, R., Qin, T., Chen, E.: Understanding and improving one-shot neural architecture optimization. CoRR abs/1909.10815 (2019)
Google Scholar
Ning, X., et al.: Evaluating efficient performance estimators of neural architectures. In: Annual Conference on Neural Information Processing Systems (NIPS) (2021)
Google Scholar
Ning, X., Zheng, Y., Zhao, T., Wang, Yu., Yang, H.: A generic graph-based neural architecture encoding scheme for predictor-based NAS. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12358, pp. 189–204. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58601-0_12
Chapter Google Scholar
Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning (ICML), pp. 4095–4104. PMLR (2018)
Google Scholar
Platanios, E.A., Stretcu, O., Neubig, G., Póczos, B., Mitchell, T.: Competence-based curriculum learning for neural machine translation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1162–1172 (2019)
Google Scholar
Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollár, P.: Designing network design spaces. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10428–10436 (2020)
Google Scholar
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI Conference on Artificial Intelligence, vol. 33, pp. 4780–4789 (2019)
Google Scholar
Ren, Z., Dong, D., Li, H., Chen, C.: Self-paced prioritized curriculum learning with coverage penalty in deep reinforcement learning. IEEE Trans. Neural Networks Learn. Syst. 29(6), 2216–2226 (2018)
Article Google Scholar
Siems, J., Zimmer, L., Zela, A., Lukasik, J., Keuper, M., Hutter, F.: NAS-bench-301 and the case for surrogate benchmarks for neural architecture search. arXiv preprint arXiv:2008.09777 (2020)
Soviany, P., Ionescu, R.T., Rota, P., Sebe, N.: Curriculum learning: a survey. Int. J. Comput. Vis. (IJCV) 130, 1526–1565 (2022)
Article Google Scholar
Su, X., et al.: K-shot NAS: learnable weight-sharing for NAS with k-shot supernets. In: International Conference on Machine Learning (ICML), pp. 9880–9890. PMLR (2021)
Google Scholar
Tay, Y., et al.: Simple and effective curriculum pointer-generator networks for reading comprehension over long narratives. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4922–4931 (2019)
Google Scholar
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1492–1500 (2017)
Google Scholar
Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Yang, Z., et al.: Cars: continuous evolution for efficient neural architecture search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1829–1838 (2020)
Google Scholar
Yu, K., Sciuto, C., Jaggi, M., Musat, C., Salzmann, M.: Evaluating the search phase of neural architecture search. In: International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Zela, A., Siems, J., Hutter, F.: NAS-bench-1shot1: benchmarking and dissecting one-shot neural architecture search. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Zhang, M., Li, H., Pan, S., Chang, X., Su, S.: Overcoming multi-model forgetting in one-shot NAS with diversity maximization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7809–7818 (2020)
Google Scholar
Zhao, Y., Wang, L., Tian, Y., Fonseca, R., Guo, T.: Few-shot neural architecture search. In: International Conference on Machine Learning (ICML), pp. 12707–12718. PMLR (2021)
Google Scholar
Zhou, H., Yang, M., Wang, J., Pan, W.: BayesNAS: a Bayesian approach for neural architecture search. In: International Conference on Machine Learning (ICML), pp. 7603–7613. PMLR (2019)
Google Scholar
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8697–8710 (2018)
Google Scholar

Download references

This work was supported by National Natural Science Foundation of China (No. U19B2019, 61832007), National Key Research and Development Program of China (No. 2019YFF0301500), Tsinghua EE Xilinx AI Research Fund, Beijing National Research Center for Information Science and Technology (BNRist), and Beijing Innovation Center for Future Chips.

Author information

Authors and Affiliations

Department of Electronic Engineering, Tsinghua University, Beijing, China
Zixuan Zhou, Xuefei Ning, Yi Cai, Jiashu Han, Huazhong Yang & Yu Wang
Huawei TCS Lab, Shanghai, China
Xuefei Ning & Yiping Deng
Tsinghua Shenzhen International Graduate School, Shenzhen, China
Zixuan Zhou & Yuhan Dong

Authors

Zixuan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xuefei Ning
View author publications
You can also search for this author in PubMed Google Scholar
Yi Cai
View author publications
You can also search for this author in PubMed Google Scholar
Jiashu Han
View author publications
You can also search for this author in PubMed Google Scholar
Yiping Deng
View author publications
You can also search for this author in PubMed Google Scholar
Yuhan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Huazhong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Wang .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 682 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Z. et al. (2022). CLOSE: Curriculum Learning on the Sharing Extent Towards Better One-Shot NAS. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13680. Springer, Cham. https://doi.org/10.1007/978-3-031-20044-1_33

Download citation

DOI: https://doi.org/10.1007/978-3-031-20044-1_33
Published: 20 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20043-4
Online ISBN: 978-3-031-20044-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CLOSE: Curriculum Learning on the Sharing Extent Towards Better One-Shot NAS