Abstract
One-shot Neural Architecture Search (NAS) has been widely used to discover architectures due to its efficiency. However, previous studies reveal that one-shot performance estimations of architectures might not be well correlated with their performances in stand-alone training because of the excessive sharing of operation parameters (i.e., large sharing extent) between architectures. Thus, recent methods construct even more over-parameterized supernets to reduce the sharing extent. But these improved methods introduce a large number of extra parameters and thus cause an undesirable trade-off between the training costs and the ranking quality. To alleviate the above issues, we propose to apply Curriculum Learning On Sharing Extent (CLOSE) to train the supernet both efficiently and effectively. Specifically, we train the supernet with a large sharing extent (an easier curriculum) at the beginning and gradually decrease the sharing extent of the supernet (a harder curriculum). To support this training strategy, we design a novel supernet (CLOSENet) that decouples the parameters from operations to realize a flexible sharing scheme and adjustable sharing extent. Extensive experiments demonstrate that CLOSE can obtain a better ranking quality across different computational budget constraints than other one-shot supernets, and is able to discover superior architectures when combined with various search strategies. Code is available at https://github.com/walkerning/aw_nas.
Z. Zhou and X. Ning—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bender, G., Kindermans, P.J., Zoph, B., Vasudevan, V., Le, Q.: Understanding and simplifying one-shot architecture search. In: International Conference on Machine Learning (ICML), pp. 550–559. PMLR (2018)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: International Conference on Machine Learning (ICML), pp. 41–48 (2009)
Benyahia, Y., et al.: Overcoming multi-model forgetting. In: International Conference on Machine Learning (ICML), pp. 594–603. PMLR (2019)
Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Smash: one-shot model architecture search through hypernetworks. In: International Conference on Learning Representations (ICLR) (2018)
Dong, X., Yang, Y.: Searching for a robust neural architecture in four GPU hours. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1761–1770 (2019)
Dong, X., Yang, Y.: NAS-bench-201: Extending the scope of reproducible neural architecture search. In: International Conference on Learning Representations (ICLR) (2020)
Gong, C., Yang, J., Tao, D.: Multi-modal curriculum learning over graphs. ACM Trans. Intell. Syst. Technol. (TIST) 10(4), 1–25 (2019)
Guo, S., et al.: CurriculumNet: weakly supervised learning from large-scale web images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 139–154. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_9
Guo, Y., et al.: Breaking the curse of space explosion: towards efficient NAS with curriculum search. In: International Conference on Machine Learning (ICML), pp. 3822–3831. PMLR (2020)
Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 544–560. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_32
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hong, W., et al.: Dropnas: grouped operation dropout for differentiable architecture search. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 2326–2332 (2020)
Hu, Y., et al.: Angle-based search space shrinking for neural architecture search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 119–134. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_8
Jiang, L., Meng, D., Mitamura, T., Hauptmann, A.G.: Easy samples first: self-paced reranking for zero-example multimedia search. In: ACM International Multimedia Conference (MM), pp. 547–556 (2014)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations (ICLR). OpenReview.net (2018)
Liang, H., et al.: Darts+: improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035 (2019)
Liu, C., et al.: Progressive neural architecture search. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 19–35. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_2
Liu, H., Simonyan, K., Yang, Y.: Darts: differentiable architecture search. In: International Conference on Learning Representations (ICLR) (2019)
Luo, R., Qin, T., Chen, E.: Understanding and improving one-shot neural architecture optimization. CoRR abs/1909.10815 (2019)
Ning, X., et al.: Evaluating efficient performance estimators of neural architectures. In: Annual Conference on Neural Information Processing Systems (NIPS) (2021)
Ning, X., Zheng, Y., Zhao, T., Wang, Yu., Yang, H.: A generic graph-based neural architecture encoding scheme for predictor-based NAS. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12358, pp. 189–204. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58601-0_12
Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning (ICML), pp. 4095–4104. PMLR (2018)
Platanios, E.A., Stretcu, O., Neubig, G., Póczos, B., Mitchell, T.: Competence-based curriculum learning for neural machine translation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1162–1172 (2019)
Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollár, P.: Designing network design spaces. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10428–10436 (2020)
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI Conference on Artificial Intelligence, vol. 33, pp. 4780–4789 (2019)
Ren, Z., Dong, D., Li, H., Chen, C.: Self-paced prioritized curriculum learning with coverage penalty in deep reinforcement learning. IEEE Trans. Neural Networks Learn. Syst. 29(6), 2216–2226 (2018)
Siems, J., Zimmer, L., Zela, A., Lukasik, J., Keuper, M., Hutter, F.: NAS-bench-301 and the case for surrogate benchmarks for neural architecture search. arXiv preprint arXiv:2008.09777 (2020)
Soviany, P., Ionescu, R.T., Rota, P., Sebe, N.: Curriculum learning: a survey. Int. J. Comput. Vis. (IJCV) 130, 1526–1565 (2022)
Su, X., et al.: K-shot NAS: learnable weight-sharing for NAS with k-shot supernets. In: International Conference on Machine Learning (ICML), pp. 9880–9890. PMLR (2021)
Tay, Y., et al.: Simple and effective curriculum pointer-generator networks for reading comprehension over long narratives. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4922–4931 (2019)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1492–1500 (2017)
Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. In: International Conference on Learning Representations (ICLR) (2019)
Yang, Z., et al.: Cars: continuous evolution for efficient neural architecture search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1829–1838 (2020)
Yu, K., Sciuto, C., Jaggi, M., Musat, C., Salzmann, M.: Evaluating the search phase of neural architecture search. In: International Conference on Learning Representations (ICLR) (2020)
Zela, A., Siems, J., Hutter, F.: NAS-bench-1shot1: benchmarking and dissecting one-shot neural architecture search. In: International Conference on Learning Representations (ICLR) (2019)
Zhang, M., Li, H., Pan, S., Chang, X., Su, S.: Overcoming multi-model forgetting in one-shot NAS with diversity maximization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7809–7818 (2020)
Zhao, Y., Wang, L., Tian, Y., Fonseca, R., Guo, T.: Few-shot neural architecture search. In: International Conference on Machine Learning (ICML), pp. 12707–12718. PMLR (2021)
Zhou, H., Yang, M., Wang, J., Pan, W.: BayesNAS: a Bayesian approach for neural architecture search. In: International Conference on Machine Learning (ICML), pp. 7603–7613. PMLR (2019)
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2017)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8697–8710 (2018)
This work was supported by National Natural Science Foundation of China (No. U19B2019, 61832007), National Key Research and Development Program of China (No. 2019YFF0301500), Tsinghua EE Xilinx AI Research Fund, Beijing National Research Center for Information Science and Technology (BNRist), and Beijing Innovation Center for Future Chips.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, Z. et al. (2022). CLOSE: Curriculum Learning on the Sharing Extent Towards Better One-Shot NAS. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13680. Springer, Cham. https://doi.org/10.1007/978-3-031-20044-1_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-20044-1_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20043-4
Online ISBN: 978-3-031-20044-1
eBook Packages: Computer ScienceComputer Science (R0)