Abstract
Neural architecture search (NAS) aims to automate architecture design processes and improve the performance of deep neural networks. Platform-aware NAS methods consider both performance and complexity and can find well-performing architectures with low computational resources. Although ordinary NAS methods result in tremendous computational costs owing to the repetition of model training, one-shot NAS, which trains the weights of a supernetwork containing all candidate architectures only once during the search process, has been reported to result in a lower search cost. This study focuses on the architecture complexity-aware one-shot NAS that optimizes the objective function composed of the weighted sum of two metrics, such as the predictive performance and number of parameters. In existing methods, the architecture search process must be run multiple times with different coefficients of the weighted sum to obtain multiple architectures with different complexities. This study aims at reducing the search cost associated with finding multiple architectures. The proposed method uses multiple distributions to generate architectures with different complexities and updates each distribution using the samples obtained from multiple distributions based on importance sampling. The proposed method allows us to obtain multiple architectures with different complexities in a single architecture search, resulting in reducing the search cost. The proposed method is applied to the architecture search of convolutional neural networks on the CIAFR-10 and ImageNet datasets. Consequently, compared with baseline methods, the proposed method finds multiple architectures with varying complexities while requiring less computational effort.
Keywords
- Neural architecture search
- Convolutional neural network
- Importance sampling
- Natural gradient
This is a preview of subscription content, access via your institution.
Buying options

References
Akimoto, Y., Shirakawa, S., Yoshinari, N., Uchida, K., Saito, S., Nishida, K.: Adaptive stochastic natural gradient method for one-shot neural architecture search. In: International Conference on Machine Learning (ICML) (2019)
Amari, S.: Natural gradient works efficiently in learning. Neural Comput. 10(2), 251–276 (1998)
Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. In: International Conference on Learning Representations (ICLR) (2019)
Chu, X., Zhang, B., Li, Q., Xu, R., Li, X.: SCARLET-NAS: bridging the gap between scalability and fairness in neural architecture search. In: ICCV Workshops (2021). https://arxiv.org/abs/1908.06022
Chu, X., Zhou, T., Zhang, B., Li, J.: Fair DARTS: eliminating unfair advantages in differentiable architecture search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 465–480. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_28
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Elsken, T., Metzen, J.H., Hutter, F.: Neural architecture search: a survey. J. Mach. Learn. Res. 20(55), 1–21 (2019)
Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 544–560. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_32
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Huang, S., Chu, W.: Searching by generating: flexible and efficient one-shot NAS with architecture generator. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, Department of Computer Science, University of Toronto (2009)
Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: International Conference on Learning Representations (ICLR) (2019)
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (ICLR) (2017)
Ollivier, Y., Arnold, L., Auger, A., Hansen, N.: Information-geometric optimization algorithms: a unifying picture via invariance principles. J. Mach. Learn. Res. 18(18), 1–65 (2017)
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. In: International Conference on Machine Learning (ICML) (2018)
Real, E., et al.: Large-scale evolution of image classifiers. In: International Conference on Machine Learning (ICML) (2017)
Saito, S., Shirakawa, S.: Controlling model complexity in probabilistic model-based dynamic optimization of neural network structures. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11728, pp. 393–405. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30484-3_33
Shirakawa, S., Akimoto, Y., Ouchi, K., Ohara, K.: Sample reuse in the covariance matrix adaptation evolution strategy based on importance sampling. In: Genetic and Evolutionary Computation Conference (GECCO) (2015)
Shirakawa, S., Akimoto, Y., Ouchi, K., Ohara, K.: Sample Reuse via Importance Sampling in Information Geometric Optimization. arXiv:1805.12388 (2018). https://arxiv.org/abs/1805.12388
Shirakawa, S., Iwata, Y., Akimoto, Y.: Dynamic optimization of neural network structures using probabilistic modeling. In: 32nd AAAI Conference on Artificial Intelligence (AAAI) (2018)
Suganuma, M., Shirakawa, S., Nagao, T.: A genetic programming approach to designing convolutional neural network architectures. In: Genetic and Evolutionary Computation Conference (GECCO) (2017)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Wu, B., et al.: FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
You, S., Huang, T., Yang, M., Wang, F., Qian, C., Zhang, C.: GreedyNAS: towards fast one-shot NAS with greedy supernet. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Zhou, P., Xiong, C., Socher, R., Hoi, S.C.H.: Theory-inspired path-regularized differential network architecture search. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 8296–8307 (2020)
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2017)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Acknowledgments
This work was partially supported by NEDO (JPNP18002), JSPS KAKENHI Grant Number JP20H04240, and JST PRESTO Grant Number JPMJPR2133.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Noda, Y., Saito, S., Shirakawa, S. (2022). Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13532. Springer, Cham. https://doi.org/10.1007/978-3-031-15937-4_51
Download citation
DOI: https://doi.org/10.1007/978-3-031-15937-4_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15936-7
Online ISBN: 978-3-031-15937-4
eBook Packages: Computer ScienceComputer Science (R0)