Skip to main content
Log in

Saliency: a new selection criterion of important architectures in neural architecture search

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Neural architecture search (NAS) has achieved great success in automatically designing high-performance neural networks for given tasks. But the early NAS approaches have a problem of excessive computational cost. Recently, some NAS approaches, such as gradient-based ones, have significantly reduced the computational cost. However, the gradient-based methods have a significant deviation in the architecture selection because they simply use the parameter values of the corresponding architectures as an importance index for architecture selection. This causes the architecture selected from the search space to generally fall into a sub-optimal state. To address this problem, we propose architecture saliency, as a new selection criterion of optimal architectures. Concretely, we define architecture saliency as the squared change in network loss induced by removing this architecture from the neural network. Our saliency directly reflects the contribution of a candidate architecture to the network performance. Therefore, our proposed selection criterion eliminates the deviation in architecture selection. Furthermore, we approximate architecture saliency with Taylor series expansion to get a more efficient implementation. Extensive experiments show that our approach achieves competitive even better model evaluation performance than other NAS approaches on multiple datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Baker B, Gupta O, Naik N, Raskar R (2016) Designing neural network architectures using reinforcement learning. In : International conference on learning representations

  2. Baldominos A, Saez Y, Isasi P (2020) On the automated, evolutionary design of neural networks: past, present, and future. Neural Comput Appl 32:1–27

    Article  Google Scholar 

  3. Cai H, Zhu L, Han S (2018) Proxylessnas: direct neural architecture search on target task and hardware. Preprint arXiv:1812.00332

  4. Cai Z, Zhu W (2018) Multi-label feature selection via feature manifold learning and sparsity regularization. Int J Mach Learn Cybern 9(8):1321–1334

    Article  Google Scholar 

  5. Cai Z, Yang X, Huang T, Zhu W (2020) A new similarity combining reconstruction coefficient with pairwise distance for agglomerative clustering. Inf Sci 508:173–182

    Article  MathSciNet  Google Scholar 

  6. Chen X, Xie L, Wu J, Tian Q (2019) Progressive differentiable architecture search: bridging the depth gap between search and evaluation, vol 1294–1303

  7. Chen X, Hsieh C-J (2020) Stabilizing differentiable architecture search via perturbation-based regularization. In: International conference on machine learning, pp 1554–1565

  8. Chu X, Zhou T, Zhang B, Li J (2020) Fair darts: eliminating unfair advantages in differentiable architecture search. In: European conference on computer vision, pp 465–480

  9. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp 248–255

  10. Dong X, Yang Y (2019) Searching for a robust neural architecture in four GPU hours. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1761–1770

  11. Elsken T, Metzen JH, Hutter F (2018) Neural architecture search: a survey. Preprint arXiv:1808.05377

  12. Guyon I, Gunn S, Nikravesh M, Zadeh LA (2008) Feature extraction: foundations and applications, vol 207. Springer, Berlin

    MATH  Google Scholar 

  13. Hand DJ (2007) Principles of data mining. Drug Saf 30(7):621–622

    Article  Google Scholar 

  14. Hassibi B, Stork DG (1993) Second order derivatives for network pruning: Optimal brain surgeon. In: Advances in neural information processing systems, pp 164–171

  15. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  16. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. Preprint arXiv:1704.04861

  17. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  18. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

  19. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd international conference on machine learning, pp 448–456

  20. Kapanova KG, Dimov I, Sellier JM (2018) A genetic approach to automatic neural network architecture optimization. Neural Comput Appl 29(5):1481–1492

    Article  Google Scholar 

  21. Krizhevsky A, Hinton G (2010) Convolutional deep belief networks on cifar-10. Unpublished manuscript 40(7):1–9

  22. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  23. Le Y, Yang X (2015) Tiny imagenet visual recognition challenge. CS 231N

  24. LeCun Y, Denker JS, Solla SA (1990) Optimal brain damage. In: Advances in neural information processing systems, pp 598–605

  25. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  26. Liang H, Zhang S, Sun J, He X, Huang W, Zhuang K, Li Z (2019) Darts+: improved differentiable architecture search with early stopping. Preprint arXiv:1909.06035

  27. Liu H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2017) Hierarchical representations for efficient architecture search. Preprint arXiv:1711.00436

  28. Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li L-J, Fei-Fei L, Yuille A, Huang J, Murphy K (2018) Progressive neural architecture search. In: Proceedings of the European conference on computer vision (ECCV), pp 19–34

  29. Liu H, Simonyan K, Yang Y (2018) Darts: differentiable architecture search. In: International conference on learning representations

  30. Ly A, Marsman M, Verhagen J, Grasman RPPP, Wagenmakers E-J (2017) A tutorial on fisher information. J Math Psychol 80:40–55

    Article  MathSciNet  Google Scholar 

  31. Martens J (2014) New insights and perspectives on the natural gradient method. Preprint arXiv:1412.1193

  32. Mahmoudi MT, Taghiyareh F, Forouzideh N, Lucas C (2013) Evolving artificial neural network structure using grammar encoding and colonial competitive algorithm. Neural Comput Appl 22(1):1–16

    Article  Google Scholar 

  33. Nixon M, Aguado A (2019) Feature extraction and image processing for computer vision. Academic Press, London

    Google Scholar 

  34. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch

  35. Pham H, Guan MY, Zoph B, Le QV, Dean J (2018) Efficient neural architecture search via parameter sharing. Preprint arXiv:1802.03268

  36. Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: Proceedings of the 34th international conference on machine learning, vol. 70, pp 2902–2911. JMLR. org

  37. Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 4780–4789

  38. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations

  39. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  40. Xie S, Zheng H, Liu C, Lin L (2018) Snas: stochastic neural architecture search. Preprint arXiv:1812.09926

  41. Xu Y, Xie L, Zhang X, Chen X, Qi G-J, Tian Q, Xiong H (2019) Pc-darts: partial channel connections for memory-efficient differentiable architecture search. Preprint arXiv:1907.05737

  42. Yao Q, Xu J, Tu W, Zhu Z (2019) Efficient neural architecture search via proximal iterations. arXiv: Learning

  43. Zhong Z, Yan J, Wu W, Shao J, Liu C(2018) Practical block-wise neural network architecture generation, pp 2423–2432

  44. Zhu W (2009) Relationship between generalized rough sets based on binary relation and covering. Inf Sci 179(3):210–225

    Article  MathSciNet  Google Scholar 

  45. Zela A, Elsken T, Saikia T, Marrakchi Y, Brox T, Hutter F (2020) Understanding and robustifying differentiable architecture search. In: International conference on learning representations

  46. Zoph B, Le QV (2016) Neural architecture search with reinforcement learning. Preprint arXiv:1611.01578

  47. Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61772120.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William Zhu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

All authors have seen the manuscript and approved to submit to your journal.

Informed consent

We confirm that the content of the manuscript has not been published or submitted for publication elsewhere.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hao, J., Cai, Z., Li, R. et al. Saliency: a new selection criterion of important architectures in neural architecture search. Neural Comput & Applic 34, 1269–1283 (2022). https://doi.org/10.1007/s00521-021-06418-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06418-4

Keywords

Navigation