Skip to main content
Log in

Architecture search of accurate and lightweight CNNs using genetic algorithm

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

Convolutional neural networks (CNNs) are popularly-used in various AI fields, yet the design of CNN architectures heavily depends on domain expertise. Evolutionary neural architecture search (ENAS) methods can search for neural architectures automatically using evolutionary computation algorithms, e.g. genetic algorithm. However, most existing ENAS methods solely focus on the network accuracy, which leads to large-sized networks to be evolved and huge cost in computation resources and search time. Even though there are ENAS works using multi-objective techniques to optimize both the accuracy and size of CNNs, they are complex and time/resource-consuming. In this work, two new ENAS methods are designed, which aim to evolve both accurate and lightweight CNN architectures efficiently using genetic algorithm (GA). They are termed as GACNN_WS (GA CNN Weighted Sum) and GACNN_LE (GA CNN Local Elitism) respectively. Specifically, GACNN_WS designs a weighted-sum fitness of two items (i.e. accuracy and size) to evaluate candidate networks. GACNN_LE sets the accuracy as its fitness like most other ENAS methods, and designs a local elitism strategy to consider the network size. Thus, GACNN_WS and GACNN_LE can search for both accurate and lightweight CNNs without using multi-objective techniques. Results show that the proposed methods have better search ability than state-of-the-art NAS methods, which consume less time and generate better CNNs with lower error rates and parameter numbers for classification on CIFAR-10. Moreover, the evolved CNNs of the proposed methods generally perform better than eleven hand-designed CNNs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available in the Kaggle repository, https://www.kaggle.com/competitions/cifar-10/data.

References

  1. K. Ahmed, L. Torresani, Maskconnect: connectivity learning by gradient descent, in Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 349–365

  2. B. Baker, O. Gupta, N. Naik, R. Raskar, Designing neural network architectures using reinforcement learning, in International Conference on Learning Representations (2017), pp. 1–18

  3. T. Chen, I. Goodfellow, J. Shlens, Net2net: Accelerating learning via knowledge transfer, in International Conference on Learning Representations (2016)

  4. M. Dhouibi, A. Salem, S.B. Saoud, Optimization of CNN model for image classification, in IEEE International Conference on Design and Test of Integrated Micro and Nano-Systems (2021)

  5. J.K. Duggal, El-Sharkawy, M.: High performance squeezenext for cifar-10, in National Aerospace and Electronics Conference (2019)

  6. T. Elsken, J.H. Metzen, F. Hutter, Neural architecture search: a survey. J. Mach. Learn. Res. 20(1), 1997–2017 (2019)

    MathSciNet  Google Scholar 

  7. J. Fang, Y. Sun, Q. Zhang, Y. Li, X. Wang, Densely connected search space for more flexible neural architecture search, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 10628–10637

  8. X. Gastaldi, Shake–shake regularization of 3-branch residual networks, in International Conference on Learning Representations (2017), pp. 770–778

  9. K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, C. Xu, Ghostnet: more features from cheap operations, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020), pp. 1577–1586

  10. B. Hasani, P.S. Negi, M.H. Mahoor, BReG-NeXt: facial affect computing using adaptive residual networks with bounded gradient. IEEE Trans. Affect. Comput. 13(2), 1023–1036 (2022)

    Article  Google Scholar 

  11. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), pp. 770–778

  12. A. Howard, M. Sandler, B. Chen, W. Wang, L.C. Chen, M. Tan, G. Chu, V. Vasudevan, Y. Zhu, R. Pang, Searching for mobilenetv3, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2020)

  13. S. Hu, R. Cheng, C. He, Z. Lu, Multi-objective neural architecture search with almost no training, in International Conference on Evolutionary Multi-Criterion Optimization (Springer, 2021), pp. 492–503

  14. T. Hu, M. Tomassini, W. Banzhaf, A network perspective on genotype–phenotype mapping in genetic programming. Genet. Program. Evolvable Mach. 21, 375–397 (2020)

    Article  Google Scholar 

  15. G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely connected convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 4700–4708

  16. P. Jiang, Y. Xue, F. Neri, Continuously evolving dropout with multi-objective evolutionary optimisation. Eng. Appl. Artif. Intell. 124, 106504 (2023)

    Article  Google Scholar 

  17. P. Jiang, Y. Xue, F. Neri, Convolutional neural network pruning based on multi-objective feature map selection for image classification. Appl. Soft Comput. 139, 110229 (2023)

    Article  Google Scholar 

  18. A. Krizhevsky, G. Hinton, Learning multiple layers of features from tiny images, in Handbook of Systemic Autoimmune Diseases (2009)

  19. G. Larsson, M. Maire, G. Shakhnarovich, Fractalnet: ultra-deep neural networks without residuals, in International Conference on Learning Representations (2017), pp. 770–778

  20. G. Li, G. Qian, I.C. Delgadillo, M. Muller, A. Thabet, B. Ghanem, SGAS: sequential greedy architecture search, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 1620–1630

  21. M. Liang, Figure-ground image segmentation using feature-based multi-objective genetic programming techniques. Neural Comput. Appl. 31(7), 3075–3094 (2019)

    Article  Google Scholar 

  22. T. Liang, Y. Wang, Z. Tang, G. Hu, H. Ling, OPANAS: one-shot path aggregation network architecture search for object detection, pp. 1–9 (2021). arXiv:2103.04507

  23. Y. Liang, M. Zhang, W.N. Browne, Image feature selection using genetic programming for figure-ground segmentation. Eng. Appl. Artif. Intell. 62(Jun.), 96–108 (2017)

    Article  Google Scholar 

  24. M. Lin, P. Wang, Z. Sun, H. Chen, X. Sun, Q. Qian, H. Li, R. Jin, Zen-NAS: a zero-shot NAS for high-performance image recognition, in Proceedings of the IEEE/CVF International Conference on Computer Vision (2021), pp. 347–356

  25. H. Liu, K. Simonyan, O. Vinyals, C. Fernando, K. Kavukcuoglu, Hierarchical representations for efficient architecture search, in International Conference of Learning Representation (2018)

  26. H. Liu, K. Simonyan, Y. Yang, Darts: differentiable architecture search, in International Conference on Learning Representations (2019), pp. 1–13

  27. Y. Liu, Y. Sun, B. Xue, M. Zhang, G.G. Yen, K.C. Tan, A survey on evolutionary neural architecture search, in 2021 IEEE Congress on Evolutionary Computation (CEC) (2021)

  28. M. Loni, S. Sinaei, A. Zoljodi, M. Daneshtalab, M. Sjödin, Deepmaker: a multi-objective optimization framework for deep neural networks in embedded systems. Microprocess. Microsyst. 73, 102989 (2020)

    Article  Google Scholar 

  29. Z. Lu, I. Whalen, V. Boddeti, Y. Dhebar, K. Deb, E. Goodman, W. Banzhaf, NSGA-Net: neural architecture search using multi-objective genetic algorithm, in Genetic and Evolutionary Computation Conference 2019 (2019)

  30. F. Neri, C. Cotta, P. Moscato, Handbook of Memetic Algorithms, vol. 379 (Springer, Berlin, 2011)

    Google Scholar 

  31. W. Peng, X. Hong, H. Chen, G. Zhao, Learning graph convolutional network for skeleton-based human action recognition by neural searching, in National Conference on Artificial Intelligence (2020)

  32. Y. Peng, A. Song, V. Ciesielski, H.M. Fayek, X. Chang, PRE-NAS: predictor-assisted evolutionary neural architecture search, in Proceedings of the Genetic and Evolutionary Computation Conference (2022), pp. 1066–1074

  33. A. Piergiovanni, A. Angelova, A. Toshev, M. Ryoo, Evolving space-time neural architectures for video, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

  34. E. Real, S. Moore, A. Selle, S. Saxena, Y.L. Suematsu, J. Tan, Q.V. Le, A. Kurakin, Large-scale evolution of image classifiers, in 34th International Conference on Machine Learning (2017), pp. 2902–2911

  35. D. Sapra, A.D. Pimentel, Constrained evolutionary piecemeal training to design convolutional neural networks, in International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (Springer, 2020), pp. 709–721

  36. L.A. Scardua, Genetic Algorithms. Applied Evolutionary Algorithms for Engineers Using Python (CRC Press, Boca Raton, 2021)

    Book  Google Scholar 

  37. R. Shin, C. Packer, D. Song, Differentiable neural network architecture search, in International Conference on Learning Representations (Workshop Track) (2018), pp. 1–4

  38. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in International Conference on Learning Representations (2015)

  39. N. Sinha, K.W. Chen, Evolving neural architecture using one shot model, in Proceedings of the Genetic and Evolutionary Computation Conference (2021), pp. 910–918

  40. N. Sinha, K.W. Chen, Neural architecture search using progressive evolution, in Proceedings of the Genetic and Evolutionary Computation Conference (2022), pp. 1093–1101

  41. X. Song, Y. Zhang, D. Gong, X. Sun, Feature selection using bare-bones particle swarm optimization with mutual information. Pattern Recognit. J. Pattern Recognit. Soc. 112(1), 1–17 (2021)

    Google Scholar 

  42. M. Tan, Q. Le, Efficientnetv2: smaller models and faster training, in International Conference on Machine Learning (2021), pp. 10096–10106

  43. X. Wang, C. Xue, J. Yan, X. Yang, Y. Hu, K. Sun, MergeNAS: merge operations into one for differentiable architecture search, in Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence (2021), pp. 3065–3072

  44. H. Wei, F. Lee, C. Hu, Q. Chen, MOO-DNAS: efficient neural network design via differentiable architecture search based on multi-objective optimization. IEEE Access 10, 14195–14207 (2022)

    Article  Google Scholar 

  45. R.J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)

    Article  Google Scholar 

  46. M. Wistuba, Deep learning architecture search by neuro-cell-based evolution with function-preserving mutations, in Joint European Conference on Machine Learning and Knowledge Discovery in Databases (2018)

  47. Y. Xue, P. Jiang, F. Neri, J. Liang, A multi-objective evolutionary approach based on graph-in-graph for neural architecture search of convolutional neural networks. Int. J. Neural Syst. 31(09), 2150035 (2021)

    Article  Google Scholar 

  48. Y. Xue, Y. Wang, J. Liang, A self-adaptive gradient descent search algorithm for fully-connected neural networks. Neurocomputing 478, 70–80 (2022)

    Article  Google Scholar 

  49. Y. Xue, Y. Wang, J. Liang, A. Slowik, A self-adaptive mutation neural architecture search algorithm based on blocks. IEEE Comput. Intell. Mag. 16(3), 67–78 (2021)

    Article  Google Scholar 

  50. Z. Yang, Y. Wang, X. Chen, B. Shi, C. Xu, C. Xu, Q. Tian, C. Xu, Cars: continuous evolution for efficient neural architecture search, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 1829–1838

  51. H. Zhang, Y. Jin, R. Cheng, K. Hao, Efficient evolutionary search of attention convolutional networks via sampled training and node inheritance. IEEE Trans. Evol. Comput. 25(2), 371–385 (2020)

    Article  Google Scholar 

  52. M. Zhang, H. Li, S. Pan, X. Chang, S. Su, Overcoming multi-model forgetting in one-shot NAS with diversity maximization, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 7809–7818

  53. L. Zhao, L. Wang, Y. Jia, Y. Cui, A lightweight deep neural network with higher accuracy. PLoS ONE 17(8), e0271225 (2022)

    Article  Google Scholar 

  54. Y. Zhao, L. Wang, Y. Tian, R. Fonseca, T. Guo, Few-shot neural architecture search, in International Conference on Machine Learning (2021)

  55. Z. Zhong, J. Yan, C.L. Liu, Practical network blocks design with q-learning, pp. 1–11 (2017). arXiv:1708.05552v3

  56. J. Zhou, Q. He, G. Cheng, Z. Lin, Union-net: lightweight deep neural network model suitable for small data sets. J. Supercomput. 79, 7228–7243 (2022)

    Article  Google Scholar 

  57. B. Zoph, Q.V. Le, Neural architecture search with reinforcement learning, pp. 1–16 (2016). arXiv:1611.01578

  58. B. Zoph, V. Vasudevan, J. Shlens, Q.V. Le, Learning transferable architectures for scalable image recognition, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), pp. 8697–8710

Download references

Acknowledgements

This study is funded by National Natural Science Foundation of China (Grant Number 61902281).

Author information

Authors and Affiliations

Authors

Contributions

HC is responsible for conducting experiments and result visualization. JL is responsible for methodology, result analyses and writing. YL is in charge of proof-reading.

Corresponding author

Correspondence to Jiayu Liang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval and informed consent

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Area Editor: Sebastian Risi.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, J., Cao, H., Lu, Y. et al. Architecture search of accurate and lightweight CNNs using genetic algorithm. Genet Program Evolvable Mach 25, 13 (2024). https://doi.org/10.1007/s10710-024-09484-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10710-024-09484-4

Keywords

Navigation