Skip to main content

Advertisement

Log in

A survey of designing convolutional neural network using evolutionary algorithms

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Convolutional neural networks (CNN) are highly effective for image classification and computer vision activities. The accuracy of CNN architecture depends on the design and selection of optimal parameters. The number of parameters increases exponentially with every connected layer in deep CNN architecture. Therefore, the manual selection of efficient parameters entirely remains ad-hoc. To solve that problem, we must carefully examine the relationship between the depth of architecture, input parameters, and the model’s accuracy. The evolutionary algorithms are prominent in solving the challenges in architecture design and parameter selection. However, the adoption of evolutionary algorithms itself is a challenging task as the computation cost increases with its evolution. The performance of evolutionary algorithms depends on the type of encoding technique used to represent a CNN architecture. In this article, we presented a comprehensive study of the recent approaches involved in the design and training of CNN architecture. The advantages and disadvantages of selecting a CNN architecture using evolutionary algorithms are discussed. The manual architecture is compared against automated CNN architecture based on the accuracy and range of parameters in the existing benchmark datasets. Furthermore, we have discussed the ongoing issues and challenges involved in evolutionary algorithms-based CNN architecture design.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Assunçao F, Lourenço N, Machado P, Ribeiro B (2018) Using GP is neat: evolving compositional pattern production functions. In: Proceedings of the European conference on genetic programming, pp 3–18

  • Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M (2020) Optimizing convolutional neural network hyperparameters by enhanced swarm intelligence metaheuristics. Algorithms 13(3):67

    MathSciNet  Google Scholar 

  • Baker B, Gupta O, Naik N, Raskar R (2016) Designing neural network architectures using reinforcement learning. arXiv preprint. arXiv:1611.02167

  • Bakhshi A, Noman N, Chen Z, Zamani M, Chalup S (2019) Fast automatic optimization of CNN architectures for image classification using genetic algorithm. In: 2019 IEEE congress on evolutionary computation (CEC), June 2019. IEEE, Piscataway, pp 1283–1290

  • Barik D, Mondal M (2010) Object identification for computer vision using image segmentation. In: 2010 2nd International conference on education technology and computer, vol 2. IEEE, Piscataway, pp V2-170

  • Bengio Y, LeCun Y (2007) Scaling learning algorithms towards AI. In: Bottou L, Chapelle O, DeCoste D, Weston J (eds) Large-scale kernel machines, vol 34(5). MIT, Cambridge, pp 1–41

  • Bergstra J, Yamins D, Cox D (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: International conference on machine learning, pp 115–123

  • Beyer HG, Schwefel HP (2002) Evolution strategies—a comprehensive introduction. Nat Comput 1(1):3–52

    MathSciNet  MATH  Google Scholar 

  • Bock S, Goppold J, Weiß M (2018) An improvement of the convergence proof of the ADAM-Optimizer. arXiv preprint. arXiv:1804.10587

  • Cao J, Su Z, Yu L, Chang D, Li X, Ma Z (2018) Softmax cross entropy loss with unbiased decision boundary for image classification. In: 2018 Chinese automation congress (CAC), November 2018. IEEE, Piscataway, pp 2028–2032

  • Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 1251–1258

  • Chunna L, Hai F, Chunlin G (2020) Development of an efficient global optimization method based on adaptive infilling for structure optimization. Struct Multidiscip Optim 62(6):3383–3412

    Google Scholar 

  • Das K, Jiang J, Rao JNK (2004) Mean squared error of empirical predictor. Ann Stat 32(2):818–840

    MathSciNet  MATH  Google Scholar 

  • Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process Mag 29(6):141–142

    Google Scholar 

  • Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255

  • Dorigo M, Birattari M, Di Caro GA, Doursat R, Engelbrecht AP, Floreano D, Gambardella LM, Gross R, Sahin E, Stützle T, Sayama H (2010) Swarm intelligence: 7th international conference, ANTS 2010, Brussels, Belgium

  • Elsken T, Metzen JH, Hutter F (2017) Simple and efficient architecture search for convolutional neural networks. arXiv preprint. arXiv:1711.04528

  • Esfahanian P, Akhavan M (2019) GACNN: training deep convolutional neural networks with genetic algorithm. arXiv preprint. arXiv:1909.13354.

  • Fan E (2000) Extended tanh-function method and its applications to nonlinear equations. Phys Lett A 277(4–5):212–218

    MathSciNet  MATH  Google Scholar 

  • Faradonbeh RS, Monjezi M, Armaghani DJ (2016) Genetic programing and non-linear multiple regression techniques to predict backbreak in blasting operation. Eng Comput 32(1):123–133

    Google Scholar 

  • Fukushima K (1988) Neocognitron: a hierarchical neural network capable of visual pattern recognition. Neural Netw 1(2):119–130

    Google Scholar 

  • Galván E, Mooney P (2021) Neuroevolution in deep neural networks: current trends and future challenges. IEEE Trans Artif Intell 2(6):476–493

    Google Scholar 

  • Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  • Guo B, Hu J, Wu W, Peng Q, Wu F (2019) The Tabu_genetic algorithm: a novel method for hyper-parameter optimization of learning algorithms. Electronics 8(5):579

    Google Scholar 

  • Han J, Moraga C (1995) The influence of the sigmoid function parameters on the speed of backpropagation learning. In: International workshop on artificial neural networks, June 1995. Springer, Berlin, pp 195–201

  • Hansen N, Arnold DV, Auger A (2015) Evolution strategies. In: Springer handbook of computational intelligence. Springer, Berlin, pp 871–898

  • Hassanzadeh T, Essam D, Sarker R (2020) EvoU-NET: an evolutionary deep fully convolutional neural network for medical image segmentation. In: Proceedings of the 35th annual ACM symposium on applied computing, March 2020, pp 181–189

  • Hochreiter S (1998) The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int J Uncertain Fuzziness Knowl Based Syst 6(02):107–116

    MATH  Google Scholar 

  • Houck CR, Joines J, Kay MG (1995) A genetic algorithm for function optimization: a Matlab implementation. NCSU-IE TR-95(09):1-0

  • Hu X, Eberhart RC, Shi Y (2003) Swarm intelligence for permutation optimization: a case study of n-queens problem. In: Proceedings of the 2003 IEEE swarm intelligence symposium, SIS’03 (Cat. No. 03EX706). IEEE, Piscataway, pp 243–246

  • Hu J, Shen L, Sun G (2018) Squeeze- and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 7132–7141

  • Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 4700–4708

  • Istrate R, Scheidegger F, Mariani G, Nikolopoulos D, Bekas C, Malossi AC (2019) Tapas: train-less accuracy predictor for architecture search. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, No. 01, pp 3927–3934

  • Joshi D, Singh TP (2020) A survey of fracture detection techniques in bone X-ray images. Artif Intell Rev 53(6):4475–4517

    Google Scholar 

  • Joshi D, Mishra V, Srivastav H, Goel D (2021) Progressive transfer learning approach for identifying the leaf type by optimizing network parameters. Neural Process Lett 53(5):3653–3676

    Google Scholar 

  • Karaboga K (2005) An idea based on honey bee swarm for numerical optimization. Technical Report-TR06, Erciyes University, Engineering Faculty, Computer Engineering Department

  • Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-international conference on neural networks, vol 4. IEEE, Piscataway, pp 1942–1948

  • Keskar NS, Socher R (2017) Improving generalization performance by switching from ADAM to SGD. arXiv preprint. arXiv:1712.07628

  • Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53(8):5455–5516

    Google Scholar 

  • Kitjacharoenchai P, Ventresca M, Moshref-Javadi M, Lee S, Tanchoco JM, Brunese PA (2019) Multiple traveling salesman problem with drones: mathematical model and heuristic approach. Comput Ind Eng 129:14–30

    Google Scholar 

  • Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT, Cambridge

    MATH  Google Scholar 

  • Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  • Kurbiel T, Khaleghian S (2017) Training of deep neural networks based on distance measures using RMSProp. arXiv preprint. arXiv:1708.01911

  • LeCun Y, Jackel LD, Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Muller UA, Sackinger E, Simard P, Vapnik V (1995) Learning algorithms for classification: a comparison on handwritten digit recognition. Neural Netw Stat Mech Perspect 261(276):2

    Google Scholar 

  • Lee KS, Geem ZW (2005) A new meta-heuristic algorithm for continuous engineering optimization: harmony search theory and practice. Comput Methods Appl Mech Eng 194(36–38):3902–3933

    MATH  Google Scholar 

  • Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, Cham, pp 740–755

  • Liu H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2017) Hierarchical representations for efficient architecture search. arXiv preprint. arXiv:1711.00436

  • Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li LJ, Fei-Fei L, Yuille A, Huang J, Murphy K (2018a) Progressive neural architecture search. In: Proceedings of the European conference on computer vision (ECCV), pp 19–34

  • Liu H, Simonyan K, Yang Y (2018b) Darts: differentiable architecture search. arXiv preprint. arXiv:1806.09055

  • Liu Y, Sun Y, Xue B, Zhang M, Yen GG, Tan KC (2021) A survey on evolutionary neural architecture search. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3100554

    Article  Google Scholar 

  • Liu S, Zhang H, Jin Y (2022) A survey on surrogate-assisted efficient neural architecture search. arXiv preprint. arXiv:2206.01520

  • Lorenzo PR, Nalepa J, Kawulok M, Ramos LS, Pastor JR (2017) Particle swarm optimization for hyper-parameter selection in deep neural networks. In: Proceedings of the genetic and evolutionary computation conference, pp 481–494

  • Loussaief S, Abdelkrim A (2018) Convolutional neural network hyper-parameters optimization based on genetic algorithms. Int J Adv Comput Sci Appl 9(10):252–266

    Google Scholar 

  • Lucas S (2021) The origins of the halting problem. J Logical Algebraic Methods Program 121:100687

    MathSciNet  MATH  Google Scholar 

  • Lydia A, Francis S (2019) Adagrad—an optimizer for stochastic gradient descent. Int J Inf Comput Sci 6(5):566–568

    Google Scholar 

  • Mendoza H, Klein A, Feurer M, Springenberg JT, Hutter F (2016) Towards automatically-tuned neural networks. In: Workshop on automatic machine learning, PMLR, pp 58–65

  • Michalewicz Z, Schoenauer M (1996) Evolutionary algorithms for constrained parameter optimization problems. Evol Comput 4(1):1–32

    Google Scholar 

  • Mirjalili S (2019) Genetic algorithm. In: Evolutionary algorithms and neural networks. Springer, Cham, pp. 43–55

  • Muthukrishnan R, Radha M (2011) Edge detection techniques for image segmentation. Int J Comput Sci Inf Technol 3(6):259

    Google Scholar 

  • Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: ICML

  • Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International conference on machine learning, PMLR, pp 4095–4104

  • Qin AK, Huang VL, Suganthan PN (2008) Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE Trans Evol Comput 13(2):398–417

    Google Scholar 

  • Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: Proceedings of the international conference on machine learning, pp 2902–2911

  • Ren J, Li Z, Yang J, Xu N, Yang T, Foran DJ (2019) Eigen Ecologically-inspired genetic approach for neural network structure searching from scratch. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9059–9068

  • Serizawa T, Fujita H (2020) Optimization of convolutional neural network using the linearly decreasing weight particle swarm optimization. arXiv preprint. arXiv:2001.05670

  • Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48

    Google Scholar 

  • Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484–489

    Google Scholar 

  • Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556

  • Sinha T, Haidar A, Verma B (2018) Particle swarm optimization based approach for finding optimal values of convolutional neural network parameters. In: 2018 IEEE congress on evolutionary computation (CEC), pp 1–6

  • Sleegers J, Berg DVD (2021) Backtracking (the) algorithms on the Hamiltonian cycle problem. arXiv preprint. arXiv:2107.00314

  • Soon FC, Khaw HY, Chuah JH, Kanesan J (2018) Hyper-parameters optimisation of deep CNN architecture for vehicle logo recognition. IET Intell Transp Syst 12(8):939–946

    Google Scholar 

  • Stanley KO, Miikkulainen R (2002) Evolving neural networks through augmenting topologies. Evol Comput 2:99–127

    Google Scholar 

  • Suganuma M, Kobayashi M, Shirakawa S, Nagao T (2020) Evolution of deep convolutional neural networks using Cartesian genetic programming. Evol Comput 28(1):141–163

    Google Scholar 

  • Sun Y, Xue B, Zhang M, Yen GG (2019a) Evolving deep convolutional neural networks for image classification. IEEE Trans Evol Comput 24(2):394–407

    Google Scholar 

  • Sun Y, Xue B, Zhang M, Yen GG (2019b) Completely automated CNN architecture design based on blocks. IEEE Trans Neural Netw Learn Syst 31(4):1242–1254

    MathSciNet  Google Scholar 

  • Sun Y, Xue B, Zhang M, Yen GG, Lv J (2020) Automatically designing CNN architectures using the genetic algorithm for image classification. IEEE Trans Cybern 50(9):3840–3854

    Google Scholar 

  • Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 1–9

  • Talathi SS (2015) Hyper-parameter optimization of deep convolutional networks for object recognition. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 3982–3986

  • Vargas-Hakim GA, Mezura-Montes E, Acosta-Mesa HG (2021) A review on convolutional neural network encodings for neuroevolution. IEEE Trans Evol Comput 26(1):12–27

    Google Scholar 

  • Voß S, Martello S, Osman IH, Roucairol C (2012) Meta-heuristics: advances and trends in local search paradigms for optimization. Springer, New York

    MATH  Google Scholar 

  • Wang B, Sun Y, Xue B, Zhang M (2018a) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. In: Proceedings of the IEEE congress on evolutionary computation (CEC), pp 1–8

  • Wang B, Sun Y, Xue B, Zhang M (2018b) A hybrid differential evolution approach to designing deep convolutional neural networks for image classification. In: Australasian joint conference on artificial intelligence, December 2018b. Springer, Cham, pp 237–250

  • Wang Y, Zhang H, Zhang G (2019) cPSO-CNN: an efficient PSO-based algorithm for fine-tuning hyper-parameters of convolutional neural networks. Swarm Evol Comput 49:114–123

    Google Scholar 

  • Wu H, Gu X (2015) Towards dropout training for convolutional neural networks. Neural Netw 71:1–10

    Google Scholar 

  • Wu S, Zhong S, Liu Y (2018) Deep residual learning for image steganalysis. Multimedia Tools Appl 77(9):10437–10453

    Google Scholar 

  • Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint. arXiv:1708.07747

  • Xiao S, Li T, Wang J (2019) Optimization methods of video images processing for mobile object recognition. Multimedia Tools Appl 79(25):17245–17255

    Google Scholar 

  • Xie L, Yuille A (2017) Genetic CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1379–1388

  • Yamasaki T, Honma T, Aizawa K (2017) Efficient optimization of convolutional neural networks using particle swarm optimization. In: Proceedings of the third international conference on multimedia big data (BigMM), pp 70–73

  • Yao X, Liu Y, Lin G (1999) Evolutionary programming made faster. IEEE Trans Evol Comput 3(2):82–102

    Google Scholar 

  • Yu X, Gen M (2010) Introduction to evolutionary algorithms. Springer, London

    MATH  Google Scholar 

  • Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint. arXiv:1605.07146

  • Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv preprint. arXiv:1212.5701

  • Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, Cham, pp 818–833

  • Zhan ZH, Li JY, Zhang J (2022) Evolutionary deep learning: a survey. Neurocomputing 483:42–58

    Google Scholar 

  • Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. In: Advances neural information processing systems 31 (NeurIPS 2018)

  • Zhong Z, Yan J, Liu C-L (2018) Practical network blocks design with q-learning. In: AAAI conference on artificial intelligence

  • Zhou X, Qin AK, Gong M, Tan KC (2021) A survey on evolutionary construction of deep neural networks. IEEE Trans Evol Comput 25(5):894–912

    Google Scholar 

  • Zoph B, Le QV (2016) Neural architecture search with reinforcement learning. arXiv preprint. arXiv:1611.01578

  • Zoph B, Vasudevan V, Shlens J, Le QV (2018a) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710

  • Zunino R, Gastaldo P (2002) Analog implementation of the softmax function. In: 2002 IEEE international symposium on circuits and systems. Proceedings (Cat. No. 02CH37353), May 2002, vol 2. IEEE, Piscataway

Download references

Author information

Authors and Affiliations

Authors

Contributions

All authors take public responsibility for the content of the work submitted for review. The authors confirm contribution to the paper as follows: 1. VM—Conception or design of the work; 2. VM—Methodology; 3. VM—Writing—original draft; 4. VM—Critical revision of the article; 5. LK—Supervision; 6. LK and VM—final approval of the version to be communicated.

Corresponding author

Correspondence to Vidyanand Mishra.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mishra, V., Kane, L. A survey of designing convolutional neural network using evolutionary algorithms. Artif Intell Rev 56, 5095–5132 (2023). https://doi.org/10.1007/s10462-022-10303-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-022-10303-4

Keywords

Navigation