A Novel Automatic CNN Architecture Design Approach Based on Genetic Algorithm

  • Amr AbdelFatah AhmedEmail author
  • Saad M. Saad Darwish
  • Mohamed M. El-Sherbiny
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1058)


The deep “Convolutional Neural Networks (CNNs)” gained a grand success on a broad of computer vision tasks. However, CNN structures training consumes a massive computing resources amount. The researchers in this field are concerned on designing CNN structures to maximize the performance and accuracy. The main design methods are human hand-crafted fixed model structures and automatic generated models. We proposed an automatic CNN structure design approach based on genetic algorithm that concerned with generating light weight CNN structures. We also introduce a chromosome novel representation for the structure of CNN. Unlike existing approaches, the proposed methodology is designed to work on limited computing assets with achieving high accuracy. It utilizes advanced training methods to decrease the overhead on the computing resources that are involved in the process. Our experimental results denote the proposed model effectiveness over the related work methods.


Convolutional Neural Networks CNNs Genetic algorithm Automatic model design 


  1. 1.
    Sermanet, P., Chintala, S., LeCun, Y.: Convolutional neural networks applied to house numbers digit classification. In: International Conference on Pattern Recognition, pp. 10–13. Tsukuba, Japan (2012)Google Scholar
  2. 2.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations ICLR, pp. 1–14. San Diego, USA (2015)Google Scholar
  3. 3.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. Boston, USA (2015)Google Scholar
  4. 4.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. Las Vegas, USA (2016)Google Scholar
  5. 5.
    Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv preprint arXiv:1704.04861 (2017)
  6. 6.
    Iandola, F.N., Han, S., Moskewic, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level Accuracy with 50x Fewer Parameters and < 0.5 MB Model Size. arXiv preprint arXiv:1602.07360 (2016)
  7. 7.
    Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6848–6856. Utah, USA (2018)Google Scholar
  8. 8.
    Xie, L., Yuille, A.L.: Genetic CNN. In: The International Conference on Computer Vision ICCV, pp. 1388–1397. Venice, Italy (2017)Google Scholar
  9. 9.
    Baldominos, A., Saez, Y., Isasi, P.: Evolutionary convolutional neural networks: an application to handwriting recognition. Int. J. Neurocomput. 283, 38–52 (2018)CrossRefGoogle Scholar
  10. 10.
    Suganuma, M., Shirakawa, S., Nagao, T.: A genetic programming approach to designing convolutional neural network architectures. In: Genetic and Evolutionary Computation Conference, pp. 497–504. ACM, Berlin (2017)Google Scholar
  11. 11.
    Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations ICLR, pp. 1–16. Toulon, France (2017)Google Scholar
  12. 12.
    Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 8697–8710. Utah, USA (2018)Google Scholar
  13. 13.
    Zhong, Z., Yan, J., Liu, C.L.: Practical block-wise neural network architecture generation. In: The IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 2423–2432. Utah, USA (2018)Google Scholar
  14. 14.
    Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: International Conference on Learning Representations ICLR, pp. 1–18. Toulon, France (2017)Google Scholar
  15. 15.
    Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: International Conference on Artificial Intelligence AAAI, pp. 2787–2794. Louisiana, USA (2018)Google Scholar
  16. 16.
    Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameter sharing. In: International Conference on Machine Learning, pp. 4092–4101. Stockholm, Sweden, (2018)Google Scholar
  17. 17.
    Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized Evolution for Image Classifier Architecture Search. arXiv preprint arXiv:1802.01548 (2018)
  18. 18.
    Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: International Conference on Learning Representations, New Orleans (2019).
  19. 19.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. Int. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing ınternal covariate shift. In: The 32nd International Conference on Machine Learning, PMLR, pp. 448–456. Lille, France (2015)Google Scholar
  21. 21.
    Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328. Quebec, Canada (2014)Google Scholar
  22. 22.
    Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. Chapman & Hall/CRC, Boca Raton (2012)Google Scholar
  23. 23.
    Deng, L.: The MNIST database of handwritten digit images for machine learning research. IEEE Sig. Process. Mag. 29(6), 141–142 (2012)CrossRefGoogle Scholar
  24. 24.
    Krizhevsky, A., Hinton, G.: Learning Multiple Layers of Features from Tiny Images. Technical Report, University of Toronto (2009)Google Scholar
  25. 25.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations ICLR, pp. 1–15. San Diego, USA (2015)Google Scholar
  26. 26.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT press, Cambridge (2016)Google Scholar
  27. 27.
    Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520. Utah, USA (2018)Google Scholar
  28. 28.
    Freeman, I., Roese-Koerner, L., Kummert, A.: Effnet: an efficient structure for convolutional neural networks. In: The 25th IEEE International Conference on Image Processing (ICIP), pp. 6–10, Athens, Greece (2018)Google Scholar
  29. 29.
    Chen, H.Y., Su, C.Y.: An enhanced hybrid MobileNet. In: The 9th International Conference on Awareness Science and Technology (iCAST), pp. 308–312, Fukuoka, Japan (2018)Google Scholar
  30. 30.
    Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: Stochastic Neural Architecture Search. arXiv preprint arXiv:1812.09926 (2018)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Amr AbdelFatah Ahmed
    • 1
    Email author
  • Saad M. Saad Darwish
    • 2
  • Mohamed M. El-Sherbiny
    • 3
  1. 1.Department of Computer EngineeringAlexandria High Institute of Engineering and TechnologyAlexandriaEgypt
  2. 2.Department of Information Technology, Institute of Graduate Studies and ResearchAlexandria UniversityAlexandriaEgypt
  3. 3.Department of Material Science, Institute of Graduate Studies and ResearchAlexandria UniversityAlexandriaEgypt

Personalised recommendations