Size/Accuracy Trade-Off in Convolutional Neural Networks: An Evolutionary Approach

  • Tomaso CettoEmail author
  • Jonathan Byrne
  • Xiaofan Xu
  • David Moloney
Conference paper
Part of the Proceedings of the International Neural Networks Society book series (INNS, volume 1)


In recent years, the shift from hand-crafted design of Convolutional Neural Networks (CNN’s) to an automatic approach (AutoML) has garnered much attention. However, most of this work has been concentrated on generating state of the art (SOTA) architectures that set new standards of accuracy. In this paper, we use the NSGA-II algorithm for multi-objective optimization to optimize the size/accuracy trade-off in CNN’s. This approach is inspired by the need for simple, effective, and mobile-sized architectures which can easily be re-trained on any datasets. This optimization is carried out using a Grammatical Evolution approach, which, implemented alongside NSGA-II, automatically generates valid network topologies which can best optimize the size/accuracy trade-off. Furthermore, we investigate how the algorithm responds to an increase in the size of the search space, moving from strictly topology optimization (number of layers, size of filter, number of kernels,etc.) and then expanding the search space to include possible variations in other hyper-parameters such as the type of optimizer, dropout rate, batch size, or learning rate, amongst others.


CNN Grammatical evolution 


  1. 1.
    Ahmadizar, F., Soltanian, K., AkhlaghianTab, F., Tsoulos, I.: Artificial neural network development by means of a novel combination of grammatical evolution and genetic algorithm. Eng. Appl. Artif. Intell. 39, 1–13 (2015)CrossRefGoogle Scholar
  2. 2.
    Assunçao, F., Lourenço, N., Machado, P., Ribeiro, B.: Automatic generation of neural networks with structured grammatical evolution. In: 2017 IEEE Congress on Evolutionary Computation (CEC), pp. 1557–1564. IEEE (2017)Google Scholar
  3. 3.
    Assunçao, F., Lourenço, N., Machado, P., Ribeiro, B.: Towards the evolution of multi-layered neural networks: a dynamic structured grammatical evolution approach. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 393–400. ACM (2017)Google Scholar
  4. 4.
    Assunçao, F., Lourenço, N., Machado, P., Ribeiro, B.: DENSER: deep evolutionary network structured representation. arXiv preprint arXiv:1801.01563 (2018)
  5. 5.
    Cheng, A.C., Dong, J.D., Hsu, C.H., Chang, S.H., Sun, M., Chang, S.C., Pan, J.Y., Chen, Y.T., Wei, W., Juan, D.C.: Searching toward pareto-optimal device-aware neural architectures. arXiv preprint arXiv:1808.09830 (2018)
  6. 6.
    Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 215–223 (2011)Google Scholar
  7. 7.
    Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)CrossRefGoogle Scholar
  8. 8.
    Fenton, M., McDermott, J., Fagan, D., Forstenlechner, S., Hemberg, E., O’Neill, M.: PonyGE2: grammatical evolution in Python. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1194–1201. ACM (2017)Google Scholar
  9. 9.
    Gupta, S., Zhang, W., Wang, F.: Model accuracy and runtime tradeoff in distributed deep learning: a systematic study. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 171–180. IEEE (2016)Google Scholar
  10. 10.
    Kandasamy, K., Neiswanger, W., Schneider, J., Poczos, B., Xing, E.: Neural architecture search with bayesian optimisation and optimal transport. arXiv preprint arXiv:1802.07191 (2018)
  11. 11.
    Kim, Y.H., Reddy, B., Yun, S., Seo, C.: NEMO: neuro-evolution with multiobjective optimization of deep neural network for speed and accuracy. In: ICML 2017, AutoML Workshop (2017)Google Scholar
  12. 12.
    Koza, J.R.: Genetic programming as a means for programming computers by natural selection. Stat. Comput. 4(2), 87–112 (1994)CrossRefGoogle Scholar
  13. 13.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)Google Scholar
  14. 14.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  15. 15.
    Lawrence, S., Giles, C.L., Tsoi, A.C.: What size neural network gives optimal generalization? Convergence properties of backpropagation. Technical report (1998)Google Scholar
  16. 16.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  17. 17.
    Liu, C., Zoph, B., Shlens, J., Hua, W., Li, L.J., Fei-Fei, L., Yuille, A., Huang, J., Murphy, K.: Progressive neural architecture search. arXiv preprint arXiv:1712.00559 (2017)
  18. 18.
    Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436 (2017)
  19. 19.
    Liu, L., Deng, J.: Dynamic deep neural networks: optimizing accuracy-efficiency trade-offs by selective execution. arXiv preprint arXiv:1701.00299 (2017)
  20. 20.
    Loshchilov, I., Hutter, F.: CMA-ES for hyperparameter optimization of deep neural networks. arXiv preprint arXiv:1604.07269 (2016)
  21. 21.
    Miikkulainen, R., Liang, J., Meyerson, E., Rawal, A., Fink, D., Francon, O., Raju, B., Shahrzad, H., Navruzyan, A., Duffy, N., et al.: Evolving deep neural networks. arxiv 2017. arXiv preprint arXiv:1703.00548
  22. 22.
    Negrinho, R., Gordon, G.: DeepArchitect: automatically designing and training deep architectures. arXiv preprint arXiv:1704.08792 (2017)
  23. 23.
    O’Neill, M., Ryan, C.: Grammatical evolution: Evolutionary automatic programming in a arbitrary language, volume 4 of genetic programming (2003)Google Scholar
  24. 24.
    Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. arXiv preprint arXiv:1802.01548 (2018)
  25. 25.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  26. 26.
    Snoek, J., Rippel, O., Swersky, K., Kiros, R., Satish, N., Sundaram, N., Patwary, M., Prabhat, M., Adams, R.: Scalable Bayesian optimization using deep neural networks. In: International Conference on Machine Learning, pp. 2171–2180 (2015)Google Scholar
  27. 27.
    Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRefGoogle Scholar
  28. 28.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  29. 29.
    Tsoulos, I., Gavrilis, D., Glavas, E.: Neural network construction and training using grammatical evolution. Neurocomputing 72(1–3), 269–277 (2008)CrossRefGoogle Scholar
  30. 30.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833. Springer (2014)Google Scholar
  31. 31.
    Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)
  32. 32.
    Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. arXiv preprint arXiv:1707.07012 2(6) (2017)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Tomaso Cetto
    • 1
    Email author
  • Jonathan Byrne
    • 1
  • Xiaofan Xu
    • 1
  • David Moloney
    • 1
  1. 1.Advanced Architecture GroupIntel CorporationLeixlipIreland

Personalised recommendations