Advertisement

DENSER: deep evolutionary network structured representation

  • Filipe AssunçãoEmail author
  • Nuno Lourenço
  • Penousal Machado
  • Bernardete Ribeiro
Article

Abstract

Deep evolutionary network structured representation (DENSER) is a novel evolutionary approach for the automatic generation of deep neural networks (DNNs) which combines the principles of genetic algorithms (GAs) with those of dynamic structured grammatical evolution (DSGE). The GA-level encodes the macro structure of evolution, i.e., the layers, learning, and/or data augmentation methods (among others); the DSGE-level specifies the parameters of each GA evolutionary unit and the valid range of the parameters. The use of a grammar makes DENSER a general purpose framework for generating DNNs: one just needs to adapt the grammar to be able to deal with different network and layer types, problems, or even to change the range of the parameters. DENSER is tested on the automatic generation of convolutional neural networks (CNNs) for the CIFAR-10 dataset, with the best performing networks reaching accuracies of up to 95.22%. Furthermore, we take the fittest networks evolved on the CIFAR-10, and apply them to classify MNIST, Fashion-MNIST, SVHN, Rectangles, and CIFAR-100. The results show that the DNNs discovered by DENSER during evolution generalise, are robust, and scale. The most impressive result is the 78.75% classification accuracy on the CIFAR-100 dataset, which, to the best of our knowledge, sets a new state-of-the-art on methods that seek to automatically design CNNs.

Keywords

Automated machine learning NeuroEvolution Deep neural networks Convolutional neural networks Dynamic structured grammatical evolution 

Notes

Acknowledgements

This work is partially funded by: Fundação para a Ciência e Tecnologia (FCT), Portugal, under the Grant SFRH/BD/114865/2016. We would also like to thank NVIDIA for providing us Titan X GPUs.

References

  1. 1.
    M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, Tensorflow: a system for large-scale machine learning. OSDI 16, 265–283 (2016)Google Scholar
  2. 2.
    F. Ahmadizar, K. Soltanian, F. AkhlaghianTab, I. Tsoulos, Artificial neural network development by means of a novel combination of grammatical evolution and genetic algorithm. Eng. Appl. Artif. Intell. 39, 1–13 (2015)CrossRefGoogle Scholar
  3. 3.
    F. Assunção, N. Lourenço, P. Machado, B. Ribeiro, Evolving the topology of large scale deep neural networks, in European Conference on Genetic Programming. Springer, pp. 19–34 (2018)Google Scholar
  4. 4.
    F. Assunção, N. Lourenço, P. Machado, B. Ribeiro, Towards the evolution of multi-layered neural networks: A dynamic structured grammatical evolution approach, in Proceedings of the Genetic and Evolutionary Computation Conference, GECCO’17. ACM, New York, NY, USA, pp. 393–400 (2017).  https://doi.org/10.1145/3071178.3071286
  5. 5.
    T. Bäck, H.P. Schwefel, An overview of evolutionary algorithms for parameter optimization. Evol. Comput. 1(1), 1–23 (1993)CrossRefGoogle Scholar
  6. 6.
    B. Baker, O. Gupta, N. Naik, R. Raskar, Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167 (2016)
  7. 7.
    A. Baldominos, Y. Saez, P. Isasi, Evolutionary design of convolutional neural networks for human activity recognition in sensor-rich environments. Sensors (14248220) 18(4) (2018)CrossRefGoogle Scholar
  8. 8.
    J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)MathSciNetzbMATHGoogle Scholar
  9. 9.
    J.S. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for hyper-parameter optimization, in Advances in Neural Information Processing Systems, pp. 2546–2554 (2011)Google Scholar
  10. 10.
    C.M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics) (Springer, Berlin, Heidelberg, 2006)zbMATHGoogle Scholar
  11. 11.
    F. Chollet, et al.: Keras. https://keras.io (2015)
  12. 12.
    Z. Chunhong, J. Licheng, Automatic parameters selection for SVM based on GA, in Intelligent Control and Automation, 2004. WCICA 2004. Fifth World Congress on, vol. 2. IEEE, pp. 1869–1872 (2004)Google Scholar
  13. 13.
    K.B. Duan, S.S. Keerthi, Which is the best multiclass SVM method? An empirical study, in International Workshop on Multiple Classifier Systems. Springer, pp. 278–285 (2005)Google Scholar
  14. 14.
    C. Fernando, D. Banarse, C. Blundell, Y. Zwols, D. Ha, A.A. Rusu, A. Pritzel, D. Wierstra, Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734 (2017)
  15. 15.
    M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, F. Hutter, Efficient and robust automated machine learning, in Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)Google Scholar
  16. 16.
    D. Floreano, P. Dürr, C. Mattiussi, Neuroevolution: from architectures to learning. Evol. Intell. 1(1), 47–62 (2008)CrossRefGoogle Scholar
  17. 17.
    F. Gomez, J. Schmidhuber, R. Miikkulainen, Accelerated neural evolution through cooperatively coevolved synapses. J. Mach. Learn. Res. 9(May), 937–965 (2008)MathSciNetzbMATHGoogle Scholar
  18. 18.
    I. Goodfellow, Y. Bengio, A. Courville, Y. Bengio, Deep Learning, vol. 1 (MIT Press, Cambridge, 2016)zbMATHGoogle Scholar
  19. 19.
    I.J. Goodfellow, Y. Bulatov, J. Ibarz, S. Arnoud, V. Shet, Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082 (2013)
  20. 20.
    B. Graham, Fractional max-pooling. arXiv preprint arXiv:1412.6071 (2014)
  21. 21.
    I. Guyon, K. Bennett, G. Cawley, H.J. Escalante, S. Escalera, T.K. Ho, N. Macia, B. Ray, M. Saeed, A. Statnikov, et al.: Design of the 2015 ChaLearn AutoML challenge, in Neural Networks (IJCNN), 2015 International Joint Conference on. IEEE, pp. 1–8 (2015)Google Scholar
  22. 22.
    I. Guyon, I. Chaabane, H.J. Escalante, S. Escalera, D. Jajetic, J.R. Lloyd, N. Macià, B. Ray, L. Romaszko, M. Sebag, et al.: A brief review of the ChaLearn AutoML challenge: any-time any-dataset learning without human intervention, in Workshop on Automatic Machine Learning, pp. 21–30 (2016)Google Scholar
  23. 23.
    N. Hansen, A. Ostermeier, Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159–195 (2001)CrossRefGoogle Scholar
  24. 24.
    S.A. Harp, T. Samad, A. Guha, Designing application-specific neural networks using the genetic algorithm, in Advances in Neural Information Processing Systems, pp. 447–454 (1990)Google Scholar
  25. 25.
    K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  26. 26.
    Á.B. Jiménez, J.L. Lázaro, J.R. Dorronsoro, Finding optimal model parameters by deterministic and annealed focused grid search. Neurocomputing 72(13–15), 2824–2832 (2009)CrossRefGoogle Scholar
  27. 27.
    J.Y. Jung, J.A. Reggia, Evolutionary design of neural network architectures using a descriptive encoding language. IEEE Trans. Evol. Comput. 10(6), 676–688 (2006)CrossRefGoogle Scholar
  28. 28.
    J.D. Kelly Jr., L. Davis, A hybrid genetic algorithm for classification, in Proceedings of the 12th International Joint Conference on Artificial Intelligence, Sydney, Australia, ed. by J. Mylopoulos, R. Reiter (Morgan Kaufmann, San Francisco, 1991), pp. 645–650. http://ijcai.org/Proceedings/91-2/Papers/006.pdfGoogle Scholar
  29. 29.
    D. Khritonenko, V. Stanovov, E. Semenkin, Applying an instance selection method to an evolutionary neural classifier design, in IOP Conference Series: Materials Science and Engineering, vol. 173. IOP Publishing, p. 012007 (2017)Google Scholar
  30. 30.
    H. Kitano, Designing neural networks using genetic algorithms with graph generation system. Complex Syst. 4(4), 461–476 (1990)zbMATHGoogle Scholar
  31. 31.
    B. Komer, J. Bergstra, C. Eliasmith, Hyperopt-sklearn: automatic hyperparameter configuration for scikit-learn, in ICML Workshop on AutoML (2014)Google Scholar
  32. 32.
    A. Krizhevsky, G. Hinton, Learning multiple layers of features from tiny images (2009)Google Scholar
  33. 33.
    Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  34. 34.
    F.H.F. Leung, H.K. Lam, S.H. Ling, P.K.S. Tam, Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans. Neural Netw. 14(1), 79–88 (2003)CrossRefGoogle Scholar
  35. 35.
    I. Loshchilov, F. Hutter, CMA-ES for hyperparameter optimization of deep neural networks. arXiv preprint arXiv:1604.07269 (2016)
  36. 36.
    N. Lourenço, F. Assunção, F.B. Pereira, E. Costa, P. Machado, Structured grammatical evolution: a dynamic approach, in Handbook of Grammatical Evolution, ed. by C. Ryan, M. O’Neill, J. Collins (Springer, Berlin, 2018).  https://doi.org/10.1007/978-3-319-78717-6 CrossRefGoogle Scholar
  37. 37.
    N. Lourenço, F.B. Pereira, E. Costa, Unveiling the properties of structured grammatical evolution. Genet. Program. Evol. Mach. 17(3), 251–289 (2016)CrossRefGoogle Scholar
  38. 38.
    R. Miikkulainen, J. Liang, E. Meyerson, A. Rawal, D. Fink, O. Francon, B. Raju, A. Navruzyan, N. Duffy, B. Hodjat, Evolving deep neural networks. arXiv preprint arXiv:1703.00548 (2017)
  39. 39.
    J.F. Miller, Cartesian genetic programming, in Cartesian Genetic Programming. Springer, pp. 17–34 (2011)Google Scholar
  40. 40.
    J. Močkus, On bayesian methods for seeking the extremum, in Optimization Techniques IFIP Technical Conference. Springer, pp. 400–404 (1975)Google Scholar
  41. 41.
    D.E. Moriarty, R. Miikkulainen, Forming neural networks through efficient and adaptive coevolution. Evol. Comput. 5(4), 373–399 (1997)CrossRefGoogle Scholar
  42. 42.
    G. Morse, K.O. Stanley, Simple evolutionary optimization can rival stochastic gradient descent in neural networks, in Proceedings of the 2016 on Genetic and Evolutionary Computation Conference. ACM, pp. 477–484 (2016)Google Scholar
  43. 43.
    Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, A.Y. Ng, Reading digits in natural images with unsupervised feature learning, in NIPS Workshop on Deep Learning and Unsupervised Feature Learning, vol. 2011, p. 5 (2011)Google Scholar
  44. 44.
    M. O’Neil, C. Ryan, Grammatical evolution, in Grammatical Evolution. Springer, pp. 33–47 (2003)Google Scholar
  45. 45.
    F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  46. 46.
    A. Radi, R. Poli, Discovering efficient learning rules for feedforward neural networks using genetic programming, in Recent Advances in Intelligent Paradigms and Applications. Springer, pp. 133–159 (2003)Google Scholar
  47. 47.
    E. Real, S. Moore, A. Selle, S. Saxena, Y.L. Suematsu, Q. Le, A. Kurakin, Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041 (2017)
  48. 48.
    M. Rocha, P. Cortez, J. Neves, Evolution of neural networks for classification and regression. Neurocomputing 70(16), 2809–2816 (2007)CrossRefGoogle Scholar
  49. 49.
    B. Schuller, S. Reiter, G. Rigoll, Evolutionary feature generation in speech emotion recognition, in 2006 IEEE International Conference on Multimedia and Expo. IEEE, pp. 5–8 (2006)Google Scholar
  50. 50.
    P. Sermanet, S. Chintala, Y. LeCun, Convolutional neural networks applied to house numbers digit classification, in 2012 21st International Conference on Pattern Recognition (ICPR). IEEE, pp. 3288–3291 (2012)Google Scholar
  51. 51.
    P.Y. Simard, D. Steinkraus, J.C. Platt et al., Best practices for convolutional neural networks applied to visual document analysis. ICDAR 3, 958–962 (2003)Google Scholar
  52. 52.
    K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  53. 53.
    J. Snoek, H. Larochelle, R.P. Adams, Practical Bayesian optimization of machine learning algorithms, in Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)Google Scholar
  54. 54.
    J. Snoek, O. Rippel, K. Swersky, R. Kiros, N. Satish, N. Sundaram, M. Patwary, M. Prabhat, R. Adams, Scalable Bayesian optimization using deep neural networks, in International Conference on Machine Learning, pp. 2171–2180 (2015)Google Scholar
  55. 55.
    K. Soltanian, F.A. Tab, F.A. Zar, I. Tsoulos, Artificial neural networks generation using grammatical evolution, in 2013 21st Iranian Conference on Electrical Engineering (ICEE). IEEE, pp. 1–5 (2013)Google Scholar
  56. 56.
    K.O. Stanley, D.B. D’Ambrosio, J. Gauci, A hypercube-based encoding for evolving large-scale neural networks. Artif. Life 15(2), 185–212 (2009)CrossRefGoogle Scholar
  57. 57.
    K.O. Stanley, R. Miikkulainen, Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRefGoogle Scholar
  58. 58.
    V. Stanovov, E. Semenkin, O. Semenkina, Instance selection approach for self-configuring hybrid fuzzy evolutionary algorithm for imbalanced datasets, in International Conference in Swarm Intelligence. Springer, pp. 451–459 (2015)Google Scholar
  59. 59.
    M. Suganuma, S. Shirakawa, T. Nagao, A genetic programming approach to designing convolutional neural network architectures, in Proceedings of the Genetic and Evolutionary Computation Conference, GECCO’17. ACM, New York, NY, USA, pp. 497–504 (2017).  https://doi.org/10.1145/3071178.3071229
  60. 60.
    C. Thornton, F. Hutter, H.H. Hoos, K. Leyton-Brown, Auto-weka: Combined selection and hyperparameter optimization of classification algorithms, in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp. 847–855 (2013)Google Scholar
  61. 61.
    A.J. Turner, J.F. Miller, Cartesian genetic programming encoded artificial neural networks: a comparison using three benchmarks, in Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. ACM, pp. 1005–1012 (2013)Google Scholar
  62. 62.
    P. Verbancsics, J. Harguess, Image classification using generative neuro evolution for deep learning in 2015 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp. 488–493 (2015)Google Scholar
  63. 63.
    D. Whitley, T. Starkweather, C. Bogart, Genetic algorithms and neural networks: optimizing connections and connectivity. Parallel Comput. 14(3), 347–361 (1990)CrossRefGoogle Scholar
  64. 64.
    I.H. Witten, E. Frank, M.A. Hall, C.J. Pal, Data Mining: Practical Machine Learning Tools And Techniques (Morgan Kaufmann, San Francisco, 2016)Google Scholar
  65. 65.
    H. Xiao, K. Rasul, R. Vollgraf, Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms (2017)Google Scholar
  66. 66.
    B. Xue, M. Zhang, W.N. Browne, X. Yao, A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)CrossRefGoogle Scholar
  67. 67.
    X. Yao, Evolving artificial neural networks. Proc. IEEE 87(9), 1423–1447 (1999)CrossRefGoogle Scholar
  68. 68.
    J. Yu, B. Bhanu, Evolutionary feature synthesis for facial expression recognition. Pattern Recognit. Lett. 27(11), 1289–1298 (2006)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.CISUC - Department of Informatics EngineeringUniversity of CoimbraCoimbraPortugal

Personalised recommendations