Skip to main content

Advertisement

Log in

Automatic model selection for fully connected neural networks

  • Published:
International Journal of Dynamics and Control Aims and scope Submit manuscript

Abstract

Neural networks and deep learning are changing the way that artificial intelligence is being done. Efficiently choosing a suitable network architecture and fine tuning its hyper-parameters for a specific dataset is a time-consuming task given the staggering number of possible alternatives. In this paper, we address the problem of model selection by means of a fully automated framework for efficiently selecting a neural network model for a selected task, whether it is classification or regression. The algorithm, named Automatic Model Selection, is a modified micro-genetic algorithm that automatically and efficiently finds the most suitable fully connected neural network model for a given dataset. The main contributions of this method are: a simple, list based encoding for neural networks, which will be used as the genotype in our evolutionary algorithm, novel crossover and mutation operators, the introduction of a fitness function that considers the accuracy of the neural network and its complexity, and a method to measure the similarity between two neural networks. AMS is evaluated on two different datasets. By comparing some models obtained with AMS to state-of-the-art models for each dataset we show that AMS can automatically find efficient neural network models. Furthermore, AMS is computationally efficient and can make use of distributed computing paradigms to further boost its performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G et al (2015) TensorFlow: large-scale machine learning on heterogeneous systems. Software available from http://tensorflow.org/

  2. Francois C (2015) Keras. https://github.com/fchollet/keras

  3. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, et al. (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093

  4. Seide F, Agarwal A (2016) CNTK: Microsoft’s open-source deep-learning toolkit. In: 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 2135–2135

  5. Hall M, Frank E, Holmes G, Pfahringer B, Pand Witten Reutemann I (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  6. Schaul T, Bayer J, Wierstra D, Sun Y, Felder M, Sehnke F, Rucksties T, Schmidhuber J (2010) Pybrain. J Mach Learn Res 11:743–746

    Google Scholar 

  7. Meng X, Bradley J, Yavuz B, Sparks B, Venkataraman S, Liu D et al (2016) MLlib: machine learning in apache spark. J Mach Learn Res 17(34):1–7

    MathSciNet  MATH  Google Scholar 

  8. Jin H, Song Q, Hu X (2018) Auto-keras: an efficient neural architecture search system. arXiv:1806.10282

  9. Real E, Aggarwal A, Huang Y, Le QV (2018) Regularized evolution for image classifier architecture search. arXiv:1802.01548

  10. Sparks E, Talwalkar A, Smith V, Kottalam J, Pan X, Gonzales J (2015) Automated model search for large scale machine learning. In: SoCC, pp 368–380

  11. Zoph B, Le QV (2017) Neural architecture search with reinforcement learning. In: proceedings of international conference on learning representations, April 24–26, 2017, Paris. https://arxiv.org/abs/1611.01578

  12. Baker B, Gupta O, Naik N, Raskar R (2017) Designing neural network architectures using reinforcement learning. arXiv:1611.02167

  13. Zhong Z, Yan J, Wu W, Shao J, Liu CL (2017) Practical block-wise neural network architecture generation. arXiv:1708.05552

  14. Liu C, Zoph B, Neumann M, Shlens J, Hua W, et al. (2017) Progressive neural architecture search. arXiv:1712.00559

  15. Miikkulainen R, Liang J, Meyerson E, Rawal A, Fink D, et al. (2017) Evolving deep neural networks. arXiv:1703.00548

  16. Angeline P, Saunders G, Pollack J (1994) An evolutionary algorithm that constructs recurrent neural networks. IEEE Trans Neural Netw 5(1):54–65

    Article  Google Scholar 

  17. Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the genetic and evolutionary computation conference, ACM, GECCO ’17, pp 497–504

  18. Kotthoff L, Thornton C, Hoos HH, Hutter F, Leyton-Brown K (2017) Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA. J Mach Learn Res 18(25):1–5

    MathSciNet  Google Scholar 

  19. Brochu E, Cora VM, de Freitas N (2010) A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv:1012.2599

  20. Hutter F, Hoos H, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Proceedings of the 5th international conference on learning and intelligent optimization, LION’05, Springer-Verlag, pp 507–523

  21. Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. In: Advances in neural information processing systems 28, Curran Associates, Inc., pp 2962–2970

  22. Real E, Moore S, Selle A, Saxena S, Suematsu YL et al (2017) Large-scale evolution of image classifiers. arXiv:1703.01041

  23. Mendoza H, Klein A, Feurer M, Springenberg JT, Urban M, Burkart M, Dippel M, Lindauer M, Hutter F (2019) Towards automatically-tuned deep neural networks. In: Hutter F, Kotthoff L, Vanschoren J (eds) Automated machine learning: methods, systems, challenges. Springer, Cham, pp 135–149

    Chapter  Google Scholar 

  24. Snoek J, Larochelle H, Adams RP (2012) Practical Bayesian optimization of machine learning algorithms. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol 2, Curran Associates Inc., Red Hook, NY, USA, pp 2951–2959

  25. Golovin D, Solnik B, Moitra S, Kochanski G, Karro J, Sculley D (2017) Google vizier: A service for black-box optimization. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1487–1495

  26. Simon J (2019) Amazon sagemaker autopilot—automatically create high-quality machine learning models with full control and visibility. https://aws.amazon.com/blogs/aws/amazon-sagemaker-autopilot-fully-managed-automatic-machine-learning/. Accessed 06/2020

  27. Microsoft (2017) Neural network intelligence. https://www.microsoft.com/en-us/research/project/neural-network-intelligence/. Accessed 06/2020

  28. H2Oai (2020) AutoML: Automatic machine learning. https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html. Accessed 06/2020

  29. Truong A, Walters A, Goodsitt J, Hines K, Bruss CB, Farivar R (2019) Towards automated machine learning: Evaluation and comparison of AutoML approaches and tools. In: 2019 IEEE 31st International conference on tools with artificial intelligence (ICTAI), IEEE, Portland, Oregon, pp 1471–1479

  30. Zoller MA, Huber MF (2019) Benchmark and survey of automated machine learning frameworks. arXiv preprint arXiv:1904.12054

  31. Hutter F, Kotthoff L, Vanschoren J (eds) (2019) Automated machine learning: methods, systems, challenges. Springer, Berlin

    Google Scholar 

  32. Quanming Y, Mengshuo W, Hugo JE, Isabelle G, Yi-Qi H, Yu-Feng L, Wei-Wei T, Qiang Y, Yang Y (2018) Taking human out of learning applications: a survey on automated machine learning. arXiv preprint arXiv:1810.13306

  33. Elsken T, Metzen JH, Hutter F (2018) Neural architecture search: a survey. arXiv preprint arXiv:1808.05377

  34. Moritz P, Nishihara R, Wang S, Tumanov A, Liaw R, Liang E et al (2017) Ray: a distributed framework for emerging ai applications. arXiv:1712.05889

  35. Zaharia M, Xin RS, Wendell P, Das T, Armbrust M, Dave A, Meng X, Rosen J, Venkataraman S, Franklin MJ, Ghodsi A, Gonzalez J, Shenker S, Stoica I (2016) Apache spark: a unified engine for big data processing. Commun ACM 59(11):56–65. https://doi.org/10.1145/2934664

    Article  Google Scholar 

  36. Engelbrecht D (2007) Computational intelligence: an introduction. Wiley, Hoboken

    Book  Google Scholar 

  37. Ebehart R, Shi Y (2007) Computational intelligence. Morgan Kauffman, Burlington

    Google Scholar 

  38. Sumathi S, Surekha P (2010) Computational intelligence paradigms. Theory and applications using MATLAB. CRC Press, Boca Raton

    Book  Google Scholar 

  39. Krishnakumar K (1989) Micro-genetic algorithms for stationary and non-stationary function optimization. In: SPIE proceedings: intelligent control and adaptive systems, pp 289–296

  40. Hillermeier C (2001) Nonlinear multiobjective optimization. Springer, Berlin

    Book  Google Scholar 

  41. Laredo D, Chen Z, Schütze O, Sun JQ (2019) A neural network-evolutionary computational framework for remaining useful life estimation of mechanical systems. Neural Netw 116:178–187

    Article  Google Scholar 

  42. Holland J (1992) Adaptation in natural and artificial systems. MIT Press, Cambridge

    Book  Google Scholar 

  43. Laredo D, Quin Y, Schütze O, Sun JQ (2019) Automatic model selection for neural networks, source code. https://github.com/dlaredo/automatic_model_selection

  44. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  45. Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer, New York, NY, USA

    MATH  Google Scholar 

  46. Simard P, Steinkraus D, Platt J (2003) Best practices for convolutional neural networks applied to visual document analysis. In: The 7th international conference on document analysis and recognition (ICDAR 2003), 2-Volume Set, 3–6 August 2003, Edinburgh, Scotland, UK, pp 958–962

  47. Ciresan DC, Meier U, Gambardella LM, Schmidhuber J (2010) Deep, big, simple neural nets for handwritten digit recognition. Neural Comput 22(12):3207–3220

    Article  Google Scholar 

  48. Saxena A, Goebel K (2008) PHM08 challenge data set. https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/

  49. Saxena A, Goebel K, Simon D, Eklund N (2008) Damage propagation modeling for aircraft engine run-to-failure simulation. In: International conference on prognostics and health management, IEEE, pp 1–9

  50. Laredo D, Chen X, Schütze O, Sun JQ (2018) ANN-EA for RUL estimation, source code. https://github.com/dlaredo/NASA_RUL_-CMAPS-

  51. Li X, Ding Q, Sun J (2018) Remaining useful life estimation in prognostics using deep convolution neural networks. Reliab Eng Syst Safety 172:1–11

    Article  Google Scholar 

  52. Lim P, Goh CK, Tan KC (2016) A time-window neural networks based framework for remaining useful life estimation. In: Proceedings international joint conference on neural networks, pp 1746–1753

  53. Zhang C, Lim P, Qin A, Tan K (2016) Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics. IEEE Trans Neural Netw Learn Syst 99:1–13

    Google Scholar 

  54. Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX conference on hot topics in cloud computing, pp 1–7

Download references

Funding

Funding was provided by Conacyt (Grant No.285599).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian-Qiao Sun.

Generated neural network models

Generated neural network models

See Tables 19 and 20.

Table 19 Neural network model corresponding to \(S_1\)
Table 20 Neural network model corresponding to \(S_2\)
Table 21 Neural network model corresponding to \(S_3\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Laredo, D., Ma, S.F., Leylaz, G. et al. Automatic model selection for fully connected neural networks. Int. J. Dynam. Control 8, 1063–1079 (2020). https://doi.org/10.1007/s40435-020-00708-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40435-020-00708-w

Keywords

Navigation