Abstract
Neural networks and deep learning are changing the way that artificial intelligence is being done. Efficiently choosing a suitable network architecture and fine tuning its hyper-parameters for a specific dataset is a time-consuming task given the staggering number of possible alternatives. In this paper, we address the problem of model selection by means of a fully automated framework for efficiently selecting a neural network model for a selected task, whether it is classification or regression. The algorithm, named Automatic Model Selection, is a modified micro-genetic algorithm that automatically and efficiently finds the most suitable fully connected neural network model for a given dataset. The main contributions of this method are: a simple, list based encoding for neural networks, which will be used as the genotype in our evolutionary algorithm, novel crossover and mutation operators, the introduction of a fitness function that considers the accuracy of the neural network and its complexity, and a method to measure the similarity between two neural networks. AMS is evaluated on two different datasets. By comparing some models obtained with AMS to state-of-the-art models for each dataset we show that AMS can automatically find efficient neural network models. Furthermore, AMS is computationally efficient and can make use of distributed computing paradigms to further boost its performance.
Similar content being viewed by others
References
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G et al (2015) TensorFlow: large-scale machine learning on heterogeneous systems. Software available from http://tensorflow.org/
Francois C (2015) Keras. https://github.com/fchollet/keras
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, et al. (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093
Seide F, Agarwal A (2016) CNTK: Microsoft’s open-source deep-learning toolkit. In: 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 2135–2135
Hall M, Frank E, Holmes G, Pfahringer B, Pand Witten Reutemann I (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Schaul T, Bayer J, Wierstra D, Sun Y, Felder M, Sehnke F, Rucksties T, Schmidhuber J (2010) Pybrain. J Mach Learn Res 11:743–746
Meng X, Bradley J, Yavuz B, Sparks B, Venkataraman S, Liu D et al (2016) MLlib: machine learning in apache spark. J Mach Learn Res 17(34):1–7
Jin H, Song Q, Hu X (2018) Auto-keras: an efficient neural architecture search system. arXiv:1806.10282
Real E, Aggarwal A, Huang Y, Le QV (2018) Regularized evolution for image classifier architecture search. arXiv:1802.01548
Sparks E, Talwalkar A, Smith V, Kottalam J, Pan X, Gonzales J (2015) Automated model search for large scale machine learning. In: SoCC, pp 368–380
Zoph B, Le QV (2017) Neural architecture search with reinforcement learning. In: proceedings of international conference on learning representations, April 24–26, 2017, Paris. https://arxiv.org/abs/1611.01578
Baker B, Gupta O, Naik N, Raskar R (2017) Designing neural network architectures using reinforcement learning. arXiv:1611.02167
Zhong Z, Yan J, Wu W, Shao J, Liu CL (2017) Practical block-wise neural network architecture generation. arXiv:1708.05552
Liu C, Zoph B, Neumann M, Shlens J, Hua W, et al. (2017) Progressive neural architecture search. arXiv:1712.00559
Miikkulainen R, Liang J, Meyerson E, Rawal A, Fink D, et al. (2017) Evolving deep neural networks. arXiv:1703.00548
Angeline P, Saunders G, Pollack J (1994) An evolutionary algorithm that constructs recurrent neural networks. IEEE Trans Neural Netw 5(1):54–65
Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the genetic and evolutionary computation conference, ACM, GECCO ’17, pp 497–504
Kotthoff L, Thornton C, Hoos HH, Hutter F, Leyton-Brown K (2017) Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA. J Mach Learn Res 18(25):1–5
Brochu E, Cora VM, de Freitas N (2010) A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv:1012.2599
Hutter F, Hoos H, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Proceedings of the 5th international conference on learning and intelligent optimization, LION’05, Springer-Verlag, pp 507–523
Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. In: Advances in neural information processing systems 28, Curran Associates, Inc., pp 2962–2970
Real E, Moore S, Selle A, Saxena S, Suematsu YL et al (2017) Large-scale evolution of image classifiers. arXiv:1703.01041
Mendoza H, Klein A, Feurer M, Springenberg JT, Urban M, Burkart M, Dippel M, Lindauer M, Hutter F (2019) Towards automatically-tuned deep neural networks. In: Hutter F, Kotthoff L, Vanschoren J (eds) Automated machine learning: methods, systems, challenges. Springer, Cham, pp 135–149
Snoek J, Larochelle H, Adams RP (2012) Practical Bayesian optimization of machine learning algorithms. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol 2, Curran Associates Inc., Red Hook, NY, USA, pp 2951–2959
Golovin D, Solnik B, Moitra S, Kochanski G, Karro J, Sculley D (2017) Google vizier: A service for black-box optimization. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1487–1495
Simon J (2019) Amazon sagemaker autopilot—automatically create high-quality machine learning models with full control and visibility. https://aws.amazon.com/blogs/aws/amazon-sagemaker-autopilot-fully-managed-automatic-machine-learning/. Accessed 06/2020
Microsoft (2017) Neural network intelligence. https://www.microsoft.com/en-us/research/project/neural-network-intelligence/. Accessed 06/2020
H2Oai (2020) AutoML: Automatic machine learning. https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html. Accessed 06/2020
Truong A, Walters A, Goodsitt J, Hines K, Bruss CB, Farivar R (2019) Towards automated machine learning: Evaluation and comparison of AutoML approaches and tools. In: 2019 IEEE 31st International conference on tools with artificial intelligence (ICTAI), IEEE, Portland, Oregon, pp 1471–1479
Zoller MA, Huber MF (2019) Benchmark and survey of automated machine learning frameworks. arXiv preprint arXiv:1904.12054
Hutter F, Kotthoff L, Vanschoren J (eds) (2019) Automated machine learning: methods, systems, challenges. Springer, Berlin
Quanming Y, Mengshuo W, Hugo JE, Isabelle G, Yi-Qi H, Yu-Feng L, Wei-Wei T, Qiang Y, Yang Y (2018) Taking human out of learning applications: a survey on automated machine learning. arXiv preprint arXiv:1810.13306
Elsken T, Metzen JH, Hutter F (2018) Neural architecture search: a survey. arXiv preprint arXiv:1808.05377
Moritz P, Nishihara R, Wang S, Tumanov A, Liaw R, Liang E et al (2017) Ray: a distributed framework for emerging ai applications. arXiv:1712.05889
Zaharia M, Xin RS, Wendell P, Das T, Armbrust M, Dave A, Meng X, Rosen J, Venkataraman S, Franklin MJ, Ghodsi A, Gonzalez J, Shenker S, Stoica I (2016) Apache spark: a unified engine for big data processing. Commun ACM 59(11):56–65. https://doi.org/10.1145/2934664
Engelbrecht D (2007) Computational intelligence: an introduction. Wiley, Hoboken
Ebehart R, Shi Y (2007) Computational intelligence. Morgan Kauffman, Burlington
Sumathi S, Surekha P (2010) Computational intelligence paradigms. Theory and applications using MATLAB. CRC Press, Boca Raton
Krishnakumar K (1989) Micro-genetic algorithms for stationary and non-stationary function optimization. In: SPIE proceedings: intelligent control and adaptive systems, pp 289–296
Hillermeier C (2001) Nonlinear multiobjective optimization. Springer, Berlin
Laredo D, Chen Z, Schütze O, Sun JQ (2019) A neural network-evolutionary computational framework for remaining useful life estimation of mechanical systems. Neural Netw 116:178–187
Holland J (1992) Adaptation in natural and artificial systems. MIT Press, Cambridge
Laredo D, Quin Y, Schütze O, Sun JQ (2019) Automatic model selection for neural networks, source code. https://github.com/dlaredo/automatic_model_selection
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer, New York, NY, USA
Simard P, Steinkraus D, Platt J (2003) Best practices for convolutional neural networks applied to visual document analysis. In: The 7th international conference on document analysis and recognition (ICDAR 2003), 2-Volume Set, 3–6 August 2003, Edinburgh, Scotland, UK, pp 958–962
Ciresan DC, Meier U, Gambardella LM, Schmidhuber J (2010) Deep, big, simple neural nets for handwritten digit recognition. Neural Comput 22(12):3207–3220
Saxena A, Goebel K (2008) PHM08 challenge data set. https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/
Saxena A, Goebel K, Simon D, Eklund N (2008) Damage propagation modeling for aircraft engine run-to-failure simulation. In: International conference on prognostics and health management, IEEE, pp 1–9
Laredo D, Chen X, Schütze O, Sun JQ (2018) ANN-EA for RUL estimation, source code. https://github.com/dlaredo/NASA_RUL_-CMAPS-
Li X, Ding Q, Sun J (2018) Remaining useful life estimation in prognostics using deep convolution neural networks. Reliab Eng Syst Safety 172:1–11
Lim P, Goh CK, Tan KC (2016) A time-window neural networks based framework for remaining useful life estimation. In: Proceedings international joint conference on neural networks, pp 1746–1753
Zhang C, Lim P, Qin A, Tan K (2016) Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics. IEEE Trans Neural Netw Learn Syst 99:1–13
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX conference on hot topics in cloud computing, pp 1–7
Funding
Funding was provided by Conacyt (Grant No.285599).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Laredo, D., Ma, S.F., Leylaz, G. et al. Automatic model selection for fully connected neural networks. Int. J. Dynam. Control 8, 1063–1079 (2020). https://doi.org/10.1007/s40435-020-00708-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40435-020-00708-w