Automatic model selection for fully connected neural networks

Laredo, David; Ma, Shangjie Frank; Leylaz, Ghazaale; Schütze, Oliver; Sun, Jian-Qiao

doi:10.1007/s40435-020-00708-w

Automatic model selection for fully connected neural networks

Published: 06 October 2020

Volume 8, pages 1063–1079, (2020)
Cite this article

International Journal of Dynamics and Control Aims and scope Submit manuscript

David Laredo¹,
Shangjie Frank Ma²,
Ghazaale Leylaz²,
Oliver Schütze¹ &
…
Jian-Qiao Sun ORCID: orcid.org/0000-0002-5441-7982²

568 Accesses
12 Citations
Explore all metrics

Abstract

Neural networks and deep learning are changing the way that artificial intelligence is being done. Efficiently choosing a suitable network architecture and fine tuning its hyper-parameters for a specific dataset is a time-consuming task given the staggering number of possible alternatives. In this paper, we address the problem of model selection by means of a fully automated framework for efficiently selecting a neural network model for a selected task, whether it is classification or regression. The algorithm, named Automatic Model Selection, is a modified micro-genetic algorithm that automatically and efficiently finds the most suitable fully connected neural network model for a given dataset. The main contributions of this method are: a simple, list based encoding for neural networks, which will be used as the genotype in our evolutionary algorithm, novel crossover and mutation operators, the introduction of a fitness function that considers the accuracy of the neural network and its complexity, and a method to measure the similarity between two neural networks. AMS is evaluated on two different datasets. By comparing some models obtained with AMS to state-of-the-art models for each dataset we show that AMS can automatically find efficient neural network models. Furthermore, AMS is computationally efficient and can make use of distributed computing paradigms to further boost its performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

References

Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G et al (2015) TensorFlow: large-scale machine learning on heterogeneous systems. Software available from http://tensorflow.org/
Francois C (2015) Keras. https://github.com/fchollet/keras
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, et al. (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093
Seide F, Agarwal A (2016) CNTK: Microsoft’s open-source deep-learning toolkit. In: 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 2135–2135
Hall M, Frank E, Holmes G, Pfahringer B, Pand Witten Reutemann I (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Article Google Scholar
Schaul T, Bayer J, Wierstra D, Sun Y, Felder M, Sehnke F, Rucksties T, Schmidhuber J (2010) Pybrain. J Mach Learn Res 11:743–746
Google Scholar
Meng X, Bradley J, Yavuz B, Sparks B, Venkataraman S, Liu D et al (2016) MLlib: machine learning in apache spark. J Mach Learn Res 17(34):1–7
MathSciNet MATH Google Scholar
Jin H, Song Q, Hu X (2018) Auto-keras: an efficient neural architecture search system. arXiv:1806.10282
Real E, Aggarwal A, Huang Y, Le QV (2018) Regularized evolution for image classifier architecture search. arXiv:1802.01548
Sparks E, Talwalkar A, Smith V, Kottalam J, Pan X, Gonzales J (2015) Automated model search for large scale machine learning. In: SoCC, pp 368–380
Zoph B, Le QV (2017) Neural architecture search with reinforcement learning. In: proceedings of international conference on learning representations, April 24–26, 2017, Paris. https://arxiv.org/abs/1611.01578
Baker B, Gupta O, Naik N, Raskar R (2017) Designing neural network architectures using reinforcement learning. arXiv:1611.02167
Zhong Z, Yan J, Wu W, Shao J, Liu CL (2017) Practical block-wise neural network architecture generation. arXiv:1708.05552
Liu C, Zoph B, Neumann M, Shlens J, Hua W, et al. (2017) Progressive neural architecture search. arXiv:1712.00559
Miikkulainen R, Liang J, Meyerson E, Rawal A, Fink D, et al. (2017) Evolving deep neural networks. arXiv:1703.00548
Angeline P, Saunders G, Pollack J (1994) An evolutionary algorithm that constructs recurrent neural networks. IEEE Trans Neural Netw 5(1):54–65
Article Google Scholar
Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the genetic and evolutionary computation conference, ACM, GECCO ’17, pp 497–504
Kotthoff L, Thornton C, Hoos HH, Hutter F, Leyton-Brown K (2017) Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA. J Mach Learn Res 18(25):1–5
MathSciNet Google Scholar
Brochu E, Cora VM, de Freitas N (2010) A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv:1012.2599
Hutter F, Hoos H, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Proceedings of the 5th international conference on learning and intelligent optimization, LION’05, Springer-Verlag, pp 507–523
Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. In: Advances in neural information processing systems 28, Curran Associates, Inc., pp 2962–2970
Real E, Moore S, Selle A, Saxena S, Suematsu YL et al (2017) Large-scale evolution of image classifiers. arXiv:1703.01041
Mendoza H, Klein A, Feurer M, Springenberg JT, Urban M, Burkart M, Dippel M, Lindauer M, Hutter F (2019) Towards automatically-tuned deep neural networks. In: Hutter F, Kotthoff L, Vanschoren J (eds) Automated machine learning: methods, systems, challenges. Springer, Cham, pp 135–149
Chapter Google Scholar
Snoek J, Larochelle H, Adams RP (2012) Practical Bayesian optimization of machine learning algorithms. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol 2, Curran Associates Inc., Red Hook, NY, USA, pp 2951–2959
Golovin D, Solnik B, Moitra S, Kochanski G, Karro J, Sculley D (2017) Google vizier: A service for black-box optimization. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1487–1495
Simon J (2019) Amazon sagemaker autopilot—automatically create high-quality machine learning models with full control and visibility. https://aws.amazon.com/blogs/aws/amazon-sagemaker-autopilot-fully-managed-automatic-machine-learning/. Accessed 06/2020
Microsoft (2017) Neural network intelligence. https://www.microsoft.com/en-us/research/project/neural-network-intelligence/. Accessed 06/2020
H2Oai (2020) AutoML: Automatic machine learning. https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html. Accessed 06/2020
Truong A, Walters A, Goodsitt J, Hines K, Bruss CB, Farivar R (2019) Towards automated machine learning: Evaluation and comparison of AutoML approaches and tools. In: 2019 IEEE 31st International conference on tools with artificial intelligence (ICTAI), IEEE, Portland, Oregon, pp 1471–1479
Zoller MA, Huber MF (2019) Benchmark and survey of automated machine learning frameworks. arXiv preprint arXiv:1904.12054
Hutter F, Kotthoff L, Vanschoren J (eds) (2019) Automated machine learning: methods, systems, challenges. Springer, Berlin
Google Scholar
Quanming Y, Mengshuo W, Hugo JE, Isabelle G, Yi-Qi H, Yu-Feng L, Wei-Wei T, Qiang Y, Yang Y (2018) Taking human out of learning applications: a survey on automated machine learning. arXiv preprint arXiv:1810.13306
Elsken T, Metzen JH, Hutter F (2018) Neural architecture search: a survey. arXiv preprint arXiv:1808.05377
Moritz P, Nishihara R, Wang S, Tumanov A, Liaw R, Liang E et al (2017) Ray: a distributed framework for emerging ai applications. arXiv:1712.05889
Zaharia M, Xin RS, Wendell P, Das T, Armbrust M, Dave A, Meng X, Rosen J, Venkataraman S, Franklin MJ, Ghodsi A, Gonzalez J, Shenker S, Stoica I (2016) Apache spark: a unified engine for big data processing. Commun ACM 59(11):56–65. https://doi.org/10.1145/2934664
Article Google Scholar
Engelbrecht D (2007) Computational intelligence: an introduction. Wiley, Hoboken
Book Google Scholar
Ebehart R, Shi Y (2007) Computational intelligence. Morgan Kauffman, Burlington
Google Scholar
Sumathi S, Surekha P (2010) Computational intelligence paradigms. Theory and applications using MATLAB. CRC Press, Boca Raton
Book Google Scholar
Krishnakumar K (1989) Micro-genetic algorithms for stationary and non-stationary function optimization. In: SPIE proceedings: intelligent control and adaptive systems, pp 289–296
Hillermeier C (2001) Nonlinear multiobjective optimization. Springer, Berlin
Book Google Scholar
Laredo D, Chen Z, Schütze O, Sun JQ (2019) A neural network-evolutionary computational framework for remaining useful life estimation of mechanical systems. Neural Netw 116:178–187
Article Google Scholar
Holland J (1992) Adaptation in natural and artificial systems. MIT Press, Cambridge
Book Google Scholar
Laredo D, Quin Y, Schütze O, Sun JQ (2019) Automatic model selection for neural networks, source code. https://github.com/dlaredo/automatic_model_selection
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer, New York, NY, USA
MATH Google Scholar
Simard P, Steinkraus D, Platt J (2003) Best practices for convolutional neural networks applied to visual document analysis. In: The 7th international conference on document analysis and recognition (ICDAR 2003), 2-Volume Set, 3–6 August 2003, Edinburgh, Scotland, UK, pp 958–962
Ciresan DC, Meier U, Gambardella LM, Schmidhuber J (2010) Deep, big, simple neural nets for handwritten digit recognition. Neural Comput 22(12):3207–3220
Article Google Scholar
Saxena A, Goebel K (2008) PHM08 challenge data set. https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/
Saxena A, Goebel K, Simon D, Eklund N (2008) Damage propagation modeling for aircraft engine run-to-failure simulation. In: International conference on prognostics and health management, IEEE, pp 1–9
Laredo D, Chen X, Schütze O, Sun JQ (2018) ANN-EA for RUL estimation, source code. https://github.com/dlaredo/NASA_RUL_-CMAPS-
Li X, Ding Q, Sun J (2018) Remaining useful life estimation in prognostics using deep convolution neural networks. Reliab Eng Syst Safety 172:1–11
Article Google Scholar
Lim P, Goh CK, Tan KC (2016) A time-window neural networks based framework for remaining useful life estimation. In: Proceedings international joint conference on neural networks, pp 1746–1753
Zhang C, Lim P, Qin A, Tan K (2016) Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics. IEEE Trans Neural Netw Learn Syst 99:1–13
Google Scholar
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX conference on hot topics in cloud computing, pp 1–7

Download references

Funding

Funding was provided by Conacyt (Grant No.285599).

Author information

Authors and Affiliations

Department of Computer Science, CINVESTAV, Mexico City, Mexico
David Laredo & Oliver Schütze
Department of Mechanical Engineering, University of California, Merced, CA, 95343, USA
Shangjie Frank Ma, Ghazaale Leylaz & Jian-Qiao Sun

Authors

David Laredo
View author publications
You can also search for this author in PubMed Google Scholar
Shangjie Frank Ma
View author publications
You can also search for this author in PubMed Google Scholar
Ghazaale Leylaz
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Schütze
View author publications
You can also search for this author in PubMed Google Scholar
Jian-Qiao Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian-Qiao Sun.

Generated neural network models

See Tables 19 and 20.

Table 19 Neural network model corresponding to \(S_1\)

Full size table

Table 20 Neural network model corresponding to \(S_2\)

Full size table

Table 21 Neural network model corresponding to \(S_3\)

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Laredo, D., Ma, S.F., Leylaz, G. et al. Automatic model selection for fully connected neural networks. Int. J. Dynam. Control 8, 1063–1079 (2020). https://doi.org/10.1007/s40435-020-00708-w

Download citation

Received: 02 July 2020
Revised: 16 September 2020
Accepted: 28 September 2020
Published: 06 October 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s40435-020-00708-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic model selection for fully connected neural networks

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Visualizing and Understanding Convolutional Networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Generated neural network models

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic model selection for fully connected neural networks

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Visualizing and Understanding Convolutional Networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Generated neural network models

Generated neural network models

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation