Hyperparameter Optimization with Factorized Multilayer Perceptrons

Conference paper
First Online: 01 January 2015

pp 87–103
Cite this conference paper

Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2015)

Nicolas Schilling¹⁰,
Martin Wistuba¹⁰,
Lucas Drumond¹⁰ &
…
Lars Schmidt-Thieme¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9285))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

4337 Accesses
11 Citations

Abstract

In machine learning, hyperparameter optimization is a challenging task that is usually approached by experienced practitioners or in a computationally expensive brute-force manner such as grid-search. Therefore, recent research proposes to use observed hyperparameter performance on already solved problems (i.e. data sets) in order to speed up the search for promising hyperparameter configurations in the sequential model based optimization framework.

In this paper, we propose multilayer perceptrons as surrogate models as they are able to model highly nonlinear hyperparameter response surfaces. However, since interactions of hyperparameters, data sets and metafeatures are only implicitly learned in the subsequent layers, we improve the performance of multilayer perceptrons by means of an explicit factorization of the interaction weights and call the resulting model a factorized multilayer perceptron. Additionally, we evaluate different ways of obtaining predictive uncertainty, which is a key ingredient for a decent tradeoff between exploration and exploitation. Our experimental results on two public meta data sets demonstrate the efficiency of our approach compared to a variety of published baselines. For reproduction purposes, we make our data sets and all the program code publicly available on our supplementary webpage.

Download to read the full chapter text

Chapter PDF

Similar content being viewed by others

Hyperparameter Optimization

Chapter © 2019

Scalable Meta-Bayesian Based Hyperparameters Optimization for Machine Learning

Chapter © 2022

Comprehensive analysis of gradient-based hyperparameter optimization algorithms

Article 03 June 2019

Keywords

References

Adankon, M.M., Cheriet, M.: Model selection for the LS-SVM. Application to handwriting recognition. Pattern Recognition 42(12), 3264–3270 (2009)
Article MATH Google Scholar
Bardenet, R., Brendel, M., Kegl, B., Sebag, M.: Collaborative hyperparameter tuning. In: Dasgupta, S., Mcallester, D. (eds.) JMLR Workshop and Conference Proceedings of the 30th International Conference on Machine Learning (ICML 2013), vol. 28, pp. 199–207, May 2013
Google Scholar
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
MathSciNet Google Scholar
Bishop, C.M., et al.: Pattern recognition and machine learning, vol. 4. springer New York (2006)
Google Scholar
Chapelle, O., Vapnik, V., Bengio, Y.: Model selection for small sample regression. Machine Learning 48(1–3), 9–23 (2002)
Article Google Scholar
Feurer, M., Springenberg, J.T., Hutter, F.: Initializing bayesian hyperparameter optimization via meta-learning. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Guo, X.C., Yang, J.H., Wu, C.G., Wang, C.Y., Liang, Y.C.: A novel ls-svms hyper-parameter selection based on particle swarm optimization. Neurocomput. 71(16–18), 3211–3215 (2008)
Article Google Scholar
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011)
Chapter Google Scholar
Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. of Global Optimization 13(4), 455–492 (1998)
Article MathSciNet Google Scholar
Kapoor, A., Ahn, H., Qi, Y., Picard, R.W.: Hyperparameter and kernel learning for graph based semi-supervised classification. In: Advances in Neural Information Processing Systems, pp. 627–634 (2005)
Google Scholar
Koch, P., Bischl, B., Flasch, O., Bartz-Beielstein, T., Weihs, C., Konen, W.: Tuning and evolution of support vector kernels. Evolutionary Intelligence 5(3), 153–170 (2012)
Article MATH Google Scholar
Leite, R., Brazdil, P., Vanschoren, J.: Selecting classification algorithms with active testing. In: Perner, P. (ed.) MLDM 2012. LNCS, vol. 7376, pp. 117–131. Springer, Heidelberg (2012)
Chapter Google Scholar
Murphy, K.P.: Machine learning: a probabilistic perspective. MIT press (2012)
Google Scholar
Nguyen, D., Widrow, B.: Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. In: 1990 IJCNN International Joint Conference on Neural Networks, 1990, pp. 21–26. IEEE (1990)
Google Scholar
Pfahringer, B., Bensusan, H., Giraud-Carrier, C.: Meta-learning by landmarking various learning algorithms. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 743–750. Morgan Kaufmann (2000)
Google Scholar
Rendle, S.: Factorization machines. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 995–1000. IEEE (2010)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cognitive modeling 5 (1988)
Google Scholar
Schilling, N.: Supplementary website. http://hylap.org/publications/hyper-opt-with-factorized-multilayer-perceptrons
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 2951–2959. Curran Associates, Inc. (2012)
Google Scholar
Swersky, K., Snoek, J., Adams, R.P.: Multi-task bayesian optimization. In: Burges, C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 2004–2012. Curran Associates, Inc. (2013)
Google Scholar
Yogatama, D., Mann, G.: Efficient transfer learning method for automatic hyperparameter tuning. In: International Conference on Artificial Intelligence and Statistics (AISTATS 2014) (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Information Systems and Machine Learning Lab, University of Hildesheim, 31141, Hildesheim, Germany
Nicolas Schilling, Martin Wistuba, Lucas Drumond & Lars Schmidt-Thieme

Authors

Nicolas Schilling
View author publications
You can also search for this author in PubMed Google Scholar
Martin Wistuba
View author publications
You can also search for this author in PubMed Google Scholar
Lucas Drumond
View author publications
You can also search for this author in PubMed Google Scholar
Lars Schmidt-Thieme
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicolas Schilling .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
Universidade do Porto, Porto, Portugal
Vítor Santos Costa
University of Porto - INESC TEC, Porto, Portugal
João Gama
University of Porto - INESC TEC, Porto, Portugal
Alípio Jorge
University of Porto - INESC TEC, Porto, Portugal
Carlos Soares

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Schilling, N., Wistuba, M., Drumond, L., Schmidt-Thieme, L. (2015). Hyperparameter Optimization with Factorized Multilayer Perceptrons. In: Appice, A., Rodrigues, P., Santos Costa, V., Gama, J., Jorge, A., Soares, C. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9285. Springer, Cham. https://doi.org/10.1007/978-3-319-23525-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-23525-7_6
Published: 29 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23524-0
Online ISBN: 978-3-319-23525-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics