Abstract
A major problem in deep learning is identifying appropriate hyperparameter configurations for deep architectures. This issue is important because: (1) inappropriate hyperparameter configurations will lead to mediocre performance; (2) little expert experience is available to make an informed decision. Random search is a straightforward choice for this problem; however, expensive time cost for each test has made numerous trails impractical. The main strategy of our solution has been based on data modeling via random forest, which is used as a tool to analyze data characteristics of performance of deep architectures with respect to hyperparameter variants and to explore underlying interactions of hyperparameters. This is a general method suitable for all types of deep architecture. Our approach is tested by using deep belief network: the error rate reduced from \(1.2\,\%\) to \(0.89\,\%\) by merely replacing three hyperparameter values.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural comput. 18, 1527–1554 (2006)
Jones, N.: Computer science: the learning machines. Nature 505, 146–148 (2014)
Arbib, M.A.: The elements of brain theory and neural networks, part I: background. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 3–7. MIT press, Cambridge (1995)
LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time-series. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 255–257. MIT press, Cambridge (1995)
Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length, and helmholtz free energy. Adv. Neural Inf. Process. Syst. 6, 3–10 (1994)
Lopes, N., Riberio, B.: Towards adaptive learning with improved convergence of deep belief networks on graphic processing units. Pattern Recogn. 47, 114–127 (2014)
Hinton, G.E., Deng, L., Yu, D., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE signal Process. Mag. 11, 82–97 (2012)
Graves, A., Mohamed, A.R., Hinton, G.E.: Speech recognition with deep recurrent neural networks. In: Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE Press, New York (2013)
Saxe, A.M., Koh, P.W., Chen, Z.: On random weights and unsupervised feature learning. In: 2011 International Conference on Machine Learning, pp. 1089–1096. IEEE Press, New York (2011)
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classifcation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3642–3649. IEEE Press, New York (2012)
Pinto, N., Cox, D., DiCarlo, J.: A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Comput. Biol. 5, 1–12 (2009)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. 13, 281–305 (2012)
Hutter, F., Lopez-Ibanez, M., Fawcett, C., Lindauer, M., Hoos, H., Leyton-Brown, K., Stutzle, T.: AClib: a benchmark library for algorithm configuration. In: Pardalos, P.M., Resende, M.G.C., Vogiatzis, C., Walteros, J.L. (eds.) Lion8. LNCS, vol. 8426, pp. 36–40. Springer, Heidelberg (2014)
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A. (ed.) LION 5. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011)
Hutter, F.: Automated configuration of algorithm for solving hard computational problems. Ph.D. thesis, Department of computer science, University of British Columbia (2009)
Bergstra, J., Bardenet, R., Bengio, Y., Kegl, B.: Algorithms for hyperparameter optimization. Adv. Neural Inf. Process. Syst. 24, 2546–2554 (2011)
Bergstra, J., Yamins, D., Cox, D.D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: 30th International Conference on the Machine Learning (2013)
Thornton, C., Hutter, F., Hoos, H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of clasification algorithms. In: 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013)
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. Advances in neural information processing system 4 (2012)
Adams, R.P., Wallach, H.M., Ghahramani, Z.: Learning the structure of deepsparse graphical models. J. Mach. Learn. 9, 1–8 (2010)
Hutter, F., Hoos, H., Leyton-Brown, K.: An efficient approach for assessing hyperparameter importance. In: Proceedings of the 2014 International Conference on Machine Learning (2014)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Automated algorithm configuration project. http://www.cs.ubc.ca/labs/beta Projects/AAC/index.html
Acknowledgments
This research is supported in part by NSFC (Grant No.: 61201348, 61472144), National Science and Technology Support plan (Grant No.:2013B AH65F01 -2013BAH65F04), GDNSF (Grant No.: S2011020000541, S201204 0008016), GDSTP (Grant No.: 2012A010701001), Research Fund for the Doctoral Program of Higher Education of China (Grant No.: 20120172110023).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Li, ZZ., Zhong, ZY., Jin, LW. (2015). Identifying Best Hyperparameters for Deep Architectures Using Random Forests. In: Dhaenens, C., Jourdan, L., Marmion, ME. (eds) Learning and Intelligent Optimization. LION 2015. Lecture Notes in Computer Science(), vol 8994. Springer, Cham. https://doi.org/10.1007/978-3-319-19084-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-19084-6_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19083-9
Online ISBN: 978-3-319-19084-6
eBook Packages: Computer ScienceComputer Science (R0)