Identifying Best Hyperparameters for Deep Architectures Using Random Forests

Li, Zhen-Zhen; Zhong, Zhuo-Yao; Jin, Lian-Wen

doi:10.1007/978-3-319-19084-6_4

Identifying Best Hyperparameters for Deep Architectures Using Random Forests

Zhen-Zhen Li¹⁶,
Zhuo-Yao Zhong¹⁶ &
Lian-Wen Jin¹⁶

Conference paper
First Online: 01 January 2015

1051 Accesses
1 Citations
6 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8994))

Abstract

A major problem in deep learning is identifying appropriate hyperparameter configurations for deep architectures. This issue is important because: (1) inappropriate hyperparameter configurations will lead to mediocre performance; (2) little expert experience is available to make an informed decision. Random search is a straightforward choice for this problem; however, expensive time cost for each test has made numerous trails impractical. The main strategy of our solution has been based on data modeling via random forest, which is used as a tool to analyze data characteristics of performance of deep architectures with respect to hyperparameter variants and to explore underlying interactions of hyperparameters. This is a general method suitable for all types of deep architecture. Our approach is tested by using deep belief network: the error rate reduced from \(1.2\,\%\) to \(0.89\,\%\) by merely replacing three hyperparameter values.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Article MATH MathSciNet Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural comput. 18, 1527–1554 (2006)
Article MATH MathSciNet Google Scholar
Jones, N.: Computer science: the learning machines. Nature 505, 146–148 (2014)
Article Google Scholar
Arbib, M.A.: The elements of brain theory and neural networks, part I: background. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 3–7. MIT press, Cambridge (1995)
Google Scholar
LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time-series. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 255–257. MIT press, Cambridge (1995)
Google Scholar
Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length, and helmholtz free energy. Adv. Neural Inf. Process. Syst. 6, 3–10 (1994)
Google Scholar
Lopes, N., Riberio, B.: Towards adaptive learning with improved convergence of deep belief networks on graphic processing units. Pattern Recogn. 47, 114–127 (2014)
Article Google Scholar
Hinton, G.E., Deng, L., Yu, D., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE signal Process. Mag. 11, 82–97 (2012)
Article Google Scholar
Graves, A., Mohamed, A.R., Hinton, G.E.: Speech recognition with deep recurrent neural networks. In: Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE Press, New York (2013)
Google Scholar
Saxe, A.M., Koh, P.W., Chen, Z.: On random weights and unsupervised feature learning. In: 2011 International Conference on Machine Learning, pp. 1089–1096. IEEE Press, New York (2011)
Google Scholar
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classifcation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3642–3649. IEEE Press, New York (2012)
Google Scholar
Pinto, N., Cox, D., DiCarlo, J.: A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Comput. Biol. 5, 1–12 (2009)
Article MathSciNet Google Scholar
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. 13, 281–305 (2012)
MATH MathSciNet Google Scholar
Hutter, F., Lopez-Ibanez, M., Fawcett, C., Lindauer, M., Hoos, H., Leyton-Brown, K., Stutzle, T.: AClib: a benchmark library for algorithm configuration. In: Pardalos, P.M., Resende, M.G.C., Vogiatzis, C., Walteros, J.L. (eds.) Lion8. LNCS, vol. 8426, pp. 36–40. Springer, Heidelberg (2014)
Chapter Google Scholar
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A. (ed.) LION 5. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011)
Chapter Google Scholar
Hutter, F.: Automated configuration of algorithm for solving hard computational problems. Ph.D. thesis, Department of computer science, University of British Columbia (2009)
Google Scholar
Bergstra, J., Bardenet, R., Bengio, Y., Kegl, B.: Algorithms for hyperparameter optimization. Adv. Neural Inf. Process. Syst. 24, 2546–2554 (2011)
Google Scholar
Bergstra, J., Yamins, D., Cox, D.D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: 30th International Conference on the Machine Learning (2013)
Google Scholar
Thornton, C., Hutter, F., Hoos, H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of clasification algorithms. In: 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013)
Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. Advances in neural information processing system 4 (2012)
Google Scholar
Adams, R.P., Wallach, H.M., Ghahramani, Z.: Learning the structure of deepsparse graphical models. J. Mach. Learn. 9, 1–8 (2010)
Google Scholar
Hutter, F., Hoos, H., Leyton-Brown, K.: An efficient approach for assessing hyperparameter importance. In: Proceedings of the 2014 International Conference on Machine Learning (2014)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Automated algorithm configuration project. http://www.cs.ubc.ca/labs/beta Projects/AAC/index.html
Caffe. https://github.com/BVLV/caffe

Download references

Acknowledgments

This research is supported in part by NSFC (Grant No.: 61201348, 61472144), National Science and Technology Support plan (Grant No.:2013B AH65F01 -2013BAH65F04), GDNSF (Grant No.: S2011020000541, S201204 0008016), GDSTP (Grant No.: 2012A010701001), Research Fund for the Doctoral Program of Higher Education of China (Grant No.: 20120172110023).

Author information

Authors and Affiliations

School of Electronic and Information Engineering, South China University of Technology, Wushan Road 381, Guangzhou, 510641, China
Zhen-Zhen Li, Zhuo-Yao Zhong & Lian-Wen Jin

Authors

Zhen-Zhen Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhuo-Yao Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Lian-Wen Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhen-Zhen Li .

Editor information

Editors and Affiliations

Lille University, Villeneuve d'Ascq, France
Clarisse Dhaenens
Lille University, Villeneuve d'Ascq, France
Laetitia Jourdan
Lille University, Villeneuve d'Ascq, France
Marie-Eléonore Marmion

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, ZZ., Zhong, ZY., Jin, LW. (2015). Identifying Best Hyperparameters for Deep Architectures Using Random Forests. In: Dhaenens, C., Jourdan, L., Marmion, ME. (eds) Learning and Intelligent Optimization. LION 2015. Lecture Notes in Computer Science(), vol 8994. Springer, Cham. https://doi.org/10.1007/978-3-319-19084-6_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-19084-6_4
Published: 29 May 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19083-9
Online ISBN: 978-3-319-19084-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics