Skip to main content

Identifying Best Hyperparameters for Deep Architectures Using Random Forests

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8994))

Abstract

A major problem in deep learning is identifying appropriate hyperparameter configurations for deep architectures. This issue is important because: (1) inappropriate hyperparameter configurations will lead to mediocre performance; (2) little expert experience is available to make an informed decision. Random search is a straightforward choice for this problem; however, expensive time cost for each test has made numerous trails impractical. The main strategy of our solution has been based on data modeling via random forest, which is used as a tool to analyze data characteristics of performance of deep architectures with respect to hyperparameter variants and to explore underlying interactions of hyperparameters. This is a general method suitable for all types of deep architecture. Our approach is tested by using deep belief network: the error rate reduced from \(1.2\,\%\) to \(0.89\,\%\) by merely replacing three hyperparameter values.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.cs.toronto.edu/hinton/MatlabForSciencePaper.html.

  2. 2.

    https://randomforest-matlab.googlecode.com/files/Windows-Precompiled-RF_Mexstandalone-v0.02-.zip.

References

  1. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  2. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural comput. 18, 1527–1554 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  3. Jones, N.: Computer science: the learning machines. Nature 505, 146–148 (2014)

    Article  Google Scholar 

  4. Arbib, M.A.: The elements of brain theory and neural networks, part I: background. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 3–7. MIT press, Cambridge (1995)

    Google Scholar 

  5. LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time-series. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 255–257. MIT press, Cambridge (1995)

    Google Scholar 

  6. Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length, and helmholtz free energy. Adv. Neural Inf. Process. Syst. 6, 3–10 (1994)

    Google Scholar 

  7. Lopes, N., Riberio, B.: Towards adaptive learning with improved convergence of deep belief networks on graphic processing units. Pattern Recogn. 47, 114–127 (2014)

    Article  Google Scholar 

  8. Hinton, G.E., Deng, L., Yu, D., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE signal Process. Mag. 11, 82–97 (2012)

    Article  Google Scholar 

  9. Graves, A., Mohamed, A.R., Hinton, G.E.: Speech recognition with deep recurrent neural networks. In: Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE Press, New York (2013)

    Google Scholar 

  10. Saxe, A.M., Koh, P.W., Chen, Z.: On random weights and unsupervised feature learning. In: 2011 International Conference on Machine Learning, pp. 1089–1096. IEEE Press, New York (2011)

    Google Scholar 

  11. Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classifcation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3642–3649. IEEE Press, New York (2012)

    Google Scholar 

  12. Pinto, N., Cox, D., DiCarlo, J.: A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Comput. Biol. 5, 1–12 (2009)

    Article  MathSciNet  Google Scholar 

  13. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. 13, 281–305 (2012)

    MATH  MathSciNet  Google Scholar 

  14. Hutter, F., Lopez-Ibanez, M., Fawcett, C., Lindauer, M., Hoos, H., Leyton-Brown, K., Stutzle, T.: AClib: a benchmark library for algorithm configuration. In: Pardalos, P.M., Resende, M.G.C., Vogiatzis, C., Walteros, J.L. (eds.) Lion8. LNCS, vol. 8426, pp. 36–40. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  15. Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A. (ed.) LION 5. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  16. Hutter, F.: Automated configuration of algorithm for solving hard computational problems. Ph.D. thesis, Department of computer science, University of British Columbia (2009)

    Google Scholar 

  17. Bergstra, J., Bardenet, R., Bengio, Y., Kegl, B.: Algorithms for hyperparameter optimization. Adv. Neural Inf. Process. Syst. 24, 2546–2554 (2011)

    Google Scholar 

  18. Bergstra, J., Yamins, D., Cox, D.D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: 30th International Conference on the Machine Learning (2013)

    Google Scholar 

  19. Thornton, C., Hutter, F., Hoos, H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of clasification algorithms. In: 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013)

    Google Scholar 

  20. Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. Advances in neural information processing system 4 (2012)

    Google Scholar 

  21. Adams, R.P., Wallach, H.M., Ghahramani, Z.: Learning the structure of deepsparse graphical models. J. Mach. Learn. 9, 1–8 (2010)

    Google Scholar 

  22. Hutter, F., Hoos, H., Leyton-Brown, K.: An efficient approach for assessing hyperparameter importance. In: Proceedings of the 2014 International Conference on Machine Learning (2014)

    Google Scholar 

  23. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  24. Automated algorithm configuration project. http://www.cs.ubc.ca/labs/beta Projects/AAC/index.html

  25. Caffe. https://github.com/BVLV/caffe

Download references

Acknowledgments

This research is supported in part by NSFC (Grant No.: 61201348, 61472144), National Science and Technology Support plan (Grant No.:2013B AH65F01 -2013BAH65F04), GDNSF (Grant No.: S2011020000541, S201204 0008016), GDSTP (Grant No.: 2012A010701001), Research Fund for the Doctoral Program of Higher Education of China (Grant No.: 20120172110023).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhen-Zhen Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Li, ZZ., Zhong, ZY., Jin, LW. (2015). Identifying Best Hyperparameters for Deep Architectures Using Random Forests. In: Dhaenens, C., Jourdan, L., Marmion, ME. (eds) Learning and Intelligent Optimization. LION 2015. Lecture Notes in Computer Science(), vol 8994. Springer, Cham. https://doi.org/10.1007/978-3-319-19084-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19084-6_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19083-9

  • Online ISBN: 978-3-319-19084-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics