Kernel and Acquisition Function Setup for Bayesian Optimization of Gradient Boosting Hyperparameters
The application scenario investigated in the paper is the bank credit scoring based on a Gradient Boosting classifier. It is shown how one may exploit hyperparameter optimization based on the Bayesian Optimization paradigm. All the evaluated methods are based on the Gaussian Process model, but differ in terms of the kernel and the acquisition function. The main purpose of the research presented herein is to confirm experimentally that it is reasonable to tune both the kernel function and the acquisition function in order to optimize Bayesian Gradient Boosting hyperparameters. Moreover, the paper provides results indicating that, at least in the investigated application scenario, the superiority of some of the evaluated Bayesian Optimization methods over others strongly depends on the amount of the optimization budget.
KeywordsBinary classification Gradient Boosting Hyperparameters Bayesian Optimization Gaussian Process Kernel function Acquisition function Bank credit scoring
This work was supported by the Polish National Science Centre, grant DEC-2011/01/D/ST6/06788, and by Poznan University of Technology under grant 04/45/DSPB/0163.
- 2.Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS 2012, USA, vol. 2, pp. 2951–2959. Curran Associates Inc. (2012)Google Scholar
- 5.Chen, T., Guestrin, C.: XGboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. ACM, New York (2016)Google Scholar
- 6.Brochu, E., Cora, V.M., de Freitas, N.: A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning, December 2010. arXiv:1012.2599
- 7.Szwabe, A., Misiorek, P., Walkowiak, P.: Reflective relational learning for ontology alignment. In: 9th International Conference on Distributed Computing and Artificial Intelligence, DCAI 2012, Salamanca, Spain, 28–30th March 2012, pp. 519–526 (2012)Google Scholar
- 12.Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML 2010, USA, pp. 1015–1022. Omnipress (2010)Google Scholar
- 13.University of California, Irvine (UCI), Machine Learning Repository (MRI): German Credit dataset (2017). https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)