Abstract
We consider a new method to improve the quality of training in gradient boosting as well as to increase its generalization performance based on the use of modified loss functions. In computational experiments, the possible applicability of this method to improve the quality of gradient boosting when solving various classification and regression problems on real data is shown.
Notes
https://drive.google.com/file/d/1EyiNNQ_u0CzQ7qYEdZEeEwFTwjkkv2xL/view.
https://drive.google.com/file/d/1ADa975pas6WPm5SDmPCRF4oPrAyoBkx4/view?usp=sharing.
REFERENCES
Friedman, J.H., Multiple additive regression trees with application in epidemiology, Stat. Med., 2003, vol. 22, no. 9, pp. 1365–1381.
Elith, J., Boosted regression trees for ecological modeling, CRAN, 2018, vol. 77, no. 4, pp. 802–813.
Lalchand, V., Extracting more from boosted decision trees: A high energy physics case study, 2020. arXiv:2001.06033.
Breiman, L., Random forests, Mach. Learn., 2001, vol. 45, no. 1.
Zhi-Hua, Z., Ensemble Methods: Foundations and Algorithms, New York: Chapman & Hall/CRC, 2012.
Zhuravlev, Yu.I., Senko, O.V., Dokukin, A.A., Kiselyova, N.N., and Saenko, I.A., Two-level regression method using ensembles of trees with optimal divergence, Dokl. Math., 2021, vol. 104, no. 1, pp. 212–215.
Friedman, J.H., Stochastic gradient boosting, Comput. Stat. DataAnal., 2002, vol. 38, no. 4, pp. 367–378.
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A.V., and Gulin, A., CatBoost: unbiased boosting with categorical features, 2017. arXiv:1706.09516.
Chen, T. and Guestrin, C., XGBoost: A scalable tree boosting system, 2016. arXiv:1603.02754.
Ke, G. et al., LightGBM: A highly efficient gradient boosting decision tree, Adv. Neural Inf. Process. Syst., 2017, vol. 30, pp. 3146–3154.
Brown, G., Wyatt, J., Harris, R., and Xin Yao, Diversity creation methods: A survey and categorisation, Inf. Fusion, 2005, vol. 6, pp. 367–378.
Dokukin, A.A. and Senko, O.V., Optimal convex correcting procedures in problems of high dimension, Comput. Math. Math. Phys., 2011, vol. 51, no. 9, pp. 1644–1652.
Guvenir, H. Altay, Acar, B., and Muderrisoglu, H., Arrhythmia Data Set. https://archive.ics.uci.edu/ml/datasets/Arrhythmia.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Translated by V. Potapchouck
Rights and permissions
About this article
Cite this article
Korolev, N.S., Senko, O.V. Method for Improving Gradient Boosting Learning Efficiency Based on Modified Loss Functions. Autom Remote Control 83, 1935–1943 (2022). https://doi.org/10.1134/S00051179220120074
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S00051179220120074