On Incremental Learning for Gradient Boosting Decision Trees

  • Chongsheng Zhang
  • Yuan Zhang
  • Xianjin ShiEmail author
  • George Almpanidis
  • Gaojuan Fan
  • Xiajiong Shen


Boosting algorithms, as a class of ensemble learning methods, have become very popular in data classification, owing to their strong theoretical guarantees and outstanding prediction performance. However, most of these boosting algorithms were designed for static data, thus they can not be directly applied to on-line learning and incremental learning. In this paper, we propose a novel algorithm that incrementally updates the classification model built upon gradient boosting decision tree (GBDT), namely iGBDT. The main idea of iGBDT is to incrementally learn a new model but without running GBDT from scratch, when new data is dynamically arriving in batch. We conduct large-scale experiments to validate the effectiveness and efficiency of iGBDT. All the experimental results show that, in terms of model building/updating time, iGBDT obtains significantly better performance than the conventional practice that always runs GBDT from scratch when a new batch of data arrives, while still keeping the same classification accuracy. iGBDT can be used in many applications that require in-time analysis of continuously arriving or real-time user-generated data, such as behaviour targeting, Internet advertising, recommender systems, etc.


Gradient boosting Gradient boosting decision tree Incremental learning Ensemble learning 



  1. 1.
    Aggarwal CC (2007) Data streams: models and algorithms, vol 31. Springer, BerlinCrossRefzbMATHGoogle Scholar
  2. 2.
    Babenko B, Yang MH, Belongie S (2009) A family of online boosting algorithms. In: IEEE 12th international conference on computer vision workshops (ICCV workshops). IEEE, pp 1346–1353Google Scholar
  3. 3.
    Beygelzimer A, Hazan E, Kale S, Luo H (2015a) Online gradient boosting. In: NIPS, pp 2458–2466Google Scholar
  4. 4.
    Beygelzimer A, Kale S, Luo H (2015b) Optimal and adaptive algorithms for online boosting. In: ICMLGoogle Scholar
  5. 5.
    Chapelle O, Chang Y (2011) Yahoo! learning to rank challenge overview. In: JMLR proceedings, pp 1–24Google Scholar
  6. 6.
    Chen ST, Lin HT, Lu CJ (2012) An online boosting algorithm with theoretical justifications. In: ICMLGoogle Scholar
  7. 7.
    Dheeru D, Karra Taniskidou E (2017) UCI machine learning repository. Accessed 31 Jan 2019
  8. 8.
    Domingos P, Hulten G (2000) Mining high-speed data streams. In: ACM SIGKDD. ACM, pp 71–80Google Scholar
  9. 9.
    Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Comput 121(2):256–285MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM Sigmod Rec 34(2):18–26CrossRefzbMATHGoogle Scholar
  13. 13.
    Grbovic M, Vucetic S (2011) Tracking concept change with incremental boosting by minimization of the evolving exponential loss. In: PKDD. Springer, Berlin, pp 516–532Google Scholar
  14. 14.
    Hulten G, Spencer L, Domingos PM (2001) Mining time-changing data streams. In: ACM SIGKDD, pp 97–106Google Scholar
  15. 15.
    Leistner C, Saffari A, Roth PM, Bischof H (2009) On robustness of on-line boosting: a competitive study. In: IEEE 12th international conference on computer vision workshops (ICCV workshops), pp 1362–1369Google Scholar
  16. 16.
    Liu X, Yu T (2007) Gradient feature selection for online boosting. In: 2007 IEEE 11th international conference on computer vision (ICCV). IEEE, pp 1–8Google Scholar
  17. 17.
    Oza NC, Russell S (2001) Experimental comparisons of online and batch versions of bagging and boosting. In: ACM SIGKDD. ACM, pp 359–364Google Scholar
  18. 18.
    Oza NC, Russell SJ (2001) Online bagging and boosting. In: Eighth international workshop on artificial intelligence and statistics, pp 105–112Google Scholar
  19. 19.
    Pavlov DY, Gorodilov A, Brunk CA (2010) Bagboo: a scalable hybrid bagging-the-boosting model. In: Proceedings of the 19th ACM conference on information and knowledge management, CIKM 2010, Toronto, Ontario, Canada, October 26–30, 2010, pp 1897–1900Google Scholar
  20. 20.
    Pelossof R, Jones M, Vovsha I, Rudin C (2009) Online coordinate boosting. In: IEEE 12th international conference on computer vision workshops (ICCV workshops). IEEE, pp 1354–1361Google Scholar
  21. 21.
    Perkins S, Lacker K, Theiler J (2003) Grafting: fast, incremental feature selection by gradient descent in function space. J Mach Learn Res 3(Mar):1333–1356MathSciNetzbMATHGoogle Scholar
  22. 22.
    Zhang C, Hao Y, Mazuran M, Zaniolo C, Mousavi H, Masseglia F (2013) Mining frequent itemsets over tuple-evolving data streams. In: Proceedings of the 28th annual ACM symposium on applied computing, SAC’13, Coimbra, Portugal, March 18–22, 2013, pp 267–274Google Scholar
  23. 23.
    Zhang C, Liu C, Zhang X, Almpanidis G (2017) An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst Appl 82:128–150. CrossRefGoogle Scholar
  24. 24.
    Zhang T, Yu B (2005) Boosting with early stopping: convergence and consistency. Ann Stat 33:1538–1579MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Chongsheng Zhang
    • 1
  • Yuan Zhang
    • 1
  • Xianjin Shi
    • 1
    • 2
    Email author
  • George Almpanidis
    • 1
  • Gaojuan Fan
    • 1
  • Xiajiong Shen
    • 1
  1. 1.Henan UniversityKaifengChina
  2. 2.Education Information Centre of Henan ProvinceZhengzhouChina

Personalised recommendations