Automatically Optimized Gradient Boosting Trees for Classifying Large Volume High Cardinality Data Streams Under Concept Drift

  • Jobin WilsonEmail author
  • Amit Kumar Meher
  • Bivin Vinodkumar Bindu
  • Santanu Chaudhury
  • Brejesh Lall
  • Manoj Sharma
  • Vishakha Pareek
Conference paper
Part of the The Springer Series on Challenges in Machine Learning book series (SSCML)


Data abundance along with scarcity of machine learning experts and domain specialists necessitates progressive automation of end-to-end machine learning workflows. To this end, Automated Machine Learning (AutoML) has emerged as a prominent research area. Real world data often arrives as streams or batches, and data distribution evolves over time causing concept drift. Models need to handle data that is not independent and identically distributed (iid), and transfer knowledge across time through continuous self-evaluation and adaptation adhering to resource constraints. Creating autonomous self-maintaining models which not only discover an optimal pipeline, but also automatically adapt to concept drift to operate in a lifelong learning setting was the crux of NeurIPS 2018 AutoML challenge. We describe our winning solution to the challenge, entitled AutoGBT, which combines an adaptive self-optimized end-to-end machine learning pipeline based on gradient boosting trees with automatic hyper-parameter tuning using Sequential Model-Based Optimization (SMBO). We report experimental results on the challenge datasets as well as several benchmark datasets affected by concept drift and compare it with the baseline model for the challenge and Auto-sklearn. Results indicate the effectiveness of the proposed methodology in this context.


AutoML Concept drift Lifelong machine learning Hyperopt Auto-sklearn 


  1. 1.
    Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. Journal of Machine Learning Research 13(Feb), 281–305 (2012)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Bergstra, J., Yamins, D., Cox, D.D.: Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. In: Proceedings of the 12th Python in Science Conference, pp. 13–20. Citeseer (2013)Google Scholar
  3. 3.
    Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM international conference on data mining, pp. 443–448. SIAM (2007)Google Scholar
  4. 4.
    Borisov, A., Eruhimov, V., Tuv, E.: Tree-based ensembles with dynamic soft feature selection. In: Feature Extraction, pp. 359–374. Springer (2006)Google Scholar
  5. 5.
    Codalab: Codalab—competition. (retrived January 2019)
  6. 6.
    Codalab: Codalab—competition. (retrived January 2019)
  7. 7.
    Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments: A survey. IEEE Computational Intelligence Magazine 10(4), 12–25 (2015)CrossRefGoogle Scholar
  8. 8.
    Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Practical automated machine learning for the automl challenge 2018. In: International Workshop on Automatic Machine Learning at ICML (2018)Google Scholar
  9. 9.
    Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)Google Scholar
  10. 10.
    Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM computing surveys (CSUR) 46(4), 44 (2014)CrossRefGoogle Scholar
  11. 11.
    Google: Cloud automl—custom machine learning models. (retrived January 2019)
  12. 12.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of machine learning research 3(Mar), 1157–1182 (2003)zbMATHGoogle Scholar
  13. 13.
    Guyon, I., Sun-Hosoya, L., Boullé, M., Escalante, H.J., Escalera, S., Liu, Z., Jajetic, D., Ray, B., Saeed, M., Sebag, M., Statnikov, A., Tu, W.W., Viegas, E.: Analysis of the AutoML Challenge series 2015–2018. In: F. Hutte, L. Kotthoff, J. Vanschore (eds.) AutoML: Methods, Systems, Challenges, The Springer Series on Challenges in Machine Learning. Springer Verlag (2018). URL
  14. 14. Automl: Automatic machine learning—H2O documentation. (retrived January 2019)
  15. 15.
    Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 97–106. ACM (2001)Google Scholar
  16. 16.
    Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: International Conference on Learning and Intelligent Optimization, pp. 507–523. Springer (2011)Google Scholar
  17. 17.
    Jin, H., Song, Q., Hu, X.: Efficient neural architecture search with network morphism. arXiv preprint arXiv:1806.10282 (2018)Google Scholar
  18. 18.
    Kaggle:—employee access challenge. (retrived January 2019)
  19. 19.
    Kaggle: Click-through rate prediction. (retrived January 2019)
  20. 20.
    Kanter, J.M., Veeramachaneni, K.: Deep feature synthesis: Towards automating data science endeavors. In: Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on, pp. 1–10. IEEE (2015)Google Scholar
  21. 21.
    Katz, G., Shin, E.C.R., Song, D.: Explorekit: Automatic feature generation and selection. In: Data Mining (ICDM), 2016 IEEE 16th International Conference on, pp. 979–984. IEEE (2016)Google Scholar
  22. 22.
    Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., Liu, T.Y.: Lightgbm: A highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems, pp. 3146–3154 (2017)Google Scholar
  23. 23.
    Khamassi, I., Sayed-Mouchaweh, M., Hammami, M., Ghédira, K.: Discussion and review on evolving data streams and concept drift adapting. Evolving systems 9(1), 1–23 (2018)CrossRefGoogle Scholar
  24. 24.
    Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, vol. 14, pp. 1137–1145. Montreal, Canada (1995)Google Scholar
  25. 25.
    Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: An ensemble method for drifting concepts. Journal of Machine Learning Research 8(Dec), 2755–2790 (2007)zbMATHGoogle Scholar
  26. 26.
    Madrid, J., Escalante, H.J., Morales, E., Tu, W.W., Yu, Y., Sun-Hosoya, L., Guyon, I., Sebag, M.: Towards automl in the presence of drift: first results. In: Workshop AutoML 2018@ ICML/IJCAI-ECAI (2018)Google Scholar
  27. 27.
    Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B., et al.: Never-ending learning. Communications of the ACM 61(5), 103–115 (2018)CrossRefGoogle Scholar
  28. 28.
    Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing. Decision Support Systems 62, 22–31 (2014)CrossRefGoogle Scholar
  29. 29.
    Olson, R.S., Moore, J.H.: Tpot: A tree-based pipeline optimization tool for automating machine learning. In: Workshop on Automatic Machine Learning, pp. 66–74 (2016)Google Scholar
  30. 30.
    Pentina, A., Lampert, C.H.: Lifelong learning with non-i.i.d. tasks. In: C. Cortes, N.D. Lawrence, D.D. Lee, M. Sugiyama, R. Garnett (eds.) Advances in Neural Information Processing Systems 28, pp. 1540–1548. Curran Associates, Inc. (2015). URL
  31. 31.
    Pesaranghader, A., Viktor, H.L.: Fast hoeffding drift detection method for evolving data streams. In: Joint European conference on machine learning and knowledge discovery in databases, pp. 96–111. Springer (2016)Google Scholar
  32. 32.
    Quanming, Y., Mengshuo, W., Hugo, J.E., Isabelle, G., Yi-Qi, H., Yu-Feng, L., Wei-Wei, T., Qiang, Y., Yang, Y.: Taking human out of learning applications: A survey on automated machine learning. arXiv preprint arXiv:1810.13306 (2018)Google Scholar
  33. 33.
    Silver, D.L., Yang, Q., Li, L.: Lifelong machine learning systems: Beyond learning algorithms. In: AAAI Spring Symposium: Lifelong Machine Learning, vol. 13, p. 05 (2013)Google Scholar
  34. 34.
    Tessler, C., Givony, S., Zahavy, T., Mankowitz, D.J., Mannor, S.: A deep hierarchical approach to lifelong learning in minecraft. In: AAAI, vol. 3, p. 6 (2017)Google Scholar
  35. 35.
    Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: Combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 847–855. ACM (2013)Google Scholar
  36. 36.
    Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Jobin Wilson
    • 1
    • 2
    Email author
  • Amit Kumar Meher
    • 1
  • Bivin Vinodkumar Bindu
    • 1
    • 2
  • Santanu Chaudhury
    • 2
  • Brejesh Lall
    • 2
  • Manoj Sharma
    • 3
  • Vishakha Pareek
    • 3
  1. 1.R&D DepartmentFlytxtTrivandrumIndia
  2. 2.Department of Electrical EngineeringIndian Institute of Technology DelhiNew DelhiIndia
  3. 3.CSIR-CEERIPilaniIndia

Personalised recommendations