Hybrid decision trees for data streams based on Incremental Flexible Naive Bayes prediction at leaf nodes

  • C. Sweetlin HemalathaEmail author
  • Ravi Pathak
  • V. Vaidehi
Research Paper


Mining data over streams in one pass and using constant memory is a challenging task. Decision trees are one of the most popular classifiers for both batch and incremental learning due to their high degree of interpretability, ease of construction and good accuracy. The most popular decision tree for stream classification is Hoeffding Tree based on Hoeffding bound. Literature shows a few variants of decision trees based on different bounds. The default class prediction method adopted in decision tree is “majority class” approach. Later, the accuracy of prediction was scaled up by a hybrid decision tree where Naive Bayes classifier was used for prediction. Kernel Density Estimation (KDE) is employed in Flexible Naive Bayes for classification. However, it is suitable for modeling static data set. This paper proposes an Incremental Flexible Naive Bayes (IFNB) based hybrid decision tree paradigm that uses KDE to model continuous attributes at leaf nodes of the tree for improving the class prediction accuracy. Experimental results on both synthetic and real dataset show that the proposed IFNB based leaf classifiers achieves improvement over the class prediction methods adopted in existing decision trees for data streams.


Data stream mining Decision trees Hoeffding bound Kernel density estimation Incremental Flexible Naive Bayes 



  1. 1.
    Aggarwal CC (2007) Data streams: models and algorithms, vol 31. Springer, BerlinCrossRefzbMATHGoogle Scholar
  2. 2.
    Bifet A (2010, July) Adaptive stream mining: pattern learning and mining from evolving data streams. In: Proceedings of the 2010 conference on adaptive stream mining: pattern learning and mining from evolving data streams. Ios Press, pp 1–212Google Scholar
  3. 3.
    Bifet A, Holmes G, Pfahringer B, Read J, Kranen P, Kremer H, Seidl T (2011, September). MOA: a real-time analytics open source framework. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp. 617–620Google Scholar
  4. 4.
    Bifet A, Kirkby R (2009) Data stream mining a practical approach. Technical report, Univ. of WaikatoGoogle Scholar
  5. 5.
    Breiman L, Friedman JH, Olshen RA, Stone CJ (1993) Classification and regression trees. Chapman and Hall, LondonzbMATHGoogle Scholar
  6. 6.
    Cazzolato MT, Ribeiro MX (2013, June) A statistical decision tree algorithm for medical data stream mining. In: Proceedings of the 26th IEEE international symposium on computer-based medical systems. IEEE, pp 389–392Google Scholar
  7. 7.
    Czarnowski I, Jędrzejowicz P (2014) Ensemble classifier for mining data streams. Procedia Computer Science 35:397–406CrossRefGoogle Scholar
  8. 8.
    Domingos P, Hulten G (2000, August) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 71–80Google Scholar
  9. 9.
    Domingos P, Hulten G (2003) A general framework for mining massive data streams. J Comput Graph Stat 12(4):945–949MathSciNetCrossRefGoogle Scholar
  10. 10.
    Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. SIGMOD Rec 34(2):18–26CrossRefzbMATHGoogle Scholar
  11. 11.
    Gama J, Fernandes R, Rocha R (2006) Decision trees for mining data streams. Intell Data Anal 10(1):23–45CrossRefGoogle Scholar
  12. 12.
    Gama J (2012) A survey on learning from data streams: current and future trends. Prog Artif Intell 1(1):45–55CrossRefGoogle Scholar
  13. 13.
    Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, AmsterdamzbMATHGoogle Scholar
  14. 14.
    He Y, Mao Y, Chen W, Chen Y (2015) Nonlinear metric learning with kernel density estimation. IEEE Trans Knowl Data Eng 27(6):1602–1614CrossRefGoogle Scholar
  15. 15.
    Heinz C, Seeger B (2008) Cluster kernels: resource-aware kernel density estimators over streaming data. IEEE Trans Knowl Data Eng 20(7):880–893CrossRefGoogle Scholar
  16. 16.
    Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58(301):13–30MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Hoens TR, Polikar R, Chawla NV (2012) Learning from streaming data with concept drift and imbalance: an overview. Prog Artif Intell 1(1):89–101CrossRefGoogle Scholar
  18. 18.
    Jankowski Dariusz, Jackowski Konrad, Cyganek Bogusław (2016) Learning decision trees from data streams with concept drift. Proc Comput Sci 80:1682–1691CrossRefGoogle Scholar
  19. 19.
    Jin R, Agrawal G (2003, August) Efficient decision tree construction on streaming data. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 571–576Google Scholar
  20. 20.
    John GH, Langley P (1995, August) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., pp 338–345Google Scholar
  21. 21.
    Jun Y, Mingyou B, Guannan W, Xianjiang S (2017, July) Fault diagnosis of planetary gearbox using wavelet packet transform and flexible naive Bayesian classifier. In: 2017 36th Chinese control conference (CCC). IEEE, pp 7207–7211Google Scholar
  22. 22.
    Kobos M, Mańdziuk J (2009, September) Classification based on combination of kernel density estimators. In: International conference on artificial neural networks. Springer, Berlin, pp 125–134Google Scholar
  23. 23.
    Li F, Liu Q (2008, December) An improved algorithm of decision trees for streaming data based on VFDT. In: 2008 international symposium on information science and engineering, vol 1. IEEE, pp 597–600Google Scholar
  24. 24.
    McDiarmid C (1989) On the method of bounded differences. Surv Combin 141(1):148–188MathSciNetzbMATHGoogle Scholar
  25. 25.
    Muthukrishnan S (2005) Data streams: Algorithms and applications. Now Publishers Inc, BredazbMATHGoogle Scholar
  26. 26.
    Pérez A, Larrañaga P, Inza I (2009) Bayesian classifiers based on kernel density estimation: flexible classifiers. Int J Approx Reason 50(2):341–362CrossRefzbMATHGoogle Scholar
  27. 27.
    Quinlan JR (2014) C4. 5: programs for machine learning. Elsevier, AmsterdamGoogle Scholar
  28. 28.
    Ram P, Gray AG (2011, August) Density estimation trees. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 627–635Google Scholar
  29. 29.
    Rutkowski L, Pietruczuk L, Duda P, Jaworski M (2013) Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans Knowl Data Eng 25(6):1272–1279CrossRefGoogle Scholar
  30. 30.
    Rutkowski L, Jaworski M, Pietruczuk L, Duda P (2014) Decision trees for mining data streams based on the gaussian approximation. IEEE Trans Knowl Data Eng 26(1):108–119CrossRefzbMATHGoogle Scholar
  31. 31.
    Rutkowski L, Jaworski M, Pietruczuk L, Duda P (2015) A new method for data stream mining based on the misclassification error. IEEE Trans Neural Netw Learn Syst 26(5):1048–1059MathSciNetCrossRefGoogle Scholar
  32. 32.
    Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, HobokenCrossRefzbMATHGoogle Scholar
  33. 33.
    Silverman BW (1986) Density estimation for statistics and data analysis, vol 26. CRC Press, Boca RatonCrossRefzbMATHGoogle Scholar
  34. 34.
    Smyth P, Gray A, Fayyad UM (1995, July) Retrofitting decision tree classifiers using kernel density estimation. In: ICML, pp 506–514Google Scholar
  35. 35.
    Su L, Han W, Yang S, Zou P, Jia Y (2007, September). Continuous adaptive outlier detection on distributed data streams. In: International conference on high performance computing and communications. Springer, Berlin, pp 74–85Google Scholar
  36. 36.
    Wand MP, Jones MC (1995) Kernel smoothing. Chapman and Hall, LondonCrossRefzbMATHGoogle Scholar
  37. 37.
    Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, BurlingtonzbMATHGoogle Scholar
  38. 38.
    Yu WG, Cai YH (2012, July) A weighted flexible naive Bayesian classifier for continuous attributes. In: 2012 International conference on machine learning and cybernetics (ICMLC), vol 2. IEEE, pp 756–761Google Scholar
  39. 39.
  40. 40.
  41. 41.
    Zhou A, Cai Z, Wei L, Qian W (2003, March) M-kernel merging: towards density estimation over data streams. In: Proceedings of eighth international conference on database systems for advanced applications, 2003 (DASFAA 2003). IEEE, pp 285–292Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • C. Sweetlin Hemalatha
    • 1
    Email author
  • Ravi Pathak
    • 2
  • V. Vaidehi
    • 1
  1. 1.School of Computing Science and EngineeringVITChennaiIndia
  2. 2.Global Biodiversity Information Facility (GBIF)Secretariat CopenhagenCopenhagenDenmark

Personalised recommendations