Machine Learning

, Volume 106, Issue 9–10, pp 1469–1495 | Cite as

Adaptive random forests for evolving data stream classification

  • Heitor M. Gomes
  • Albert Bifet
  • Jesse Read
  • Jean Paul Barddal
  • Fabrício Enembreck
  • Bernhard Pfharinger
  • Geoff Holmes
  • Talel Abdessalem
Article
  • 1.6k Downloads
Part of the following topical collections:
  1. Special Issue of the ECML PKDD 2017 Journal Track

Abstract

Random forests is currently one of the most used machine learning algorithms in the non-streaming (batch) setting. This preference is attributable to its high learning performance and low demands with respect to input preparation and hyper-parameter tuning. However, in the challenging context of evolving data streams, there is no random forests algorithm that can be considered state-of-the-art in comparison to bagging and boosting based algorithms. In this work, we present the adaptive random forest (ARF) algorithm for classification of evolving data streams. In contrast to previous attempts of replicating random forests for data stream learning, ARF includes an effective resampling method and adaptive operators that can cope with different types of concept drifts without complex optimizations for different data sets. We present experiments with a parallel implementation of ARF which has no degradation in terms of classification performance in comparison to a serial implementation, since trees and adaptive operators are independent from one another. Finally, we compare ARF with state-of-the-art algorithms in a traditional test-then-train evaluation and a novel delayed labelling evaluation, and show that ARF is accurate and uses a feasible amount of resources.

Keywords

Data stream mining Random forests Ensemble learning Concept drift 

References

  1. Abdulsalam, H., Skillicorn, D. B, & Martin, P. (2007). Streaming random forests. In 11th international on database engineering and applications symposium, IDEAS (pp. 225–232). IEEE.Google Scholar
  2. Abdulsalam, H., Skillicorn, D. B, & Martin, P. (2008). Classifying evolving data streams using dynamic streaming random forests. In Database and expert systems applications (pp. 643–651). Springer.Google Scholar
  3. Aggarwal, C. C., Han, J., Wang, J., & Yu, P. S. (2003). A framework for clustering evolving data streams. In Proceedings of the 29th international conference on very large data bases, VLDB ’03 (Vol. 29, pp. 81–92). VLDB Endowment.Google Scholar
  4. Agrawal, R., Imilielinski, T., & Swani, A. (1993). Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering, 5(6), 914–925.CrossRefGoogle Scholar
  5. Amini, A., & Wah, T. Y. (2014). On density-based data streams clustering algorithms: A survey. Journal of Computer Science and Technology, 29(1), 116–141.CrossRefGoogle Scholar
  6. Baena-Garcia, M., del Campo-Avila, J., Fidalgo, R., Bifet, A., Gavalda, R., & Morales-Bueno, R. (2006). Early drift detection method. In ECML PKDD 2006 workshop on knowledge discovery from data streams.Google Scholar
  7. Barddal, J. P., Gomes, H. M., & Enembreck, F. (2015). Sncstream: A social network-based data stream clustering algorithm. Proceedings of the 30th annual ACM symposium on applied computing, SAC ’15 (pp. 935–940). New York, NY: ACM.CrossRefGoogle Scholar
  8. Beygelzimer, A., Kale, S., & Luo, H. (2015). Optimal and adaptive algorithms for online boosting. In International conference in machine learning (pp. 2323–2331).Google Scholar
  9. Bifet, A., de Francisci Morales, G., Read, J., Holmes, G., & Pfahringer, B. (2015). Efficient online evaluation of big data stream classifiers. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 59–68). ACM.Google Scholar
  10. Bifet, A., & Gavaldà, R. (2007). Learning from time-changing data with adaptive windowing. In SIAM.Google Scholar
  11. Bifet, A., Holmes, G., Kirkby, R., & Pfahringer, B. (2010). Moa: Massive online analysis. The Journal of Machine Learning Research, 11, 1601–1604.Google Scholar
  12. Bifet, A., Holmes, G., Kirkby, R., & Pfahringer, B. (2011). MOA data stream mining: A practical approach. Centre for Open Software Innovation. http://heanet.dl.sourceforge.net/project/moa-datastream/documentation/StreamMining.pdf.
  13. Bifet, A., Holmes, G., & Pfahringer, B. (2010). Leveraging bagging for evolving data streams. In PKDD (pp. 135–150).Google Scholar
  14. Bifet, A., Holmes, G., Pfahringer, B., & Frank, E. (2010). Fast perceptron decision tree learning from evolving data streams. In PAKDD. Lecture notes in computer science (pp. 299–310). Springer.Google Scholar
  15. Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., & Gavaldà, R. (2009, June). New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 139–148). ACM SIGKDD.Google Scholar
  16. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.MATHGoogle Scholar
  17. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.CrossRefMATHGoogle Scholar
  18. Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Boca Raton: CRC Press.MATHGoogle Scholar
  19. Brzeziński, D., & Stefanowski, J. (2011). Accuracy updated ensemble for data streams with concept drift. In Hybrid artificial intelligent systems (pp. 155–163). Springer.Google Scholar
  20. Brzezinski, D., & Stefanowski, J. (2014). Combining block-based and online methods in learning ensembles from concept drifting data streams. Information Sciences, 265, 50–67.MathSciNetCrossRefMATHGoogle Scholar
  21. Chen, S.-T., Lin, H.-T., & Lu, C.-J. (2012, June). An online boosting algorithm with theoretical justifications. In Proceedings of the international conference on machine learning (ICML).Google Scholar
  22. Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.MathSciNetMATHGoogle Scholar
  23. Domingos, P., & Hulten, G. (2000, September). Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 71–80). ACM SIGKDD.Google Scholar
  24. Freund, Y., Schapire, R. E., et al. (1996). Experiments with a new boosting algorithm. ICML, 96, 148–156.Google Scholar
  25. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.MathSciNetCrossRefMATHGoogle Scholar
  26. Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). A survey on concept drift adaptation. ACM Computing Surveys, 46(4), 44:1–44:37.Google Scholar
  27. Gomes, H. M., & Enembreck, F. (2014, March). Sae2: advances on the social adaptive ensemble classifier for data streams. In Proceedings of the 29th annual ACM symposium on applied computing (SAC), SAC 2014 (pp. 199–206). ACM.Google Scholar
  28. Guha, S., Mishra, N., Motwani, R., & O’Callaghan, L. (2000). Clustering data streams. In Proceedings of the 41st annual symposium on foundations of computer science (pp. 359–366). IEEE.Google Scholar
  29. Holmes, G., Kirkby, R., & Pfahringer, B. (2005). Stress-testing hoeffding trees. In PKDD (pp. 495–502).Google Scholar
  30. Hulten, G., Spencer, L., & Domingos, P. (2001). Mining time-changing data streams. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 97–106). ACM.Google Scholar
  31. Katakis, I., Tsoumakas, G., Banos, E., Bassiliades, N., & Vlahavas, I. (2009). An adaptive personalized news dissemination system. Journal of Intelligent Information Systems, 32(2), 191–212.CrossRefGoogle Scholar
  32. Kolter, J. Z, & Maloof, M. et al. (2003). Dynamic weighted majority: A new ensemble method for tracking concept drift. In Third IEEE international conference on data mining, ICDM 2003 (pp. 123–130). IEEE.Google Scholar
  33. Lim, C. P., & Harrison, R. F. (2003). Online pattern classification with multiple neural network systems: An experimental study. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 33(2), 235–247.Google Scholar
  34. Minku, L. L., & Yao, X. (2012). Ddd: A new ensemble approach for dealing with concept drift. IEEE Transactions on Knowledge and Data Engineering, 24(4), 619–633.CrossRefGoogle Scholar
  35. Oza, N. C. (2005). Online bagging and boosting. IEEE International Conference on Systems, Man and Cybernetics, 3, 2340–2345.Google Scholar
  36. Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2), 100–115.MathSciNetCrossRefMATHGoogle Scholar
  37. Parker, B. S., & Khan, L. (2015). Detecting and tracking concept class drift and emergence in non-stationary fast data streams. In Twenty-ninth AAAI conference on artificial intelligence.Google Scholar
  38. Pelossof, R., Jones, M., Vovsha, I., & Rudin, C. (2009). Online coordinate boosting. In IEEE 12th international conference on computer vision workshops (ICCV Workshops) (pp. 1354–1361). IEEE.Google Scholar
  39. Qin, X., Zhang, Y., Li, C., & Li, X. (2013). Learning from data streams with only positive and unlabeled data. Journal of Intelligent Information Systems, 40(3), 405–430.CrossRefGoogle Scholar
  40. Ruiz, C., Menasalvas, E., & Spiliopoulou, M. (2009). Discovery science: 12th international conference, DS 2009, Porto, Portugal, October 3–5, 2009 (pp. 287–301). Chapter C-DenStream: Using domain knowledge on a data stream. Springer: BerlinGoogle Scholar
  41. Sethi, T. S., Kantardzic, M., Arabmakki, E., & Hu, H. (2014). An ensemble classification approach for handling spatio-temporal drifts in partially labeled data streams. In IEEE 15th international conference on information reuse and integration (IRI) (pp. 725–732). IEEE.Google Scholar
  42. Street, W. N., & Kim, Y. S. (2001). A streaming ensemble algorithm (sea) for large-scale classification. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 377–382). ACM.Google Scholar
  43. Žliobaitė, I., Bifet, A., Read, J., Pfahringer, B., & Holmes, G. (2015). Evaluation methods and decision theory for classification of streaming data with temporal dependence. Machine Learning, 98(3), 455–482.MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© The Author(s) 2017

Authors and Affiliations

  1. 1.PPGIaPontifícia Universidade Católica do ParanáCuritibaBrazil
  2. 2.LTCI, Télécom ParisTechUniversité Paris-SaclayParisFrance
  3. 3.LIXÉcole PolytechniquePalaiseauFrance
  4. 4.Department of Computer ScienceUniversity of WaikatoHamiltonNew Zealand
  5. 5.UMI CNRS IPAL & School of ComputingNational University of SingaporeSingaporeSingapore

Personalised recommendations