Adaptive random forests for evolving data stream classification

  • Heitor M. Gomes
  • Albert Bifet
  • Jesse Read
  • Jean Paul Barddal
  • Fabrício Enembreck
  • Bernhard Pfharinger
  • Geoff Holmes
  • Talel Abdessalem
Article
  • 276 Downloads
Part of the following topical collections:
  1. Special Issue of the ECML PKDD 2017 Journal Track

Abstract

Random forests is currently one of the most used machine learning algorithms in the non-streaming (batch) setting. This preference is attributable to its high learning performance and low demands with respect to input preparation and hyper-parameter tuning. However, in the challenging context of evolving data streams, there is no random forests algorithm that can be considered state-of-the-art in comparison to bagging and boosting based algorithms. In this work, we present the adaptive random forest (ARF) algorithm for classification of evolving data streams. In contrast to previous attempts of replicating random forests for data stream learning, ARF includes an effective resampling method and adaptive operators that can cope with different types of concept drifts without complex optimizations for different data sets. We present experiments with a parallel implementation of ARF which has no degradation in terms of classification performance in comparison to a serial implementation, since trees and adaptive operators are independent from one another. Finally, we compare ARF with state-of-the-art algorithms in a traditional test-then-train evaluation and a novel delayed labelling evaluation, and show that ARF is accurate and uses a feasible amount of resources.

Keywords

Data stream mining Random forests Ensemble learning Concept drift 

Copyright information

© The Author(s) 2017

Authors and Affiliations

  1. 1.PPGIaPontifícia Universidade Católica do ParanáCuritibaBrazil
  2. 2.LTCI, Télécom ParisTechUniversité Paris-SaclayParisFrance
  3. 3.LIXÉcole PolytechniquePalaiseauFrance
  4. 4.Department of Computer ScienceUniversity of WaikatoHamiltonNew Zealand
  5. 5.UMI CNRS IPAL & School of ComputingNational University of SingaporeSingaporeSingapore

Personalised recommendations