Tapped Delay Lines for GP Streaming Data Classification with Label Budgets

  • Ali Vahdat
  • Jillian Morgan
  • Andrew R. McIntyre
  • Malcolm I. Heywood
  • A. Nur Zincir-Heywood
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9025)

Abstract

Streaming data classification requires that a model be available for classifying stream content while simultaneously detecting and reacting to changes to the underlying process generating the data. Given that only a fraction of the stream is ‘visible’ at any point in time (i.e. some form of window interface) then it is difficult to place any guarantee on a classifier encountering a ‘well mixed’ distribution of classes across the stream. Moreover, streaming data classifiers are also required to operate under a limited label budget (labelling all the data is too expensive). We take these requirements to motivate the use of an active learning strategy for decoupling genetic programming training epochs from stream throughput. The content of a data subset is controlled by a combination of Pareto archiving and stochastic sampling. In addition, a significant benefit is attributed to support for a tapped delay line (TDL) interface to the stream, but this also increases the dimensionality of the task. We demonstrate that the benefits of assuming the TDL can be maintained through the use of oversampling without recourse to additional label information. Benchmarking on 4 dataset demonstrates that the approach is particularly effective when reacting to shifts in the underlying properties of the stream. Moreover, an online formulation for class-wise detection rate is assumed, where this is able to robustly characterize classifier performance throughout the stream.

Keywords

Streaming data classification Non-stationary Class imbalance Benchmarking 

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ali Vahdat
    • 1
  • Jillian Morgan
    • 1
  • Andrew R. McIntyre
    • 1
  • Malcolm I. Heywood
    • 1
  • A. Nur Zincir-Heywood
    • 1
  1. 1.Faculty of Computer ScienceDalhousie University HalifaxHalifaxCanada

Personalised recommendations