Outlier and anomaly pattern detection on data streams

A data stream is a sequence of data generated continuously over time. A data stream is too big to be saved in memory, and its underlying data distribution may change over time. Outlier detection aims to find data instances which significantly deviate from the underlying data distribution. While most of outlier detection methods work in batch mode where all the data samples are available at once, the necessity for efficient outlier and anomaly pattern detection methods in a data stream has increased. Outlier detection is performed at an individual instance level, and anomalous pattern detection involves detecting a point in time where the behavior of the data becomes unusual and differs from normal behavior. Alternatively, concept drift detection methods find a concept-changing point in the streaming data and try to adapt the model to the new emerging pattern. In this paper, we provide a review of outlier detection, anomaly pattern detection, and concept drift detection for streaming data.

This research was supported by Korea Electric Power Corporation. (Grant number: R18XA05).

