OPTICS-OF: Identifying Local Outliers

* Final gross prices may vary according to local VAT.

Get Access

Abstract

For many KDD applications finding the outliers, i.e. the rare events, is more interesting and useful than finding the common cases, e.g. detecting criminal activities in E-commerce. Being an outlier, however, is not just a binary property. Instead, it is a property that applies to a certain degree to each object in a data set, depending on how ‘isolated’ this object is, with respect to the surrounding clustering structure. In this paper, we formally introduce a new notion of outliers which bases outlier detection on the same theoretical foundation as density-based cluster analysis. Our notion of an outlier is ‘local’ in the sense that the outlier-degree of an object is determined by taking into account the clustering structure in a bounded neighborhood of the object. We demonstrate that this notion of an outlier is more appropriate for detecting different types of outliers than previous approaches, and we also present an algorithm for finding them. Furthermore, we show that by combining the outlier detection with a density-based method to analyze the clustering structure, we can get the outliers almost for free if we already want to perform a cluster analysis on a data set.