AI 2010: Advances in Artificial Intelligence

Volume 6464 of the series Lecture Notes in Computer Science pp 122-131

An Effective Pattern Based Outlier Detection Approach for Mixed Attribute Data

  • Ke ZhangAffiliated withCollege of Engineering & Computer Science, Australian National University
  • , Huidong JinAffiliated withCollege of Engineering & Computer Science, Australian National UniversityCSIRO Mathematics, Informatics and Statistics

* Final gross prices may vary according to local VAT.

Get Access


Detecting outliers in mixed attribute datasets is one of major challenges in real world applications. Existing outlier detection methods lack effectiveness for mixed attribute datasets mainly due to their inability of considering interactions among different types of, e.g., numerical and categorical attributes. To address this issue in mixed attribute datasets, we propose a novel Pattern based Outlier Detection approach (POD). Pattern in this paper is defined to describe majority of data as well as capture interactions among different types of attributes. In POD, the more does an object deviate from these patterns, the higher is its outlier factor. We use logistic regression to learn patterns and then formulate the outlier factor in mixed attribute datasets. A series of experimental results illustrate that POD performs statistically significantly better than several classic outlier detection methods.


outlier detection mixed attribute data pattern based outlier detection