Advertisement

DWOF: A Robust Density-Based Outlier Detection Approach

  • Rana Momtaz
  • Nesma Mohssen
  • Mohammad A. Gowayyed
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7887)

Abstract

The problem of unsupervised outlier detection is challenging, especially when the structure of data is unknown. This paper presents a new density-based outlier detection technique that detects the top-n outliers. It overcomes the limitations of existing approaches, like low accuracy and high sensitivity to parameters. Our approach provides a score to each object called Dynamic-Window Outlier Factor (DWOF). DWOF improves Resolution-based Outlier Factor method (ROF) to consider varying-density clusters, which improves outliers’ ranking even when providing same outliers. Experiments show that DWOF’s average accuracy is better than existing approaches and less sensitive to its parameter.

Keywords

Unsupervised Outlier Detection Density-Based Outlier Factor Resolution-Based Outlier Factor 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Barnett, V., Lewis, T.: Outliers in statistical data, 2nd edn. Wiley Series in Probability and Mathematical Statistics. Applied Probability and Statistics, ch. 1. Wiley, Chichester (1984)zbMATHGoogle Scholar
  2. 2.
    Breunig, M., Kriegel, H., Ng, R., Sander, J., et al.: Lof: identifying density-based local outliers. Sigmod Record 29(2), 93–104 (2000)CrossRefGoogle Scholar
  3. 3.
    Fan, H., Zaïane, O.R., Foss, A., Wu, J.: Resolution-based outlier factor: detecting the top-n most outlying data points in engineering data. Knowledge and Information Systems 19(1), 31–51 (2009)CrossRefGoogle Scholar
  4. 4.
    Hawkins, D.M.: Identification of outliers, vol. 11. Chapman and Hall, London (1989)Google Scholar
  5. 5.
    He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recognition Letters 24(9), 1641–1650 (2003)zbMATHCrossRefGoogle Scholar
  6. 6.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys (CSUR) 31(3), 264–323 (1999)CrossRefGoogle Scholar
  7. 7.
    Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the International Conference on Very Large Data Bases, pp. 392–403. Citeseer (1998)Google Scholar
  8. 8.
    Tan, P.N., Steinbach, M., Kumar, V., et al.: Introduction to data mining. Pearson, Addison Wesley, Boston (2006)Google Scholar
  9. 9.
    Tukey, J.W.: Exploratory data analysis, Reading, MA (1977)Google Scholar
  10. 10.
    Yoursi, N.A.: A validity index for outlier detection. In: 2010 10th International Conference on Intelligent Systems Design and Applications (ISDA), pp. 325–329. IEEE (2010)Google Scholar
  11. 11.
    Zhang, K., Hutter, M., Jin, H.: A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 813–822. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  12. 12.
  13. 13.
    Letter Recognition dataset in UCI repository, http://archive.ics.uci.edu/ml/datasets/Letter+Recognition

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Rana Momtaz
    • 1
  • Nesma Mohssen
    • 1
  • Mohammad A. Gowayyed
    • 1
  1. 1.Computer and Systems EngineeringAlexandria UniversityEgypt

Personalised recommendations