Skip to main content

Grid-ODF: Detecting Outliers Effectively and Efficiently in Large Multi-dimensional Databases

  • Conference paper
Computational Intelligence and Security (CIS 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3801))

Included in the following conference series:

Abstract

In this paper, we will propose a novel outlier mining algorithm, called Grid-ODF, that takes into account both the local and global perspectives of outliers for effective detection. The notion ofOutlying Degree Factor(ODF), that reflects the factors of both the density and distance, is introduced to rank outliers. A grid structure partitioning the data space is employed to enable Grid-ODF to be implemented efficiently. Experimental results show that Grid-ODF outperforms existing outlier detection algorithms such as LOF and KNN-distance in terms of effectiveness and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data Mining Application. In: SIGMOD 1999, Philadelphia, PA (1999)

    Google Scholar 

  2. Barnett, V., Lewis, T.: Outliers in Statistical Data, 3rd edn. John Wiley, Chichester (1994)

    MATH  Google Scholar 

  3. Breuning, M., Kriegel, H.-P., Ng, R., Sander, J.: LOF: Identifying Density-Based Local Outliers. In: SIGMOD 2000, Dallas, Texas (2000)

    Google Scholar 

  4. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: KDD 1996, Portland, Oregon (1996)

    Google Scholar 

  5. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufman Publishers, San Francisco (2000)

    Google Scholar 

  6. Hawkins, D.: Identification of Outliers. Chapman and Hall, London (1980)

    MATH  Google Scholar 

  7. Hinneburg, A., Keim, D.A.: An Efficient Approach to Cluster in Large Multimedia Databases with Noise. In: KDD 1998, New York City, NY (1998)

    Google Scholar 

  8. Jin, W., Tung, A.K.H., Han, J.: Finding Top_n Local Outliers in Large Database. In: SIGKDD 2001, San Francisco, CA (2001)

    Google Scholar 

  9. Knorr, E.M., Ng, R.T.: Algorithms for Mining Distance-based Outliers in Large Dataset. In: VLDB 1998, New York, NY (1998)

    Google Scholar 

  10. Knorr, E.M., Ng, R.T.: Finding Intentional Knowledge of Distance-based Outliers. In: VLDB 1999, Edinburgh, Scotland (1999)

    Google Scholar 

  11. Ng, R., Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining. In: VLDB 1994, Santiago, Chile (1994)

    Google Scholar 

  12. Preparata, F., Shamos, M.: Computational Geometry: an Introduction. Springer, Heidelberg (1988)

    Google Scholar 

  13. Ramaswamy, S., Rastogi, R., Kyuseok, S.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: SIGMOD 2000, Dallas, Texas (2000)

    Google Scholar 

  14. Ruts, I., Rousseeuw, P.: Computing Depth Contours of Bivariate Point Clouds. Computational Statistics and Data Analysis 23, 153–168 (1996)

    Article  MATH  Google Scholar 

  15. Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A Wavelet based Clustering Approach for Spatial Data in Very Large Database. VLDB Journal 8(3-4), 289–304 (1999)

    Google Scholar 

  16. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In: SIGMOD 1996, Montreal, Canada (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, W., Zhang, J., Wang, H. (2005). Grid-ODF: Detecting Outliers Effectively and Efficiently in Large Multi-dimensional Databases. In: Hao, Y., et al. Computational Intelligence and Security. CIS 2005. Lecture Notes in Computer Science(), vol 3801. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596448_113

Download citation

  • DOI: https://doi.org/10.1007/11596448_113

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30818-8

  • Online ISBN: 978-3-540-31599-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics