Grid-ODF: Detecting Outliers Effectively and Efficiently in Large Multi-dimensional Databases

Wang, Wei; Zhang, Ji; Wang, Hai

doi:10.1007/11596448_113

Wei Wang²⁶,
Ji Zhang²⁷ &
Hai Wang²⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3801))

Included in the following conference series:

International Conference on Computational and Information Science

1661 Accesses
2 Citations

Abstract

In this paper, we will propose a novel outlier mining algorithm, called Grid-ODF, that takes into account both the local and global perspectives of outliers for effective detection. The notion ofOutlying Degree Factor(ODF), that reflects the factors of both the density and distance, is introduced to rank outliers. A grid structure partitioning the data space is employed to enable Grid-ODF to be implemented efficiently. Experimental results show that Grid-ODF outperforms existing outlier detection algorithms such as LOF and KNN-distance in terms of effectiveness and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data Mining Application. In: SIGMOD 1999, Philadelphia, PA (1999)
Google Scholar
Barnett, V., Lewis, T.: Outliers in Statistical Data, 3rd edn. John Wiley, Chichester (1994)
MATH Google Scholar
Breuning, M., Kriegel, H.-P., Ng, R., Sander, J.: LOF: Identifying Density-Based Local Outliers. In: SIGMOD 2000, Dallas, Texas (2000)
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: KDD 1996, Portland, Oregon (1996)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufman Publishers, San Francisco (2000)
Google Scholar
Hawkins, D.: Identification of Outliers. Chapman and Hall, London (1980)
MATH Google Scholar
Hinneburg, A., Keim, D.A.: An Efficient Approach to Cluster in Large Multimedia Databases with Noise. In: KDD 1998, New York City, NY (1998)
Google Scholar
Jin, W., Tung, A.K.H., Han, J.: Finding Top_n Local Outliers in Large Database. In: SIGKDD 2001, San Francisco, CA (2001)
Google Scholar
Knorr, E.M., Ng, R.T.: Algorithms for Mining Distance-based Outliers in Large Dataset. In: VLDB 1998, New York, NY (1998)
Google Scholar
Knorr, E.M., Ng, R.T.: Finding Intentional Knowledge of Distance-based Outliers. In: VLDB 1999, Edinburgh, Scotland (1999)
Google Scholar
Ng, R., Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining. In: VLDB 1994, Santiago, Chile (1994)
Google Scholar
Preparata, F., Shamos, M.: Computational Geometry: an Introduction. Springer, Heidelberg (1988)
Google Scholar
Ramaswamy, S., Rastogi, R., Kyuseok, S.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: SIGMOD 2000, Dallas, Texas (2000)
Google Scholar
Ruts, I., Rousseeuw, P.: Computing Depth Contours of Bivariate Point Clouds. Computational Statistics and Data Analysis 23, 153–168 (1996)
Article MATH Google Scholar
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A Wavelet based Clustering Approach for Spatial Data in Very Large Database. VLDB Journal 8(3-4), 289–304 (1999)
Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In: SIGMOD 1996, Montreal, Canada (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Educational Science, Nanjing Normal University, China
Wei Wang
Falculty of Computer Science, Dalhousie University, Halifax, Canada
Ji Zhang
Sobey School of Business, Saint Mary’s University, Halifax, Canada
Hai Wang

Authors

Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ji Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hai Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microelectronic Instiute, Xidian University, 710071, Xi’an, China
Yue Hao
Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
Jiming Liu
School of Computer Science and Technology, Xidian University, Xi’an, China
Yuping Wang
Department of Computer Science, Hong Kong Baptist University, Hong Kong,
Yiu-ming Cheung
School of Electrical and Electronic Engineering, University of Manchester, UK
Hujun Yin
Life Science Research Center, School of Electronic Engineering, Xidian University, 710071, Xi’an, Shaanxi, China
Licheng Jiao
Key Laboratory of Computer Networks and Information Security (Ministry of Education), Xidian University, 710071, Xi’an, China
Jianfeng Ma
National Laboratory of Antennas and Microwave Technology, Xidian University, 710071, Xi’an, Shanxi, P.R. China
Yong-Chang Jiao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, W., Zhang, J., Wang, H. (2005). Grid-ODF: Detecting Outliers Effectively and Efficiently in Large Multi-dimensional Databases. In: Hao, Y., et al. Computational Intelligence and Security. CIS 2005. Lecture Notes in Computer Science(), vol 3801. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596448_113

Download citation

DOI: https://doi.org/10.1007/11596448_113
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30818-8
Online ISBN: 978-3-540-31599-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics