A Nonparametric Outlier Detection for Effectively Discovering Top-N Outliers from Engineering Data

Fan, Hongqin; Zaïane, Osmar R.; Foss, Andrew; Wu, Junfeng

doi:10.1007/11731139_66

Hongqin Fan²²,
Osmar R. Zaïane²³,
Andrew Foss²³ &
…
Junfeng Wu²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3918))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3174 Accesses
40 Citations

Abstract

We present a novel resolution-based outlier notion and a nonparametric outlier-mining algorithm, which can efficiently identify top listed outliers from a wide variety of datasets. The algorithm generates reasonable outlier results by taking both local and global features of a dataset into consideration. Experiments are conducted using both synthetic datasets and a real life construction equipment dataset from a large building contractor. Comparison with the current outlier mining algorithms indicates that the proposed algorithm is more effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Raz, O., Buchheit, R., Shaw, M., Koopman, P., Faloutsos, C.: Detecting Semantic Anomalies in Truck Weigh-in-Motion Traffic Data Using Data Mining. Journal of Computing in Civil Engineering, ASCE 18(4), 291–300 (2004)
Article Google Scholar
Knorr, E., Ng, R.: Algorithms for Mining Distance-based Outliers in Large Datasets. In: Proc. of 24th International Conference on Very Large Databases (1998)
Google Scholar
Breunig, M., Kriegel, H., Ng, R., Sander, J.: LOF: Identifying Density-Based Local Outliers. In: Proc. of ACM SIGMOD 2000 International Conference on Management of Data, Dallas, TX (2000)
Google Scholar
Tang, J., Chen, Z., Fu, A., Cheung, D.: Enhancing Effectiveness of outlier Detections for Low Density Patterns. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 535–548. Springer, Heidelberg (2002)
Chapter Google Scholar
Foss, A., Zaïane, O.: A Parameterless Method for Efficiently Discovering Clusters of arbitrary Shape in Large Datasets. In: Proc. of 2002 IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan (2002)
Google Scholar
Hawkins, D.: Identification of Outliers, p. 1. Chapman and Hall, London (1980)
Book MATH Google Scholar
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: Proc. of the ACM SIGMOD International Conference on Management of Data, Dallas, TX (2000)
Google Scholar
Goldstein, J., Ramakrishnan, R.: Constrast Polots and P-Sphere Trees: Space vs. Time in Nearest Neighbor Searches. In: Proc. 26th VLDB conference (2000)
Google Scholar
Kushilevitz, E., Ostrovsky, R., Rabani, Y.: Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces. In: STOC 1998 (1998)
Google Scholar
Liu, T., Moore, A.W., Gray, A., Wang, K.: An Investigation of Practical Approximate Nearest Neighbor Algorithms. In: NIPS (December 2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Civil Engineering, University of Alberta, Canada
Hongqin Fan
Department of Computing Science, University of Alberta, Canada
Osmar R. Zaïane, Andrew Foss & Junfeng Wu

Authors

Hongqin Fan
View author publications
You can also search for this author in PubMed Google Scholar
Osmar R. Zaïane
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Foss
View author publications
You can also search for this author in PubMed Google Scholar
Junfeng Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Nanyang Technological University, Singapore
Wee-Keong Ng
Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, 153-8505, Tokyo, Japan
Masaru Kitsuregawa
School of Computer Science and Technology, Heilongjiang University, China
Jianzhong Li
School of Computer Engineering, Nanyang Technological University, 639798, Singapore, Singapore
Kuiyu Chang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fan, H., Zaïane, O.R., Foss, A., Wu, J. (2006). A Nonparametric Outlier Detection for Effectively Discovering Top-N Outliers from Engineering Data. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_66

Download citation

DOI: https://doi.org/10.1007/11731139_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33206-0
Online ISBN: 978-3-540-33207-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics