Abstract
We present a novel resolution-based outlier notion and a nonparametric outlier-mining algorithm, which can efficiently identify top listed outliers from a wide variety of datasets. The algorithm generates reasonable outlier results by taking both local and global features of a dataset into consideration. Experiments are conducted using both synthetic datasets and a real life construction equipment dataset from a large building contractor. Comparison with the current outlier mining algorithms indicates that the proposed algorithm is more effective.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Raz, O., Buchheit, R., Shaw, M., Koopman, P., Faloutsos, C.: Detecting Semantic Anomalies in Truck Weigh-in-Motion Traffic Data Using Data Mining. Journal of Computing in Civil Engineering, ASCE 18(4), 291–300 (2004)
Knorr, E., Ng, R.: Algorithms for Mining Distance-based Outliers in Large Datasets. In: Proc. of 24th International Conference on Very Large Databases (1998)
Breunig, M., Kriegel, H., Ng, R., Sander, J.: LOF: Identifying Density-Based Local Outliers. In: Proc. of ACM SIGMOD 2000 International Conference on Management of Data, Dallas, TX (2000)
Tang, J., Chen, Z., Fu, A., Cheung, D.: Enhancing Effectiveness of outlier Detections for Low Density Patterns. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 535–548. Springer, Heidelberg (2002)
Foss, A., Zaïane, O.: A Parameterless Method for Efficiently Discovering Clusters of arbitrary Shape in Large Datasets. In: Proc. of 2002 IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan (2002)
Hawkins, D.: Identification of Outliers, p. 1. Chapman and Hall, London (1980)
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: Proc. of the ACM SIGMOD International Conference on Management of Data, Dallas, TX (2000)
Goldstein, J., Ramakrishnan, R.: Constrast Polots and P-Sphere Trees: Space vs. Time in Nearest Neighbor Searches. In: Proc. 26th VLDB conference (2000)
Kushilevitz, E., Ostrovsky, R., Rabani, Y.: Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces. In: STOC 1998 (1998)
Liu, T., Moore, A.W., Gray, A., Wang, K.: An Investigation of Practical Approximate Nearest Neighbor Algorithms. In: NIPS (December 2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fan, H., Zaïane, O.R., Foss, A., Wu, J. (2006). A Nonparametric Outlier Detection for Effectively Discovering Top-N Outliers from Engineering Data. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_66
Download citation
DOI: https://doi.org/10.1007/11731139_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33206-0
Online ISBN: 978-3-540-33207-7
eBook Packages: Computer ScienceComputer Science (R0)