Skip to main content
Log in

FindOut: Finding Outliers in Very Large Datasets

  • Original Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract.

Finding the rare instances or the outliers is important in many KDD (knowledge discovery and data-mining) applications, such as detecting credit card fraud or finding irregularities in gene expressions. Signal-processing techniques have been introduced to transform images for enhancement, filtering, restoration, analysis, and reconstruction. In this paper, we present a new method in which we apply signal-processing techniques to solve important problems in data mining. In particular, we introduce a novel deviation (or outlier) detection approach, termed FindOut, based on wavelet transform. The main idea in FindOut is to remove the clusters from the original data and then identify the outliers. Although previous research showed that such techniques may not be effective because of the nature of the clustering, FindOut can successfully identify outliers from large datasets. Experimental results on very large datasets are presented which show the efficiency and effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Received 7 September 2000 / Revised 2 February 2001 / Accepted in revised form 31 May 2001

Correspondence and offprint requests to: A. Zhang, Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY 14260, USA. Email: azhang@cse.buffalo.eduau

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, D., Sheikholeslami, G. & Zhang, A. FindOut: Finding Outliers in Very Large Datasets . Knowl Inform Sys 4, 387–412 (2002). https://doi.org/10.1007/s101150200013

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s101150200013

Navigation