Advertisement

Approach based on wavelet analysis for detecting and amending anomalies in dataset

  • Peng Xiao-qi 
  • Song Yan-po Email author
  • Tang Ying 
  • Zhang Jian-zhi 
Article

Abstract

It is difficult to detect the anomalies whose matching relationship among some data attributes is very different from others’ in a dataset. Aiming at this problem, an approach based on wavelet analysis for detecting and amending anomalous samples was proposed. Taking full advantage of wavelet analysis’ properties of multi-resolution and local analysis, this approach is able to detect and amend anomalous samples effectively. To realize the rapid numeric computation of wavelet translation for a discrete sequence, a modified algorithm based on Newton-Cores formula was also proposed. The experimental result shows that the approach is feasible with good result and good practicality.

Key words

data preprocessing wavelet analysis anomaly detecting data mining 

CLC number

TP39 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Eskin E. Anomaly detection over noisy data using learned probability distributions [C]// Langley P. Proceedings of the 17th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc, 2000: 255–262.Google Scholar
  2. [2]
    Yamanishi K, Takeuchi J I, Williams G. On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms[C]// Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Boston: ACM Press, 2000: 320–324.CrossRefGoogle Scholar
  3. [3]
    Knorr E M, Ng R T. Algorithms for mining distance-based outliers in large datasets [C]// Gupta A, Shmueli O, Widom J. Proceedings of the 24th International Conference on Very Large Data Bases. New York: Morgan Kaufmann, 1998: 392–403.Google Scholar
  4. [4]
    Knorr E M, Ng R T. Finding intensional knowledge of distance-based outliers[C]// Atkinson M P, Orlowska M E, Valduriez P. Proceedings of the 25th International Conference on Very Large Data Bases. Edinburgh: Morgan Kaufmann, 1999: 211–222.Google Scholar
  5. [5]
    Ramaswamy S, Rastogi R, Kyuseok S. Efficient algorithms for mining outliers from large data sets[C]// Chen W, Naughton J F, Bernstein P A. Proceedings of the ACM SIGMOD International Conference on Management of Data. Dallas: ACM Press, 2000: 427–438.Google Scholar
  6. [6]
    Bay S D, Schwabacher M. Mining distance-based outliers in near linear time with randomization and a simple pruning rule[C]// Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington: ACM Press, 2003: 29–38.CrossRefGoogle Scholar
  7. [7]
    Breunig M M, Kriegel H P, Ng R T, et al. OPTICS-OF: identifying local outliers [C]// Zytkow J M, Rauch J. Proceedings of the 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases. Berlin: Springer, 1999: 262–270.Google Scholar
  8. [8]
    Breunig M M, Kriegel H P, Ng R. T, et al. LOF: identifying density-based local outliers[C]// Chen W, Naughton J F, Bernstein P A. Proceedings of the ACM SIGMOD International Conference on Management of Data. Dallas: ACM Press, 2000: 93–104.Google Scholar
  9. [9]
    Jiang M F, Tseng S S, Su C M. Two-phase clustering process for outliers detection[J]. Pattern Recognition Letters, 2001, 22(6–7): 691–700.CrossRefGoogle Scholar
  10. [10]
    HE Zeng-you, XU Xiao-fei, DENG Sheng-chun. Discovering cluster-based local outliers[J]. Pattern Recognition Letters, 2003, 24(9–10): 1641–1650.CrossRefGoogle Scholar
  11. [11]
    Arshad M H, Chan P K. Identifying outliers via clustering for anomaly detection[EB/OL]. [2003-06-13]. http://www.cs.fit.edu/Projects/tech-reports/cs-2003-19.pdfGoogle Scholar
  12. [12]
    HE Zeng-you, DENG Sheng-chun, XU Xiao-fei. Outlier detection integrating semantic knowledge [C]// Proceeding of the 3rd International Conference on Web-Age Information Management. London: Springer-verlag, 2002: 126–131.Google Scholar
  13. [13]
    Hawkins S, HE Hong-xing, Williams G, et al. Outlier detection using replicator neural networks [C]// Proceedings of the 4th International Conference and Data Warehousing and Knowledge Discovery. London: Springer-Verlag, 2002: 170–180.CrossRefGoogle Scholar
  14. [14]
    YANG Fu-sheng. Wavelet transformation’s analysis and application in engineering[M]. Beijing: Science Press, 2000. (in Chinese)Google Scholar
  15. [15]
    LI Qing-yan, WANG Neng-chao, YI Da-yi. Numerical analysis[M]. 3rd ed. Wuhan: Huazhong University of Science and Technology Press, 1986. (in Chinese)Google Scholar

Copyright information

© Science Press 2001

Authors and Affiliations

  • Peng Xiao-qi 
    • 1
    • 2
  • Song Yan-po 
    • 1
    • 2
    Email author
  • Tang Ying 
    • 3
  • Zhang Jian-zhi 
    • 1
  1. 1.School of Energy Science and EngineeringCentral South UniversityChangshaChina
  2. 2.School of Information Science and EngineeringCentral South UniversityChangshaChina
  3. 3.School of Physics Science and TechnologyCentral South UniversityChangshaChina

Personalised recommendations