Skip to main content

Parameter Rating by Diffusion Gradient

  • Chapter
  • First Online:
Modeling, Simulation and Optimization for Science and Technology

Part of the book series: Computational Methods in Applied Sciences ((COMPUTMETHODS,volume 34))

Abstract

Anomaly detection is a central task in high-dimensional data analysis. It can be performed by using dimensionality reduction methods to obtain a low-dimensional representation of the data, which reveals the geometry and the patterns that exist and govern it. Usually, anomaly detection methods classify high-dimensional vectors that represent data points as either normal or abnormal. Revealing the parameters (i.e., features) that cause detected abnormal behaviors is critical in many applications. However, this problem is not addressed by recent anomaly-detection methods and, specifically, by nonparametric methods, which are based on feature-free analysis of the data. In this chapter, we provide an algorithm that rates (i.e., ranks) the parameters that cause an abnormal behavior to occur. We assume that the anomalies have already been detected by other anomaly detection methods and they are treated in this chapter as prior knowledge. Our algorithm is based on the underlying potential of the diffusion process that is used in Diffusion Maps (DM) for dimensionality reduction. We show that the gradient of this potential indicates the direction from an anomalous data point to a cluster that represents a normal behavior. We use this direction to rate the parameters that cause the abnormal behavior to occur. The algorithm was applied successfully to rate the measured parameters from process control and networking applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bermanis A, Averbuch A, Coifman RR (2011) Multiscale data sampling and function extension. In: Proceedings of the 9th international conference on sampling theory and applications. Nanyang Technological University, Singapore. http://sampta2011.ntu.edu.sg/SampTA2011Proceedings/start.pdf. Best student paper award

  2. Breunig MM, Kriegel HP, Ng RT, Sander J (1999) OPTICS-OF: Identifying local outliers. Principles of data mining and knowledge discovery. Lecture notes in computer science, vol 1704. Springer, Berlin, pp 262–270

    Google Scholar 

  3. Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: Identifying density-based local outliers. SIGMOD Rec 29(2):93–104

    Article  Google Scholar 

  4. Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15:1–15:58

    Google Scholar 

  5. Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799

    Google Scholar 

  6. Chung F (1997) Spectral graph theory. CBMS regional conference series in mathematics, vol 92. AMS, Providence

    Google Scholar 

  7. Coifman RR, Lafon S (2006) Diffusion maps. Appl Comput Harmon Anal 21(1):5–30

    Article  MATH  MathSciNet  Google Scholar 

  8. Coifman RR, Lafon S (2006) Geometric harmonics: a novel tool for multiscale out-of-sample extension of empirical functions. Appl Comput Harmon Anal 21(1):31–52

    Article  MATH  MathSciNet  Google Scholar 

  9. Coifman RR, Lafon S, Lee AB, Maggioni M, Nadler B, Warner F, Zucker SW (2005) Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc Natl Acad Sci USA 102(21):7426–7431

    Article  Google Scholar 

  10. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intel 24(5):603–619

    Google Scholar 

  11. David G (2009) Anomaly detection and classification via diffusion processes in hyper-networks. Ph.D. thesis, Tel Aviv University

    Google Scholar 

  12. David G, Averbuch A (2012) Hierarchical data organization, clustering and denoising via localized diffusion folders. Appl Comput Harmon Anal 33(1):1–23

    Article  MATH  MathSciNet  Google Scholar 

  13. Fernández A, Rabin N, Dorronsoro J (2013) Auto-adaptative Laplacian pyramids for high-dimensional data analysis (2013)

    Google Scholar 

  14. Fukunaga K, Hostetler LD (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theor IT-21: 32–40

    Google Scholar 

  15. Gibbons JD, Olkin I, Sobel M (1977) Selecting and ordering populations: a new statistical methodology. Wiley, New York

    MATH  Google Scholar 

  16. Hein M, Audibert JY (2005) Intrinsic dimensionality estimation of submanifolds in \(\mathbb{R}^d\). In: ICML ’05 proceedings of the 22nd international conference on machine learning. ACM, New York, pp 289–296

    Google Scholar 

  17. Jin W, Tung A, Han J (2001) Mining top-n local outliers in large databases. In: KDD ’01 proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 293–298

    Google Scholar 

  18. Lafon SS (2004) Diffusion maps and geometric harmonics. PhD thesis, Yale University

    Google Scholar 

  19. Lim MJ, Negnevitsky M, Hartnett J (2006) A fuzzy approach for detecting anomalous behaviour in e-mail traffic. In: Proceedings of the 4th Australian digital forensics conference. Edith Cowan University, Perth, pp 36–49

    Google Scholar 

  20. Muller E, Assent I, Steinhausen U, Seidl T (2008) OutRank: ranking outliers in high dimensional data. In: IEEE 24th international conference on data engineering workshop (ICDEW 2008). IEEE, pp 600–603

    Google Scholar 

  21. Nadler B, Lafon S, Coifman RR, Kevrekidis IG (2006) Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, MA, pp 955–962

    Google Scholar 

  22. Nadler B, Lafon S, Coifman RR, Kevrekidis IG (2006) Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Appl Comput Harmon Anal 21(1):113–127

    Article  MATH  MathSciNet  Google Scholar 

  23. Nelson BL, Matejcik FJ (1995) Using common random numbers for indifference-zone selection and multiple comparisons in simulation. Manage Sci 41(12):1935–1945

    Article  MATH  Google Scholar 

  24. Noma H, Matsui S, Omori T, Sato T (2010) Bayesian ranking and selection methods using hierarchical mixture models in microarray studies. Biostatistics 11(2):281–289

    Article  Google Scholar 

  25. Parzen E (1962) On estimation of a probability density function and mode. Ann Math Statist 33:1065–1076

    Article  MATH  MathSciNet  Google Scholar 

  26. Rabin N (2011) Data mining in dynamically evolving systems via diffusion methodologies. PhD thesis, Tel Aviv University

    Google Scholar 

  27. Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Statist 27:832–837

    Article  MATH  MathSciNet  Google Scholar 

  28. Schclar A (2008) Multi-sensor fusion via reduction of dimensionality. PhD thesis, Tel Aviv University

    Google Scholar 

  29. Sriver TA, Chrissis JW, Abramson MA (2009) Pattern search ranking and selection algorithms for mixed variable simulation-based optimization. Eur J Oper Res 198(3):878–890

    Article  MATH  MathSciNet  Google Scholar 

  30. Swisher JR., Jacobson SH (1999) A survey of ranking, selection, and multiple comparison procedures for discrete-event simulation. In: Proceedings of the 1999 winter simulation conference. ACM, New York, pp 492–501

    Google Scholar 

Download references

Acknowledgments

The author thanks Ido Weinberg, Avihai Ankri, Shmulik Cohen and Avi Aboody from Applied Materials, Israel, for their constant help and support. This research was partially supported by the Israel Science Foundation (Grant No. 1041/10), the Ministry of Science & Technology (Grant No. 3-9096) and the BSF (Grant No. 201182). The first author was also supported by the Eshkol Fellowship from the Israeli Ministry of Science & Technology and by a graduate Fellowship from University of Jyväskylä.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Averbuch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Wolf, G., Averbuch, A., Neittaanmäki, P. (2014). Parameter Rating by Diffusion Gradient. In: Fitzgibbon, W., Kuznetsov, Y., Neittaanmäki, P., Pironneau, O. (eds) Modeling, Simulation and Optimization for Science and Technology. Computational Methods in Applied Sciences, vol 34. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-9054-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-9054-3_13

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-017-9053-6

  • Online ISBN: 978-94-017-9054-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics