Parameter Rating by Diffusion Gradient

Part of the Computational Methods in Applied Sciences book series (COMPUTMETHODS, volume 34)


Anomaly detection is a central task in high-dimensional data analysis. It can be performed by using dimensionality reduction methods to obtain a low-dimensional representation of the data, which reveals the geometry and the patterns that exist and govern it. Usually, anomaly detection methods classify high-dimensional vectors that represent data points as either normal or abnormal. Revealing the parameters (i.e., features) that cause detected abnormal behaviors is critical in many applications. However, this problem is not addressed by recent anomaly-detection methods and, specifically, by nonparametric methods, which are based on feature-free analysis of the data. In this chapter, we provide an algorithm that rates (i.e., ranks) the parameters that cause an abnormal behavior to occur. We assume that the anomalies have already been detected by other anomaly detection methods and they are treated in this chapter as prior knowledge. Our algorithm is based on the underlying potential of the diffusion process that is used in Diffusion Maps (DM) for dimensionality reduction. We show that the gradient of this potential indicates the direction from an anomalous data point to a cluster that represents a normal behavior. We use this direction to rate the parameters that cause the abnormal behavior to occur. The algorithm was applied successfully to rate the measured parameters from process control and networking applications.


Parameter rating Diffusion maps Feature ranking Feature selection Underlying potential Abnormal behavior 



The author thanks Ido Weinberg, Avihai Ankri, Shmulik Cohen and Avi Aboody from Applied Materials, Israel, for their constant help and support. This research was partially supported by the Israel Science Foundation (Grant No. 1041/10), the Ministry of Science & Technology (Grant No. 3-9096) and the BSF (Grant No. 201182). The first author was also supported by the Eshkol Fellowship from the Israeli Ministry of Science & Technology and by a graduate Fellowship from University of Jyväskylä.


  1. 1.
    Bermanis A, Averbuch A, Coifman RR (2011) Multiscale data sampling and function extension. In: Proceedings of the 9th international conference on sampling theory and applications. Nanyang Technological University, Singapore. Best student paper award
  2. 2.
    Breunig MM, Kriegel HP, Ng RT, Sander J (1999) OPTICS-OF: Identifying local outliers. Principles of data mining and knowledge discovery. Lecture notes in computer science, vol 1704. Springer, Berlin, pp 262–270Google Scholar
  3. 3.
    Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: Identifying density-based local outliers. SIGMOD Rec 29(2):93–104CrossRefGoogle Scholar
  4. 4.
    Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15:1–15:58Google Scholar
  5. 5.
    Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799Google Scholar
  6. 6.
    Chung F (1997) Spectral graph theory. CBMS regional conference series in mathematics, vol 92. AMS, ProvidenceGoogle Scholar
  7. 7.
    Coifman RR, Lafon S (2006) Diffusion maps. Appl Comput Harmon Anal 21(1):5–30CrossRefzbMATHMathSciNetGoogle Scholar
  8. 8.
    Coifman RR, Lafon S (2006) Geometric harmonics: a novel tool for multiscale out-of-sample extension of empirical functions. Appl Comput Harmon Anal 21(1):31–52CrossRefzbMATHMathSciNetGoogle Scholar
  9. 9.
    Coifman RR, Lafon S, Lee AB, Maggioni M, Nadler B, Warner F, Zucker SW (2005) Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc Natl Acad Sci USA 102(21):7426–7431CrossRefGoogle Scholar
  10. 10.
    Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intel 24(5):603–619Google Scholar
  11. 11.
    David G (2009) Anomaly detection and classification via diffusion processes in hyper-networks. Ph.D. thesis, Tel Aviv UniversityGoogle Scholar
  12. 12.
    David G, Averbuch A (2012) Hierarchical data organization, clustering and denoising via localized diffusion folders. Appl Comput Harmon Anal 33(1):1–23CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Fernández A, Rabin N, Dorronsoro J (2013) Auto-adaptative Laplacian pyramids for high-dimensional data analysis (2013)Google Scholar
  14. 14.
    Fukunaga K, Hostetler LD (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theor IT-21: 32–40Google Scholar
  15. 15.
    Gibbons JD, Olkin I, Sobel M (1977) Selecting and ordering populations: a new statistical methodology. Wiley, New YorkzbMATHGoogle Scholar
  16. 16.
    Hein M, Audibert JY (2005) Intrinsic dimensionality estimation of submanifolds in \(\mathbb{R}^d\). In: ICML ’05 proceedings of the 22nd international conference on machine learning. ACM, New York, pp 289–296Google Scholar
  17. 17.
    Jin W, Tung A, Han J (2001) Mining top-n local outliers in large databases. In: KDD ’01 proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 293–298Google Scholar
  18. 18.
    Lafon SS (2004) Diffusion maps and geometric harmonics. PhD thesis, Yale UniversityGoogle Scholar
  19. 19.
    Lim MJ, Negnevitsky M, Hartnett J (2006) A fuzzy approach for detecting anomalous behaviour in e-mail traffic. In: Proceedings of the 4th Australian digital forensics conference. Edith Cowan University, Perth, pp 36–49Google Scholar
  20. 20.
    Muller E, Assent I, Steinhausen U, Seidl T (2008) OutRank: ranking outliers in high dimensional data. In: IEEE 24th international conference on data engineering workshop (ICDEW 2008). IEEE, pp 600–603Google Scholar
  21. 21.
    Nadler B, Lafon S, Coifman RR, Kevrekidis IG (2006) Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, MA, pp 955–962Google Scholar
  22. 22.
    Nadler B, Lafon S, Coifman RR, Kevrekidis IG (2006) Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Appl Comput Harmon Anal 21(1):113–127CrossRefzbMATHMathSciNetGoogle Scholar
  23. 23.
    Nelson BL, Matejcik FJ (1995) Using common random numbers for indifference-zone selection and multiple comparisons in simulation. Manage Sci 41(12):1935–1945CrossRefzbMATHGoogle Scholar
  24. 24.
    Noma H, Matsui S, Omori T, Sato T (2010) Bayesian ranking and selection methods using hierarchical mixture models in microarray studies. Biostatistics 11(2):281–289CrossRefGoogle Scholar
  25. 25.
    Parzen E (1962) On estimation of a probability density function and mode. Ann Math Statist 33:1065–1076CrossRefzbMATHMathSciNetGoogle Scholar
  26. 26.
    Rabin N (2011) Data mining in dynamically evolving systems via diffusion methodologies. PhD thesis, Tel Aviv UniversityGoogle Scholar
  27. 27.
    Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Statist 27:832–837CrossRefzbMATHMathSciNetGoogle Scholar
  28. 28.
    Schclar A (2008) Multi-sensor fusion via reduction of dimensionality. PhD thesis, Tel Aviv UniversityGoogle Scholar
  29. 29.
    Sriver TA, Chrissis JW, Abramson MA (2009) Pattern search ranking and selection algorithms for mixed variable simulation-based optimization. Eur J Oper Res 198(3):878–890CrossRefzbMATHMathSciNetGoogle Scholar
  30. 30.
    Swisher JR., Jacobson SH (1999) A survey of ranking, selection, and multiple comparison procedures for discrete-event simulation. In: Proceedings of the 1999 winter simulation conference. ACM, New York, pp 492–501Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Tel Aviv UniversityTel AvivIsrael
  2. 2.Department of Mathematical Information TechnologyUniversity of JyväskyläJyväskyläFinland

Personalised recommendations