Metric Anomaly Detection via Asymmetric Risk Minimization

  • Aryeh Kontorovich
  • Danny Hendler
  • Eitan Menahem
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7005)


We propose what appears to be the first anomaly detection framework that learns from positive examples only and is sensitive to substantial differences in the presentation and penalization of normal vs. anomalous points. Our framework introduces a novel type of asymmetry between how false alarms (misclassifications of a normal instance as an anomaly) and missed anomalies (misclassifications of an anomaly as normal) are penalized: whereas each false alarm incurs a unit cost, our model assumes that a high global cost is incurred if one or more anomalies are missed.

We define a few natural notions of risk along with efficient minimization algorithms. Our framework is applicable to any metric space with a finite doubling dimension. We make minimalistic assumptions that naturally generalize notions such as margin in Euclidean spaces. We provide a theoretical analysis of the risk and show that under mild conditions, our classifier is asymptotically consistent. The learning algorithms we propose are computationally and statistically efficient and admit a further tradeoff between running time and precision. Some experimental results on real-world data are provided.


False Alarm False Alarm Rate Separation Distance Anomaly Detection Doubling Dimension 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alon, N., Ben-David, S., Cesa-Bianchi, N., Haussler, D.: Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the ACM 44(4), 615–631 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Bartlett, P., Shawe-Taylor, J.: Generalization performance of support vector machines and other pattern classifiers. pp. 43–54 (1999)Google Scholar
  3. 3.
    Ben-Hur, A.: Support vector clustering. Scholarpedia 3(6), 5187 (2008)CrossRefGoogle Scholar
  4. 4.
    Bennett, K.P., Demiriz, A., Maclin, R.: Exploiting unlabeled data in ensemble methods. In: KDD, pp. 289–296 (2002)Google Scholar
  5. 5.
    Berend, D., Kontorovich, A.: The missing mass problem (in preparation, 2011)Google Scholar
  6. 6.
    Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. SIGMOD Rec. 29, 93–104 (2000)CrossRefGoogle Scholar
  7. 7.
    Crammer, K., Chechik, G.: A needle in a haystack: local one-class optimization. In: ICML (2004)Google Scholar
  8. 8.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39(1), 1–38 (1977) With discussionMathSciNetzbMATHGoogle Scholar
  9. 9.
    Domingos, P.: Metacost: A general method for making classifiers cost-sensitive. In: KDD, pp. 155–164 (1999)Google Scholar
  10. 10.
    Elkan, C.: The foundations of cost-sensitive learning. In: IJCAI, pp. 973–978 (2001)Google Scholar
  11. 11.
    Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. In: Applications of Data Mining in Computer Security. Kluwer, Dordrecht (2002)Google Scholar
  12. 12.
    Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: Misclassification cost-sensitive boosting. In: ICML, pp. 97–105 (1999)Google Scholar
  13. 13.
    CEI figures: Computer Economics Inc. Security issues: Virus costs are rising again (2003)Google Scholar
  14. 14.
    Gottlieb, L.-A., Kontorovich, L., Krauthgamer, R.: Efficient classification for metric data. In: COLT (2010)Google Scholar
  15. 15.
    Gottlieb, L.-A., Krauthgamer, R.: Proximity algorithms for nearly-doubling spaces. In: Serna, M., et al. (eds.) APPROX and RANDOM 2010. LNCS, vol. 6302, pp. 192–204. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    Greiner, R., Grove, A.J., Roth, D.: Learning cost-sensitive active classifiers. Artif. Intell. 139(2), 137–174 (2002)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Gupta, A., Krauthgamer, R., Lee, J.R.: Bounded geometries, fractals, and low-distortion embeddings. In: FOCS, pp. 534–543 (2003)Google Scholar
  18. 18.
    Heller, K.A., Svore, K.M., Keromytis, A.D., Stolfo, S.J.: One class support vector machines for detecting anomalous windows registry accesses. In: ICDM Workshop on Data Mining for Computer Security, DMSEC (2003)Google Scholar
  19. 19.
    Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. Intell. Data Anal. 6(5), 429–449 (2002)zbMATHGoogle Scholar
  20. 20.
    Knorr, E.M., Ng, R.T.: A unified notion of outliers: Properties and computation. In: KDD, pp. 219–222 (1997)Google Scholar
  21. 21.
    Krauthgamer, R., Lee, J.R.: Navigating nets: Simple algorithms for proximity search. In: 15th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 791–801 (January 2004)Google Scholar
  22. 22.
    Li, Y.-F., Kwok, J.T., Zhou, Z.-H.: Cost-sensitive semi-supervised support vector machine. In: AAAI (2010)Google Scholar
  23. 23.
    Ling, C.X., Sheng, V.S.: Cost-sensitive learning. In: Encyclopedia of Machine Learning, pp. 231–235 (2010)Google Scholar
  24. 24.
    Ling, C.X., Yang, Q., Wang, J., Zhang, S.: Decision trees with minimal costs. In: ICML (2004)Google Scholar
  25. 25.
    Liu, A., Jun, G., Ghosh, J.: A self-training approach to cost sensitive uncertainty sampling. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009, Part I. LNCS, vol. 5781, pp. 10–10. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  26. 26.
    Luo, J., Ding, L., Pan, Z., Ni, G., Hu, G.: Research on cost-sensitive learning in one-class anomaly detection algorithms. In: Xiao, B., Yang, L., Ma, J., Muller-Schloer, C., Hua, Y. (eds.) ATC 2007. LNCS, vol. 4610, pp. 259–268. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  27. 27.
    Margineantu, D.D.: Active cost-sensitive learning. In: IJCAI, pp. 1613–1622 (2005)Google Scholar
  28. 28.
    Plummer, D.C.: Rfc 826: An ethernet address resolution protocol – or – converting network protocol addresses to 48.bit ethernet address for transmission on ethernet hardware (1982), Internet Engineering Task Force, Network Working GroupGoogle Scholar
  29. 29.
    Qin, Z., Zhang, S., Liu, L., Wang, T.: Cost-sensitive semi-supervised classification using CS-EM. In: 8th IEEE International Conference on Computer and Information Technology, CIT 2008, pp. 131 –136 (July 2008)Google Scholar
  30. 30.
    Schölkopf, B., Burges, C., Vapnik, V.: Extracting support data for a given task. In: KDD, pp. 252–257 (1995)Google Scholar
  31. 31.
    Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Computation 13(7), 1443–1471 (2001)CrossRefzbMATHGoogle Scholar
  32. 32.
    Sun, Y., Kamel, M.S., Wong, A.K.C., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition 40(12), 3358–3378 (2007)CrossRefzbMATHGoogle Scholar
  33. 33.
    Tax, D.M.J., Duin, R.P.W.: Data domain description using support vectors. In: ESANN, pp. 251–256 (1999)Google Scholar
  34. 34.
    Martinus, D., Tax, J.: One-class classification. PhD thesis, Delft University of Technology (2001)Google Scholar
  35. 35.
    Turney, P.D.: Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm. J. Artif. Intell. Res (JAIR) 2, 369–409 (1995)Google Scholar
  36. 36.
    von Luxburg, U., Bousquet, O.: Distance-based classification with lipschitz functions. Journal of Machine Learning Research 5, 669–695 (2004)MathSciNetzbMATHGoogle Scholar
  37. 37.
    Waters, R.: When will they ever stop bugging us? Financial Times, special report (2003)Google Scholar
  38. 38.
    Zadrozny, B., Elkan, C.: Learning and making decisions when costs and probabilities are both unknown. In: KDD, pp. 204–213 (2001)Google Scholar
  39. 39.
    Zhou, Z.-H., Liu, X.-Y.: On multi-class cost-sensitive learning. Computational Intelligence 26(3), 232–257 (2010)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Aryeh Kontorovich
    • 1
    • 2
  • Danny Hendler
    • 1
    • 2
  • Eitan Menahem
    • 1
    • 3
  1. 1.Deutsche Telekom LaboratoriesBen-Gurion University of the NegevIsrael
  2. 2.Department of Computer ScienceBen-Gurion University of the NegevIsrael
  3. 3.Department of Information Systems EngineeringBen-Gurion University of the NegevIsrael

Personalised recommendations