Anomaly Ranking in a High Dimensional Space: The Unsupervised TreeRank Algorithm

  • S. ClémençonEmail author
  • N. Baskiotis
  • N. Vayatis


Ranking unsupervised data in a multivariate feature space \(\mathcal{X} \subset \mathbb{R}^{d}\), d ≥ 1 by degree of abnormality is of crucial importance in many applications (e.g., fraud surveillance, monitoring of complex systems/infrastructures such as energy networks or aircraft engines, system management in data centers). However, the learning aspect of unsupervised ranking has only received attention in the machine-learning community in the past few years. The Mass-Volume (MV) curve has been recently introduced in order to evaluate the performance of any scoring function \(s: \mathcal{X} \rightarrow \mathbb{R}\) with regard to its ability to rank unlabeled data. It is expected that relevant scoring functions will induce a preorder similar to that induced by the density function f(x) of the (supposedly continuous) probability distribution of the statistical population under study. As far as we know, there is no efficient algorithm to build a scoring function from (unlabeled) training data with nearly optimal MV curve when the dimension d of the feature space is high. It is the major purpose of this chapter to introduce such an algorithm which we call the Unsupervised TreeRank algorithm. Beyond its description and the statistical analysis of its performance, numerical experiments are exhibited in order to provide empirical evidence of its accuracy.


  1. 1.
    Provost, F., Fawcett, T.: Adaptive fraud detection. Data Min. Knowl. Disc. 1, 291–316 (1997)CrossRefGoogle Scholar
  2. 2.
    Martin, R., Gorinevsky, D., Matthews, B.: Aircraft anomaly detection using performance models trained on fleet data. In: Proceedings of the 2012 Conference on Intelligent Data Understanding (2012)Google Scholar
  3. 3.
    Viswanathan, K., Choudur, L., Talwar, V., Wang, C., Macdonald, G., Satterfield, W.: Ranking anomalies in data centers. In: James, R.D. (ed.) Network Operations and System Management, pp. 79–87. IEEE, New York (2012)Google Scholar
  4. 4.
    Polonik, W.: Minimum volume sets and generalized quantile processes. Stoch. Process. Their Appl. 69(1), 1–24 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Scott, C., Nowak, R.: Learning Minimum Volume Sets. J. Mach. Learn. Res. 7, 665–704 (2006)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A., Williamson, R.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)CrossRefzbMATHGoogle Scholar
  7. 7.
    Steinwart, I., Hush, D., Scovel, C.: A classification framework for anomaly detection. J. Mach. Learn. Res. 6, 211–232 (2005)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Vert, R., Vert, J.-P.: Consistency and convergence rates of one-class svms and related algorithms. J. Mach. Learn. Res. 7, 817–854 (2006)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Park, C., Huang, J.Z., Ding, Y.: A computable plug-in estimator of minimum volume sets for novelty detection. Oper. Res. 58(5), 1469–1480 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Clémençon, S., Jakubowicz, J.: Scoring anomalies: a M-estimation formulation. In: Proceedings of AISTATS, JMLR W&CP, vol. 31 (2013)Google Scholar
  11. 11.
    Han, J., Jin, W., Tung, A., Wang, W.: Ranking Outliers Using Symmetric Neighborhood Relationship. Lecture Notes in Computer Science, vol. 3918, pp. 148–188. Springer, Berlin (2006)Google Scholar
  12. 12.
    Clémençon, S., Robbiano, S.: Anomaly ranking as supervised bipartite ranking. In: Jebara, T., Xing, E.P. (eds.) Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 343–351 (2014)Google Scholar
  13. 13.
    Clémençon, S., Vayatis, N.: Tree-based ranking methods. IEEE Trans. Inf. Theory 55(9), 4316–4336 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Clémençon, S., Depecker, M., Vayatis, N.: Adaptive partitioning schemes for bipartite ranking. Mach. Learn. 43(1), 31–69 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Einmahl, J.H.J., Mason, D.M.: Generalized quantile process. Ann. Stat. 20, 1062–1078 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Scott, C., Nowak, R.: Learning minimum volume sets. J. Mach. Learn. Res. 7, 665–704 (2006)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Duchi, J., Mackey, L., Jordan, M.: On the consistency of ranking algorithms. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10) (2010)Google Scholar
  18. 18.
    Clémençon, S., Lugosi, G., Vayatis, N.: Ranking and empirical risk minimization of U-statistics. Ann. Stat. 36(2), 844–874 (2008)CrossRefzbMATHGoogle Scholar
  19. 19.
    Agarwal, S., Graepel, T., Herbrich, R., Har-Peled, S., Roth, D.: Generalization bounds for the area under the ROC curve. J. Mach. Learn. Res. 6, 393–425 (2005)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Freund, Y., Iyer, R., Schapire, R., Singer, Y.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 933–969 (2003)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Rakotomamonjy, A.: Optimizing area under Roc curve with SVMs. In: Proceedings of the First Workshop on ROC Analysis in AI (2004)Google Scholar
  22. 22.
    Pahikkala, T., Tsivtsivadze, E., Airola, A., Boberg, J., Salakoski, T.: Learning to rank with pairwise regularized least-squares. In: Proceedings of SIGIR, pp. 27–33 (2007)Google Scholar
  23. 23.
    Fawcett, T.: An introduction to ROC analysis. Lett. Pattern Recogn. 27(8), 861–874 (2006)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Clémençon, S., Depecker, M., Vayatis, N.: Ranking forests. J. Mach. Learn. Res. 14, 39–73 (2013)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Herbrich, R., Graepel, T., Obermayer, K.: Large margin rank boundaries for ordinal regression. In: Advances in Large Margin Classifiers, pp. 115–132. MIT Press, Cambridge (2000)Google Scholar
  26. 26.
    Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning. Springer, New York (2009)zbMATHGoogle Scholar
  27. 27.
    Scott, C., Davenport, M.: Regression level set estimation via cost-sensitive classification. IEEE Trans. Signal Process. 55(6), 2752–2757 (2007)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Belmont (1984)zbMATHGoogle Scholar
  29. 29.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos (1993)Google Scholar
  30. 30.
    Rokach, L., Maimon, O.: Data-Mining with Decision Trees: Theory and Applications, 2nd edn. Series in Machine Perception and Artificial Intelligence. World Scientific, Singapore (2014)CrossRefzbMATHGoogle Scholar
  31. 31.
    Clémençon, S., Depecker, M., Vayatis, N.: An empirical comparison of learning algorithms for nonparametric scoring: the treerank algorithm and other methods. Pattern Anal. Appl. 16(4), 475–496 (2013)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Institut Mines Telecom, LTCI UMRTelecom ParisTech & CNRS No. 5141ParisFrance
  2. 2.Université ParisParisFrance
  3. 3.ENS CachanCMLACachanFrance

Personalised recommendations