New Generation Computing

, Volume 35, Issue 1, pp 69–86 | Cite as

Joint Analysis of Multiple Algorithms and Performance Measures

  • Cassio P. de Campos
  • Alessio Benavoli
Special Feature


There has been an increasing interest in the development of new methods using Pareto optimality to deal with multi-objective criteria (for example, accuracy and time complexity). Once one has developed an approach to a problem of interest, the problem is then how to compare it with the state of art. In machine learning, algorithms are typically evaluated by comparing their performance on different data sets by means of statistical tests. Standard tests used for this purpose are able to consider jointly neither performance measures nor multiple competitors at once. The aim of this paper is to resolve these issues by developing statistical procedures that are able to account for multiple competing measures at the same time and to compare multiple algorithms altogether. In particular, we develop two tests: a frequentist procedure based on the generalized likelihood ratio test and a Bayesian procedure based on a multinomial-Dirichlet conjugate model. We further extend them by discovering conditional independences among measures to reduce the number of parameters of such models, as usually the number of studied cases is very reduced in such comparisons. Data from a comparison among general purpose classifiers are used to show a practical application of our tests.


Bayesian Network Conditional Independence Dominance Statement Generalize Likelihood Ratio Test Null Hypothesis Significance Test 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Dehuri, S., Cho, S.-B.: Multi-criterion Pareto based particle swarm optimized polynomial neural network for classification: A review and state-of-the-art. Comput. Sci. Rev. 3(1), 19–40 (2009)CrossRefzbMATHGoogle Scholar
  2. 2.
    Cai, W., Chen, S., Zhang, D.: A multiobjective simultaneous learning framework for clustering and classification. Neural Netw. IEEE Trans. 21(2), 185–200 (2010)CrossRefGoogle Scholar
  3. 3.
    Shi, C., Kong, X., Yu, P.S., Wang, B.: Multi-Objective Multi-Label Classification. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 355–366. SIAM (2012)Google Scholar
  4. 4.
    Hsiao, K.J., Xu, K., Calder, J., Hero, A.O.: Multi-criteria anomaly detection using pareto depth analysis. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 25, pp. 845–853. Curran Associates Inc, USA (2012)Google Scholar
  5. 5.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Benavoli, A., de Campos, C.P.: Statistical Tests for Joint Analysis of Performance Measures, pp. 76–92. Springer International Publishing, Cham (2015)Google Scholar
  7. 7.
    Wilks, S.S.: The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann. Math. Stat. 9, 60–62 (1938)CrossRefzbMATHGoogle Scholar
  8. 8.
    Rice, J.: Mathematical statistics and data analysis. Cengage Learning, USA (2006)Google Scholar
  9. 9.
    de Campos, C.P., Tong, Y., Ji, Q.: Constrainted maximum likelihood bayesian network for facial expression recognition. In: European Conference on Computer Vision (ECCV). Lecture Notes in Computer Science, vol. 5304, pp. 168–181 (2008)Google Scholar
  10. 10.
    DasGupta, A.: Asymptotic theory of statistics and probability. Springer, New York (2008)zbMATHGoogle Scholar
  11. 11.
    Walley, P.: Inferences from multinomial data: learning about a bag of marbles. J. R. Stat. Soc. B 58(1), 3–57 (1996)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Buntine, W.: Theory refinement on Bayesian networks. In: D’Ambrosio, B.D., Smets, P., Bonissone, P.P. (eds.) UAI-92, pp. 52–60. Morgan Kaufmann, San Francisco, CA (1991)Google Scholar
  13. 13.
    Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9, 309–347 (1992)zbMATHGoogle Scholar
  14. 14.
    Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995)zbMATHGoogle Scholar
  15. 15.
    de Campos, C.P., Ji, Q.: Properties of Bayesian Dirichlet scores to learn Bayesian network structures. In: AAAI Conference on Artificial Intelligence, pp. 431–436. AAAI Press, USA (2010)Google Scholar
  16. 16.
    Silander, T., Myllymaki, P.: A simple approach for finding the globally optimal Bayesian network structure. 22nd Conference on Uncertainty in Artificial Intelligence. Arlington, Virginia, pp. 445–452. AUAI Press, USA (2006)Google Scholar
  17. 17.
    Barlett, M., Cussens, J.: Advances in Bayesian network learning using integer programming. In: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence, UAI’13, pp. 182–191 (2013)Google Scholar
  18. 18.
    de Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12, 663–689 (2011)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Yuan, C., Malone, B.: Learning optimal Bayesian networks: A shortest path perspective. J. Artif. Intell. Res. 48, 23–65 (2013)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRefGoogle Scholar

Copyright information

© Ohmsha, Ltd. and Springer Japan 2016

Authors and Affiliations

  1. 1.Queen’s UniversityBelfastUK
  2. 2.IDSIAMannoSwitzerland

Personalised recommendations