Expectation Propagation for Rating Players in Sports Competitions

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4702)


Rating players in sports competitions based on game results is one example of paired comparison data analysis. Since an exact Bayesian treatment is intractable, several techniques for approximate inference have been proposed in the literature. In this paper we compare several variants of expectation propagation (EP). EP generalizes assumed density filtering (ADF) by iteratively improving the approximations that are made in the filtering step of ADF. Furthermore, we distinguish between two variants of EP: EP-Correlated, which takes into account the correlations between the strengths of the players and EP-Independent, which ignores those correlations. We evaluate the different approaches on a large tennis dataset to find that EP does significantly better than ADF (iterative improvement indeed helps) and EP-Correlated does significantly better than EP-Independent (correlations do matter).


Posterior Distribution Expectation Propagation Brier Score Term Approximation Iterative Improvement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Minka, T.P.: A Family of Algorithms for Approximate Bayesian Inference. PhD thesis, M.I.T (2001)Google Scholar
  2. 2.
    Bradley, R.A, Terry, M.E.: Rank analysis of incomplete block designs: I, the method of paired comparisons. Biometrika (1952)Google Scholar
  3. 3.
    Seeger, M.: Notes on Minka’s expectation propagation for Gaussian process classification. Technical report, University of Edinburgh (2002)Google Scholar
  4. 4.
    Barber, D., Bishop, C.: Ensemble learning in Bayesian neural networks. Neural Networks and Machine Learning (1998)Google Scholar
  5. 5.
    Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge (1992)Google Scholar
  6. 6.
    Salzberg, S.L.: On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1(3), 317–328 (1997)CrossRefGoogle Scholar
  7. 7.
    Brier, G.W.: Verification of forecasts expressed in terms of probability. Monthly Weather Review (1950)Google Scholar
  8. 8.
    Glickman, M.: Paired Comparison Models with Time Varying Parameters. PhD thesis, Harvard University (1993)Google Scholar
  9. 9.
    Herbrich, R., Minka, T., Graepel, T.: TrueSkill: A Bayesian skill rating system. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, pp. 569–576. MIT Press, Cambridge (2007)Google Scholar
  10. 10.
    Huang, T.K., Lin, C.J., Weng, R.C.: A generalized Bradley-Terry model: From group competition to individual skill. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 17, pp. 601–608. MIT Press, Cambridge (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  1. 1.Institute for Computing and Information Sciences, Radboud University Nijmegen Toernooiveld 1, 6525 ED NijmegenThe Netherlands

Personalised recommendations