Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength

  • Rémi Coulom
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5131)


Whole-History Rating (WHR) is a new method to estimate the time-varying strengths of players involved in paired comparisons. Like many variations of the Elo rating system, the whole-history approach is based on the dynamic Bradley-Terry model. But, instead of using incremental approximations, WHR directly computes the exact maximum a posteriori over the whole rating history of all players. This additional accuracy comes at a higher computational cost than traditional methods, but computation is still fast enough to be easily applied in real time to large-scale game servers (a new game is added in less than 0.001 second). Experiments demonstrate that, in comparison to Elo, Glicko, TrueSkill, and decayed-history algorithms, WHR produces better predictions.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Coulom, R.: Bayeselo (2005), http://remi.coulom.free.fr/Bayesian-Elo/
  2. 2.
    Dangauthier, P., Herbrich, R., Minka, T., Graepel, T.: TrueSkill through time: Revisiting the history of chess. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, Vancouver, Canada, MIT Press, Cambridge (2007)Google Scholar
  3. 3.
    Edwards, R.: Edo historical chess ratings (2004), http://members.shaw.ca/edo1/
  4. 4.
    Elo, A.E.: The Rating of Chessplayers, Past and Present. Arco Publishing, New York (1978)Google Scholar
  5. 5.
    Fahrmeir, L., Tutz, G.: Dynamic stochastic models for time-dependent ordered paired comparison systems. Journal of the American Statistical Association 89(428), 1438–1449 (1994)MATHCrossRefGoogle Scholar
  6. 6.
    Glickman, M.E.: Paired Comparison Model with Time-Varying Parameters. PhD thesis, Harvard University, Cambridge, Massachusetts (1993)Google Scholar
  7. 7.
    Glickman, M.E.: Parameter estimation in large dynamic paired comparison experiments. Applied Statistics 48(33), 377–394 (1999)MATHGoogle Scholar
  8. 8.
    Herbrich, R., Graepel, T.: TrueSkillTM: A Bayesian skill rating system. Technical Report MSR-TR-2006-80, Microsoft Research (2006)Google Scholar
  9. 9.
    Hunter, D.R.: MM algorithms for generalized Bradley-Terry models. The Annals of Statistics 32(1), 384–406 (2004)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Rybicki, G.B., Hummer, D.G.: An accelerated lambda iteration method for multilevel radiative transfer. Astronomy and Astrophysics 245(1), 171–181 (1991)Google Scholar
  11. 11.
    Rybicki, G.B., Press, W.H.: Interpolation, realization, and reconstruction of noisy, irregularly sampled data. The Astrophysical Journal 398(1), 169–176 (1992)CrossRefGoogle Scholar
  12. 12.
    Knorr-Held, L.: Dynamic rating of sports teams. The Statistician 49(2), 261–276 (2000)Google Scholar
  13. 13.
    Shubert, W.M.: Details of the KGS rank system (2007), http://www.gokgs.com/help/rmath.html
  14. 14.
    Sonas, J.: Chessmetrics (2005), http://db.chessmetrics.com/CM2/Formulas.asp

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Rémi Coulom
    • 1
  1. 1.Université Charles de Gaulle, INRIA SEQUEL, CNRS GRAPPALilleFrance

Personalised recommendations