Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength
Abstract
Whole-History Rating (WHR) is a new method to estimate the time-varying strengths of players involved in paired comparisons. Like many variations of the Elo rating system, the whole-history approach is based on the dynamic Bradley-Terry model. But, instead of using incremental approximations, WHR directly computes the exact maximum a posteriori over the whole rating history of all players. This additional accuracy comes at a higher computational cost than traditional methods, but computation is still fast enough to be easily applied in real time to large-scale game servers (a new game is added in less than 0.001 second). Experiments demonstrate that, in comparison to Elo, Glicko, TrueSkill, and decayed-history algorithms, WHR produces better predictions.
Keywords
Wiener Process Prediction Rate Rating Algorithm Rating Uncertainty Incremental AlgorithmPreview
Unable to display preview. Download preview PDF.
References
- 1.Coulom, R.: Bayeselo (2005), http://remi.coulom.free.fr/Bayesian-Elo/
- 2.Dangauthier, P., Herbrich, R., Minka, T., Graepel, T.: TrueSkill through time: Revisiting the history of chess. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, Vancouver, Canada, MIT Press, Cambridge (2007)Google Scholar
- 3.Edwards, R.: Edo historical chess ratings (2004), http://members.shaw.ca/edo1/
- 4.Elo, A.E.: The Rating of Chessplayers, Past and Present. Arco Publishing, New York (1978)Google Scholar
- 5.Fahrmeir, L., Tutz, G.: Dynamic stochastic models for time-dependent ordered paired comparison systems. Journal of the American Statistical Association 89(428), 1438–1449 (1994)MATHCrossRefGoogle Scholar
- 6.Glickman, M.E.: Paired Comparison Model with Time-Varying Parameters. PhD thesis, Harvard University, Cambridge, Massachusetts (1993)Google Scholar
- 7.Glickman, M.E.: Parameter estimation in large dynamic paired comparison experiments. Applied Statistics 48(33), 377–394 (1999)MATHGoogle Scholar
- 8.Herbrich, R., Graepel, T.: TrueSkillTM: A Bayesian skill rating system. Technical Report MSR-TR-2006-80, Microsoft Research (2006)Google Scholar
- 9.Hunter, D.R.: MM algorithms for generalized Bradley-Terry models. The Annals of Statistics 32(1), 384–406 (2004)MATHCrossRefMathSciNetGoogle Scholar
- 10.Rybicki, G.B., Hummer, D.G.: An accelerated lambda iteration method for multilevel radiative transfer. Astronomy and Astrophysics 245(1), 171–181 (1991)Google Scholar
- 11.Rybicki, G.B., Press, W.H.: Interpolation, realization, and reconstruction of noisy, irregularly sampled data. The Astrophysical Journal 398(1), 169–176 (1992)CrossRefGoogle Scholar
- 12.Knorr-Held, L.: Dynamic rating of sports teams. The Statistician 49(2), 261–276 (2000)Google Scholar
- 13.Shubert, W.M.: Details of the KGS rank system (2007), http://www.gokgs.com/help/rmath.html
- 14.Sonas, J.: Chessmetrics (2005), http://db.chessmetrics.com/CM2/Formulas.asp