Abstract
This paper analyzes the procedure used by FIFA up until 2018 to rank national football teams and define by random draw the groups for the initial phase of the World Cup finals. A predictive model is calibrated to form a reference ranking to evaluate the performance of a series of simple changes to that procedure. These proposed modifications are guided by a qualitative and statistical analysis of the FIFA ranking. We then analyze the use of this ranking to determine the groups for the World Cup finals. After enumerating a series of deficiencies in the group assignments for the 2014 World Cup, a mixed integer linear programming model is developed and used to balance the difficulty levels of the groups.
Notes
Starting in 2026 the number of teams that qualify to the final stage is expected to increase to 48.
In our simulations, the team with the highest number of points by the end of the tournament wins the competition. We use the goal-difference and goals scored criteria in a hierarchical manner to resolve ties.
The confederation strength factor is based on the number of wins by all of the teams in a confederation in the last three World Cups. Before the 2014 World Cup, the values were 1 for CONMEBOL and UEFA, 0.88 for CONCACAF, 0.86 for the AFC and the CAF, and 0.85 for the OFC. For further details, see FIFA (2017).
The ranking is also used in setting up the World Cup qualifying tournaments and confederation cups in some confederations.
This is explained as in our data from October 2005 to October 2013 there are 1951 games that end up tied, among a total of 8049 games. Thus the estimated model will favor having some side winning with higher probability.
Both criteria are estimators of the relative quality of a statistical model for given data, and are typically used for model selection: for further details, see, for example, Hastie et al. (2001).
Although some of the UEFA countries have great achievements in the history of football (such as Germany and Italy), the confederation in total gathers 55 teams including some that consistently rank among the worst in the world (such as San Marino, Andorra, Malta, and Liechtenstein) which help explain this result.
In this case, because a higher rating translate into a lower ranking, we set \(w_{r}= c \; (1-e^{-r})\) where r denotes the rating of a team and c is a normalizing constant.
Because of the dynamic nature of this score, we did not compute it for the cases of the Reference ranking, the modified FIFA ranking and the World Football Elo ranking.
References
Alarcón, F., Durán, G., Guajardo, M., Miranda, J., Muñoz, H., Ramírez, L., et al. (2017). Operations research transforms the scheduling of Chilean soccer leagues and South American World Cup qualifiers. Interfaces, 47(1), 52–69.
Anderson, S. P., de Palma, A., & Thisee, J.-F. (1992). Discrete choice theory of product differentiation. Cambridge, MA: MIT Press.
Bezanson, Jeff, Edelman, Alan, Karpinski, Stefan, & Shah, Viral B. (2017). Julia: A fresh approach to numerical computing. SIAM Review, 59(1), 65–98.
Bradley, R. A., & Terry, M. E. (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39(3/4), 324–345.
Coleman, B. J. (2005). Minimizing game score violations in college football rankings. Interfaces, 35(6), 483–496.
Davidson, R. R. (1970). On extending the Bradley–Terry model to accommodate ties in paired comparison experiments. Journal of the American Statistical Association, 65(329), 317–328.
Dixon, M., & Robinson, M. (1998). A birth process model for association football matches. Journal of the Royal Statistical Society: Series D (The Statistician), 47(3), 523–538.
Downward, P., & Jones, M. (2007). Effects of crowd size on referee decisions: Analysis of the FA Cup. Journal of Sports Sciences, 25(14), 1541–1545.
Durán, G., Guajardo, M., & Sauré, D. (2017). Scheduling the South American Qualifiers to the 2018 FIFA World Cup by integer programming. European Journal of Operational Research, 262(3), 1109–1115.
Dyte, D., & Clarke, S. R. (2000). A ratings based Poisson model for World Cup soccer simulation. The Journal of the Operational Research Society, 51(8), 993–998.
FIFA. (2014a). 2014 FIFA World Cup Brazil technical report and statistics.
FIFA. (2014b). 2014 FIFA World Cup Brazil television audience report.
FIFA. (2017). FIFA ranking. http://www.fifa.com/fifa-world-ranking/. Accessed March 07, 2017.
FIFA. (2018). Revision of the FIFA/Coca-Cola World Ranking. https://img.fifa.com/image/upload/edbm045h0udbwkqew35a.pdf. Accessed April 24, 2019.
Gurobi Optimizer Reference Manual. http://www.gurobi.com. Accessed April 24, 2019.
Guyon, J. (2014). The World Cup draw is unfair. Here’s a better way. The New York Times. https://www.nytimes.com/2014/06/05/upshot/the-world-cup-draw-is-unfair-heres-a-better-way.html. Accessed Feb 05, 2019.
Guyon, J. (2015). Rethinking the FIFA World Cup final draw. Journal of Quantitatie Analysis in Sports, 11(3), 169–182.
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.
Huchette, Joey, & Lubin, Miles. (2017). JuMP: A modeling language for mathematical optimization. SIAM Review, 59(2), 295–320.
Karlis, D., & Ntzoufras, I. (2003). Analysis of sports data by using bivariate Poisson models. Journal of the Royal Statistical Society: Series D (The Statistician), 52(3), 381–393.
Karlis, D., & Ntzoufras, I. (2008). Bayesian modelling of football outcomes: using the Skellam’s distribution for the goal difference. IMA Journal of Management Mathematics, 20(2), 133–145.
Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30(1/2), 81–93.
Lasek, J., Szlávik, Z., & Bhulai, S. (2013). The predictive power of ranking systems in association football. International Journal of Applied Pattern Recognition, 1(1), 27–46.
Lasek, J., Szlávik, Z., Gagolewski, M., & Bhulai, S. (2016). How to improve a team’s position in the FIFA ranking? A simulation study. Journal of Applied Statistics, 43(7), 1349–1368.
Lillestøl, J., & Andersson, J. (2011). The Z-Poisson distribution with application to the modelling of soccer score probabilities. Statistical Modelling, 11(6), 507–522.
Maher, M. J. (1982). Modelling association football scores. Statistica Neerlandica, 36(3), 109–118.
Martinich, J. (2002). College football rankings: Do the computers know best? Interfaces, 32(5), 85–94.
McHale, I., & Davies, S. (2007). Statistical analysis of the effectiveness of the FIFA world rankings. In Statistical thinking in sports (pp. 77–89). CRC Press.
Pollard, R. (1986). Home advantage in soccer: A retrospective analysis. Journal of Sports Sciences, 4(3), 237–248.
Rue, H., & Salvesen, O. (2000). Prediction and retrospective analysis of soccer matches in a league. Journal of the Royal Statistical Society: Series D (The Statistician), 49(3), 399–418.
Scarf, P. A., & Yusof, M. M. (2011). A numerical study of tournament structure and seeding policy for the soccer World Cup Finals. Statistica Neerlandica, 65(1), 43–57.
Suzuki, K. A., Salasar, B. L. E., Leite, G. J., & Louzada-Neto, F. (2010). A Bayesian approach for predicting match outcomes: The 2006 (Association) Football World Cup. Journal of the Operational Research Society, 61(10), 1530–1539.
Van Haaren, J., & Van den Broeck, G. (2015). Relational learning for football-related predictions. In Latest advances in inductive logic programming (pp. 237–244).
World Football Elo Ratings. http://www.eloratings.net. Accessed April 24, 2019.
Acknowledgements
We would like to sincerely thank two reviewers for their valuable suggestions that allowed us to considerably improve a preliminary version of this work. We also thank ISCI, Chile (CONICYT PIA FB0816) for its support. The second author was partially financed by ANPCyT PICT Grant 2015-2218 (Argentina) and UBACyT Grant 20020170100495BA (Argentina).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Mixed integer linear programming model
Sets
G: groups.
C: confederations.
I: teams.
S: seeded teams (\(S \subset I\)).
\(J_c\): teams in confederation c (\(J_c \subset I, c \in C\)).
Parameters
\(R_{i}\): ranking of team i (\(i \in I\)).
\(L_c\): minimum number of teams of confederation c in each group (\(c \in C\)).
\(U_c\): maximum number of teams of confederation c in each group (\(c \in C\)).
SR: minimum accepted sum ranking difference.
\(\hat{R}\): upper bound on the ranking of any team.
Decision variables
\(w_{min}\): ranking sum of the group with the lowest ranking sum value.
\(w_{max}\): ranking sum of the group with the highest ranking sum value.
\(z_{min}\): range of the group with the lowest range.
\(z_{max}\): range of the group with the highest range.
Objective function
Constraints
Constraints (A.2) ensure that every team is assigned to exactly one group. Constraints (A.3) specify that each group contains exactly four teams while constraints (A.4) require that one of the four teams is a seed. Constraints (A.5) and (A.6) impose upper and lower bounds on the number of teams from a single confederation assigned to a single group (in the 2014 World Cup, only one team from each confederation was allowed in a group except for the European confederation, in which case the limit was two; and at least one UEFA team per group is necessary). Constraints (A.7)–(A.8) aid in calculating the group’s minimum and maximum ranking sums, while constraint (A.9) imposes a bound on the difference of such values. Constraints (A.10)–(A.14) compute the range of each group, and constraints (A.15)–(A.16) aid in calculating the group’s minimum and maximum ranges. Constraints (A.18) defines the nature of the variables, and finally, the objective function (A.1) minimizes the difference between the maximum and minimum ranges.
Appendix B: Estimates for the predictive model of Section 2
See Table 17.
Rights and permissions
About this article
Cite this article
Cea, S., Durán, G., Guajardo, M. et al. An analytics approach to the FIFA ranking procedure and the World Cup final draw. Ann Oper Res 286, 119–146 (2020). https://doi.org/10.1007/s10479-019-03261-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-019-03261-8