Skip to main content
Log in

An analytics approach to the FIFA ranking procedure and the World Cup final draw

Annals of Operations Research Aims and scope Submit manuscript

Abstract

This paper analyzes the procedure used by FIFA up until 2018 to rank national football teams and define by random draw the groups for the initial phase of the World Cup finals. A predictive model is calibrated to form a reference ranking to evaluate the performance of a series of simple changes to that procedure. These proposed modifications are guided by a qualitative and statistical analysis of the FIFA ranking. We then analyze the use of this ranking to determine the groups for the World Cup finals. After enumerating a series of deficiencies in the group assignments for the 2014 World Cup, a mixed integer linear programming model is developed and used to balance the difficulty levels of the groups.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Finland)

Instant access to the full article PDF.

Institutional subscriptions

Notes

  1. Starting in 2026 the number of teams that qualify to the final stage is expected to increase to 48.

  2. In our simulations, the team with the highest number of points by the end of the tournament wins the competition. We use the goal-difference and goals scored criteria in a hierarchical manner to resolve ties.

  3. The confederation strength factor is based on the number of wins by all of the teams in a confederation in the last three World Cups. Before the 2014 World Cup, the values were 1 for CONMEBOL and UEFA, 0.88 for CONCACAF, 0.86 for the AFC and the CAF, and 0.85 for the OFC. For further details, see FIFA (2017).

  4. The ranking is also used in setting up the World Cup qualifying tournaments and confederation cups in some confederations.

  5. This is explained as in our data from October 2005 to October 2013 there are 1951 games that end up tied, among a total of 8049 games. Thus the estimated model will favor having some side winning with higher probability.

  6. Both criteria are estimators of the relative quality of a statistical model for given data, and are typically used for model selection: for further details, see, for example, Hastie et al. (2001).

  7. Although some of the UEFA countries have great achievements in the history of football (such as Germany and Italy), the confederation in total gathers 55 teams including some that consistently rank among the worst in the world (such as San Marino, Andorra, Malta, and Liechtenstein) which help explain this result.

  8. In this case, because a higher rating translate into a lower ranking, we set \(w_{r}= c \; (1-e^{-r})\) where r denotes the rating of a team and c is a normalizing constant.

  9. Because of the dynamic nature of this score, we did not compute it for the cases of the Reference ranking, the modified FIFA ranking and the World Football Elo ranking.

References

  • Alarcón, F., Durán, G., Guajardo, M., Miranda, J., Muñoz, H., Ramírez, L., et al. (2017). Operations research transforms the scheduling of Chilean soccer leagues and South American World Cup qualifiers. Interfaces, 47(1), 52–69.

    Article  Google Scholar 

  • Anderson, S. P., de Palma, A., & Thisee, J.-F. (1992). Discrete choice theory of product differentiation. Cambridge, MA: MIT Press.

    Book  Google Scholar 

  • Bezanson, Jeff, Edelman, Alan, Karpinski, Stefan, & Shah, Viral B. (2017). Julia: A fresh approach to numerical computing. SIAM Review, 59(1), 65–98.

    Article  Google Scholar 

  • Bradley, R. A., & Terry, M. E. (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39(3/4), 324–345.

    Article  Google Scholar 

  • Coleman, B. J. (2005). Minimizing game score violations in college football rankings. Interfaces, 35(6), 483–496.

    Article  Google Scholar 

  • Davidson, R. R. (1970). On extending the Bradley–Terry model to accommodate ties in paired comparison experiments. Journal of the American Statistical Association, 65(329), 317–328.

    Article  Google Scholar 

  • Dixon, M., & Robinson, M. (1998). A birth process model for association football matches. Journal of the Royal Statistical Society: Series D (The Statistician), 47(3), 523–538.

    Google Scholar 

  • Downward, P., & Jones, M. (2007). Effects of crowd size on referee decisions: Analysis of the FA Cup. Journal of Sports Sciences, 25(14), 1541–1545.

    Article  Google Scholar 

  • Durán, G., Guajardo, M., & Sauré, D. (2017). Scheduling the South American Qualifiers to the 2018 FIFA World Cup by integer programming. European Journal of Operational Research, 262(3), 1109–1115.

    Article  Google Scholar 

  • Dyte, D., & Clarke, S. R. (2000). A ratings based Poisson model for World Cup soccer simulation. The Journal of the Operational Research Society, 51(8), 993–998.

    Article  Google Scholar 

  • FIFA. (2014a). 2014 FIFA World Cup Brazil technical report and statistics.

  • FIFA. (2014b). 2014 FIFA World Cup Brazil television audience report.

  • FIFA. (2017). FIFA ranking. http://www.fifa.com/fifa-world-ranking/. Accessed March 07, 2017.

  • FIFA. (2018). Revision of the FIFA/Coca-Cola World Ranking. https://img.fifa.com/image/upload/edbm045h0udbwkqew35a.pdf. Accessed April 24, 2019.

  • Gurobi Optimizer Reference Manual. http://www.gurobi.com. Accessed April 24, 2019.

  • Guyon, J. (2014). The World Cup draw is unfair. Here’s a better way. The New York Times. https://www.nytimes.com/2014/06/05/upshot/the-world-cup-draw-is-unfair-heres-a-better-way.html. Accessed Feb 05, 2019.

  • Guyon, J. (2015). Rethinking the FIFA World Cup final draw. Journal of Quantitatie Analysis in Sports, 11(3), 169–182.

    Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.

    Book  Google Scholar 

  • Huchette, Joey, & Lubin, Miles. (2017). JuMP: A modeling language for mathematical optimization. SIAM Review, 59(2), 295–320.

    Article  Google Scholar 

  • Karlis, D., & Ntzoufras, I. (2003). Analysis of sports data by using bivariate Poisson models. Journal of the Royal Statistical Society: Series D (The Statistician), 52(3), 381–393.

    Google Scholar 

  • Karlis, D., & Ntzoufras, I. (2008). Bayesian modelling of football outcomes: using the Skellam’s distribution for the goal difference. IMA Journal of Management Mathematics, 20(2), 133–145.

    Article  Google Scholar 

  • Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30(1/2), 81–93.

    Article  Google Scholar 

  • Lasek, J., Szlávik, Z., & Bhulai, S. (2013). The predictive power of ranking systems in association football. International Journal of Applied Pattern Recognition, 1(1), 27–46.

    Article  Google Scholar 

  • Lasek, J., Szlávik, Z., Gagolewski, M., & Bhulai, S. (2016). How to improve a team’s position in the FIFA ranking? A simulation study. Journal of Applied Statistics, 43(7), 1349–1368.

    Article  Google Scholar 

  • Lillestøl, J., & Andersson, J. (2011). The Z-Poisson distribution with application to the modelling of soccer score probabilities. Statistical Modelling, 11(6), 507–522.

    Article  Google Scholar 

  • Maher, M. J. (1982). Modelling association football scores. Statistica Neerlandica, 36(3), 109–118.

    Article  Google Scholar 

  • Martinich, J. (2002). College football rankings: Do the computers know best? Interfaces, 32(5), 85–94.

    Article  Google Scholar 

  • McHale, I., & Davies, S. (2007). Statistical analysis of the effectiveness of the FIFA world rankings. In Statistical thinking in sports (pp. 77–89). CRC Press.

  • Pollard, R. (1986). Home advantage in soccer: A retrospective analysis. Journal of Sports Sciences, 4(3), 237–248.

    Article  Google Scholar 

  • Rue, H., & Salvesen, O. (2000). Prediction and retrospective analysis of soccer matches in a league. Journal of the Royal Statistical Society: Series D (The Statistician), 49(3), 399–418.

    Article  Google Scholar 

  • Scarf, P. A., & Yusof, M. M. (2011). A numerical study of tournament structure and seeding policy for the soccer World Cup Finals. Statistica Neerlandica, 65(1), 43–57.

    Article  Google Scholar 

  • Suzuki, K. A., Salasar, B. L. E., Leite, G. J., & Louzada-Neto, F. (2010). A Bayesian approach for predicting match outcomes: The 2006 (Association) Football World Cup. Journal of the Operational Research Society, 61(10), 1530–1539.

    Article  Google Scholar 

  • Van Haaren, J., & Van den Broeck, G. (2015). Relational learning for football-related predictions. In Latest advances in inductive logic programming (pp. 237–244).

  • World Football Elo Ratings. http://www.eloratings.net. Accessed April 24, 2019.

Download references

Acknowledgements

We would like to sincerely thank two reviewers for their valuable suggestions that allowed us to considerably improve a preliminary version of this work. We also thank ISCI, Chile (CONICYT PIA FB0816) for its support. The second author was partially financed by ANPCyT PICT Grant 2015-2218 (Argentina) and UBACyT Grant 20020170100495BA (Argentina).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillermo Durán.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Mixed integer linear programming model

Sets

  • G: groups.

  • C: confederations.

  • I: teams.

  • S: seeded teams (\(S \subset I\)).

  • \(J_c\): teams in confederation c (\(J_c \subset I, c \in C\)).

Parameters

  • \(R_{i}\): ranking of team i (\(i \in I\)).

  • \(L_c\): minimum number of teams of confederation c in each group (\(c \in C\)).

  • \(U_c\): maximum number of teams of confederation c in each group (\(c \in C\)).

  • SR: minimum accepted sum ranking difference.

  • \(\hat{R}\): upper bound on the ranking of any team.

Decision variables

$$\begin{aligned} x_{ig}= & {} \left\{ \begin{array}{ll} 1 \text { if team }i\text { is assigned to group }g\\ 0 \text { otherwise } \end{array} \right. \\ x^{\max }_{ig}= & {} \left\{ \begin{array}{ll} 1 \text { if team }i\text { is assigned to group }g\text { has the highest ranked team}\\ 0 \text { otherwise } \end{array} \right. \\ x^{\min }_{ig}= & {} \left\{ \begin{array}{ll} 1 \text { if team }i\text { is assigned to group }g\text { has the lowest ranked team}\\ 0 \text { otherwise } \end{array} \right. \end{aligned}$$
  • \(w_{min}\): ranking sum of the group with the lowest ranking sum value.

  • \(w_{max}\): ranking sum of the group with the highest ranking sum value.

  • \(z_{min}\): range of the group with the lowest range.

  • \(z_{max}\): range of the group with the highest range.

Objective function

$$\begin{aligned} \min f = z_{max}-z_{min} \end{aligned}$$
(A.1)

Constraints

$$\begin{aligned} \sum _{g \in G}x_{ig}= & {} 1 \ \ \ \ \forall \ i \in \ I \end{aligned}$$
(A.2)
$$\begin{aligned} \sum _{i \in I}x_{ig}= & {} 4 \ \ \ \ \forall \ g \in \ G \end{aligned}$$
(A.3)
$$\begin{aligned} \sum _{i \in S}x_{ig}= & {} 1 \ \ \ \ \forall \ g \in \ G \end{aligned}$$
(A.4)
$$\begin{aligned} \sum _{i \in J_c}x_{ig}\ge & {} L_c \ \ \ \ \forall \ g \in \ G, c \in C \end{aligned}$$
(A.5)
$$\begin{aligned} \sum _{i \in J_c}x_{ig}\le & {} U_c \ \ \ \ \forall \ g \in \ G, c \in C \end{aligned}$$
(A.6)
$$\begin{aligned} w_{min}\le & {} \sum _{i \in I}R_{i}x_{ig} \end{aligned}$$
(A.7)
$$\begin{aligned} w_{max}\ge & {} \sum _{i \in I}R_{i}x_{ig} \end{aligned}$$
(A.8)
$$\begin{aligned} w_{max}-w_{min}\le & {} SR \end{aligned}$$
(A.9)
$$\begin{aligned} x^{\min }_{ig} + x^{\max }_{ig}\le & {} x_{ig} \ \ \ \ \forall i \in I, g \in G \end{aligned}$$
(A.10)
$$\begin{aligned} \sum _{j} R_j x^{\max }_{jg}\ge & {} R_i~x_{ig} \ \ \ \ \forall i \in I, g \in G \end{aligned}$$
(A.11)
$$\begin{aligned} \sum _{j} (\bar{R}-R_j)~x^{\min }_{jg}\ge & {} (\bar{R}-R_i)~x_{ig} \ \ \ \ \forall i \in I, g \in G \end{aligned}$$
(A.12)
$$\begin{aligned} \sum _{i \in I}x^{\max }_{ig}= & {} 1 \ \ \ \ \forall \ g \in \ G \end{aligned}$$
(A.13)
$$\begin{aligned} \sum _{i \in I}x^{\min }_{ig}= & {} 1 \ \ \ \ \forall \ g \in \ G \end{aligned}$$
(A.14)
$$\begin{aligned} z_{min}\le & {} \sum _{i \in I}R_{i}(x^{\max }_{ig} - x^{\min }_{ig}) \end{aligned}$$
(A.15)
$$\begin{aligned} z_{max}\ge & {} \sum _{i \in I}R_{i}(x^{\max }_{ig} - x^{\min }_{ig}) \end{aligned}$$
(A.16)
$$\begin{aligned} x_{ig},\, x^{\max }_{ig},\, x^{\min }_{ig}\in & {} \ \{0,1\} \quad \forall i \in I, g \in G \end{aligned}$$
(A.17)
$$\begin{aligned} w_{min},\,w_{max},\,z_{min},\, z_{max}\ge & {} 0 \ \ \ \ \end{aligned}$$
(A.18)

Constraints (A.2) ensure that every team is assigned to exactly one group. Constraints (A.3) specify that each group contains exactly four teams while constraints (A.4) require that one of the four teams is a seed. Constraints (A.5) and (A.6) impose upper and lower bounds on the number of teams from a single confederation assigned to a single group (in the 2014 World Cup, only one team from each confederation was allowed in a group except for the European confederation, in which case the limit was two; and at least one UEFA team per group is necessary). Constraints (A.7)–(A.8) aid in calculating the group’s minimum and maximum ranking sums, while constraint (A.9) imposes a bound on the difference of such values. Constraints (A.10)–(A.14) compute the range of each group, and constraints (A.15)–(A.16) aid in calculating the group’s minimum and maximum ranges. Constraints (A.18) defines the nature of the variables, and finally, the objective function (A.1) minimizes the difference between the maximum and minimum ranges.

Appendix B: Estimates for the predictive model of Section 2

See Table 17.

Table 17 Poisson regression outcome for predictive model in Sect. 2

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cea, S., Durán, G., Guajardo, M. et al. An analytics approach to the FIFA ranking procedure and the World Cup final draw. Ann Oper Res 286, 119–146 (2020). https://doi.org/10.1007/s10479-019-03261-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-019-03261-8

Keywords

Navigation