1 Introduction

Strategy-proofness is a central issue in sports where all contestants are familiar with the high-stake decisions involved and behave as strategic actors. Consequently, a tournament design should provide the players with the appropriate incentives to perform (Szymanski 2003).

Although sporting applications of operations research proliferate in the academic world (Wright 2014), the scientific analysis of sports ranking rules from the perspective of incentive compatibility has started only recently. Kendall and Lenten (2017) give probably the first comprehensive review of sports regulations with unexpected consequences. Their examples uncover three possible situations in which a team might prefer losing a game to winning it: (1) when it might gain advantages in the next season; (2) when a lower-ranked team can still qualify and might face a favoured competitor in a later stage of the tournament; (3) when a team is strictly better off by losing due to an ill-constructed tournament design.

The classical example for the first situation arises from the reverse order applied in the traditional set-up of player drafts that aims to increase competitive balance over time. Hence, if a team is still certainly eliminated from the play-off, a perverse incentive is created to tank in the later games (Taylor and Trogdon 2002; Price et al. 2010; Fornwagner 2019).

The second situation occurred, for instance, in Badminton at the 2012 Summer Olympics—Women’s doubles (Kendall and Lenten 2017, Section 3.3.1), and has inspired some works addressing the strategic manipulation problem with game-theoretical tools (Pauly 2014; Vong 2017).

However, in the first case, the rules are deliberately devised to support underdogs, and in the second case, the team gains only in expected terms. The current paper discusses the most serious third situation when the tournament rules allow a team to certainly benefit from a weaker performance. Therefore, a design is called strategy-proof in the following if this possibility is excluded. We do not deal with other forms of strategic manipulation like collusion and shirking.

Probably the first academic paper studying the problem of such misaligned incentives is Dagaev and Sonin (2018). The authors prove that tournament systems, consisting of multiple round-robin and knockout tournaments with noncumulative prizes, are generically incentive incompatible. Recent qualifications for the UEFA (Union of European Football Associations) European Championships have also been shown to be vulnerable to manipulation (Csató 2018, 2020a), including the case when both teams should avoid winning to advance to the next stage (Csató 2020b).

Here it will be revealed that sports tournaments with multiple group stages, in which some (but not all) match results from the previous round are carried over to the next round, suffer from incentive incompatibility. In particular, teams initially play a preliminary round-robin stage and the top teams qualify for a second (main) round-robin stage. In this second stage, some groups are “merged”, that is, two teams qualifying from the same group in the preliminary stage will be in the same main round-robin group. In order to reduce the number of matches in the tournament, these teams do not play another game in the main stage against each other but its result is carried over from the preliminary stage.

Table 1 Recent handball tournaments with multiple group stages and results that are carried over

This format is widely used in handball as Table 1 demonstrates. All these tournaments contain two group stages, and the number of qualified teams in the main round shows the number of teams that have a chance to win the tournament at the end of this phase (see the last column). Tournaments with multiple group stages and carried over results are also used in other sports, for instance, in basketball (EuroBasket 2013), cricket (2007 Cricket World Cup) (Scarf et al. 2009), and volleyball (2014 FIVB Volleyball Men’s World Championship). Although 1999/2000 UEFA Champions League, as well as the following three seasons of this tournament, included two subsequent group stages, no results were carried over to the second group stage.

First, we present a real-world handball match where a team had an incentive not to win by a high margin. Second, this particular tournament design is verified to violate strategy-proofness in general. Finally, an incentive compatible mechanism is provided, namely, to carry over a monotonic transformation of all results from the previous round, regardless that some matches were played against teams already eliminated from the tournament. According to computer simulations, carrying over half of all points scored in the previous round essentially does not affect the selective ability and the competitive balance of the tournament, while it guarantees strategy-proofness and even reduces the influence of seeding the teams into pots before the draw of the groups. Our suggestion has been discussed in a recent collection of academic work proposing rule change ideas (Lenten and Kendall 2021).

The main contributions can be summarised as follows: (1) the incentive incompatibility of sports tournaments with multiple group stages is proved by a mathematical model; (2) a real-world example is presented to show that this is not only an irrelevant issue in practice, and a third, innocent team might suffer from the unsportsmanlike act of a team in a match; (3) inspired by a policy applied in some European association football leagues, a viable strategy-proof alternative is suggested. These results can be especially useful for sports administrators.

The mathematical framework for multi-stage tournaments somewhat overlaps with the models of Csató (2020a) and Csató (2020b). But both works investigate the problem of qualification systems where some teams playing in different groups should be compared, and they do not deal with the next stage of the tournament and do not consider the possibility of carrying over some points at all. The type of sports tournaments analysed here is completely new in the literature discussing incentive (in)compatibility. Unfortunately, contrary to Dagaev and Sonin (2018) and Csató (2020a), no straightforward strategy-proof mechanism can be found for the original format of tournaments with multiple group stages and carried over results. Thus alternative incentive compatible solutions should be proposed and investigated by Monte-Carlo simulations. The simulation methodology is imported from Csató (2021a), a paper that aims only to compare certain tournament designs without addressing incentive compatibility.

The rest of the paper proceeds as follows. Section 2 brings an example from handball, where the unfair behaviour of a team could lead to the elimination of a third team. Section 3 builds a theoretical model to prove that a standard tournament with multiple group stages violates strategy-proofness. Section 4 proposes a family of incentive compatible mechanisms for organising these tournaments and explores their characteristics with respect to selective ability and competitive balance via Monte-Carlo simulations. Finally, Sect. 5 concludes.

2 A real-world example of misaligned incentives

The European Men’s Handball Championship is a biennial competition for the senior men’s national handball teams of Europe since 1994, organised by the EHF (European Handball Federation), the umbrella organization for European handball. The 11th European Men’s Handball Championship (EHF Euro 2014) was held in Denmark between 12 and 26 January 2014. In its preliminary round, the sixteen national teams were divided into four groups (A–D) to play in a round-robin format. The top three teams in each group qualified to the main round: teams from Groups A and B composed the first main round group X, while teams from Groups C and D composed the second main round group Y. The main round groups were also organised in a round-robin format, but all matches (consequently, results and points) played in the preliminary round between teams that were in the same main round group, were kept and remained valid for the ranking of the main round. Figure 1 in the Appendix gives an overview of this tournament design.

In the groups of the preliminary and main rounds, two points were awarded for a win, one point for a draw, and zero points for a defeat. Teams were ranked by adding up their number of points. If two or more teams had an equal number of points, the following tie-breaking criteria were used after the completion of all group matches (EHF 2021, Articles 9.11–9.12 and 9.23–9.24):

  1. (a)

    Higher number of points obtained in the group matches played amongst the teams in question;

  2. (b)

    Superior goal difference from the group matches played amongst the teams in question;

  3. (c)

    Higher number of goals scored in the group matches played amongst the teams in question;

  4. (d)

    Superior goal difference from all group matches (achieved by subtraction);

  5. (e)

    Higher number of goals scored in all group matches.

Table 2 11th European Men’s Handball Championship (EHF Euro 2014), Group C

A strange situation emerged in Group C of the preliminary round, which requires further investigation. On 16 January 2014, each team in the group had one more game to play. Table 2 shows the known results and the preliminary standing.

Consider the possible scenarios from the perspective of Poland. It is certainly eliminated if it does not win against Russia. Poland carries over 0 points, 46 goals for and 48 goals against to the main round if it wins against Russia and Serbia plays at least a draw against France because then Russia will be eliminated as the fourth team of the group. On the other hand, if Poland wins by x goals against Russia and Serbia loses, there will be three teams with 2 points, which obtained 2 points in the group matches played among them. Consequently, further tie-breaking criteria should be applied: Poland, Russia, and Serbia will have head-to-head goal differences of \(x-1\), \(2-x\), and \(-1\), respectively.

\(x-1 > -1\) implies that Poland will qualify. Serbia is eliminated as being the fourth team if \(1 \le x \le 2\). Russia and Serbia have the same head-to-head goal difference if \(x=3\), hence the number of goals scored against the three teams with 2 points breaks the tie. It is 45 for Serbia and at least 27 for Russia, thus Russia qualifies if it scores at least 19 goals against Poland (if Poland vs. Russia is 21–18, then the ranking will depend on the result of Serbia vs. France). If \(x \ge 4\), then Serbia has a better head-to-head goal difference than Russia, thus Serbia qualifies, and Russia is eliminated.

Fig. 1
figure 1

The tournament format of the 2014 European Men’s Handball Championship

Figure 2 overviews all scenarios from the perspective of Poland. To summarise, if Poland wins, it carries over its result against Russia (2 points) or Serbia (0 points) to the main round, therefore Poland has every incentive to qualify together with Russia. Consequently, it is ex ante unfavourable for Poland to win by more than three goals against Russia because this scenario yields no gain in the main round but may lead to a loss of 2 points if Serbia is defeated by France. However, Russia does not have similar problems with its incentives, for example, it is clearly better off by a smaller defeat compared to a greater one.

In fact, Poland vs. Russia was 24–22 and Serbia vs. France was 28–31, hence France, Poland, and Russia qualified for the main round with 4, 2, and 0 points, respectively. The result of Poland vs. Russia was 10–14 after 30 min (half-time), while the match stood at 21–16 in the 48th, 22–17 in the 50th, and 23–18 in the 52nd min (EHF 2014).Footnote 1

These events, perhaps influenced by the misaligned incentives of Poland, led to the elimination of a third, innocent team, Serbia, which makes the example especially worrying. The situation could not have been improved by playing the last group matches simultaneously because Poland’s (weakly) dominant strategy was independent of the result of the game played later. This seems to be a persuading argument against the rules of the 11th European Men’s Handball Championship (EHF Euro 2014).

3 The model

Now we build a model of a tournament consisting of round-robin preliminary and main rounds, where the matches played in the preliminary round against teams that qualified to the same main round group are carried over. This design will be proved to violate strategy-proofness, that is, it allows for misaligned incentives. Our notations follow Csató (2020a) in certain details since the qualification system discussed there is also based on round-robin groups.

Definition 1

(Round-robin group) The pair (XR) is a round-robin group where

  • X is a finite set of at least two teams;

  • the ranking method R associates a strict order R(v) on the set X for any function \(v: X \times X \rightarrow \left\{ \left( v_1; v_2 \right) : v_1,v_2 \in {\mathbb {N}} \right\} \cup \{ \text {---} \} \cup \{ \otimes \}\) such that \(v(x,y) = \text {---}\) if \(x=y\) and \(v(x,y) = \otimes\) implies \(v(y,x) = \otimes\).

Function v describes game results with the number of goals scored by the first and second team, respectively. It contains the possibility that some matches between the teams remain to be played, denoted by the symbol \(\otimes\).

Definition 1 can describe a home-and-away round-robin tournament where any two teams play each other once at home and once at away. Consider the notation \(v(u,w) = \left( v_1(u,w); v_2(u,w) \right)\) where the first team u plays at home. It is said that team x wins over team y if \(v_1(x,y) > v_2(x,y)\) (home) or \(v_1(y,x) < v_2(y,x)\) (away), team x loses to team y if \(v_1(x,y) < v_2(x,y)\) (home) or \(v_1(y,x) > v_2(y,x)\) (away) and team x draws against team y if \(v_1(x,y) = v_2(x,y)\) or \(v_1(y,x) = v_2(y,x)\).

Let (XR) be a round-robin group, \(x,y \in X\), \(x \ne y\) be two teams, and v be a set of results. x is ranked higher (lower) than y if and only if x is preferred to y by R(v), that is, \(x \succ _{R(v)} y\) (\(x \prec _{R(v)} y\)).

The ranking is usually based on the number of points scored.

Definition 2

(Number of points) Let (XR) be a round-robin group, v be a set of results, \(x \in X\) be a team, and \(\alpha> \beta > \gamma\) be three parameters. Denote by \(N_v^w(x)\) the number of wins, by \(N_v^d(x)\) the number of draws, and by \(N_v^\ell (x)\) the number of losses of team x, respectively. The number of points of team x is \(s_v(x) = \alpha N_v^w(x) + \beta N_v^d(x) + \gamma N_v^\ell (x)\).

In other words, a win gives \(\alpha\), a draw gives \(\beta\), and a loss gives \(\gamma\) points.

Remark 1

With a slight abuse of notation, it is assumed in the following that the ranking method R determines the values \(\alpha> \beta > \gamma\) for any round-robin group (XR).

The number of points does not necessarily induce a strict order on the set of teams, hence some tie-breaking rules should be introduced.

Definition 3

(Goal difference) Let (XR) be a round-robin group, v be a set of results, and \(x \in X\) be a team. The goal difference of team x is

$$\begin{aligned} gd_v(x)= & {} \sum _{y \in X \setminus \{ x \}, \, v(x,y) \ne \otimes } \left[ v_1(x,y) - v_2(x,y) \right] \\&+ \sum _{y \in X \setminus \{ x \}, \, v(x,y) \ne \otimes } \left[ v_2(y,x) - v_1(y,x) \right] . \end{aligned}$$

Goal difference is the number of goals scored by team x minus the number of goals conceded by team x.

Definition 4

(Head-to-head results) Let (XR) be a round-robin group, v be a set of results, and \(x \in X\) be a team. Denote by \(L \subseteq X \setminus \{ x \}\) a set of teams. The head-to-head number of points of team x with respect to L is

$$\begin{aligned} s_v^L(x) & = \alpha \left( | \left\{ y \in L: v_1(x,y)> v_2(x,y) \right\} | + | \left\{ y \in L: v_1(y,x)< v_2(y,x) \right\} | \right) \\&+ \beta \left( | \left\{ y \in L: v_1(x,y) = v_2(x,y) \right\} | + | \left\{ y \in L: v_1(y,x) = v_2(y,x) \right\} | \right) \\&+ \gamma \left( | \left\{ y \in L: v_1(x,y) < v_2(x,y) \right\} | + | \left\{ y \in L: v_1(y,x) > v_2(y,x) \right\} | \right) . \end{aligned}$$

The head-to-head goal difference of team x with respect to L in (Xv) is

$$\begin{aligned} gd_v^L(x) = \sum _{y \in L} \left[ v_1(x,y) - v_2(x,y) \right] + \sum _{y \in L} \left[ v_2(y,x) - v_1(y,x) \right] . \end{aligned}$$

In accordance with (EHF 2021, Articles 9.11–9.12 and 9.23–9.24), head-to-head results are calculated only for complete round-robin groups where all matches have already been played.

Definition 5

(Head-to-head domination) Let (XR) be a round-robin group, v be a set of results, and \(x, y \in X\) be two teams such that \(s_v(x) = s_v(y)\). Denote by L the set of teams that have scored the same number of points as teams x and y. Team x head-to-head dominates team y if one of the following holds:

  • \(s_v^L(x) > s_v^L(y)\);

  • \(s_v^L(x) = s_v^L(y)\) and \(gd_v^L(x) > gd_v^L(y)\).

Therefore, if two teams have the same number of points, then one head-to-head dominates the other if: (a) it has scored more points against all teams with the same number of points (the first condition); or (b) it has scored the same number of points against all teams with the same number of points but has a superior goal difference against them (the second condition). See the analogy to (EHF 2021, Articles 9.11–9.12 and 9.23–9.24).

Definition 6

(Monotonicity of the ranking in a round-robin group) Let (XR) be a round-robin group. Its ranking method is called monotonic if for any set of results v and for any teams \(x,y \in X\):

  1. 1.

    \(s_v(x) > s_v(y)\) implies \(x \succ _{R(v)} y\);

  2. 2.

    \(s_v(x) = s_v(y)\), \(gd_v(x) > gd_v(y)\), and x head-to-head dominates y imply \(x \succ _{R(v)} y\).

Monotonicity requires that (a) a team should be ranked higher if it has a greater number of points (criterion 1); and (b) a team should be ranked higher compared to any other with the same number of points, an inferior goal difference, and worse head-to-head results against all teams with the same number of points (criterion 2).

Monotonicity does not necessarily result in a strict ranking. The complexity of Definition 6 is due to cover the two different tie-breaking concepts, goal difference, and head-to-head results. For example, in association football, FIFA currently uses the former, while UEFA applies the latter rule.

Definition 7

(Preliminary round) The preliminary round \({\mathcal {P}}\) consists of k groups of round-robin tournaments \((X^1,R^1)\), \((X^2,R^2)\), ..., \((X^k,R^k)\) such that \(X^i \cap X^h = \emptyset\) for any \(h \ne i\), \(1 \le h,i \le k\).

Definition 8

(Main round) The main round \({\mathcal {M}}\) consists of \(\ell\) groups of round-robin tournaments \((Y^1,S^1)\), \((Y^2,S^2)\), ..., \((Y^\ell ,S^\ell )\) such that \(Y^j \cap Y^h = \emptyset\) for any \(j \ne h\), \(1 \le h,j \le \ell\).

Definition 9

(Qualification rule) Let \({\mathcal {P}}\) be a preliminary round and \({\mathcal {M}}\) be a main round. For any set of results \(V = \left\{ v^1, v^2, \dots , v^k \right\}\) in the preliminary round such that \(v^i(x,y) \ne \otimes\) for all \(x,y \in X^i\) and \(1 \le i \le k\), a qualification rule \({\mathcal {Q}}\) associates the sets \(Y^1, Y^2, \dots Y^\ell\) and the set of results \(W = \left\{ w^1, w^2, \dots , w^\ell \right\}\) in the main round groups.

Thus the qualification rule determines the composition of the groups in the main round and the set of results carried over from the preliminary round on the basis of all results in the preliminary round, that is, after all matches have been played there.

Team \(x \in X^i\) is said to be qualified to the main round if \(x \in \cup _{j=1}^\ell Y^j\).

Definition 10

(Tournament with multiple group stages) A tournament with multiple group stages is a triple \(({\mathcal {P}}, {\mathcal {M}}, {\mathcal {Q}})\) consisting of the preliminary round \({\mathcal {P}}\), the main round \({\mathcal {M}}\), and the qualification rule \({\mathcal {Q}}\).Footnote 2

It is natural to restrict our attention to a reasonable subset of tournaments.

Definition 11

(Regularity of a tournament with multiple group stages) Let \(({\mathcal {P}}, {\mathcal {M}}, {\mathcal {Q}})\) be a tournament with multiple group stages. It is called regular if under any set of results \(V = \left\{ v^1, v^2, \dots , v^k \right\}\) in the preliminary round such that \(v^i(x,y) \ne \otimes\) for all \(x,y \in X^i\) and \(1 \le i \le k\), the following conditions hold:

  1. (a)

    \(\cup _{j=1}^\ell Y^j \subseteq \cup _{i=1}^{k} X^i\);

  2. (b)

    there exists a common monotonic ranking \(R = R^i\) in each group \((X^i, R^i)\) of the preliminary round \({\mathcal {P}}\) such that \(x \succ _{R(v^i)} y\) and \(y \in \cup _{j=1}^\ell Y^j\) imply \(x \in \cup _{j=1}^\ell Y^j\) for all \(x,y \in X^i\), \(1 \le i \le k\);

  3. (c)

    \(x,y \in X^i \cap Y^j\) implies \(w^j(x,y) = v^i(x,y)\), where \(w^j\) is the set of results in the main round group \((Y^j, S^j)\)

  4. (d)

    \(x \in X^i\), \(y \in X^h\), \(i \ne h\), and \(x,y \in Y^j\) imply \(w^j(x,y) = \otimes\), where \(w^j\) is the set of results in the main round group \((Y^j, S^j)\);

  5. (e)

    there exists a common monotonic ranking \(S = S^j\) in each group \((Y^j, S^j)\) of the main round \({\mathcal {M}}\).

The idea behind a regular tournament with multiple group stages is straightforward. Some top teams from the preliminary round groups qualify for the main round (conditions (a) and (b)), where they are divided into new groups such that the matches already played against teams in the same main round group are carried over (conditions (c) and (d)). Furthermore, the rankings in the preliminary and main round groups are monotonic and identical, respectively (conditions (b) and (e)). Nonetheless, the rankings R and S can be different.

Perhaps these principles have inspired the decision-makers of the EHF.

Definition 12

(Manipulation) Let \(({\mathcal {P}}, {\mathcal {M}}, {\mathcal {Q}})\) be a tournament with multiple group stages. A team \(x \in X^i\) can manipulate the tournament if there exist two sets of results \(V = \left\{ v^1, v^2, \dots , v^i, \dots , v^k \right\}\) and \({\bar{V}} = \left\{ v^{1}, v^2, \dots , {\bar{v}}^i, \dots , v^k \right\}\) in the preliminary round such that \({\bar{v}}_2^i(x,y) \ge v_2^i(x,y)\) and \({\bar{v}}_1^i(y,x) \ge v_1^i(y,x)\) for all \(y \in X^i\), furthermore, \(x \in Y^j\), \(1 \le j \le \ell\) according to both \({\mathcal {Q}}(V)\) and \({\mathcal {Q}}({\bar{V}})\), and either \(s_{{\bar{W}}}(x) > s_W(x)\), or \(s_{{\bar{W}}}(x) = s_{W}(x)\) and \(gd_{{\bar{W}}}(x) > gd_{W}(x)\).

Manipulation means that team x can increase its number of points (\(s_{{\bar{W}}}(x) > s_W(x)\)), or at least improve its goal difference (\(gd_{{\bar{W}}}(x) > gd_{W}(x)\)) with preserving its number of points (\(s_{{\bar{W}}}(x) = s_{W}(x)\)) in the main round by conceding more goals in a match of the preliminary round. The definition of manipulation may seem to be restrictive but: (a) scoring fewer goals is not an option at a given standing of the game; and (b) qualification to another main round group is not necessarily advantageous even if better results are carried over.

Since conceding more goals is in the hands of a team, it can be regarded as its decision variable.

Definition 13

(Strategy-proofness) A tournament with multiple group stages \(({\mathcal {P}}, {\mathcal {M}}, {\mathcal {Q}})\) is called strategy-proof if there exists no set of group results \(V = \left\{ v^1, v^2, \dots ,v^k \right\}\) under which a team \(x \in \cup _{i=1}^k X^i\) can manipulate.

Our central result concerns the strategy-proofness of regular tournaments with multiple group stages: while manipulation certainly worsens a team’s goal difference (and sometimes its number of points, too) in its preliminary round group as the ranking rule applied here is monotonic, this might pay off in the main round, where some matches of the preliminary round are discarded—provided that the team still qualifies.

Proposition 1

Let \(({\mathcal {P}}, {\mathcal {M}}, {\mathcal {Q}})\) be a regular tournament with multiple group stages such that the following conditions hold:

  • there exist \(x,y \in X^i \cap Y^j\) for some \(1 \le i \le k\) and \(1 \le j \le \ell\);

  • there exists \(u \in X^i\) but \(u \notin Y^j\).

Then this tournament with multiple group stages does not satisfy strategy-proofness.

According to the conditions of Proposition 1, the result of at least one match played in the preliminary round (between the teams x and y) is carried over to the main round, and the results of some matches (between the teams x or y, and u) are ignored in the main round.

Proof

It works by simplifying the motivating example of Sect. 2.

Example 1

Consider a regular tournament with multiple group stages \(({\mathcal {P}}, {\mathcal {M}}, {\mathcal {Q}})\). Let \((X^1,R)\) be a single round-robin group in the preliminary round with \(X^1 = \{ a,b,c \}\). Therefore, the number of points for each team is between \(2 \gamma\) and \(2 \alpha\).

Assume that there is \(\ell = 1\) group in the main round and \(x \in X^1 \cap Y^1\) if and only if \(\left\{ z \in X^1: x \succ _{R(v^1)} z \right\} \ne \emptyset\), namely, the group winner and the runner-up qualify for the main round from the group \((X^1,R)\).

Table 3 The round-robin group \((X^1,R)\) of Example 1

A possible set of results in the preliminary round group \((X^1,R)\) is shown in Table 3. Team a is the group-winner due to the best (head-to-head) goal difference (see criterion 2 of a monotonic group ranking method). Furthermore, it is considered with \(s_{W}(a) = \gamma\) points in the main round, after discarding its match against team c, the last team in the group by criterion 2 of a monotonic group ranking R (see the last but one row of Table 3).

Let us examine what happens if \({\bar{v}}^1(a,c) = (2; 0)\). Then teams a, b, and c remain with \(\alpha + \gamma\) points, but they have head-to-head goal differences \(+1\), \(-1\), and 0, respectively. Therefore, a is the first and c is the second according to criterion 2 of the monotonic group ranking R, and team a is considered with \(s_{{\bar{W}}}(a) = \alpha > \gamma = s_{W}(a)\) points in the main round as the last row of Table 3 shows.

To conclude, team a has an opportunity to manipulate this regular tournament with multiple group stages under the set of group results V, hence it violates strategy-proofness.

Example 1 contains only three teams, which is minimal under the conditions of Proposition 1. The number of teams can be increased without changing the essence of the counterexample if we add some teams such that all of them have suffered a defeat of 1–0 to teams a, b, and c. Groups can be double round-robin tournaments instead of single ones by copying the game results above. Since a tournament is incentive incompatible if there exists a single group with the threat of manipulation, an arbitrary number of groups can be added to the example. \(\square\)

Proposition 1 remains valid if draws are allowed in a tournament with multiple group stages.

Remark 2

The 11th European Men’s Handball Championship (EHF Euro 2014), discussed in Sect. 2, fits into the model presented above. The number of groups in the preliminary round is \(k=4\), the number of groups in the main round is \(\ell =2\), and it is a regular tournament with multiple group stages:

  1. (a)

    \(Y^1 \subset X^1 \cup X^2\) and \(Y^2 \subset X^3 \cup X^4\);

  2. (b)

    Ranking in the preliminary round groups is monotonic as it is based on the number of points with tie-breaking through head-to-head results, and the top three teams qualify for the main round;

  3. (c)

    Matches played in the preliminary round against opponents which qualified to the main round are kept and remain valid for the ranking of the main round;

  4. (d)

    In the main round, each team faces three teams that did not participate in its preliminary round group;

  5. (e)

    Ranking in the main round groups is monotonic as it is based on the number of points with tie-breaking through head-to-head results.

Example 2

The 11th European Men’s Handball Championship (EHF Euro 2014) is not strategy-proof.

Proof

The scenario presented in Sect. 2 shows that team \(\text {Poland} = x \in X^3\) can manipulate against \(\text {Russia} = y \in X^3\): there exist two sets of group results \(V = \left\{ v^1, v^2, v^3, v^4 \right\}\) and \({\bar{V}} = \left\{ v^1, v^2, {\bar{v}}^3, v^4 \right\}\) such that \({\bar{v}}^3 = v^3\) except for \({\bar{v}}_1^3(x,y) = {\bar{v}}_2^3(y,x) = 26 > 24 = v_1^3(x,y) = v_2^3(y,x)\), furthermore, Poland qualifies for the group \((X^2,S)\) according to both \({\mathcal {Q}}(V)\) and \({\mathcal {Q}}({\bar{V}})\), whereas \(s_{{\bar{W}}}(x) = 2 > 0 = s_W(x)\).

Proposition 1 can also be applied due to Remark 2. \(\square\)

Now we state a positive result, a “pair” of Proposition 1.

Proposition 2

Let \(({\mathcal {P}}, {\mathcal {M}}, {\mathcal {Q}})\) be a regular tournament with multiple group stages such that one of the following conditions hold:

  • there does not exist \(x,y \in X^i \cap Y^j\) for any \(1 \le i \le k\) and \(1 \le j \le \ell\);

  • \(u,z \in X^i\) and \(u \in Y^j\) imply \(z \in Y^j\) for all \(1 \le i \le k\).

Then this tournament with multiple group stages is strategy-proof.

Proof

If all preliminary round results achieved against other qualified teams are ignored (first condition), or carried over to the main round (second condition), then it makes no sense to perform weaker in the preliminary round because of the monotonicity of the group rankings in both rounds. \(\square\)

Proposition 2 implies that teams qualifying from the same preliminary round group should be drawn into different groups in the main round (which is guaranteed if only one team qualifies from each preliminary round group), or all teams from a given preliminary round group should qualify for the same main round group to avoid incentive incompatibility.

It is also clear from the match discussed in Sect. 2 that head-to-head results cannot be used to break a tie in the main round between two teams qualified from the same preliminary round group, otherwise there remain some incentives to influence the set of qualified teams.

Our main result is somewhat related to—but entirely independent of—the finding of Vong (2017) that in general multi-stage tournaments, the necessary and sufficient condition of strategy-proofness is to allow only the top-ranked player to qualify from each group. However, in the model of Vong (2017), teams tank in order to meet preferred opponents in the next round, thus they only gain in expected value. Contrarily, Definition 13 requires that a team cannot be strictly better off by a lower effort.

4 A family of incentive incompatible designs

The theoretical results in Sect. 3 uncover that there is no straightforward way to guarantee the strategy-proofness of tournaments with multiple group stages and results that are carried over, in contrast to tournament systems consisting of multiple round-robin and knockout tournaments (Dagaev and Sonin 2018), or group-based qualification systems (Csató 2020a).

According to Proposition 2, incentive compatibility will be satisfied if either all points scored in the preliminary round are considered in the main round (directly or after an arbitrary monotonic transformation), or all of them are discarded, which is against the essence of these tournaments. Consequently, the only reasonable solution is to carry over all preliminary round results to the main round, perhaps after a monotonic transformation, regardless that some matches were played against teams already eliminated from the tournament.

However, if all results are carried through, then the subsequent phase loses a bit of excitement because there will be greater variation in points at the commencement of this stage, and the teams entering bottom will find it much harder to catch up with the teams entering the stage on top.

This effect can be mitigated by carrying over only half of the points from the preliminary round. The idea comes from the Belgian First Division A, the top league competition for association football clubs in Belgium, where the sixteen participants play a double round-robin tournament in the regular season, followed by a championship play-off for the first six teams such that the points obtained during the regular season are halved. A similar policy is applied currently in the top-tier association football leagues in Poland, Romania, and Serbia (Lasek and Gagolewski 2018). However, in contrast to our model in Sect. 3, the teams advancing to the championship play-off play again in a round-robin format.

For tie-breaking purposes, we suggest retaining the number of goals scored and conceded in the preliminary round. Theoretically, they can also be discarded, but it seems to be unfair when there was a match played in the preliminary round against a team from the same main round group. In the case of Belgian First Division A, goal difference is not among the tie-breaking criteria in the championship play-offs.

Therefore, two incentive compatible versions of each tournament with multiple group stages will be considered without changing the set of matches played: (1) carrying over all results from the preliminary round; and (2) carrying over half of the points from the preliminary round. The consequences of these modifications will be explored here as a kind of cost-benefit analysis via simulations, implemented in the framework of Csató (2021a). The latter study attempts to compare four tournament formats of the World Men’s Handball Championships with respect to several sporting criteria such as selection ability, and competitiveness and quality of the final.

As Table 1 uncovers, the tournament has used three different designs containing multiple group stages. Since the one used in 2003 suffers from various problems and seems not to be efficacious (Csató 2021a), the following two are studied:

  • Format G66: This design, presented in Fig. 3, has been used first in the 2005 World Men’s Handball Championship and has been applied in the 2009, 2011, and 2019 tournaments, too. The preliminary round (see Fig. 3a) consists of four groups of six teams each such that the top three teams qualify for the main round. The main round consists of two groups of six teams, each created from two preliminary round groups. The top two teams of every main round group advance to the semifinals in the knockout stage (see Fig. 3b). The name of the format comes from the size of the groups in its two rounds.

  • Format G46: This design, presented in Fig. 4, has been used in the 2007 World Men’s Handball Championship. The teams are drawn into six groups of four teams each in the preliminary round (see Fig. 4a) such that the top two teams proceed to the main round. The main round consists of two groups of six teams, each created from three preliminary round groups. The top four teams of every main round group advance to the quarterfinals in the knockout stage (see Fig. 4b).Footnote 3 Again, the name of the format comes from the size of the groups in its two rounds.

While the knockout stage of both tournament formats is immediately determined by the preceding group stages, the competing teams should be drawn into groups before the start of the tournament, thus the seeding regime may also affect the outcome (Scarf and Yusof 2011). On the other hand, seeding is clearly independent of how the results are carried over from the preliminary round to the main round.

Hence, similarly to Csató (2021a), two variants of each tournament design, called seeded and unseeded, are considered. In the seeded version, the preliminary round groups are drawn such that in the case of groups with k teams (\(k=4\) for G46 and \(k=6\) for G66), the strongest k teams are placed in Pot 1, the next strongest k teams in Pot 2, and so on. Then each group gets one team from each pot. The unseeded version divides the teams into the pots randomly. Therefore, a strong team, allocated in a harsh group, will have more difficulties in qualifying than a “lucky” weak team, allocated in an easier group.

Table 4 Tournament designs considered in the simulations

Table 4 summarises the twelve tournament designs to be analysed.

The results of the matches are determined by the a priori fixed winning probabilities, which depend on the pre-tournament ranks of the teams \(1 \le i,j \le 24\), such that a stronger team defeats a weaker team with a higher probability than vice versa.

Further details of the simulation procedure can be found in Csató (2021a). According to the arguments presented there, all simulations have been implemented with one million runs.

Definition 14

(Tournament metrics) The tournament designs are compared on the basis of some standard metrics:

  • the average pre-tournament rank of the winner, the second-, the third- and the fourth-placed teams;

  • the expected quality of the final: the sum of the finalists’ pre-tournament ranks;

  • the expected competitive balance of the final: the difference between the finalists’ pre-tournament ranks.

Therefore, a lower value of all measures can be preferred.

Fig. 2
figure 2

Possible scenarios before the last matchday of Group C in the 2014 European Men’s Handball Championship from the perspective of Poland

Figure 5 shows the average pre-tournament rank of the first four teams. If all points are carried over from the preliminary round, then the result of the tournament becomes more predetermined as the expected rank slightly decreases. Preserving only half of these points substantially mitigates this loss of excitement, except in the unseeded variant of format G46. On the other hand, the average rank of the winner is even higher in the case of seeded G46 according to this solution than under the original incentive incompatible design. In addition, carrying over half of all points minimises the effect of the seeding policy, which seems to be desirable because it is a factor not influenced by the competitors.

Fig. 3
figure 3

Tournament format G66 of the 2011 and the 2019 World Men’s Handball Championships

Figure 6 reinforces these findings by focusing on the final of the tournament: if half of all points scored in the preliminary round are carried over instead of only the results against the teams qualified for the main round, then the final may become a bit more boring but usually involves stronger teams. It decreases the influence of the seeding regime again, especially in the format G66.

Following Scarf et al. (2009), we have made a robustness check by calculating the metrics for more and less competitive tournaments than the baseline version, in the same way as Csató (2021a). The qualitative results of these simulations coincide with the findings from Figs. 5 and 6, hence our observations are independent of the distribution of teams’ strength.

The comparison of Figs. 5a, b, as well as Fig. 6a, b, uncovers that the choice of the tournament format is more important than the effect of how points are carried over to the main round (see the scales on the vertical axis). Since there is no consensus in the former, at least for the Men’s (Women’s) World Handball Championships, it does not make much sense to dispute the use of the suggested incentive compatible variants of tournaments with multiple group stages on the basis of the tournament metrics considered.

To conclude, the price of guaranteeing incentive compatibility seems to be negligible—at least, compared to other features of the design like the particular tournament format or the seeding policy. We propose to carry over half of the points scored in the preliminary round. This solution has another interesting implication: it minimises the role of seeding (the difference between the seeded and unseeded variants is the smallest among all designs), which can be advantageous because the true ranking of the teams is never known, and misaligned classification usually leads to unfairness.

On the other hand, carrying over all results from the preliminary round (even after a monotonic transformation) means that the outcome in the main round is at least partially dependent on the strength of opponents in the preliminary group, which might raise questions about fairness if these groups are different in terms of their strength. Fortunately, some authors have recently made useful proposals to balance the difficulty levels of the groups (Cea et al. 2020; Guyon 2015; Laliena and López 2019).

Nonetheless, perfect balance can never be realistically achieved. Therefore, we have examined a scenario to reveal how the proposed family of incentive compatible designs affects competitive balance.Footnote 4 It is assumed that there are two strong teams, one identified correctly with the pre-tournament rank 1, while the other is thought to be only the 13th. The winning probabilities against all other teams are computed as before, according to the model of Csató (2021a). The unseeded tournament formats are uninteresting since the strongest two teams will play against opponents of the same strength on average. However, compared to the correctly identified strong team, the lower-ranked strong team should face a top team (ranked between 1 and 4 in G66, or between 1 and 6 in G46) in the preliminary round instead of a middle team (ranked between 13 and 16 in G66, or between 13 and 18 in G46).

Fig. 4
figure 4

Tournament format G46 of the 2007 World Men’s Handball Championship

Figure 7 presents the effect of our proposals on the winning probability of the top 16 teams. Crucially, the impact is marginal, for instance, the choice of the tournament format (G66 or G46) has a much greater role. Carrying over all points from the preliminary round somewhat worsens competitive balance in design G66 as it favours only the five best teams, including the one mistakenly ranked 13th. However, the comparison of the two strong teams (1 and 13) reveals that carrying over only half of all points somewhat reduces the inequality in group strength. The impact is even less significant for the tournament format G46. Hence, organisers should not worry about the unfairness caused by this family of incentive compatible mechanisms, which guarantees an important theoretical requirement by excluding any instances where a team might be better off by losing.

5 Discussion

Tournament design is an important topic of economics and operations research (Csató 2021b). We have argued that organisers should not miss analysing incentive compatibility because a sporting contest is supposed to be genuine, and is sold to the public as having full integrity. While the actual probability of misaligned incentives can be relatively small, and the audience does not necessarily recognise the problem, it is not worth risking a potential scandal with enormous financial and reputational costs. According to our simulation model, the price of guaranteeing the incentive compatibility of tournaments with multiple group stages is marginal: the use of a fair mechanism essentially does not affect the selective ability and the competitive balance of these tournaments.

Somewhat surprisingly, we have not found any controversy about the particular handball match presented in Sect. 2. Nonetheless, its detection is non-trivial as compared to the football and basketball matches discussed in Sect. 1 because it was enough to make some mistakes in defence or attack, without the need to score own goals. Reasonably, the EHF remained silent on this issue, and the audience obviously did not study the tie-breaking rules carefully. On the other hand, the Polish coach and players probably knew that they should not make great efforts to win by a higher margin. Hopefully, our paper will contribute to placing this game in the category of the notorious “Nichtangriffspakt (Schande) von Gijón”Footnote 5 (Kendall and Lenten 2017, Section 3.9.1) in the history of sports. A match played by Australia and West Indies in the 1999 Cricket World Cup might be an example of similar tacit collusion or emerging cooperation, too (Kendall and Lenten 2017, Section 3.7.2). However, in contrast to the scenario presented in Sect. 2, this plan—if there was one—did not work out entirely.

Several directions remain open for future research. First, by the quantification of team strengths and the modelling of match outcomes, the probability of situations susceptible to manipulation can be estimated (Chater et al. 2021; Csató 2022; Guyon 2020). Second, strategy-proofness can be considered as another aspect in the comparison of different league formats (Goossens et al. 2012). Third, it is clear that there are various trade-offs between efficiency and fairness, and sports administrators implicitly seem to accept some minimal level of tanking (Pauly 2014). Thus the final aim may be an extensive axiomatic discussion and comparison of sports ranking rules, which has started recently.