1 Introduction

Incentives are an important tool for motivating people to exert effort. In many environments, from the workplace to sporting contests, incentives are put in place to ensure that invested efforts are optimized to achieve predefined goals (Lazear, 2000; Laffont & Martimort, 2002). Changing incentives directly translates into altered performance or success probability (Rosen, 1985; Prendergast, 1999). In contests where the reward depends solely on the outcome of a single event, incentives are provided directly through the potential rewards. In multi-event contests, the translation of contest rewards into incentives for single events is not directly observable. Moreover, single events may be unequally important for obtaining the final rewards, e.g. performance in an interview is rated higher than the previous assessment center test, or the results of previous events lead to momentum for subsequent events.

As this transmission of multi-event contest rewards into incentives reflects the (personal) expected rewards, incentives vary not only between events but also between participants. Understanding these disparate and possibly asymmetric incentives in multi-event contests is essential as it could lead participants to strategically allocate their efforts (Preston & Szymanski, 2003). Asymmetric individual incentives may have spillover effects on the outcome probabilities of all other participants, which in turn could lead to potentially unbalanced or unfair contests. Operations Research is concerned with fairness in multi-event contests (Rasmussen & Trick, 2008; Kendall et al., 2010; Van Bulck & Goossens, 2020), and already considered differences in the relevance of single events (Scarf et al., 2009; Goossens et al., 2012; Buraimo et al., 2022).

In this work, we propose a general statistical framework to quantify the importance, or implicit incentive, of single events in complex multi-event contests for each participant individually – the event importance (EI). The EI measures the capability of an event to change a participant’s (expected) reward for a contest. Our approach is based on two steps. First, we calculate the discrete probability distribution for each contestant to reach certain end-of-contest rewards based on an outcome model determining the outcome probabilities for every single event. Second, the importance of a single event is determined through the changes in the end-of-contest reward probabilities with respect to the possible outcomes of this particular event. If the probabilities to reach the final rewards are changed substantially, the importance of this event for the participant is high, and our methodology returns a high event importance measure.

The proposed framework generalizes previous approaches (Schilling, 1994; Scarf & Shi, 2008; Buraimo et al., 2022) in that it is suitable for any contest design and any number of participants and is not specific to any particular contest environment. Moreover, it allows for participant-specific reward structures, and both the reward structure and the schedule can change dynamically during the contest. Crucial for practical usage is that the proposed statistical procedure can also be used in situations in which the importance of the single event potentially plays a role in the determination of the event’s outcome – this can be accounted for by calculating the specific event importance in an iterative procedure.

Our statistical framework can be applied in a variety of practical use cases, like competing pharmaceutical companies developing a drug for the same medical indication, presidential elections, which are held in a series of local elections, a job or promotion contest among applicants or workers, or sports tournaments.

To showcase our methodology, we use our framework in two applications. In the first, we apply the framework to the US presidential primaries to examine the problem of front-loading: Earlier elections are known to have a greater impact on the outcome of the nomination process, which is why several states are pushing for earlier election dates. We analyze the Democrats’ electoral schedule for the 2020 primaries and compare it with two alternative hypothetical schedules, one sorted by the number of delegates and one randomized. In this analysis, we find that the positioning of a state’s election in the schedule substantially affects its impact on the outcome of the nomination – indicated by higher event importance measures. A comparison of the different schedules shows that the problem of front-loading can be mitigated by arranging the schedule according to the number of delegates in the states and completely be eliminated by a random scheduling.

In a second application of our framework to the double round-robin tournament structure in football leagues, we provide explicit measures of the EI that express implicit incentives for teams. In this setting, the relevance of a particular match with respect to the team’s expected rewards varies substantially, even though every match is actually awarded the same number of points. Hence, the importance of a match varies over the season and between teams. This leads to pairings between teams with potentially very different incentives that change the presumed probabilities of winning.

We demonstrate the meaningfulness of the derived values by analyzing their relationship to various observable characteristics of the matches. The integration of the EI information into a prediction model improves the accuracy of match outcome forecasts. We show that bookmakers do not fully take into account the team-specific importance of events in their prediction model. Furthermore, a positive interrelation can be drawn between the estimated importance of the match and the public’s interest in certain matches in the form of larger stadium attendance and social media engagement for more important matches. For the in-match activity of the players and the outcomes of the match, we observe a comprehensive pattern suggesting that teams approach more important matches with a more aggressive, direct, and successful playing style.

Both the event importance values and replication code for the applications are publicly available on Harvard Dataverse (Goller & Heiniger, 2022). The rest of the paper is structured as follows. Section 2 discusses related literature. Section 3 explains the proposed statistical method. Section 4 applies the framework to the front-loading in US primaries and the application to double round-robin tournaments appears in Sect. 5. Section 6 concludes.

2 Related literature

This work mainly refers to two types of literature, (a) the importance of specific events and attempts to quantify them and (b) the literature on incentives in contests. The literature investigating the role of (explicit) incentives on performance generally finds that higher incentives increase performance (Ehrenberg & Bognanno, 1990; Prendergast, 1999; Lazear, 2000). However, it is important to distinguish between effort- and skill-based tasks; in the latter, strong incentives can lead to performance decrements, a phenomenon known as choking-under-pressure (Ariely et al., 2009; Harb-Wu & Krumer, 2019; Goller, 2023).

Several studies are concerned with fair and balanced schedules for multi-event contests. Rasmussen & Trick (2008) and Kendall et al. (2010) survey previous OR studies investigating different fairness constraints or concerns in scheduling. The major issues are a balanced distribution of (dis-)advantages among contestants (Della Croce & Oliveri, 2006; Durán et al., 2019), availability constraints (Van Bulck & Goossens, 2020) or rescheduling of single events (Yi et al., 2020). Directly using a measure for the relevance of the single events Goossens et al. (2012) simulate the attractiveness of different multi-event contest designs to analyze which design to implement in practice.

The importance of specific events in multi-event contests can be found in many fields of literature. The order of action literature finds that the outcome of the contest is influenced by the order of the events which has been shown theoretically (Krumer et al., 2017, 2020, 2023) and empirically, e.g., in musical contests (Ginsburgh & van Ours, 2003), song contests (de Bruin, 2005), or judicial decisions (Danziger et al., 2011). More specifically, several works focus on potential advantages in the first event in (usually sequential) contests, like in R &D (Harris & Vickers, 1987), sports (Apesteguia & Palacios-Huerta, 2010; Krumer and Lechner, 2017), or elections (Klumpp & Polborn, 2006).

Research documents the differential importance of sequential elections in US presidential primaries. Surprising wins in early states led to momentum effects in the 2004 primaries (Knight and Schiff, 2010). In their work, they find an unbalanced influence on the final result for voters in early and late elections. Klumpp & Polborn (2006) model this first-winner advantage for primaries – known as the New Hampshire effect – giving an explanation for the more intense campaigning in early elections. With more influence in the nomination process in the early events front-loading, i.e. states moving their elections to earlier dates, is well documented (Mayer & Busch, 2003). Ridout and Rottinghaus (2008) find more attention of the candidates to the states the earlier their elections are. Moreover, they find that scheduling is more important than the delegates’ count.

The first approaches to determining an event’s importance use rather simplistic measures (Jennett, 1984) and basic contest structures (Schilling, 1994). Even nowadays, trivial (binary) measures are used in studies, e.g. for the relevance of games in basketball (Di Mattia & Krumer, 2023) or jumps in diving (Goller & Späth, 2023). Most influential was Schilling (1994)’s general idea of defining and calculating event importance in terms of how the probability to reach a final goal varies for different event outcomes. Recently, more sophisticated approaches have emerged, for instance, Scarf & Shi (2008) simulating probabilities of final contest rewards conditional on two different event outcomes. Lahvička (2015) and Buraimo et al. (2022) build on the ideas of Jennett (1984) and Schilling (1994) but simulate final standings in the ranking in a Monte Carlo simulation to estimate the importance of single events.Footnote 1 A different objective is followed in Geenens (2014): The importance of a match with regard to its influence on the final contest outcome is investigated. This has an interesting use case to investigate contests from the neutral spectator’s perspective but is conceptually very different from the importance of a match for the specific contestant.

The drawback of all the discussed approaches is that they are specific to a certain type of contest that is prevalent in sports, i.e. a fixed number of event outcomes and one specific reward (e.g., winner-takes-it-all contests). This does not encompass more complex or dynamic contest designs and reward structures, which are common in society. The approach we propose in the following section provides the flexibility to handle a variety of practical applications with a variety of contest and reward structures.

3 The event importance

3.1 Introduction to the general framework

The event importance measures the difference between the discrete contest reward probability distributions induced by the possible outcomes of a single event. If the probabilities for the final rewards vary substantially with the differential outcomes of the examined event, its impact on the tournament reward is large, and a high event importance measure is attributed.

To quantify the importance of a particular event, we hence require the probability distribution of the contest rewards conditional on each possible outcome of the investigated event. To determine the probability distributions, our framework sets the outcome of the examined event accordingly and solves the remainder of the contest by successive evaluation of the outcome model. Subsequent to the examined event, whose outcome is set by the framework, all entities (outcomes, covariates, schedule) are subject to the probabilistic outcome model. The successive application of the outcome model until the end of the contest generates the probability distribution for the contest reward conditional on the initial outcome.

There are six valuable attributes of our approach: First, by evaluating the reward probability from the perspective of every contestant individually, the event importance measure is specific to every participant and not the event itself. Second, we do not impose narrow restrictions on the contest setup. Since the contest reward probability distribution is conditional on each possible event outcome, we only need to assume a finite number of contestants, a finite number of possible outcomes for an event, and a finite schedule for the contest. Moreover, the tournament rewards have to be measurable based on the outcomes of all single events.

Third, the reward structure can be any function of all single event outcomes or a final contest ranking if such exists, e.g. close-by ranks can be grouped together if valued equally. Fourth, the framework is not restricted to a specific schedule. As the probability distributions are calculated through a successive evaluation of all events in the contest, the reward probabilities encompass all the essential features of the schedule as well, e.g. early elimination of participants in the contest or differences in information sets induced by events held in parallel or sequentially. Fifth, the framework is not tied to a specific distance metric to calculate the difference in the reward probability distributions. Sixth, if the outcome model is not known, it can be estimated on training data using any well-suited statistical method.

In the following, the details of how the described framework can be implemented to determine the event importance values in a general case are outlined.

3.2 Technical implementation

3.2.1 Notation

This section defines the notation used to describe the general framework. Table 1 summarises the notation as a quick reference. Upper case letters denote random variables, lower case letters denote their realizations or other exogenous variables, and calligraphic letters are sets. Multi-character names, such as EI or function names, are evident choices.

Table 1 Notation

The contest is held along a finite schedule \({\mathcal {T}}\). Because of the implicit chronological ordering of \({\mathcal {T}}\) we can define the notation \({\mathcal {T}}_{t^-}=\bigcup _{t'\le t}\,t'\) and \({\mathcal {T}}_{t^+}=\bigcup _{t'>t}\,t'\), denoting the sub-schedule up to and after time t. Multiple events \(e_{t,i}\) can be held simultaneously at time t, where the subscript i identifies one particular event out of all events that take place at time t. In this case \({\mathcal {e}}_t= \bigcup _i e_{t,i}\). A finite set of contestants \({\mathcal {K}}\) participate in the contest of which a subset \({\mathcal {K}}_e\subseteq {\mathcal {K}}\) participate in event e. For each event, information about the contestants, i.e. a set of covariates \(x_e=\bigcup _{k\in {\mathcal {K}}_e} x_{e,k}\), and its outcome \(y_{e}=\bigcup _{k\in {\mathcal {K}}_e} y_{e,k}\) is observed. The outcome \(Y_e\) of event e is a random variable that follows a conditional probabilistic outcome model \(\text {out}_e(x_e)= \bigcup _{\varvec{\mathcal {Y}}_e}P[Y_e=y_e|X_e=x_e]\) which, in case \(\text {out}_e()\) is not known, is approximated by \(\widehat{\text {out}}_e()\). In the description of the general framework, we assume without loss of generality that \(\text {out}_e()\) is known and uniform, i.e. \(\text {out}_e()=\text {out}()\). The cases of an approximated outcome model \(\widehat{\text {out}_e}()\) or event-specific outcome models \(\text {out}_e()\) can both be handled in the general framework.

The chronological feature of the events further allows us to define the sets of covariates \(x_{t^-} = \bigcup _{e,t'\le t} x_{e_{t'}}\) and \(x_{t^+} = \bigcup _{e,t'>t} x_{e_{t'}}\) which combine all information on events and participants taking place either up to or after time t. In time t several events can take place at the same time (using the same information set of previously held events). The analogous operation on the outcomes defines \(y_{t^-}\) and \(y_{t^+}\). In settings where parts of the covariates x and/or the schedule \({\mathcal {T}}\) depend on past outcomes, they are generated at run time based on the previous outcomes by \(\left\{ {\mathcal {T}}_{t^+},x_{t^+}\right\} =\text {gen}\left( {\mathcal {T}}_{t^-},x_{t^-},y_{t^-}\right) \). After the full contest \({\mathcal {T}}\), the probability distribution of the final rewards is determined according to the valuation function \(r_{k,y_e}=\text {rew}_k\left( \bigcup _{\mathcal {T}} y_e\right) \) which can be individually specific for every contestant k. The event importance \(\text {EI}_{e,k}=\text {dist}\left( \bigcup _{\varvec{\mathcal {Y}}_{e}}r_{k,y_e},\text {out}(x_e)\right) \) for contestant k in event e is the difference between the multiple probability distributions of the final rewards measured by any distance measure dist(). The distance function can incorporate the outcome probabilities \(\text {out}_e(x_e)\) of the starting event as weights.

3.2.2 Algorithm

Algorithm 1 describes the computation of the event importance value for a competitor k in event \(e_{t,i}\). Readers who are less familiar with the pseudo-code notation can consult the literal translation of the algorithm in Appendix A.1.

figure a

3.2.3 Approximation of the probability distributions

By the subsequent evaluation of the outcome model, the probability distribution of the rewards at the end of the season can be determined, independent of the contest design. However, a large amount of sequential events opens an extremely large number of possible outcome paths, which causes numerical problems if their probability would be evaluated exactly. For an outcome model that depends on past outcomes, the outcome paths can additionally become very complex. For this reason, it is often appropriate to perform a Monte Carlo simulation to approximate the probability distribution of the final rewards. Each run simulates one path for the remaining contest according to the outcome model and the generated covariates/schedule information at run time. With an adequate number of \(N_\text {MC}\) Monte Carlo runs, the calculated empirical probability distribution and hence the event importance values become sufficiently close to the true values.

In a simulation that estimates the EI values for all events consecutively, a chronologically backward iteration over the events allows for the reuse of already evolved paths as they can be merged with the respective previous outcomes to longer paths and thus reduce the computational complexity. In this case, \(N_\text {MC}\) is an upper bound for the number of actually performed runs in each step and, at the same time, a lower bound for the number of runs the event importance estimate is premised on.

3.2.4 Iterative approximation of event importance

In many applications, the event importance is an integral part of the outcome model, e.g. when the importance can be interpreted as an incentive for the contestants to provide effort that, in turn, influences the outcome of the event. Independent of whether the outcome model is known or not, it encompasses the event importance values which are not available beforehand.

To determine the unknown event importance values, Algorithm 1 is at first executed with an approximate outcome model that does not feature the event importance in the variable set. This returns an initial approximation of the desired EI values. A subsequent iterative application of Algorithm 1 with the full covariate set, including the preliminary EI variables, updates the event importance estimates accounting for their own impact through the outcome model. This iterative procedure can be continued until a predefined stopping criterion is reached. The application in Sect. 5 is an example of an outcome model which includes the event importance in the covariates. Algorithm 2 in Appendix A.2 illustrates how the iterative procedure is implemented in the context of the application.

3.2.5 Distance functions

To measure the difference between the probability distributions of the contest rewards, an appropriate distance function needs to be chosen. In simple applications with only binary event outcomes and a binary reward scheme, the difference between the contest-winning probabilities conditional on the event outcome is a straightforward choice as the distance function.

As soon as a tournament features either multiple possible event outcomes or multiple rewards, simply taking differences between the calculated probabilities is not possible anymore. Such complex cases require a statistical distance function. For most settings, the Jensen–Shannon divergence (JSD) is an appropriate distance function to cope with multiple discrete probability distributions (Lin, 1991). It is a common distance measure (Nielsen, 2020) with desirable properties, as it can be applied to any number and size of probability distributions, and it also allows for a weighting scheme. Compared to the Kullback–Leibler divergence, another common distance measure, the JSD is bounded and symmetrical. The Bhattacharya distance is less widely used as the JSD but offers the same valuable properties. For settings that require extreme flexibility in the distance measure, such as non-overlapping distribution or differing probability spaces, the Wasserstein distance can be an appropriate alternative. However, it is highly uncommon for typical tournaments to require such features. The JSD measures the difference in the Shannon entropy between the probability distributions, which implies that it does not have an intuitive linear interpretation. If such an interpretation is of relevance, other candidates, such as the total variation distance, can be applied.

4 Application: Front-loading in US primaries

4.1 Introduction

Presidential primary elections in the United States are held by the Democratic and the Republican party to determine the presidential election nominees. Both parties follow a similar procedure where each state, every permanently inhabited US territory, and party members living abroad are attributed a certain number of votes (pledged delegates). For ease of readability, all entities are henceforth labeled as states. In addition to the pledged delegates, selected party officials have additional votes (unpledged delegates) that are not tied to states’ election results. In order to be nominated for the presidential elections, the candidates in the primaries must receive the majority of the delegate votes.

Each state holds its election or caucus on an individually chosen date, and the election results determine how its delegates vote. Several states can vote on the same day, e.g. on “Super Tuesday”, about one third of all delegate votes are determined. Due to the partially sequential nature of the primaries, it may happen that later elections become irrelevant to the outcome of the nomination if a candidate has already received more than half of the total delegate votes. Moreover, the first elections are of greater importance as they reveal voters’ preferences and influence later elections through their results. These two features of the electoral process lead to a long-known and unresolved problem of front-loading (Mayer & Busch, 2003; Ridout & Rottinghaus, 2008).

From a state’s perspective, an early election date can increase its influence in the primaries. If all states are considering moving their elections to earlier dates, a solution must be found to regulate the timing of state elections that takes into account the different importance of the dates. Currently, additional delegates are granted for late election dates, but these do not provide sufficient incentive for states to resolve front-loading. In the following analysis, we compare the US Democrats’ 2020 election schedule with two proposed election schedules, namely randomizing the election dates and arranging the states according to their delegate count. The aim of the analysis is to find out whether this leads to a more balanced distribution of the importance of elections for the individual states that is less driven by the timing of the elections.

4.2 Setup

We utilize the actual schedule and reward structure of the 2020 democratic party presidential primary elections. The ordering of the elections and the number of delegates rewarded by the election are displayed in Table S1 in Supplemental Material Section S-A. For ease of exposition, we simplify the model of the election process by discarding unpledged delegates, implementing only winner-takes-it-all elections, and engaging only two candidates \(i\in \{0,1\}\).

The reward function of the contest is given by winning the primaries, i.e. obtaining the majority of the delegates’ votes. We model the state’s election as a representative voter facing a binary choice model with random utility. The utility \(\psi _{i,s}\) of candidate i for state s as characterized by (1) is composed of four components: a) the fixed reputation \(\{\eta _0,\eta _1\}=\{0.5,0\}\) of the candidates, which yields a small constant advantage for one candidate.Footnote 2 b) the match between state preferences \(\rho _s\sim N(0,1)\) and the candidates’ positions \(\{\rho _0,\rho _1\}=\{-1,1\}\), c) spillover effects \(\zeta _{i,s}\) triggered by above/below average performance of candidate i in preceding state elections d) a standard Type I extreme value error term \(\epsilon _{i,s}\). Under the assumption that the representative voter always chooses the candidate with maximum utility, the setting describes a conditional logit model (McFadden, 1974) with outcome probability \(\pi _{i,s}\) that candidate i wins the election in state s as described by Eq. (2). When winning the election in state s, all the state delegates \(y_s\) are attributed to the victorious candidate i by setting \(y_{i,s}=y_s\) and \(y_{1-i,s}=0\). Spillover effects from the results of early states on future elections as defined in (3) occur if the share of obtained delegates’ votes in prior states, measured by the first term in (3), differs from the expected share solely based on the candidates’ reputation, expressed by the second term in (3). The spillover effects are an example of dynamic covariates in the outcome model that have to be re-evaluated when new election results are determined.

$$\begin{aligned} \psi _{i,s}&=\eta _i-\frac{\left|\rho _i-\rho _s\right|^2}{2} + \zeta _{i,s} + \epsilon _{i,s} \end{aligned}$$
(1)
$$\begin{aligned} \pi _{i,s}&=\frac{\exp \left( \psi _{i,s}\right) }{\exp \left( \psi _{i,s}\right) +\exp \left( \psi _{1-i,s}\right) } \end{aligned}$$
(2)
$$\begin{aligned} \zeta _{i,s}&=\frac{\sum _{s'<s} y_{i,s'}}{\sum _{s'<s} \left( y_{i,s'}+y_{1-i,s'}\right) }-\frac{\exp \left( \eta _{i}\right) }{\exp \left( \eta _i\right) +\exp \left( \eta _{1-i}\right) } \end{aligned}$$
(3)

Based on the model for the election process given by Eqs. (1)–(3), the probability for each candidate to win the primaries conditional on the outcome of a single state’s result is determined. Because the number of states and territories is too large to allow an exact numerical calculation, the winning probabilities are approximated by a Monte Carlo simulation using 5000 simulation runs. As suggested in Sect. 3.2.5, we choose the difference in the winning probabilities as the distance function for this contest with binary reward structure, i.e. to be nominated as a presidential candidate or not. This distance function leads to symmetric EI estimates for both candidates. To eliminate the dependency on a particular set of states’ preferences, we simulate 1000 random draws of state preferences.

The three schedules we evaluate are defined as follows: The regular schedule is according to the actual election dates and the allocated number of delegates by state. In the random schedule, we randomly permute the ordering of the states keeping the framework of the schedule fixed. The rank increase scheme ranks the states increasing by their number of delegate votes and applies the ordering to the actual schedule framework.

4.3 Results

Fig. 1
figure 1

Average event importance estimates over 1000 states’ (preferences) samples with linear fit

Figure 1 shows the average EI estimates over all samples for the three schedule types. The regular schedule of the 2020 democratic party primaries (1a) displays the increased importance of the early elections, as the respective states have a higher average EI estimate than the number of delegates allocated to them would suggest. The randomized schedule (1b) reveals a linear relationship between the ability of a state to change the outcome of the primaries and its number of delegates. Since all states will eventually benefit from an early position in the schedule, the positive spillover effects are spread across all states and territories and balance each other out.

Ordering the states by increasing number of delegates (1c) cannot entirely eliminate the first-winner effect but substantially alleviates it. The increased importance of states due to the early election date can be compensated by a smaller number of delegates, which is an option already considered in the allocation of delegates. Because ex-ante randomization requires many repetitions to balance the positional effects, the ordered schedule seems to be a good compromise between practicability and fairness.

Fig. 2
figure 2

Estimated event importance values for Iowa and California at different hypothetical positions in the election schedule for 400 random samples of states’ preferences

To illustrate the importance of both the first-winner effect and the state’s size, we show in Fig. 2 the estimated EI values for two states, Iowa (41 delegates) and California (415 delegates), at different hypothetical positions in the election schedule. For the very early positions (1 and 4), both states are of considerable importance for the final nomination of the presidential candidates. From position 5, the “Super Tuesday” on which numerous states hold their elections, the capability of the elections to influence the final nomination decreases considerably.

The importance of elections in small states thrives on the fact that there are spillover effects through the first-winner effect. Because of the substantial amount of delegates, California remains considerably important for the nomination result in late stages of the schedule, while the low number of Iowa’s delegates become irrelevant in many realizations. For Iowa in particular, if the election would be held in the middle (20) or at the end (50) of the schedule, the importance of the election would be determined only by the possibility that the state’s delegates could act as tiebreakers in the nomination if the election race is close.

5 Application: European football leagues

5.1 Introduction

As with many analyses of contests, sports data provide a suitable and well-structured framework for applications because it features accurate observational data (Kahn, 2000; Bar-Eli et al., 2020). We apply the EI framework to the seven major European football leagues. Those contests have a non-trivial schedule of multiple events, held sequentially or in parallel, between pairings of the participants, and a non-linear reward structure that can vary individually or change throughout the season – all of which can be handled naturally with the proposed framework. With, for example, postponed games leading to changes in the schedule, or supplementary rewards achieved by national cup tournaments changing the reward structure individually, this application is a good showcase to demonstrate the flexibility of the framework.

Quantifying EI in this context is interesting for several reasons. Contest designers should avoid match-ups that pit contestants with unequal incentive levels against each other. Such matches are potentially more susceptible to bribery, and a lower engagement of certain participants could give an unfair (dis)advantage to participants not even involved in the event itself (Duggan & Levitt, 2002; Elaad et al., 2018). Other valuable use cases of the EI in football tournaments include (a) selecting intense or interesting matches for prime-time broadcast, (b) improving the prediction of winning probabilities, and (c) detecting or avoiding unfair match schedules.

5.2 Setup

5.2.1 Data

We analyze data from the 2006/07 through 2018/19 seasons of seven major European football leagues, namely the German ‘1.Bundesliga’, the Dutch ‘Eredivisie’, the Spanish ‘La Liga’, the French ‘Ligue 1’, the English ‘Premier League’, the Portuguese ‘Primeira Liga’, and the Italian ‘Serie A’. These leagues were the major leagues in Europe in terms of sporting and financial success throughout the studied period. All leagues are designed as double round-robin tournaments, which means that each team plays each other twice - once at each home venue. The rewards are distributed after the season. With the seven analyzed leagues, we cover a variety of different reward structures. A detailed description of the tournament design, league format, and reward structure of the considered European football leagues can be found in Supplemental Material Section S-B.1.

For each individual match, we record a long list of characteristics: describing the match setting, such as the time or day of the week, characterizing the participating teams, as their success in recent matches, whether they play in international competitions, and metrics of the squad players, e.g. age, height, estimated market value, and preferred foot. The full set of all 133 covariates is listed in Supplemental Material Section S-B.2.

5.2.2 Specific application framework

In this section, we describe how we implement the general framework from Sect. 3 and elaborate on all generic functions outlined in Algorithm 1. Algorithm 2 in Appendix A.2 presents the pseudo-code of the specific framework tailored to this application. At the end of a football season, rewards are allocated to the teams based on their final rank. The areas in the ranking which denote the championship title, qualification for international competitions, and relegation are stated by strict thresholds. We use those boundaries to group all the ranks between two thresholds as a single reward. More detailed information on the reward structures per league and season appear in Supplemental Material Section S-B.3.1.. Individual updating patterns of the reward scheme, e.g. because the UEFA Europa League place allocated to the national cup winner is transferred and included in the league’s season rewards, are explained in Supplemental Material Section S-B.3.2.

In Sect. 3, we have outlined how the general framework can be employed for applications with unknown outcome functions and those incorporating the EI itself. Outcomes of football matches do not follow a deterministic rule and can only approximately be described by a probabilistic model. We follow the approach of Goller et al. (2021), using an ordered choice model with three outcome probabilities estimated by the Ordered Forest (Lechner & Okasa, 2019), hereafter abbreviated as ORF. To restrict the number of covariates in the ORF model, we perform a LASSO-based model selection step. Starting from the second iteration, this set of covariates is extended with the previously estimated EI values (as outlined in Sect. 3.2.4). In addition, we also simulate the exact score of the match, drawn from two independent Poisson distributions, as the goal difference often serves as a tie-breaker in determining the final ranking (find more details on the outcome model in Supplemental Material Section S-B.3.3). The general framework is not restricted to this specific method, such as the ORF, and the choice of the underlying outcome model is of second-order (consult Supplemental Material Section S-B.4.4 for detailed results).

The choice of the Jensen–Shannon divergence as the distance function, specified in equation (4), follows the argumentation in Sect. 3.2.5 for settings with multiple event outcomes and rewards. The JSD calculates the difference between the Shannon entropy H() of the weighted average of the probability distributions and the sum of the Shannon entropy of the individual probability distributions \(P_H,P_D,P_A\), where the subscript H stands for home win, D for draw, and A for away win. The Shannon entropy H() of the discrete probability distribution on the rewards is the negative sum over all rewards \(j\!\in \!\{1,...,m\}\) of the respective probability mass \(p_j\) and its natural logarithm. We use a scaling factor of \(\ln (3)^{-1}\) to constrain the EI to the [0,1] interval and weight the probability distributions \(P_i\) by the match outcome probabilities \(\{\pi _H, \pi _D, \pi _A\}\) to account for the likelihood of the three outcome scenarios.

$$\begin{aligned} \text {JSD}_{\pi _H,\pi _D,\pi _A}(P_H,P_D,P_A))=\frac{1}{\ln (3)}\left( {\textbf{H}}\left( \sum _{i\in \{H,D,A\}} \pi _i P_i\right) - \sum _{i\in \{H,D,A\}} \pi _i {\textbf{H}}(P_i) \right) \\ \textit{where}\ {\textbf{H}}(P)=-\sum _{j=1}^m p_j\ln \left( p_j\right) \nonumber \end{aligned}$$
(4)

5.3 Results

5.3.1 Distribution of the estimated values

Fig. 3
figure 3

Home and away team’s event importance estimates grouped by matc h day enumerated in relation to the last match day, all seasons, and all leagues. Square-root transformation to y-axis applied

Figure 3 shows the distribution of estimated EI values by match days. For the majority of the season, the estimated EI values are concentrated around a value of about 0.01. In other words, most matches are similarly (un)important for the first parts of the season. Deviations can be observed in pairings between teams that are expected to be close in the final end-of-season standings, as in these matches a positive result implies a negative result for the opponent. This behavior changes towards the end of the season, with non-relevant matches and frequent outliers of particularly important matches. The uncertainty about the outcome of the rest of the season is reduced with fewer unknown future results, and the results of individual matches can become more decisive for the end-of-season rewards. This results in more pronounced values of the EI towards the end of the season. As an illustrative example, we show the estimated EI values for the last match day in the 2017/18 German ‘1.Bundesliga’ season in Supplemental Material Section S-B.4.1.

5.3.2 Predicting match outcomes

To shed light on whether the quantified EI has an impact on outcome prediction, we compare the estimates of a ‘baseline’ ordered forest model that does not use the EI information with a ‘richer’ ORF model that includes the estimated EI of both, the home and away team, as additional input.

We fit both ORF models on half of the data and predict the outcome probabilities with the two models on the other half. Based on the outcome probabilities, we construct the expected points (ExpP) measure by awarding points to the outcomes according to modern football rules – 3 points for a win, 1 for a draw, and 0 for a loss. This procedure is repeated with swapped training and prediction samples.

Fig. 4
figure 4

The difference in expected points (ExpP) between the model including EI variables (Rich) and the baseline model (Base) by the difference in event importance (EI) between the team and its opponent. Values are rounded to the nearest grid point. Frequency indicates the number of points on a grid point. The red line denotes a GAM with a 95% confidence interval. Expected points are averaged over 100 estimates with different sample splits. Square-root transformation to x-axis applied

Figure 4 displays the difference in expected points between the rich and the baseline model by the difference in the EI values between the two competing teams. The generalized additive model (GAM) fit on the data confirms that teams with a higher absolute difference in EI are attributed a higher absolute prediction of ExpP with the richer model verifying that the inclusion of the EI variable is relevant for outcome prediction. The variable importance measures of the EI variables in the rich model are shown in Supplemental Material Section S-B.4.3 and provide evidence for the notable contribution of the EI in the outcome model.

5.3.3 Prediction power improvement

In Sect. 5.3.2, we have shown that the estimated EI values are picked up by an enriched outcome model. This raises the question of whether using EI values in an outcome model improves predictive performance.

We compare seven different prediction models to margin-free betting odds of the online betting platform B365, collected from the website www.football-data.co.uk. To ensure comparability with the model predictions, we linearly scale the odds to remove the bookmaker’s margin. The baseline ORF model as described in Sect. 5.3.2 (ORF), the richer model including the EI values (ORF+EI), and additionally including the difference of EI estimates (ORF+EI+diff), an ORF model with a binary importance measure (ORF+naive), an ORF model (ORF+add3) that adds three covariates, an ordered logit model with the baseline variables (Logit), and an ordered logit including the EI estimates (Logit+EI). The ORF+add3 model includes travel distance, days since the last match of the home team, and days to the next match of the away team to investigate if a potential improvement is just induced by a larger set of covariates.

To evaluate the out-of-sample prediction accuracy we randomly split the data into two samples. To not give the proposed models an unfair advantage over the betting odds, this split is performed on the full-season level. On one-half of the seasons, the models are fitted; on the other half, the prediction accuracy of the models is measured by the log-likelihood and the Brier score (results for the Brier score can be found in Supplemental Material Section S-B.4.6). In each repetition, we index the accuracy measures by the results of the benchmark betting-odds model to balance any particular characteristics of the chosen sample.

Fig. 5
figure 5

Out-of-sample prediction accuracy of different models over 1000 repetitions. Values are quantified in log-Likelihood and indexed in each repetition by the performance of betting odds

Figure 5 shows the out-of-sample prediction accuracy. For the logit model, the addition of the EI results in only a slight improvement, which is probably due to the linearity constraint. Including EI values in the ORF model substantially increases the predictive power, as the additional information contained in the EI variables can be fully utilized, resulting in better performance than the margin-free betting odds. Recording the event importance in a binary variable does not improve the accuracy of the prediction. The model with three added covariates indicates that the increase in prediction power is not just induced by the larger set of covariates.

Fig. 6
figure 6

The difference in realized and expected points according to betting odds by the difference in event importance between the team and its opponent. Values are rounded to the nearest grid point. Frequency indicates the number of points on a grid point. The red line denotes GAM with a 95% confidence interval. Square-root transformation to x-axis applied

To break down the improvement by the EI information on the betting odds, we present in Fig. 6 the difference between the achieved points and the expected points according to the betting odds in relation to the difference in the EI values between the teams and their opponents. A GAM fit on all data points indicates that, in particular, across matches where the differences are small, the EI can partly explain the mismatch in the betting odds. For larger EI differences, the EI does not provide additional information to the bookmaker’s model. We deduce that the betting odds already cover the unequal incentives when they are particularly pronounced but do not fully account for more subtle disparities in the importance of a match to the competing teams. This is generally in line with and extends the results of Feddersen et al. (2023), which show that bookmakers are aware of the impact of different incentives on the outcome of matches on the final match days. We provide additional evidence on the complimentary informational content of the EI measure to the betting odds in Supplemental Material Section S-B.4.5.

5.3.4 Team performance

Besides the usefulness of the EI measure in predictions, we investigate whether the differences in incentives for teams are reflected in-match statistics that record a team’s on-field behavior and performance. For this, we investigate in-match statistics (provided by Opta) for the 2010/11 through 2018/19 German ‘1.Bundesliga’ seasons with regards to our EI estimates. The team performance data is collected individually for both teams and pooled for the home and away teams. Outcome variables are totals per match, except for ‘Duel win’ and ‘Tackles win’, which are shares. For ease of interpretation, the EI estimates for the home and away teams are each divided into three groups - ‘zero’ (EI = 0), ‘low’, and ‘high’ EI. The threshold for the ‘high’ group is chosen such that its size approximately matches the ‘zero’ group that accounts for 5% of the data.

The analysis follows a two-step procedure: First, we run a fixed effect regression for every outcome individually using the combinations ‘Team \(\times \) home/away \(\times \) season’ and ‘Opponent \(\times \) home/away \(\times \) season ‘fixed effects’. The resulting residuals are then centered and standardized by ‘Team \(\times \) home/away \(\times \) season’. On those scaled residuals, we run a regression using again the grid on the EI categories ‘zero’, ‘low’, and ‘high’ for both competing teams. Figure 7 shows the results for four outcomes: duels per game, number of completed passes, number of goals scored, and number of goals conceded. Complementary results using other outcomes are shown in Supplemental Material Section S-B.4.7.

Fig. 7
figure 7

Linear regression estimates of the centered and standardized residuals of different outcomes on the Event importance categories. 95% confidence intervals are in parentheses. The baseline is low by low category

The results can be summarised as follows: Teams for which a match is particularly important (i) play more aggressively, entering more duels on the pitch, (ii) play more directly towards the goal with fewer passes, fewer touches, and more entries into the final third and penalty area, (iii) score more goals. In contrast, teams with zero importance exhibit a more passive style of play and concede more goals.

5.3.5 Public perception

Sport is entertainment and thrives on public perception. If the calculated EI can represent the (later realized) public interest in a specific match, it could be useful for several purposes – marketing, ticket pricing, or prime-time broadcasting selection. With this in mind, we relate the EI to the stadium attendance turnout, as well as social media attention. While attendances are officially reported by clubs, social media attention is captured through Twitter account mentions (e.g. @LFC - the official Twitter account of the English team Liverpool FC) and match hashtags (e.g. #BVBS04 relates to the match of the teams Borussia Dortmund against Schalke 04) within the 24 h before kickoff on Twitter. Due to the inconsistency and lack of use of the aforementioned proxies in the early years and across the leagues, we can only perform this analysis beginning with the 2014/15 season and must exclude the Spanish and Portuguese leagues.

The procedure is analog to the residual analysis in Sect. 5.3.4. As different clubs have put different emphases on social media, and this has changed over time, we control for the team and season-specific usage of social media. Figure 8 presents the point estimates and 95 % confidence intervals of the linear regressions using the Twitter and attendance data (both in logs) as outcomes. Stadium attendance is modestly associated with the home team’s importance in the match. Here, restrictions on stadium capacity and (pre-sold) season tickets could mitigate the effect. Thus, social media attention might give a more clear picture of realized interest. We find team account mentions are strongly associated with the respective team’s EI measure. Similarly, the match-tag mentions increase with both teams’ EI. This is consistent with and complementary to Dobson and Goddard (1992) and Lei & Humphreys (2013) reporting higher stadium attendance for more important sporting events and recent findings by Buraimo et al. (2022) that Premier League television audiences are larger for more important matches.

Fig. 8
figure 8

Linear regression estimates of the centered and standardized residuals of different outcomes on the event importance (EI) categories. 95% confidence intervals are in parentheses. The baseline is low by low category

6 Conclusion

Public perception and academic research analyze incentives in simple situations where there is a direct link between performance and reward. More complex situations with indirect rewards and therefore unclear implicit incentive structures have received little attention.

In this article, we propose a statistical method to quantify the importance of single events in multi-event contests with end-of-contest reward structures. Thanks to its flexibility and generality the procedure covers a multitude of potential applications and can be valuable for various fields, including sales and marketing, human resources, or operations management.

Our event importance framework can be adapted to different contest structures seen in society and opens a variety of potential research topics, such as behavioral responses involving implicit incentives or operational concerns associated with the scheduling of a contest. These include, for example, different valuations due to the order of actions or asymmetric incentives that lead to distorted probabilities of winning in a contest.

In an application to European football leagues, we show the association of the quantified importance of a match to in-match behavior and the performance of the teams. As discrepancies in the EI can lead to altered outcome probabilities, the quantification of the event importance can help to ensure fair tournaments. The event importance measure also addresses other stakeholders in the football industry. As we show that the EI measure is consistent with the public interest in terms of social media and stadium attendance, it can be useful for dynamic ticket pricing or TV stations that want to broadcast the most attractive match. Lastly, we illustrate the value of the EI measure for predicting match outcomes and point out under which circumstances the bookmakers do not yet account for the event’s importance.

For the application to the US presidential primaries, we quantify the higher relevance of early election dates induced by the first-winner effect. For small states with a low number of delegates, this can substantially boost their influence on the nomination outcome, as otherwise, their votes become irrelevant in many of the primaries. We show that the two investigated hypothetical schedules lead to more equitable distribution in the ratio of event importance values to the number of delegates rewarded by the election.