1 Introduction

The development of advanced statistical methods to forecast teams and players’ performances (e.g. Adhikari et al. 2020; Galariotis et al. 2018; Yang et al. 2014) or match results (e.g. Diniz et al. 2019; Friesl et al. 2020; Koopman and Lit 2015; Nikolaidis 2015) has recently attracted the interest of many researchers in the field of statistics and econometrics. Since betting on sport events is based on a guess about the future match outcome, the development of statistical models able to improve decision making is of interest to both bettors and odd setters.

In the context of football, the most popular bet is based on whether one expects that a team will win, lose, or draw the next game. Accordingly, many different methodologies have been developed with the aim of constructing profitable betting strategies on the basis of statistical forecasts for the final match outcome.

The result of a football match depends by the number of goals scored and conceded by both the involved teams. Since the number of goals scored (as well as the conceded ones) follows a Poisson distribution, the mainstream statistical approach for forecasting the match result is mainly based on Poisson regressions (e.g. Angelini and De Angelis 2017; Koopman and Lit 2015; Maher 1982).

Indeed, Maher (1982) proposed to model the goals scored by two teams by means of two independent Poisson distributions. Therefore, the final match outcome can be predicted by means of the forecasts on the total goals scored by the two opponent teams. The idea of using the model of Maher (1982) for the development of a profitable betting strategy is due to Dixon and Coles (1997) and Dixon and Pope (2004).

Other contributors, as for example Goddard and Asimakopoulos (2004), Goddard (2005), Forrest and Simmons (2000) and Forrest et al. (2005), have adopted a methodology based on the ordered probit or logit regressions to directly model either the outcomes or the probability of future match results.

However, nowadays the sport betting market offers a variety of alternative bets than the classical “win/draw/lose”. For example, it is possible to bet on the event that will be scored more (or less) than a certain amount of goals in the next match, without guessing on the final result. this bet is called over (under). Further, a bettor can bet on the event that both teams will (or not) score at least a goal within the game (Goal/No Goal). Lastly, it is also possible to bet on the occurrence of a red card within the football match.

Exactly as in the case of the “classical bet”, a bettor can be surely interested also in forecasting the probability that these alternative events will occur. Nevertheless, despite their potential relevance, forecasting models for these bets have been poorly studied with some very recent exception (e.g. Wheatcroft 2020).

Moreover, while some of these events (e.g. in the case of Under/Over or of Goal/No Goal) can be predicted by means of forecasts on the number of goals that will be scored by each team, other events are binary outcomes by nature (i.e. occurrence of a red card or of a penalty within the match). Nevertheless, also the aforementioned not binary events as the Goal/No Goal and Under/Over can actually be seen as binary outcomes from the bettor point of view.

Indeed, in these two examples the bettor has only two possible choices for betting: either the event “Goal” will occur or the “No Goal” one. The same also applies to the Under/Over bet.

Therefore, for each team we can construct a binary variable that takes value 1 if in a given match both teams scored at least one goal and 0 otherwise in the case of Goal/No Goal bet. Similarly, we can build a binary variable that takes value of 1 if in a given match at least three goal have been scored and 0 otherwise, as in the case of the Under/Over 2.5 bet.

By considering them as binary bets, in this paper we provide a simple framework to obtain accurate forecasts by implementing appropriate statistical models for binary time series.

While from one side we have the cassical generalized autoregressive moving average (GARMA) models (Benjamin et al. 2003), that are generalized linear models for modeling binary time series data, more recently (Creal et al. 2013) developed a new dynamic model called generalized autoregressive score (GAS), that has already been applied successfully also in the context of football matches forecasting (e.g. Koopman and Lit 2015, 2019). The score-driven models are observation-driven, which means that the likelihood is available in closed form. This allows for a fast estimation process despite the challenges involved with the use of high-dimensional models.

The dynamic characteristic of the class of score-driven models lies on the fact that the time-varying coefficients are updated by means of an autoregressive component and by the score of the conditional observation probability density function.

By assuming that the binary outcomes of football matches are Bernoulli trials, in this paper we show that the GAS model can be successfully applied also for predicting the probability that a binary event will occur in the next match.

To this aim, we provide experimental evidences from two football championships: the English Premier League and the Italian Serie A. For both national leagues, we collect the last 13 seasons of matches and we use the last two to assess the predictive ability of the statistical models under consideration.

Once probability forecasts are obtained, we build a profitable betting strategy that can be summarized as follows: if the probability of a given event is higher (lower) than the average probability for both the teams, there is a suggestion of betting on the outcome 1 (outcome 0).

For example, suppose that we aim to forecast the occurrence of a red card in the next match. If the probability of observing a red card is higher (lower) than the average probability of observing a red card for both teams, the bettor should bet on the event “red card: yes” (“red card: no”). In the case of uncertainty, any bet should be placed. This simple betting strategy allows the bettor to avoid betting on too uncertain matches. This can considerably improve the returns of a betting strategy (e.g. Angelini and De Angelis 2017). Indeed the fact that not always a betting advice is provided allows the bettor to save money and to increase his/her net profit.

We find that, for the event “red card: yes or not”, the proposed betting strategy with GAS probability forecast has an accuracy, computed as the proportion of the correct bets over the total, equal to 100% for the English data and 94% for the Italian ones.

In the case of Goal/No Goal events, instead, the accuracy is on average equal to 93% in both experiments, while for some football teams it is close or equal to 100%.

Lastly, in the case of Under/Over we consider different thresholds. In particular bets on the events Under/Over 1.5, Under/Over 2.5 and Under/Over 3.5 are considered. The difference lies on the considered threshold about the amount of goals scored in the match (one in the first case, two in the second and three in the third one). The betting accuracy is around 90% on average for the English Premier League for all the considered thresholds. In the case of Italian Serie A, instead, the betting accuracy is always higher than 80% (94% in the case of Under/Over 1.5) and is always much better than the considered alternatives.

Therefore, we would recommend the proposed strategy for betting purposes.

The rest of the paper is as follows. In the Sect. 2 the statistical methods for modeling binary time series are presented as well as more details about the proposed score-driven approach based on Bernoulli conditional density. Then, the Sect. 3 is devoted to the data used for the empirical experiments with English Premier League and Italian Serie A data and the betting strategy is also discussed more in detail. Section 4 presents the results of the empirical experiments. At the end are some final remarks are discussed.

2 Forecasting binary time series

As previously discussed, some of the available bets for soccer match are binary (the red cards or penalties occurrence are clear examples) while others can be seen and treated as binary. In general, in many applications to finance or macroeconomics is common to construct binary variables from observed continuous ones (e.g. Christoffersen and Diebold 2006; Harding and Pagan 2011; Hausman et al. 1992; Nyberg 2010; Startz 2008).

Since the team’s performances depend on the results realized in the recent matches, assuming the independence of matches played by the same team over time is not reasonableFootnote 1. The existence of this persistency effect implies that an autocorrelation structure for event’s probabilities for the same football team exists, thus justifying the usage of a time series approach in forecasting football match outcomes (Angelini and De Angelis 2017).

For modeling binary time series, several authors proposed particular extensions of Generalized Linear Models (e.g. Davis et al. 2003; Benjamin et al. 2003; Moysiadis and Fokianos 2014). One of the most general approach is the so called generalized autoregressive moving average (GARMA) model of Benjamin et al. (2003) that can be used to model a variety of time-dependent response variables (e.g. data with Bernoulli, Poisson, negative binomial, or binomial distributions) as well as continuous data. The GARMA model is a direct GLM extension of the usual autoregressive moving average (ARMA).

In the GARMA model we assume that the conditional distribution of each observation \(y_t\), with \(t = 1, \dots , T\), belongs to the same exponential family given the previous information set \({\mathcal {F}}_t\):

$$\begin{aligned} f\left( y_{t} \mid {\mathcal {F}}_t\right) =\exp \left\{ y_{t} \vartheta _{t}- b\left( \vartheta _{t}\right) +c_t\right\} , \end{aligned}$$
(1)

where the function \(b(\cdot )\) defines the specific member of the exponential family of interest and \(c_t\) is a sequence of constants. The terms \(E\left( y_{t} \mid {\mathcal {F}}_{t}\right) \) and \({\text {var}}\left( y_{t} \mid {\mathcal {F}}_{t}\right) \) represent the conditional mean and variance of \(y_{t}\) given \({\mathcal {F}}_{t}\). We can express a general model for \(\eta _t = g(\mu _t)\), without exogenous covariates, as follow (Benjamin et al. 2003):

$$\begin{aligned} \eta _t = g\left( \mu _{t}\right) =\sum _{j=1}^{p} \phi _{j} g\left( y_{t-j}\right) + \sum _{j=1}^{q} \theta _{j}\left\{ g\left( y_{t-j}\right) -g\left( \mu _{t-j}\right) \right\} \end{aligned}$$
(2)

where \(g\left( \cdot \right) \) is the link function while the \(\phi _j\)’s and \(\theta _j\)’s are the equivalent of the autoregressive and moving average parameters in the standard ARMA. In this case, the parameters \(\phi =\left( \phi _1, \phi _2, \dots , \phi _p\right) \) and \(\theta =\left( \theta _1, \theta _2, \dots ,\theta _q \right) \) can be estimated by maximum likelihood. For certain functions \(g(\cdot )\) it may be necessary to replace \(y_{t-j}\) with a transformation \(y_{t-j}^{*}\) in (2) in order to avoid the nonexistence of \(g\left( y_{t-j}\right) \) for certain values of \(y_{t-j}\). Clearly, the kind of transformation \(y_{t-j}^{*}\), if needed, depends on the particular function \(g(\cdot )\).

In the case of binary time series we have that 0 and 1 are the only two possible values that \(y_t\) can take. Clearly, the value 1 indicates that some event occurs, and 0 that it does not. A common special case of GARMA for binary time series is the so called Binary Autoregressive Moving Average (BARMA) extensively studied by Li (1994) and Wang and Li (2011) that is commonly implemented for macroeconomic applications (e.g. Startz 2008; Kauppi and Saikkonen 2008), that can be written as:

$$\begin{aligned} \eta _t= g\left( \mu _{t}\right) =\phi _1 g\left( y_{t-1}\right) + \theta _1 \left\{ g\left( y_{t-1}\right) -g\left( \mu _{t-1}\right) \right\} \end{aligned}$$
(3)

where the function \(g\left( \cdot \right) \) is the logit to ensure that \(\mu _t\), the conditional probability of a success at time point t, takes only values between 0 and 1. As a special case of GARMA, also the parameters \(\phi _1\) and \(\theta _1\) of the BARMA model can be estimated with maximum likelihood but it is common to estimate them also by means of quasi-maximum likelihood (Zeger and Qaqish 1988).

By assuming that the binary outcomes of football matches are Bernoulli trials, an alternative and promising possibility for predicting the conditional probability of success is the generalized autoregressive score (GAS) model of Creal et al. (2013), that has been proved to be useful in predicting football matches (Koopman and Lit 2019).

The GAS model can be formalized as follows. Let us consider the teams’ matches outcomes over the time \((y_t:t=1, \dots , T)\) with \(f_t\) the time-varying parameter vector observed at time t. Let also assume that the available information set at time t, called \({\mathcal {F}}_t\), it is given by a collection of some previous realizations of the parameters \(f_t\) and the outcomes \(y_t\). Then, define \(\theta \) a vector of static parameters.

The starting point of the GAS model is the observation density of \(y_t\):

$$\begin{aligned} y_t \sim p(y_{t} | f_t, {\mathcal {F}}_t; \theta ), \end{aligned}$$
(4)

Then, given two integers \( 0 \le p , q \le T-1\), we can formalize the GAS(pq) model for the t-th realization \(f_t\) of the time-varying parameter vector as:

$$\begin{aligned} f_t = \omega + \sum _{i=1}^{p} {\mathbf {A}}_i s_{t-i} + \sum _{j=1}^{q} {\mathbf {B}}_j f_{t-j} \end{aligned}$$
(5)

where \(\omega \) is a real vector and the \({\mathbf {A}}\)’s and the \({\mathbf {B}}\)’s are real matrices with an appropriate dimension – all the scalar parameters in \(\omega , {\mathbf {A}}_1, \dots , {\mathbf {A}}_n, {\mathbf {B}}_1, \dots , {\mathbf {B}}_m\) are collected in the vector \(\theta \) introduced in (4); \(s_t\) is a type of score of the conditional distribution defined in (4), and it is a function of the data and the parameters, so that:

$$\begin{aligned} s_t = S_t\cdot \nabla _t \end{aligned}$$
(6)

where \(S_t = S_t(f_t, {\mathcal {F}}_t; \theta )\) is a positive definite scaling matrix known at time t and \(\nabla _t (y_t, f_t,{\mathcal {F}}_t; \theta )\) is the score of \(y_t\) evaluated with respect to \(f_t\), i.e.:

$$\begin{aligned} \nabla _t = \frac{\partial \log p(y_{t} | f_t, {\mathcal {F}}_t; \theta )}{\partial f_t} \end{aligned}$$
(7)

Since the score depends on the complete density and not only on some moments of \(y_t\), the GAS(pq) model uses the full density structure for updating \(f_t\). Different choices about the scaling matrix \(S_t\) lead to different model specification. Creal et al. (2013) suggested to consider a scale that depends on the variance of the score. Particularly, the authors proposed to scale a matrix \(S_t\) to a power \(\gamma > 0\) of the inverse of the information matrix of \(f_t\):

$$\begin{aligned} S_t = {\mathcal {I}}^{-\gamma } \end{aligned}$$
(8)

with \({\mathcal {I}}\):

$$\begin{aligned} {\mathcal {I}} = E \left[ \nabla _t \nabla _t' \right] \end{aligned}$$
(9)

with \(\nabla _t\) defined as (7). The parameter \(\gamma \) is fixed by the user and usually takes value in the set \(\{0,\frac{1}{2}, 1\}\). When \(\gamma \) is set equal to zero and \(S_t\) the identity matrix so \(S_t = {\mathbf {I}}\) there is no scaling. Instead, if \(\gamma =1\) the conditional score \(\nabla _t\) is premultiplied by the inverse of its covariance matrix \({\mathcal {I}}^{-1}(f_t)\) otherwise, with \(\gamma =\frac{1}{2}\), to its square-root. Therefore, the GAS(1, 1) that can be written as:

$$\begin{aligned} f_t = \omega + {\mathbf {A}}_1 s_{t-1} + {\mathbf {B}}_1 f_{t-1} \end{aligned}$$
(10)

It is also possible to write the Eq. (10) in the case of \(f_t = (f_{1,t}, \dots , f_{K,t})\) by means of:

$$\begin{aligned} \omega = \begin{pmatrix} \omega _1 \\ \vdots \\ \omega _K \end{pmatrix} \text {,} \quad {\mathbf {A}} = \begin{pmatrix} a_1 &{} \dots &{} 0 \\ \vdots &{} \ddots &{} \vdots \\ 0 &{} \cdots &{} a_K \end{pmatrix} \quad \text {and} \quad {\mathbf {B}} = \begin{pmatrix} b_1 &{} \dots &{} 0 \\ \vdots &{} \ddots &{} \vdots \\ 0 &{} \cdots &{} b_K \end{pmatrix} \end{aligned}$$

In the case of football matches, the binary outcomes come from a single trial since a match is played just once within the same match-week. Therefore we can assume that these binary observations \(y_t\) follow a Bernoulli distribution (Lu 2020):

$$\begin{aligned} p\left( y_t | \pi _t \right) = \pi _t^{y_t} + \left( 1-\pi _t\right) ^{1-y_t} \end{aligned}$$
(11)

where \(y_t \in \{ 0, 1 \}\) and \(\pi _t\) is the time varying probability. Within the GAS(1, 1) model, following Mesters and Koopman (2012) and Lit and Koopman (2020), we can specify the unobserved time-varying probability \(\pi _t\) as follows:

$$\begin{aligned} \eta _t&= g\left( \pi _{t}\right) ,\nonumber \\ \pi _t&= \omega + a \pi _{t-1} + b s_{t-1} \end{aligned}$$
(12)

where the link function \(g\left( \cdot \right) \) is the logit link, such that the values of \(\pi _t\) take value between 0 and 1. The driving force behind the updating equation is, exactly defined as in the Eq. (10), the scaled score innovation \(s_t\) where \(\pi _t\) is the only time varying parmeter \(f_t\). In this case, the information set consists of lagged variables of \(\pi _t\). The unknown coefficients, i.e. the constant \(\omega \), the autoregressive parameter a and the score updating parameter b can be estimated by maximum likelihood.

3 Data and betting strategy

The empirical experiment has the aim of predicting the match outcomes in the next round of both the considered football competitions, the English Premier League and the Italian Serie A. The considered binary outcomes are the following events: the presence of a red card, if both teams score at least one goal (Goal/No Goal) and if there will be scored not more than or at least a certain amount of goals (Under/Over).

The decision of betting if an event will occur or not is based on some probability forecasts, obtained by a means of the statistical models shown in the previous section. The accuracy of the probability predictions obtained by means of alternative statistical models are compared in terms of the average proportion of correct bets over the total, that we define betting accuracy.

For each competition, the total data set consist of the last 13 seasons of football match results. We have partitioned the data set into the in-sample seasons from the 2008–2009 to the 2018–2019, that are used for the parameter estimation step, and the last two seasons 2019–2021 as the out-of-sample ones, that are used to assess the forecasting performance. In doing so we use a recursive forecasting scheme (Koopman and Lit 2019).

Table 1 Example of the dataset for the English Premier League

After each season, the teams ranked in the last 3 positions are then relegated and new 3 teams are instead promoted into the higher competition (from Championship to Premier League in England and from Serie B to Serie A in Italy). Therefore, to keep the number of teams constant, we consider the teams that participated in the last season. The data used in our empirical study can be found at http://www.football-data.co.uk.

The Table 1 shows how dataset for the English Premier League is built.

While one variable (red) is binary, the Goal/No Goal and the Under/Over variables have been transformed in binary events. Indeed, we construct a dummy variable “both teams score” (Goal/No Goal), that takes value of 1 if both teams score at least one goal in the match and 0 otherwise. Similarly, we construct the binary variable “Over: yes” if the sum of the goals scored by both the teams was at least equal to the threshold. This variable assumes value of 0 otherwise. The descriptive statistics for the Premier League are below reported in the Table 2.

Table 2 English Premier League: descriptive statistics

Descriptive statistics show that, on average, in the English Premier League the home teams score 1.5 goals while the away ones 1.2. Hence, in a certain measure the home teams are more willing to win a match (probabily due to the so called home bias effect Goossens et al. 2012; Csató 2020). Moreover, in almost the 50% of the matches both team scores and there are not less than 3 total goals. In the 75% there are at least more than one goal, while only in the 30% of the matches more than three goals are scored, In the end, red cards occurred only in the 0.7% of the matches within the sample.

Similarly to the Premier League, we also collected the last 13 seasons of the Italian Serie A. The dataset looks like the one in Table 1. The descriptive statistics of the Italian Serie A matches are shown in the Table 3.

Table 3 Italian Serie A: descriptive statistics

The goals statistics are very similar to those of the English Premier League, such that also for the Italian Serie A the home teams are also more willing to win a match. However, Table 3 highlights that the occurrence of a red card is more frequent in the Italian national league than in the English one. The other statistics are instead very similar.

In order to produce betting recommendation about the results of all the matches of the current season for the two considered national league competitions, we take advantage of the the statistical model presented in the Sect. 2 of the paper.

More in details, by means of equations (3) and (12), we are able to forecast the probability that a certain binary event will realize (event “yes”) in the next match or not (event “no”). We call this probability for the i-th team \({\hat{\pi }}_{i, t+1}\).

Before to compute the forecasts for the \(t+1\) match probability, all the parameters in (3) and (12) are estimated using all data up to time t. Therefore, the first forecasts of all matches at time \(t+1\) (i.e. the first match of the season 2019-2020) are based on the parameter estimates from the data of the previous seasons 2009-2019.

In order to decide on what event to bet, we consider the following simple betting rule:

$$\begin{aligned} \text {if}&\quad {\hat{\pi }}_{i,t+1} \ge {\bar{\pi }}_{i} \quad \text {and} \quad {\hat{\pi }}_{j,t+1} \ge {\bar{\pi }}_{j} \quad \hat{y}_{t+1}=1 \nonumber \\ \text {if}&\quad {\hat{\pi }}_{i,t+1} \le {\bar{\pi }}_{i} \quad \text {and} \quad {\hat{\pi }}_{j,t+1} \le {\bar{\pi }}_{j} \quad \hat{y}_{t+1}=0 \nonumber \\ \text {if}&\quad {\hat{\pi }}_{i,t+1} \ge {\bar{\pi }}_{i} \quad \text {and} \quad {\hat{\pi }}_{j,t+1} < {\bar{\pi }}_{j} \quad \text {no bet} \nonumber \\ \text {if}&\quad {\hat{\pi }}_{i,t+1} \le {\bar{\pi }}_{i} \quad \text {and} \quad {\hat{\pi }}_{j,t+1} > {\bar{\pi }}_{j} \quad \text {no bet} \end{aligned}$$
(13)

In other words, if the probability of an event is greater (lower) than the average probability \({\bar{\pi }}\) for both the teams, the bettor has to guess on the event “yes” (event “no”). Otherwise, in the case of uncertainty the bet should be avoided.

Betting avoidance on uncertain events ia useful feature of the proposed strategy. Indeed the fact that not always a betting advice is provided allows the bettor to save money and to increase his/her net profit. Indeed, for evaluating the net profit of a betting strategy one should consider the differences between the amount of money earned and the expenses faced by the bettor. Clearly, betting avoidance dramatically reduces bettor’s expenses.

Following the score-driven model literature, the average (unconditional) probability \({\bar{\pi }}\) for the team i can be computed by means of:

$$\begin{aligned} {\bar{\pi }}_i = \frac{{\hat{\omega }}}{1-\hat{b}} \end{aligned}$$
(14)

where \({\hat{\omega }}\) and \(\hat{b}\) are the maximum likelihood parameters estimated from the equation (12) for the i-th football team. The same apply for the j-th team.

For the next round of football matches \(t+2\) and its probability forecasts \({\hat{\pi }}_{i, t+2}\), we re-estimate both the parameter vector and the average probability after including the football match results of the most recent round in our data set. Hence we have an expanding estimation sample, i.e. we follow a recursive forecasting approach, to ensure that we can utilize as much data we can for estimation phase (Koopman and Lit 2019).

As a benchmark for (13), based on the models (3) and (12) forecasts, we consider a naive betting strategy where the bettor suppose that, looking at the favorite teamFootnote 2, the outcome realized at time t will realizes also at time \(t+1\), so no forecasts are used. In other words the naive betting strategy can be written as:

$$\begin{aligned} \text {if}&\quad y_{i,t} = 1 \quad \hat{y}_{t+1}=1\nonumber \\ \text {if}&\quad y_{i,t} = 0 \quad \hat{y}_{i,t+1}=0 \end{aligned}$$
(15)

The naive strategy (15) is also necessary to assess if statistical models are really useful in predicting binary outcomes. If the naive strategy has a lower accuracy than (13), that is based on statistical forecasts, then we can argue that the models discussed in the Sect. 2 should be used to this aim.

Once a bet indication is obtained by means of either (13) or (15), we can evaluate a loss function to measure the betting accuracy. The betting accuracy is computed ex-post as the proportion of the correct bets over the total:

$$\begin{aligned} \text {ACC}_{i,k}=\frac{\text {Correct Bets}_{i,k}}{\text {Tot. Bets}_{i,k}} \end{aligned}$$
(16)

for the k-th bet (i.e. red card, Goal/No Goal or Under/Over) for the matches played by the i-th team, given that we make a correct bet if:

$$\begin{aligned} \{\hat{y}_{t+1}=1 \quad \text {and} \quad y_{t+1}=1\} \quad \text {or} \quad \{\hat{y}_{t+1}=0 \quad \text {and} \quad y_{t+1}=0\} \end{aligned}$$
(17)

where \(\hat{y}_{t+1}\) is the predicted outcome for the next match \(t+1\) according to (13) and \(y_{t+1}\) is the realized one. In the end, we have to highlight that the football matches without any bet recommendation are excluded from the computation of (16).

4 Forecasting accuracy

In this section the performances of the betting strategy are discussed. Since the matches of the two last seasons are used for out-of-sample evaluation, we have a total of \(N=1560\) matches for the out-of-sample assessment for both the national leagues.

4.1 Red card

First of all, we aim to predict the binary outcome “red card: yes/no”. Since the probability of a red card is generally low, most of bookmakers pay a few amount of money for the “red card: yes” betFootnote 3. In other words, guessing an extremely rare event generally result in a low payout.

The Table 4 shows the betting accuracy of the proposed strategy for the English Premier League matches. The probability predictions needed to implement the strategy (13) are obtained, as explained in the Sect. 2, by means of the BARMA and the GAS models based on the Bernoulli distribution. Those models are then compared with also the naive approach in (15).

Clearly, the out-of-sample performance is positive, with a very high betting accuracy close to a 100% of average correct bets, for both the statistical models. Howerver, the score-driven approach shows a bit higher accuracy rate, equal to 100%, than the BARMA whose accuracy is equal to 99,8%. The naive approach performs well in this case but it is the worst one among the considered alternatives.

We also study the betting accuracy of the proposed strategy based on probability forecasts with the Italian Serie A data (see Table 5).

In this second empirical experiment we still observe a very high betting accuracy level, much greater than 90%. More in detail, the strategy based on the GAS forecasts reach an accuracy equal to 94%, while the one based on BARMA forecasts ensures an accuracy equal to 97%. Naive strategy is around 95% of accuracy.

In general both the approaches are very effective in predicting football matches outcomes in the case of red cards, even if in the case of Italian Serie A the benefit seems to be lower.

Moreover, the statistical models show variable accuracy regarding the different football teams. In the case of the GAS model, for example, while for some teams the betting accuracy is really high (e.g. Sassuolo, Napoli and Udinese with 100% of correct bets), for some others we get results lower than 90% of accuracy (e.g. Genoa with 84% and Parma with 83%). Indeed, the overperformance of the naive and BARMA-based forecasts is mainly due to the lower variability within the sample of teams.

For example, in the case of Genoa football team, the GAS-based forecasts allow an accuracy equal to 88% (94% in the average), while the BARMA forecasts reach an accuracy grater than 95%. Therefore, it is reasonable to assume that combining both forecasting approaches in this case it would be possible to dramatically increase the betting accuracy.

A final comment has to be devoted to the number of teams not included in the samples in both Tables 4 and 5. The main reasons for the Teams’ exclusion is due to the lack of observations to generate the probability forecasts. For example, in the case of Benevento and Spezia for the Serie A or Leeds and Sheffield United for the Premier League, have been excluded since they participated only to the current season or in only few occasions.

Overall, the very high accuracy level of the proposed statistical models ensures an enormous advantage from the point of view of a bettor.

Table 4 Red card betting accuracy—English Premier League
Table 5 Red card betting accuracy—Italian serie A

4.2 Goal and no goal

Let now consider the model’s predictive ability with respect to the event “both teams score at least a goal”, also called “Goal/No Goal”. At the best of our knowledge, as well as the red card outcome, the “Goal/No Goal” is another poorly explored bet by the sport forecasting literature.

As we mentioned previously, summing the number of goals does not returns a binary variable. However, bets on this event involve a binary choice: if a bettor believe that both teams will score a goal, whatever the final result will be, she bets on “Goal” otherwise she bets on “No Goal”.

Hence, we construct a binary variable with the aim of forecasting the probability of the “yes” event defined as \({\hat{\pi }}_{i,t+1}\).

The betting accuracy for the English Premier League is shown in Table 6. Differently from the red card outcome, in the case of Goal/No Goal the superiority of the score-driven approach is evident. Indeed, the average betting accuracy is equal to 92% while in the case of BARMA forecasts the accuracy is less than 50%. The naive approach performs even poorer with 47% of accuracy. Therefore, in this case the usage of statistical models, especially the score-driven one, is very useful.

In the case of score-driven forecasts, for some teams the percentage of accurate bets is even equal to 100% (e.g. Arsenal,, Manchester United and the West Ham) but for all the teams it is greater than 80% (the only exception is given by the West Brom that has an accuracy of 72%).

On the other side, the results for the Italian Serie A are shown in the Table 7. In this case the superiority of the score-driven approach is even more evident with respect the experiment with Premier League data. Indeed, the average of correct bets is greater than 93%, while for the strategy based on BARMA forecasts it is equal to 40%. The naive strategy in this case performs better than BARMA-based forecasts since the average accuracy is equal to 52%.

For the Italian data, the betting accuracy is always greater than 80% for the GAS-based forecasts while is always lower than 50% for the BARMA ones. Moreover, for some teams the BARMA approach shows also an accuracy level lower than 40% (e.g. Bologna and Lazio with 33% or Sassuolo with 27%) with the best result equal to 50% (in the case of Genoa football team). On the side of the GAS-based forecasts, instead, we also observe many teams with an accuracy levels equal to 100% (e.g. Fiorentina, Inter and Napoli).

In the end, as in the previous experiment, the matches with any betting recommendation have been excluded from the computation of the betting accuracy. In the case of Premier League on the Leeds has been excluded, while in the case of Serie A only the Spezia. Clearly, this result is due to the short time series available.

Table 6 Goal/No Goal betting accuracy—English Premier League
Table 7 Goal/No Goal betting accuracy—Italian Serie A

4.3 Under and over

The last considered experiment is with the Under/Over bet for both national leagues. Differently from the previous two cases, the forecasts on Under/Over event has only recently been explored by Wheatcroft (2020) but with a completely different approach.

As in the case of “Goal/No Goal”, the “Under/Over” is not a binary variable. However, we construct three different binary variables that takes value of 1 if there have been scored more than a certain threshold of goals in a match (Over) and 0 otherwise (Under). The selected thresholds of goals are respectively equal to one (Under/Over 1.5), two (Under/Over 2.5) and three (Under/Over 3.5).

Indeed, nowadays the agencies provide the possibility of betting on these thresholds. Clearly, we expect that the predictive power of the proposed strategy for the event Under/Over will be the same regardless a given threshold. For sake of completeness, we report the results for all the three mentioned Under/Over thresholds.

The percentage of correct bets for the English Premier League data is shown in Table 8. The score-driven approach ensures stable level of average accuracy close to 90% regardless the threshold. The BARMA-based forecasts, instead, are competitive only with the scenario of Under/Over 1.5, showing an accuracy equal to 80%. Nevertheless, also in this best case its performances are much lower than the GAS-based forecasts. This result is confirmed also for the other two bets. Indeed, in the case of Under/Over 2.5 the average accuracy of the BARMA model dramatically reduces to 42%, versus the 89% of the GAS. In the case of Under/Over 3.5 the BARMA has an accuracy equal to 73% versus the 91% of the GAS model. The naive strategy is, instead, competitive only with Under/Over 1.5 event, with an average accuracy of 70% but still remains the worst strategy among the considered alternatives. Therefore, also in this case statistical forecast and the proposed betting strategy are useful. In the other two scenarios, the naive strategy still perform poorly.

More in detail, on the side of GAS forecasts, there are some teams whose percentage is equal to 100% (e.g. Leicester) and much others with an accuracy level higher than 90% (e.g. Sheffield and Tottenham with 97%) but there are some exceptions (e.g. Nwecastle and Wolves with 77%) that pushes down the average accuracy for the English Premier League in the case of Under/Over 1.5. In the case of Under/Over 2.5, instead, the number of teams with maximum accuracy of 100% is even higher (e.g. Everton, Leicester and Tottenham) but some outliers as the West Brom (with a poor 60%) pushes down the average accuracy. In the case of Under/Over 3.5 the percentage of accuracy is higher for most of the considered teams.

On the side of the BARMA-based predictions, instead, the level of accuracy is rarely more than 80% in the case of Under/Over 1.5, while it is dramatically low for all the teams in the case of Under/Over 2.5. Similarly for the case Under/Over 3.5. In general, the BARMA model is not competitive at all with respect to the GAS one.

The results for the Italian Serie A are shown in the Table 9. Also with the italian data the superiority of the score-driven approach appear evident. Indeed, in the case of Under/Over 1.5 the GAS shows an average accuracy of 94% versus the 81% of the BARMA and the 70% of the naive strategy. For the Under/Over 2.5 bet, instead, the accuracy of the GAS is extremely high with respect the alternatives, since it is almost the double (86% versus 42& and 53%). The BARMA performs poorer than the naive strategy in this case. In the end, for the Under/Over 3.5 the BARMA accuracy rises and becomes higher than the naive strategy (65% versus 54%), but both of the considered alternatives perform poorer than the GAS model (82%).

Once again statistical models discussed in the Sect. 2 are able to predict binary outcomes and should be used at this aim.

Also in this case a great heterogeneity across football teams can be highlighted. Considering the Under/Over 1.5 bet, the strategy (13) with GAS-based forecasts reach the 100% of the accuracy for some teams (e.g. Crotone and Sassuolo) and, despite one exception (i.e. Torino with 84%), all the teams have an accuracy higher than 90%. For te BARMA, instead, no team reaches the 90% of the accuracy, that it lower than 80% in the case of the naive betting strategy that does not employ forecasting at all.

The Under/Over 2.5 highlights some outliers in terms of accuracy also for the GAS-based forecasts (e.g. Sampdoria with 68%) but the accuracy is still much higher than the alternatives. The accuracy for most of the teams is, in the case of BARMA forecasts, lower than 50% with some teams (e.g. Juventus) with a value of 33%. Indeed, as we have said, the naive strategy performs better than the BARMA. Therefore, excluding the GAS, one would consider the employment of statistical techniques useless by looking at these results.

In the end, the Under/Over 3.5 case is peculiar in terms of heterogeneity. The GAS model is still the best one with much higher average accuracy, but the heterogeneity is greater than the other two scenarios. For example, there is a huge outlier (i.e. Lazio) with an accuracy equal to 47% that reduces the average around to 82%. The other two strategies instead, do not present such outliers and the percentage of correctly guessed bets is similar across the teams.

As in the previous experiments, Leeds and Spezia are the only teams without any bet recommendation. This fact is due to the length of the time series associated to these two football teams.

In general, the overperformance of the betting strategy based on GAS forecasts is evident for both experiments also in the case of Under/Over event.

Table 8 Under/Over betting accuracy—English Premier League
Table 9 Under/Over betting accuracy—Italian Serie A

5 Conclusions

Forecasting football match results is important since such predictions can used to construct profitable betting strategies. Even if the most popular bets are based on whether one expects that a team will win, lose, or draw in the next game, nowadays it is possible to bet on a variety of other outcomes. While some of these events are binary in nature, others can be seen as binary outcomes. In this paper we aim to provide a simple framework to obtain accurate forecasts for binary outcomes in soccer matches.

In statistics, a common approach for modeling and forecasting binary time series can be found in the Binary Autoregressive Moving Average (BARMA) a special case of the class of generalized linear models especially designed for time series data, called generalized autoregressive moving average (GARMA). In a similar fashion, another interesting approach useful in analyzing binary time series is the generalized autoregressive score (GAS) model.

The GAS model, to which we refer as score-driven approach, updates the time-varying parameters by means of an autoregressive component and the conditional score. Despite it has been already successfully applied in football matches forecasting, by predicting the number of goals scored and conceded by a team, the novelty of this paper lies on its application to the prediction of binary outcomes, still poorly explored by sport statistics literature.

In particular, we aim to forecast the future probability that a binary event will occur or not. On the basis of these probability forecasts, we define a suitable betting strategy comparing the predicted probabilities estimated by either GAS or BARMA (that we use as a benchmark) models with its average. If, for both teams, the probability that an event “yes” (“no”) realizes is greater than its average value, a betting recommendation on the event “yes” (“no”) is provided. Indeed, the proposed strategy provides betting recommendation on either “yes” or “no” events, depending by which one is more likely to occur for both teams. Otherwise, there is no betting advice.

The main idea of the proposed betting strategy is to bet only on matches where the probability estimated on the basis of the statistical model is larger than its average, meaning that this event is more likely to verify than usually does. Hence, a useful feature of the proposed strategy is that not always a betting advice is provided and, hence, too uncertain football matches can be avoided be the bettor that can also save money.

Then, in order to assess if statistical models are really useful in predicting binary outcomes, a naive strategy has been implemented as well.

To show the usefulness of the proposed statistical method, two empirical experiments to the English Premier League and to the Italian Serie A are provided for predicting red cards, Under/Over (with three different thresholds) and Goal/No Goal events.

As shown in the empirical analysis, the GAS model outperforms the benchmark approaches represented by the BARMA and the naive strategy. Therefore, a first positive evidence that we get is about the usefulness of forecasting binary outcomes for bettors and odds setters.

More in details, while in the case of red card forecasting the models are competitive each other (the GAS dominates but the advantages with respect the naive strategy is not huge), in the other experiments with Goal/No Goal and Under/Over the betting strategy based on score-driven forecasts results in bets much more accurate. Accuracy is ex-post measured as the percentage of correct bets over the total.

In the end, we have to stress the advantages of a strategy based on betting avoidance in the case of too uncertain events. Indeed, the absence of recommendation, due to probabilistic reasoning, allows to increase the betting accuracy and to reduce the betting costs. Therefore it can be seen as a strength of the proposed approach.

Understanding the determinants of uncorret bets is difficult a priori. However, a future research direction could be the consideration of additional match statistics or performances indicators (i.e. team market value as proxy for the strength) as additional predictors in the GAS model. Perhaps, considering additional covariates could further increase betting accuracy. Nevertheless, we have to highlight that the basic GAS model specification discussed in the paper without covariates already ensures 100% of correct bets also thanks to betting avoidance principle.