1 Introduction

It is well known that the number of goals by a team in a football game follows a Poisson distribution very well (e.g., Chu, 2003). Based on it, many statistical models which assume that scores gained are generated from a Poisson distribution with a particular intensity parameter have been discussed (Izumi & Konaka, 2016; Saraiva et al., 2016; Koopman & Lit, 2015). The intensity parameter of the Poisson distribution can be considered as the index of team performance.

Izumi and Konaka (2016) discussed the two-stage championship held in the 2015 and 2016 seasons in Japan Professional Football League (J.League) Division 1. They proposed some probability models that determine the number of goals. Saraiva et al. (2016) proposed a model that the performances of teams and the effect of home advantage determine the number of goals and applied it to the professional football leagues in England and Brazil. However, they both assumed that teams’ performances never vary through a season. Koopman and Lit (2015) proposed the model that the intensity parameter of the next game is determined stochastically from that of the previous game, like a Markov model. They treated the time discretely.

The varying coefficient models proposed by Hastie and Tibshirani (1993) are applied to the fields of medical science and social science. For example, Izumi et al. (2015, 2017) proposed the method to visualize the transition of effects of binary covariates for count data and demonstrated it with the Peace Declaration of Hiroshima and Nagasaki. The varying coefficient models can be applied to estimate the effect depending on time or spatial position as seen in the literature, including them.

Izumi and Obata (2018) discussed introducing the varying coefficient model into the analysis of football data. This paper considers the continuous time-varying transition of the team performance by introducing the varying coefficient model, based on their discussion. It can be used smoothing for counting data by taking the observed time, and the observed number of goals gained/lost as x-axis and y-axis, respectively, to extract continuous performance. We propose the visualization of the estimated transition of team performance from semi-parametric regression with linear spline basis functions.

We can also use our model for the prediction of game results. In the middle of the season, our model can estimate the performance transition function up to that time. Based on this estimated performance function, both teams’ offensive/defensive performances in the next game can be predicted. The predicted performances enable us to evaluate the probabilities of the number of goals and the game outcome. Obata and Izumi (2018) considered using the estimated performances to predict the outcome of the game outcome. This paper proposes a method to predict the outcome probabilities of football games.

This paper consists as follows. Section 2 reviews the varying coefficient model. In Sect. 3, we propose the model to estimate the time transition of team performances through a season. An example of an application to the data of J.League Division 2 in the 2018 season is shown. In Sect. 4, our model is applied to estimate the performances up to the middle of the season. We propose the method for predicting the probabilities of the outcomes of the succeeding game using these estimated performances. The overall appropriateness of the results of the prediction is also discussed. Section 5 concludes this paper.

2 Varying coefficient model

Let us consider response count data \(y(t_1),y(t_2),\dots ,y(t_{n-1}),y(t_n)\) observed, respectively, at n observation times \(t_1<t_2<\dots<t_{n-1}<t_n\). We assume that response data \(y(t_1),\dots ,y(t_n)\) can be considered as observed values of time-varying random variable Y(t) following a Poisson distribution \(\text {Po}(\lambda (t))\), where intensity parameter \(\lambda (t)\) varies through the time t. If responses depend on time-varying p covariates \(a_1(t),\dots ,a_p(t)\), Poisson regression model can be written as

$$\begin{aligned} E(Y(t)|a_1(t),\dots ,a_p(t))&= \lambda (t|a_1(t),\dots ,a_p(t)) \nonumber \\&= \exp \left[ \sum _{j=1}^p \beta _j(t) a_j(t) \right] , \end{aligned}$$
(1)

where \(\beta _j(t)\) are the time-varying effects and called as varying coefficients. The varying coefficient model was proposed originally by Hastie and Tibshirani (1993).

If the number of observation times n is sufficiently large, linear spline functions with \(r(<n-2)\) knots \(\kappa _1,\dots ,\kappa _r\) can be considered as varying coefficients \(\beta _j(t)\). Linear spline function is continuous polyline, and its basis are represented as

$$\begin{aligned} x(t)={(1,t,(t-\kappa _1)_+,\dots ,(t-\kappa _r)_+)}^{\!\top }, \end{aligned}$$
(2)

where

$$\begin{aligned} (t-\kappa )_+ = {\left\{ \begin{array}{ll} (t-\kappa ), &{} (t-\kappa > 0), \\ 0, &{} (\text {otherwise}). \end{array}\right. } \end{aligned}$$

Positions of knots are set so that each interval includes a sufficient number of data. Izumi et al. (2015) propose arranging knots so that each interval contains the same number of observations using quantile of data. Varying coefficients \(\beta _j(t)\) are represented as

$$\begin{aligned} \beta _j(t) = {x(t)}^{\!\top } b_j, \quad j=1,\dots ,p, \end{aligned}$$
(3)

where \(b_j\) is the vector of regression coefficients.

3 The transition of team performance

3.1 Model for estimation

If the number of goals gained \(Y_O^{(k)}\) by team k is assumed to follow a Poisson distribution, there exists an intensity parameter \(\lambda _O^{(k)}\) such that

$$\begin{aligned} Y_O^{(k)} \sim \text {Po}(\lambda _O^{(k)}). \end{aligned}$$

These parameters may differ from team to team. So, the magnitude of \(\lambda _O^{(k)}\) can be interpreted as the index of the offensive performance of team k. In the same manner, there exists \(\lambda _D^{(k)}\) such that

$$\begin{aligned} Y_D^{(k)} \sim \text {Po}(\lambda _D^{(k)}), \end{aligned}$$

where \(Y_D^{(k)}\) denotes the number of goals lost by team k. The magnitude of \(\lambda _D^{(k)}\) can be interpreted as the index of the defensive performance of team k. To be more precise, \(-\lambda _D^{(k)}\), instead of \(\lambda _D^{(k)}\), should be called the defensive index because small \(\lambda _D^{(k)}\) indicates high defensive performance. However, we call \(\lambda _D^{(k)}\) as the index of defensive performance. We omit the superscript of team k unless misleading hereafter.

In most professional football leagues, teams play many games in a long season. It must be natural that the offensive/defensive performances of a team vary throughout a season. Therefore, it can be supposed that there exist time-varying intensity parameters \(\lambda _O(t)\) and \(\lambda _D(t)\) such that

$$\begin{aligned} Y_u(t) \sim \text {Po}\left( \lambda _u(t)\right) , \quad u = O, D, \end{aligned}$$

where \(Y_O(t)\) and \(Y_D(t)\) denote the numbers of goals gained and lost by a team at the game held at time t, respectively. In this paper, we regard these time-varying offensive index \(\lambda _O(t)\), defensive index \(\lambda _D(t)\), or \(-\lambda _D(t)\), and total index \(\lambda _T(t)=\lambda _O(t)-\lambda _D(t)\) as the transitions of team performances.

To estimate these indices from actually observed numbers of goals gained and lost by a team, Poisson regression models with varying coefficients are suitable. We consider the simplest models in which we set \(p=1\) and \(a_1(t)=1\) in Eqs. (1) and (3) as the simplest model, i.e., we fit the observed number of goals gained and lost to the following model:

$$\begin{aligned} E(Y_u(t))&= \lambda _u(t) = \exp \left[ \beta _u(t) \right] \nonumber \\&= \exp \left[ {x(t)}^{\!\top } b_u \right] , \quad u = O, D. \end{aligned}$$
(4)

Let the number of games that each team has in a season be n, then the actual number of goals gained and lost \(y_u(t_1),y_u(t_2),\dots ,y_u(t_n)\) in games held at \(t_1<t_2<\dots <t_n\) are observed. By fitting these observed data to the model (4), we can obtain the estimates \(\hat{b}_u\) of the coefficients \(b_u\) and the estimate of the team performance indices

$$\begin{aligned} \hat{\lambda }_u(t) = \exp \left[ \hat{\beta }_u(t)\right] = \exp \left[ {x(t)}^{\!\top } \hat{b}_u\right] \end{aligned}$$

(\(u = O, D\)).

By drawing the graphs of these estimates \(\hat{\lambda }_O(t)\), \(-\hat{\lambda }_D(t)\), and \(\hat{\lambda }_T(t)\), the transitions of team performances can be visualized. These three curves run along with the observed numbers of goals gained, the observed numbers of goals lost, and the observed goal differentials, respectively.

3.2 Example

Let us see an example. We apply the previously mentioned model to the data of the 2018 season of J.League Division 2 (J2). J2 consists of 22 teams. All teams had two round-robin games; therefore, each played 42 games from Feb. 25 (0th day) to Nov. 17 (265th day). We use days from the season’s opening day as the unit of time. Table 1 summarizes the data of the games by Oita Trinita in the 2018 season. The meanings of each column are as follows:

Date: the date when the game was held,

Sec.: the section number,

Days: the number of days since the opening day of the season,

Gained: the number of goals gained,

Lost: the number of goals lost,

Opponent: the name of the team against.

Table 1 The number of goals gained and lost by Oita Trinita in the 2018 season

We apply our model mentioned in the previous subsection to such data. The number of games n is 42. All games are numbered as sections (“Sec.” column in Table 1). Ordinarily, sections are numbered in the order of the date games are scheduled. However, games may be held against the order of the sections due to inclement weather or others. The “Days” column corresponds to a series of observation times, \(t_1 = 0, t_2=7, \dots , t_{42}=265\). The observation times are ordered according to the actual time passage, even if the games are held in irregular order. The “Gained” column corresponds to the observed number of goals gained. That is \(y_O(t_1) = 4, y_O(t_2) = 2, \dots , y_O(t_{42}) = 1\). The “Lost” column corresponds to the observed number of goals lost. That is \(y_D(t_1) = 2, y_D(t_2) = 2, \dots , y_D(t_{42}) = 1\).

As for Eq. (2), we use three knots such as \(\kappa _q = \tfrac{q}{4}\cdot 265\) (\(q = 1, 2, 3\)) that each interval contains around ten games.

Figure 1 shows the estimated transition of performances of Oita Trinita. The top figure illustrates the estimated offensive index function \(\hat{\lambda }_O(t)\) (solid line) with three knots (vertical bars). Solid dots express the observed numbers of goals gained. In the same way, the central illustrates the estimated defensive index function \(\hat{\lambda }_D(t)\) and the observed numbers of goals lost. These are plotted by multiplying a minus one so that the higher value indicates a better performance. The bottom illustrates the estimated total index function \(\hat{\lambda }_T(t)\) and the observed goal differentials.

Fig. 1
figure 1

Estimated transition of performances of Oita Trinita in the 2018 season

4 Prediction of outcome probabilities

4.1 Prediction of outcome probabilities

In the previous section, we obtained the transitions of team performances. In this section, we propose the method to obtain the outcome probabilities of the succeeding game using a proposed model.

Let us consider the situation to predict the game’s outcome—which team will win?—in which competing teams k and l at the time \(t^*\), in advance.

Suppose that team k have finished \(i^{(k)}\) games held at \(t_1< \dots < t_{i^{(k)}}\) so far. By fitting the Poisson regression model (4) to the observed number of goals gained/lost \(y_u^{(k)}(t_1), \dots , y_u^{(k)}(t_{i^{(k)}})\), the transitions of performances \(\lambda _u^{(k)}(t)\) of the team up to the time \(t_{i^{(k)}}\) can be estimated (\(u = O, D\)). Similarly for team l, the transitions of performances \(\lambda _u^{(l)}(t)\) can be estimated (\(u = O, D\)). As we will see later, the allocation of the knots should be considered carefully.

The number of goals gained by team k in the next game can be supposed to follow a Poisson distribution with certain intensity \(\lambda ^{(k)}\). If trends of performances of team k continue, this parameter \(\lambda ^{(k)}\) will depend on the intensity of goals gained by team k at time \(t^*\), \(\lambda _O^{(k)}(t^*)\). However, if the defensive performance of the opponent, team l, is high, team k may not fully show their offensive performance. Conversely, if team l has low defensive performance, team k may be able to score more than expected. So that, the intensity of goals lost by team l at time \(t^*\), \(\lambda _D^{(l)}(t^*)\), also has influence to \(\lambda ^{(k)}\). Consequently, it is natural to suppose that the value of \(\lambda ^{(k)}\) can be represented as some kind of average of \(\lambda _O^{(k)}(t^*)\) and \(\lambda _D^{(l)}(t^*)\). Note that the smaller \(\lambda _D^{(l)}(t^*)\) means the higher defensive performance of team l.

So, we assume that the number of goals gained by team k—this equals to the number of goals lost by team l—at the next game \(Y^{(k)}\) follows the Poisson distribution whose intensity parameter equals a geometric mean of \(\lambda _O^{(k)}(t^*)\) and \(\lambda _D^{(l)}(t^*)\) as follows:

$$\begin{aligned} Y^{(k)} \sim \text {Po}(\lambda ^{(k)}), \end{aligned}$$

where

$$\begin{aligned} \lambda ^{(k)} = \sqrt{\lambda _O^{(k)}(t^*) \cdot \lambda _D^{(l)}(t^*)} = \exp \left[ \frac{\beta _O^{(k)}(t^*) + \beta _D^{(l)}(t^*)}{2} \right] . \end{aligned}$$

The reason why we use a geometric mean is the structure of \(\lambda _u(\cdot ) = \exp \left[ \beta _u(\cdot ) \right] \). In the same way, the number of goals gained by team l at the next game \(Y^{(l)}\) is assumed to be

$$\begin{aligned} Y^{(l)} \sim \text {Po}(\lambda ^{(l)}), \end{aligned}$$

where

$$\begin{aligned} \lambda ^{(l)} = \sqrt{\lambda _O^{(l)}(t^*) \cdot \lambda _D^{(k)}(t^*)} = \exp \left[ \frac{\beta _O^{(l)}(t^*) + \beta _D^{(k)}(t^*)}{2} \right] . \end{aligned}$$

If the \(Y^{(k)}\) and \(Y^{(l)}\) can be considered independent, the joint distribution of \(\left( Y^{(k)}, Y^{(l)}\right) \) is expressed as

$$\begin{aligned}&\text {Pr}\left\{ Y^{(k)}=i, Y^{(l)}=j\right\} \\&\quad = \text {Pr}\left\{ Y^{(k)}=i\right\} \cdot \text {Pr}\left\{ Y^{(l)}=j\right\} , \quad i, j = 0, 1, 2, \ldots . \end{aligned}$$

Hence, probabilities of the outcome of the next game competing teams k and l can be predicted as

$$\begin{aligned} \text {Pr}\left\{ \text {team { k} wins at the next game}\right\} = \sum _{i > j} \text {Pr}\left\{ Y^{(k)}=i, Y^{(l)}=j\right\} , \\ \text {Pr}\left\{ \text {ends in a draw at the next game}\right\} = \sum _{i = j} \text {Pr}\left\{ Y^{(k)}=i, Y^{(l)}=j\right\} , \\ \text {Pr}\left\{ \text {team { l} wins at the next game}\right\} = \sum _{i < j} \text {Pr}\left\{ Y^{(k)}=i, Y^{(l)}=j\right\} . \end{aligned}$$

4.2 Example

We demonstrate our prediction method using the 2018 season J2 data again. Remind that each team in the league plays \(n=42\) games.

Estimation of the performances in the middle of the season does not work well if each interval does not contain sufficient data. So, we change the number and the allocation of knots of linear spline according to the number of usable data. Let \(i^*\) be the number of games up to the current time. Similar to Sect. 3.2, the basis of time is set as follows with the maximum number of knots is three:

$$\begin{aligned} x(t)= {\left\{ \begin{array}{ll} {(1,t)}^{\!\top }, &{} (5 \le i^*\le n/4), \\ {(1,t,(t-\kappa _{11})_+)}^{\!\top }, &{} (n/4< i^*\le n/2), \\ {(1,t,(t-\kappa _{21})_+,(t-\kappa _{22})_+)}^{\!\top }, &{} (n/2< i^*\le 3n/4), \\ {(1,t,(t-\kappa _{31})_+,(t-\kappa _{32})_+,(t-\kappa _{33})_+)}^{\!\top }, &{} (3n/4 < i^*\le n), \end{array}\right. } \end{aligned}$$

and the allocation is set as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} \kappa _{11} = t_{i^*}/2, &{} (n/4< i^*\le n/2), \\ \kappa _{21} = t_{i^*}/3, \quad \kappa _{22} = 2t_{i^*}/3, &{} (n/2< i^*\le 3n/4), \\ \kappa _{31} = t_{i^*}/4, \quad \kappa _{32} = t_{i^*}/2, \quad \kappa _{33} = 3t_{i^*}/4, &{} (3n/4 < i^*\le n), \end{array}\right. } \end{aligned}$$

where \(t_{i^*}\) denotes the time when \({i^*}\)th game held. Therefore, each interval contains more than five pieces of data.

An example of performance estimation of Oita Trinita up to the 19th game is shown in Fig. 2. The performances are estimated using linear spline with one knot (\(i^*=19\) and \(t_{i^*}=111\)).

Fig. 2
figure 2

Estimated transition of performances of Oita Trinita after 19 games finished

For example, we show the prediction process of the outcome of the game of Oita Trinita vs. Avispa Fukuoka held at time \(t^*=118\). It was the 20th game for both. First, the performance index functions of both teams up to the 19th game, \(\hat{\lambda }_O^{(\text {Oita})}(t)\), \(\hat{\lambda }_D^{(\text {Oita})}(t)\), \(\hat{\lambda }_O^{(\text {Fukuoka})}(t)\) and \(\hat{\lambda }_D^{(\text {Fukuoka})}(t)\) are estimated. The values of these functions at \(t^*=118\) are

$$\begin{aligned} \hat{\lambda }_O^{(\text {Oita})}(118)= & {} 2.304, \quad \hat{\lambda }_D^{(\text {Oita})}(118) = 2.160, \\ \hat{\lambda }_O^{(\text {Fukuoka})}(118)= & {} 0.552, \quad \hat{\lambda }_D^{(\text {Fukuoka})}(118) = 0.516. \end{aligned}$$

The numbers of goals gained by both teams, \(Y^{(\text {Oita})}\) and \(Y^{(\text {Fukuoka})}\), are assumed to follow the following distributions, respectively:

$$\begin{aligned} Y^{(\text {Oita})} \sim \text {Po}\left( \sqrt{2.304 \times 0.516} \right) = \text {Po}(1.094), \\ Y^{(\text {Fukuoka})} \sim \text {Po}\left( \sqrt{0.552 \times 2.160} \right) = \text {Po}(1.092). \end{aligned}$$

The joint distribution of \(\left( Y^{(\text {Oita})}, Y^{(\text {Fukuoka})}\right) \) is shown in Table 2. Joint probabilities of the number of goals beyond five (i.e., \(i,j=6,7,\dots \)) are omitted from the table because the values are less than 0.001. Finally, the probabilities of outcomes can be predicted as

$$\begin{aligned}&\text {Pr}\left\{ \text {Oita wins at the 20th game}\right\} \\&\quad = \sum _{i > j} \text {Pr}\left\{ Y^{(\text {Oita})}=i, Y^{(\text {Fukuoka})}=j\right\} = 0.354, \\&\text {Pr}\left\{ \text {ends in a draw at the 20th game}\right\} \\&\quad = \sum _{i = j} \text {Pr}\left\{ Y^{(\text {Oita})}=i, Y^{(\text {Fukuoka})}=j\right\} = 0.292, \\&\text {Pr}\left\{ \text {Fukuoka wins at the 20th game}\right\} \\&\quad = \sum _{i < j} \text {Pr}\left\{ Y^{(\text {Oita})}=i, Y^{(\text {Fukuoka})}=j\right\} = 0.353. \end{aligned}$$

Thus our method using the data up to the 19th game predicts that Oita Trinita has a slightly better chance of winning against Avispa Fukuoka at the 20th game. The actual result of this game was that Oita Trinita achieved a 1–0 victory over Avispa Fukuoka. This case suggests that our method may be valid. Thus, we decided to examine other cases.

Table 2 The joint distribution of the number of goals gained by Oita and Fukuoka

We applied such a process to all games after the sixth section in the 2018 season of J2. Table 3 shows a part of the predicted probabilities of outcomes using the data up to the just previous game. The meanings of each column are as follows:

Sec.::

the section number,

Home team::

the home (H) team,

Away team::

the home (A) team,

Prob. H wins::

the predicted probability that the home team wins,

Prob. draw::

the predicted probability that the game ends in a draw,

Prob. A wins::

the predicted probability that the away team wins,

Actual outcome::

the actual outcome of the game.

Table 3 Predicted probabilities of outcomes in the 2018 season of J2

In some cases, like Chiba vs. Kyoto in Section 6, Okayama vs. Omiya in Section 42, and Kumamoto vs. Ehime in Section 42, our prediction was appropriate, and in others was not.

4.3 Overall appropriateness of our prediction

To see the overall appropriateness of our prediction method, suppose the situation of betting on the outcome of the next game using our predicted probabilities. There are three alternatives: “home win”, “draw” or “away win”, and one can bet on one of these outcomes.

We introduce the following ten betting strategies:

D1::

Bet on the outcome whose predicted probability by our method is the largest.

D2::

Bet on “home win” always.

D3::

Bet on “draw” always.

D4::

Bet on “away win” always.

D5::

Bet on the outcome whose probability is the largest based on the recent three games.

D6::

Bet on the outcome whose probability is the largest based on the recent five games.

S1::

Try a random sampling of three outcomes with predicted probabilities by our method and bet on the result.

S2::

Try a random sampling of three outcomes with equal probabilities and bet on the result.

S3::

Try a random sampling of three outcomes with probabilities based on the recent three games and bet on the result.

S4::

Try a random sampling of three outcomes with probabilities based on the recent five games and bet on the result.

“Probabilities based on the recent games” used in D5, D6, S3, and S4 are determined by the following procedure:

  1. 1.

    Calculate the arithmetic means of the numbers of goals gained by both teams, \(\bar{\lambda }_O^\text {H}\) and \(\bar{\lambda }_O^\text {A}\), of the recent three/five games. Similarly, calculate the arithmetic means of the numbers of goals lost by both teams, \(\bar{\lambda }_D^\text {H}\) and \(\bar{\lambda }_D^\text {A}\), of the recent three/five games.

  2. 2.

    Calculate the geometric mean of the number of goals gained by the home team and the number of goals lost by the away team, \(\bar{\lambda }^{\text {H}} = \sqrt{\bar{\lambda }_O^\text {H} \cdot \bar{\lambda }_D^\text {A}}\). Similarly, calculate the geometric mean of the number of goals gained by the away team and the number of goals lost by the home team, \(\bar{\lambda }^{\text {A}} = \sqrt{\bar{\lambda }_O^\text {A} \cdot \bar{\lambda }_D^\text {H}}\).

  3. 3.

    Obtain the probabilities of three outcomes from independent joint distribution of \(\text {Po}(\bar{\lambda }^{\text {H}})\) and \(\text {Po}(\bar{\lambda }^{\text {A}})\), similarly as mentioned in the previous subsection.

Six (D1–D6) are deterministic, and four (S1–S4) are stochastic. The strategies D1 and S1 use our predicted probabilities.

We applied these strategies to all games after the sixth section in the 2018 J2 season previously mentioned.

For deterministic strategies, Table 4 shows the proportions of hitting the correct answers. Strategy D1, which uses our prediction, seems better than others, though not significantly.

Table 4 Proportions of hitting the right answers by D1–D6

For stochastic strategies, we had 10,000 trials each. Figure 3 shows the distributions of the proportions of hitting the correct answers by four strategies. The medians are 0.3686, 0.3342, 0.3612 and 0.3612 for S1, S2, S3 and S4, respectively. Strategy S1, which uses our prediction, seems slightly better than the others.

Fig. 3
figure 3

Distributions of the proportion of hitting the right answers by S1–S4

5 Conclusions

If the number of goals in a football game follows a Poisson distribution, we can regard the intensity parameter of the distribution as an indicator of the team’s performance. Besides, the intensity can be supposed to vary over time.

In this paper, at first, a method for estimating the time-varying performance of football teams was proposed. We introduced the varying coefficient model into sports data analysis and demonstrated that it could be used to estimate the time transition of team performances. For the estimation of semi-parametric varying coefficient, the mixed effect model can be used as in Izumi et al. (2015, 2017). However, we used the simple generalized linear model here because it is necessary to carefully interpret the meaning of estimated coefficients for the mixed effect model.

Our performance estimation is also available for the data up to the middle of the season. We proposed a method to predict the probabilities of the next game’s outcome based on estimated performance up to the previous game. It was applied to the one-year data of the Japanese professional football league, and we examined the appropriateness of our method.

We used the simplest varying coefficient model, which does not contain other covariates than time. The effect of home advantage in a football game is discussed in several works of literature (e.g., Goumas, 2014; Koopman & Lit, 2015; Saraiva et al., 2016). Other factors, e.g., weather or the number of audiences, which may influence the football team’s performance, can also be considered. The addition of these environment covariates is worth considering.

The combination of players and team formation are also essential factors in the performances of football teams (Hirotsu & Ueda, 2015). Considering these factors may improve the results in the phases of performance estimation or outcome prediction.

In the prediction phase, we assume independence between the number of goals of both teams in a game. However, the appropriateness of this assumption should be discussed more carefully.

Finally, careful investigation of errors and confidence intervals of estimation and prediction remains for future research.