Bayesian GARCH modeling of functional sports data

The use of statistical methods in sport analytics has gained a rapidly growing interest over the last decade, and nowadays is common practice. In particular, the interest in understanding and predicting an athlete’s performance throughout his/her career is motivated by the need to evaluate the efficacy of training programs, anticipate fatigue to prevent injuries and detect unexpected of disproportionate increases in performance that might be indicative of doping. Moreover, fast evolving data gathering technologies require up to date modelling techniques that adapt to the distinctive features of sports data. In this work, we propose a hierarchical Bayesian model for describing and predicting the evolution of performance over time for shot put athletes. We rely both on a smooth functional contribution and on a linear mixed effect model with heteroskedastic errors to represent the athlete-specific trajectories. The resulting model provides an accurate description of the performance trajectories and helps specifying both the intra- and inter-seasonal variability of measurements. Further, the model allows for the prediction of athletes’ performance in future sport seasons. We apply our model to an extensive real world data set on performance data of professional shot put athletes recorded at elite competitions.


Introduction
Shot put is a track and field event involving throwing ("putting") the shot, a metal ball (7.26kg/16lb for men, 4kg/8.8lbfor women), with one hand as far as possible from a seven-foot diameter (2.135m) circle.In order for each put to be considered valid, the shot must not drop below the line of the athlete's shoulders and must land inside a designated 35-degree sector.Athletes commonly put four to six times per competition, and their best performance is recorded.Figure 1 displays the results of elite shot put competitions for four athletes with careers of different lengths.Furthermore, several athlete-specific variables (e.g., age, gender, doping history, nationality, etc.) that might be of interest in a sport analytics perspective are naturally gathered during elite competitions or can be easily retrieved retrospectively.Shot put competitions have been part of the modern Olympics since their revival, hence there exist complete and long-established data sets for shot put performances that can be exploited for performance analysis.
Sportive competitions might display some sort of seasonality in the results.We underline that here the term "seasonality" is not used to indicate a cyclical behaviour of the results over time as in the literature of time series but, rather, a time dependent gathering of observations.On one hand competitions are traditionally concentrated in some months of the year, and on the other hand weather and environmental conditions may affect the performances or even the practicability of the sport itself.Straightforward examples are sports leagues that run on a different schedule and host their season openers, playoffs and championships at different times of the year or, even more dramatically, winter sports as opposed to outdoor water sports.Shot put events range over the whole year, with indoor competitions held during Winter months and major tournaments, like the Olympics, Diamond League and World Championship organised during Summer.Therefore, it is reasonable to say that seasons coincides with calendar years.In Figure 1, vertical lines represent new years' days: the time point at which seasons change.To provide an accurate representation of the data, seasonality effects need to be taken into account.Note that, from a modelling perspective, the amplitude of seasons is arbitrary, and ideally any model should be easily adapted to any sport with any type of seasonality.Athlete 357 Figure 1: Each panel displays the performance results (points) of a professional shot put athlete throughout his/her career at elite competitions.Performance is measured in meters (length of the throw) and plotted against the days elapsed since January 1st of each athlete's career starting year.
The dark vertical lines represent seasons changing points, namely new year days.
There exist a well established literature of both frequentist and Bayesian contributions to performance data analysis in various sports.We mention, among others, Malcata et al. (2014) and Mengersen et al. (2016) in triathlon, Costa et al. (2010) and Costa et al. (2013) in swimming, and Koulis et al. (2014) in cricket.Albert (2016)  In this work, we are interested in describing the evolution of performances of professional shot put athletes throughout their careers.To this end, we propose a Bayesian hierarchical model where results of each athlete are represented as error prone measurements of some underlying unknown function.The distinctive contribution of our approach is the form of such function, which has an additive structure with three components.First, we consider a smooth functional contribution for capturing the overall variability in athletes' performances.For this component, we follow the approach in Montagna et al. (2012), who propose a Bayesian latent factor regression model for detecting the doping status of athletes given their shot put performance results and other covariates.
However, the authors limit the analysis to data collected from 2012, whereas we aim at describing the trajectories in performance over the whole time span available for our data (1996 to 2016).
For this time span, a global smoothness assumption for the trajectories could be too restrictive.
Indeed, data may exhibit jumps localised at fixed and shared time points among all athletes.See, for example, athlete 303 in Figure 1, whose results show a jump between the second and fourth season (calendar year) of his/her career.The presence of jumps between seasons is even more striking when yearly average performances are considered.We highlight this behaviour by showing yearly averages via thick horizontal lines in Figure 1.Accordingly, the description of this yearly dependent contribution is delegated to a mixed effect model, that quantifies the seasonal mean for each athlete as a deviation from a grand mean.This component captures the inter-seasonal variability of the data set, whereas the smooth functional component describes the intra-seasonal evolution of performances.Finally, we complete our model specification accounting for the effect of a selection of time dependent covariates through a regressive component.We embed our model in a Bayesian framework by proposing suitable prior distributions for all parameters of interest.
We believe the presented model represents a flexible tool to analyse evolution of performances in measurable sports, namely, all those disciplines for which results can be summarised by a unique measure (e.g., distance, time or weight).
The rest of the paper is organised as follows.In Section 2, we describe the motivating case study.In Section 3, we present the proposed model and briefly discuss possible alternative settings.
Moreover, we elicit priors for our Bayesian approach.In Section 4, we outline the algorithm for posterior computation.In Section 5, we discuss posterior estimates, argue on the performance of the model and interpret the model's parameters form a sports analytic perspective.Conclusions are presented in Section 6.Finally, the Appendix includes the complete description of the MCMC algorithm.
2 The World Athletics shot put data set As shown in Figure 1 for a selection of athletes, data are collected over time.Hereafter we will denote as t ij the time at which the j-th observation for athlete i is recorded.t ij corresponds to the time elapsed from January 1st of each athlete's career starting year to the date of the com-petition.Accordingly, equal time values for different athletes are likely to specify distinct years, but the same moment in those athletes' careers.Moreover, different athletes will have observations ranging over a large time span, according to the length of their careers.Having described seasons as calendar years, athletes will also compete in a different number of seasons.Figure 2 shows the number of athletes per season as well as boxplots of the distribution of their mean performances across the various seasons.Athletes with the longest careers have been playing for 19 years.A general increasing trend in performance can be observed as a function of career length (right panel in Figure 2) or, equivalently, the age of the athlete.In the following, we will discuss two different modelling choices for age, respectively, accounting for its time dependence and considering age as a fixed quantity, namely the age of the athlete at the beginning of his/her career.
Table 1 reports descriptive statistics suggesting how sex, environment and doping have an effect on the average value of the result.We point out that in our dataset we only have 18 athletes who tested positive for doping at some point in their career.Information on the date the test was taken (or if multiple tests were taken) is not available in the data.As expected, performances for men are, on average, higher than for women.Similar effects, despite less evident in magnitude, also hold true for the variable environment, which takes values indoor and outdoor.Regarding environment, a further remark is in order.We already pointed out that major WA events take place outdoor (27,800 observations) during Summer months, whereas less competitive events are held inside between November and March (13,200 observations).Figure 3 displays results in gray when recorded outdoor, and in black otherwise.We can clearly see that field events gather in Summer months, whereas indoor events take place during Winter (solid lines).

The model
Let n denote the total number of athletes in the study.We assume that shot put performances for athlete i are given by noisy measurements of an underlying function g i (t ij ): with ij iid ∼ N (0, ψ 2 ) independent errors.Recall t ij is the time at which the j-th observation for athlete i is collected, for j = 1, . . ., n i , where n i is the total number of measurements available on athlete i.We further suggest an explicit functional form for g i (t ij ): where f i (t) is a smooth functional component for intra-seasonal variability, µ is a season-specific intercept, and x i (t)β is an additional multiple regression component.Here s ∈ {1, 2, . . ., S i } indicates the season in which the shot was recorded.Specifically, ) is an athlete-specific step function taking value µ is for all time points in season s, delimited by t s i and t s+1 i .For a complete treatment of the notation used insofar, please refer to Table 2.
We will now discuss each of the three terms in Eq. ( 2) in more detail.

The functional component
The functional component f i (t) is meant to capture the subject-specific global evolution of the response variable.It explains the global dependence of the data from time.We require that these functions display a smooth behaviour: the latter is assured by assuming {f i (t)} n i=1 are linear combinations of smooth basis functions, {b m (t)} p m=1 .Note that, both the nature and the number p of these bases are to be determined according to some properties we wish them to satisfy.In where {b m (t)} p m=1 represent the B-spline basis (de Boor, 1978) and {θ im } p m=1 are subject-specific coefficients.
We briefly recall that the B-spline basis of degree k on [L, U ] is a collection of p polynomials defined recursively on a sequence of points, known as knots, and indicated with L ≡ t 1 ≤ . . .≤ t p+k+1 ≡ U .We follow the common approach of choosing k = 3, leading to cubic splines (see, for instance, Marsden, 1974).Moreover, we assume the knot sequence to be equispaced and (k + 1)open.That is, the first and last k + 1 knots are identified with the extremes of the definition interval, whereas the remaining p − k − 1 knots divide said interval into sets of the same length.
Under these assumptions, each basis function b j (t) has compact support over k + 1 knots, precisely [t j , t j+k+1 ].Moreover, together they span the space of piecewise polynomial functions of degree k on [L, U ] with breakpoints {t n } p+K+1 n=1 .Finally, such functions are twice continuously differentiable at the breakpoints, de facto eliminating any visible type of discontinuity and providing a smooth result.
The number of basis functions is chosen to be large enough for sufficiently many basis functions to have a completely enclosed support in a given season.In particular, rescaling time to [0, 1] (for computational purposes and easier prior definition) seasons have amplitude 0.054 and, if we want at least a local basis to be completely supported in one of them, we need 4 knots to fall in it.
Accordingly, we require about 75 internal knots and 80 degrees of freedom.
With the sake of tractability, a low dimensional representation of the individual curves is of interest.Following the approach by Montagna et al. (2012), we exploit a sparse latent factor model on the basis coefficients: where λ ml are the entries of a (p × k) factor loading matrix Λ, and η i is a vector of k latent factors for subject i.Finally, ξ i = (ξ i1 , . . ., ξ ip ) is a residual vector, independent of all other variables in the model.We assume: and the error terms ξ i are assumed to have normal distribution with diagonal covariance matrix, For the modelling of the factor loading matrix Λ, we follow the approach in Bhattacharya and Dunson (2011).Specifically, we adopt a multiplicative gamma process shrinkage prior which favours an unknown but small number of factors k.The prior is specified as follows: Here τ l is a global stochastically increasing shrinkage parameter for the l-th column which favors more shrinkage as the column index increases.Similarly, φ ml are local shrinkage parameters for the elements in the l-th column.Bhattacharya and Dunson (2011) describe a procedure to select the number of factors k p adaptively.We follow their lead, thus k is not set a priori but automatically tuned as the Gibbs sampler progresses.Refer to Bhattacharya and Dunson (2011) for details.Note that, by combining Eq. (3) and Eq.(4) one gets: where Φ l (t) = p m=1 λ ml b m (t) is a new, unknown non-local basis function learnt from the data and r i (t) a functional error term.Although the number of pre-specified basis function p is potentially large, the number of "operative" bases k is k p and learnt from the data.

The seasonal component
Early graphical displays and straightforward exploratory analysis suggest a significant variability of the average response across seasons as displayed in Figure 1.Namely, performances prove to be gathered over predetermined time intervals, the seasons (calendar years).However, it is reasonable to expect some degree of dependence for the average performance across seasons.To model such dependence, an autoregressive model for seasonal intercepts can be proposed.The idea behind this choice is to allow for borrowing of information across seasons, in the sense that the seasonal intercept µ is at season s is influenced by the intercept at season s − 1 through the autoregressive coefficient ρ i .Namely, However, when we first implemented this model, we noted how residuals presented a pattern which we would like to intercept with a finer model.Therefore, we consider a random intercept model with Normal Generalized Autoregressive Conditional Heteroskedastic (GARCH) errors (Bollerslev, 1986).Specifically, where α 0 > 0, α 1 ≥ 0 and ≥ 0 to ensure a positive conditional variance and ζ is = µ is − m with h i0 = ζ i0 := 0 for convenience.The additional assumption of wide-sense stationarity with is guaranteed by requiring α 1 + < 1, as proven by Bollerslev (1986).
Three parameters of the seasonal component require prior specification: the overall mean m and the conditional variance parameters, and α = (α 0 , α 1 ) .For the autoregressive and heteroskedastic parameters of the GARCH model, we propose non-informative priors satisfying the positivity constraint.For the overall mean parameter, we rely on a more informative Normal prior centered around the mean suggested by posterior analysis of preliminary versions of the model.In particular: where α = (α 0 , α 1 ) is a bidimensional vector.We complete the model specification assuming that the parameters are statistically independent and noticing that the hypothesis needed for widesense stationarity do not translate into actual prior conditions on the parameters.Hence, one of the objects of our analysis becomes to test whether the constraint α 1 + < 1 holds true.

Covariates
We consider the effect of three covariates, gender, age and environment, and assume conjugate prior choices for the covariates coefficients: 4 The Bayesian update Because of the additive nature of the overall sampling model ( 1) -( 2), we are able to exploit a blocked Gibbs sampler grouping together the parameters of the three modelling components described in Section 3. Note first that, because of the high dimensionality of the problem, it is computationally convenient to choose conditionally conjugate prior distributions for the parameters.Indeed, conjugacy guarantees analytical tractability of posterior distributions.In some cases, specifically for the conditional variances of GARCH errors, no conjugate model exists and updates rely on an adaptive version of the Metropolis Hastings algorithm for posterior sampling.
Algorithm 1 outlines our sampling scheme, while details are presented in Appendix A. As far as the parameters of the functional component θ i are concerned, we follow Montagna et al. (2012) by choosing conditionally conjugate prior distributions so that the update proceeds via simple Gibbs sampling steps.Analogously, the update of the regression coefficients β and the error term ψ proceeds straightforwardly by sampling from their full conditional posterior distributions.Conjugate priors for the GARCH parameters m, and α are not available, therefore we resort to adaptive Metropolis schemes to draw values from their full conditionals.Specifically, we build an adaptive scale Metropolis such that the covariance matrix of the proposal density adapts at each iteration to achieve an optimal acceptance rate (see Haario et al., 2001).
Further details about the algorithm can be found in the Appendix A, whereas code is available at https://github.com/PatricDolmeta/Bayesian-GARCH-Modeling-of-Functional-Sports-Data.

Posterior analysis
The idea of estimating trajectories for athletes' performances is a natural pursuit for the model specification we adopted.Indeed, describing observations as error prone measurements of an unknown underlying function suggests evaluating such function, once retrieved, on any number of points of interest.In practice, we will generate a fine grid of T equispaced time points: {t k } T k=1 between 0 ≡ t 1 and 1 ≡ t T and evaluate the function on this grid.
In particular, we start by evaluating the athlete-specific functional component by exploiting the basis function representation.Being: the matrix of individual-specific spline basis coefficients for all iterations g = 1, . . .G and Set the required MCMC sample size G, the burn-in period g 0 and the thinning parameter i , m (0) , (0) , α (0) , β i , ψ (0) i For g = 0, . . ., G

Update functional component
Set partial residuals y g+1) , α (g+1) on the base of Appendix A.2

Update regressive component
Set partial residuals y on the base of Appendix A.3

Update error term
Set partial residuals for g = g 0 , g 0 + g s , g 0 + 2g s , . . ., G all values of a p-dimensional, degree-3, spline basis on a set of T + 1 equispaced knots in the unit interval, the estimated contribution of the functional component to the overall trajectory is, at each iteration: where Θ (g) i corresponds to the i-th row of matrix Θ i .
As for the seasonal linear mixed effect, we modelled it as a piecewise continuous function taking individual-and season-specific values.Hence, when retrieving its estimated effect on any point in the time grid, we need to determine which season it belongs to.As discussed in Section 3, time is rescaled so that equal values across individuals indicate the same day of the year, possibly in different years.Therefore, season changes, that occur at new year's days, can be easily computed by straightforward proportions.At this point, the season to which t k belongs to is obtained by comparison with the season thresholds.In the following Equation, the indicator variable χ (t∈s) determines to which season each time point belongs to.Accordingly, the estimated contribution of the seasonal component to the overall trajectory is, at each iteration: Lastly, the regressive component has to be taken into account.The estimated contribution of the regressive component to the overall trajectory is, at each iteration: Given the three components, the overall estimate of the underlying function is obtained by adding these three components.In particular, the estimated mean trajectory can be written as: Similarly, 95% credible intervals can be computed to quantify uncertainty around our point estimate.

Model application
In this Section, we fit different specifications of our model to the data described in Section 2.
In general, we consider the additive structure of the sampling model illustrated in Equation 2. Table 3 reports an overview on of the six models we compare.In Model M 1 , the B-sline basis functions have 80 degrees of freedom, the seasonal component has GARCH errors and three regressors are taken into account: sex, age and environment.M 2 represents a slight modification of M 1 given by the fixed-age implementation.Here we consider the covariate age not as a time dependent variable, but as a fixed value given by the age at the beginning of each athlete's career.
In model M 3 a simpler dependence structure among the seasonal effects is used.Namely, we assume an autoregressive model for µ i s (see Eq. 7).For model M 4 , we simply consider a larger number of basis functions, i.e. 120, accounting for up to three splines having support in a season and hence meant to better capture the intra-seasonal variability.Finally, models M 5 and M 6 allow for doping Priors were chosen as discussed in Section 3, and with hyperparameter choices summarized in Table 4.To argue on the choice of the informative prior for the overall mean parameter m, in Table 5 we also report the results under a slight modification of model M 1 , that we denote M 1 , yielding a vague prior for m.
For all experiments, inference is obtained via posterior samples drawn by the Gibbs sampler introduced in Section 4. In particular, we ran 20, 000 iterations with a burn-in period of 60% and a thinning of 5. Performances are compared by means of the logarithm of the pseudo marginal likelihood (LPML) index (Geisser and Eddy, 1979).This estimator for the log marginal likelihood is based on conditional predictive densities and provides an overall comparison of model fit, with higher values denoting better performing models.
Performances for the different models are fairly similar: as a matter of fact, the model specifications do not differ in a significant way.Despite having a slightly lower LPML than the best performing model, M 2 , we prefer looking at results for model M 1 with 80 degrees od freedom splines, GARCH errors and three regressors with time-dependent age definition because regression parameters prove to be significant in this setting.As far as the estimation of trajectories describing the evolution of athletes' performances is concerned, we use the method discussed in Section 5. Figure 4 displays the estimate (with 95% credible bounds) for a random selection of athletes (black) together with one-season-ahead performance prediction (grey).The results are graphically pleasing in terms of model fit, but some comments are of order.First, we acknowledge that the  (11) ν β 0.5 1 st Gamma coeff.sregression param.
seasonal random intercept captures the majority of the variability in the data.Second, the functional component, which is meant to capture the overall variability in the data set, reduces to capture the intra-seasonal variability.Interestingly, the number on non-local bases selected by the adaptive procedure in Bhattacharya and Dunson (2011) is exactly equal to the number of seasons in the data set.This effect seems to be consistent with the choice of degrees of freedom, that limits the support of each spline to a unique season.Finally, the effect of covariates is very small in magnitude.Note that, because sex is time-constant, in the estimated trajectory we expect to recognise the contribution of age as a linear trend and the environmental effect as a diversification of summer and winter performances.
In Figure 5 we underline the effect of the three additive components of our functional model.
The first panel (top-left) displays the observed data and the estimated trajectory for athlete 226.
The top-right panel shows the seasonal contribution (i.e, an estimation of µ is ; i = 226; s = 1, . . ., S 226 ).The third panel (bottom-left) reports the functional contribution (i.e, an estimate of f 226 (t)).Finally, the bottom-right panel shows the effect of covariates.As anticipated, within the trajectory estimation of a unique athlete, we only recognise a performance drop predicted during winter months (negative effect of indoor environment on performances) in the bottom-right panel.
The tight credible intervals for this component assures estimates to be significant, despite small, as we will discuss in further detail in the next Section.The Bayesian latent factor methodology was originally developed for very sparse longitudinal data, with the purpose of capturing a global trend in subject-specific trajectories.We balanced the model with the requirement of smoothness using a B-spline basis system and adding a seasonal random intercept.However, it is evident that the latter explains the majority of variability in the dataset.Therefore, it might be worth considering a functional basis that batter captures the intra-seasonal variability.Further, we observed that the contribution of the regressive component is consistent across various modelling choices.
Finally, we looked at the effect of doping on results.Despite not being significant, the negative effect seems to suggest that performances of doped athletes are less variable.This is in line with previous literature suggesting that doping is more likely used to enhance performances in periods of decreasing fitness than to consolidate already good performances, generally exposed to strict controls.We think this aspect deserves further investigation, considering, for instance, more specific modeling techniques and a less imbalanced data set.
In conclusion, the attempt to extend exiting tools of functional statistics to the modelling of (shot put) performance data seems promising because of the adaptability of these methodologies to all sorts of performance longitudinal data in measurable sports.
whereas the terms in the joint sampling distribution depending on µ i,s are: is a convenient writing for partial residuals, especially from the implementation point of view.

A.2.1 Update m
The posterior distribution for m is based on the GARCH model under the assumption that the conditional variances {h i,s } s i are fixed, known and given by the writing in Eq. ( 9).In this case the likelihood function is: It is here of capital importance that h i,s are obtained by means of Eq. ( 9), using previous realisations the GARCH coefficients from the MH sampler.

A.2.2 Update α
To generate samples from α we can not exploit conjugacy, therefore we would like to rely on a normal proposal distribution.Indeed, symmetry of proposal distributions significantly eases computations of acceptance probabilities in Metropolis algorithms.Unfortunately, we are given a non-negativity constraint for the parameter, which forces us to apply a bidimensional transformation from the positive quadrant into the real plane: let say (θ 0 , θ 1 ) = (log(α 0 ), log(α 1 )).After transformation of the starting parameter, we propose a new sample employing a random walk sampler.
That is, a proposal of the form where is a multivariate standard normal random vector and ζ a covariance matrix, defined according to an Adaptive scaling within the Adaptive Metropolis-Hastings algorithm (ASWAM) by Haario et al. (2001).In this approach, both the covariance matrix of the proposal is adapted to the covariance matrix of the target density and the scale parameter is updated to achieve an average acceptance rate of 0.234 (proven to be optimal in different scenarios).
Provided it will be accepted, the actual parameter will be retrieved applying the inverse transform to θ.As discussed above, the acceptance probability will not depend on the proposal distribution.Therefore: 17 describe batting performance in baseball, Wimmer et al. (2011) performance evolution in decathlon, Yousefi and Swartz (2013) in golf, and Montagna et al. (2020) in tennis.Further, Vaci et al. (2019) rely on a Bayesian latent variable model to investigate age-related changes across the complete lifespan of basketball athletes on the basis of exponential functions.

Figure 2 :
Figure 2: Left: total number of athletes per season.Right: each boxplot shows the distribution of the athletes' mean performances within each season.

Figure 3 :
Figure 3: Performance results displayed according to the variable environment, which takes values indoor (black) and outdoor (gray).Vertical lines corresponding to the label ticks represent season changes, whereas the two enclosing it indicate the boundaries of winter months.
Gamma coeff. of the l-th global shrink.factor δ l

Figure 4 :
Figure 4: Performance trajectory estimates for a random selection of athletes.The x-axis denotes the time measured in days from January 1st of the first season of career, whereas on the y-axis there is the length of throw in meters.Vertical lines represent calendar years (seasons in our notation).The final part of each trajectory (grey) for which no observations are available, represents oneseason-ahead performance prediction.

Figure 5 :
Figure 5: Single contributions to the whole additive model as in Equation 12.The first panel is the complete additive model, whereas the second (top-right) displays the estimate of the seasonal random intercept.The third panel (bottom-left) represents the functional contribution, while the bottom-right panel displays the regressive component.

Table 1 :
Performance results conditioned on covariates.

Table 2 :
Mathematical notation is Number of observations in season s for athlete i rThe number of covariates to be consideredy ijResponse variable at time j for athlete i particular, we assume:

Table 3 :
Models name and description.

Table 4 :
Hyperparameter choices.In the first column, we refer to the Equation where the hyper-Gamma coeff. of the 1 st global shrink.factor δ 1 Gamma coeff. of the 1 st global shrink.factor δ 1 st Gamma coeff. of the l-th global shrink.factor δ l

Table 6 :
Posterior mean estimate of the regression coefficients for model M 1 (Table3), together with the standard deviation of their estimate, effective sample size (ESS) with respect to 1600 retained samples, and 95% posterior credible bounds.
Montagna and Hopker (2018)erarchical Bayesian model for the analysis of athletes' performances in a longitudinal context.FollowingMontagna and Hopker (2018), we proposed a smooth functional contribution for explaining the overall variability in the data set.The functions are represented by means of a high-dimensional set of pre-specified basis functions and a factor model on the basis coefficients ensures dimensionality reduction.We enriched the model by allowing for timedependent covariates to affect estimates through a regressive component.Finally, we addressed the issue of seasonal gathering of sports data introducing a mixed effect model with GARCH errors which provides evolving random intercepts over different time intervals in the data set.While the motivation of our work comes from the analysis of shot put performance data, the methodology presented in this work is applicable to the analysis of performance data collected in all measurable sports.

Table 9 :
Posterior mean estimate of the regression coefficients for model M 6 (Table3), together with the standard deviation of their estimate, effective sample size (ESS), and 95% posterior credible bounds.