Bayesian isotonic logistic regression via constrained splines: an application to estimating the serve advantage in professional tennis

Montagna, Silvia; Orani, Vanessa; Argiento, Raffaele

doi:10.1007/s10260-020-00535-5

Bayesian isotonic logistic regression via constrained splines: an application to estimating the serve advantage in professional tennis

Original Paper
Published: 08 July 2020

Volume 30, pages 573–604, (2021)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

340 Accesses
3 Altmetric
Explore all metrics

Abstract

In professional tennis, it is often acknowledged that the server has an initial advantage. Indeed, the majority of points are won by the server, making the serve one of the most important elements in this sport. In this paper, we focus on the role of the serve advantage in winning a point as a function of the rally length. We propose a Bayesian isotonic logistic regression model for the probability of winning a point on serve. In particular, we decompose the logit of the probability of winning via a linear combination of B-splines basis functions, with athlete-specific basis function coefficients. Further, we ensure the serve advantage decreases with rally length by imposing constraints on the spline coefficients. We also consider the rally ability of each player, and study how the different types of court may impact on the player’s rally ability. We apply our methodology to a Grand Slam singles matches dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

The analysis of serve decisions in tennis using Bayesian hierarchical models

Article 14 January 2022

Peter Tea & Tim B. Swartz

A hierarchical approach for evaluating athlete performance with an application in elite basketball

Article Open access 19 January 2024

Thiago de Paula Oliveira & John Newell

Estimation of player aging curves using regression and imputation

Article 13 January 2023

Michael Schuckers, Michael Lopez & Brian Macdonald

Notes

https://github.com/JeffSackmann.

References

Abraham C, Khadraoui K (2015) Bayesian regression with b-splines under combinations of shape constraints and smoothness properties. Stat Neerlandica 69(2):150–170
Article MathSciNet Google Scholar
Baio G, Blangiardo M (2010) Bayesian hierarchical model for the prediction of football results. J Appl Stat 69:150–170
MathSciNet MATH Google Scholar
Barnett T, Clarke SR (2005) Combining player statistics to predict outcomes of tennis matches. IMA J Manag Math 16:113–120
Article MathSciNet Google Scholar
Bradley RA, Terry ME (1952) Rank analysis of incomplete block designs: the method of paired comparisons. Biometrika 39:324–345
MathSciNet MATH Google Scholar
Carpenter B, Gelman A, Hoffman M, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, Riddell A (2017) Stan: a probabilistic programming language. J Stat Soft 76(1):1–32
Article Google Scholar
Christensen R, Johnson W, Branscum A, Hanson TE (2010) Bayesian ideas and data analysis. RCR Press, Baco Raton
Book Google Scholar
de Boor C (2001) A practical guide to splines. Springer, Berlin
MATH Google Scholar
Elphinstone C (1983) A target distribution model for nonparametric density estimation. Commun Stat Theory Methods 12(2):161–198
Article MathSciNet Google Scholar
Geisser S, Eddy WF (1979) A predictive approach to model selection. J Am Stat Assoc 3:153–160
Article MathSciNet Google Scholar
Gelfand AE, Dey DK (1994) Bayesian model choice: asymptotics and exact calculations. J R Stat Soc Ser B (Methodological) 56(3):501–514
MathSciNet MATH Google Scholar
Gelfand AE, Sahu SK (1999) Identifiability, improper priors, and gibbs sampling for generalized linear models. J Am Stat Assoc 94(445):247–253
Article MathSciNet Google Scholar
Gelman A, Goegebeur Y, Tuerlinckx F, Mechelen I (2000) Diagnostic checks for discrete data regression models using posterior predictive simulations. J R Stat Soc Ser C 49:247–268
Article MathSciNet Google Scholar
Gelman A, Hwang J, Vehtari A (2014) Understanding predictive information criteria for bayesian models. Stat Comput 24:997–1016
Article MathSciNet Google Scholar
Gilsdorf KF, Sukhatme VA (2008) Testing rosen’s sequential elimination tournament model: incentives and player performance in professional tennis. J Sport Econ 9:287–303
Article Google Scholar
Glickman ME (1999) Parameter estimation in large dynamic paired comparison experiments. J R Stat Soc Ser C (Appl Stat) 48:377–394
Article Google Scholar
Gomes RV, Coutts AJ, Viveiros L, Aoki MS (2011) Physiological demands of match-play in elite tennis: a case study. Eur J Sport Sci 11(2):105–109
Article Google Scholar
Gorgi P, Koopman SJS, Lit R (2018) The analysis and forecasting of ATP tennis matches using a high-dimensional dynamic model. Tinbergen Institute Discussion Papers 18-009/III, Tinbergen Institute
Heinzmann D (2008) A filtered polynomial approach to density estimation. Comput Stat 23(3):343–360
Article MathSciNet Google Scholar
Irons DJ, Buckley S, Paulden T (2014) Developing an improved tennis ranking system. J Quant Anal Sports 10:2
Google Scholar
Klaassen FJ, Magnus JR (2003) Forecasting the winner of a tennis match. Eur J Oper Res 148:257–267
Article Google Scholar
Kotze J, Mitchell SJ, Rothberg S (2000) The role of the racket in high-speed tennis serves. Sports Eng 3:67–84
Article Google Scholar
Kovalchik S (2016) Searching for the goat of tennis win prediction. J Quant Anal Sports 12:127–138
Google Scholar
Kovalchik S (2018a) deuce: resources for analysis of professional tennis data. R Pack Vers 1:2
Google Scholar
Kovalchik S (2018b) The serve advantage in professional tennis (under Review)
Lees A (2003) Science and the major racket sports: a review. J Sports Sci 2:707–732
Article Google Scholar
Lunn D, Jackson C, Best N, Thomas DA Spiegelhalter (2012) The BUGS boo: a practical introduction to bayesian analysis. Texts in statistical science. Chapman & Hall/CRC, Boca Raton
Book Google Scholar
Manderson AA, Cripps E, Murray K, Turlach BA (2017) Monotone polynomials using bugs and stan. Aust N Zeal J Stat 59(4):353–370
Article MathSciNet Google Scholar
Murray K, Müller S, Turlach BA (2013) Revisiting fitting monotone polynomials to data. Comput Stat 28:1989–2005
Article MathSciNet Google Scholar
Murray K, Müller S, Turlach BA (2016) Fast and flexible methods for monotone polynomial fitting. J Stati Comput Simul 86(15):2946–2966
Article MathSciNet Google Scholar
Newton PK, Keller JB (2005) Probability of winning at tennis i. theory and data. Stud Appl Math 114(3):241–269
Article MathSciNet Google Scholar
O’Donoghue GP, Brown E (2008) The importance of service in grand slam singles tennis. Int J Perf Anal Sport 8(3):70–78
Google Scholar
Orani V (2019) Bayesian isotonic logistic regression via constrained splines: an application to estimating the serve advantage in professional tennis. Master’s thesis, University of Torino
Plummer M, Best N, Cowles K, Vines K (2006) Coda: convergence diagnosis and output analysis for mcmc. R News 6(1):7–11
Google Scholar
Plummer M, Stukalov A, Denwood MJ (2016) rjags: Bayesian Graphical Models Using MCMC
R Core Team (2013) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria
Robert CP (1996) Intrinsic losses. Theory Decis 40:191–214
Article MathSciNet Google Scholar
Smith S (2013) Effects of playing surface and gender on rally durations in singles grand slam tennis. PhD thesis, Cardiff Metropolitan University
Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc 64:583–639
Article MathSciNet Google Scholar
Starbuck C, Damm L, Clarke J, Carré M, Capel-Davis J, Miller S, Stiles V, Dixon S (2016) The influence of tennis court surfaces on player perceptions and biomechanical response. J Sports Sci 34(17):1627–1636
Article Google Scholar
Watanabe S (2010) Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res 11:3571–3594
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

ESOMAS Department, University of Turin, C.so Unione Sovietica 218/bis, 10134, Turin, Italy
Silvia Montagna
IEIIT, National Research Council of Italy, Via De Marini 6, 16149, Genoa, Italy
Vanessa Orani
Department of Statistical Sciences, Università Cattolica del Sacro Cuore, Largo Gemelli 1, 20123, Milan, Italy
Raffaele Argiento
Collegio Carlo Alberto, Piazza Vincenzo Arbarello 8, 10122, Turin, Italy
Silvia Montagna & Raffaele Argiento

Authors

Silvia Montagna
View author publications
You can also search for this author in PubMed Google Scholar
Vanessa Orani
View author publications
You can also search for this author in PubMed Google Scholar
Raffaele Argiento
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Silvia Montagna.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Proof of Proposition 1

Consider the restricted spline function

$$\begin{aligned} f_{L_0,U}(s)= \sum _{m=m_{L_0}-k}^{M}\beta _m b_{m,k}(s), \end{aligned}$$

where k is the order of the B-splines. Following Formula (12) on page 116 and Formula (13) of de Boor (2001), we can compute its derivative as

$$\begin{aligned} f'_{L_0,U}(s)&= (k-1)\sum _{m=m_{L_0}-k+1}^{M}\frac{\beta _m-\beta _{m-1}}{t_{m+k-1}-t_m} b_{m,k-1}(s)\\&\le (k-1) \sup _{m\in \{m_{L_0}-k+1,\dots ,M\}}\left\{ \frac{\beta _m-\beta _{m-1}}{t_{m+k-1}-t_m} \right\} \sum _{m=m_{L_0}-k+1}^{M} b_{m,k-1}(s)\\&= (k-1) \sup _{m\in \{m_{L_0}-k+1,\dots ,M\}}\left\{ \frac{\beta _m-\beta _{m-1}}{t_{m+k-1}-t_m} \right\} \end{aligned}$$

where we used the fact that that $\sum _{m=m_{L_0}-k+1}^{M} b_{m,k-1}(s)=1$. The latter follows from Formula (37) on page 96 of de Boor (2001), and from the fact that $b_{m,k-1}(s)=0$ for $s\in [L_0,U]$ and $m\notin \{m=m_{L_0}-k+1,\dots ,M\}$. It is straightforward to observe that the constant $\sup _{m\in \{m_{L_0}-k+1,\dots ,M\}}\left\{ \frac{\beta _m-\beta _{m-1}}{t_{m+k-1}-t_m} \right\} $ is smaller or equal than zero if and only if

$$\begin{aligned} \beta _m\le \beta _{m+1}\quad \text {for each}\quad m\in \{m_{L_0}+k-1\} \end{aligned}$$

The latter property is equivalent to requiring that the restricted control polygon ${\mathcal {C}}_{[L_0,U]}(s)$ is not increasing on its support, i.e. for $s\in [{\bar{t}}_{m_{L_0}-k},{\bar{t}}_M]$.

Appendix 2: Rally abilities on different types of courts

See Table 5

Table 5 Credible intervals for the rally abilities of the top players for the ATP and the WTA tournaments

Full size table

Appendix 3: Predictive information criteria

In this “Appendix” we provide additional details and equations for the predictive information criteria used in Sect. 4. The interested reader can refer to Gelman et al. (2014) for a more extensive review.

Let us begin with the Log Pseudo Marginal Likelihood (LPML) criterion (Geisser and Eddy 1979). Let ${M_m}$ be a particular model, and suppose data ${\textit{\textbf{y}}} = (y_1,\cdots , y_n)^\top $ arise independently given a vector of model parameters $\varvec{\theta }_m$ so that $p({\textit{\textbf{y}}}\vert \varvec{\theta }_m,{M_m}) = \prod _{i=1}^{n}{p(y_i|\varvec{\theta }_m,M_m)}$. To compare the performance of different models $M_1, \ldots , M_M$ via Bayes factors, we need to compute $p({\textit{\textbf{y}}}|{M_m})$ for the various models under consideration, but this density may be difficult to compute. Geisser and Eddy (1979) propose to replace $p({\textit{\textbf{y}}}|{M_m})$ with the pseudomarginal likelihood:

$$\begin{aligned} p({\textit{\textbf{y}}}\vert {M_m}) = \prod _{i=1}^n p(y_i\vert {\textit{\textbf{y}}}_{-i},{M_m}), \end{aligned}$$

where $p(y_i|{\textit{\textbf{y}}}_{-i},{M_m})$ is the i-th conditional predictive ordinate ($\text {CPO}_i$), that is, the predictive density of the i-th observation given all the other data, ${\textit{\textbf{y}}}_{-i}$. Let us focus on a particular model $M_m$ and thus, for ease of notation, let us omit the conditioning to model $M_m$ and denote ${\varvec{\theta }}_m = {\varvec{\theta }}$ hereafter. Christensen et al. (2010) show that the i-th CPO can be expressed as $\text {CPO}_i^{-1}= \mathbb {E}\left[ \frac{1}{p(y_i \mid \varvec{\theta })} \Big \vert y_1,\ldots , y_n \right] $. By resorting to a Monte Carlo approximation, we can compute the i-th CPO directly from the MCMC output based on fitting the model to all the data (Gelfand and Dey 1994). Specifically, for each i:

$$\begin{aligned} \widehat{\text {CPO}}_i^{-1} = \frac{1}{T}\sum _{t=1}^T \frac{1}{p\left( y_i \mid \varvec{\theta }^{(t)}\right) }, \end{aligned}$$

(14)

where T denotes the total number of iterations, and $\left\{ {\varvec{\theta }}^{(t)}\right\} _{t=1}^T$ are samples drawn from the posterior distribution of the model parameters $(\varvec{\theta } \vert {y_1},\ldots , y_n)$. Geisser and Eddy (1979) summarise a model’s predictive performance via the LPML, given by

$$\begin{aligned} \widehat{\text {LPML} }=\sum _{i=1}^{n} {\log (\widehat{\text {CPO}}_i)}. \end{aligned}$$

(15)

The best model is the one that maximises the LPML coefficient. Similarly, we can use the MCMC posterior samples to obtain an estimate of the expected log-pointwise posterior predictive density ($\widehat{\text {elppd}}_{\text {waic}}$):

$$\begin{aligned} \widehat{\text {elppd}}_{\text {waic}} = \sum _{i=1}^{n} \log \left[ \frac{1}{T}\sum _{t=1}^{T}p\left( y_i|{\varvec{\theta }}^{(t)}\right) \right] - \sum _{i=1}^{n} \text {Var}_{t=1}^T \left[ \log p\left( y_i|{\varvec{\theta }}^{(t)}\right) \right] , \end{aligned}$$

(16)

where $p\left( y_i|{\varvec{\theta }}^{(t)}\right) $ is the posterior predictive distribution for individual i given model parameters ${\varvec{\theta }}^{(t)}$ and $\text {Var}_{t=1}^T \left[ \log p\left( y_i|{\varvec{\theta }}^{(t)}\right) \right] $ represents the sample variance of the log posterior predictive distribution for each data point $y_i$. The expected log predictive density is a common measure of a model’s predictive accuracy due to its relation to the Kullback-Leibler information measure (Robert 1996). Indeed, asymptotically, the model with the lowest Kullback-Leibler information (equivalently, the highest expected log predictive density) will have the highest posterior probability. Therefore, it can be used as a criterion to compare models in terms of their predictive perfomance. The first term in Eq. (16) is the computed log pointwise posterior predictive density on the observed data $y_i$’s, which is typically an over-estimate of the expected log pointwise predictive density for future data (Gelman et al. 2014). The second term in Eq. (16) can be thought of as correction for the effective number of parameters to adjust for overfitting. The Watanabe–Akaike information criterion (WAIC) is then defined as:

$$\begin{aligned} \text {WAIC} =-2 (\widehat{\text {elppd}}_{\text {waic}}) \end{aligned}$$

(17)

The best model is the one that minimises the WAIC. We remark that Eq. 17 does not match the original definition for the WAIC in Watanabe (2010). Eq. 17 corresponds to the definition for the WAIC given in Gelman et al. (2014), who suggest scaling the criterion to put it on the deviance scale, thus making it comparable to the Deviance Information Criterion (DIC) (Spiegelhalter et al. 2002). The DIC is itself the sum of a measure of model fit and a penalty, $p_{DIC}$, representing the effective number of parameters in the model:

$$\begin{aligned} \text {DIC} = -2 \sum _{i=1}^n\log p\left( y_i \mid {\widehat{{\varvec{\theta }}}}\right) + 2 p_{DIC}, \end{aligned}$$

(18)

where ${\widehat{{\varvec{\theta }}}}$ is the posterior mean of ${\varvec{\theta }}$, ${\widehat{{\varvec{\theta }}}} = \mathbb {E}[{\varvec{\theta }}\mid {\textit{\textbf{y}}}]$, and

$$\begin{aligned} p_{DIC} = 2\sum _{i=1}^n\left[ \log p\left( y_i \mid {\widehat{{\varvec{\theta }}}}\right) - \frac{1}{T}\sum _{t=1}^T \log p\left( y_i \mid {\varvec{\theta }}^{(t)}\right) \right] . \end{aligned}$$

The smallest DIC identifies the model with the best predictive performance, in the same spirit of Akaike’s criterion.

Finally, a popular measure of a model’s predictive accuracy (not based on the log predictive density) is the Root Mean Squared Error (RMSE). Denoting with $n_s$ the total number of servers, the RMSE of the estimated posterior probability of winning on serve to the truth is computed as:

$$\begin{aligned} \text {RMSE} = \sqrt{\sum _{i=1}^{n_s}\frac{1}{n_s}\left[ \sum _{x=1}^{15}\frac{1}{15}\left( \sum _{t=1}^T\frac{\left( p_i(x) - {\widehat{p}}^{(t)}_i(x)\right) ^2}{T} \right) \right] }. \end{aligned}$$

(19)

Here $p_i(x)$ is the observed frequency of wins on serve given rally length x for server i, and ${\widehat{p}}^{(t)}_i(x)$ is the estimated probability at iteration t, where logit ${\widehat{p}}^{(t)}_i(x) = \sum _{m=1}^M \beta _{i,m}^{(t)} b_m(x)+ (\alpha _i^{(t)} - \alpha _j^{(t)})$.

Appendix 4: Convergence diagnostics and model checks

Valid inference from sequences of MCMC samples rely on the assumption that samples are drawn from the true posterior distribution of the model parameters. Asymptotic theory guarantees the validity of this assumption as the number of iterations approaches infinity. A significant task is to determine the minimum number of samples required to ensure a reasonable approximation to the target posterior density. After some tests and convergence diagnostics about the best setting for the parameters of our model, a good balance between convergence of the chain and computation time was achieved by running our algorithm for 20000 iterations, with a burn-in of 100 iterations and thinning every twenty to reduce autocorrelation within the chain.

In our application, we are interested in estimating the conditional probability of winning the point on serve as a function of rally length. Thus, here we report convergence diagnostics for our quantity of interest in our final model (Sect. 4). In particular, at each iteration $t = 1,\ldots , T$, we estimate the probability of winning on serve, ${\widehat{p}}_{i}(x)$, given a particular value of rally length $x \in \{1,\ldots ,15\}$ for a selected player i as

$$\begin{aligned} {\widehat{p}}_i(x)^{(t)} = \text {logit}^{-1}\left( \sum _{m=1}^M {\beta _{i m}^{(t)}b_m(x)+ \alpha _i^{(t)}} - {\bar{\alpha }}_j^{(t)}\right) , \end{aligned}$$

where $\{\beta _{i m}^{(t)}\}_{m=1}^M$ and $\alpha _i^{(t)}$ are, respectively, the estimate at iteration t of the basis function coefficients and rally ability for player i, and ${\bar{\alpha }}_j$ is obtained by averaging all the $\alpha _j^{(t)}$, with $j\ne i$. Figure 11 shows traceplots and autocorrelation functions of ${\widehat{p}}_i(x)^{(t)}$ for randomly selected players at randomly selected rally lengths. We also report the effective sample size, which indicates the approximate effective number of independent simulations draws for our estimand of interest. It appears that our chain converged quickly and mixed efficiently.

For a general assessment of model fit, we compare the discrete data with their predicted value by computing the realised residuals $r = y - {\mathbb {E}}[y | x, \beta , \alpha ]$, as described in Gelman et al. (2000). For our model, the realised residuals for player i at rally length x are defined and computed at each iteration t as follows:

$$\begin{aligned} r_{i}^{(t)}(x) = y_{i}(x) - p_{i}(x)^{(t)} = y_i(x) - \text {logit}^{-1}\left( \sum _{m=1}^M {\beta _{i m}^{(t)}b_m(x)+ \alpha _i^{(t)}} - {\bar{\alpha }}_j^{(t)}\right) , \end{aligned}$$

For each value of rally length, we then compute the mean residuals for player i as ${\bar{r}}_{i}(x) = \sum _{t = B + 1}^{T} r_{i}^{(t)}(x)$, where the mean is computed over the post burn-in (B) iterations. Figure 12 shows the mean residuals for each player in the training set. The mean resalised residuals appear to be symmetric around zero and, as the value of rally length increases, we observe larger variance of the residuals. This result is not surprising and is due to data sparsity at larger rally lengths (Fig. 2), thus more uncertainty in the estimates. Consequently, this strengthens the need for an approach that puts some sort of regularity on the estimate of $p_i(x)$. Our monotonicity constraints ensures the estimated probability can not change wildly at large rally lengths and, therefore, ensures the estimate is less affected by outlying observations.

Appendix 5: Bayesian polynomial regression (BPR)

In this “Appendix”, we consider polynomial regression to model the probability of winning a point as a function of rally length. We assume

$$\begin{aligned} Y_{i,j}|p_{i,j}(x) \sim \text {Bernoulli}(p_{i,j}(x)), \quad x\in X \end{aligned}$$

(20)

where $p_{i,j}(x)$ is the probability that server i wins a point against receiver j at rally length x, e.g ${\mathbb {P}}[Y_{i,j}=1\vert x]$. Analogously to the splines model in Sect. 3, we consider two components for the modelling of the logit of $p_{i,j}(x)$, one describing the serve advantage and the second representing the rally ability of the players. Specifically, we model the logit of $p_{i,j}(x)$ as follows:

$$\begin{aligned} \text {logit }{\mathbb {P}}(Y_{i,j}=1 \vert x) = \text {logit} \left[ p_{i,j}(x) \right] = f_i(x) + (\alpha _i -\alpha _j), \quad x \in X \end{aligned}$$

(21)

with

$$\begin{aligned} f_i(s) = \sum _{m=0}^{d}\beta _{i,m} s^{m}, \quad s \in [L, U] \subset {\mathbb {R}} \end{aligned}$$

(22)

for $i= 1,\dots , n_s,$ $j = 1,\dots , n_r$ and $i \ne j$, where d is the degree of the polynomial, $n_s$ is the total number of servers, and$n_r$ the total number of receivers.

We fit four Bayesian polynomial regression models to the Grand slam data (Sect. 2) by varying the degree of the polynomials. In particular, we considered $d = 3, 4, 5, 6$, and we adopted the prior in Eq. (6) for the basis function coefficients $\{\beta _m\}_{m=0}^d$ (no constraints) and prior in Eq. (10) for the rally ability parameters. In Fig. 13 we represent the probability of winning a point as a function of rally length for Andy Murray, with the serve advantage modelled as a polynomial function with varying degrees. While the model fit is reasonable for small values of rally length, with all polynomial regressions capturing a decreasing trend, for intermediate or large values of rally lengths the estimated probability oscillates or even increases. The cubic polynomial model also captures the decreasing behaviour at the end, even though in the interval $\left[ 7,10\right] $ the estimate locally increases. Polynomial models of higher degrees appear to be affected by outliers, with the estimated probability of winning the point showing a strange increasing behaviour for large x.

Appendix 6: Players ID for Figs. 8 and 9

In this “Appendix”, we report the legend associating each number in Figs. 8 and 9 to the corresponding player in the ATP and WTA tournaments, respectively.

ID	ATP	WTA
1	Alexander Bublik	Alison Van Uytvanck
2	Andy Murray	Alize Cornet
3	Benjamin Becker	Angelique Kerber
4	Bernard Tomic	Arina Rodionova
5	Damir Dzumhur	Bethanie Mattek Sands
6	David Ferrer	Camila Giorgi
7	Ernesto Escobedo	Caroline Wozniacki
8	Ernests Gulbis	CoCo Vandeweghe
9	Evgeny Donskoy	Dominika Cibulkova
10	Fernando Verdasco	Ekaterina Alexandrova
11	Frances Tiafoe	Francesca Schiavone
12	Gilles Muller	Heather Watson
13	Guillermo Garcia Lopez	Jelena Jankovic
14	Ivo Karlovic	Johanna Konta
15	Janko Tipsarevic	Kirsten Flipkens
16	Jo Wilfried Tsonga	Kristina Kucova
17	Kei Nishikori	Louisa Chirico
18	Malek Jaziri	Madison Keys
19	Marco Trungelliti	Magda Linette
20	Marin Cilic	Mandy Minella
21	Mischa Zverev	Mirjana Lucic Baroni
22	Nick Kyrgios	Petra Martic
23	Novak Djokovic	Risa Ozaki
24	Pablo Carreno Busta	Saisai Zheng
25	Radek Stepanek	Serena Williams
26	Rafael Nadal	Simona Halep
27	Roger Federer	Sorana Cirstea
28	Rogerio Dutra Silva	Svetlana Kuznetsova
29	Ruben Bemelmans	Tsvetana Pironkova
30	Steve Johnson	Venus Williams
31	Teymuraz Gabashvili	Victoria Azarenka
32	Victor Estrella Burgos	Yaroslava Shvedova
33	–	Zarina Diyas

Rights and permissions

Reprints and permissions

About this article

Cite this article

Montagna, S., Orani, V. & Argiento, R. Bayesian isotonic logistic regression via constrained splines: an application to estimating the serve advantage in professional tennis. Stat Methods Appl 30, 573–604 (2021). https://doi.org/10.1007/s10260-020-00535-5

Download citation

Accepted: 22 June 2020
Published: 08 July 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s10260-020-00535-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Bayesian isotonic logistic regression via constrained splines: an application to estimating the serve advantage in professional tennis

Abstract

Access this article

Similar content being viewed by others

The analysis of serve decisions in tennis using Bayesian hierarchical models

A hierarchical approach for evaluating athlete performance with an application in elite basketball

Estimation of player aging curves using regression and imputation

Notes

References