Who can receive the pass? A computational model for quantifying availability in soccer

The paper presents a computational approach to Availability of soccer players. Availability is defined as the probability that a pass reaches the target player without being intercepted by opponents. Clearly, a computational model for this probability grounds on models for ball dynamics, player movements, and technical skills of the pass giver. Our approach aggregates these quantities for all possible passes to the target player to compute a single Availability value. Empirically, our approach outperforms state-of-the-art competitors using data from 58 professional soccer matches. Moreover, our experiments indicate that the model can even outperform soccer coaches in assessing the availability of soccer players from static images.

and try to create chances for scoring. The quality of passing affects the team's success and represents an important category for evaluating collective and individual match performances (Hughes and Franks 2005). It is easy to accept that simple indicators, such as the number of completed passes, are not very meaningful when it comes to assessing passing skills (Mackenzie and Cushion 2012). A simple pass backwards to a non-marked teammate has to be assessed differently than a long through ball between the defending lines, which may lead to a scoring opportunity. Therefore, research suggests using more advanced metrics e.g. the risk of a pass in relation to its potential effort (Goes et al. 2019;Power et al. 2017;Bransen et al. 2019b).
When coaches analyze match performance, they use -explicitly or implicitly -the concept of Availability. For this paper, we want to define Availability as the probability of which a selected player can receive the ball from a teammate. Availability is related to the risk of a pass and does not refer to a specific pass only, but asks for the aggregated success probability of all (hypothetical) passes, which could be received in a worthwhile zone. This concept is highly useful for understanding the performance of the passing player as well as for the receiver. The passing player has to choose from several target players (Steiner 2018), and has to adjust the kicking movement precisely with regard to the ball's trajectory and speed. The Availability of the teammates is an important factor for assessing both, the quality of the decision for a target player and the motor-technical skills for passing. On the other hand, Availability is also important for evaluating the tactical behavior of potential pass receivers. In elite soccer, players have to move in an unpredictable and dynamic way to create free spaces in which they can receive the ball (Bangsbo and Peitersen 2004). If players fail to create this space, the tactical options in the attacking game are limited and the defending team can be more successful . Against this background, an objective quantification of Availability would be very useful for match analysis in soccer.
This paper presents a probabilistic machine-learning approach for computing Availability based on spatiotemporal tracking data. From a computational perspective, quantifying Availability is a challenging problem, since there are many factors to consider including that (i) players are constantly in motion and may receive the ball at many different positions on the pitch, (ii) there are many possible trajectories of the ball to the same position resulting in different transition times of the ball, (iii) opponents may intercept the ball, (iv) the technical skill levels of pass giver and the (v) time for controlling the ball influence successful completion of the pass. Our approach addresses all these aspects: For every moment t, we compute probabilities at which player r can receive a pass at the position x, played with multiple speeds. After this, we aggregate those probabilities for many x to an overall probability of a successful pass, which we call Availability of r at the moment t. Figure 1 shows an example and the corresponding availability of the receiver, which serves as a running example for the rest of the paper.
The remainder is organized as follows. Section 2 reviews related work. We present our Availability model including sub-models for interception, player skills and reception in Sect. 3, followed by the sub-models for ball and player movement in Sect. 4. The experimental validation in Sect. 5 compares computed Availability scores with observed receptions and interception of real passes in 58 matches of German Bun- Fig. 1 Running example: Scene from a Bundesliga match where the blue player with the black circle denoting the ball passes to the blue player below the orange zone. Left: The receiving player will receive the ball at the location marked with the light black circle. The orange zone marks the area in which the player can receive the ball, where darker orange indicates higher success probability. The number denotes the overall availability of the player. Note that the dark black circle indicating the position of the ball does not align perfectly with the actual position of the ball. This is due to small errors in the provided positional data. Right: The moment of the first touch of the receiving player. Incidentally, the receiver is not able to control the ball and will loose the ball immediately. For better visibility, a larger version of this figure is shown in Appendix C desliga. Additionally, we compare computed Availabilities to human observer ratings. Section 6 highlights different application areas and Sect. 7 concludes.

Related work
Data-driven analyses of sports and soccer in particular are manifold in the literature. Existing approaches cover different aspects of the game including tactical constructs, estimating outcome of matches or quantifying the probability of goal scoring, see Goes et al. (2021) for an overview. For the purpose for this paper we aim to focus on approaches dealing with passing and the movement of players in a narrow sense.
Estimating the likelihood of successful passes has been investigated in Spearman et al. (2017) andPeralta Alguacil et al. (2020), where the authors propose a physicsbased approach to predict the time until a player can reach a certain position. The time component can be computed by solving the player's equation of motion. In Peralta Alguacil et al. (2020) for example, the model is based on Fujimura and Sugihara (2005) for solving the equation of motion and augmented with an additional logistic distribution model to define an overall reachability. The authors also employ a physics based ball dynamics model. Overall, the approach is similar to ours, though we show in our experiments that by using our fully learned player dynamics and ball dynamics, we are able to better predict dynamics and, ultimately, the Availability of players. Alternative approaches for solving the equation of motion are provided in Taki et al. (1996) and Brefeld et al. (2019). Note that Peralta Alguacil et al. (2020) also identify potential runs of attacking players that maximize a combination of pass probability, pitch impact and pitch control. For an attacking player, based on the chosen combination, an optimal position is computed and compared to the actual observed position.
Other work on the analysis of passes in soccer include Power et al. (2017), who compare the risk of a pass (probability of an intercepted pass) versus its reward, the likelihood that the attacking team will take a shot at goal within 10 sec after the pass. Goes et al. (2019) estimate the effectiveness of passes by measuring how much defensive players have to move and how much their defensive organization reduces following a pass, while Bransen et al. (2019a) use event data of passes and model the reward by measuring the impact of these pass on the goal scoring probability.
Other publications rate general game states according to different measures and can also be used to rate the effectiveness of passes. Spearman (2018) develops a model that combines scoring probabilities from a certain point on the pitch with a team's control at that point and the probability that the ball will reach the point. Fernández et al. (2019) measure pitch control of teams and players and pitch value, which was estimated to correlate with positions that defenders aim to occupy. Recently, many publications use machine learning approaches to predict which player will receive the next pass (Vercruyssen et al. 2016;Fournier-Viger et al. 2018;Hubácek et al. 2018;Dauxais and Gautrais 2019;Li and Zhang 2019;Fernández et al. 2021). Although these approaches aim to model tactical decisions of the passing player, the Availability approach presented in this paper asks for the success probability of such a pass.
Estimating future positions of soccer players is another aspect that has been widely investigated. A general problem when learning coordinated movements of several agents, like players in a team, is that trajectories come as unordered sets of individuals. When learning from several games incorporating different teams and players, a model has to work without a natural ordering of the agents. Le et al. (2017b, a) learn future positions of players by estimating the roles of players in a given episode and using the role assignments to predict future movements using the then ordered set of players. Other publications also use role assignments to predict future positions of players using a variational recurrent neural networks (Zhan et al. 2019(Zhan et al. , 2018Felsen et al. 2018). Yeh et al. (2019, on the other hand, study graph representations to model interactions of all agents including the ball. They leverage graph neural networks (GNN) which are naturally suited to model coordinated behavior because of their invariance to permutations in the input and propose a graph variational recurrent neural network to predict future positions of soccer and basketball players. Hoshen (2017) and Kipf et al. (2018) deploy graph-related attention mechanisms to learn trajectories for soccer and basketball players, respectively. GNNs have been widely used to model structured or relational data,  provide an overview. In cases where data is sequential in nature, graph recurrent neural nets (GRNN) have been widely used, starting e.g. with Sanchez-Gonzalez et al. (2018) who mix graph representations with recurrent layers, such as gaited recurrent units (GRUs, Cho et al. 2014).
Due to the complex nature of soccer players' movements, one can expect the distribution of future points of a player to be multi-modal, which a probabilistic model that predicts future points should reflect. Thus, (conditional) variational models (CVM) with Gaussian emission functions are frequently deployed to account for multi-modality in the data (Zhan et al. 2019(Zhan et al. , 2018Yeh et al. 2019;Felsen et al. 2018). However, Graves (2013) show that combining RNNs with mixture density networks (MDNs, Bishop 1994), with a Gaussian mixture model (GMM) as output distribution yields accurate predictive results for spatiotemporal tasks. In fact, Rudolph et al.
(2020) provide empirical evidence that combining GMM emission with recurrent graph networks works on par with using more complex CVM models.

Preliminaries
Our approach uses spatiotemporal data including x y positions of the players and x yz position of the ball. Data is recorded at 25Hz by a semiautomatic optical tracking system (TRACAB ® , ChyronHego), which consists of up to 24 cameras around the pitch. The system uses computer vision methods to detect objects in the video stream. Tracking loss and identity swaps are eliminated manually after the matches. We make also use of a manually logged ball status flag, which indicates whether the ball is in play or not. The data is provided by German Professional Soccer League (Bundesliga) for the purpose of this paper. Reliability and validity of the system for measuring soccer specific movements is verified in Linke et al. (2020).
Computational analyzes that build upon tracking data, that is, sequences of player positions, movement directions, require some kind of formal representation of that data. To not clutter notation unnecessarily, we informally define the state of player i at time T by S i T . The state S i T contains the player's position in x y-coordinates and velocity, as well as team and ball possession indicators. Superscript 0 is reserved to index the state of the ball S 0 T with its position, velocity, and additionally its zcoordinate at time T . We sometimes aggregate states of all players and ball at time T , denoted by S 0:N T , where usually N = 22, as well as time windows of interest by S i T 1 :T 2 . For simplicity, we define S as the entire history of states of the game until the current point in time.
The goal of this paper is to establish a model for Availability. In other words, we aim to devise a model that computes the likelihood that a pass, irrespectively of whether it is a footed and headed pass, played from position b = [b x , b y , b z ] with initial direction a = [a x , a y , a z ] and speed a , can be reached by the intended receiver r without being intercepted by any opposing player. However, there are many passes with (slightly) different directions or velocities that may reach the target player. A solution thus needs to aggregate likelihoods for all possible passes to receiver r and aggregate them into an Availability value.
We assume that the passer chooses the best passing direction vector for passing to r but may not be able to execute the pass optimally. That is, the actual ball trajectory may differ from the intended one, as defined by the initial direction a = a + a. Instead of working directly with vectors a and residual a, we will instead make use of horizontal and vertical angles α and β, respectively, as well as ball speed v. The re-parameterization is given by a(α, β, v) = [cos(β) cos(α), cos(β) sin(α), sin(β)] · v such that a = a(α + α, β + β, v + v) with residual a = [ α, β, v]. The re-parameterization allows to formulate non-optimal executions of passes in  (α, β, v) terms of differences in horizontal angle α, vertical angle β, and the initial ball speed v. For ease of notation, we will often use tupled parameters θ = (α, β, v) and θ = ( α, β, v) and a(θ ); Figure 2 provides a visualization. Using the re-parameterization, the Availability model consists of two parts: (i) computing likelihoods p r a (θ ) of successful passes to target player r along vertices a(θ ), and (ii) aggregating these likelihoods into a single Availability value A r (ψ) where ψ denotes the skill of the pass giver to determine the expected deviation of the actual pass from the intended one.
In the next section, we introduce models for ball dynamics that are used to predict ball trajectories from initial ball directions a(θ ). Together with a predictor of whether a player can reach certain positions on the pitch in time, Section 3.3 derives a model to estimate the probability that a pass along an initial direction can be successful. Finally, Sect. 3.4 aggregates those values over a variety of initial directions into an availability value. The previously mentioned player reachability model is slightly more involving and introduced in Sect. 4.

Ball dynamics
Assume that the ball is played with initial movement vector a(α, β, v) and moves along a straight line on the x y-plane. That is, we ignore curve balls for a moment. Naturally, the velocity of the ball decreases over time due to air and ground friction and so does its z-speed and position due to gravity and rotation. Physics implies that the deceleration curve of the ball depends strongly on the initial movement vector a (α, β, v), in particular on z-angle β and initial speed v. The reason behind this is two-fold. First, the ball is either flying (air friction, quadratic in speed) or rolling (mainly ground friction, approximately linear in speed). Secondly, depending on the intended distance and speed of the ball, the ball is played with more or less (backward) rotation, which changes its dynamics significantly. Note that rotation is not directly observable in tracking data.
The same holds for the acceleration in z-direction. While gravity force is constant, the observed acceleration varies significantly between passes, due to unobservable ball rotation. It becomes obvious that an Availability model cannot be computed in absence of ball movement. In fact, we capture ball dynamics with three distinct models that are also listed in Table 1. • The first model is denoted by t(d; β, v) and estimates the time until the ball reaches a certain distance d after it has been kicked with initial z-angle β and velocity v. Function t is learned by a ridge regression with a polynomial kernel from historic data where β and v are estimated for every pass by taking the difference of the first two frames. • The second model u (d; β, v) estimates ball velocity at a certain distance d with initial angle β and velocity v. We learn function u using a ridge regression with a polynomial kernel on historic data. • The third model informs about the height z of the ball at a given distance. Ignoring air friction and ball rotation, the ball's z-coordinate dynamics would be determined by .81 is the gravity force. However, the "observed gravitational force", or in other words the observed accelerationĝ, deviates strongly from frictionless acceleration g. In fact, as can be seen in Fig. 3, left, the observed accelerationĝ depends on the initial z-angle β. We therefore learn a probabilistic model of "gravitation" from historic ball data by assuming thatĝ follows a Gaussian distribution when mean μ(β, v) and variance σ (β, v) are linear functions. We thus have for z(t) >= 0 and p = 0 otherwise and learn by minimizing the negative loglikelihood of the data.
Figure 3, right, shows sampled simulated ball trajectories for varying initial ball speeds v and z-angles β. Each simulated ball trajectory corresponds to a specific v, β pair and the height z at distance d is estimated by first computing time t d = t(d; β, v) until the ball reaches distance d using the first model and then computing the distribution p(z(t d ) | β, v) using the third model and Eq. (1).

Quantifying the likelihood of passes
We begin the derivation of Availability by quantifying the likelihood that the intended receiver r reaches the ball when it is played from position b with parameters θ = (α, β, v) along the initial direction [cos(β) cos(α), cos(β) sin(α), sin(β)] with initial speed v. We denote this likelihood by p r A (θ, S).  To proceed, we first derive the probability that the receiver can reach the ball at any point along the line given by θ , we then derive the probability that any defender can intercept the ball before any of those points and in a last step aggregate those probabilities into a single value. In the course of this section, we will make use of movement models p i R (m, t; S) that quantify the probability that player i reaches position m in time t. We will introduce the model properly in Sect. 4 to not clutter this section unnecessarily.

Low passes
Let us assume that β = 0 before we turn to the general case. We thus focus on low passes starting at the current position of the ball with arbitrary angle α and velocity v while β = 0. Let m be an arbitrary x y-position on the pitch.
Still assuming only straight passes, the probability p α (m | θ, S) that the ball passes through position m is a point measure that is 1 only if there exists c > 0 : Furthermore, the probability p i I (m; θ, S) that a low pass can be intercepted at position m by player i, equals the likelihood that player i can reach position m before the ball (which is currently at position b), that is where t is the time-to-position function of the ball. Figure 4 shows a visualization. Analogously, the likelihood that player i can intercept the pass anywhere on the passing line before position m is given by In other words, the player will attain the position where the interception probability is highest. Figure 5, left, shows interception probabilities for the running example.
Putting everything together yields the probability that a low pass to player r , starting at position b along trajectory a(α, β = 0, v) = a(θ ) and ending in position m, is successful is given by (i) the probability that position m lies on the trajectory of the ball, (ii) the probability that player r can intercept the ball exactly at position m, and (iii) the probability that no opponent o will intercept the ball before it reaches position m, Figure 5, right, shows examples of those probabilities. The likelihood that a low pass along direction a(θ ) is successful can be written as

Generalization to all passes
We now extend the concept to all passes by including high passes for which β > 0. High passes are slightly more involving since the ball may be too far up to be reachable for a player. We thus make use of the ball model p (z | d; β, v) in Eq. (1) that estimates the density of the height z of the ball at a given distance d and velocity v.
The idea is to incorporate the notion of z-reachability into the interception probability p I . Let p z (z < h | d; β, v) be the cumulative distribution that the ball is lower than height h at distance d when a pass was played with initial parameters β and v. Furthermore, let h i I be the maximum height at which a ball can be intercepted by player i. 1 It follows that the interception probability of the i-th player can be written as a product of the x y-reachability given by the movement model of player i and the z-reachability, given by So far, the probability of a successful pass along a does not take into account whether the receiver can control the ball. Consider a pass over 10 m that is played with 30 m/s and reaches the receiver at a height of 1.5 m. This pass is reachable but certainly not controllable. Therefore we introduce a control-likelihood that is a function of the predicted speed of the ball u when it reaches the receiver and the likelihood that the ball is below h C = 0.5 m. Putting everything together, the likelihood that a pass with initial parameters θ can be successfully received by player r at position m is

Full availability
The previous sections showed how to compute likelihoods of successful passes along vectors a(θ ). However, while a player may intend to hit the ball with certain initial parameters θ = (α, β, v), there will generally be a (possibly slight) deviation θ = ( α, β, v) from the intended pass trajectory, determined by the individual skill and circumstances such as pressure on the ball carrier or running speed. We will now present our model A r (ψ) that determines the overall likelihood of a successful pass to r using uncertainty parameters ψ = (σ α , σ β , σ v ) which can be understood as skill parameters. We assume that deviations are drawn from a normal distribution with mean 0 and diagonal covariance matrix It follows that the expected success for intended initial pass parameters θ is Using the expected success we compute the final availability score as In other words, the passer chooses the best intended option to pass to player r .

Modelling players
We now present the player reachability model p i R (m, t; S) that determines the likelihood that player i can reach position m in time t. The idea is to derive the reachability model from an underlying motion or movement model that estimates the future density of the position of a player, conditioned on the game state S. Several different movement models have been proposed in the literature, for example using simplified physics (Taki et al. 1996;Fujimura and Sugihara 2005) or frequency statistics (Brefeld et al. 2019).
However, as we will show below, the reachability model p i R (m, t; S) estimates reachability based on a cumulative distribution function of moving to position m in time t. Computing cumulative distribution functions can be computationally very expensive if one has to rely on sampling in order to approximate the true cumulative distribution

Player and ball interactions
We use GRNNs to model the interactions between players and ball using a fully connected graph structure. Players and ball correspond to nodes in that graph and edges represent their relations. This part of the model is depicted in Fig. 6, left, and consists of several layers. One layer or block G R of the model is shown in Fig. 6, right. To describe such a layer, or block, of the graph network, recall that state S i T contains the position and speed of player/ball i at time T , as well as team and ball possession indicators.
The -th block of the graph network takes as input the states S i T for all 0 ≤ i ≤ 22 as well as the outputs of layer −1 given by feature vectors h i −1,T . Since the graph is fully connected, every player/ball i is connected to any other players/ball j via typed edges e type (i, j) with t ype ∈ {P P, B P, P B} representing directed edges either between two players (P P), between ball and player (B P), or between player and ball (P B). Edge features φ type e are computed via attention functions α type (·; θ ), depending again on the edge t ype and per-node-functions f v , which are fully connected subnets, such that

Movement model
The distribution of future positions m that can be attained in time t are represented as a mixture model with k Gaussian mixtures, and is realized by a mixture density network (MDN, Bishop 1994). The MDN takes the v i T outputs of the GRNN for player i and the time horizon t as the input to a single layer fully connected subnet f M DN . Categorical mixture distribution π i is computed from the outputs of f M DN with a standard softmax. Gaussian means μ i k (t, S i ) ∈ R 2 and variances σ i k (t, S i ) ∈ R 2 are predicted using linear and exponential, resp., activation functions, where I is the diagonal identity matrix. Figure 7, left, shows an overview.
The two described building blocks GRNN and MDN form a joint graph recurrent mixture density network and are trained simultaneously.

Player reachability
Having computed the movement model p i M (m | t, S), we are now ready to derive the reachability distribution p i R (m, t; , S). While the movement model describes where the average player will be in time t, the reachability model estimates which positions can be reached in time t. Reachability is modeled using the movement model by defining a (pseudo) cumulative distribution function of positions and using a cd f cutoff parameter that defines which positions are reachable with probability 1. All positions that lie outside that cd f cutoff are reachable with probability below one. Figure 7, right, shows a visualization of that approach in 2D.
Let the expected position of player i, exactly t seconds into the future, given by We define the (pseudo) cumulative distribution function at position m as the cd f of the one-dimensional distribution defined on the line that goes through the mean μ m and m.
where Z is the partition function given by Based on the cd f , we assume that player i can reach position m in time t with probability 1 if its cdf i M (m | t, S) is between cutoff-values c co and 1 − c co . Otherwise, the reachability likelihood is scaled with 1/c co to guarantee a smooth reachability surface, as shown below and in Fig. 7, right, The cutoff parameter is tuned on data in order to maximize observed reachability of pass receivers and minimize reachability of defenders that did not intercept the ball. That is, we define a binary classification problem such that for each observed pass in the data we create one positive example (pass receiver intercepts the ball at position m) and one negative example for each defender that did not intercept the ball along the ball trajectory.

Experimental evaluation
We evaluate our model on passes extracted from 58 Bundesliga games from the 2017/18 season. The data comes in form of tracking and event logs. The tracking data is sampled at 25 fps and contains positions of all players and the ball at each frame/timestamp. Pass information is extracted from corresponding event data. This comprises the passing player, the time of the pass, the target position of the pass as well as the receiving player. However, the receiving player could be an opponent who intercepts the pass. In this case, data does not contain ground-truth about the intended receiver. We overcome this problem by identifying the most likely team mate according to the initial direction of the ball at the time of the pass as described in Appendix A.
The entire data consists of 38,851 passes with 33,561 successful and 5290 intercepted passes. This sums up to an average success rate of 0.86. Model selection is conducted via a five-fold cross-validation on 54 games. We report average results on an independent test set consisting of the remaining 4 games. The Baseline Approach We compare our approach to Peralta Alguacil et al. (2020) that grounds on the movement model by Fujimura and Sugihara (2005) and the work by Spearman et al. (2017) and considers players as physical objects whose dynamics are described by an equation of motion with internal and external forces. We also use the proposed logistic distribution to estimate final player movement probabilities. Our approach thereby differs from Peralta Alguacil et al. (2020) in that our model learns the movement distributions from observed player data only instead of a using a solely physics-based model based on approximated properties of soccer players. We also test against the ball dynamics model as described in the appendix in Peralta Alguacil et al. (2020) which again is a model that describes the ball movement based on its approximated physical properties. In contrast, our model learns the ball dynamics from observed ball trajectory data. All parameters of the baseline are set according to the values proposed by the authors in Peralta Alguacil et al. (2020).

Movement Models
We begin the empirical evaluation by comparing our player reachability model with the motion model of the baseline. To do so, we compare the likelihoods that the true receiver of a pass can actually reach the observed receiving position. We take both, successful and unsuccessful, passes into account, such that the true receiver could be an intercepting opponent player. Thus, we compute interception probabilities for all opponents and report the maximum over all opponents. Thus, an accurate model assigns higher probabilities to the true receiver and lower probabilities to non-intercepting opponents. Table 2 shows the results. Our approach distinguishes by much higher average likelihoods for true receivers but also assigns higher likelihoods to uninvolved defenders who did not intercept the ball. The table also shows AUC values for both methods where we compare the ability to predict whether a pass is going to be successful or not. We therefore compute the pass success probability p r s (m) in Eq. (4) at the observed reception position m and, instead of using a ball model, compute the cumulative interception probabilities p o C I (m) along the observed ball trajectory. The computed success probabilities are compared to the observed outcomes of the pass (success or interception) to give the AUC of the prediction. The resulting AUCs support the previous outcome, our reachability model is in fact more accurate in predicting whether a player can reach a certain position in time.

Ball Dynamics
We now compare our ball dynamics model to observed passes and the baseline model to instantiate whether our models predict realistic passes. Table 3 shows experimental results that compare our models for ball dynamics to the baseline. The evaluation is performed on all observed passes in our data. We test model t(d; β, v) that predicts the time it takes a ball to reach a certain distance d given the initial zangle β and speed v. β and v are estimated over the first 5 frames of observed ball trajectories and the error is computed between observed and predicted time at the observed ball reception positions. The first row in Table 3 shows that our model is better in predicting the expected time the ball needs to cover a certain distance than the baseline. The second row shows results for predicting the maximum height of a pass given β and v where we took the mean of distribution p(z(t) | β, v) in Table 1. Again, our model outperforms the baseline.
Successful Passes Next, we compare predictions about whether a pass will be successful or not. For every pass, the first four frames are used to estimate the initial direction and speed of the ball. On the basis of these estimates, the models then predict the ball trajectory. In addition to the full baseline (Peralta Alguacil et al. 2020), we also compare against a hybrid model that uses the physics-based reachability model from Peralta Alguacil et al. (2020) but our ball model. We evaluate the three models by comparing their predictions with the true outcome. Again, we use AUC to measure predictive performance. Table 4 shows the results. Our model outperforms both baselines in terms of AUC. However, in contrast to Table 3, which shows that ball dynamics of the baseline are clearly inferior to our proposed approach, this result does not imply inferior performance when predicting outcomes of passes. Note that the baseline performs, with a predictive accuracy of about 0.880, significantly better in our experimental setup than in Spearman et al. (2017) who report an accuracy of 0.819. Naturally, this effect may come (in parts) from using different data. For example, our Bundesliga data has an average rate of successful passes of 0.86 while Spearman et al. (2017) reports only 0.789.

Receiving Position
The pass reception probability p r s (m; θ) describes the likelihood that a pass to receiver r can be successfully completed at position m. We empirically quantify the estimate by comparing the true positions of passer and receiver to the predicted success rate at those positions. Again, we consider successful as well as intercepted passes. Consider the example in Fig. 1 which serves as our running example in the paper. The black circle in the left part shows the actual pass reception position of the pass and its corresponding color-coded pass reception likelihood while the right figure shows the instance of the first touch of the receiver. Our evaluation on all available pass data shows that for successful passes, the average success likelihood and standard deviation of the end position is 0.789±0.328 whereas the average likelihood of bad passes is 0.169±0.269. In other words, the model is able to predict reliably, whether passes to certain positions can be successful. This is also highlighted by a corresponding AUC value of 0.913.
Availability After having shed light on different aspects of the proposed approach, we now turn to evaluate the main contribution of the paper, the predicted Availability scores A r (ψ). For every pass in the data, we compare the (expected) availability of the (intended) receiver to the true outcome of the pass. In our evaluation, the average Availability score and standard deviation of successful and unsuccessful passes are 0.884±0.23 and 0.572±0.24, respectively. Moreover, measuring AUC on Availability and true outcome yields a score of 0.870. This allows to draw the conclusion that the computed Availability scores correlate highly with success of passes. Note that the overall success rate of observed passes of 0.86 is almost matched by an average predicted Availability of 0.84. Figure 8, left, shows average success rates and Availability scores for different pass origins, 2 while the right part of the figure shows those quantities w.r.t. pass distance. since we do not incorporate the notion of "pressure" on the pass giver into the model, areas closer to the own goal have higher success rates than the estimated ones. On the other hand, the true success rates are smaller than the predictions in areas closer to the opponent's goal. We credit this finding to the expected pressure on pass giver at the time of a pass. The closer the passer is to the opponent's goal, the more pressure is issued by the opponents and the player has to act in smaller spaces and in shorter time windows to play a pass.
Note that the pressure argument does not translate to the pass distance as Fig. 8, right, demonstrates. A reason for this can be observed in Fig. 9, left. The figure shows the mean success rates for bins of Availability scores. An optimal scoring function would yield the diagonal green line where Availability scores and average success rates would perfectly align. However, as observed in Fig. 9, left, this holds only for Availability scores above 0.5. The observed success rates for smaller values exceed the expectations significantly. An in-depth analysis of the results shows that this is mainly due to imperfections in the recorded data (cf., Linke et al. 2020) for a quantitative evaluation of tracking data) and the small number of passes with low Availability, as shown in red in the figure. As an example, consider Fig. 10, left, where the recorded ball position (thick black circle) suggests that the ball is located at the player's heels. However, in reality, the ball is positioned half a meter to the left. The computed Availability uses the recorded position to come to the conclusion that the player at the bottom of the image can only receive a high pass to the left of him with probability 0.26. However, the real ball reception is at the light circle, following a low pass. In the right part of the figure, the recorded ball position is slightly below the player's position, whereas the real ball position is about a meter above the recorded position. Because of that, computed success probabilities p r s (x, θ) are non-zero mainly below the receiver (to the right in playing direction) because according to the data the ball was one meter below the real position. The actual ball reception position is unreachable according to the data.

Availability over Time
We now investigate how Availability changes over time prior to a pass. We differentiate between good and bad passes and measure relative Availabilities by subtracting the score at the time of the pass from the respective score s seconds before the pass. Figure 9, right, shows the resulting relative Availabilities over time. For all passes, relative Availabilities decline rapidly before the actual pass. This is explained by passing players turning their bodies into the direction of the pass before the pass is being played, which in turn means that opponents read their intentions from their posture and try to close the passing window. This effect is stronger for bad passes, which partly explains why they are unsuccessful.

Comparison to Expert Ratings
To evaluate validity against an external criterion, we compare computed Availability scores to those estimated by human observers. Four soccer coaches rated the availability of players in 60 situations. We included situations only, in which the ball possessing player had full ball control (ball flat, orientation to opponent's goal, at least one second ball possession, low pressure). To consider different types of tactical constellations, we distributed the 60 situations evenly on build-ups, transitions, and situations in the box. The coaches are presented an image from the video and rated the Availability of every team mate of the ball possessing player in that situation with a score in the set {−2, −1, 0, 1, 2}, where −2 corresponds to lowest Availability and +2 to highest.
To provide a common understanding of the ratings, the experts were instructed before the experiment. In short, for a rating of +2, a player should be able to receive the ball safely if the passing player does not make a terrible mistake. A rating of +1 represents a situation in which a player has a good chance to receive the ball, but there is maybe a small interception chance for an opponent. A rating of 0 indicates a 50:50 chance to get the ball, and so on. The experts were encouraged to rate the situations based on their personal understanding of soccer. Since Availability is a probability and naturally ranges between zero and one, we binned the scores evenly in intervals of length 0.2 so that an Availability within [0, 0.2) corresponds to a rating of −2, interval [0.2, 0.4) is mapped to −1, and so on. Table 5 shows correlations between the expert ratings and the binned Availabilities. The table shows a strong positive correlation between the coaches and our computational model. Differences between the observers indicate that it may not be possible to entirely objectify Availability. This is quite typical for non-trivial tactical concepts and has also been reported for other metrics (Link et al. 2016). However, correlations between experts are generally higher than the ones between expert and model with Obs1 being the only exception. Though the finally outcomes are comparable, this result suggests that our model rates Availabilities slightly different than the experts.
Interestingly, Table 6 shows that our algorithm in fact rates Availabilities slightly better than the experts. The table summarizes average completion rates of passes as an indicator. That is, we compare average ratings of successful and non-successful passes as follows. A mean of 0.12 for non-successful passes is comparable to the rating of the experts. For successful passes, however, an average rating of 1.77 is significantly higher than those of the experts. This result also becomes obvious when comparing AUCs, where our algorithm significantly outperforms all experts.

Scenarios for application
From the perspective of performance analysis, our model presents a set of interesting applications. As an example, coaches and other experts can use Availability to characterize the passing tactics of players, that is, does a player only try easy passes or is the proportion of difficult passes noticeably high? The model can also be used to rate the actual passing ability of players. While coaches have a very good understanding of their players' passing capabilities, by seeing them in training and matches on a daily basis, quantifying those abilities is still a hard task. While average passing statistics are readily available in a wide variety of sources 3 the raw pass count and average success rates to not show the full picture. E.g. attacking players generally have a lower pass percentage than defensive players, simply because they operate in tighter spaces and have fewer, if any, available passing options. Therefore, it should prove beneficial to compare a player's observed success rates to the expected one in order to more objectively quantify a player's passing ability. Figure 11 and Table 7 show preliminary results for both use cases. In Fig. 11, we compare pass selections of defenders, midfielders, and forwards of Bayern Munich. The results show that the position in which a player generally operates has a significant influence on the kinds of passes a player attempts. Defenders take fewer risks when passing, for one because the outcome of an unsuccessful pass could more likely result in a scoring chance by the opponent but also because they have more available passing options. On the other hand, forwards take more risks, either because the potential reward is higher, or because simple passes are not possible due to high pressure of defenders. Table 7 shows the top 10 ranked players in our data w.r.t. their expected versus observed pass percentage. We would like to note that our evaluation data consists of only 58 games and several players did not have enough passes in the data to validate reliable analyses. We only considered players with at least 350 passes which left us with 65 players from FC Bayern Munich, Hamburger SV, TSG 1899 Hoffenheim, FC Schalke 04, and SG Eintracht Frankfurt. Still, the top 10 exhibits an impressive overrepresentation of Bayern Munich players. Sebastian Rudy, Joshua Kimmich, Arturo Vidal, David Alaba, Corentin Tolisso, and Niklas Süle all played for Bayern that season with only 2 Schalke players (Daniel Caligiuri, Benjamin Stambouli) and 2 Hamburg players (Kyriakos Papadopoulus, Gotoku Sakai) in the list.

Conclusion
We presented and evaluated a data-driven approach to estimating Availability of soccer players. The investigated model leverages graph recurrent neural networks to predict whether players can intercept the ball in a given time to compute the probability of a successful pass along a ball trajectory. By computing all possible ball trajectories using trained models for ball dynamics, we showed how to aggregate those potential passes into a single value that represents the overall Availability of a player. Experimental  evaluation showed that this overall model outperforms the state-of-the-art approach on 58 professional soccer matches. Additionally, our experiments indicate that the model can even outperform soccer coaches in assessing the Availability of soccer players.

Fig. 12 Model for predicting intended receivers
Our approach is to learn from data of successful passes, where the intended receiver is also the observed receiver, and simply use that model to predict intended receivers of unsuccessful passes. Let t p be the recorded time of a successful pass and y t p the true receiver of that pass. The pass is turned into a training example by computing the initial direction of the pass on the first 6 frames and using the positions and trajectories of players and ball in the subsequent frame as the input. The output is simply the true receiving player, e.g., by a one-hot encoding. Figure 12 shows an overview of the model. Let φ I (v k T , v b T ) be a score function of outputs h k T of potential receiver k and passer h b T of the GRNN model as described in Sect. 4. The model minimizes the cross-entropy loss between real labels and scores and thus outputs a softmax distribution over all possible receiving players.

Appendix C Figure 13 and 14
For better visibility, we show larger sized versions of Figures 1 and 10 here.