1 Introduction

Acquiring accurate and high frequency positional and event data is common in most of the world’s top professional football (soccer) leagues. Manually annotated event data provides information about the one player carrying the ball at the time of a game relevant action only, whereas so called positional data can capture highly accurate positions of all 22 players up to 25 times a second.

Every professional football team spends a substantial amount of time analyzing and monitoring strategies such as counterpressing.—a complex team strategy for transition situations—of their own and opposing teams. Navarro and Javier (2018) defines counterpressing as simple as “[..] pressure after losing the ball”. Related to this, Pep Guardiola made the ’five second rule’ for counterpressing famous.Footnote 1 Another coach to experience tremendous success in recent seasons is Liverpool FC manager Jürgen Klopp. He is generally accepted as the originator of the term ’Gegenpressing’, which is well-known in both its German version and English translation.Footnote 2 It is apparent to football experts that Klopp’s counterpressing concept is closely related to Guardiola’s strategy of regaining the ball within the first five seconds.

There are significant differences in team’s defensive and offensive tactical line-ups (Bialkowski et al. 2014, 2015; Andrienko 2019; Shaw and Mark 2019). The transition phase describes the period following a win or loss of possession in which the team transitions between its offensive and defensive tactical line-ups and vice versa. When a team is in possession for at least a certain amount of time, we can assume that generally its tactical formation is optimized for offensive play and, consequently, sub-optimal in terms of defending its own goal (Andrienko 2019; Shaw and Mark 2019). Therefore, the first seconds after losing the ball are critical from a defensive perspective. Several studies proved that transition phases are a substantial factor for a team’s overall performance: As early as in 1968, Reep and Benjamin (1968) demonstrated in the first known football analytics study, that \(30\%\) of all regained possessions lead to shots on goal and \(25\%\) of all goals came from regained possessions in the attacking quarter. Grant et al. (1999) confirmed these findings for the 1998 World Cup. Both outcomes align perfectly with Jürgen Klopp’s statement that regaining the ball immediately after loosing it, potentially through successful counterpressing, "...is the best playmaker".Footnote 3 Klopp hereby claims that counterpressing can also be seen as an offensive strategy. Recent studies show that regaining the ball in open play likelier leads to a goal than a save build-up from a team’s own half (Vogelbein et al. 2014; Hobbs et al. 2018). Based on tracking data from the English Premier League, Hobbs et al. (2018) detected possession regains close to the opponent’s goal—potential counterpressing situations—highlighting their relevance once more. Even though many coaches and clubs affected the development of this sophisticated strategy, neither an objective proof of its efficiency, nor an analysis on its usage in top leagues is presented in the literature.

Hughes and Ian (2015) point out that team sports performance analysis tends to be operationalized on the basis of notation systems, described as a replicable and consistent method of recording sport performance. Recent literature explained a framework, where coaches‘ decisions are supported by several performance analysis reports from games, teams and players (Travassos et al. 2013) and pointed out that team tactics in football refer to both a priori decisions made before the match, and also real-time adaptions during the game (Rein and Daniel 2016). In accordance to that it is described as a complex process resulting from a network of inter-dependent parameters (Kempe et al. 2014). These processes are conducted in a time-critical set-up, especially when it comes to the world’s top leagues and competitions where teams need to encounter different opponents several times a week. Although many clubs extended their match-analysis departments considerably within the past years, the limited amount of time and resources during matches forced teams to seek ways to automate processes and gain insights faster in order to obtain a competitive edge.

These recent developments—the availability of accurate performance data and the need for a quick detailed tactical analysis—signifies a huge potential for the application of sophisticated machine learning techniques to football data and requires an efficient collaboration of computer-science and domain experts (Herold et al. 2019; Goes et al. 2020; Rein and Daniel 2016). Many recent scientific investigations aimed to establish new key performance indicator (KPI)—metrics quantifying certain aspects of the game: pass evaluation metrics were examined (Steiner et al. 2019; Goes et al. 2019), metrics to quantify controlled space were defined (Kim 2004; Fernandez and Bornn 2018; Brefeld et al. 2019) and several studies evaluated shot metrics (Lucey et al. 2014; Rathke 2017; Fairchild et al. 2018; Anzer and Bauer 2021)Footnote 4 and goal scoring opportunities through possession values (Link et al. 2016; Spearman 2018; Fernandez and Bornn 2018; Decroos et al. 2020). Additionally, there are many approaches for measuring the defensive behavior of teams (Santos et al. 2018; Andrienko 2019; Goes et al. 2019), and even approaches aiming to quantify pressing (Bojinov and Luke 2016; Andrienko 2017; Robberechts 2019). Although pressing and counterpressing are closely related, they are two different phenomena. An interesting conference proceeding describes how specific counterpressing situations can be derived from detected general pressing scenes.Footnote 5 Several approaches also showed, that analyzing these KPI’s or even aggregating simple statistics (e.g. the pass completion rate) over one or several seasons provides a helpful indication to practitioners (Power et al. 2018; Pappalardo et al. 2019). The primary goal of all these approaches is to derive new insights by processing vast amounts of information. Decroos et al. (2018) presented a first approach to detect interesting match-phases based on event data. To the best of our knowledge no peer-reviewed study focused on automating parts of the performance analysts everyday life by detecting complex tactical patterns based on positional and event data. However, a noteworthy approach aiming to detect counterattacks was presented in an established football analytics conference.Footnote 6

With this practical need for process optimization in mind, it is the pivotal issue of this study to detect counterpressing situations without human-support and provide several ad-hoc reports for match analysts in near real-time. The outcome is optimized to fulfill their practical requirements and fit seamlessly into their tool-ecosystem. Additionally, the automated detection allows us to analyze large amounts of data that would exceed manual processing capacities. Consequently, our approach enables us to perform impartial long-term analysis of the German Bundesliga’s latest seasons investigating the following research questions:

  • Can we differentiate between varying regaining strategies and determine reasons for a short defensive reaction time (definition in Sect. 2.1.2), i.e. to which extent is a fast ball regain actually caused by counterpressing (RQ1)?

  • Can we set objective benchmarks to quantify counterpressing strategies (amount and effectiveness) on a match- and season-level and point out their correlation with a team’s overall success (RQ2)?

  • Do the established rules of thumb agree with the data (i.e. counterpressing is more effective close to the sideline) (RQ3)?

  • To what extent do team’s counterpressing strategies differ in the German Bundesliga (RQ4)?

All together, answering these research questions helps us to define the baseline for a qualitative discussion with experts, and thus allows them to formulate requirements for the practical application (PA) set-up.

The remainder of this paper is structured as follows: Sect. 2 provides a detailed description of the used data, the underlying rules and definitions, the labelling process and the extracted features. The outcomes in Sect. 3 are split into three parts: First in Sect. 3.1, we describe a statistical evaluation of the detection models. Section 3.2 presents a subject-specific evaluation by interpreting our results on six seasons of German Bundesliga. Lastly, in Sect. 3.3, we demonstrate how this approach can be operationalized in the performance analysis process. This application is based on two matches of the German national teams.Footnote 7\(^{,}\)Footnote 8 All parts of this study were developed in close cooperation with the professional match-analysts and coaches (see Acknowledgements).

2 Methods

2.1 Data and definitions

2.1.1 Data collection

The present study uses positional and event data collected in more than six seasons (4118 matches) of the German Bundesliga and \(2{\text {nd}}\) Bundesliga, as well as the above mentioned two matches of the German national teams. Positional data is captured by optical tracking systemsFootnote 9 and event data consists of manual annotations based on a dedicated official match data catalogue,Footnote 10 defining around 30 events with more than 100 attributes. The event data can be seen as a log of the ball relevant actions (e.g. passes, shots, tacklings or fouls), however, it does not cover complex team-tactical behaviors such as counterpressing.

Since the two data sources are collected independently of each other, they need to be synchronized before they can be processed together. Even though several steps of quality management from independent institutions are conducted on the manually collected event data, the assigned timestamp of a given event can differ significantly to one in the positional data. The synchronization of positional and event data is conducted by dedicated rules (per event) that extracts the exact timestamp and the exact location on the pitch from tracking data given a manually tagged event. For example, when synchronizing a pass, the sudden increase in the distance between the passing player and the ball, captured by the optical tracking, can be used to align both location and timestamp of the pass. The positional data is collected at a frequency of 25 Hz and includes the longitudinal, latitudinal, and in case of the ball, also the altitudinal positions of the players, ball and referees related to the pitch markings.

The information about which team is currently in possession of the ball (hereafter referred to as ball possession) and whether the game is running or currently stopped (hereafter referred to as ball status) are crucial for our survey. Both values are collected live in the stadium for every frame of the match by a skilled operator focused exclusively on this task.

2.1.2 Definitions

Since there are conflicting definitions of ball possession in the literature (Kempe et al. 2014), we decided to adopt published definitions with expert feedback. The above mentioned operators, dedicated to acquire information about ball possession and status, are briefed to mark ball possession for one team, if and from that time point a player of that team touches the ball with ball control, until the ball is out of play, or an opponent player touches the ball with ball control. Ball control is defined in this context, as the ability to conduct a contrived action with the ball. Whenever a pass is played between two players of one team, the ball possession belongs to that team as long as no opposing player intercepted that pass or won the ball within an individual duel. According to the definition from Link and Hoernig (2017) we also compute ball possession on a player level (individual ball possession). In the case of an interception, the ball possession change is detected exactly at the time of the first ball touch of the intercepting player. We use the term defensive reaction time—the time it takes to regain ball possession after losing it—as defined in Vogelbein et al. (2014). All situations where either the ball is beyond the pitch markings or the play is stopped by the referee (e.g. because of a foul) are labeled as out-of-play. Hence, if the ball goes out of bounds there must typically be a change in ball possession. Situations in which the touch of the player carrying the ball outside the markings is not declared as a ball possession due to missing control (e.g. a deflected shot), or when the individual possession model disagrees with the team possession flag are excluded. For all further investigations only the effective playing time (also referred to as net playing time)—defined as all the situations while the game is running—are considered. Shots, for example, always represent the end of a ball possession phase per definition. Ball possession phases that end with the halftime-, or final-whistle or a referee ball are excluded from our analysis.

In addition to these general rules, we developed the following transition-related definitions in consultation with match-analysis experts: A defensive transition phase is defined as the time-window when a team loses ball possession, but is not yet into their ideal defensive formation. Within these defensive transition phases,

a team conducts counterpressing if at least one player exerts (spatio and/or temporal) pressure on the ball carrier, or on the opponents close to the ball.

Note that there exist many different definitions for pressing: StatsBombFootnote 11 defines pressing as a defensive player being within a five-yard radius of the ball-carrying opponent.Footnote 12 Very similarly, a more granular and non-binary definition, aggregating the pressure of several defensive players, is presented by Andrienko (2017). Based on these pressing definitions, counterpressing could be defined as situations where pressing is exerted immediately after a ball possession change (Navarro and Javier 2018). Both of these rule-based definitions are used as a baseline model for our investigation.

However, according to the match-analysts involved in this project, being close to the player in ball possession is not the only way to exert pressure. Attacking or blocking the easiest pass options could, for instance, also be seen as applying pressure.

To quantify the success of counterpressing, we consider it as successful if the ball is regained within five seconds and shots and goals, scored or received, are accredited to the previous counterpressing phase if they occur within the following 20 seconds.

From hereon the game is split into ball possession phases which could start and end either with an in-play ball possession change or a stoppage such as a set-piece. Note that the set-up of the in-play ball possession change, such as the defensive transition, might not be the only influence factor on the defensive reaction time—it can also occur due to short, uncontrolled ball possessions or risky passes of the opponent. Any ball possession phases that either start with a set-piece or end with a stoppage in play will not be considered further. Fig. 1 shows a heatmap displaying the occurrences of transition situations related to the pitch. It indicates, that most turnovers happen in the opposing half, especially near the sidelines. Ball possession changes due to a ball going out of bounds are added to the area touching the sideline. Easily identified is the high proportion of turnovers in the opponent’s six-yard box. This is likely because both saved shots and shots missing the goal wide are counted as a change in possession as soon as the goalkeeper receives the ball.

Fig. 1
figure 1

Overview of where on the pitch turnovers happen most frequently (from the perspective of a team playing from left to right)

2.2 Supervised machine learning set-up

2.2.1 Hand-crafted labeling of defensive transition situations

Since the rule-based approaches to detect counterpressing we investigated lead us to an insufficient accuracy (see Sect. 3.1), we conducted a manual tagging procedure with trained student-analysts. It was their task to label situations with a detectable counterpressing strategy. In total, out of 11, 108 relevant defensive turnovers, 3, 196 situations were labeled as counterpressing. The labeling was conducted for the first eleven Bundesliga-matchdays of the 2018/2019 season from the perspective of the home team. The percentage of counterpressings detected per transitions differs significantly per team. Borussia Mönchengladbach presented the highest value (\(40.07\%\)), whereas only \(21.80\%\) of Hannover 96’s transitions have been labeled as counterpressing. The aggregated outcome of the labeling process per team of the German Bundesliga is displayed in Table 7 in the Appendix A.

To quantify the inter-labeler reliability, 20 matches were labeled by three different students. To compute the pairwise accuracy for each defensive turnover, we checked if both students had identified countpressing in the following two seconds. This yielded a pairwise accuracy of \(82.01\%\), i.e. in \(82.01\%\) of defensive turnovers both students agreed on the nature of the actions following a turnover.

As additional information, the experts tried to detect the exact start and end-frame of the respective transition situation. The average duration of all transitions phases is 9.34 s, 9.89 s for counterpressing, whereas all non-counterpressing turnovers took in average 9.11 s.

2.2.2 Expert-based feature extraction

We defined a list of 134 features that aim to characterize the transition. The features describe the location of a ball possession change, several relevant factors describing both teams’ exact positioning at the time of turnover and their movements in the first two seconds immediately after the ball loss. A time-window longer than two seconds was problematic, because it would cut off too many situations where the ball possession changed within that time.

A teams’ decision to conduct counterpressing is heavily influenced by the situation of the ball possession itself. To take this into consideration, all features are also calculated at the moment of the ball possession change. According to football experts, turnovers without the chance to counterpress are often characterized by immediate clearances or aerial duels. Therefore, we included the ball position, the ball height, and the individual ball possession time (Link and Hoernig 2017)—describing the time a player of the ball possessing team was in direct control over the ball. The involved football experts suggested, that counterpressing is often characterized by achieving a local compactness close to the ball. We aimed to cover this with several metrics measuring the regaining team’s positioning around the ball. For instance, we use the team’s covered area, global and local stretch indices (Bourbousson and Carole Sève 2010; Santos et al. 2018) as features in our model. A team primarily aiming to defend their own goal after losing possession does this usually with high-speed towards their own goal, whereas counterpressing requires often only players close to the ball to attack their opponents with a high speed towards the ball carrier. This is addressed by calculating several speed-values and considering each team’s average position, the so-called team-center (Bourbousson and Carole Sève 2010; Andrienko 2017). In contrast to a more conservative transition strategy, counterpressing’s primary objective is not to place many players in a compact unit behind the ball quickly, but rather to defend more aggressively up the pitch. Therefore, we calculate both the number of players in front and behind the ball, as well as their respective compactness. Although the pressing definition from Andrienko (2017) was not sufficient as a stand-alone rule-based counterpressing detection criteria (see Sect. 3.1), it is incorporated in various features of our model.

All features were discussed, consolidated and steadily improved within workshops and based on several steps of evaluation of the detection. A detailed list and description of the features is presented in Table 1, a video describing some of the features can be accessed here.

Table 1 The extracted features that are used for counterpressing detection. Features used in both dimensions of pitch coordinates (horizontally and vertically) and for different time points after the initial ball possession change are listed only once

2.3 Model training

2.3.1 Detection of counterpressing as a supervised machine learning task

We trained several classification algorithms based on the 11, 108 labeled defensive turnover situations from 97 matches fulfilling our inclusion criteria (see Fig. 1).

We split the labeled data-set (\(75\%\) training data, \(25\%\) test data) by taking randomly \(25\%\) of all transitions out of every match to avoid over-representing teams, scores, or results.

Table 2 Hyperparameter-selection of the XGBoost models

We used the above defined features (section 2.2.2 or Table 1) and evaluated the best performing models on our set of test data.

Among different basic-models, we applied extreme gradient boosting (hereafter referred to as XGBoost), a scalable tree boosting system, introduced by Chen and Carlos (2016), which outperformed traditional machine learning algorithms in numerous applications (Li et al. 2019; Liu et al. 2020; Zhang et al. 2020). For our investigation, we want to point out three major advantages of XGBoost: (a) To make use of our wide set of features without taking the risk of overfitting, an additional regularization term is added to the loss function. Additionally, (b) XGboost is a scalable machine learning model, which can be extended seamlessly with more data or more features being available. Furthermore, (c) no normalization is required.

Before training our model, a set of hyperparameters (shown in Table 2) has to be defined. As presented in Wang (2019), we applied Bayesian tree-structured Parzen Estimator hyperparameter optimization approaches to obtain the highest possible accuracy and avoid overfitting. By using tree-structured Parzen Estimators (Bergstra 2011) as the surrogate model in Bayesian optimization (Dewnacker et al. 2016), we reduce the running time of hyperparameter tuning and achieve better scores on the testing set. Note that the hyperparameter nrounds was set to a maximum of 400 iterations.

To further guarantee the stability of our model and avoid overfitting, we applied five-fold cross-validation on the training data.

As described in Sect. 2.1.2, we also implemented two rule-based baseline models that serve as a benchmark for our detection. A naive approach defines counterpressing as follows: whenever one or more players are within a five-yard radius around the ball carrier during the first individual ball possession phase following a turnover, it is classified as a counterpress (hereafter referred to as naive rule-based approach).Footnote 13 The second approach (hereafter referred to as Andrienko-approach) defines counterpressing as all turnovers whenever the first player in ball possession receives pressure exceeding a certain threshold according the pressure-definition in Andrienko (2017), whereby the final threshold of 0.74 was obtained by maximizing the \(F_1\)-score on the training set.

2.3.2 Effectiveness for counterpressing and fast possession regains

In order to define some success metric of a transition phase, the low scoring nature of football causes us to examine the following actions more granularly, rather than just checking whether they are followed by a goal. For both cases—successful and unsuccessful ball recoveries through counterpressing—we extracted taken shots, expected goalsFootnote 14 and actual goals following a transition phase. To investigate this issue, two definitions had to be made: Which ball recovery latency of a counterpressing strategy should be considered as successful, and for how long a defensive and an offensive action would be accredited to the previous ball recovery (strategy). As a starting point for a potential threshold for successful counterpressing, a first indicator is given by Pep Guardiola’s five second rule. We queried relevant video scenes with possession regains after 3, 4, 5, 6, 7 and 8 seconds and discussed them with a group of professional match-analysts. The same procedure was conducted to investigate the follow-up goal-scoring opportunities. Here scenes with shots 10, 15, 17, 20 and 25 seconds after the initial ball loss were discussed. Through this procedure, we finally agreed on the definitions described in Sect. 2.1.2.

3 Results

3.1 Statistical evaluation

3.1.1 Detection of counterpressing

With the above described supervised machine learning set-up we are able to detect counterpressing situations with sufficient accuracy for practical applications (see also Sect. 3.3). Table 3 shows a statistical evaluation of the different models, from which XGBoost performed the best. Per team and per match, we detect around 20 to 30 counterpressing situations, out of around 90 to 200 transition situations.

Table 3 Statistical evaluation of the counterpressing outcome

With the highest overall area under the curve (AUC) the above presented optimization (Table 3, row 1) is best suited for the long term analysis of several seasons with the goal to identify trends and underpin practitioner rules (RQ1-RQ4, Sect. 3.2). The XGBoost model with a class balancer (Table 3, row 4) has a higher recall of \(80 \%\) with still an acceptable false positive rate. Thus, it can be applied for specific performance analysis of either the own match or several matches of the next opponent (PA, Sect. 3.3)—where match-analysts spend a lot of time analyzing video footage either way. The optimal hyperparameters used for both models can be found in Table 2. When examining the results of the two baseline approaches, they exhibit a very low overall accuracy. The naive rule-based approach (Table 3, row 5) classifies \(72.41 \%\) of all turnovers as counterpressing, which leads to a high recall but also a large number of false positives. For the Andrienko-approach, selecting the threshold by optimizing the \(F_1\)-score lead to a more realistic percentage of predicted counterpressing situations in the test set (\(25.65\%\)), but are, nevertheless, significantly outperformed by either machine learning approach.

Another advantage of the XGboost approach is that the individual influences of our rich feature set can be somewhat quantified and interpreted by analyzing the respective SHAP-values.Footnote 15 The naming was both coined by their originator Lord Shapley, who introduced them in the context of cooperative game theory (Roth and Thomson 1988), but also by Lundberg and Su (2017), who used the concept to interpret the features for machine learning models. In comparison to traditional feature importance models (e.g. gain or Saabas method), SHAP-values present a consistent and locally accurate method to identify the individualized feature contribution to machine learning models. This method has been effectively used in different applications (Antipov et al. 2020; Meng et al. 2020; Ibrahim et al. 2020; Anzer and Bauer 2021).

Fig. 2
figure 2

Feature influence to the counterpressing prediction based on SHAP-values

Figure 2 displays the most influential features according to the SHAP values in two different representations. In the left Fig., each dot represents the contribution of the feature to the model, whereas the color-coding describes the value of that feature. Both, the absolute individual ball possession time (IndividualBallPossession (abs)) and the speed of the regaining team two seconds after the change in ball possession (SpeedRegainingTeamPlayer (2s)), have a very strong and linear impact on the predictions. Besides the fact, that both features have the highest overall influence on the prediction (widest dispersion of the dots on the left part of Fig. 2, the interpretation of the SHAP-values can be expressed as follows: the higher the absolute amount of individual ball possession time within the first three seconds and the higher the speed of the regaining team two seconds after the turnover, the more likely a defensive turnover is classified as counterpressing. The number of opposing players behind the ball after one second (PlayersOpposingTeamBehindBall (1s)) influences the prediction in a different way: A high number of players behind the ball increases the chances for a classification of counterpressing, but the relation is clearly non-linear. To get a better idea of the influence, we will have a look at the right part of Fig. 2. The value per feature is now displayed by the x-axis, whereas the model influence is shown on the y-axis. If less than four players are behind the ball, this feature on average decreases the chance of a counterpressing classification. The number of four defenders—almost half of the team—seems to be a decisive threshold. If four or more players are behind the ball, this feature has a positive contribution to the prediction. This not only aligns with the expectation of the practitioners, but also led to a very valuable discussion among the professional analysts. A more complex relation is shown by the local stretch index of the five closest player to the ball of the opposing team two seconds after the ball possession change (OpposingTeamlocal5 stretch (2s)). After a steep increase starting at 600 cm, the influence of that feature reaches its maximum at roughly 1, 000 cm (see right plot in Fig. 2) but decreases afterwards. This indicates that a higher stretch index (lower compactness) of the opposing team after two seconds increases the chances for counterpressing.

Excluding features with little to no influence according to the SHAP-values, did not improve neither the \(F_1\)-score nor the AUC of our prediction on the test data-set.

3.1.2 Effects of counterpressing

Table 4 shows the outcome of counterpressing regarding goals and shots scored or conceded within 20 seconds. If one is successful, i.e. wins back the ball within five seconds the chance of scoring increases tremendously, but if unsuccessful one is far likelier to concede a goal.

Table 4 For all 4118 considered matches of the German Bundesliga and \(2{\text {nd}}\) Bundesliga, this table shows the probability of shots and goals per team following counterpressing situations

It is no surprise, that the chances to either shoot or score are significantly higher when counterpressing was applied successfully, since it implies the crucial attacking advantage of having gained the possession of the ball.

While using goals scored versus conceded would theoretically be ideal to measure success, the low scoring nature often prevents us from doing so. Therefore, we use shots to compare teams and coaches. Nevertheless, according to the experts, looking at both shot- and goal- (or even expected goal-)balance is a very valuable key-performance-indicator for counterpressing.

3.2 Subject-specific evaluation of six seasons of German Bundesliga data

In the following section, we use our quantitative results as a baseline for a qualitative, subject-specific evaluation and interpretation with the involved experts.

3.2.1 Lessons learned about defensive transitions (RQ1)

A common procedure when analyzing a team’s transition strategy is looking at the easily acquirable defensive reaction time (Vogelbein et al. 2014). This, however, comes with the drawback that it is not able to distinguish between situations with intentional counterpressing behavior and noise. Note that in non-trivial defensive turnover situations typically a team can choose between falling back or conduct counterpressing. But for the purpose of this study, the distinction between fallback and other non-counterpressing situations was removed for the sake of simplicity. An additional analysis of the expert-based labeling (see Appendix A) showed, that around \(62.5\%\) of all ball losses fulfilling the inclusion criteria cannot be assigned to either defensive strategy (counterpressing or fallback). The sheer number of these situations—with a very short defensive reaction time (on average 7.83 s) and without any defensive tactical choices being detectable—significantly influences the defensive reaction time when applied to all transitions. Further analysis (see Appendix A) shows that specifically turnovers with very short individual ball possession times fall in this category. Their exclusion presents a crucial step for a better understanding of transition situations.

In general, counterpessing is not always advantageous and needs to be executed well. Although, the expectation of some practitioners (invigorated by Jürgen Klopp’s statement "counterpressing is the best playmaker") may be different, it is still intuitive that the team in possession of the ball has a higher chance to perform a successful offensive action (e.g. a shot or goal; see Table 4, row 1). However, this is an average over all teams independent of their skill level and as we will later see, there are some teams/coaches that were able to apply counterpressing so successfully that they even ended up with a positive shot balance by creating more shots after counterpressing than conceding. In order to properly assess the risk versus reward nature of counterpressing, one would ideally compare it to its strategic counterpart falling back. But even then, one would need to carefully address potentially confounding variables describing the original situations, since, as it seems from the feature importance discussion, the situation typically dictates the strategic response. This goes, however, beyond the scope of this study, but could be the ground for interesting future work. Additionally, since all non-counterpressing situations consist of myriad of different circumstances, they do not serve as reasonable baseline to effectiveness of counterpressing.

3.2.2 Define and statistically underpin objective Benchmarks (RQ2)

Based on the above explained definitions and trained prediction models, several quotients and ratios were discussed with the experts. Aggregated on a season level, we analyzed the correlation with a team’s final ranking. Our detection provides several different metrics with a significant correlation to success. These performance indicators can be calculated per match, per match phase (e.g. one halftime) or even per turnover, which allows practitioners to objectively compare their teams performance with pre-defined benchmarks. With a negative Pearson correlation, the ratio of successful counterpressing-situations to the total number of transitions predicts a team’s final ranking the best (\(r=-0.44\)). Another metric, correlating with the final ranking that was very valuable to the experts due to its direct interpretability, was shot and goal balance (\(r=-0.36\) shots, \(r=-0.42\) goals)—describing, whether more shots or goals are taken after successful counterpressing, than are conceded after failed attempts. However, these metrics should be used carefully since they are based on small sample size and could contain confounding effects with the overall offensive or defensive qualities of a team. Loosing the ball, increases the probability of an opponent conducting an offensive action. Thus we present an effective strategy to monitor the outcome of counterpressing strategies, such as several performance indicators that enables coaches to objectively benchmark a team’s defensive transition behavior.

3.2.3 Approve established rules of thumb (RQ3)

A widely spread rule of thumb is that counterpressing is ideal after ball losses close to the sideline or close to the corners in the opponent’s half. Fig. 3 presents two heatmaps that underpin this statement. Secondly, we want to examine, whether a numerical superiority of players close to the ball increases the chance of a successful counterpress as assumed by many experts. For that we examine the 109, 852 detected counterpressing situations satisfying the inclusion criteria. Whenever the team out of possession has a numerical superiority within a 10 m radius around the ball, at the time of the turnover, they regain ball possession within 5 seconds \(36.2\%\) of the time, compared to only \(30.2\%\), when the other team has more players in that area. This indicates that the rule of thumb has some truth to it, but is far from the only influencing factor, deciding whether a counterpress will be successful.

Fig. 3
figure 3

Heatmap of ball possession change locations before unsuccessful counterpressing situations (left) and sucsessfull situations (right)

3.2.4 Compare Team’s and Coach’s Counterpressing in German Bundesliga (RQ4)

Further investigations highlight to which extent teams use completely different defensive transition strategies. We investigated the coaches that were expected to have a pronounced counterpressing-behavior by the experts. The respective ranking among all coaches finishing a full season can be found in the Appendix C.

Table 5 Comparison of counterpressing-related performance indicators in Bundesliga. The colored arrows display the rankings within the respective groups compared to the average (teams by ranking; all teams by seasons; teams and coaches). For the columns presenting the shot and goal balance we used colored dots for the teams and coaches, showing whether the total outcome per match is positive, neutral or negative

Table 5 also gives a first indication of which further aspects could be considered. Teams ending up in the top 5 of a season perform above average in almost all defined metrics. No consistent tendency over the considered seasons is detectable: games do not get more intensive in terms of total in-play transitions per match, nor are there significant changes in teams average counterpressing behavior. FC Bayern München has the best goal balance after counterpressing, which might be heavily influenced by their offensive efficiency. Jürgen Klopp and Ralf Ragnick performed extraordinary well in terms of ending up with more created than received shots after attempting counterpressing. Given that Jürgen Klopp ended only seventh place in one of his two considered seasons, this should be seen as an outstanding performance. Over the course of a whole season only nine coaches achieved a non-negative shot balance within 20 s after their own counterpressing—the average final ranking of the respective teams was four. Note that teams playing at home tend to conduct counterpressing slightly more often than away teams. Considering only home-teams, our model classifies \(27.24\%\) of all included defensive transitions as counterpressing, which is roughly in line with the labeled training data, that was conducted only on home teams (\(28.77\%\)).

Figure 4 shows a shortlist of teams’ counterpressing outcomes. Note that, since on average more shots/goals are conceded than created when counterpressing, all four axis are scaled differently. For both sub-figures, teams on the upper left, including, for example, 1. FC NürnbergFootnote 16, Hannover 96Footnote 17 and SV Darmstadt 98Footnote 18 in the left figure, perform worse. Teams on the bottom right tend to generate more shots/goals and allow fewer while counterpressing. Teams with high values in the top right quadrant like Borussia MönchengladbachFootnote 19, FC AugsburgFootnote 20 or TSG HoffenheimFootnote 21 seem to employ risky defensive transition strategy, by both creating and also receiving many shots after their own counterpressing.

Fig. 4
figure 4

For both shots (left figure) and goals (right figure) teams are displayed depending on their percentage of counterpressings leading to either shots or goals for (x-axis) or against (y-axis). Team values are aggregated across all considered seasons

As a general recap of the Bundesliga analysis, we would like to point out that teams use significantly different transition strategies (RQ4). The experts’ expectations of which coaches use counterpressing more often and/or more efficiently were underpinned by the results.

3.3 Proof of concept

The central objective of this study is to automate the detection of counterpressing situations. This helps match-analysts in their daily processes by saving them time, but also by providing objective and comparable benchmarks. First, we describe the general set-up, whereas a prototypical application for two exemplary matches of German national teams is conducted.Footnote 22\(^{,}\)Footnote 23 Based on the results described above, we are now able to provide match-analysts with two files fully automatically in virtually real-time: First, they receive a list of all detected counterpressing situations. To integrate this efficiently into their ecosystem, the files are produced in different file-formats, which can be imported into their video-analysis tool of choice (e.g. Hudl Sportscode,Footnote 24 Stats Edge ViewerFootnote 25). Such tools basically help to handle tags or labels in combination with the video footage. Figure 5 shows how this eliminates the usual process of the match-analysts labeling the videos manually in an exemplary tool (here Hudl’s Sportscode). Usually, match-analysts use these tools to tag important situations live during the match but also in detail post-match for opponent analysis. Once a match or parts of it are tagged, the tool allows the analyst to output the tags either as a video-playlist or as an xml-file, containing the category and the time-frame of each tag. Depending on the coaches needs, the outcome is either presented as a video-playlist or a quantitative report giving an aggregated overview—which also is typically produced manually. An automatically generated counterpressing-playlist for the U21 match based on our prediction can be viewed here.

Fig. 5
figure 5

Integration of the counterpressing-analysis into the match-analysts and coaches daily business. The yellow colored part shows the traditional process, the grey part displays how the automated analysis assimilates seamlessly

Second, we automatically provide coaches and analysts a counterpressing match-report with visualization after the respective match or entire season reports. For the U21 match an excerpt of the automated match report is presented in Fig. 6. Only two shots occurred within 20 seconds after either team counterpressed. However, these two shots after unsuccessful counterpressing attempts by the German team lead to two goals, which were decisive for the total match-outcome (Germany vs. Belgium, final score 2–3). Whereas situations leading to goals are analyzed on a highly detailed level by coaches and analysts either way, they arrived at the same conclusion. Nonetheless, this report helps coaching staffs to evaluate whether the German team had a general bias in their defensive transition behavior or whether the two goals happened as a consequence of extraordinary opposing actions or defensive errors from the German team. In this case, bad counterpressing behavior had a significant stake in the origin of both goals. Another outcome was that many successful counterpressing situations (\(1.23 \%\) above Bundesliga average) did not end up in a single shot.

Fig. 6
figure 6

Automated match report for the match Germany U21 against Belgium U21 in Freiburg. The pitch visualizations shows the positions of ball losses leading to a successful (green) or (unsuccessful) counterpressing. Whereas the absolute values are presented in the middle block, a benchmark of the central performance indicators against Bundesliga average is shown in the lower grey box

The practical implementation of this study was prototyped by the example of two recent matches of the German national teams. After each match, both the Sportscode-xml file and the automated match-report sheets were produced and shared with the experts for their post-match-analysis process. Additionally, to validate the report, the matches were manually analyzed and provided a ground truth to compare our results with. The overall results are shown in Table 6.

Table 6 Statistical expert-evaluation of the counterpressing outcome. For the defensive transition column, the number in brackets displays the number of scenes excluded by our criteria (see Fig. 1)

For the first game 153 relevant defensive transition situations of the German national team were queried by the above defined rules. Out of these, 25 scenes were detected as counterpressing from which 6 were manually excluded from the final counterpressing playlist. For 164 defensive transitions from the second match 33 situations were detected as counterpressing and analysts ruled out 5 manually. The manually excluded scenes were discussed with all experts and it turned out that different definitions and interpretations lead to different labels. In this case study, ten of the eleven manually excluded situations consisted of only one player exerting pressure. Although this fulfills our definition, some of the experts would classify situations with only one player defending actively towards the ball as fallback (definition in Appendix A). All scenes that were additionally labeled by the respective match-analysts (in total five) contained a clearance closely after the initial ball loss and where thus not clearly related to the counterpressing strategy.

Both the Sportscode-xml and the automated match report turned out to be valuable for the coaching staff. Due to the interpretability of the inaccuracies, the experts trust the outcome for further applications. They deemed the results to be sufficient in terms of a practical usage and noted that it saved vast amounts of resources in the pre- and post-match-analysis. The automated match-report (see Fig. 6) allows us to provide an objective comparison, with a flexible benchmark (here Bundesliga average) and therefore provide a new way to approach complex tactical strategies. The shot- and goal-balance are very intuitive and present a direct monitoring of the efficiency of conducted strategies.

4 Discussion

This paper shows that complex tactical strategies, such as counterpressing, can be detected automatically based on synchronized positional and event data. Comparing team’s counterpressing strategies objectively and on longer periods of time creates insights that could not have been achieved with traditional methods.

The interdisciplinary cooperation turned out to be a very beneficial factor for this study. In our opinion, such a set-up of competencies is necessary to obtain relevant results. Machine learning techniques are required to detect complex strategies from spatio-temporal data, but also tactical football expertise are inevitable to determine definitions, extract features and evaluate and interpret the resulting outcomes. A key lesson we learned through this study is that both definitions of complex strategies and their reading vary between football experts—this became apparent during an intensive process of expert-supported evaluations. One of our most meaningful key-performance-indicator is the shot-balance after counterpressing. Here, shots are used as a proxy for a successful attack. This is a common procedure in football analytics, however, the approach could be extended by using expected goals (e.g. Anzer and Bauer (2021)) or expected possession values (e.g. Spearman (2018)).

Since there does not exist a comparable approach for detecting counterpressing in the literature, we implemented two naive baseline approaches to benchmark our model against (see Table 3). While the approach based on Andrienko (2017) was originally not designed to quantify counterpressing but rather pure pressure, we build this rule based approach on Fernandez and Bornn (2018), who defined counterpressing as immediate pressure after losing the ball. Hence, it may not be the ideal approach, but due to complete the lack of alternatives in the literature, we use it as a benchmark model. Even though our model outperforms different rule-based baseline models (see Table 3) and the prediction accuracy is sufficient for practical application (see section 3.3), the basic limitation to achieve further accuracy is the inter-labeler reliability of \(82.01\%\). After discussing the definitions, no further steps of consolidation between the labelers were conducted—but we would highly recommend such a step including the strict monitoring of the inter-labeler reliability for similar investigations. However, data labeling is a time-consuming process which cannot be conducted for each occurring philosophy and definition. Furthermore, methodologies that reduce labeling efforts, such as weak supervision, should be implemented on top of general detections as the one presented here to adjust definitions to the specific needs and to improve both the accuracy and the degree of individualization (Ratner et al. 2016, 2017). With an even larger amount of labeled data, one could consider using continuous features, or even the raw positional data of all players instead of features at discrete time points. The application of labeling-support methods could lead to more individualized and accordant labels and thus to a better prediction.

With an even more accurate model using one of the above described approaches, an improved and team-individual success prediction model for counterpressing could support the reflection of teams’ decision making processes significantly. Also, the adaption of the counterpressing detection itself to team-specific definitions, provide a huge potential for further investigations.

Vogelbein et al. (2014) evaluated 306 matches of the 2010/2011 Bundesliga season and showed the time it takes to regain the ball also depends of the score at the time. They pointed out that teams with a lead tend to regain the ball slower than the ones that are trailing, and that teams finishing their season in the top third of the table regain ball possession significantly faster than the other teams—especially in drawn and loosing match states. We found that the defensive reaction time, which serves as a baseline for our success definition, typically includes many noisy situations, where no clear strategy is observable. Further lessons learned regarding defensive transitions are described in section 3.2.1 and extended in Appendix A. The high influence of individual ball possession times on the predictions (see Fig. 2) can be attributed to uncontrolled situations without the possibility for either defensive strategy (counterpressing, fallback). This also shows a limitation of our work, that some of the model’s most important features focus on the situation itself and fewer on the strategy conducted thereafter. A possible explanation for this is that besides filtering out noisy situations, the model found that most often the situation dictates the defensive response.

Nevertheless, a tendency of the opponent to play counterattacks, or especially risky passes could lead to many fast ball recoveries independent from the defensive transition strategy. This issue could be considered by either combining this approach with an equivalent offensive transition strategy detection as shown for example in Hobbs et al. (2018) or by including more features of the opposing team or even the raw data of all 22 players and the ball. Not only the question in which situations counterpressing induces a high chance for a possession regain, but furthermore—given a situation where counterpressing is conducted—how likely is it to take/receive a shot when conducting that strategy are of high interest for practitioners. Regaining the ball fast might not be the only objective of counterpressing. Consequently, future investigations should also consider quantifying alternative success definitions, e.g. slowing down the opposing attack, forcing back-passes etc.

Another missing piece which should be investigated further is an accurate selection of fallback situations. Comparing their risk-reward structure to counterpressing situations, could lead to crucial insights by evaluating a teams’ decision to counterpress versus falling back objectively. Since different teams may have their own club-specific definitions, our experimental set-up could be applied to arbitrary counterpressing-definitions or even other tactical patterns, as long as they gather sufficient labeled data. An interesting follow-up study, could investigate how many labeled matches would be necessary to achieve a sufficient accuracy depending on the definition. In our case we found 100 labeled matches to be sufficient, but also stress that a high inter-labeler reliability is necessary.

Note that counterpressing is only one example of a complex tactical pattern, that is of interest to match-analysts, but not covered in typical event level data. The needs of match-analysis departments combined with the growing availability and accuracy of positional and event data present a huge potential for task automating approaches.

5 Conclusions

Based on expert-evaluated definitions and hand-crafted labels, we are able to detect counterpressing strategies automatically with a sufficient accuracy in a supervised machine learning set-up. By producing both an understandable match-report and tagging-files suitable for conventional video-analysis software, the integration of the process into a match-analyst’s daily business saves a significant amount of time. The outcome helps to analyze the own team’s performance and provides helpful information about the next opponent’s defensive transition behavior (PA).

We can differentiate between intended counterpressing strategies and the many uncontrolled transition situations with short defensive reaction times. This provides not only a better understanding of transitions but also several more granular performance indicators describing defensive transitions (RQ1). The respective performance indicators, consolidated by statistical influences and expert opinions, derived interpretable and intuitive metrics (RQ2), such as the goal- or shot balance presenting an effective efficiency quantification for the counterpressing strategy—that were not used before but seem to have a huge potential according to the experts. Two of the proven rules of thumb are that counterpressing is more likely to succeed closer to the sidelines and a numerical superiority close to the ball increases the chance of winning it back (RQ3). Through analyzing different facets over several seasons we are also able to quantify trends over a large period of time: teams within the German Bundesliga follow appreciably different transition strategies (RQ4). Furthermore, successful teams—measured against their final ranking—tend to use the counterpressing strategy more efficiently, giving credence to the notion of declaring it as an offensive strategy (RQ2).