Skip to main content
Log in

Football tracking data: a copula-based hidden Markov model for classification of tactics in football

  • Original Research
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Driven by recent advances in technology, tracking devices allow to collect high-frequency data on the position of players in (association) football matches and in many other sports. Although such data sets are available to every professional team, most teams still rely on time-consuming video analysis when analysing future opponents, for example with regard to how goals were scored or a team’s general style of play. In this contribution, we provide a data-driven approach for automated classification of tactics in football. For that purpose, we consider hidden Markov models (HMMs) to analyse high-frequency tracking data, where the underlying states serve for a team’s tactic. In particular, as space control in football has been considered a major driver of success, we focus on the effective playing space, which is the convex hull created by the players excluding the goalkeeper. This quantity relates to both playing style and team behavior. Using copula-based HMMs, we model jointly the effective playing space of both teams to account for the competitive nature of the game. Our model thus provides an estimate of a team’s playing style at each time point, which can be beneficial for team managers but also of huge interest to football fans.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Baptista, J., Travassos, B., Gonçalves, B., Mourão, P., Viana, J. L., & Sampaio, J. (2020). Exploring the effects of playing formations on tactical behavior and external workload during football small-sided games. The Journal of Strength & Conditioning Research, 34(7), 2024–2030.

    Article  Google Scholar 

  • Bueno, MJd. O., Silva, M., Cunha, S. A., Torres, Rd. S., & Moura, F. A. (2021). Multiscale fractal dimension applied to tactical analysis in football: A novel approach to evaluate the shapes of team organization on the pitch. PlOS One, 16(9), e0256771.

    Article  Google Scholar 

  • Cervone, D., D’Amour, A., Bornn, L., & Goldsberry, K. (2016). A multiresolution stochastic process model for predicting basketball possession outcomes. Journal of the American Statistical Association, 111(514), 585–599.

    Article  Google Scholar 

  • Fernandez, J., & Bornn, L. (2018). Wide open spaces: A statistical technique for measuring space creation in professional soccer. In: Sloan Sports Analytics Conference.

  • Franks, A., Miller, A., Bornn, L., Goldsberry, K., et al. (2015). Characterizing the spatial structure of defensive skill in professional basketball. Annals of Applied Statistics, 9(1), 94–121.

    Article  Google Scholar 

  • Frencken, W., Lemmink, K., Delleman, N., & Visscher, C. (2011). Oscillations of centroid position and surface area of soccer teams in small-sided games. European Journal of Sport Science, 11(4), 215–223.

    Article  Google Scholar 

  • Goes, F., Kempe, M., van Norel, J., & Lemmink, K. (2021). Modelling team performance in soccer using tactical features derived from position tracking data. IMA Journal of Management Mathematics, 32(4), 519–533.

    Article  Google Scholar 

  • Goes, F., Meerhoff, L., Bueno, M., Rodrigues, D., Moura, F., Brink, M., Elferink-Gemser, M., Knobbe, A., Cunha, S., Torres, R., et al. (2021). Unlocking the potential of big data to support tactical performance analysis in professional soccer: A systematic review. European Journal of Sport Science, 21(4), 481–496.

    Article  Google Scholar 

  • Goes, F. R., Kempe, M., Meerhoff, L. A., & Lemmink, K. A. (2019). Not every pass can be an assist: A data-driven model to measure pass effectiveness in professional soccer matches. Big Data, 7(1), 57–70.

    Article  Google Scholar 

  • Gonçalves, B., Folgado, H., Coutinho, D., Marcelino, R., Wong, D., Leite, N., & Sampaio, J. (2018). Changes in effective playing space when considering sub-groups of 3 to 10 players in professional soccer matches. Journal of Human Kinetics, 62, 145.

    Article  Google Scholar 

  • Härdle, W. K., Okhrin, O., & Wang, W. (2015). Hidden Markov structures for dynamic copulae. Econometric Theory, 31(5), 981–1015.

    Article  Google Scholar 

  • Joe, H. (2014). Dependence modeling with copulas. CRC Press.

  • Kempe, M., Goes, F.R., & Lemmink, K.A. (2018). Smart data scouting in professional soccer: Evaluating passing performance based on position tracking data. In 2018 IEEE 14th International Conference on e-Science, IEEE, pp 409–410.

  • Kosmidis, I., & Karlis, D. (2016). Model-based clustering using copulas with applications. Statistics and Computing, 26(5), 1079–1099.

    Article  Google Scholar 

  • Lopez, M. J. (2020). Bigger data, better questions, and a return to fourth down behavior: An introduction to a special issue on tracking datain the National Football League. Journal of Quantitative Analysis in Sports, 16(2), 73–79.

    Article  Google Scholar 

  • Martino, A., Guatteri, G., & Paganoni, A. M. (2020). Multivariate hidden Markov models for disease progression. Statistical Analysis and Data Mining, 13(5), 499–507.

    Article  Google Scholar 

  • Memmert, D., Raabe, D., Schwab, S., & Rein, R. (2019). A tactical comparison of the 4-2-3-1 and 3-5-2 formation in soccer: A theory-oriented, experimental approach based on positional data in an 11 vs 11 game set-up. PlOS One, 14(1), e0210191.

    Article  Google Scholar 

  • Orfanogiannaki, K., & Karlis, D. (2018). Multivariate Poisson hidden Markov models with a case study of modelling seismicity. Australian & New Zealand Journal of Statistics, 60(3), 301–322.

    Article  Google Scholar 

  • Ötting, M., Langrock, R., & Maruotti, A. (2021). A copula-based multivariate hidden Markov model for modelling momentum in football. AStA Advances in Statistical Analysis pp 1–19.

  • Pohle, J., Langrock, R., van Beest, F. M., & Schmidt, N. M. (2017). Selecting the number of states in hidden Markov models: Pragmatic solutions illustrated using animal movement. Journal of Agricultural, Biological and Environmental Statistics, 22(3), 270–293.

    Article  Google Scholar 

  • R Core Team. (2021). R: A language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria, https://www.R-project.org/

  • Ric, A., Torrents, C., Gonçalves, B., Torres-Ronda, L., Sampaio, J., & Hristovski, R. (2017). Dynamics of tactical behaviour in association football when manipulating players’ space of interaction. PlOS One, 12(7), e0180773.

    Article  Google Scholar 

  • Silva, P., Aguiar, P., Duarte, R., Davids, K., Araújo, D., & Garganta, J. (2014). Effects of pitch size and skill level on tactical behaviours of association football players during small-sided and conditioned games. International Journal of Sports Science & Coaching, 9(5), 993–1006.

    Article  Google Scholar 

  • Vardi, Y., & Zhang, C. H. (2000). The multivariate L\(_1\)-median and associated data depth. Proceedings of the National Academy of Sciences, 97(4), 1423–1426.

    Article  Google Scholar 

  • Zucchini, W., MacDonald, I. L., & Langrock, R. (2016). Hidden Markov Models for Time Series: An Introduction Using R. Boca Raton: Chapman & Hall/CRC.

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Associate Editor and the referees for helpful comments that helped improve the paper. Marius Ötting received financial support from the German Research Foundation (DFG), grant number 431536450, which is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitris Karlis.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ötting, M., Karlis, D. Football tracking data: a copula-based hidden Markov model for classification of tactics in football. Ann Oper Res 325, 167–183 (2023). https://doi.org/10.1007/s10479-022-04660-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-022-04660-0

Keywords

Navigation