Skip to main content

Predicting Tennis Match Outcomes with Network Analysis and Machine Learning

  • Conference paper
  • First Online:
SOFSEM 2021: Theory and Practice of Computer Science (SOFSEM 2021)

Abstract

Singles tennis is one of the most popular individual sports in the world. Many researchers have embarked on a wide range of approaches to model a tennis match, using probabilistic modeling, or applying machine learning models to predict the outcome of matches. In this paper, we propose a novel approach based on network analysis to infer a surface-specific and time-varying score for professional tennis players and use it in addition to players’ statistics of previous matches to represent tennis match data. Using the resulting features, we apply advanced machine learning paradigms such as Multi-Output Regression and Learning Using Privileged Information, and compare the results with standard machine learning approaches. The models are trained and tested on more than 83,000 men’s singles tennis matches between the years 1991 and 2020. Evaluating the results shows the proposed methods provide more accurate predictions of tennis match outcome than classical approaches and outperform the existing methods in the literature and the current state-of-the-art models in tennis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.dltm.it/.

References

  1. Barnett, T., Pollard, G.: How the tennis court surface affects player performance and injuries. Med. Sci. Tennis 12(1), 34–37 (2007)

    Google Scholar 

  2. Biau, G., Scornet, E.: A random forest guided tour. Test 25(2), 197–227 (2016). https://doi.org/10.1007/s11749-016-0481-7

    Article  MathSciNet  MATH  Google Scholar 

  3. Borchani, H., Varando, G., Bielza, C., Larrañaga, P.: A survey on multi-output regression. Wiley Interdisc. Rev. Data Min. Knowl. Discovery 5(5), 216–233 (2015)

    Article  Google Scholar 

  4. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  5. Breznik, K.: On the gender effects of handedness in professional tennis. J. Sports Sci. Med. 12(2), 346 (2013)

    Google Scholar 

  6. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine (1998)

    Google Scholar 

  7. Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)

    Google Scholar 

  8. Das, K., Samanta, S., Pal, M.: Study on centrality measures in social networks: a survey. Soc. Netw. Anal. Min. 8(1), 1–11 (2018). https://doi.org/10.1007/s13278-018-0493-2

    Article  Google Scholar 

  9. Dingle, N., Knottenbelt, W., Spanias, D.: On the (Page) ranking of professional tennis players. In: Tribastone, M., Gilmore, S. (eds.) EPEW 2012. LNCS, vol. 7587, pp. 237–247. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36781-6_17

    Chapter  Google Scholar 

  10. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2009)

    Book  Google Scholar 

  11. Klaassen, F.J., Magnus, J.R.: Are points in tennis independent and identically distributed? evidence from a dynamic binary panel data model. J. Am. Stat. Assoc. 96(454), 500–509 (2001)

    Article  MathSciNet  Google Scholar 

  12. Klaassen, F.J., Magnus, J.R.: Forecasting the winner of a tennis match. Eur. J. Oper. Res. 148(2), 257–267 (2003)

    Article  Google Scholar 

  13. Knottenbelt, W.J., Spanias, D., Madurska, A.M.: A common-opponent stochastic model for predicting the outcome of professional tennis matches. Comput. Math. Appl. 64(12), 3820–3827 (2012)

    Article  MathSciNet  Google Scholar 

  14. Levatić, J., Ceci, M., Kocev, D., Džeroski, S.: Semi-supervised learning for multi-target regression. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2014. LNCS (LNAI), vol. 8983, pp. 3–18. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17876-9_1

    Chapter  Google Scholar 

  15. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, pp. 4765–4774 (2017)

    Google Scholar 

  16. Ma, S.M., Liu, C.C., Tan, Y., Ma, S.C.: Winning matches in grand slam men’s singles: an analysis of player performance-related variables from 1991 to 2008. J. Sports Sci. 31(11), 1147–1155 (2013)

    Article  Google Scholar 

  17. Michieli, U.: Complex network analysis of men single atp tennis matches. arXiv preprint arXiv:1804.08138 (2018)

  18. O’Malley, A.J.: Probability formulas and statistical analysis in tennis. J. Quant. Anal. Sports 4(2), 1-23 (2008)

    Google Scholar 

  19. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical Report, Stanford InfoLab (1999)

    Google Scholar 

  20. Peters, J.: Predicting the outcomes of professional tennis matches (2017)

    Google Scholar 

  21. Radicchi, F.: Who is the best player ever? a complex network analysis of the history of professional tennis. PLoS ONE 6(2), e17249 (2011)

    Article  Google Scholar 

  22. Sipko, M., Knottenbelt, W.: Machine learning for the prediction of professional tennis matches. MEng computing-final year project, Imperial College London (2015)

    Google Scholar 

  23. Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-label classification methods for multi-target regression. arXiv:1211.6581 (2012)

  24. Vapnik, V., Vashist, A.: A new learning paradigm: learning using privileged information. Neural Netw. 22(5–6), 544–557 (2009)

    Article  Google Scholar 

  25. Wang, J., Chen, Z., Sun, K., Li, H., Deng, X.: Multi-target regression via target specific features. Knowl. Based Syst. 170, 70–78 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Firas Bayram .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bayram, F., Garbarino, D., Barla, A. (2021). Predicting Tennis Match Outcomes with Network Analysis and Machine Learning. In: Bureš, T., et al. SOFSEM 2021: Theory and Practice of Computer Science. SOFSEM 2021. Lecture Notes in Computer Science(), vol 12607. Springer, Cham. https://doi.org/10.1007/978-3-030-67731-2_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67731-2_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67730-5

  • Online ISBN: 978-3-030-67731-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics