Skip to main content

Time Series Regression in Professional Road Cycling

Part of the Lecture Notes in Computer Science book series (LNAI,volume 12323)

Abstract

With the recent explosive developments in sensoring capabilities and ubiquitous computing in road cycling, large quantities of detailed data about performance are becoming available. In this paper, we will demonstrate that this rich data in cycling offers several non-trivial data science challenges. The primary task that we focus on is a regression task: given a collection of results in previous races of a specific rider, predict the performance in a future race solely based on the characteristics of said rider and the stage profile. To make these predictions, we have developed a predictive pipeline that consists of three consecutive rider-specific models. First, we transform the distance-altitude profile into a time profile, by using a climb-descent model that describes the relationship between the speed of the cyclist and the slope of the terrain. Second, we introduce an effective profile that includes the rider-specific physiological capabilities. Third, we predict the performance based on the characteristics of the effective profile, by using a model constructed from the historical records of our cyclist. To demonstrate the relevance of this work, we show that for a professional cycling team, important information for making tactical decisions can be obtained from our modeling approach.

Keywords

  • Temporal data mining
  • Time series regression
  • Predictive modeling

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-61527-7_45
  • Chapter length: 15 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-61527-7
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.

Notes

  1. 1.

    The most important stage races: Tour de France, Giro d’Italia and Vuelta a España.

  2. 2.

    This specific information is only available at the end of the race, which makes our analysis a post-hoc one.

References

  1. Atkinson, G., Davison, R., Jeukendrup, A., Passfield, L.: Science and cycling: current knowledge and future directions for research. J. Sports Sci. 21, 767–787 (2003)

    CrossRef  Google Scholar 

  2. Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Disc. 31, 606–660 (2017)

    MathSciNet  CrossRef  Google Scholar 

  3. Bengio, Y., Courville, A.C., Vincent, P.: Unsupervised feature learning and deep learning: review and new perspectives (2012). CoRR abs/1206.5538

    Google Scholar 

  4. Box, G., Jenkins, G.M.: Time Series Analysis: Forecasting and Control. Holden Day, San Francisco (1976)

    MATH  Google Scholar 

  5. Deng, H., Runger, G., Tuv, E., Vladimir, M.: A time series forest for classification and feature extraction. Inf. Sci. 239, 142–153 (2013)

    MathSciNet  CrossRef  Google Scholar 

  6. Duivesteijn, W., Knobbe, A.J.: Exploiting false discoveries-statistical validation of patterns and quality measures in subgroup discovery. In: Proceedings of the 2011 IEEE 11th International Conference on Data Mining, ICDM’2011, USA, pp. 151–160. IEEE Computer Society (2011)

    Google Scholar 

  7. Faria, E., Parker, D., Faria, I.: The science of cycling: Factors affecting performance-part 2. Sports Med. (Auckland, N.Z.) 35, 313–337 (2005)

    CrossRef  Google Scholar 

  8. Faria, E., Parker, D., Faria, I.: The science of cycling: physiology and training-part 1. Sports Med. (Auckland, N.Z) 35, 285–312 (2005)

    CrossRef  Google Scholar 

  9. Fulcher, B.D., Jones, N.S.: Highly comparative feature-based time-series classification. IEEE Trans. Knowl. Data Eng. 26, 3026–3037 (2014)

    CrossRef  Google Scholar 

  10. Herrera, F., Carmona, C.J., González, P., del Jesus, M.J.: An overview on subgroup discovery: foundations and applications. Knowl. Inf. Syst. 29(3), 495–525 (2011)

    CrossRef  Google Scholar 

  11. https://www.procyclingstats.com (2019)

  12. Knobbe, A.J., Orie, J., Hofman, N., van der Burgh, B., Cachucho, R.E.: Sports analytics for professional speed skating. Data Min. Knowl. Disc. 31, 1872–1902 (2017)

    MathSciNet  CrossRef  Google Scholar 

  13. de Leeuw, A.W., Meerhoff, R., Knobbe, A.J.: Effects of pacing properties on performance in long-distance running. Big Data 6(4), 248–261 (2018)

    CrossRef  Google Scholar 

  14. Lucia, A., Hoyos, J.J., Chicharro, L.: Physiology of professional road cycling. Sports Med. (Auckland, N.Z.) 31, 325–337 (2001)

    CrossRef  Google Scholar 

  15. Makridakis, S., Spiliotis, E., Assimakopoulos, V.: Statistical and machine learning forecasting methods: concerns and ways forward. PLoS One 13(3), 1–26 (2018)

    CrossRef  Google Scholar 

  16. Meeng, M., Knobbe, A.J.: Flexible enrichment with cortana-software demo. In: Proceedings of BeneLearn, the annual Belgian-Dutch conference on machine learning, pp. 117–119 (2011)

    Google Scholar 

  17. Mörchen, F.: Time series feature extraction for data mining using DWT and DFT (2003)

    Google Scholar 

  18. Nanopoulos, A., Alcock, R., Manolopoulos, Y.: Feature-based classification of time-series data. Int. J. Comput. Res. 10, 49–61 (2001)

    Google Scholar 

  19. Song, H.A., Lee, S.Y.: Hierarchical representation using NMF. In: Lee, M., Hirose, A., Hou, Z.G., Kil, R.M. (eds.) Neural Information Processing, pp. 466–473. Springer, Berlin Heidelberg (2013)

    CrossRef  Google Scholar 

  20. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  21. Wang, X., Smith-Miles, K., Hyndman, R.: Characteristic-based clustering for time series data. Data Min. Knowl. Disc. 13, 335–364 (2006)

    MathSciNet  CrossRef  Google Scholar 

  22. Wang, X., Wirth, A., Wang, L.: Structure-based statistical features and multivariate time series clustering. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), pp. 351–360, October 2007

    Google Scholar 

  23. Xiong, M., Chen, J., Wang, Z., Liang, C., Zheng, Q., Han, Z., Sun, K.: Deep feature representation via multiple stack auto-encoders. In: Advances in Multimedia Information Processing-PCM 2015, pp. 275–284. Springer International Publishing, Cham, September 2015

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arie-Willem de Leeuw .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

de Leeuw, AW., Heijboer, M., Hofmijster, M., van der Zwaard, S., Knobbe, A. (2020). Time Series Regression in Professional Road Cycling. In: Appice, A., Tsoumakas, G., Manolopoulos, Y., Matwin, S. (eds) Discovery Science. DS 2020. Lecture Notes in Computer Science(), vol 12323. Springer, Cham. https://doi.org/10.1007/978-3-030-61527-7_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61527-7_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61526-0

  • Online ISBN: 978-3-030-61527-7

  • eBook Packages: Computer ScienceComputer Science (R0)