Skip to main content
Log in

Current State of Data and Analytics Research in Baseball

  • Injuries in Overhead Athletes (J Dines and C Camp, Section Editors)
  • Published:
Current Reviews in Musculoskeletal Medicine Aims and scope Submit manuscript

Abstract

Purpose of Review

Baseball has become one of the largest data-driven sports. In this review, we highlight the historical context of how big data and sabermetrics began to transform baseball, the current methods for data collection and analysis in baseball, and a look to the future including emerging technologies.

Recent Findings

Machine learning (ML), artificial intelligence (AI), and modern motion-analysis techniques have shown promise in predicting player performance and preventing injury. With the advent of the Health Injury Tracking System (HITS), numerous studies have been published which highlight the epidemiology and performance implications for specific injuries. Wearable technologies allow for the prospective collection of kinematic data to improve pitching mechanics and prevent injury.

Summary

Data and analytics research has transcended baseball over time, and the future of this field remains bright.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Baseball History, American history and you [Internet]. [cited 2021 Oct 27]. Available from: https://baseballhall.org/baseball-history-american-history-and-you

  2. Healey G. The new moneyball: how ballpark sensors are changing baseball. Proceedings of the IEEE. Institute of Electrical and Electronics Engineers Inc.; 2017. p. 1999–2002.

  3. Matt Kelly. What is sabermetrics? Modern analytics impact nearly every part of today’s game [Internet]. MLB.com. 2019 [cited 2021 Oct 27]. Available from: https://www.mlb.com/news/sabermetrics-in-baseball-a-casual-fans-guide

  4. Hall B. Artificial intelligence, machine learning, and the bright future of baseball [Internet]. The National Pastime: The Future According to Baseball. 2021 [cited 2021 Oct 27]. Available from: https://sabr.org/journal/article/artificial-intelligence-machine-learning-and-the-bright-future-of-baseball/

  5. Fred Ivor-Campbell. F.C. Lane. 2000 [cited 2021 Oct 29]; Available from: https://sabr.org/bioproj/person/f-c-lane/

  6. A guide to sabermetric research [Internet]. [cited 2021 Oct 31]. Available from: https://sabr.org/sabermetrics

  7. James LeDoux. Introducing pybaseball: an open source package for baseball data analysis. 2017.

    Google Scholar 

  8. Karnuta JM, Luu BC, Haeberle HS, Saluan PM, Frangiamore SJ, Stearns KL, et al. Machine learning outperforms regression analysis to predict next-season major league baseball player injuries: epidemiology and validation of 13,982 player-years from performance and injury profile trends, 2000-2017. Orthopaedic Journal of Sports Medicine. SAGE Publications Ltd; 2020;8.

  9. Neil Weinberg. Complete list (pitching) [Internet]. FanGraphs. 2014 [cited 2021 Nov 5]. Available from: https://library.fangraphs.com/pitching/complete-list-pitching/

  10. Neil Weinberg. Complete list (offense) [Internet]. FanGraphs. 2014 [cited 2021 Nov 5]. Available from: https://library.fangraphs.com/offense/offensive-statistics-list/

  11. Neil Weinberg. Overview [Internet]. FanGraphs. 2015 [cited 2021 Nov 5]. Available from: https://library.fangraphs.com/defense/overview/

  12. ben Harris. A sabermetric primer understanding advanced baseball metrics [Internet]. The Athletic. 2018 [cited 2021 Nov 5]. Available from: https://theathletic.com/255898/2018/02/28/a-sabermetric-primer-understanding-advanced-baseball-metrics/

  13. Neil Weinberg. Getting started [Internet]. FanGraphs. 2015 [cited 2021 Nov 5]. Available from: https://library.fangraphs.com/getting-started/

  14. Catch probability [Internet]. [cited 2021 Nov 7]. Available from: https://mlb.com/glossary/statcast/catch-probability

  15. Neil Weinberg. How to evaluate a pitcher, sabermetrically [Internet]. 2014 [cited 2021 Nov 7]. Available from: https://www.beyondtheboxscore.com/2014/6/2/5758898/sabermetrics-stats-pitching-stats-learn-sabermetrics

  16. Expected ERA (xERA) [Internet]. [cited 2021 Nov 7]. Available from: https://www.mlb.com/glossary/statcast/expected-era

  17. GLOSSARY [Internet]. [cited 2021 Nov 7]. Available from: https://www.mlb.com/glossary

  18. John Foley. Twinkie town analytics fundamentals: the flaws of batting average [Internet]. 2020 [cited 2021 Nov 7]. Available from: https://www.twinkietown.com/2020/5/11/21253031/twinkietown-analytics-fundamentals-come-learn-baseball-with-john-sabermetrics-batting-average-flaws

  19. Shaun Payne. Walks are important: why batting average is less than enough in today’s game [Internet]. 2011 [cited 2021 Nov 7]. Available from: https://www.twinkietown.com/2020/5/11/21253031/twinkietown-analytics-fundamentals-come-learn-baseball-with-john-sabermetrics-batting-average-flaws

  20. Neil Weinberg. How to evaluate a hitter, sabermetrically [Internet]. 2014 [cited 2021 Nov 7]. Available from: https://www.beyondtheboxscore.com/2014/5/26/5743956/sabermetrics-stats-offense-learn-sabermetrics

  21. Expected WEIGHTED ON-BASE AVERAGE (xwOBA) [Internet]. [cited 2021 Nov 7]. Available from: https://www.mlb.com/glossary/statcast/expected-woba

  22. Batting average on balls in play (BABIP) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/advanced-stats/babip

  23. Weighted on-base average (wOBA) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/advanced-stats/weighted-on-base-average

  24. Weighted runs created plus (wRC+) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/advanced-stats/weighted-runs-created-plus

  25. Wins above replacement (WAR) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/advanced-stats/wins-above-replacement

  26. Exit velocity (EV) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/statcast/exit-velocity

  27. Fielding independent pitching (FIP) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/advanced-stats/fielding-independent-pitching

  28. Adjusted earned run average (ERA+) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/advanced-stats/earned-run-average-plus

  29. Pitch movement [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/statcast/pitch-movement

  30. Spin rate (SR) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/statcast/spin-rate

  31. Perceived velocity (PV) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/statcast/perceived-velocity

  32. Defensive runs saved (DRS) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/advanced-stats/defensive-runs-saved

  33. Range factor (RF) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/advanced-stats/range-factor

  34. Outs above average (OAA) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/statcast/outs-above-average

  35. Distance covered (DCOV) [Internet]. [cited 2022 Jan 22]. Available from: https://www.mlb.com/glossary/statcast/distance-covered

  36. Healey G. Combining radar and optical sensor data to measure player value in baseball. Sensors (Switzerland). MDPI AG. 2021;21:1–14.

    Google Scholar 

  37. Thomas SJ, Paul RW, Rosen AB, Wilkins SJ, Scheidt J, Kelly JD, et al. Return-to-play and competitive outcomes after ulnar collateral ligament reconstruction among baseball players: a systematic review. Orthopaedic Journal of Sports Medicine. SAGE Publications Ltd; 2020;8:232596712096631.

  38. Camp CL, Dines JS, van der List JP, Conte S, Conway J, Altchek DW, et al. Summative report on time out of play for major and minor league baseball: an analysis of 49,955 injuries from 2011 through 2016. The American journal of sports medicine. SAGE Publications Inc.; 2018;46:1727–32.

  39. Makhni MC, Curriero FC, Yeung CM, Leung E, Kvit A, Mroz T, et al. Epidemiology of spine-related neurologic injuries in professional baseball players. 2021

  40. Rubenstein WJ, Allahabadi S, Curriero F, Feeley BT, Lansdown DA. Fracture epidemiology in professional baseball from 2011 to 2017. Orthopaedic Journal of Sports Medicine. SAGE Publications Ltd; 2020;8.

  41. Hultman K, Szukics PF, Grzenda A, Curriero FC, Cohen SB. Gastrocnemius injuries in professional baseball players: an epidemiological study. American Journal of Sports Medicine. SAGE Publications Inc.; 2020;48:2489–98.

  42. Lucasti CJ, Dworkin M, Warrender WJ, Winters B, Cohen S, Ciccotti M, et al. Ankle and lower leg injuries in professional baseball players. American Journal of Sports Medicine. SAGE Publications Inc.; 2020;48:908–15.

  43. Erickson BJ, Chalmers PN, D’Angelo J, Ma K, Romeo AA. Performance and return to sport after latissimus dorsi and teres major tears among professional baseball pitchers. American Journal of Sports Medicine. SAGE Publications Inc.; 2019;47:1090–5.

  44. Camp CL, Desai V, Conte S, Ahmad CS, Ciccotti M, Dines JS, Altchek DW, D’Angelo J, Griffith TB Revision ulnar collateral ligament reconstruction in professional baseball: current trends, surgical techniques, and outcomes. Orthopaedic Journal of Sports Medicine. SAGE Publications Ltd; 2019;7

  45. Hodgins JL, Trofa DP, Donohue S, Littlefield M, Schuk M, Ahmad CS. Forearm flexor injuries among major league baseball players: epidemiology, performance, and associated injuries. American Journal of Sports Medicine. SAGE Publications Inc.; 2018;46:2154–60.

  46. Esquivel A, Freehill MT, Curriero FC, Rand KL, Conte S, Tedeschi T, Lemos SE Analysis of non-game injuries in major league baseball. orthopaedic journal of sports medicine. SAGE Publications Ltd; 2019;7

  47. Li X, Zhou H, Williams P, Steele JJ, Nguyen J, Jäger M, et al. The epidemiology of single season musculoskeletal injuries in professional baseball. Orthopedic Reviews. Open Medical Publishing; 2013;5:3.

  48. Posner M, Cameron KL, Wolf JM, Belmont PJ, Owens BD. Epidemiology of major league baseball injuries. American Journal of Sports Medicine. 2011;39:1676–80.

    Article  Google Scholar 

  49. Christoffer DJ, Melugin HP, Cherny CE. A clinician’s guide to analysis of the pitching motion. Current reviews in musculoskeletal medicine. Humana Press Inc.; 2019. p. 98–104.

  50. Lizzio VA, Cross AG, Guo EW, Makhni EC. Using wearable technology to evaluate the kinetics and kinematics of the overhead throwing motion in baseball players. Arthroscopy Techniques. Elsevier B.V.; 2020;9:e1429–31.

  51. Fleisig GS. Editorial commentary: changing times in sports biomechanics: baseball pitching injuries and emerging wearable technology. Arthroscopy : the journal of arthroscopic & related surgery : official publication of the Arthroscopy Association of North America and the International Arthroscopy Association. W.B. Saunders; 2018;34:823–824.

  52. Makhni EC, Lizzio VA, Meta F, Stephens JP, Okoroha KR, Moutzouros V. Assessment of elbow torque and other parameters during the pitching motion: comparison of fastball, curveball, and change-up. Arthroscopy : the Journal of Arthroscopic & Related Surgery : Official Publication of the Arthroscopy Association of North America and the International Arthroscopy Association. W.B. Saunders; 2018;34:816–822.

  53. Camp CL, Tubbs TG, Fleisig GS, Dines JS, Dines DM, Altchek DW, et al. The relationship of throwing arm mechanics and elbow varus torque: within-subject variation for professional baseball pitchers across 82,000 throws. The American Journal of Sports Medicine. SAGE Publications Inc.; 2017;45:3030–5.

  54. Roggio F, Bianco A, Palma A, Ravalli S, Maugeri G, di Rosa M, et al. Technological advancements in the analysis of human motion and posture management through digital devices. World Journal of Orthopedics. Baishideng Publishing Group Co. 2021;12:467–84.

    Article  Google Scholar 

  55. Prasanth H, Caban M, Keller U, Courtine G, Ijspeert A, Vallery H, et al. Wearable sensor-based real-time gait detection: a systematic review. MDPI AG: Sensors; 2021.

    Google Scholar 

  56. Leafblad ND, Larson DR, Fleisig GS, Conte S, Fealy SA, Dines JS, et al. Variability in baseball throwing metrics during a structured long-toss program: does one size fit all or should programs be individualized? Sports Health. SAGE Publications Inc.; 2019;11:535–42.

  57. Lizzio VA, Smith DG, Jildeh TR, Gulledge CM, Swantek AJ, Stephens JP, et al. Importance of radar gun inclusion during return-to-throwing rehabilitation following ulnar collateral ligament reconstruction in baseball pitchers: a simulation study. Journal of Shoulder and Elbow Surgery. Mosby Inc.; 2020;29:587–92.

  58. Melugin HP, Larson DR, Fleisig GS, Conte S, Fealy SA, Dines JS, et al. Baseball pitchers’ perceived effort does not match actual measured effort during a structured long-toss throwing program. American Journal of Sports Medicine. SAGE Publications Inc.; 2019;47:1949–54.

  59. Okoroha KR, Lizzio VA, Meta F, Ahmad CS, Moutzouros V, Makhni EC. Predictors of elbow torque among youth and adolescent baseball pitchers. American Journal of Sports Medicine. SAGE Publications Inc.; 2018;46:2148–53.

  60. Okoroha KR, Meldau JE, Jildeh TR, Stephens JP, Moutzouros V, Makhni EC. Impact of ball weight on medial elbow torque in youth baseball pitchers. Journal of Shoulder and Elbow Surgery. Mosby Inc.; 2019;28:1484–9.

  61. Okoroha KR, Meldau JE, Lizzio VA, Meta F, Stephens JP, Moutzouros V, et al. Effect of fatigue on medial elbow torque in baseball pitchers: a simulated game analysis. American Journal of Sports Medicine. SAGE Publications Inc.; 2018;46:2509–13.

  62. Maury Brown. How wearable technology got quietly into major league baseball. 2016.

    Google Scholar 

  63. Camp CL, Loushin S, Nezlek S, Fiegen AP, Christoffer D, Kaufman K. Are wearable sensors valid and reliable for studying the baseball pitching motion? An independent comparison with marker-based motion capture. The American Journal of Sports Medicine. SAGE Publications Inc.; 2021;49:3094–101.

  64. Boddy KJ, Marsh JA, Caravan A, Lindley KE, Scheffey JO, O’Connell ME. Exploring wearable sensors as an alternative to marker-based motion capture in the pitching delivery. PeerJ. PeerJ Inc.; 2019;2019.

  65. Kakavas G, Malliaropoulos N, Pruna R, Maffulli N. Artificial intelligence: a tool for sports trauma prediction. Injury. Elsevier Ltd. 2020;51:S63–5.

    Article  Google Scholar 

  66. Luu BC, Wright AL, Haeberle HS, Karnuta JM, Schickendantz MS, Makhni EC, Nwachukwu BU, Williams III RJ, Ramkumar PN Machine learning outperforms logistic regression analysis to predict next-season NHL player injury: an analysis of 2322 players from 2007 to 2017. Orthopaedic Journal of Sports Medicine. SAGE Publications Ltd; 2020;8.

  67. Makhni EC, Makhni S, Ramkumar PN. Artificial intelligence for the orthopaedic surgeon: an overview of potential benefits, limitations, and clinical applications. The Journal of the American Academy of Orthopaedic Surgeons. NLM (Medline); 2021. p. 235–43.

  68. Helm JM, Swiergosz AM, Haeberle HS, Karnuta JM, Schaffer JL, Krebs VE, Spitzer AI, Ramkumar PN machine learning and artificial intelligence: definitions, applications, and future directions. Current Reviews in Musculoskeletal Medicine. Springer; 2020. p. 69–76

  69. Cabitza F, Locoro A, Banfi G. Machine learning in orthopedics: a literature review. Frontiers in Bioengineering and Biotechnology. Frontiers Media S.A.; 2018

    Google Scholar 

  70. Goltz DE, Ryan SP, Hopkins TJ, Howell CB, Attarian DE, Bolognesi MP, et al. A novel risk calculator predicts 90-day readmission following total joint arthroplasty. Journal of Bone and Joint Surgery - American Volume. Lippincott Williams and Wilkins. 2019;101:547–56.

    Google Scholar 

  71. Jayakumar P, Bozic KJ. Advanced decision-making using patient-reported outcome measures in total joint replacement. Journal of Orthopaedic Research. John Wiley and Sons Inc.; 2020;38:1414–22.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Chalmers.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Human and Animal Rights and Informed Consent

All reported studies/experiments with human or animal subjects performed by the authors have been previously published and complied with all applicable ethical standards (including the Helsinki declaration and its amendments, institutional/national research committee standards, and international/national/institutional guidelines).

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Injuries in Overhead Athletes

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mizels, J., Erickson, B. & Chalmers, P. Current State of Data and Analytics Research in Baseball. Curr Rev Musculoskelet Med 15, 283–290 (2022). https://doi.org/10.1007/s12178-022-09763-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12178-022-09763-6

Keywords

Navigation