Baseball Pitch Type Recognition Based on Broadcast Videos

Chen, Reed; Siegler, Dylan; Fasko, Michael; Yang, Shunkun; Luo, Xiong; Zhao, Wenbing

doi:10.1007/978-981-15-1925-3_24

Baseball Pitch Type Recognition Based on Broadcast Videos

Reed Chen⁷,
Dylan Siegler⁸,
Michael Fasko Jr.⁹,
Shunkun Yang¹⁰,
Xiong Luo¹¹ &
…
Wenbing Zhao⁹

Conference paper
First Online: 06 December 2019

1210 Accesses
3 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1138))

Abstract

In this paper, we report our work on baseball pitch type recognition based on broadcast videos using two-stream inflated 3D convolutional neural network (I3D). To improve the state-of-the-art of research, we developed our own high-quality dataset, trained and tuned the I3D model extensively, primarily combating the problem of overfitting while still trying to improve final validation accuracy. In the end, we are able to achieve an accuracy of 53.43% ± 3.04% when oversampling and 57.10% ± 2.99% when not oversampling, which is a significant improvement over the published best result of an accuracy of 36.4% on the same six pitch type classes.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bux, A., Angelov, P., Habib, Z.: Vision based human activity recognition: a review. In: Angelov, P., Gegov, A., Jayne, C., Shen, Q. (eds.) Advances in Computational Intelligence Systems. AISC, vol. 513, pp. 341–371. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-46562-3_23
Chapter Google Scholar
Chen, M., Li, Y., Luo, X., Wang, W., Wang, L., Zhao, W.: A novel human activity recognition scheme for smart health using multilayer extreme learning machine. IEEE Internet Things J. 6(2), 1410–1418 (2018)
Article Google Scholar
Kong, Y., Fu, Y.: Human action recognition and prediction: a survey. arXiv preprint arXiv:1806.11230 (2018)
Lun, R., Zhao, W.: A survey of applications and human motion recognition with Microsoft Kinect. Int. J. Pattern Recognit. Artif. Intell. 29(5), 1555008 (2015)
Article Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Google Scholar
Zhao, W.: A concise tutorial on human motion tracking and recognition with Microsoft Kinect. Sci. China Inf. Sci. 59(9), 93101 (2016)
Article Google Scholar
Piergiovanni, A., Fan, C., Ryoo, M.S.: Learning latent subevents in activity videos using temporal attention filters. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Google Scholar
Siegler, D., Chen, R., Fasko Jr., M., Yang, S., Luo, X., Zhao, W.: Semi-automated development of a dataset for baseball pitch type recognition. In: Ning, H. (ed.) CyberDI 2019/CyberLife 2019. CCIS, vol. 1138, pp. 345–359. Springer, Singapore (2019)
Google Scholar
Sun, C., Shrivastava, A., Singh, S., Gupta, A.: Revisiting unreasonable effectiveness of data in deep learning era. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 843–852 (2017)
Google Scholar
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)
Google Scholar
Aghdam, H.H., Heravi, E.J., Puig, D.: Analyzing the stability of convolutional neural networks against image degradation. In: VISIGRAPP (4: VISAPP), pp. 370–382 (2016)
Google Scholar
Laermann, J., Samek, W., Strodthoff, N.: Achieving generalizable robustness of deep neural networks by stability training. arXiv preprint arXiv:1906.00735 (2019)
Google Scholar
Xu, Z., Yu, F., Chen, X.: DoPa: a comprehensive CNN detection methodology against physical adversarial attacks (2019)
Google Scholar
Zhang, H.B., et al.: A comprehensive survey of vision-based human action recognition methods. Sensors 19(5), 1005 (2019)
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Gers, F., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. In: IET Conference Proceedings, pp. 850–855(5). https://digital-library.theiet.org/content/conferences/10.1049/cp_19991218
Luo, X., et al.: Short-term wind speed forecasting via stacked extreme learning machine with generalized correntropy. IEEE Trans. Ind. Inform. 14(11), 4963–4971 (2018)
Article Google Scholar
Luo, X., et al.: Towards enhancing stacked extreme learning machine with sparse autoencoder by correntropy. J. Franklin Inst. 355(4), 1945–1966 (2018)
Article MathSciNet Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Google Scholar
Piergiovanni, A., Ryoo, M.S.: Fine-grained activity recognition in baseball videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1740–1748 (2018)
Google Scholar
Luo, X., Jiang, C., Wang, W., Xu, Y., Wang, J.H., Zhao, W.: User behavior prediction in social networks using weighted extreme learning machine with distribution optimization. Future Gener. Comput. Syst. 93, 1023–1035 (2019)
Article Google Scholar
Hardt, M., Recht, B., Singer, Y.: Train faster, generalize better: stability of stochastic gradient descent. arXiv preprint arXiv:1509.01240 (2015)
Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: International Conference on Machine Learning, pp. 1139–1147 (2013)
Google Scholar
Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)
Reed, S., Lee, H., Anguelov, D., Szegedy, C., Erhan, D., Rabinovich, A.: Training deep neural networks on noisy labels with bootstrapping. arXiv preprint arXiv:1412.6596 (2014)
Ng, A.Y.: Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 78. ACM (2004)
Google Scholar
Wager, S., Wang, S., Liang, P.S.: Dropout training as adaptive regularization. In: Advances in Neural Information Processing Systems, pp. 351–359 (2013)
Google Scholar
van Laarhoven, T.: L2 regularization versus batch and weight normalization. arXiv preprint arXiv:1706.05350 (2017)
Li, X., Chen, S., Hu, X., Yang, J.: Understanding the disharmony between dropout and batch normalization by variance shift. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2682–2690 (2019)
Google Scholar

Download references

Acknowledgment

This work is partially supported by the Undergraduate Summer Research Award program at Cleveland State University.

Author information

Authors and Affiliations

Duke University, Durham, NC, 27708, USA
Reed Chen
Georgia Institute of Technology, North Avenue, Atlanta, GA, 30332, USA
Dylan Siegler
Department of Electrical Engineering and Computer Science, Cleveland State University, Cleveland, OH, 44115, USA
Michael Fasko Jr. & Wenbing Zhao
School of Reliability and Systems Engineering, Beihang University, 37 Xueyuan Road, Beijing, 100191, China
Shunkun Yang
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
Xiong Luo

Authors

Reed Chen
View author publications
You can also search for this author in PubMed Google Scholar
Dylan Siegler
View author publications
You can also search for this author in PubMed Google Scholar
Michael Fasko Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Shunkun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xiong Luo
View author publications
You can also search for this author in PubMed Google Scholar
Wenbing Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenbing Zhao .

Editor information

Editors and Affiliations

University of Science and Technology, Beijing, China
Huansheng Ning

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, R., Siegler, D., Fasko, M., Yang, S., Luo, X., Zhao, W. (2019). Baseball Pitch Type Recognition Based on Broadcast Videos. In: Ning, H. (eds) Cyberspace Data and Intelligence, and Cyber-Living, Syndrome, and Health. CyberDI CyberLife 2019 2019. Communications in Computer and Information Science, vol 1138. Springer, Singapore. https://doi.org/10.1007/978-981-15-1925-3_24

Download citation

DOI: https://doi.org/10.1007/978-981-15-1925-3_24
Published: 06 December 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1924-6
Online ISBN: 978-981-15-1925-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics