Spatiotemporal based table tennis stroke-type assessment

Aktas, Kadir; Demirel, Mehmet; Moor, Marilin; Olesk, Johanna; Ozcinar, Cagri; Anbarjafari, Gholamreza

doi:10.1007/s11760-021-01893-7

Spatiotemporal based table tennis stroke-type assessment

Original Paper
Published: 30 March 2021

Volume 15, pages 1593–1600, (2021)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Kadir Aktas^1,4,
Mehmet Demirel²,
Marilin Moor¹,
Johanna Olesk¹,
Cagri Ozcinar¹ &
…
Gholamreza Anbarjafari ORCID: orcid.org/0000-0001-8460-5717^1,3,4

563 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

This paper presents a proposed multistage deep neural network pipeline for sports action recognition. The proposed pipeline is based on the classification of stroke types of table tennis using spatiotemporal features. The proposed network predicts the final class with different aspects of the final class at each stage. Outcomes of each stage are then fused together to obtain the final prediction. We utilize four different methods that are used in each stage, namely RGB image-based, optical flow-based, pose-based, and region-of-interest-based methods. We conducted our experiments on the TTSTROKE-21 dataset, which has been introduced in MediaEval Challenge 2020. Experimental results show that our proposed methodology obtains 90.7% test accuracy using a combination of RGB images and optical flow-based methods together.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid LSTM and GAN model for action recognition and prediction of lawn tennis sport activities

Article 22 September 2023

Xiaolong Sun, Yong Wang & Jawad Khan

Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks

Article 19 April 2020

Pierre-Etienne Martin, Jenny Benois-Pineau, … Julien Morlier

Fast Neural Accumulator (NAC) Based Badminton Video Action Classification

References

Soomro, K., Zamir, A.R., Shah, M.: Ucf101: A dataset of 101 human actions classes from videos in the wild (2012). arXiv:1212.0402
Safdarnejad, S.M., Liu, X., Udpa, L., Andrus, B., Wood, J., Craven, D.: Sports videos in the wild (svw): a video dataset for sports analysis. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1. pp. 1–7. IEEE (2015)
Piergiovanni, A., Ryoo, M.S.: Fine-grained activity recognition in baseball videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1740–1748 (2018)
Pettersen, S.A., Johansen, D., Johansen, H., Berg-Johansen, V., Gaddam, V.R., Mortensen, A., Langseth, R., Griwodz, C., Stensland, H.K., Halvorsen, P.: Soccer video and player position dataset. In: Proceedings of the 5th ACM Multimedia Systems Conference, pp. 18–23 (2014)
Liu, R., Wang, Z., Shi, X., Zhao, H., Qiu, S., Li, J., Yang, N.: Table tennis stroke recognition based on body sensor network. In: International Conference on Internet and Distributed Computing Systems, pp. 1–10. Springer (2019)
Blank, P., Hoßbach, J., Schuldhaus, D., Eskofier, B.M.: Sensor-based stroke detection and stroke type classification in table tennis. In: Proceedings of the 2015 ACM International Symposium on Wearable Computers, pp. 93–100 (2015)
Dokic, K., Mesic, T., Martinovic, M.: Table tennis forehand and backhand stroke recognition based on neural network. In: International Conference on Advances in Computing and Data Sciences, pp. 24–35. Springer (2020)
Hegazy, H., Abdelsalam, M., Hussien, M., Elmosalamy, S., Hassan, Y.M., Nabil, A.M., Atia, A.: Online detection and classification of in-corrected played strokes in table tennis using IR depth camera. Proc. Comput. Sci. 170, 555–562 (2020)
Article Google Scholar
Hegazy, H., Abdelsalam, M., Hussien, M., Elmosalamy, S., Hassan, Y.M., Nabil, A.M., Atia, A.: Ipingpong: a real-time performance analyzer system for table tennis stroke’s movements. Proc. Comput. Sci. 175, 80–87 (2020)
Article Google Scholar
Pierre-Etienne, M., B.-P. J, P. R, M. J.: Fine grained sport action recognition with twin spatio-temporal convolutional neural networks. Multimed. Tools Appl. 79, no. 20429–20447, pp. 85–97 (2020)
Martin, P.-E., Benois-Pineau, J., Péteri, R., Morlier, J.: Optimal choice of motion estimation methods for fine-grained action classification with 3d convolutional networks. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 554–558. IEEE (2019)
Xia, K., Wang, H., Xu, M., Li, Z., He, S., Tang, Y.: Racquet sports recognition using a hybrid clustering model learned from integrated wearable sensor. Sensors 20(6), 1638 (2020)
Article Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. Adv. Neural Inf. Process. Syst. 27, 568–576 (2014)
Google Scholar
Tammvee, M., Anbarjafari, G.: Human activity recognition-based path planning for autonomous vehicles. In: Signal, Image and Video Processing, pp. 1–8 (2020)
Lüsi, I., Jr., J. C. J., Gorbova, J., Baró, X., Escalera, S., Demirel, H., Allik, J., Ozcinar, C., Anbarjafari, G.: Joint challenge on dominant and complementary emotion recognition using micro emotion features and head-pose estimation: Databases. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 809–813. IEEE (2017)
Sato, S., Aono, M.: Mediaeval 2020: Leveraging human pose estimation model for stroke classification in table tennis. In: MediaEval (2020)
Oved, D., Alvarado, I., Gallo, A.: Real-time human pose estimation in the browser with tensorflow. js. In: TensorFlow Medium, May (2018)
Nguyen-Truong, H., Cao, S., Nguyen, K.N.A., Pham, B.-D., Dao, H., Le, M.-Q., Nguyen-Dinh, H.-P., Nguyen, H.-D., Tran, M.-T.: Mediaeval 2020: Hcmus at mediaeval 2020: Ensembles of temporal deep neural networks for table tennis strokes classification task. In: MediaEval (2020)
Alp Güler, R., Neverova, N., Kokkinos, I.: Densepose: Dense human pose estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7297–7306 (2018)
Martin, P.-E., Benois-Pineau, J., Mansencal, B., Péteri, R., Morlier, J.: Mediaeval 2020: Classification of strokes in table tennis with a three stream spatio-temporal cnn for mediaeval 2020. In: MediaEval, (2020)
Ahmadi, A., Mitchell, E., Richter, C., Destelle, F., Gowing, M., O’Connor, N.E., Moran, K.: Toward automatic activity classification and movement assessment during a sports training session. IEEE Internet Things J. 2(1), 23–32 (2014)
Article Google Scholar
Papandreou G., Zhu, T., Chen, L.-C., Gidaris, S., Tompson, J., Murphy, K.: Personlab: person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 269–286 (2018)
Sriraman, S., Srinivasan, S., Krishnan, V.K., B.J, Mirnalinee, T.T.: Mediaeval 2019: Lrcns for stroke detection in table tennis. In: MediaEval (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Aktas, K., Demirel, M., Moor, M., Olesk, J., Anbarjafari, G.: Mediaeval 2020: Spatio-temporal based table tennis hit assessment using lstm algorithm. In: MediaEval (2020)
Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6(3), 21–45 (2006)
Article Google Scholar
Demirel, H., Anbarjafari, G.: Data fusion boosted face recognition based on probability distribution functions in different colour channels. EURASIP J. Adv. Signal Process. 2009(1), 482585 (2009)
Article Google Scholar
Horn, B.K., Schunck, B.G.: Determining optical flow. In: Techniques and Applications of Image Understanding, vol. 281, pp. 319–331 . International Society for Optics and Photonics (1981)
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., Brox, T.: Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)
Liu, Z., Zhu, J., Bu, J., Chen, C.: A survey of human pose estimation: the body parts parsing based methods. J. Vis. Commun. Image Represent. 32, 10–19 (2015)
Article Google Scholar
Wang, J., Qiu, K., Peng, H., Fu, J., Zhu, J.: Ai coach: Deep human pose estimation and analysis for personalized athletic training assistance. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 374–382 (2019)
Fang, H.-S., Xie, S., Tai, Y.-W., Lu, C.: RMPE: Regional multi-person pose estimation. In: ICCV (2017)
Jocher, G., Nishimura, K., Mineeva, T., Vilariño, R.: Yolov5 (2020). https://github.com/ultralytics/yolov5
Soviany, P., Ionescu, R.T.: Optimizing the trade-off between single-stage and two-stage deep object detectors using image difficulty prediction. In: 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). pp. 209–214. IEEE (2018)
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. In: 1999 Ninth International Conference on Artificial Neural Networks ICANN 99. (Conf. Publ. No. 470), vol. 2, pp. 850–855 (1999)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015). arXiv:1502.03167
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. In: COURSERA: Neural Networks for Machine Learning, vol. 4, no. 2, pp. 26–31 (2012)
Martin, P.-E., Benois-Pineau, J., Peteri, R., Morlier, J.: Fine grained sport action recognition with twin spatio-temporal convolutional neural networks: application to table tennis. Multimed. Tools Appl. 79, 07 (2020)
Article Google Scholar
Zhang, Q., Sun, S.: A centroid k-nearest neighbor method. In: International Conference on Advanced Data Mining and Applications, pp. 278–285. Springer (2010)

Download references

Author information

Authors and Affiliations

iCV Research Lab, Institute of Technology, University of Tartu, 50411, Tartu, Estonia
Kadir Aktas, Marilin Moor, Johanna Olesk, Cagri Ozcinar & Gholamreza Anbarjafari
University of Manchester, Manchester, UK
Mehmet Demirel
PwC Advisory, Helsinki, Finland
Gholamreza Anbarjafari
iVCV, Tartu, 51011, Estonia
Kadir Aktas & Gholamreza Anbarjafari

Authors

Kadir Aktas
View author publications
You can also search for this author in PubMed Google Scholar
Mehmet Demirel
View author publications
You can also search for this author in PubMed Google Scholar
Marilin Moor
View author publications
You can also search for this author in PubMed Google Scholar
Johanna Olesk
View author publications
You can also search for this author in PubMed Google Scholar
Cagri Ozcinar
View author publications
You can also search for this author in PubMed Google Scholar
Gholamreza Anbarjafari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gholamreza Anbarjafari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work has been partially supported by the Estonian Centre of Excellence in IT (EXCITE) funded by the European Regional Development Fund. The authors also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aktas, K., Demirel, M., Moor, M. et al. Spatiotemporal based table tennis stroke-type assessment. SIViP 15, 1593–1600 (2021). https://doi.org/10.1007/s11760-021-01893-7

Download citation

Received: 29 December 2020
Revised: 10 March 2021
Accepted: 15 March 2021
Published: 30 March 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11760-021-01893-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatiotemporal based table tennis stroke-type assessment

Abstract

Access this article

Similar content being viewed by others

Hybrid LSTM and GAN model for action recognition and prediction of lawn tennis sport activities

Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks

Fast Neural Accumulator (NAC) Based Badminton Video Action Classification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Spatiotemporal based table tennis stroke-type assessment

Abstract

Access this article

Similar content being viewed by others

Hybrid LSTM and GAN model for action recognition and prediction of lawn tennis sport activities

Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks

Fast Neural Accumulator (NAC) Based Badminton Video Action Classification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation