Skip to main content
Log in

Spatiotemporal based table tennis stroke-type assessment

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

This paper presents a proposed multistage deep neural network pipeline for sports action recognition. The proposed pipeline is based on the classification of stroke types of table tennis using spatiotemporal features. The proposed network predicts the final class with different aspects of the final class at each stage. Outcomes of each stage are then fused together to obtain the final prediction. We utilize four different methods that are used in each stage, namely RGB image-based, optical flow-based, pose-based, and region-of-interest-based methods. We conducted our experiments on the TTSTROKE-21 dataset, which has been introduced in MediaEval Challenge 2020. Experimental results show that our proposed methodology obtains 90.7% test accuracy using a combination of RGB images and optical flow-based methods together.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Soomro, K., Zamir, A.R., Shah, M.: Ucf101: A dataset of 101 human actions classes from videos in the wild (2012). arXiv:1212.0402

  2. Safdarnejad, S.M., Liu, X., Udpa, L., Andrus, B., Wood, J., Craven, D.: Sports videos in the wild (svw): a video dataset for sports analysis. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1. pp. 1–7. IEEE (2015)

  3. Piergiovanni, A., Ryoo, M.S.: Fine-grained activity recognition in baseball videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1740–1748 (2018)

  4. Pettersen, S.A., Johansen, D., Johansen, H., Berg-Johansen, V., Gaddam, V.R., Mortensen, A., Langseth, R., Griwodz, C., Stensland, H.K., Halvorsen, P.: Soccer video and player position dataset. In: Proceedings of the 5th ACM Multimedia Systems Conference, pp. 18–23 (2014)

  5. Liu, R., Wang, Z., Shi, X., Zhao, H., Qiu, S., Li, J., Yang, N.: Table tennis stroke recognition based on body sensor network. In: International Conference on Internet and Distributed Computing Systems, pp. 1–10. Springer (2019)

  6. Blank, P., Hoßbach, J., Schuldhaus, D., Eskofier, B.M.: Sensor-based stroke detection and stroke type classification in table tennis. In: Proceedings of the 2015 ACM International Symposium on Wearable Computers, pp. 93–100 (2015)

  7. Dokic, K., Mesic, T., Martinovic, M.: Table tennis forehand and backhand stroke recognition based on neural network. In: International Conference on Advances in Computing and Data Sciences, pp. 24–35. Springer (2020)

  8. Hegazy, H., Abdelsalam, M., Hussien, M., Elmosalamy, S., Hassan, Y.M., Nabil, A.M., Atia, A.: Online detection and classification of in-corrected played strokes in table tennis using IR depth camera. Proc. Comput. Sci. 170, 555–562 (2020)

    Article  Google Scholar 

  9. Hegazy, H., Abdelsalam, M., Hussien, M., Elmosalamy, S., Hassan, Y.M., Nabil, A.M., Atia, A.: Ipingpong: a real-time performance analyzer system for table tennis stroke’s movements. Proc. Comput. Sci. 175, 80–87 (2020)

    Article  Google Scholar 

  10. Pierre-Etienne, M., B.-P. J, P. R, M. J.: Fine grained sport action recognition with twin spatio-temporal convolutional neural networks. Multimed. Tools Appl. 79, no. 20429–20447, pp. 85–97 (2020)

  11. Martin, P.-E., Benois-Pineau, J., Péteri, R., Morlier, J.: Optimal choice of motion estimation methods for fine-grained action classification with 3d convolutional networks. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 554–558. IEEE (2019)

  12. Xia, K., Wang, H., Xu, M., Li, Z., He, S., Tang, Y.: Racquet sports recognition using a hybrid clustering model learned from integrated wearable sensor. Sensors 20(6), 1638 (2020)

    Article  Google Scholar 

  13. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. Adv. Neural Inf. Process. Syst. 27, 568–576 (2014)

    Google Scholar 

  14. Tammvee, M., Anbarjafari, G.: Human activity recognition-based path planning for autonomous vehicles. In: Signal, Image and Video Processing, pp. 1–8 (2020)

  15. Lüsi, I., Jr., J. C. J., Gorbova, J., Baró, X., Escalera, S., Demirel, H., Allik, J., Ozcinar, C., Anbarjafari, G.: Joint challenge on dominant and complementary emotion recognition using micro emotion features and head-pose estimation: Databases. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 809–813. IEEE (2017)

  16. Sato, S., Aono, M.: Mediaeval 2020: Leveraging human pose estimation model for stroke classification in table tennis. In: MediaEval (2020)

  17. Oved, D., Alvarado, I., Gallo, A.: Real-time human pose estimation in the browser with tensorflow. js. In: TensorFlow Medium, May (2018)

  18. Nguyen-Truong, H., Cao, S., Nguyen, K.N.A., Pham, B.-D., Dao, H., Le, M.-Q., Nguyen-Dinh, H.-P., Nguyen, H.-D., Tran, M.-T.: Mediaeval 2020: Hcmus at mediaeval 2020: Ensembles of temporal deep neural networks for table tennis strokes classification task. In: MediaEval (2020)

  19. Alp Güler, R., Neverova, N., Kokkinos, I.: Densepose: Dense human pose estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7297–7306 (2018)

  20. Martin, P.-E., Benois-Pineau, J., Mansencal, B., Péteri, R., Morlier, J.: Mediaeval 2020: Classification of strokes in table tennis with a three stream spatio-temporal cnn for mediaeval 2020. In: MediaEval, (2020)

  21. Ahmadi, A., Mitchell, E., Richter, C., Destelle, F., Gowing, M., O’Connor, N.E., Moran, K.: Toward automatic activity classification and movement assessment during a sports training session. IEEE Internet Things J. 2(1), 23–32 (2014)

    Article  Google Scholar 

  22. Papandreou G., Zhu, T., Chen, L.-C., Gidaris, S., Tompson, J., Murphy, K.: Personlab: person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 269–286 (2018)

  23. Sriraman, S., Srinivasan, S., Krishnan, V.K., B.J, Mirnalinee, T.T.: Mediaeval 2019: Lrcns for stroke detection in table tennis. In: MediaEval (2019)

  24. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556

  25. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  26. Aktas, K., Demirel, M., Moor, M., Olesk, J., Anbarjafari, G.: Mediaeval 2020: Spatio-temporal based table tennis hit assessment using lstm algorithm. In: MediaEval (2020)

  27. Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6(3), 21–45 (2006)

    Article  Google Scholar 

  28. Demirel, H., Anbarjafari, G.: Data fusion boosted face recognition based on probability distribution functions in different colour channels. EURASIP J. Adv. Signal Process. 2009(1), 482585 (2009)

    Article  Google Scholar 

  29. Horn, B.K., Schunck, B.G.: Determining optical flow. In: Techniques and Applications of Image Understanding, vol. 281, pp. 319–331 . International Society for Optics and Photonics (1981)

  30. Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., Brox, T.: Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)

  31. Liu, Z., Zhu, J., Bu, J., Chen, C.: A survey of human pose estimation: the body parts parsing based methods. J. Vis. Commun. Image Represent. 32, 10–19 (2015)

    Article  Google Scholar 

  32. Wang, J., Qiu, K., Peng, H., Fu, J., Zhu, J.: Ai coach: Deep human pose estimation and analysis for personalized athletic training assistance. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 374–382 (2019)

  33. Fang, H.-S., Xie, S., Tai, Y.-W., Lu, C.: RMPE: Regional multi-person pose estimation. In: ICCV (2017)

  34. Jocher, G., Nishimura, K., Mineeva, T., Vilariño, R.: Yolov5 (2020). https://github.com/ultralytics/yolov5

  35. Soviany, P., Ionescu, R.T.: Optimizing the trade-off between single-stage and two-stage deep object detectors using image difficulty prediction. In: 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). pp. 209–214. IEEE (2018)

  36. Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. In: 1999 Ninth International Conference on Artificial Neural Networks ICANN 99. (Conf. Publ. No. 470), vol. 2, pp. 850–855 (1999)

  37. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015). arXiv:1502.03167

  38. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  39. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

  40. Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. In: COURSERA: Neural Networks for Machine Learning, vol. 4, no. 2, pp. 26–31 (2012)

  41. Martin, P.-E., Benois-Pineau, J., Peteri, R., Morlier, J.: Fine grained sport action recognition with twin spatio-temporal convolutional neural networks: application to table tennis. Multimed. Tools Appl. 79, 07 (2020)

    Article  Google Scholar 

  42. Zhang, Q., Sun, S.: A centroid k-nearest neighbor method. In: International Conference on Advanced Data Mining and Applications, pp. 278–285. Springer (2010)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gholamreza Anbarjafari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work has been partially supported by the Estonian Centre of Excellence in IT (EXCITE) funded by the European Regional Development Fund. The authors also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aktas, K., Demirel, M., Moor, M. et al. Spatiotemporal based table tennis stroke-type assessment. SIViP 15, 1593–1600 (2021). https://doi.org/10.1007/s11760-021-01893-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-021-01893-7

Keywords

Navigation