Basketball action recognition based on the combination of YOLO and a deep fuzzy LSTM network

Khobdeh, Soroush Babaee; Yamaghani, Mohammad Reza; Sareshkeh, Siavash Khodaparast

doi:10.1007/s11227-023-05611-7

Basketball action recognition based on the combination of YOLO and a deep fuzzy LSTM network

Published: 05 September 2023

Volume 80, pages 3528–3553, (2024)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Soroush Babaee Khobdeh¹,
Mohammad Reza Yamaghani¹ &
Siavash Khodaparast Sareshkeh²

424 Accesses
1 Citation
Explore all metrics

Abstract

The ability to identify human actions in uncontrolled surroundings is important for Human–Computer Interaction (HCI), especially in the sports areas to offer athletes, coaches, and analysts valuable knowledge about movement techniques and aid referees in making well-informed decisions regarding sports movements. Noteworthy, recognizing human actions in the context of basketball sports remains a difficult task due to issues like intricate backgrounds, obstructed actions, and inconsistent lighting conditions. Accordingly, a method based on the combination of YOLO and deep fuzzy LSTM network is proposed in this paper. YOLO is utilized for detecting players in the frame and the combination of LSTM and Fuzzy layer is used to perform the final classification. The reason behind using LSTM along with fuzzy logic refers to its inability in coping with uncertainty which led to the creation of a more transparent, interpretable, and accurate predictive system. The proposed model was validated on SpaceJam and Basketball-51 datasets. Based on the empirical results, the proposed model outperformed all baseline models on both datasets which obviously confirms the priority of our combinational model for basketball action recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HoopTransformer: Advancing NBA Offensive Play Recognition with Self-Supervised Learning from Player Trajectories

Article 30 May 2024

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Article 12 August 2023

Human activity recognition in artificial intelligence framework: a narrative review

Article 18 January 2022

Availability of data and materials

Data sharing is not applicable to this article as no new data were created and openly available datasets, namely SpaceJam [19] and Basketball-51 [20], were used to perform the implementation.

References

Saleem G, Bajwa UI, Raza RH (2023) Toward human activity recognition: a survey. Neural Comput Appl 35(5):4145–4182
Article Google Scholar
Babaee Khobdeh, S., M.R. Yamaghani, and S. Khodaparast Sareshkeh, Clustering of basketball players using self-organizing map neural networks. Journal of Applied Research on Industrial Engineering, 2021. 8(4): p. 412–428.
Hauri, S. and S. Vucetic, Group activity recognition in basketball tracking data--neural embeddings in team sports (NETS). arXiv preprint arXiv:2209.00451, 2022.
Mahmoudi SA et al (2023) A review and comparative study of explainable deep learning models applied on action recognition in real time. Electronics 12(9):2027
Article Google Scholar
Zuo K, Su X (2022) Three-dimensional action recognition for basketball teaching coupled with deep neural network. Electronics 11(22):3797
Article Google Scholar
Özyer T, Ak DS, Alhajj R (2021) Human action recognition approaches with video datasets—a survey. Knowl-Based Syst 222:106995
Article Google Scholar
Wang, H. Basketball Sports Posture Recognition based on Neural Computing and Visual Sensor. In: 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT). 2022. IEEE.
Sadr, H. and M. Nazari Soleimandarabi, ACNN-TL: attention-based convolutional neural network coupling with transfer learning and contextualized word representation for enhancing the performance of sentiment classification. The Journal of Supercomputing, 2022. 78(7): p. 10149–10175.
Pareek P, Thakkar A (2021) A survey on video-based human action recognition: recent updates, datasets, challenges, and applications. Artif Intell Rev 54:2259–2322
Article Google Scholar
Wang, L., et al. Temporal segment networks: Towards good practices for deep action recognition. in European conference on computer vision. 2016. Springer, Cham.
Zhang Y-H et al (2022) Fast 3d visualization of massive geological data based on clustering index fusion. IEEE Access 10:28821–28831
Article Google Scholar
Qu W et al (2022) A time sequence location method of long video violence based on improved C3D network. J Supercomput 78(18):19545–19565
Article Google Scholar
Lin J et al (2021) Attention-aware pseudo-3-D convolutional neural network for hyperspectral image classification. IEEE Trans Geosci Remote Sens 59(9):7790–7802
Article Google Scholar
Li G, Zhang C (2019) Automatic detection technology of sports athletes based on image recognition technology. EURASIP J Image Video Process 2019:1–9
Article Google Scholar
Sadr H, Pedram MM, Teshnehlab M (2020) Multi-view deep network: a deep model based on learning features from heterogeneous neural networks for sentiment analysis. IEEE access 8:86984–86997
Article Google Scholar
Kukker A, Sharma R (2018) Neural reinforcement learning classifier for elbow, finger and hand movements. J Intell Fuzzy Syst 35(5):5111–5121
Article Google Scholar
Soleymanpour, S., H. Sadr, and M. Nazari Soleimandarabi, CSCNN: cost-sensitive convolutional neural network for encrypted traffic classification. Neural Processing Letters, 2021. 53(5): p. 3497–3523.
Jiang P et al (2022) A Review of Yolo algorithm developments. Proc Comp Sci 199:1066–1073
Article Google Scholar
He, J., Deep learning in basketball action recognition. 2021.
Shakya, S.R., C. Zhang, and Z. Zhou. Basketball-51: A Video Dataset for Activity Recognition in the Basketball Game. In: CS & IT Conference Proceedings. 2021. CS & IT Conference Proceedings.
Lei Q et al (2019) A survey of vision-based human action evaluation methods. Sensors 19(19):4129
Article Google Scholar
Kukker A, Sharma R (2021) Stochastic genetic algorithm-assisted fuzzy q-learning for robotic manipulators. Arab J Sci Eng 46(10):9527–9539
Article Google Scholar
Zhou A et al (2023) Multi-head attention-based two-stream efficientNet for action recognition. Multimedia Syst 29(2):487–498
Article Google Scholar
Xiao J, Tian W, Ding L (2022) Basketball action recognition method of deep neural network based on dynamic residual attention mechanism. Information 14(1):13
Article Google Scholar
Kukker, A. and R. Sharma, JAYA-optimized fuzzy reinforcement learning classifier for COVID-19. IETE Journal of Research, 2022: p. 1–12.
Hosseini, S.S., M.R. Yamaghani, and S. Poorzaker Arabani, Multimodal modelling of human emotion using sound, image and text fusion. Signal, Image and Video Processing, 2023.
Dalal, N. and B. Triggs. Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference On Computer Vision And Pattern Recognition (CVPR'05). 2005. Ieee.
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vision 42:145–175
Article Google Scholar
Kuehne, H., et al. HMDB: a large video database for human motion recognition. in 2011 International conference on computer vision. 2011. IEEE, New York.
Rodriguez, M.D., J. Ahmed, and M. Shah. Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: 2008 IEEE Conference On Computer Vision And Pattern Recognition. 2008. IEEE, New York.
Ijjina, E.P. Action recognition in sports videos using stacked auto encoder and HOG3D features. In: Proceedings of the Third International Conference on Computational Intelligence and Informatics: ICCII 2018. 2020. Springer, Cham.
De Campos, T., et al. An evaluation of bags-of-words and spatio-temporal shapes for action recognition. In: 2011 IEEE Workshop on Applications of Computer Vision (WACV). 2011. IEEE, New York.
Sadanand, S. and J.J. Corso. Action bank: A high-level representation of activity in video. In: 2012 IEEE Conference on computer vision and pattern recognition. 2012. IEEE, New York.
Dalal, N., B. Triggs, and C. Schmid. Human detection using oriented histograms of flow and appearance. in Computer Vision–ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7–13, 2006. Proceedings, Part II 9. 2006. Springer.
Perš J et al (2010) Histograms of optical flow for efficient representation of body motion. Pattern Recogn Lett 31(11):1369–1376
Article Google Scholar
Wang H et al (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vision 103:60–79
Article MathSciNet Google Scholar
Wang, H. and C. Schmid. Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference On Computer Vision. 2013.
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60:91–110
Article Google Scholar
Chen, M.-y. and A. Hauptmann, Mosift: Recognizing human actions in surveillance videos. Computer Science Department, 2009: p. 929.
Karpathy, A., et al. Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2014.
Yue-Hei Ng, J., et al. Beyond short snippets: Deep networks for video classification. In: Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition. 2015.
Donahue, J., et al. Long-term recurrent convolutional networks for visual recognition and description. in Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition. 2015.
Srivastava, N., E. Mansimov, and R. Salakhudinov. Unsupervised learning of video representations using lstms. In: International Conference On Machine Learning. 2015. PMLR.
Gan, C., et al. You lead, we exceed: Labor-free video concept learning by jointly exploiting web videos and images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
Liu, S., et al., FSD-10: a dataset for competitive sports content analysis. arXiv preprint arXiv:2002.03312, 2020.
Ji S et al (2012) 3D convolutional neural networks for human action recognition. IEEE Transactions On Pattern Analysis And Machine Intelligence 35(1):221–231
Article Google Scholar
Tran, D., et al. Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference On Computer Vision. 2015.
Carreira, J. and A. Zisserman. Quo vadis, action recognition? a new model and the kinetics dataset. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
Bertasius, G., H. Wang, and L. Torresani. Is space-time attention all you need for video understanding? in ICML. 2021.
Arnab, A., et al. Vivit: A video vision transformer. In: Proceedings of the IEEE/CVF International Conference On Computer Vision. 2021.
Kondratyuk, D., et al. Movinets: Mobile video networks for efficient video recognition. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
Wang L et al (2018) Temporal segment networks for action recognition in videos. IEEE Transactions On Pattern Analysis And Machine Intelligence 41(11):2740–2755
Article Google Scholar
Zhou, B., et al. Temporal relational reasoning in videos. In: Proceedings of the European Conference On Computer Vision (ECCV). 2018.
Wang, W., D. Tran, and M. Feiszli. What makes training multi-modal classification networks hard? In: Proceedings of the IEEE/CVF Conference On Computer Vision And Pattern Recognition. 2020.
Kipf, T.N. and M. Welling, Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
Yan, S., Y. Xiong, and D. Lin. Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference On Artificial Intelligence. 2018.
Shi L et al (2020) Skeleton-based action recognition with multi-stream adaptive graph convolutional networks. IEEE Trans Image Process 29:9532–9545
Article Google Scholar
Si, C., et al. An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference On Computer Vision And Pattern Recognition. 2019.
Song, Y.-F., et al. Stronger, faster and more explainable: A graph convolutional baseline for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference On Multimedia. 2020.
Chen, Y., et al. Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference On Computer Vision. 2021.
Duan, H., et al. Revisiting skeleton-based action recognition. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural Netw 61:85–117
Article Google Scholar
Safari A, Hosseini R, Mazinani M (2021) A novel deep interval type-2 fuzzy LSTM (DIT2FLSTM) model applied to COVID-19 pandemic time-series prediction. J Biomed Inform 123:103920
Article Google Scholar
Maturana, D. and S. Scherer. Voxnet: A 3d convolutional neural network for real-time object recognition. in 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). 2015. IEEE.
Liu, H., J. Tu, and M. Liu, Two-stream 3d convolutional neural network for skeleton-based action recognition. arXiv preprint arXiv:1705.08106, 2017.
Miao, Q., et al. Multimodal gesture recognition based on the resc3d network. in Proceedings of the IEEE international conference on computer vision workshops. 2017.
Aliakbarpour, H., M.T. Manzuri, and A.M. Rahmani, Improving the readability and saliency of abstractive text summarization using combination of deep neural networks equipped with auxiliary attention mechanism. The Journal of Supercomputing, 2022: p. 1–28.
Sadr H, Pedram MM, Teshnehlab M (2019) A robust sentiment analysis method based on sequential combination of convolutional and recursive neural networks. Neural Process Lett 50:2745–2761
Article Google Scholar

Download references

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Department of Computer Engineering, Lahijan Branch, Islamic Azad University, Lahijan, Iran
Soroush Babaee Khobdeh & Mohammad Reza Yamaghani
Department of Physical Education and Sport Science, Lahijan Branch, Islamic Azad University, Lahijan, Iran
Siavash Khodaparast Sareshkeh

Authors

Soroush Babaee Khobdeh
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Reza Yamaghani
View author publications
You can also search for this author in PubMed Google Scholar
Siavash Khodaparast Sareshkeh
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SB and MRY conceived of the presented idea. SB developed the theory and performed the computations. MRY conceived the study and was in charge of overall direction and planning. SK verified the analytical methods and obtained results. All authors discussed the results and contributed to the final manuscript.

Corresponding author

Correspondence to Mohammad Reza Yamaghani.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical Approval

There are no human or animal subjects in this study. It is not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Khobdeh, S.B., Yamaghani, M.R. & Sareshkeh, S.K. Basketball action recognition based on the combination of YOLO and a deep fuzzy LSTM network. J Supercomput 80, 3528–3553 (2024). https://doi.org/10.1007/s11227-023-05611-7

Download citation

Accepted: 17 August 2023
Published: 05 September 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11227-023-05611-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Basketball action recognition based on the combination of YOLO and a deep fuzzy LSTM network

Abstract

Access this article

Similar content being viewed by others

HoopTransformer: Advancing NBA Offensive Play Recognition with Self-Supervised Learning from Player Trajectories

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Human activity recognition in artificial intelligence framework: a narrative review

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Basketball action recognition based on the combination of YOLO and a deep fuzzy LSTM network

Abstract

Access this article

Similar content being viewed by others

HoopTransformer: Advancing NBA Offensive Play Recognition with Self-Supervised Learning from Player Trajectories

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Human activity recognition in artificial intelligence framework: a narrative review

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation