Improving action quality assessment with across-staged temporal reasoning on imbalanced data

Lian, Pu-Xiang; Shao, Zhi-Gang

doi:10.1007/s10489-023-05166-3

Improving action quality assessment with across-staged temporal reasoning on imbalanced data

Published: 18 November 2023

Volume 53, pages 30443–30454, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

184 Accesses
Explore all metrics

Abstract

Action quality assessment is a significant research domain in computer vision, aimed at evaluating the accuracy of human movement and providing feedback and guidance for training and rehabilitation. However, the uneven nature of the data, which has a significant impact on the labels with less samples, is not taken into consideration by the generally used approaches in this field. To address this issue, we propose using kernel density estimation (KDE) to recalculate the label density and weight the loss function by the reciprocal of the square root of each label density. Additionally, we divide the entire motion into three sub-stages, including the takeoff, aerial movement, and entry for diving, and connect the three stages using an across-staged temporal reasoning module (ASTRM). Our approach achieves a performance of 0.9222 Spearman correlation coefficient (\(\rho \)) and 0.3304 (\(\times \)100) Relative \(\ell _2\)-distance (\(\mathrm R\)-\(\ell _2\)) on the FineDiving dataset, demonstrating competitiveness compared to other methods. Furthermore, numerous comprehensive ablation experiments validate the effectiveness of the methods and modules we adopted.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AMIE: Automatic Monitoring of Indoor Exercises

HuMoMM: A Multi-Modal Dataset and Benchmark for Human Motion Analysis

A Step Towards Automated Functional Assessment of Activities of Daily Living

Data Availability

FineDiving dataset can be downloaded upon request at https://github.com/xujinglin/FineDiving.

References

Srivastava A, Mehrotra D, Kapur PK, Aggarwal AG (2020) Analytical evaluation of agile success factors influencing quality in software industry. Int J Syst Assur Eng Manag 11:247–257
Article Google Scholar
Singh D, Satija A (2020) Integrated municipal solid waste management in faridabad city, haryana state (india). Int J Syst Assur Eng Manag 11:411–425
Article Google Scholar
Sengazani Murugesan V, Sequeira AH, Jauhar SK, Kumar V (2020) Sustainable postal service design: integrating quality function deployment from the customers perspective. Int J Syst Assur Eng Manag 11(2):494–505
Article Google Scholar
Amanbek N, Mamayeva LA, Rakhimzhanova GM (2021) Results of a comprehensive assessment of the quality of services to the population with the use of statistical methods. Int J Syst Assur Eng Manag 12:1322–1333
Article Google Scholar
Singh AK, Rawani AM (2022) Industry oriented quality management of engineering education: an integrated qfd-topsis approach. Int J Syst Assur Eng Manag 13(2):904–922
Article Google Scholar
Gupta S, Garg R, Singh A (2020) Anfis-based control of multi-objective grid connected inverter and energy management. J Inst Eng (India): Series B 101:1–14
Xu C, Fu Y, Zhang B, Chen Z, Jiang YG, Xue X (2019) Learning to score figure skating sport videos. IEEE Trans Circuits Syst Video Technol 30(12):4578–4590
Article Google Scholar
Parmar P, Gharat A, Rhodin H (2022) Domain knowledge-informed self-supervised representations for workout form assessment. In: Computer vision–ECCV 2022: 17th european conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXVIII, pp 105–123. Springer
Doughty H, Mayol-Cuevas W, Damen D (2019) The pros and cons: Rank-aware temporal attention for skill determination in long videos. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7854–7863
Nayak JR, Shaw B, Sahu BK (2023) A fuzzy adaptive symbiotic organism search based hybrid wavelet transform-extreme learning machine model for load forecasting of power system: a case study. J Ambient Intell Humaniz Comput 14(8):10833–10847
Article Google Scholar
Danandeh Mehr A, Rikhtehgar Ghiasi A, Yaseen ZM, Sorman AU, Abualigah L (2023) A novel intelligent deep learning predictive model for meteorological drought forecasting. J Ambient Intell Humaniz Comput 14(8):10441–10455
Article Google Scholar
Wang S, Yang D, Zhai P, Yu Q, Suo T, Sun Z, Li K, Zhang L (2021) A survey of video-based action quality assessment. In: 2021 International conference on networking systems of AI (INSAI), pp 1–9
Jain H, Harit G, Sharma A (2021) Action quality assessment using siamese network-based deep metric learning. IEEE Trans Circuits Syst Video Technol 31(6):2260–2273
Article Google Scholar
Li M, Zhang HB, Lei Q, Fan Z, Liu J, Du JX (2022) Pairwise contrastive learning network for action quality assessment. In: Computer vision – ECCV 2022: 17th european conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part IV, pp 457–473, Berlin, Heidelberg.Springer-Verlag
Yang Y, Zha K, Chen Y, Wang H, Katabi D (2021) Delving into deep imbalanced regression. In: Proceedings of the 38th international conference on machine learning, pp 11842–11851. PMLR
Dong LJ, Zhang HB, Shi Q, Lei Q, Du JX, Gao S (2021) Learning and fusing multiple hidden substages for action quality assessment. Knowl-Based Syst 229(C)
Zhou B, Andonian A, Oliva A, Torralba A (2017) Trn: Temporal relational reasoning in videos. 2018 ECCV
Pirsiavash H, Vondrick C, Torralba A (2014) Assessing the quality of actions. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer vision – ECCV 2014, vol 2014. lecture notes in computer science, pp 556–571, Cham. Springer International Publishing
Parmar P, Tran Morris B (2017) Learning to score olympic events. In: 2017 IEEE Conference on computer vision and pattern recognition workshops (CVPRW), pp 76–84
Li Y, Chai X, Chen X (2019) Scoringnet: Learning key fragment for action quality assessment with ranking loss in skilled sports. In: Jawahar CV, Li H, Mori G, Schindler K (eds) Computer vision – ACCV 2018. lecture notes in computer science. Cham. Springer International Publishing, pp 149–164
Wang S, Yang D, Zhai P, Chen C, Zhang L (2021) Tsa-net: Tube self-attention network for action quality assessment. In: Proceedings of the 29th ACM international conference on multimedia, MM ’21, pp 4902–4910, New York, NY, USA. Association for Computing Machinery
Zeng LA, Hong FT, Zheng WS, Yu QZ, Zeng W, Wang YW, Lai JH (2020) Hybrid dynamic-static context-aware attention network for action assessment in long videos. In: Proceedings of the 28th ACM international conference on multimedia, pp 2526–2534
Zhang HB, Dong LJ, Lei Q, Yang LJ, Du JX (2022) Label-reconstruction-based pseudo-subscore learning for action quality assessment in sporting events. Applied Intelligence (Dordrecht, Netherlands), pp 1–15
Tang Y, Ni Z, Zhou J, Zhang D, Lu J, Wu Y, Zhou J (2020) Usdl: Uncertainty-aware score distribution learning for action quality assessment. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9836–9845
Zhang B, Chen J, Xu Y, Zhang H, Yang X, Geng X (2022) Dae: Auto-encoding score distribution regression for action quality assessment
Xu A, Zeng LA, Zheng WS (2022) Likert scoring with grade decoupling for long-term action assessment. In: 2022 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 3222–3231
Yu X, Rao Y, Zhao W, Lu J, Zhou J (2021) Core: Group-aware contrastive regression for action quality assessment. 2021 IEEE/CVF International conference on computer vision (ICCV)
Bai Y, Zhou D, Zhang S, Wang J, Ding E, Guan Y, Long Y, Wang J (2022) Action quality assessment with temporal parsing transformer. In: European conference on computer vision, pp 422–438. Springer
Xu J, Rao Y, Yu X, Chen G, Zhou J, Lu J (2022) Finediving: A fine-grained dataset for procedure-aware action quality assessment. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2949–2958
World Aquatics (2023) Competition regulations. https://resources.fina.org/fina/document/2023/04/05/c8f2e9bf-54bb-4e95-a534-116671049357/WORLD_AQUATICS_COMPETITION_REGULATIONS.pdf, Approved by the World Aquatics Bureau on 21 February 2023
Wang Z, Yang Y, Liu Z, Zheng Y (2023) Deep neural networks in video human action recognition: A review. arXiv:2305.15692
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Adv Neural Inf Process Syst 27
Carreira J, Zisserman A (2017) Quo vadis, action recognition? a new model and the kinetics dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6299–6308
Arnab A, Dehghani M, Heigold G, Sun C, Lučić M, Schmid C (2021) Vivit: A video vision transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6836–6846
Liu Z, Ning J, Cao Y, Wei Y, Zhang Z, Lin S, Hu H (2022) Video swin transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3202–3211
Tran D, Wang H, Torresani L, Ray J, LeCun Y, Paluri M (2018) A closer look at spatiotemporal convolutions for action recognition. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 6450–6459
Liu Z, Miao Z, Zhan X, Wang J, Gong B, Yu SX (2019) Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2537–2546
Tang K, Huang J, Zhang H (2020) Long-tailed classification by keeping the good and removing the bad momentum causal effect. In: Advances in neural information processing systems, vol 33, pp 1513–1524. Curran Associates, Inc
Cao K, Wei C, Gaidon A, Arechiga N, Ma T (2019) Learning imbalanced datasets with label-distribution-aware margin loss. In: Advances in neural information processing systems, vol 32. Curran Associates, Inc
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Article Google Scholar
Parmar P, Morris BT (2019) What and how well you performed? a multitask learning approach to action quality assessment. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 304–313
Lin TY, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Kay W, Carreira J, Simonyan K, Zhang B, Hillier C, Vijayanarasimhan S, Viola F, Green T, Back T, Natsev P, Suleyman M (2017) The kinetics human action video dataset
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980
Farabi S, Himel H, Gazzali F, Hasan MB, Kabir MH, Farazi M (2022) Improving action quality assessment using weighted aggregation. In: Pinho AJ, Georgieva P, Teixeira LF, Sánchez JA (eds) Pattern recognition and image analysis. Lecture Notes in Computer Science. Cham, Springer International Publishing, pp 576–587
Bharadiya J (2023) A comprehensive survey of deep learning techniques natural language processing. Eur J Tech 7(1):58–66
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No.52072132).

Author information

Authors and Affiliations

School of Electronics Information Engineering, South China Normal University, Foshan, 528225, China
Pu-Xiang Lian
School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou, 510006, China
Zhi-Gang Shao
Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, South China Normal University, Guangzhou, 510006, China
Zhi-Gang Shao
Guangdong-Hong Kong Joint Laboratory of Quantum Matter, Frontier Research Institute for Physics, South China Normal University, Guangzhou, 510006, China
Zhi-Gang Shao

Authors

Pu-Xiang Lian
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Gang Shao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Pu-Xiang Lian: Conceptualization, Methodology, Validation, Formal analysis, Writing-original draft, Writing-review & editing, Visualization. Zhi-Gang Shao: Methodology, Formal analysis, Validation, Writing-review & editing.

Corresponding author

Correspondence to Zhi-Gang Shao.

Ethics declarations

Competing of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lian, PX., Shao, ZG. Improving action quality assessment with across-staged temporal reasoning on imbalanced data. Appl Intell 53, 30443–30454 (2023). https://doi.org/10.1007/s10489-023-05166-3

Download citation

Accepted: 06 November 2023
Published: 18 November 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10489-023-05166-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving action quality assessment with across-staged temporal reasoning on imbalanced data

Abstract

Access this article

Similar content being viewed by others

AMIE: Automatic Monitoring of Indoor Exercises

HuMoMM: A Multi-Modal Dataset and Benchmark for Human Motion Analysis

A Step Towards Automated Functional Assessment of Activities of Daily Living

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving action quality assessment with across-staged temporal reasoning on imbalanced data

Abstract

Access this article

Similar content being viewed by others

AMIE: Automatic Monitoring of Indoor Exercises

HuMoMM: A Multi-Modal Dataset and Benchmark for Human Motion Analysis

A Step Towards Automated Functional Assessment of Activities of Daily Living

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation