Skip to main content
Log in

Judging the Normativity of PAF Based on TFN and NAN

  • Published:
Journal of Shanghai Jiaotong University (Science) Aims and scope Submit manuscript

Abstract

The normativity of workers’ actions during producing has a great impact on the quality of the products and the safety of the operation process. Previous studies mainly focused on the normativity of each single producing action instead of considering the normativity of continuous producing actions, which is defined as producing action flow (PAF) in this paper, during operation process. For this issue, a normativity judging method based on two-LSTM fusion network (TFN) and normativity-aware attention network (NAN) is proposed. First, TFN is designed to detect and recognize the producing actions based on skeleton sequences of a worker during complete operation process, and PAF data in sequential form are obtained. Then, NAN is built to allocate different levels of attention to each producing action within the sequence of PAF, and by this means, an efficient normativity judging is conducted. The combustor surface cleaning (CSC) process of rocket engine is taken as the experimental case, and the CSC-Action2D dataset is established for evaluation. Experiment results show the high performance of TFN and NAN, demonstrating the effectiveness of the proposed method for PAF normativity judging.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. RUDE D J, ADAMS S, BELING P A. Task recognition from joint tracking data in an operational manufacturing cell [J]. Journal of Intelligent Manufacturing, 2018, 29(6): 1203–1217.

    Article  Google Scholar 

  2. HUANG H. Research on real-time monitoring system of sports fatigue [D]. Sichuan: University of Electronic Science and Technology of China, 2018 (cm(in Chinese)).

    Google Scholar 

  3. WANG P, LIU H Y, WANG L H, et al. Deep learning-based human motion recognition for predictive context-aware human-robot collaboration [J]. CIRP Annals, 2018, 67(1): 17–20.

    Article  Google Scholar 

  4. FOGGIA P, PERCANNELLA G, SAGGESE A, et al. Recognizing human actions by a bag of visual words [G]//IEEE International Conference on Systems, Man, and Cybernetics. Manchester, UK: IEEE, 2013: 2910–2915.

    Google Scholar 

  5. WANG H, KLASER A, SCHMID C, et al. Action recognition by dense trajectories [G]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Colorado Springs, CO, USA: IEEE, 2011: 3169–3176.

    Google Scholar 

  6. WANG H, SCHMID C. Action recognition with improved trajectories [G] //IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013: 3551–3558.

    Google Scholar 

  7. SIMONYAN K, ZISSERMAN A. Two-stream con-volutional networks for action recognition in videos [M] // Ghahramani Z, Welling M, Cortes C, et al. Advances in Neural Information Processing Systems 27, 2014, 1: 568–576.

    Google Scholar 

  8. JI S W, XU W, YANG M, et al. 3D convolutional neural networks for human action recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221–231.

    Article  Google Scholar 

  9. LI H J, TANG J H, WU S, et al. Automatic detection and analysis of player action in moving background sports video sequences [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2010, 20(3): 351–364.

    Article  Google Scholar 

  10. TSAI D M, CHIU W Y. Action performance evaluation in video sequences [J]. Imaging Science Journal, 2014, 62(7): 358–364.

    Article  Google Scholar 

  11. JIANG Y F. Research on action evaluation method based on Kinect [D]. Shenyang: Shenyang University of Technology, 2017 (cm(in Chinese)).

    Google Scholar 

  12. CHEN X M. An action evaluating system based on 3D human posture [D]. Hangzhou: Zhejiang University, 2018 (cm(in Chinese)).

    Google Scholar 

  13. SHARAF A, TORKI M, HUSSEIN M E, et al. Realtime multi-scale action detection from 3D skeleton data [G]//IEEE Winter Conference on Applications of Computer Vision. Waikoloa, HI, USA: IEEE, 2015: 998–1005.

    Google Scholar 

  14. HOAI M, DE LA TORRE F. Max-margin early event detectors [J]. International Journal of Computer Vision, 2014, 107(2): 191–202.

    Article  MathSciNet  Google Scholar 

  15. ESCORCIA V, HEILBRON F C, NIEBLES J C, et al. DAPs: Deep action proposals for action understanding [M] // Computer Vision - ECCV 2016. Cham: Springer, 2016: 768–784.

    Chapter  Google Scholar 

  16. SHOU Z, WANG D, CHANG S F. Temporal action localization in untrimmed videos via multi-stage CNNs [C]// J IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1049–1058.

    Google Scholar 

  17. LI Y H, LAN C L, XING J L, et al. Online human action detection using joint classification-regression recurrent neural networks [M] // Computer Vision - ECCV 2016. Cham: Springer, 2016: 203–220.

    Chapter  Google Scholar 

  18. LIU J, LI Y, SONG S, et al. Multi-modality multi-task recurrent neural network for online action detection [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019: 29(9): 2667–2682.

    Article  Google Scholar 

  19. SONG S, LAN C, XING J, et al. An end-to-end spatio-temporal attention model for human action recognition from skeleton data [C]//31st AAAI Conference on Artificial Intelligence. San Francisco, CA, USA: IEEE, 2017: 4263–4270.

    Google Scholar 

  20. SHOTTON J, SHARP T, KIPMAN A, et al. Realtime human pose recognition in parts from single depth images [J]. Communications of the ACM, 2013, 56(1): 116–124.

    Article  Google Scholar 

  21. FANG H S, XIE S Q, TAI Y W, et al. RMPE: Regional multi-person pose estimation [G]/IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2353–2362.

    Google Scholar 

  22. CAO Z, SIMON T, WEI S, et al. Realtime multi-person 2D pose estimation using part affinity fields [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 1302–1310.

    Google Scholar 

  23. DE GEEST R, TUYTELAARS T. Modeling temporal structure with LSTM for online action detection [C]// IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe, NV, USA: IEEE, 2018: 1549–1557.

    Google Scholar 

  24. CHEN J, JONSSON P, TAMURA M, et al. A simple method for reconstructing a high-quality NDVI time-series data set based on the Savitzky-Golay filter [J]. Remote Sensing of Environment, 2004, 91(3/4): 332–344.

    Article  Google Scholar 

  25. CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation [EB/OL]. (2014-09-03). https://arxiv.org/abs/1406.1078.

  26. YANG Z, YANG D, DYER C, et al. Hierarchical attention networks for document classification [C]//15th Annual Conference of the North American Chapter of the Association for Computational Linguistics. San Diego, CA, USA: ACL, 2016: 1480–1489.

    Google Scholar 

  27. FENG S, WANG Y, LIU L, et al. Attention based hierarchical LSTM network for context-aware microblog sentiment classification [J]. World Wide Web, 2019, 22(1): 59–81.

    Article  Google Scholar 

  28. BAHDANAU D, CHO K, Bengio Y. Neural machine translation by jointly learning to align and translate [EB/OL]. (2016-05-19). https://arxiv.org/abs/1409.0473.

  29. GAL Y, GHAHRAMANI Z. A theoretically grounded application of drop out in recurrent neural networks [C] //30th International Conference on Neural Information Processing Systems. Barcelona, Spain: NIPS, 1027–1035.

  30. ZHU W, LAN C, XING J, et al. Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks [C] //13th AAAI Conference on Artificial Intelligence. Phoenix, AZ, USA: AAAI, 2016: 3697–3703.

    Google Scholar 

  31. SUN H, CHEN J X, WECHSLER H, et al. A new segmentation method for broadcast sports video [C]// IEEE nth International Conference on Computational Science and Engineering. Chengdu, China: IEEE, 2014: 1789–1793.

    Google Scholar 

  32. XIA L, CHEN C C, AGGARWAL J K. View invariant human action recognition using histograms of 3D joints [C]/ /IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Providence, RI, USA: IEEE, 2012: 20–27.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinsong Bao  (鲍劲松).

Additional information

Foundation item: the National Natural Science Foundation of China (No. 51475301)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Bao, J., Liu, T. et al. Judging the Normativity of PAF Based on TFN and NAN. J. Shanghai Jiaotong Univ. (Sci.) 25, 569–577 (2020). https://doi.org/10.1007/s12204-020-2177-0

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12204-020-2177-0

Keywords

CLC number

Document code

Navigation