Skip to main content
Log in

Dual-branch deep learning architecture enabling miner behavior recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Nonstandard miner behavior can have adverse effects on coal mine safety production. Therefore, accurately capturing miner behavior in complex environments is particularly important. In the intelligent mine monitoring system, using visual perception to detect miner behavior is a challenging task due to high behavioral similarity and difficult temporal relationships. In this paper, a new deep learning framework is proposed to construct a coal miner behavior recognition model with a spatio-temporal dual-branch structure and transposed attention representation mechanism. The spatio-temporal dual-branch structure extracts rich spatial semantic information from intrinsic safety video sensor input video sequences while ensuring effective capture of rapidly changing human behavior. Subsequently, considering the discrimination of miner behavior similarity, a merged transposed weighted representation mechanism (TWR) is introduced to guide the model in extracting feature information more strongly related to the classification target, thereby effectively improving the model’s ability to classify highly similar behaviors. Experiments were conducted on UCF101, HMDB51, and a self-built miner behavior dataset, achieving significant improvements compared to other state-of-the-art methods. This collaborative structure further creates a more discriminative behavior detection model, contributing to the reliability of miner behavior detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Data sharing not applicable to this article as no datasets were generated during the current study.

References

  1. Hu W, Zhai Y, Yun R (2023) Overview and prospect of visual detection methods for underground unsafe behaviors in China. Colliery Mech Electr Technol 44(01):1–7

    Google Scholar 

  2. Ma N, Wu Z, Cheung Y-M, Guo Y, Gao Y, Li J-H, Jiang B-Y (2022) A survey of human action recognition and posture prediction. Tsinghua Sci Technol 27(6):973–1001

    Article  Google Scholar 

  3. Lyu P, He M, Chen X, Bao Y (2018) Development and prospect of wisdom mine. Ind Mine Autom 09:84–88

    Google Scholar 

  4. Zhao A, Dong J, Li J, Qi L, Zhou H (2021) Associated spatio-temporal capsule network for gait recognition. IEEE Trans Multimed 24:846–860

    Article  Google Scholar 

  5. Jiang S, Qi Y, Zhang H, Bai Z, Lu X, Wang P (2020) D3d: dual 3-d convolutional network for real-time action recognition. IEEE Trans Industr Inf 17(7):4584–4593

    Article  Google Scholar 

  6. Khan MA, Javed K, Khan SA, Saba T, Habib U, Khan JA, Abbasi AA (2024) Human action recognition using fusion of multiview and deep features: an application to video surveillance. Multimed Tools Appl 83(5):14885–14911

    Article  Google Scholar 

  7. Liu B, Jia H, Yang Y, Shen J, Gai M, Song T (2023) Research on miners’dangerous behavior recognition based on improved OpenPose algorithm. Ideo Eng 02:20–23

    Google Scholar 

  8. Luo X, Yuan Y, Wang D, Zhong S, Zhang B, Li Q (2020) Research on continuous learning model of complex behavior recognition in coal mine video. Metal Mine 10:118–123

    Google Scholar 

  9. Wen T, Wang G, Kong X, Liu M, Bo J (2020) Identification of miners’ unsafe behaviors based on transfer learning and residual network. China Saf Sci J 30(03):41–46

    Google Scholar 

  10. Dang W, Zhang Z, Bai S, Gong D, Wu Z (2020) Inspection behavior recognition of underground power distribution room based on improved two-stream CNN method. Ind Mine Autom 46(04):75–80

    Google Scholar 

  11. Huang H, Cheng X, Yun X, Zhou Y, Sun Y (2021) DA-GCN-based coal mine personnel action recognition method. Ind Mine Autom 47(04):62–66

    Google Scholar 

  12. Zhao X, Wu X, Miao J, Chen W, Chen PC, Li Z (2023) Alike: accurate and lightweight keypoint detection and descriptor extraction. IEEE Trans Multimed 25:3101–3112

    Article  Google Scholar 

  13. Dairi A, Harrou F, Khadraoui S, Sun Y (2021) Integrated multiple directed attention-based deep learning for improved air pollution forecasting. IEEE Trans Instrum Meas 70:1–15

    Article  Google Scholar 

  14. Gu F, Lu J, Cai C (2023) A robust attention-enhanced network with transformer for visual tracking. Multimed Tools Appl 82(26):40761–40782

    Article  Google Scholar 

  15. Graham B, El-Nouby A, Touvron H, Stock P, Joulin A, Jégou H, Douze M (2021) Levit: a vision transformer in convnet’s clothing for faster inference. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 12259–12269

  16. Yuan K, Guo S, Liu Z, Zhou A, Yu F, Wu W (2021) Incorporating convolution designs into visual transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 579–588

  17. Girshick R (2015) Fast r-cnn. Proceedings of the IEEE international conference on computer vision, pp 1440–1448 

  18. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969

  19. Lu M, Li. N Z, Wang Y, Pan G (2019) Deep attention network for egocentric action recognition. IEEE Trans Image Process 28(8):3703–3713

    Article  MathSciNet  Google Scholar 

  20. Wang X, Zhang L, Huang W, Wang S, Wu H, He J, Song A (2021) Deep convolutional networks with tunable speed–accuracy tradeoff for human activity recognition using wearables. IEEE Trans Instrum Meas 71:1–12

    Google Scholar 

  21. Gao W, Zhang L, Huang W, Min F, He J, Song A (2021) Deep neural networks for sensor-based human activity recognition using selective kernel convolution. IEEE Trans Instrum Meas 70:1–13

    Google Scholar 

  22. Chen Z, Jiang C, Xiang S, Ding J, Wu M, Li X (2019) Smartphone sensor-based human activity recognition using feature fusion and maximum full posteriori. IEEE Trans Instrum Meas 69(7):3992–4001

    Article  Google Scholar 

  23. Zhu Y, Zhao C, Guo H, Wang J, Zhao X, Lu H (2018) Attention CoupleNet: fully convolutional attention coupling network for object detection. IEEE Trans Image Process 28(1):113–126

    Article  MathSciNet  Google Scholar 

  24. Ling H, Wu J, Huang J, Chen J, Li P (2020) Attention-based convolutional neural network for deep face recognition. Multimed Tools Appl 79:5595–5616

    Article  Google Scholar 

  25. Shi J, Wang Y, Yu Z, Li G, Hong X, Wang F, Gong Y (2024) Exploiting multi-scale parallel self-attention and local variation via dual-branch Transformer-CNN structure for face super-resolution. IEEE Trans Multimed 26:2608–2620

    Article  Google Scholar 

  26. Zhang F, Liu N, Duan F (2024) Coarse-to-fine depth super-resolution with adaptive RGB-D feature attention. IEEE Trans Multimed 26:2621–2633

    Article  Google Scholar 

  27. Ramesh M, Mahesh K (2019) Sports video classification with deep convolution neural network: a test on UCF101 dataset. Int J Eng Adv Technol 8(4S2):2249–8958

  28. Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp 2556–2563

  29. Zhou Y, Song Y, Chen L, Chen Y, Ben X, Cao Y (2022) A novel micro-expression detection algorithm based on BERT and 3DCNN. Image Vis Comput 119:104378

    Article  Google Scholar 

  30. Xiong Q, Zhang J, Wang P, Liu D, Gao R-X (2020) Transferable two-stream convolutional neural network for human action recognition. J Manuf Syst 56:605–614

    Article  Google Scholar 

  31. Kujani T, Kumar VD (2023) Head movements for behavior recognition from real time video based on deep learning ConvNet transfer learning. J Ambient Intell Humaniz Comput 14(6):7047–7061

    Article  Google Scholar 

  32. Arnab A, Dehghani M, Heigold G, Sun C, Lučić M, Schmid C (2021) Vivit: a video vision transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6836–6846

  33. Duan H, Zhao Y, Xiong Y, Liu W, Lin D (2020) Omni-sourced webly-supervised learning for video recognition. In: European Conference on Computer Vision, pp 670–688

  34. Selvaraju R-R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 618–626

Download references

Acknowledgements

This study was supported by the National Natural Science Foundation of China (51804249), Shaanxi Province Qin Chuang yuan “Scientists + Engineers” Team Construction (2022KXJ-38), the Natural Science Basic Research Program of Shaanxi (Grant No. 2021JQ-574).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheng Wang.

Ethics declarations

Confict of interest

The authors declare that they have no confict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Liu, Y., Yang, Y. et al. Dual-branch deep learning architecture enabling miner behavior recognition. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19164-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-19164-1

Keywords

Navigation