Skip to main content

Advertisement

Log in

Heterogeneous multi-task smoking behavior recognition model combined with attention

  • S.I.: Evolutionary Computation based Methods and Applications for Data Processing
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The traditional behavior recognition model has the disadvantage that it can’t get the internal relationship between similar behaviors, such as smoking, pen, chin and the clamped objects, which limits the actual landing of such fine and complex behaviors as smoking recognition. To solve these problems, this paper puts forward the heterogeneous algorithm HMMA-NET (Heterogeneous multi-task smoking behavior recognition model combined with Attention), which consists of two modules: behavior prior and local detection, aiming at establishing the relationship between behavior and behavior objects. CNN combined with channel attention mechanism is used in both behavior prior module and local detection module. The former uses sign language semantic features to complete the primary prior of behavior according to the obtained behavior affinity vector field, while the latter designs network optimization such as fast Edgebox to obtain candidate areas, so as to transfer component information and achieve the goal of fast fine-grained detection. Finally, the two modules use SaaS mode to complete association recognition. Experiment shows that the algorithm can recognize complex actions effectively, and its accuracy is still equal to or even better than that of a single model, in which the accuracy of detecting smoking behavior scenes is 96.10%, and the false detection rate is 3.6%. The algorithm has been commercialized and applied to the actual monitoring of petrochemical scenes. The running results show that the algorithm can maintain good real-time performance and generalization ability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability statement

The data of this paper can be obtained through the email to the authors.

References

  1. Wang L, Xiong Y et al. (2016) Temporal segment networks: towards good practices for deep action recognition. In European conference on computer vision. pp 20–36

  2. Li XB (2018) Study on heterogeneous multitask learning and task grouping efficiency (in chinese)" [Master’s thesis]. Yanshan University

  3. Caruana R (1993) Multitask learning: a knowledge-based source of inductive bias1. In: Proceedings of the Tenth International Conference on Machine Learning. pp 41–48

  4. Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning. pp 160–167

  5. Liu X, Gao J et al. (2015) Representation learning using multi-task deep neural networks for semantic classification and information retrieval

  6. Melvin J, Mike S et al (2017) Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans Assoc Comput Linguist 5:339–351

    Article  Google Scholar 

  7. Seltzer ML, Droppo J (2013) Multi-task learning in deep neural networks for improved phoneme recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. pp 6965–6969

  8. Zhang K, Zhang Z et al (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503

    Article  Google Scholar 

  9. Girshick R, Donahue J et al (2015) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158

    Article  Google Scholar 

  10. Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE international conference on computer vision. pp 2650–2658

  11. Li B, Shen C et al. (2015) Depth and surface normal estimation from monocular images using regression on deep features and hierarchical crfs. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1119–1127

  12. Kopilovic I, Vagvolgyi B (2000) Application of panoramic annular lens for motion analysis tasks: surveillance and smoke detection. In: Proceedings 15th International Conference on Pattern Recognition. ICPR-2000. pp 714–717

  13. Truong TX, Kim JM (2010) An early smoke detection system based on motion estimation. In: International Forum on Strategic Technology 2010. pp 437–440

  14. Yang J, Chen F et al. (2008) Visual-based smoke detection using support vector machine. In: 2008 Fourth International Conference on Natural Computation. pp 301–305

  15. Wei Y, Chunyu Y et al. (2009) Based on wavelet transformation fire smoke detection method. In: 2009 9th International Conference on Electronic Measurement & Instruments, pp 2–872

  16. Zhang B, Wei W et al (2018) Early wildfire smoke detection based on multi-feature fusion (in chinese). J Chengdu Univ Inf Technol 33(4):408–412

    Google Scholar 

  17. Wang F (2020) Research and Implementation of Forest Fire Detection System Based on Deep Learning (in chinese). [Master’s thesis], University of Electronic Science and Technology of China

  18. Li P, Zhang J et al (2021) Smoke detection method based on optical flow improvement and YOLOv3. J Zhejiang Univ Technol 49:9–15

    Google Scholar 

  19. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement, arXiv preprint arXiv:1804.02767

  20. Li C, Yang B et al (2020) Real-time video-based smoke detection with high accuracy and efficiency. Fire Saf J 117:103184

    Article  Google Scholar 

Download references

Funding

This research was supported by the Beijing Municipal Natural Science Foundation [4202028]; General Project of the National Language Committee [YB145-25]; National Natural Science Foundation of China [62036001]; Support Plan for Beijing Municipal University Faculty Construction—High-Level Scientific Research and Innovation Team Project [BPHR20220121]; Premium Funding Project for Academic Human Resources Development in Beijing Union University [BPHR2019CZ05]; Jiangsu Province Key R&D Program (Industry Prospects and Key Core Technologies) [BE2020047]; and the characteristic-disciplines oriented research project in Beijing Union University [KYDE40201702].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dengfeng Yao.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qiu, X., Kang, X., Zhang, Y. et al. Heterogeneous multi-task smoking behavior recognition model combined with attention. Neural Comput & Applic 35, 25175–25187 (2023). https://doi.org/10.1007/s00521-023-08616-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08616-8

Keywords

Navigation