Skip to main content
Log in

STCNet: spatiotemporal cross network for industrial smoke detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Industrial smoke emissions present a serious threat to natural ecosystems and human health. Prior works have shown that using computer vision techniques to identify smoke is a low-cost and convenient method. However, translucent smoke detection is a challenging task because of the irregular contours and complex motion state. To overcome these problems, we propose a novel spatiotemporal cross network (STCNet) to recognize industrial smoke emissions. The proposed STCNet involves a spatial pathway to extract appearance features and a temporal pathway to capture smoke motion information. Our STCNet is more targeted and goal oriented for dealing with translucent, nonrigid smoke objects. The spatial path can easily recognize obvious nonsmoking objects such as trees and buildings, and the temporal path can highlight the obscure traces of motion smoke. Our STCNet achieves the mutual guidance of multilevel spatiotemporal information by bidirectional feature fusion on multilevel feature maps. Extensive experiments on public datasets show that our STCNet achieves clear improvements against the best competitors by 6.2%. We also perform in-depth ablation studies on STCNet to explore the impacts of different feature fusion methods for the entire model. The code will be available at https://github.com/Caoyichao/STCNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Cao Y, Tang Q, Wu X, Lu X (2021) EFFNet: enhanced feature foreground network for video smoke source prediction and detection. IEEE Trans Circ Syst Video Technol https://doi.org/10.1109/TCSVT.2021.3083112

  2. Carreira J, Zisserman A (2017) Quo Vadis, action recognition? A new model and the kinetics dataset, presented at the Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6299–6308

  3. Chorowski JK, Bahdanau D, Serdyuk D, Cho K, Bengio Y (2015) Attention-based models for speech recognition. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems 28. Curran Associates, Inc., pp 577–585

  4. Dimitropoulos K, Barmpoutis P, Grammalidis N (2017) Higher order linear dynamical systems for smoke detection in video surveillance applications. IEEE Transactions on Circuits and Systems for Video Technology 27(5):1143–1154. https://doi.org/10.1109/TCSVT.2016.2527340

    Article  Google Scholar 

  5. Donahue J, et al (2015) Long-term recurrent convolutional networks for visual recognition and description, presented at the Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2625–2634

  6. Goldberg Y (2017) Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies 10(1):1–309

    Article  Google Scholar 

  7. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, presented at the Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  8. Hsu Y-C, et al (2020) RISE video dataset: Recognizing industrial smoke emissions. arXiv:2005.06111. Accessed 09 July 2020

  9. Hussein N, Gavves E, Smeulders AWM (2019) Timeception for complex action recognition, presented at the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 254–263

  10. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25. Curran Associates, Inc., pp 1097–1105

  11. Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection, presented at the Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2117–2125

  12. Lin J, Gan C, Han S (2019) TSM: Temporal shift module for efficient video understanding. In: 2019 IEEE/CVF international conference on computer vision (ICCV). Seoul, Korea (South), pp 7082–7092 https://doi.org/10.1109/ICCV.2019.00718

  13. Lin G, Zhang Y, Xu G, Zhang Q (2019) Smoke detection on video sequences using 3D convolutional neural networks. Fire Technol https://doi.org/10.1007/s10694-019-00832-w

  14. Liu Y, Qin W, Liu K, Zhang F, Xiao Z (2019) A dual convolution network using dark channel prior for image smoke classification. IEEE Access 7:60697–60706. https://doi.org/10.1109/ACCESS.2019.2915599

    Article  Google Scholar 

  15. Liu P, Yu H, Cang S (2019) Adaptive neural network tracking control for underactuated systems with matched and mismatched disturbances. Nonlinear Dyn 98(2):1447–1464. https://doi.org/10.1007/s11071-019-05170-8

    Article  Google Scholar 

  16. Long C et al (2010) Transmission: A new feature for computer vision based smoke detection. In: Wang FL, Deng H, Gao Y, Lei J (eds) Artificial intelligence and computational intelligence, vol 6319. Springer, Berlin Heidelberg, pp 389–396

    Chapter  Google Scholar 

  17. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2019) MobileNetV2: inverted residuals and linear bottlenecks. arXiv:1801.04381. Accessed 03 Sep 2020

  18. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. In: 2017 IEEE international conference on computer vision (ICCV), pp 618–626 https://doi.org/10.1109/ICCV.2017.74

  19. Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Advances in neural information processing systems, pp 568–576

  20. Sobral A, Vacavant A (2014) A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos. Computer Vision and Image Understanding 122:4–21. https://doi.org/10.1016/j.cviu.2013.12.005

    Article  Google Scholar 

  21. Sun L, Zhao C, Yan Z, Liu P, Duckett T, Stolkin R (2019) A novel weakly-supervised approach for RGB-D-based nuclear waste object detection. IEEE Sensors Journal 19(9):3487–3500. https://doi.org/10.1109/JSEN.2018.2888815

    Article  Google Scholar 

  22. Tian H, Li W, Wang L, Ogunbona P (2014) Smoke detection in video: An image separation approach. International Journal of Computer Vision 106(2):192–209

    Article  Google Scholar 

  23. Tian H, Li W, Ogunbona P, Wang L (2015) Single image smoke detection. Computer vision - ACCV 2014:87–101

    MATH  Google Scholar 

  24. Tian H, Li W, Ogunbona PO, Wang L (2018) Detection and separation of smoke from single image frames. IEEE Transactions on Image Processing 27(3):1164–1177. https://doi.org/10.1109/TIP.2017.2771499

    Article  MathSciNet  MATH  Google Scholar 

  25. Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3D convolutional networks. In: 2015 IEEE international conference on computer vision (ICCV), pp 4489–4497 https://doi.org/10.1109/ICCV.2015.510

  26. Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. Salt Lake City, UT, USA, pp. 7794–7803 https://doi.org/10.1109/CVPR.2018.00813

  27. Xie S, Girshick R, Dollar P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). Honolulu, HI, pp 5987–5995 https://doi.org/10.1109/CVPR.2017.634

  28. Xu G, Zhang Q, Liu D, Lin G, Wang J, Zhang Y (2019) Adversarial adaptation from synthesis to reality in fast detector for smoke detection. IEEE Access 7:29471–29483

    Article  Google Scholar 

  29. Yin Z, Wan B, Yuan F, Xia X, Shi J (2017) A deep normalization and convolutional neural network for image smoke detection. IEEE Access 5:18429–18438

    Article  Google Scholar 

  30. Yuan FN (2012) A double mapping framework for extraction of shape-invariant features based on multi-scale partitions with AdaBoost for video smoke detection. Pattern Recognition 45(12):4326–4336

    Article  Google Scholar 

  31. Yuan F, Xia X, Shi J, Li H, Li G (2017) Non-linear dimensionality reduction and gaussian process based classification method for smoke detection. IEEE Access 5(99):6833–6841

    Article  Google Scholar 

  32. Yuan F, Zhang L, Xia X, Huang Q, Li X (2019) A wave-shaped deep neural network for smoke density estimation. IEEE Trans Image Process 1 https://doi.org/10.1109/TIP.2019.2946126

  33. Zhao Y, Ma J, Li X, Zhang J (2018) Saliency detection and deep learning-based wildfire identification in UAV imagery. Sensors (Basel) 18(3)

  34. Zhou B, Andonian A, Oliva A, Torralba A (2018) Temporal relational reasoning in videos. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision-ECCV 2018, vol 11205. Springer International Publishing, Cham, pp 831–846

    Chapter  Google Scholar 

  35. Zolfaghari M, Singh K, Brox T (2018) ECO: Efficient convolutional network for online video understanding. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision-ECCV 2018, vol 11206. Springer International Publishing, Cham, pp 713–730

    Chapter  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No.61871123), Key Research and Development Program in Jiangsu Province (No.BE2016739) and is a project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions. We thank the Big Data Center of Southeast University for providing facility support for the numerical calculations in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaobo Lu.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, Y., Tang, Q. & Lu, X. STCNet: spatiotemporal cross network for industrial smoke detection. Multimed Tools Appl 81, 10261–10277 (2022). https://doi.org/10.1007/s11042-021-11766-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11766-3

Keywords

Navigation