Action temporal detection method based on confidence curve analysis

Song, Hanjian; Tian, Lihua; Li, Chen

doi:10.1007/s11042-020-08771-3

Action temporal detection method based on confidence curve analysis

Published: 12 March 2020

Volume 79, pages 34471–34488, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hanjian Song¹,
Lihua Tian¹ &
Chen Li¹

2 Citations
Explore all metrics

Abstract

Action temporal detection is a derivative task of action recognition which needs researchers to predict temporal intervals and specific categories in untrimmed videos. Aiming at the problem of too many proposed segments and insufficient filtering effect in multi-stage networks, we propose an action temporal detection method using confidence curve analysis to generate proposal segments. Fixed step window sliding is adopted to generate candidate segments in a video, and we adjust a training mode in segment network. The proposal segments are generated by analyzing the confidence curve of candidate segments, finally proposal segments are input into localization network to classify and adjust confidence level. Extensive experiments performed on THUMOS2014 benchmark show that the proposed method performs significantly better than the original muti-stage convolutional network that mAP increase from 19.0% to 26.4% with 252% accelerating.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Video summarization using deep learning techniques: a detailed analysis and investigation

Article 15 March 2023

ByteTrack: Multi-object Tracking by Associating Every Detection Box

References

Chauhan JS, Wang Y (2018) Context-aware action detection in untrimmed videos using bidirectional LSTM[C]. 2018 15th conference on computer and robot vision (CRV), pp 222–229
Cuzzolin F, Singh G (2016) Untrimmed video classification for activity detection: submission to activitynet challenge. CVPR ActivityNet Workshop
Dai X, Singh B, Zhang G, Davis LS, Chen YQ (2017) Temporal context network for activity localization in videos. In: 2017 IEEE international conference on computer vision, pp 5727–5736
Diba A, Fayyaz M, Sharma V et al (2018) Spatio-temporal channel correlation networks for action classification[C]. European conference on computer vision, pp 299-315
Everingham M, Winn J (2006) The pascal visual object classes challenge 2007 (voc2007) development kit[J]. Int J Comput Vis 111(1):98–136
Article Google Scholar
Gao J, Yang Z, Sun C et al (2017) Turn tap: Temporal unit regression network for temporal action propos- als[C]. 2017 IEEE international conference on computer vision, pp 3648–3656
Girshick RB, Donahue J, Darrell T et al (2013) Rich feature hierarchies for accurate object detection and semantic segmentation[J/OL]. CoRR http://arxiv.org/abs/1311.2524
Guo D, Li W, Fang X (2018) Fully convolutional network for multiscale temporal action proposals[J]. IEEE Trans Multimedia 20(12):3428–3438
Article Google Scholar
Heilbron FC, Escorcia V, Ghanem B et al (2015) Activitynet: A large-scale video benchmark for human ac- tivity understanding[C]. 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 961–970. https://doi.org/10.1109/CVPR.2015.7298698
Jain M, van Gemert JC, Snoek CGM (2015) What do 15,000 object categories tell us about classifying and localizing actions? In: 2015 IEEE conference on computer vision and pattern recognition, pp 46–55
Jain M, van Gemert J, Mensink T, Snoek C (2015) Objects2action: classifying and localizing actions without any video example. CoRR
Jiyang G, Zhenheng Y, Ram N (2017) Cascaded boundary regression for temporal action detection. CoRR
Kläser A, Marszałek M, Schmid C et al (2012) Human focused action localization in video[C]// Kutulakos K N. trends and topics in computer vision. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 219–233
Book Google Scholar
Lu H, Li Y, Mu S, Wang D, Kim H, Serikawa S (2018) Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Internet Things J 5(4):2315–2322
Article Google Scholar
Lu H, Li Y, Chen M, Kim H, Serikawa S (2018) Brain intelligence: go beyond artificial intelligence. Mobile Netw Appl 23(2):368–375
Article Google Scholar
Lu H, Li Y, Uemura T, Kim H, Serikawa S (2018) Low illumination underwater light field images reconstruction using deep convolutional neural networks. Future Gener Comput Syst 82:142–148
Article Google Scholar
Oneata D, Verbeek J, Schmid C (2014) The LEAR submission at Thumos 2014[M/OL]. https://hal.inria.fr/hal-01074442
Puscas MM, Sangineto E, Culibrk D, Sebe N (2015) Unsupervised tube extraction using transductive learning and dense trajectories. In 2015 IEEE international conference on computer vision, pp 1653–1661
Qiu Z, Yao T, Mei T (2017) Learning spatio-temporal representation with pseudo-3D residual networks[C]. 2017 IEEE international conference on computer vision, pp 5533–5541
Shou Z, Wang D, Chang S (2016) Temporal action localization in untrimmed videos via multi-stage cnns[C]. 2016 IEEE conference on computer vision and pattern recognition, pp 1049–1058
Shou Z, Chan J, Zareian A et al (2017) Cdc: Convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos[C]. 2017 IEEE conference on computer vision and pattern recognition, pp 1417–1426
Shou Z, Gao H, Zhang L, Miyazawa K, Chang S-F (2018) Autoloc: weakly-supervised temporal action localization in untrimmed videos[C]. European Conference on Computer Vision, pp 162-179
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos [J/OL]. CoRR. http://arxiv.org/abs/1406.2199
Soomro K, Zamir A R, Shah M (2012) UCF101: a dataset of 101 human actions classes from videos in the wild[J/OL]. CoRR. abs/1212.0402. http://arxiv.org/abs/1212.0402
Tran D, Bourdev L, Fergus R et al (2015) Learning spatiotemporal features with 3d convolutional net- works[C]. 2015 IEEE international conference on computer vision, pp 4489–4497
Wang H, Schmid C (2013) Action recognition with improved trajectories. In 2013 IEEE international conference on computer vision, pp 3551–3558
Wang L, Tang X, Qiao Y (2014) Action recognition and detection by combining motion and appearance features[C]. ECCV THUMOS Workshop
Xu Z, Yang Y, Hauptmann AG (2015) A discriminative cnn video representation for event detection. In 2015 IEEE conference on computer vision and pattern recognition, pp 1798–1807
Yanchun W, Jianqin Y, Lei W et al (2018) Temporal action detection based on action temporal semantic continuity[J]. IEEE Access 6:31677–31684
Article Google Scholar
Yeung S, Russakovsky O, Mori G et al (2016) End-to-end learning of Action detection from frame glimpses in videos[C]. 2016 IEEE conference on computer vision and pattern recognition, pp 2678–2687
Yuan J, Ni B, Yang X et al (2016) Temporal action localization with pyramid of score distribution features[C]. 2016 IEEE conference on computer vision and pattern recognition. IEEE
Yuan Z, Stroud CJ, Lu T, Deng J Temporal action localization by structured maximal sums. pp 3215–3223. https://doi.org/10.1109/CVPR.2017.342,2017.
Zhao Y, Xiong Y, Wang L et al (2017) Temporal action detection with structured segment networks[C]. 2017 IEEE international conference on computer vision, pp 2933–2942

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grant No. 61901356 and the HPC Platform of Xi’an Jiaotong University.

Author information

Authors and Affiliations

School of Software Engineering, Xi’an Jiaotong University, Xi’an, 710049, China
Hanjian Song, Lihua Tian & Chen Li

Authors

Hanjian Song
View author publications
You can also search for this author in PubMed Google Scholar
Lihua Tian
View author publications
You can also search for this author in PubMed Google Scholar
Chen Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lihua Tian.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, H., Tian, L. & Li, C. Action temporal detection method based on confidence curve analysis. Multimed Tools Appl 79, 34471–34488 (2020). https://doi.org/10.1007/s11042-020-08771-3

Download citation

Received: 04 June 2019
Revised: 02 January 2020
Accepted: 17 February 2020
Published: 12 March 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11042-020-08771-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Action temporal detection method based on confidence curve analysis

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Video summarization using deep learning techniques: a detailed analysis and investigation

ByteTrack: Multi-object Tracking by Associating Every Detection Box

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Action temporal detection method based on confidence curve analysis

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Video summarization using deep learning techniques: a detailed analysis and investigation

ByteTrack: Multi-object Tracking by Associating Every Detection Box

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation