Video summarization via global feature difference optimization

Zhang, Yunzuo; Liu, Yameng

doi:10.1007/s11801-023-2212-0

Video summarization via global feature difference optimization

Published: 28 September 2023

Volume 19, pages 570–576, (2023)
Cite this article

Optoelectronics Letters Aims and scope Submit manuscript

Yunzuo Zhang¹ &
Yameng Liu¹

48 Accesses
1 Citation
Explore all metrics

Abstract

Video summarization aims at selecting valuable clips for browsing videos with high efficiency. Previous approaches typically focus on aggregating temporal features while ignoring the potential role of visual representations in summarizing videos. In this paper, we present a global difference-aware network (GDANet) that exploits the feature difference across frame and video as guidance to enhance visual features. Initially, a difference optimization module (DOM) is devised to enhance the discriminability of visual features, bringing gains in accurately aggregating temporal cues. Subsequently, a dual-scale attention module (DSAM) is introduced to capture informative contextual information. Eventually, we design an adaptive feature fusion module (AFFM) to make the network adaptively learn context representations and perform feature fusion effectively. We have conducted experiments on benchmark datasets, and the empirical results demonstrate the effectiveness of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video summarization using deep learning techniques: a detailed analysis and investigation

Article 15 March 2023

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

HDRC: a subjective quality assessment database for compressed high dynamic range image

Article Open access 06 May 2024

References

APOSTOLIDIS E, ADAMANTIDOU E, METSAI A I, et al. Video summarization using deep neural networks: a survey[J]. Proceedings of the IEEE, 2021, 109(11): 1838–1863.
Article Google Scholar
LEI J, LUAN Q, SONG X, et al. Action parsing-driven video summarization based on reinforcement learning[J]. IEEE transactions on circuits and systems for video technology, 2018, 29(7): 2126–2137.
Article Google Scholar
HUANG C, WANG H. A novel key-frames selection framework for comprehensive video summarization[J]. IEEE transactions on circuits and systems for video technology, 2019, 30(2): 577–589.
Article Google Scholar
YUAN L, TAY F E H, LI P, et al. Cycle-SUM: cycle-consistent adversarial LSTM networks for unsupervised video summarization[C]//Proceedings of the AAAI Conference on Artificial Intelligence, January 27-February 1, 2019, Hawaii, USA. Washington: AAAI, 2019, 33(01): 9143–9150.
Google Scholar
CHU W S, SONG Y, JAIMES A. Video co-summarization: video summarization by visual co-occurrence[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 7–12, 2015, Boston, USA. New York: IEEE, 2015: 3584–3592.
Google Scholar
MEI S, GUAN G, WANG Z, et al. L_2,0 constrained sparse dictionary selection for video summarization[C]//2014 IEEE International Conference on Multimedia and Expo, July 14–18, 2014, Chengdu, China. New York: IEEE, 2014: 1–6.
Google Scholar
ZHANG K, CHAO W L, SHA F, et al. Video summarization with long short-term memory[C]//European Conference on Computer Vision, October 10–16, 2016, Amsterdam, Netherlands. Berlin: Springer, 2016: 766–782.
Google Scholar
YUE-HEI N G J, HAUSKNECHT M, VIJAYANARASIMHAN S, et al. Beyond short snippets: deep networks for video classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 7–12, 2015, Boston, USA. New York: IEEE, 2015: 4694–4702.
Google Scholar
ZHAO B, LI X, LU X. Hierarchical recurrent neural network for video summarization[C]//Proceedings of the 25th ACM International Conference on Multimedia, October 23–27, 2017, Orlando, USA. New York: ACM, 2017: 863–871.
Google Scholar
ZHAO B, LI X, LU X. HSA-RNN: hierarchical structure-adaptive RNN for video summarization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 18–22, 2018, Salt Lake City, USA. New York: IEEE, 2018: 7405–7414.
Google Scholar
JUNG Y, CHO D, KIM D, et al. Discriminative feature learning for unsupervised video summarization[C]//Proceedings of the AAAI Conference on Artificial Intelligence, January 27–February 1, 2019, Hawaii, USA. Washington: AAAI, 2019, 33(01): 8537–8544.
Google Scholar
FU H, WANG H. Self-attention binary neural tree for video summarization[J]. Pattern recognition letters, 2021, 143: 19–26.
Article ADS Google Scholar
KANAFANI H, GHAURI J A, HAKIMOV S, et al. Unsupervised video summarization via multi-source features[C]//Proceedings of the 2021 International Conference on Multimedia Retrieval, November 16–19, 2021. New York: ACM, 2021: 466–470.
Google Scholar
SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 7–12, 2015, Boston, USA. New York: IEEE, 2015: 1–9.
Google Scholar
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.
POTAPOV D, DOUZE M, HARCHAOUI Z, et al. Category-specific video summarization[C]//European Conference on Computer Vision, September 5–12, 2014, Zurich, Switzerlan. Berlin: Springer, 2014: 540–555.
Google Scholar
GYGLI M, GRABNER H, RIEMENSCHNEIDER H, et al. Creating summaries from user videos[C]//European Conference on Computer Vision, September 5–12, 2014, Zurich, Switzerlan. Berlin: Springer, 2014: 505–520.
Google Scholar
SONG Y, VALLMITJANA J, STENT A, et al. TVSUM: summarizing web videos using titles[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 7–12, 2015, Boston, USA. New York: IEEE, 2015: 5179–5187.
Google Scholar
DE AVILA S E F, LOPES A P B, DA LUZ J R A, et al. VSUMM: a mechanism designed to produce static video summaries and a novel evaluation method[J]. Pattern recognition letters, 2011, 32(1): 56–68.
Article ADS Google Scholar
MAHASSENI B, LAM M, TODOROVIC S. Unsupervised video summarization with adversarial LSTM networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, July 21–26, 2017, Hawaii, USA. New York: IEEE, 2017: 202–211.
Google Scholar
ZHOU K, QIAO Y, XIANG T. Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward[C]//Proceedings of the AAAI Conference on Artificial Intelligence, February 2–7, 2018, New Orleans, USA. Washington: AAAI, 2018, 32(1).
Book Google Scholar
ROCHAN M, YE L, WANG Y. Video summarization using fully convolutional sequence networks[C]//Proceedings of the European Conference on Computer Vision, September 8–14, 2018, Munich, Germany. Berlin: Springer, 2018: 347–363.
Google Scholar
ZHAO B, LI H, LU X, et al. Reconstructive sequence-graph network for video summarization[J]. IEEE transactions on pattern analysis and machine intelligence, 2021, 44(5): 2793–2801.
Google Scholar
LIU T, MENG Q, HUANG J J, et al. Video summarization through reinforcement learning with a 3D spatio-temporal U-Net[J]. IEEE transactions on image processing, 2022, 31: 1573–1586.
Article ADS Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Technology, Shijiazhuang Tiedao University, Shijiazhuang, 050043, China
Yunzuo Zhang & Yameng Liu

Authors

Yunzuo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yameng Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunzuo Zhang.

Ethics declarations

Conflicts of interest

The authors declare no conflict of interest.

Additional information

This work has been supported by the National Natural Science Foundation of China (Nos.61702347 and 62027801), the Natural Science Foundation of Hebei Province (Nos.F2022210007 and F2017210161), the Science and Technology Project of Hebei Education Department (Nos.ZD2022100 and QN2017132), and the Central Guidance on Local Science and Technology Development Fund (No.226Z0501G).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Liu, Y. Video summarization via global feature difference optimization. Optoelectron. Lett. 19, 570–576 (2023). https://doi.org/10.1007/s11801-023-2212-0

Download citation

Received: 15 December 2022
Revised: 13 February 2023
Published: 28 September 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11801-023-2212-0

Document code

A

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video summarization via global feature difference optimization

Abstract

Access this article

Similar content being viewed by others

Video summarization using deep learning techniques: a detailed analysis and investigation

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

HDRC: a subjective quality assessment database for compressed high dynamic range image

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Rights and permissions

About this article

Cite this article

Document code

Navigation

Video summarization via global feature difference optimization

Abstract

Access this article

Similar content being viewed by others

Video summarization using deep learning techniques: a detailed analysis and investigation

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

HDRC: a subjective quality assessment database for compressed high dynamic range image

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Document code

Search

Navigation