Discovery of Topical Objects from Video: A Structured Dictionary Learning Approach

Liu, Huaping; Sun, Fuchun

doi:10.1007/s12559-015-9381-5

Discovery of Topical Objects from Video: A Structured Dictionary Learning Approach

Published: 25 January 2016

Volume 8, pages 519–528, (2016)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Huaping Liu^1,2,3 &
Fuchun Sun^1,2,3

244 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Automatic discovery of topical objects in video clips is a typical cognitive-related task and is essential for understanding and summarizing the video contents. In this paper, we propose a novel framework based on structured dictionary learning for such a task. Different from existing work which utilizes multiple segmentations to coarsely obtain the object regions, we adopt the most recently developed objectness operator to extract candidate objects. Such a method exhibits a great advantage that the interested object can be more reliably segmented. A structured dictionary learning method is proposed to discover the topical objects of the video clips. Such an optimization model exploits the temporal relation between video frames and therefore leads to better performance. Further, a globally convergent algorithm is developed to solve the structured dictionary learning problem, and extensive experiments on 10 Web video clips show that the proposed method outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovery of the Topical Object in Commercial Video: A Sparse Coding Method

Video co-segmentation based on directed graph

Article 06 September 2018

Yufeng Xie, Zhi Liu, … Xuemei Zou

VERGE: An Interactive Search Engine for Browsing Video Collections

Notes

For the experiments, we used the code of [19] available online without any modifications or tuning. It takes only about 0.01 s to compute candidate windows for one image.

References

Zhao G, Yuan J, Xu J, Wu Y. Discovering the thematic object in commercial videos. IEEE Multimed. 2011;18(3):56–65.
Article Google Scholar
Liu H, Liu Y, Yu Y, Sun F. Diversified key-frame selection using structured L _2,1 optimization. IEEE Trans Ind Inform. 2014;10(3):1736–45.
Article Google Scholar
Liu H, Liu Y, Sun F. Video key-frame extraction for smart phones, Multimed Tools Appl., In press.
Navarretta C. The automatic identification of the producers of co-occurring communicative behaviours. Cogn Comput. 2014;6(4):689–98.
Article Google Scholar
Chen Y, Zhou Q, Luo W, Du J, Classification of Chinese texts based on recognition of semantic topics, Cogn Comput., In press.
Yuan Y, Sun F. Data fusion-based resilient control system under DoS attacks: a game theoretic approach. Int J Control Autom Syst. 2015;13(3):513–20.
Article Google Scholar
Sivic J, Russell B, Efros A, Zisserman A, Freeman W. Discovering objects and their location in images. In: Proceedings of international conference on computer vision (ICCV), 2005, pp. 370–377.
Russell B, Freeman W, Efros A, Sivic J, Zisserman A. Using multiple segmentations to discover objects and their extent in image collections. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), 2006, pp. 1605–1614.
Tunnermann J, Mertsching B. Region-based artificial visual attention in space and time. Cogn Comput. 2014;6(1):125–43.
Article Google Scholar
Zhao G, Yuan J, Hua G. Topical video object discovery from key frames by modeling word co-occurrence prior. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), 2013; pp. 1602–1609.
Tang J, Lewis P. Non-negative matrix factorization for object class discovery and image auto-annotation. In: Proceedings of international conference on content-based image and video retrieval, 2008; pp. 105–112.
Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum HY. Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell. 2011;33(2):353–67.
Article PubMed Google Scholar
Zhu J, Wu J, Wei Y, Chang E, Tu Z. Unsupervised object class discovery via saliency-guided multiple class learning, In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), 2012; pp. 3218–3225.
Liu D, Chen T. A topic-motion model for unsupervised video object discovery. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), 2014, pp. 1–8.
Zhao G, Yuan J. Discovering thematic patterns in videos via cohesive sub-graph mining. In: Proceedings of International conference on data mining (ICDM), pp. 1260–1265.
Zhao J, Sun S, Liu X, Sun J, Yang A. A novel biologically inspired visual saliency model. Cogn Comput. 2014;6(4):841–8.
Article Google Scholar
Tu Z, Zheng A, Yang E, Luo B, Hussain A. A biologically inspired vision-based approach for detecting multiple moving objects in complex outdoor scenes. Cogn Comput. 2015;7(2):539–51.
Article Google Scholar
Alexe B, Deselaers T, Ferrari V. What is an object?. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), 2010, pp. 73–80.
Cheng M, Zhang Z, Lin W, Torr P. BING: Binarized normed gradients for objectness estimation at 300fps. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), 2014, pp. 1–8.
Cheng H, Liu Z, Hou L, Yang J. Sparsity induced similarity measure and its applications. IEEE Trans Circuits Syst Video Technol., In press, doi:10.1109/TCSVT.2012.2225911.
Wang J, Su G, Xiong Y, Chen J, Shang Y, Liu J, Ren X. Sparse representation for face recognition based on constraint sampling and face alignment. Tsinghua Sci Technol. 2013;1:62–7.
Article Google Scholar
Zheng Y, Sheng H, Zhang B, Zhang J, Xiong Z. Weight-based sparse coding for multi-shot person re-identification. Sci China Inform. Sci. 2015;58:100104(15).
Google Scholar
Bolte J, Sabach S, Teboulle M. Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math Program. 2013;146:1–36.
Google Scholar
Bao C, Ji H, Quan Y, Shen Z. \(l_0\) Norm based dictionary learning by proximal methods with global convergence. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), 2014, pp. 1–8.
Yuan J, Zhao G, Fu Y, Li Z, Katsaggelos A, Wu Y. Discovering thematic objects in image collections and videos. IEEE Trans Image Process. 2012;21(4):2207–19.
Article PubMed Google Scholar

Download references

Acknowledgments

This study was jointly supported by National Key Project for Basic Research of China under Grant 2013CB329403, National Natural Science Foundation of China under Grant 61327809, and National High-tech Research and Development Plan under Grant 2015AA042306.

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Beijing, China
Huaping Liu & Fuchun Sun
State Key Laboratory of Intelligent Technology and Systems, Beijing, China
Huaping Liu & Fuchun Sun
Tsinghua National Laboratory for Information Science and Technology (TNList), Beijing, China
Huaping Liu & Fuchun Sun

Authors

Huaping Liu
View author publications
You can also search for this author in PubMed Google Scholar
Fuchun Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huaping Liu.

Ethics declarations

Conflict of Interest

Huaping Liu and Fuchun Sun declare that they have no conflict of interest.

Informed Consent

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008 (5). Additional informed consent was obtained from all patients for which identifying information is included in this article.

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by the any of the authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Sun, F. Discovery of Topical Objects from Video: A Structured Dictionary Learning Approach. Cogn Comput 8, 519–528 (2016). https://doi.org/10.1007/s12559-015-9381-5

Download citation

Received: 06 July 2015
Accepted: 29 December 2015
Published: 25 January 2016
Issue Date: June 2016
DOI: https://doi.org/10.1007/s12559-015-9381-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovery of Topical Objects from Video: A Structured Dictionary Learning Approach

Abstract

Access this article

Similar content being viewed by others

Discovery of the Topical Object in Commercial Video: A Sparse Coding Method

Video co-segmentation based on directed graph

VERGE: An Interactive Search Engine for Browsing Video Collections

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discovery of Topical Objects from Video: A Structured Dictionary Learning Approach

Abstract

Access this article

Similar content being viewed by others

Discovery of the Topical Object in Commercial Video: A Sparse Coding Method

Video co-segmentation based on directed graph

VERGE: An Interactive Search Engine for Browsing Video Collections

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation