A novel feature fusion based framework for efficient shot indexing to massive web videos

Dong, Yuan; Wang, Lezi; Lian, Shiguo; Cen, Shusheng; Liu, Wei

doi:10.1007/s11235-014-9945-9

A novel feature fusion based framework for efficient shot indexing to massive web videos

Published: 13 December 2014

Volume 59, pages 401–413, (2015)
Cite this article

Telecommunication Systems Aims and scope Submit manuscript

Yuan Dong^1,2,
Lezi Wang²,
Shiguo Lian³,
Shusheng Cen¹ &
…
Wei Liu²

203 Accesses
Explore all metrics

Abstract

This study addresses an automatic approach to analyze the structure of large scale web videos based on visual and acoustic information. In our approach, video streams are macro-segmented via mining the duplicate sequences. Acoustic and visual information are both adopted for mining so as to avoid missing true-positive. Web videos contain severe visual and acoustic distortions, differing to TV data, where duplicate clips are quite similar. In this case, we present novel visual-acoustic feature schemes to handle the distortions. And shot based indexing algorithm and several temporary constrains are presented to mine the duplicate sequences, where the weak geometric verification is combined with direct hashing to achieve high efficiency and superior performance of image-based duplicate sequences detection, and dynamic programming is introduced to recall missing true-positives in audio-based section. Experiments conducted on the dataset composed of 500 h content-unknown videos show that F-Measure of duplicate sequences mining for web videos can achieve the rate of 95 % and, in terms of efficiency and detection performance, the proposed algorithm outperforms the state-of-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An image-based near-duplicate video retrieval and localization using improved Edit distance

Article 02 December 2016

Hao Liu, Qingjie Zhao, … Yanming Chen

Duplicate video detection for large-scale multimedia

Article 09 June 2015

Woogyoung Jun, Yillbyung Lee & Byoung-Min Jun

Near-Duplicate Web Video Retrieval and Localization Using Improved Edit Distance

Notes

References

Zhao, J. Y., Hayasaka, R., Muranoi, R., & Matsushita, Y. (1998). A MPEG video structure analysis scheme and its application to hierarchical video browser. Telecommunication Systems, 9(3–4), 403–422.
Article Google Scholar
Gauch, J. M., & Shivadas, A. (2006). Finding and identifying unknown commercials using repeated video sequence detection. Computer Vision and Image Understanding, 103, 80–88.
Article Google Scholar
Berrani S., Lechat P., & Manson G. (2007) TV broadcast macro-segmentation: metadata-based vs. content-based approaches, Proceedings of the 6th ACM international conference on Image and video retrieval, Amsterdam, The Netherlands: ACM, pp. 325–332.
Berrani, S., Manson, G., & Lechat, P. (2008). A non-supervised approach for repeated sequence detection in TV broadcast streams. Image Communication, 23, 525–537.
Google Scholar
Covell, M., Baluja, S. (2006) Advertisement detection and replacement using acoustic and visual repetition, MMSP’06, IEEE 8th workshop on multimedia signal processing.
Bai, H., Wang, L., Qin, G., Zhang, J., Tao, K., Chang, X., Dong, Y. (2011). TV program segmentation using multi-modal information fusion, Proceedings of the 1st ACM international conference on multimedia retrieval, 2011 ACM, New York, NY, USA.
Wang, L., Dong, Y., Bai, H., Zhangy, J., Huang, C., & Liu, W. (2012). Content-based large scale web audio copy detection, International conference on multimedia & expo (ICME).
Hampapur, A., Hyun, K., & Bolle, R. (2002). Comparison of sequence matching techniques for video copy detection. Proceedings of the storage and retrieval for media databases, pp. 194–201.
Bai, H., Dong, Y., Liu, W., Wang, L., Huang, C., & Tao, K. (2011). France telecom orange labs (Beijing) at TRECVID 2011: Content-Based Copy Detection-TRECVID 2011 Notebook Paper.
Duan, L., Wang, J., Zheng, Y., Jin, J. S., Lu, H., & Xu, C. (2006) Segmentation, categorization, and identification of commercial clips from TV streams using multimodal analysis, Proceedings of the 14th annual ACM international conference on Multimedia, Santa Barbara, CA, USA: ACM, pp. 201–210.
Derek, Y. K., Ke, Y., Hoiem, D., & Sukthankar, R. (2005). Computer vision for music identification. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1, 597–604.
Google Scholar
Haitsma, J., Kalker, T. (2001) Robust audio hashing for content identification, Content-based multimedia indexing (CBMI).
Dong, Y., Qin, G., Xiao, G. R., Lian, S. G., & Chang, X. F. (2013) Advanced news video parsing via visual characteristics of anchorperson scenes, Telecommunication Systems. doi:10.1007/s11235-013-9731-0
Smeaton, A. F., Over, P., & Doherty, A. R. (2010). Video shot boundary detection: Seven years of trecvid activity. Computer Vision and Image Understanding, 114(4), 411–418.
Article Google Scholar
Fei-Fei, L., & Perona, P. (2005) A Bayesian hierarchical model for learning natural scene categories. Proceedings of IEEE computer vision and pattern recognition. pp. 524–531.
Lowe, David G. (1999). Object recognition from local scale-invariant features. Proceedings of the International Conference on Computer Vision, 2, 1150–1157.
Google Scholar
Huang, C., & Dong, Y. (2012) A fast color feature for real-time image retrieval, IC-NIDC.
Lazebnik, S., Schmid, C., & Ponce, J. (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR
Uijlings, J. R. R., Smeulders, A. W. M., & Scha, R. J. H. (2010). Real-time visual concept classifcation. IEEE Transactions of Multimedia, 12(7), 665.
Article Google Scholar
Nister, D., & Stewenius, H. (2006). Scalable recognition with a vocabulary tree, IEEE computer society conference on computer vision and pattern recognition. 2, 2161–2168.
Shang, L., Yang, L., Wang, F., Chan, K., & Hua, X. (2010) Real-time large scale near-duplicate web video retrieval, ACM MM.
Needleman, S. B., & Wunsch, C. D. (1970). An efficient method applicable to the search for similarities in the amino acid sequences of two proteins. Journal of Molecular Biology, 48, 444–453.
Article Google Scholar
Sellers, P. H. (1974). An algorithm for the distance between two finite sequences. Journal of Combinatorial Theory, A16, 253–258.
Article Google Scholar
Wang, L., Dong, Y., Bai, H., Zhangy, J., Huang, C., Liu, W. (2012) Content-based large scale web audio copy detection, International conference on multimedia & expo (ICME).
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V. S. (2004) Locality-sensitive hashing scheme based on p-stable distributions, Annual symposium on computational geometry, pp. 253–262.
Gionis, A., Indyk, P., & Motwani, R. (1999) Similarity search in high dimensions via hashing, Proceeding VLDB ’99 Proceedings of the 25th international conference on very large data bases, pp. 518–529.
Schaefer, G., & Zhou, H. Y. (2009). Fuzzy clustering for colour reduction in images. Telecommunication Systems, 40(1–2), 17–25.
Article Google Scholar

Download references

Acknowledgments

This work is sponsored by collaborative Research Project (SEV01100474) between Beijing University of Posts and Telecommunications and France Telecom – Orange Lab Beijing, the National High Technology Research and Development Program of China (863 Program, No. 2012AA012505), and the National Natural Science Foundation of China (61372169).

Author information

Authors and Affiliations

Beijng University of Posts and Telecommunications, Beijing, 100876, People’s Republic of China
Yuan Dong & Shusheng Cen
France Telecom – Orange Lab Beijing, Beijing, 100190, People’s Republic of China
Yuan Dong, Lezi Wang & Wei Liu
Huawei Central Research Institute, Beijing, 100086, People’s Republic of China
Shiguo Lian

Authors

Yuan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Lezi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shiguo Lian
View author publications
You can also search for this author in PubMed Google Scholar
Shusheng Cen
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan Dong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, Y., Wang, L., Lian, S. et al. A novel feature fusion based framework for efficient shot indexing to massive web videos. Telecommun Syst 59, 401–413 (2015). https://doi.org/10.1007/s11235-014-9945-9

Download citation

Published: 13 December 2014
Issue Date: July 2015
DOI: https://doi.org/10.1007/s11235-014-9945-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A novel feature fusion based framework for efficient shot indexing to massive web videos

Abstract

Access this article

Similar content being viewed by others

An image-based near-duplicate video retrieval and localization using improved Edit distance

Duplicate video detection for large-scale multimedia

Near-Duplicate Web Video Retrieval and Localization Using Improved Edit Distance

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel feature fusion based framework for efficient shot indexing to massive web videos

Abstract

Access this article

Similar content being viewed by others

An image-based near-duplicate video retrieval and localization using improved Edit distance

Duplicate video detection for large-scale multimedia

Near-Duplicate Web Video Retrieval and Localization Using Improved Edit Distance

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation