Multiscale video sequence matching for near-duplicate detection and retrieval
- 87 Downloads
Abstract
As one of key technologies in content-based near-duplicate detection and video retrieval, video sequence matching can be used to judge whether two videos exist duplicate or near-duplicate segments or not. Despite a lot of research efforts devoted in recent years, how to precisely and efficiently perform sequence matching among videos (which may be subject to complex audio-visual transformations) from a large-scale database still remains a pretty challenging task. To address this problem, this paper proposes a multiscale video sequence matching (MS-VSM) method, which can gradually detect and locate the similar segments between videos from coarse to fine scales. At the coarse scale, it makes use of the Maximum Weight Matching (MWM) algorithm to rapidly select several candidate reference videos from the database for a given query. Then for each candidate video, its most similar segment with respect to the given query is obtained at the middle scale by the Constrained Longest Ascending Matching Subsequence (CLAMS) algorithm, and then can be used to judge whether that candidate exists near-duplicate or not. If so, the precise locations of the near-duplicate segments in both query and reference videos are determined at the fine scale by using bi-directional scanning to check the matching similarity at the segments’ boundaries. As such, the MS-VSM method can achieve excellent near-duplicate detection accuracy and localization precision with a very high processing efficiency. Extensive experiments show that it outperforms several state-of-the-art methods remarkably on several benchmarks.
Keywords
Near-duplicate video detection Video sequence matching Multiscale matching Constrained longest ascending matching subsequence Bi-directional scanningNotes
Acknowledgments
This work is partially supported by grants from the National Key R&D Program of China under grant 2017YFB1002401, the National Natural Science Foundation of China under contract No. U1611461, No. 61390515, and No. 61425025.
References
- 1.Anguera X, Obrador P, Oliver N (2009) Multimodal video copy detection applied to social media. In: Proceedings of the first SIGMM workshop on Social media. ACM, pp 57–64Google Scholar
- 2.Cai Y, Tong W, Yang L, Hauptmann AG (2012) Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval. ACM, p 16Google Scholar
- 3.Chen T, Jiang S, Chu L, Huang Q (2011) Detection and location of near-duplicate video sub-clips by finding dense subgraphs. In: Proceedings of the 19th ACM international conference on Multimedia. ACM, pp 1173–1176Google Scholar
- 4.Chiu C-Y, Wang H-M, Chen C-S (2010) Fast min-hashing indexing and robust spatio-temporal matching for detecting video copies. ACM Trans Multimedia Comput Comm Appl 6(2):23CrossRefGoogle Scholar
- 5.Chiu C-Y, Tsai T-H, Liou Y-C, Han G-W, Chang H-S (2014) Near-duplicate subsequence matching between the continuous stream and large video dataset. IEEE Trans Multimedia 16(7):1952—-1962CrossRefGoogle Scholar
- 6.Coskun B, Sankur B, Memon N (2006) Spatio–temporal transform based video hashing. IEEE Trans Multimedia 8(6):1190–1208CrossRefGoogle Scholar
- 7.Cui P, Wu Z, Jiang S, Huang Q (2010) Fast copy detection based on slice entropy scattergraph. In: 2010 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp 1236–1241Google Scholar
- 8.Diego F, Serrat J, López A. M. (2013) Joint spatio-temporal alignment of sequences. IEEE Trans Multimedia 15(6):1377–1387CrossRefGoogle Scholar
- 9.Dong W, Wang Z, Charikar M, Li K (2008) Efficiently matching sets of features with random histograms. In: Proceedings of the 16th ACM international conference on Multimedia. ACM, pp 179–188Google Scholar
- 10.Douze M, Gaidon A, Jegou H, Marszałek M, Schmid C, et al. (2008) Inria-lears video copy detection system. In: TREC Video Retrieval Evaluation (TRECVID Workshop)Google Scholar
- 11.Douze M, Jégou H, Schmid C (2010) An image-based approach to video copy detection with spatio-temporal post-filtering. IEEE Trans Multimedia 12(4):257–266CrossRefGoogle Scholar
- 12.Gengembre N, Berrani S-A (2008) A probabilistic framework for fusing frame-based searches within a video copy detection system. In: Proceedings of the 2008 international conference on Content-based image and video retrieval. ACM, pp 211–220Google Scholar
- 13.Haitsma J, Kalke T (2012) A highly robust audio fingerprinting system. In: Proceedings of Int’l Symp. Music Information Retrieval, Paris, France, pp 107–115Google Scholar
- 14.Huang T, Tian Y, Gao W, Lu J (2010) Mediaprinting: Identifying multimedia content for digital rights management. Computer 43(12):0028–35CrossRefGoogle Scholar
- 15.Huang Z, Shen HT, Shao J, Cui B, Zhou X (2010) Practical online near-duplicate subsequence detection for continuous video streams. IEEE Trans Multimedia 12(5):386–398CrossRefGoogle Scholar
- 16.Kim C, Vasudev B (2005) Spatiotemporal sequence matching for efficient video copy detection. IEEE Trans Circuits Syst Video Technol 15(1):127–132CrossRefGoogle Scholar
- 17.Kim H-S, Lee J, Liu H, Lee D (2008) Video linkage: group based copied video detection. In: Proceedings of the 2008 international conference on Content-based image and video retrieval. ACM, pp 397–406Google Scholar
- 18.Kim S, Choi JY, Han S, Ro YM (2014) Adaptive weighted fusion with new spatial and temporal fingerprints for improved video copy detection. Signal Process Image Commun 29(7):788–806CrossRefGoogle Scholar
- 19.Law-To J, Chen L, Joly A, Laptev I, Buisson O, Gouet-Brunet V, Boujemaa N, Stentiford F (2007) Video copy detection: a comparative study. In: Proceedings of ACM Int’l Conf. on Image and Video Retrieval (CIVR’07), Amsterdam, The Netherlands, pp 371–378Google Scholar
- 20.Law-To J, Joly A, Boujemaa N (2007) Muscle-vcd-2007: a live benchmark for video copy detectionGoogle Scholar
- 21.Lin J, Duan L -Y, Wang S, Bai Y, Lou Y, Chandrasekhar V, Huang T, Kot A, Gao W (2017) Hnip: Compact deep invariant representations for video matching, localization, and retrieval. IEEE Trans Multimedia 19(9):1968–1983CrossRefGoogle Scholar
- 22.Liu B, Li Z, Yang L, Wang M, Tian X (2011) Real-time video copy-location detection in large-scale repositories. IEEE Multimedia 18(3):22–31CrossRefGoogle Scholar
- 23.Liu H, Lu H, Xue X (2013) A segmentation and graph-based video sequence matching method for video copy detection. IEEE Trans Knowl Data Eng 25(8):1706–1718CrossRefGoogle Scholar
- 24.Liu H, Zhao Q, Wang H, Lv P, Chen Y (2016) An image-based near-duplicate video retrieval and localization using improved edit distance, Multimedia Tools and ApplicationsGoogle Scholar
- 25.Liu J, Huang Z, Cai H, Shen HT, Ngo CW, Wang W (2013) Near-duplicate video retrieval: Current research and future trends. ACM Comput Surv 45(4):44CrossRefGoogle Scholar
- 26.Liu J, Huang Z, Shen HT, Cui B (2011) Correlation-based retrieval for heavily changed near-duplicate videos. ACM Trans Inf Syst 29(4):21CrossRefGoogle Scholar
- 27.Liu Y, Zhao W-L, Ngo C-W, Xu C-S, Lu H-Q (2010) Coherent bag-of audio words model for efficient large-scale video copy detection. In: Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, pp 89–96Google Scholar
- 28.Lowe DG (1999) Object recognition from local scale-invariant features 2:1150–1157Google Scholar
- 29.Lowe DG (2004) Distinctive Image Features from Scale-Invariant Keypoints. Int J Comput Vis 60(2):91–110MathSciNetCrossRefGoogle Scholar
- 30.Malekesmaeili M, Fatourechi M, Ward RK (2009) Video copy detection using temporally informative representative images. In: International Conference on Machine Learning and Applications, 2009. ICMLA’09. IEEE, pp 69–74Google Scholar
- 31.Mou L, Huang T, Tian Y, Jiang M, Gao W (2013) Content-based copy detection through multimodal feature representation and temporal pyramid matching. ACM Trans Multimed Comput Commun Appl 10(1):5CrossRefGoogle Scholar
- 32.Peng Y, Ngo C-W (2006) Clip-based similarity measure for query-dependent clip retrieval and video summarization. IEEE Trans Circuits Syst Video Technol 16(5):612–627CrossRefGoogle Scholar
- 33.Qian M, Mou L, Li J, Tian Y (2014) Video picture-in-picture detection using spatio-temporal slicing. In: Proceedings of ICME’2014 Workshop on Emerg. Multimedia Sys. and Appl., Chengdu, ChinaGoogle Scholar
- 34.Ren J, Chang F, Wood T, Zhang JR (2012) Efficient video copy detection via aligning video signature time series. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval. ACM, p 14Google Scholar
- 35.Roopalakshmi R, Ram Mohana Reddy G (2013) A novel spatio-temporal registration framework for video copy localization based on multimodal features. Signal Process 93(8):2339–2351CrossRefGoogle Scholar
- 36.Shang L, Yang L, Wang F, Chan K-P, Hua X-S (2010) Real-time large scale near-duplicate web video retrieval. In: Proceedings of the international conference on Multimedia. ACM, pp 531–540Google Scholar
- 37.Shen HT, Shao J, Huang Z, Zhou X (2009) Effective and efficient query processing for video subsequence identification. IEEE Trans Knowl Data Eng 21(3):321–334CrossRefGoogle Scholar
- 38.Song J, Yang Y, Huang Z, Shen HT, Hong R (2013) Multiple feature hashing for large scale near-duplicate video retrieval. IEEE Trans Multimedia 15(8):1997–2008CrossRefGoogle Scholar
- 39.Song J, Yang Y, Huang Z, Shen HT, Luo J (2013) Effective multiple feature hashing for large-scale near-duplicate video retrieval, . IEEE Trans Multimedia 15(8):1997–2008CrossRefGoogle Scholar
- 40.Tan H-K, Ngo C-W, Hong R, Chua T-S (2009) Scalable detection of partial near-duplicate videos by visual-temporal consistency. In: Proceedings of the 17th ACM international conference on Multimedia. ACM, pp 145–154Google Scholar
- 41.Tian Y, Huang T, Jiang M, Gao W (2013) Video copy-detection and localization with a scalable cascading framework. IEEE MultiMedia 20(3):72–86CrossRefGoogle Scholar
- 42.Wei S, Zhao Y, Zhu C, Xu C, Zhu Z (2011) Frame fusion for video copy detection. IEEE Trans Circuits Syst Video Technol 21(1):15–28CrossRefGoogle Scholar
- 43.Wu X, Hauptmann AG, Ngo C-W (2007) Practical elimination of near-duplicates from web video search. In: Proceedings of the 15th international conference on Multimedia. ACM, pp 218–227Google Scholar
- 44.Wu Z, Aizawa K (2014) Self-similarity-based partial near-duplicate video retrieval and alignment. Int J Multimed Inf Retrieval 3(1):1–14CrossRefGoogle Scholar
- 45.Yeh M-C, Cheng K-T (2009) Video copy detection by fast sequence matching. In: Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, p 45Google Scholar
- 46.Yeh M-C, Cheng K-T (2011) Fast visual retrieval using accelerated sequence matching. IEEE Trans Multimedia 13(2):320–329CrossRefGoogle Scholar
- 47.Zhang L, Zhang B (2014) Quotient space based problem solving: A theoretical foundation of granular computing. Elsevier Inc., AmsterdamzbMATHGoogle Scholar
- 48.Zheng L, Qiu G, Huang J, Fu H (2011) Salient covariance for near-duplicate image and video detection. In: 2011 18th IEEE International Conference on Image Processing (ICIP). IEEE, pp 2537–2540Google Scholar
- 49.Zhou X, Chen L, Zhou X (2012) Structure tensor series-based large scale near-duplicate video retrieval. IEEE Trans Multimedia 14(4):1220–1233CrossRefGoogle Scholar
- 50.Zhou X, Zhou X, Chen L, Bouguettaya A (2012) Efficient subsequence matching over large video databases. The VLDB J 21:489–508CrossRefGoogle Scholar
- 51.Zhou X, Zhou X, Chen L, Bouguettaya A, Xiao N, Taylor JA (2009) An efficient near-duplicate video shot detection method using shot-based interest points. IEEE Trans Multimedia 11(5):879–891CrossRefGoogle Scholar