Advertisement

Multimedia Tools and Applications

, Volume 76, Issue 22, pp 24143–24163 | Cite as

A robust and low-cost video fingerprint extraction method for copy detection

  • Zobeida Jezabel Guzman-Zavaleta
  • Claudia Feregrino-Uribe
  • Miguel Morales-Sandoval
  • Alejandra Menendez-Ortiz
Article

Abstract

Video fingerprinting for content-based video identification is a very useful task for the management and monetization of copyrighted content distribution. The main challenges of monitoring and copy detection systems are: a) the effective identification of highly transformed videos (robustness) and b) computational efficiency which may be relevant for some applications. Typically, most video fingerprinting methods focus on robustness leaving aside computational efficiency. However, for real-time applications are necessary low computational cost detection methods, for instance, in illegal content monitoring in video streaming distributions. Therefore, in this paper, we propose a low-cost and effective video fingerprint extraction method based on the combination of content-based features using both acoustic and visual video components. Our method is capable of detecting video copies by using computationally efficient fingerprints while maintaining robustness against the decrease in quality and content preserved distortions, which are frequent but severe attacks.

Keywords

Content-based video copy detection ORB Spectrogram saliency maps TIRI Video fingerprinting 

Notes

Acknowledgments

This work was partially supported by CONACyT Mexico through the PhD grant No. 204554 and project PDCPN2013-01-216689.

References

  1. 1.
    Awad G, Over P, Kraaij W (2014) Content-based video copy detection benchmarking at TRECVID. ACM Trans Inf Syst 32(3):1–40. doi: 10.1145/2629531 CrossRefGoogle Scholar
  2. 2.
    Barrios J, Bustos B (2013) Competitive content-based video copy detection using global descriptors. Multimed Tools Appl 62(1):75–110. doi: 10.1007/s11042-011-0915-x CrossRefGoogle Scholar
  3. 3.
    Calonder M, Lepetit V, Strecha C, Fua P (2010) BRIEF: Binary Robust Independent Elementary Features. In: Proceedings of ECCV. doi: 10.1007/978-3-642-15561-1_56
  4. 4.
    Cano P, Batlle E, Kalker T, Haitsma J (2005) A review of audio fingerprinting. J VLSI Process-Syst Signal, Image, Video Technol 41(3):271–284. doi: 10.1007/s11265-005-4151-3 CrossRefGoogle Scholar
  5. 5.
    Douglas O (1987) Speech communication. Addison-WesleyGoogle Scholar
  6. 6.
    Dimoulas C., Symeonidis A. (2015) Syncing shared multimedia through audio- visual bimodal segmentation. IEEE Multimed 22(3):26–42. doi: 10.1109/MMUL.2015.33 CrossRefGoogle Scholar
  7. 7.
    Douze M, Jégou H, Sandhawalia H, Amsaleg L, Schmid C (2009) Evaluation of GIST descriptors for web-scale image search. Proceeding ACM International Conference Image Video Retrieval CIVR 09 p 1. doi: 10.1145/1646396.1646421
  8. 8.
    Esmaeili M, Fatourechi M, Ward R (2011) A robust and fast video copy detection system using content-based fingerprinting. IEEE Trans Inform Forensic Secur 6(1):213–226. doi: 10.1109/TIFS.2010.2097593 CrossRefGoogle Scholar
  9. 9.
    FreeSFX: City_or_town_street_ambience_pedestrians_walking_with_some _traffic_noise_in_background. prefixwww.freesfx.co.uk/download/?type=mp3&id=3154
  10. 10.
  11. 11.
    Gu X, Zhang D, Zhang Y, Li J, Zhang L (2013) A video copy detection algorithm combining local feature’s robustness and global feature’s speed. In: Proceedings ICASSP. doi: 10.1109/ICASSP.2013.6637903
  12. 12.
    Gupta S, Cho S, Kuo CCJ (2012) Current Developments and Future Trends in Audio Authentication. IEEE Comput Soc 19(1):50–59. doi: 10.1109/MMUL.2011.74 Google Scholar
  13. 13.
    Guzman-Zavaleta Z. J., Feregrino-Uribe C. (2014) Content multimodal based video copy detection method for streaming applications. Technical. Report. CCC-14-001, Instituto Nacional de Astrofísica, Óptica y Electrónica Department of Computer ScienceGoogle Scholar
  14. 14.
    Guzman-Zavaleta ZJ, Feregrino-Uribe C, Menendez-Ortiz A, Garcia-Hernandez JJ (2014) A robust audio fingerprinting method using spectrograms saliency maps. In: 9th International Conference on Internet Technological Security Transactions (ICITST). doi: 10.1109/ICITST.2014.7038773. IEEE, London, pp 47–52
  15. 15.
  16. 16.
    Harel J, Koch C, Perona P (2006) Graph-based visual saliency. Proceedings of Neural Information Processing Systems (NIPS)Google Scholar
  17. 17.
    Smith JO (2011) Spectral Audio Signal Processing. W3K Publishing. https://ccrma.stanford.edu/~jos/sasp/ https://ccrma.stanford.edu/~jos/sasp/
  18. 18.
    Kapoor A (2009) Dynamic streaming on demand with Flash Media Server 3.5. http://www.adobe.com/devnet/adobe-media-server/articles/dynstream_on_demand.html
  19. 19.
    Kim S, Choi JY, Han S, Ro YM (2014) Adaptive weighted fusion with new spatial and temporal fingerprints for improved video copy detection. Signal Process Image Commun 29(7):788–806. doi: 10.1016/j.image.2014.05.002 CrossRefGoogle Scholar
  20. 20.
    Komogortsev O (2013) Person identification using ocular biometrics with liveness detection. US Patent App. 13/908,748Google Scholar
  21. 21.
    Lerch A (2012) Audio fingerprinting, Wiley. doi: 10.1002/9781118393550.ch9
  22. 22.
    Li T, Nian F, Wu X (2012) Efficient video copy detection using multi-modality and dynamic path search. Multimed Syst 22. doi: 10.1109/TCSVT.2012.2201670
  23. 23.
    Lian S, Nikolaidis N, Sencar H (2010) Content-based video copy detection – a survey. Intell Multimed Anal Secur Appl 282:253–273. doi: 10.1007/978-3-642-11756-5_12 CrossRefGoogle Scholar
  24. 24.
    Liu X, Sun J, Liu J (2013) Visual attention based temporally weighting method for video hashing. IEEE Signal Process Lett 20(12):1253–1256CrossRefGoogle Scholar
  25. 25.
    Lu ZM, Li B, Ji QG, Tan ZF, Zhang Y (2015) Robust video identification approach based on local non-negative matrix factorization. AEU - Int J Electron Commun 69:82–89. doi: 10.1016/j.aeue.2014.07.021 CrossRefGoogle Scholar
  26. 26.
    Lv Q, Josephson W, Wang Z, Charikar M, Li K (2007) Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: Proceedings of the 33rd International Conference on Very large data bases (VLDB 07). doi: 10.1145/1143844.1143857, pp 950–961
  27. 27.
    Marszałek M, Laptev I, Schmid C (2009) Actions in context. In: IEEE Conference on Computer Vision & Pattern Recognition. doi: 10.1109/CVPR.2009.5206557. http://www.di.ens.fr/~laptev/actions/hollywood2/
  28. 28.
    Miksik O, Mikolajczyk K (2012) Evaluation of local detectors and descriptors for fast feature matching. In: Proceedings ICPR. doi:  10.1.1.301.6783
  29. 29.
    Nie X, Liu J, Sun J, Wang L, Yang X (2013) Robust video hashing based on representative-dispersive frames. Sci China Inf 56(6):1–11. doi: 10.1007/s11432-012-4760-y CrossRefMathSciNetGoogle Scholar
  30. 30.
    NIST T.D.V.R. (2009) Video data: TRECVID 2009. http://www-nlpir.nist.gov/projects/t01v/trecvid.data.html#tv09
  31. 31.
    NIST T.D.V.R. (2015) Guidelines for TRECVID 2011. http://www-nlpir.nist.gov/projects/tv2011/#ccd
  32. 32.
    NIST T.D.V.R. (2016) TREC Video Retrieval Evaluation: TRECVID Home Page. http://http://trecvid.nist.gov/
  33. 33.
    OpenCV Dev Team (2013) OpenCV 2.4.8.0 Documentation. Feature detection and description. http://docs.opencv.org/modules/features2d/doc/feature_detection_and_description.html
  34. 34.
    Over P, Awad G, Fiscus J, Antonishek B, Michel M, Smeaton Alan F, Kraaij W, Quénot G (2011) TRECVID 2011 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics. In: TRECVID 2011 - TREC Video Retrieval Evaluation Online. Gaithersburg, MD, United States. http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html. 56 pages - TRECVID workshop notebook papers/slides
  35. 35.
    Paudyal P, Battisti F, Carli M (2014) A study on the effects of quality of service parameters on perceived video quality. In: Proceedings of 5th European Workshop on Visual Information Processing, EUVIP 2014Google Scholar
  36. 36.
    Pauleve L, Jegou H, Amsaleg L (2010) Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Pattern Recogn Lett 31(11):1348 – 1358. doi: 10.1016/j.patrec.2010.04.004 CrossRefGoogle Scholar
  37. 37.
    Proyecto Gutenberg: Alice’s Adventures in Wonderland by Lewis Carroll. http://www.gutenberg.org/ebooks/11
  38. 38.
    Robertson DJ, Kramer RSS, Burton AM (2015) Face averages enhance user recognition for smartphone security. PLoS ONE 10 (3):e0119,460. doi: 10.1371/journal.pone.0119460 CrossRefGoogle Scholar
  39. 39.
    Rossion B, Hanseeuw B, Dricot L (2012) Defining face perception areas in the human brain: a large-scale factorial fMRI face localizer analysis. Brain Cogn 79 (2):138–57. doi: 10.1016/j.bandc.2012.01.001 CrossRefGoogle Scholar
  40. 40.
    Rosten E, Drummond T (2005) Fusing points and lines for high performance tracking. In: IEEE International Conference on Computer Vision. doi: 10.1109/ICCV.2005.104. Oral presentation, vol 2, pp 1508–1511
  41. 41.
    Rublee E, Rabaud V (2011) ORB: an efficient alternative to SIFT or SURF. In: Proceedings IEEE ICCV. doi: 10.1109/ICCV.2011.6126544. IEEE, California, USA, pp 2564–2571
  42. 42.
    Shinde S, Chiddarwar G (2015) Recent advances in content based video copy detection. In: International Conference on Pervasive Computing (ICPC). doi: 10.1109/PERVASIVE.2015.7087093, pp 1–6
  43. 43.
    Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and trecvid. In: MIR ’06: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval. doi: 10.1145/1178677.1178722. ACM Press, NY, USA, pp 321–330
  44. 44.
    Smith JO (2014) Mathematics of the Discrete Fourier Transform (DFT), 2nd edn. Online book. http://ccrma.stanford.edu/jos/st/
  45. 45.
    Speech, Hearing and Phonetic Sciences. UCL Division of Phsycology and Language Science: Spsc2003: Phonetic science: Acoustic of speech and hearing (2009). www.phon.ucl.ac.uk/courses/spsci/acoustics/week1-10.pdf
  46. 46.
    Suman E, Binu A (2013) An exploration based on multifarious video copy detection strategies. In: Proceedings ARTCom 2013. doi: 03.LSCS.2013.5.47
  47. 47.
    Tian Y, Jiang M, Mou L (2011) A multimodal video copy detection approach with sequential pyramid matching. In: Proceedings IEEE ICIP, pp 3629–3632Google Scholar
  48. 48.
    Yusuke U, Takagi Koichi SS (2012) Fast and accurate content-based video copy detection using bag-of-global visual features. In: IEEE International Conference Acoustic Speech Signal Processing (ICASSP). doi: 10.1109/ICASSP.2012.6288061. IEEE, Kyoto, pp 1029–1032
  49. 49.
    Wu C, Zhu J, Zhang J (2012) A content-based video copy detection method with randomly projected binary features. IEEE Comput Soc Conf Comput Vis Pattern Recognit Work 1:21–26. doi: 10.1109/CVPRW.2012.6239256. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6239256
  50. 50.
    Wu S, Zhao Z (2012) A multi modal content-based copy detection approach. In: Proceedings CIS. doi: 10.1109/CIS.2012.69, pp 280–283
  51. 51.
    Yamaguchi K (2012) MEXOPENCV - Collection of mex functions for OpenCV library. http://www.cs.stonybrook.edu/kyamagu/mexopencv/

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Departamento de Ciencias ComputacionalesInstituto Nacional de Astrofísica, Óptica y Electrónica (INAOE)Sta. Ma. TonanzintlaMéxico
  2. 2.Laboratorio de Tecnologías de InformaciónCINVESTAV-IPNTamaulipasMéxico

Personalised recommendations