Skip to main content

Practical Application of Near Duplicate Detection for Image Database

  • Conference paper
Multimedia Communications, Services and Security (MCSS 2014)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 429))

Abstract

Traditional program guides, TV applications, and online portals alone are no longer sufficient to expose all content, let alone offer the content that consumers want, at the times they are most likely to want it. DEEP, (Data Enrichment and Engagement Platform) by Orca Interactive, a comprehensive new content discovery solution, combines search, recommendation, and second-screen devices into a single immersive experience which invites exploration. The automated generation (using internet sources) of digital magazines for movies, TV shows, cast members and topics is a key value of DEEP. Unfortunately, using the internet as a source for pictures can result in the acquisition of so-called “Near Duplicate” (ND) images – similar images from a specific display context - for example, multiple red carpet images showing an actor from very similar angles or degrees of zoom on him/her. Therefore, in this paper we present a practical application of ND detection for image databases. The algorithm used is based on the MPEG-7 Colour Structure descriptor. For images that were provided by the developers of the DEEP software the algorithm performs very well, and the results are almost identical to those obtained during the training phase.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-hash and tf-idf weighting. In: British Machine Vision Conference (2008)

    Google Scholar 

  2. Chum, O., Philbin, J., Isard, M., Zisserman, A.: Scalable near identical image and shot detection. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007, pp. 549–556. ACM, New York (2007), http://doi.acm.org/10.1145/1282280.1282359

    Google Scholar 

  3. Foo, J.J., et al.: Clustering near-duplicate images in large collections (2007)

    Google Scholar 

  4. Foo, J.J., Sinha, R., Zobel, J.: Sico: A system for detection of near-duplicate images during search. In: 2007 IEEE International Conference on Multimedia and Expo, pp. 595–598 (July 2007)

    Google Scholar 

  5. Foo, J.J., Sinha, R.: Using redundant bit vectors for near-duplicate image detection. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 472–484. Springer, Heidelberg (2007), http://dx.doi.org/10.1007/978-3-540-71703-4_41

    Chapter  Google Scholar 

  6. Fraczek, R., Grega, M., Liebau, N., Leszczuk, M., Luedtke, A., Janowski, L., Papir, Z.: Ground-truth-less comparison of selected content-based image retrieval measures. In: Daras, P., Ibarra, O.M. (eds.) UCMedia 2009. LNICST, vol. 40, pp. 101–108. Springer, Heidelberg (2010), http://dblp.uni-trier.de/db/conf/ucmedia/ucmedia2009.html#FraczekGLLLJP09

    Chapter  Google Scholar 

  7. Grega, M., Łach, S.: Urban photograph localization using the instreet application – accuracy and performance analysis. Multimedia Tools and Applications pp. 1–12 (2013), http://dx.doi.org/10.1007/s11042-013-1538-1

  8. INRIA: Video copy detection evaluation showcase (2007), https://www.rocq.inria.fr/imedia/civr-bench/data.html

  9. Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008), http://dx.doi.org/10.1007/978-3-540-88682-2_24

    Chapter  Google Scholar 

  10. Jinda-Apiraksa, A., Vonikakis, V., Winkler, S.: California-nd: An annotated dataset for near-duplicate detection in personal photo collections. In: Burnett, I.S. (ed.) QoMEX, pp. 142–147. IEEE (2013)

    Google Scholar 

  11. Lee, D.C., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 648–662. Springer, Heidelberg (2010), http://dl.acm.org/citation.cfm?id=1886063.1886113

    Chapter  Google Scholar 

  12. Li, L., Wu, Z., Zha, Z.J., Jiang, S., Huang, Q.: Matching content-based saliency regions for partial-duplicate image retrieval. In: 2011 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (July 2011)

    Google Scholar 

  13. Manjunath, B., Salembier, P., Sikora, T.: Introduction to MPEG-7: multimedia content description interface. John Wiley & Sons Inc. (2002)

    Google Scholar 

  14. Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 2, pp. 2161–2168. IEEE Computer Society, Washington, DC (2006), http://dx.doi.org/10.1109/CVPR.2006.264

    Google Scholar 

  15. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2007)

    Google Scholar 

  16. Reinhardt, C.: Taxi cab geometry: History and applications

    Google Scholar 

  17. Smeaton, A.F., Kraaij, W., Over, P.: The TREC VIDeo retrieval evaluation (TRECVID): A case study and status report. In: Proceedings of RIAO 2004 (2004)

    Google Scholar 

  18. Viaccess-Orca: Going deep into discovery. Tech. rep., Viaccess-Orca (2013), http://www.viaccess-orca.com/resource-center/white-papers/462-going-deep-into-discovery.html

  19. Wang, Y., Hou, Z., Leman, K.: Keypoint-based near-duplicate images detection using affine invariant feature and color matching. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1209–1212 (May 2011)

    Google Scholar 

  20. Wu, Z., Xu, Q., Jiang, S., Huang, Q., Cui, P., Li, L.: Adding affine invariant geometric constraint for partial-duplicate image retrieval. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 842–845 (August 2010)

    Google Scholar 

  21. Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 25–32 (June 2009)

    Google Scholar 

  22. Xie, H., Gao, K., Zhang, Y., Tang, S., Li, J., Liu, Y.: Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Transactions on Multimedia 13(6), 1319–1332 (2011)

    Article  Google Scholar 

  23. Xu, D., Cham, T.J., Yan, S., Duan, L., Chang, S.F.: Near duplicate identification with spatially aligned pyramid matching. IEEE Transactions on Circuits and Systems for Video Technology 20(8), 1068–1079 (2010)

    Article  Google Scholar 

  24. Yang, X., Zhu, Q., Cheng, K.T.: Near-duplicate detection for images and videos. In: Proceedings of the First ACM Workshop on Large-scale Multimedia Retrieval and Mining, LS-MMRM 2009, pp. 73–80. ACM, New York (2009), http://doi.acm.org/10.1145/1631058.1631073

    Google Scholar 

  25. Zhang, D.Q., Chang, S.F.: Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, MULTIMEDIA 2004, pp. 877–884. ACM, New York (2004), http://doi.acm.org/10.1145/1027527.1027730

    Google Scholar 

  26. Zheng, L., Qiu, G., Huang, J., Fu, H.: Salient covariance for near-duplicate image and video detection. In: 2011 18th IEEE International Conference on Image Processing (ICIP), pp. 2537–2540 (September 2011)

    Google Scholar 

  27. Zhu, J., Hoi, S.C.H., Lyu, M.R., Yan, S.: Near-duplicate keyframe retrieval by semi-supervised learning and nonrigid image matching. ACM Trans. Multimedia Comput. Commun. Appl. 7(1), 4:1–4:24 (2011), http://doi.acm.org/10.1145/1870121.1870125

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Eshkol, A., Grega, M., Leszczuk, M., Weintraub, O. (2014). Practical Application of Near Duplicate Detection for Image Database. In: Dziech, A., Czyżewski, A. (eds) Multimedia Communications, Services and Security. MCSS 2014. Communications in Computer and Information Science, vol 429. Springer, Cham. https://doi.org/10.1007/978-3-319-07569-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07569-3_6

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07568-6

  • Online ISBN: 978-3-319-07569-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics