Practical Application of Near Duplicate Detection for Image Database

Eshkol, Adi; Grega, Michał; Leszczuk, Mikołaj; Weintraub, Ofer

doi:10.1007/978-3-319-07569-3_6

Adi Eshkol¹⁴,
Michał Grega¹⁵,
Mikołaj Leszczuk¹⁵ &
…
Ofer Weintraub¹⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 429))

Included in the following conference series:

International Conference on Multimedia Communications, Services and Security

704 Accesses
5 Citations

Abstract

Traditional program guides, TV applications, and online portals alone are no longer sufficient to expose all content, let alone offer the content that consumers want, at the times they are most likely to want it. DEEP, (Data Enrichment and Engagement Platform) by Orca Interactive, a comprehensive new content discovery solution, combines search, recommendation, and second-screen devices into a single immersive experience which invites exploration. The automated generation (using internet sources) of digital magazines for movies, TV shows, cast members and topics is a key value of DEEP. Unfortunately, using the internet as a source for pictures can result in the acquisition of so-called “Near Duplicate” (ND) images – similar images from a specific display context - for example, multiple red carpet images showing an actor from very similar angles or degrees of zoom on him/her. Therefore, in this paper we present a practical application of ND detection for image databases. The algorithm used is based on the MPEG-7 Colour Structure descriptor. For images that were provided by the developers of the DEEP software the algorithm performs very well, and the results are almost identical to those obtained during the training phase.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-hash and tf-idf weighting. In: British Machine Vision Conference (2008)
Google Scholar
Chum, O., Philbin, J., Isard, M., Zisserman, A.: Scalable near identical image and shot detection. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007, pp. 549–556. ACM, New York (2007), http://doi.acm.org/10.1145/1282280.1282359
Google Scholar
Foo, J.J., et al.: Clustering near-duplicate images in large collections (2007)
Google Scholar
Foo, J.J., Sinha, R., Zobel, J.: Sico: A system for detection of near-duplicate images during search. In: 2007 IEEE International Conference on Multimedia and Expo, pp. 595–598 (July 2007)
Google Scholar
Foo, J.J., Sinha, R.: Using redundant bit vectors for near-duplicate image detection. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 472–484. Springer, Heidelberg (2007), http://dx.doi.org/10.1007/978-3-540-71703-4_41
Chapter Google Scholar
Fraczek, R., Grega, M., Liebau, N., Leszczuk, M., Luedtke, A., Janowski, L., Papir, Z.: Ground-truth-less comparison of selected content-based image retrieval measures. In: Daras, P., Ibarra, O.M. (eds.) UCMedia 2009. LNICST, vol. 40, pp. 101–108. Springer, Heidelberg (2010), http://dblp.uni-trier.de/db/conf/ucmedia/ucmedia2009.html#FraczekGLLLJP09
Chapter Google Scholar
Grega, M., Łach, S.: Urban photograph localization using the instreet application – accuracy and performance analysis. Multimedia Tools and Applications pp. 1–12 (2013), http://dx.doi.org/10.1007/s11042-013-1538-1
INRIA: Video copy detection evaluation showcase (2007), https://www.rocq.inria.fr/imedia/civr-bench/data.html
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008), http://dx.doi.org/10.1007/978-3-540-88682-2_24
Chapter Google Scholar
Jinda-Apiraksa, A., Vonikakis, V., Winkler, S.: California-nd: An annotated dataset for near-duplicate detection in personal photo collections. In: Burnett, I.S. (ed.) QoMEX, pp. 142–147. IEEE (2013)
Google Scholar
Lee, D.C., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 648–662. Springer, Heidelberg (2010), http://dl.acm.org/citation.cfm?id=1886063.1886113
Chapter Google Scholar
Li, L., Wu, Z., Zha, Z.J., Jiang, S., Huang, Q.: Matching content-based saliency regions for partial-duplicate image retrieval. In: 2011 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (July 2011)
Google Scholar
Manjunath, B., Salembier, P., Sikora, T.: Introduction to MPEG-7: multimedia content description interface. John Wiley & Sons Inc. (2002)
Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 2, pp. 2161–2168. IEEE Computer Society, Washington, DC (2006), http://dx.doi.org/10.1109/CVPR.2006.264
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2007)
Google Scholar
Reinhardt, C.: Taxi cab geometry: History and applications
Google Scholar
Smeaton, A.F., Kraaij, W., Over, P.: The TREC VIDeo retrieval evaluation (TRECVID): A case study and status report. In: Proceedings of RIAO 2004 (2004)
Google Scholar
Viaccess-Orca: Going deep into discovery. Tech. rep., Viaccess-Orca (2013), http://www.viaccess-orca.com/resource-center/white-papers/462-going-deep-into-discovery.html
Wang, Y., Hou, Z., Leman, K.: Keypoint-based near-duplicate images detection using affine invariant feature and color matching. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1209–1212 (May 2011)
Google Scholar
Wu, Z., Xu, Q., Jiang, S., Huang, Q., Cui, P., Li, L.: Adding affine invariant geometric constraint for partial-duplicate image retrieval. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 842–845 (August 2010)
Google Scholar
Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 25–32 (June 2009)
Google Scholar
Xie, H., Gao, K., Zhang, Y., Tang, S., Li, J., Liu, Y.: Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Transactions on Multimedia 13(6), 1319–1332 (2011)
Article Google Scholar
Xu, D., Cham, T.J., Yan, S., Duan, L., Chang, S.F.: Near duplicate identification with spatially aligned pyramid matching. IEEE Transactions on Circuits and Systems for Video Technology 20(8), 1068–1079 (2010)
Article Google Scholar
Yang, X., Zhu, Q., Cheng, K.T.: Near-duplicate detection for images and videos. In: Proceedings of the First ACM Workshop on Large-scale Multimedia Retrieval and Mining, LS-MMRM 2009, pp. 73–80. ACM, New York (2009), http://doi.acm.org/10.1145/1631058.1631073
Google Scholar
Zhang, D.Q., Chang, S.F.: Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, MULTIMEDIA 2004, pp. 877–884. ACM, New York (2004), http://doi.acm.org/10.1145/1027527.1027730
Google Scholar
Zheng, L., Qiu, G., Huang, J., Fu, H.: Salient covariance for near-duplicate image and video detection. In: 2011 18th IEEE International Conference on Image Processing (ICIP), pp. 2537–2540 (September 2011)
Google Scholar
Zhu, J., Hoi, S.C.H., Lyu, M.R., Yan, S.: Near-duplicate keyframe retrieval by semi-supervised learning and nonrigid image matching. ACM Trans. Multimedia Comput. Commun. Appl. 7(1), 4:1–4:24 (2011), http://doi.acm.org/10.1145/1870121.1870125
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Orca Interactive, 22 Zarhin Street, Ra’anana, 43662, Israel
Adi Eshkol & Ofer Weintraub
AGH University of Science and Technology, al. Mickiewicza 30, PL-30059, Krakow, Poland
Michał Grega & Mikołaj Leszczuk

Authors

Adi Eshkol
View author publications
You can also search for this author in PubMed Google Scholar
Michał Grega
View author publications
You can also search for this author in PubMed Google Scholar
Mikołaj Leszczuk
View author publications
You can also search for this author in PubMed Google Scholar
Ofer Weintraub
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

AGH University of Science and Technology, al. Mickiewicza 30, 30-059, Kraków, Poland
Andrzej Dziech
Multimedia Systems Department, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Narutowicza 11/12, 80-233, Gdansk, Poland
Andrzej Czyżewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Eshkol, A., Grega, M., Leszczuk, M., Weintraub, O. (2014). Practical Application of Near Duplicate Detection for Image Database. In: Dziech, A., Czyżewski, A. (eds) Multimedia Communications, Services and Security. MCSS 2014. Communications in Computer and Information Science, vol 429. Springer, Cham. https://doi.org/10.1007/978-3-319-07569-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-07569-3_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07568-6
Online ISBN: 978-3-319-07569-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics