Skip to main content
Log in

Clustering of near duplicate images using bundled features

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Clustering the images is generally based on the image’s visual features. Selection of relevant features is the most essential task. A clustering approach based on the bundled features is presented in this paper. Bundling of affine scale invariant feature transform (ASIFT) feature helps to cluster the near duplicates. When the local features are combined with the ASIFT features, the clustering efficiency is increased. Clustering the results from the web image search engines is very essential to help users narrow their search. We applied our idea of clustering with bundled features over Google Image search results. The results obtained show that the presented approach outperforms compared to the clustering done only with local features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Foo, J.J., Zobel, J., Sinha, R., Tahaghoghi, S.M.:: Detection of near-duplicate images for web search. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 557–564. ACM (2007)

  2. Foo, J.J., Zobel, J., Sinha, R.: Clustering near-duplicate images in large collections. In: Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval, pp. 21–30. ACM (2007)

  3. Zhao, W.-L., Ngo, C.-W.: Scale-rotation invariant pattern entropy for keypoint-based near-duplicate detection. IEEE Trans. Image Process. 18(2), 412–423 (2009)

    Article  MathSciNet  Google Scholar 

  4. Hu, Y., Cheng, X., Chia, L.-T., Xie, X., Rajan, D., Tan, A.-H.: Coherent phrase model for efficient image near-duplicate retrieval. IEEE Trans. Multimed. 11(8), 1434–1445 (2009)

    Article  Google Scholar 

  5. Morel, J.-M., Guoshen, Y.: ASIFT: a new framework for fully affine invariant image comparison. SIAM J. Imaging Sci. 2(2), 438–469 (2009)

    Article  MathSciNet  Google Scholar 

  6. Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: CVPR 2009. IEEE Conference on Computer Vision and Pattern Recognition, pp. 25–32. IEEE (2009)

  7. Valle, E., Cord, M., Philipp-Foliguet, S., Gorisse, D.: Indexing personal image collections: a flexible, scalable solution. IEEE Trans. Consum. Electron. 56(3), 1167–1175 (2010)

  8. Chu, W.-T., Lin, C.-H.: Consumer photo management and browsing facilitated by near-duplicate detection with feature filtering. J. Vis. Commun. Image Represent. 21(3), 256–268 (2010)

    Article  Google Scholar 

  9. Pönitz, T., Stöttinger, J.: Efficient and robust near-duplicate detection in large and growing image data-sets. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1517–1518. ACM (2010)

  10. Liu, H., Lu, H., Xiangyang, X.: SVD-SIFT for web near-duplicate image detection. In: 2010 17th IEEE International Conference on Image Processing (ICIP), pp. 1445–1448. IEEE (2010)

  11. Cao, Y., Zhang, H., Gao, Y., Guo, J.: An efficient duplicate image detection method based on affine-SIFT feature. In: 3rd IEEE International Conference on Broadband Network & Multimedia Technology (IC-BNMT), pp. 794–797 (2010)

  12. Xie, H., Gao, K., Zhang, Y., Tang, S., Li, J., Liu, Y.: Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Trans. Multimed. 13(6), 1319–1332 (2011)

    Article  Google Scholar 

  13. Yu, G., Morel, J.-M.: ASIFT: an algorithm for fully affine invariant comparison. Image Process. Line 1, 11–38 (2011)

    Google Scholar 

  14. Wang, Y., Hou, Z.J., Leman, K.: Keypoint-based near-duplicate images detection using affine invariant feature and color matching. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1209–1212 (2011)

  15. Bueno, L., Valle, E., da Torres, R.: Bayesian approach for near-duplicate image detection. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, p. 15. ACM (2012)

  16. Rakthanmanon, T., Zhu, Q., Keogh, E.J.: Efficiently finding near duplicate figures in archives of historical documents. J. Multimed. 7(2), 109–123 (2012)

  17. Dias, Z., Rocha, A., Goldenstein, S.: Image phylogeny by minimal spanning trees. IEEE Trans. Inf. Forensics Sec. 7(2), 774–788 (2012)

    Article  Google Scholar 

  18. Lee, J., Jain, A., Tong, W.: Image retrieval in forensics: tattoo image database application. IEEE MultiMed. 19(1), 40–49 (2012)

    Article  Google Scholar 

  19. Zhang, X., Zhang, L., Wang, X.-J., Shum, H.-Y.: Finding celebrities in billions of web images. IEEE Trans. Multimed. 14(4), 995–1007 (2012)

    Article  Google Scholar 

  20. Tong, W., Li, F., Jin, R., Jain, A.: Large-scale near-duplicate image retrieval by kernel density estimation. Int. J. Multimed. Inf. Retr. 1, 45–58 (2012)

    Article  Google Scholar 

  21. LI, P., Hanbing, Y.A.N., Gang, C.U.I., Yuejin, D.U.: Near-duplicate image identification with geometric consistency verification. J. Comput. Inf. Syst. 8(9), 3593–3603 (2012)

    Google Scholar 

  22. Vitaladevuni, S., Choi, F., Prasad, R., Natarajan, P.: Detecting near-duplicate document images using interest point matching. In: \(21^{{\rm st}}\) International Conference on Pattern Recognition (ICPR), pp. 347–350. IEEE (2012)

  23. Zha, Z.-J., Tian, Q., Cai, J., Wang, Z.: Interactive social group recommendation for Flickr photos. Neurocomputing 105, 30–37 (2013)

    Article  Google Scholar 

  24. Zhang, S., Tian, Q., Ke, L., Huang, Q., Gao, W.: Edge-SIFT: discriminative binary descriptor for scalable partial-duplicate mobile search. IEEE Trans. Image Process. 22(7), 2889–2902 (2013)

    Article  Google Scholar 

  25. Kalaiarasi, G., Thyagharajan, KK.: Visual content based clustering of near duplicate web search images. In: The Proceeding of the Institute of Electrical and Electronics Engineers (IEEE) International Conference on Green Computing, Communication and Conservation of Energy (ICGCE), India, Dec 12–14, pp. 767–71 (2013)

  26. Hsieh, L.-C., Guan-Long, W., Hsu, Y.-M., Hsu, W.: Online image search result grouping with MapReduce-based image clustering and graph construction for large-scale photos. J. Vis. Commun. Image Represent. 25(2), 384–395 (2014)

    Article  Google Scholar 

  27. Battiato, S., Giovanni, M.F., Giovanni, P., Daniele, R.: Aligning codebooks for near duplicate image detection. Multimed. Tools Appl. 72(2), 1483–1506 (2014)

    Article  Google Scholar 

  28. Kalaiarasi, G., Thyagharajan, KK.: Clustering of near duplicate images in the web search using affine transform and hybrid hierarchical k- means (HHK) algorithm. In: International Conference on Communication Technology and Application (CTA2014) WIT Transactions on Information and Communication Technologies (ISSN: 1743- 3517) (2014)

  29. Kalaiarasi, G., Thyagharajan, KK.: Classification of near duplicate images by texture feature extraction and fuzzy SVM. In: Sixth International Joint Conference on Advances in Engineering and Technology (AET), Cochin, India, Dec. pp. 75–82 (2015)

  30. Kalaiarasi, G., Thyagharajan, K.K.: Retrieval of near duplicate images using K means and PSO clustering. Indian J. Sci. Technol. 9(S1), 0974–5645 (2016). ISSN (Print) : 0974-6846, ISSN (Online)

    Article  Google Scholar 

  31. Ahmed, K.T., Iqbal, M.A.: Region and texture based effective image extraction. Clust. Comput. https://doi.org/10.1007/s10586-017-0915-3 (2017)

  32. Beulah, S., Dhanaseelan, F.R.: Detection of duplicated data with minimum overhead and secure data transmission for sensor big data. Clust. Comput. https://doi.org/10.1007/s10586-017-1079-x (2017)

  33. Google search engine. http://www.google.co.in/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Kalaiarasi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kalaiarasi, G., Thyagharajan, K.K. Clustering of near duplicate images using bundled features. Cluster Comput 22 (Suppl 5), 11997–12007 (2019). https://doi.org/10.1007/s10586-017-1539-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1539-3

Keywords

Navigation