Skip to main content
Log in

Combining global and local matching of multiple features for precise item image retrieval

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

With the fast-growing of online shopping services, there are millions even billions of commercial item images available on the Internet. How to effectively leverage visual search method to find the items of users’ interests is an important yet challenging task. Besides global appearances (e.g., color, shape or pattern), users may often pay more attention to the local styles of certain products, thus an ideal visual item search engine should support detailed and precise search of similar images, which is beyond the capabilities of current search systems. In this paper, we propose a novel system named iSearch and global/local matching of local features are combined to do precise retrieval of item images in an interactive manner. We extract multiple local features including scale-invariant feature transform (SIFT), regional color moments and object contour fragments to sufficiently represent the visual appearances of items; while global and local matching of large-scale image dataset are allowed. To do this, an effective contour fragments encoding and indexing method is developed. Meanwhile, to improve the matching robustness of local features, we encode the spatial context with grid representations and a simple but effective verification approach using triangle relations constraints is proposed for spatial consistency filtering. The experimental evaluations show the promising results of our approach and system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proc. of ICCV (2003)

  2. Witten, I.H., Moffat, A., Bell, T.: Managing gigabytes: compressing and indexing documents and images. Morgan Kaufmann Publishers, USA (1999). (ISBN:1558605703)

    Google Scholar 

  3. Lowe, D.G.: Distinctive Image Features from Scale Invariant Features. Int. J. Comput. Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  4. Zhou, W., Lu, Y., Li, H., Song, Y., Tian, Q.: Spatial coding for large scale partial-duplicate web image search. In: Proc. of ACM multimedia (2010)

  5. Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: Proc. of CVPR (2009)

  6. Wang, M., Hua, X., Mei, T., Tang, J., et al.: Interactive video annotation by multi-concept multi-modality active learning. Int. J. Semant. Comput. 4, 459–477 (2007)

    Article  Google Scholar 

  7. Datta, R., Joshi, D., Li, J., Wang, J.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 1–60 (2008)

    Article  Google Scholar 

  8. Wang, M., Hua, X.: Active learning in multimedia annotation and retrieval: a survey. ACM TIST 2(2), 10 (2011)

    MathSciNet  Google Scholar 

  9. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proc. of CVPR (2007)

  10. Zhao, W., Wu, X., Ngo, C.: On the annotation of web videos by efficient near-duplicate search. IEEE Trans. Multimedia 12(5), 448–461 (2010)

    Article  Google Scholar 

  11. Li, H., Wang, X., Tang, J., Yi, L., Xiao, L.: iSearch: towards precise retrieval of item image. In: Proc. of ACM ICIMCS, Chengdu, China (2011)

  12. Carneiro, G., Jepson, C.: Flexible spatial configuration of local image features. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2089–2104 (2007)

    Article  Google Scholar 

  13. Wu, Z., Xu, Q., Jiang, S., Huang, Q., Cui, P., Li, L.: Adding affine invariant geometric constraint for partial-duplicate image retrieval. In: Proc. of ICPR (2010)

  14. Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Proc. ECCV (2008)

  15. Tang, J., Yan, S., Hong, R., Qi, G., Chua, T.: Inferring semantic concepts from community-contributed images and noisy tags. In: Proc. of ACM multimedia (2009)

  16. Wang, J., Li, J., Lee, C., Yau, W.: Dense SIFT and Gabor descriptors-based face representation with applications to gender recognition. In: Proc. of international conference on control automation robotics and vision (2010)

  17. Liu, X., Yan, S., Luo, J., Tang, J., Huang, Z., Jin, H.: Nonparametric label-to-region by search. In: Proc. of IEEE CVPR (2010)

  18. Shotton, J., Blake, A., Cipolla, R.: Multi-scale categorical object recognition using contour fragments. IEEE Trans. PAMI 30(7), 1270–1281 (2008)

    Article  Google Scholar 

  19. Xu, C., Kuipers, B.: Object detection using principal contour fragments. In: Proc. of Canadian conference on computer and robot vision (CRV-11) (2011)

  20. Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. PAMI 24(24), 509–521 (2002)

    Article  Google Scholar 

  21. Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. PAMI 27(10), 1615–1630 (2005)

    Article  Google Scholar 

  22. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. of ICCV (1999)

  23. Gavrila, D.M.: Multi-feature hierarchical template matching using distance transforms. In: Proc. of ICPR, Brisbane, Australia (1998)

  24. Jing, F., Li, M., Zhang, H.-J., Zhang, B.: An efficient and effective region-based image retrieval framework. IEEE Trans. Image Process. 13(5), 699–709 (2004)

    Article  Google Scholar 

  25. Deng, Y., Manjunath, B. S., Shin, H.: Color image segmentation. In: Proc. of IEEE CVPR ‘99, Fort Collins (1999)

  26. Tang, S., Li, J.-T., Li, M., Xie, C., Liu, Y. Z., Tao, K., Xu, S.-X.: TRECVID 2008 high-level feature extraction by MCG-ICT-CAS. In: Proc. TRECVID 2008 workshop, Gaithesburg, USA (2008)

  27. Tang, J., Li, H., Qi, G.-J., Chua, T.-S.: Image annotation by graph-based inference with integrated multiple/single instance representations. IEEE Trans. Multimedia 12(2), 131–141 (2010)

    Article  Google Scholar 

  28. Li, H., Tang, J., Li, G., Chua, T.-S., Word2Image: towards visual interpretation of words. In: Proc. ACM multimedia (2008)

  29. Li, H., Tang, J., Wu, S., Zhang, Y., Lin, S.: Automatic detection and analysis of player action in moving background sports video sequences. IEEE Trans. CSVT 20(3), 351–364 (2010)

    Google Scholar 

  30. Cheng, M.-M., Zhang, G.-X., Mitra, N. J., Huang, X., Hu, S.-M.: Global contrast based salient region detection. In: Proc. of IEEE CVPR, Colorado Springs, Colorado, USA (2011)

  31. Ricardo, B.Y., Berthier, R.N.: Modern Information Retrieval. ACM Press, New York (1999). (ISBN: 020139829)

    Google Scholar 

  32. Chua, T., Tang, J., Hong, R., Li, J., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: Proc. of ACM CIVR (2009)

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their constructive comments on the improvement of this manuscript. This work was supported by National Natural Science Funds of China (61033012, 61173104 and 61103059).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haojie Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, H., Wang, X., Tang, J. et al. Combining global and local matching of multiple features for precise item image retrieval. Multimedia Systems 19, 37–49 (2013). https://doi.org/10.1007/s00530-012-0265-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-012-0265-1

Keywords

Navigation