Skip to main content
Log in

Spatial consistency of dense features within interest regions for efficient landmark recognition

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Recently, feature grouping has been proposed as a method for improving retrieval results for logos and web images. This relies on the idea that a group of features matching over a local region in an image is more discriminative than a single feature match. In this paper, we evolve this concept further and apply it to the more challenging task of landmark recognition. We propose a novel combination of dense sampling of SIFT features with interest regions which represent the more salient parts of the image in greater detail. In place of conventional dense sampling used in category recognition that computes features on a regular grid at a number of fixed scales, we allow the sampling density and scale to vary based on the scale of the interest region. We develop new techniques for exploring stronger geometric constraints inside the feature groups and computing the match score. The spatial information is stored efficiently in an inverted index structure. The proposed approach considers part-based matching of interest regions instead of matching entire images using a histogram under bag-of-words. This helps reducing the influence of background clutter and works better under occlusion. Experiments reveal that directing more attention to the salient regions of the image and applying proposed geometric constraints helps in vastly improving recognition rates for reasonable vocabulary sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Cao, Y., Wang, C., Li, Z., Zhang, L., Zhang, L.: Spatial-bag-of-features. In: CVPR, pp. 3352–3359 (2010)

    Google Scholar 

  2. Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: British Machine Vision Conference (2011)

    Google Scholar 

  3. Chen, D.M., Baatz, G., Köser, K., Tsai, S.S., Vedantham, R., Pylvänäinen, T., Roimela, K., Chen, X., Bach, J., Pollefeys, M., Girod, B., Grzeszczuk, R.: City-scale landmark identification on mobile devices. In: CVPR, pp. 737–744 (2011)

    Chapter  Google Scholar 

  4. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    Article  MathSciNet  Google Scholar 

  5. Gehler, P.V., Nowozin, S.: On feature combination for multiclass object classification. In: ICCV, pp. 221–228 (2009)

    Google Scholar 

  6. Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: ECCV, pp. 304–317 (2008)

    Google Scholar 

  7. Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)

    Article  Google Scholar 

  8. Kalantidis, Y., Pueyo, L.G., Trevisiol, M., van Zwol, R., Avrithis, Y.S.: Scalable triangulation-based logo recognition. In: ICMR, p. 20 (2011)

    Google Scholar 

  9. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  10. Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)

    Article  Google Scholar 

  11. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)

    Google Scholar 

  12. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: CVPR (2008)

    Google Scholar 

  13. Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003)

    Google Scholar 

  14. Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: a survey. Found. Trends Comput. Graph. Vis. 3(3), 177–280 (2007)

    Article  Google Scholar 

  15. Vedaldi, A., Fulkerson, B.: VLFeat: an open and portable library of computer vision algorithms (2008). Available at http://www.vlfeat.org/

  16. Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: CVPR, pp. 25–32 (2009)

    Google Scholar 

  17. Wu, Z., Xu, Q., Jiang, S., Huang, Q., Cui, P., Li, L.: Adding affine invariant geometric constraint for partial-duplicate image retrieval. In: ICPR, pp. 842–845 (2010)

    Google Scholar 

  18. Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.W.: Evaluating bag-of-visual-words representations in scene classification. In: Multimedia Information Retrieval, pp. 197–206 (2007)

    Google Scholar 

Download references

Acknowledgements

We are thankful to Natural Sciences and Engineering Research Council of Canada and Alberta Innovates Technology Futures for continued support of this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Priyadarshi Bhattacharya.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhattacharya, P., Gavrilova, M.L. Spatial consistency of dense features within interest regions for efficient landmark recognition. Vis Comput 29, 491–499 (2013). https://doi.org/10.1007/s00371-013-0813-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-013-0813-5

Keywords

Navigation