Advertisement

Landmark Recognition: From Small-Scale to Large-Scale Retrieval

  • Federico MaglianiEmail author
  • Tomaso Fontanini
  • Andrea Prati
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 804)

Abstract

During the last years, the problem of landmark recognition is addressed in many different ways. Landmark recognition is related to finding the most similar images to a starting one in a particular dataset of buildings or places. This chapter explains the most used techniques for solving the problem of landmark recognition, with a specific focus on techniques based on deep learning. Firstly, the focus is on the classical approaches for the creation of descriptors used in the content-based image retrieval task. Secondly, the deep learning approach that has shown overwhelming improvements in many tasks of computer vision, is presented. A particular attention is put on the major recent breakthroughs in Content-Based Image Retrieval (CBIR), the first one is transfer learning which improves the feature representation and therefore accuracy of the retrieval system. The second one is the fine-tuning technique, that allows to highly improve the performance of the retrieval system, is presented. Finally, the chapter exposes the techniques for large-scale retrieval, in which datasets contain at least a million images.

References

  1. 1.
    Bhattacharya, P., Gavrilova, M.: A survey of landmark recognition using the bag-of-words framework. In: Intelligent Computer Graphics 2012, pp. 243–263. Springer (2013)Google Scholar
  2. 2.
    Liu, Y., Zhang, D., Lu, G., Ma, W.Y.: A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40(1), 262–282 (2007)CrossRefzbMATHGoogle Scholar
  3. 3.
    Hare, J.S., Lewis, P.H., Enser, P.G., Sandom, C.J.: Mind the gap: another look at the problem of the semantic gap in image retrieval. In: Multimedia Content Analysis, Management, and Retrieval 2006, vol. 6073, p. 607309. International Society for Optics and Photonics (2006)Google Scholar
  4. 4.
    Muneesawang, P., Zhang, N., Guan, L.: Mobile landmark recognition. In: Multimedia Database Retrieval, pp. 131–145. Springer (2014)Google Scholar
  5. 5.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: European Conference on Computer Vision, pp. 404–417. Springer (2006)Google Scholar
  7. 7.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556
  8. 8.
    Zheng, L., Yang, Y., Tian, Q.: SIFT meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1224–1244 (2017)CrossRefGoogle Scholar
  9. 9.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  11. 11.
    Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: International Conference on Computer Vision, vol. 2, pp. 1470–1477 (2003)Google Scholar
  12. 12.
    Jégou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision, pp. 304–317 (2008)Google Scholar
  13. 13.
    Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3304–3311 (2010)Google Scholar
  14. 14.
    Perronnin, F., Liu, Y., Sánchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3384–3391. IEEE (2010)Google Scholar
  15. 15.
    Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1169–1176. IEEE (2009)Google Scholar
  16. 16.
    Tolias, G., Avrithis, Y., Jégou, H.: To aggregate or not to aggregate: selective match kernels for image search. In: IEEE International Conference on Computer Vision, pp. 1401–1408 (2013)Google Scholar
  17. 17.
    Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening. In: European Conference on Computer Vision, pp. 774–787 (2012)CrossRefGoogle Scholar
  18. 18.
    Arandjelovic, R., Zisserman, A.: All about VLAD. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1578–1585 (2013)Google Scholar
  19. 19.
    Wang, Z., Di, W., Bhardwaj, A., Jagadeesh, V., Piramuthu, R.: Geometric VLAD for large scale image search. In: International Conference on Machine Learning (2014)Google Scholar
  20. 20.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: European Conference on Computer Vision, pp. 143–156 (2010)CrossRefGoogle Scholar
  21. 21.
    Zhao, W.L., Jégou, H., Gravier, G.: Oriented pooling for dense and non-dense rotation-invariant features. In: British Machine Vision Conference (2013)Google Scholar
  22. 22.
    Zhou, Q., Wang, C., Liu, P., Li, Q., Wang, Y., Chen, S.: Distribution entropy boosted VLAD for Image Retrieval. Entropy (2016)Google Scholar
  23. 23.
    Eggert, C., Romberg, S., Lienhart, R.: Improving VLAD: hierarchical coding and a refined local coordinate system. In: International Conference on Image Processing (2014)Google Scholar
  24. 24.
    Liu, Z., Wang, S., Tian, Q.: Fine-residual VLAD for image retrieval. Neurocomputing 173, 1183–1191 (2016)CrossRefGoogle Scholar
  25. 25.
    Magliani, F., Bidgoli, N.M., Prati, A.: A location-aware embedding technique for accurate landmark recognition. In: International Conference on Distributed Smart Cameras (2017)Google Scholar
  26. 26.
    Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)Google Scholar
  27. 27.
    Yue-Hei Ng, J., Yang, F., Davis, L.S.: Exploiting local features from deep networks for image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 53–61 (2015)Google Scholar
  28. 28.
    Yan, K., Wang, Y., Liang, D., Huang, T., Tian, Y.: CNN vs. SIFT for image retrieval: alternative or complementary? In: ACM Multimedia Conference, pp. 407–411. ACM (2016)Google Scholar
  29. 29.
    Reddy Mopuri, K., Venkatesh Babu, R.: Object level deep feature pooling for compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 62–70 (2015)Google Scholar
  30. 30.
    Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max-pooling of CNN activations. In: International Conference on Learning Representation (2015)Google Scholar
  31. 31.
    Gordo, A., Almazán, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. In: European Conference on Computer Vision, pp. 241–257. Springer (2016)Google Scholar
  32. 32.
    Seddati, O., Dupont, S., Mahmoudi, S., Parian, M., Dolez, B.: Towards good practices for image retrieval based on CNN features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1246–1255 (2017)Google Scholar
  33. 33.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)Google Scholar
  34. 34.
    Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: European Conference on Computer Vision, pp. 584–599. Springer (2014)Google Scholar
  35. 35.
    Gordo, A., Almazan, J., Revaud, J., Larlus, D.: End-to-end learning of deep visual representations for image retrieval. Int. J. Comput. Vis. 124(2), 237–254 (2017)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: International Conference on Computational Statistics, pp. 177–186. Springer (2010)Google Scholar
  37. 37.
    Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)Google Scholar
  38. 38.
    Chavez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1647–1658 (2008)CrossRefGoogle Scholar
  39. 39.
    Amato, G., Gennaro, C., Savino, P.: Mi-file: using inverted files for scalable approximate similarity search. Multimed. Tools Appl. 71(3), 1333–1362 (2014)CrossRefGoogle Scholar
  40. 40.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: ACM Symposium on Theory of Computing, pp. 604–613. ACM (1998)Google Scholar
  41. 41.
    Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Symposium on Computational Geometry, pp. 253–262. ACM (2004)Google Scholar
  42. 42.
    Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: International Conference on Very Large Data Bases, pp. 950–961. VLDB Endowment (2007)Google Scholar
  43. 43.
    Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey (2014). arXiv:1408.2927
  44. 44.
    Magliani, F., Fontanini, T., Prati, A.: Efficient nearest neighbors search for large-scale landmark recognition (2018). arXiv:1806.05946CrossRefGoogle Scholar
  45. 45.
    Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)CrossRefGoogle Scholar
  46. 46.
    Kalantidis, Y., Mellina, C., Osindero, S.: Cross-dimensional weighting for aggregated deep convolutional features. In: European Conference on Computer Vision, pp. 685–701. Springer (2016)Google Scholar
  47. 47.
    Ge, T., He, K., Ke, Q., Sun, J.: Optimized product quantization. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 744–755 (2014)CrossRefGoogle Scholar
  48. 48.
    Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2227–2240 (2014)CrossRefGoogle Scholar
  49. 49.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)Google Scholar
  50. 50.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  51. 51.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2161–2168. IEEE (2006)Google Scholar
  52. 52.
    Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: ACM International Conference on Multimedia Information Retrieval, pp. 39–43. ACM (2008)Google Scholar
  53. 53.
    Laskar, Z., Kannala, J.: Context aware query image representation for particular object retrieval. In: Scandinavian Conference on Image Analysis, pp. 88–99. Springer (2017)Google Scholar
  54. 54.
    Kalantidis, Y., Avrithis, Y.: Locally optimized product quantization for approximate nearest neighbor search. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2321–2328 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Federico Magliani
    • 1
    Email author
  • Tomaso Fontanini
    • 1
  • Andrea Prati
    • 1
  1. 1.IMPLab, Department of Engineering and ArchitectureUniversity of ParmaParmaItaly

Personalised recommendations