Mobile Visual Search for Digital Heritage Applications

  • Rohit Girdhar
  • Jayaguru Panda
  • C. V. Jawahar


In this chapter, we demonstrate a complete pipeline for multimedia retrieval on a mobile device. We target the use case of a tourist at a heritage site, who wishes guide herself by clicking an image of an interesting structure to get information about the same. This requires efficient mobile-based instance retrieval techniques over a dataset of 1000s of images. Such a task on mobile requires a significant reduction in the visual index size. To achieve this, we describe a set of strategies that can reduce the size of the visual index structure compared to a standard instance retrieval implementation found on desktops or servers. While our proposed reduction steps affect the overall mean Average Precision (mAP), they are able to maintain a good Precision for the top K results (\(P_K\)). We argue that for such offline application, maintaining a good \(P_K\) is sufficient. Such an instance retrieval framework depends on a well-annotated dataset of images to retrieve from. Photos from tourist and heritage sites can often be described with detailed and part-wise annotations. Manually, annotating a large community photo collection is a costly and redundant process as similar images share the same annotations. Hence, we also demonstrate an interactive web-based annotation tool that allows multiple users to add, view, edit and suggest rich annotations for images in community photo collections. Since, distinct annotations could be few, we have an easy and efficient batch annotation approach using an image similarity graph, pre-computed with instance retrieval and matching. This helps in seamlessly propagating annotations of the same objects or similar images across the entire dataset.


Mobile vision Instance retrieval Digital heritage Image annotation 



The authors would like to thank DST and the India Digital Heritage Project for the financial support and introducing to the exciting set of problems in this space.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
    Chandrasekhar V, Chen DM, Li Z, Takacs G, Tsai SS, Grzeszczuk R, Girod B (2009) Low-rate image retrieval with tree histogram coding. In: MobiMediaGoogle Scholar
  7. 7.
    Chandrasekhar V, Reznik Y, Takacs G, Chen D, Tsai S, Grzeszczuk R, Girod B (2010) Quantization schemes for low bitrate compressed histogram of gradients descriptors. In: CVPR workshopsGoogle Scholar
  8. 8.
    Chandrasekhar V, Takacs G, Chen DM, Tsai SS, Reznik Y, Grzeszczuk R, Girod B (2012) Compressed histogram of gradients: a low-bitrate descriptor. IJCVGoogle Scholar
  9. 9.
    Chen DM, Tsai SS, Chandrasekhar V, Takacs G, Singh JP, Girod B (2009) Tree histogram coding for mobile image matching. In: DCCGoogle Scholar
  10. 10.
    Chum O, Perdoch M, Matas J (2009) Geometric min-hashing: finding a (thick) needle in a haystack. In: CVPRGoogle Scholar
  11. 11.
    Chum O, Philbin J, Zisserman A (2008) Near duplicate image detection: min-hash and tf-idf weighting. In: BMVCGoogle Scholar
  12. 12.
    Feng J (2012) Mobile product search with bag of hash bits and boundary reranking. In: CVPRGoogle Scholar
  13. 13.
    Fergus R, Fei-Fei L, Perona P, Zisserman A (2005) Learning object categories from Google’s image search. In: ICCV 2005Google Scholar
  14. 14.
    Föckler P, Zeidler T, Brombach B, Bruns E, Bimber O (2005) Phoneguide: museum guidance supported by on-device object recognition on mobile phones. In: Mobile and ubiquitous multimediaGoogle Scholar
  15. 15.
    Föckler P, Zeidler T, Brombach B, Bruns E, Bimber O (2005) Phoneguide: museum guidance supported by on-device object recognition on mobile phones. In: MUMGoogle Scholar
  16. 16.
    Gammeter S, Bossard L, Quack T, Gool LJV (2009) I know what you did last summer: object-level auto-annotation of holiday snaps. In: ICCVGoogle Scholar
  17. 17.
    Giridhar R, Panda J, Jawahar CV (2014) Optimizing storage intensive vision applications to device capacity. In: ACCVGoogle Scholar
  18. 18.
    Girod B, Chandrasekhar V, Chen DM, Cheung NM, Grzeszczuk R, Reznik Y, Tsai S, Takacs G, Vedantham R (2011) Mobile visual search. In: IEEE SPMGoogle Scholar
  19. 19.
    Goesele M, Snavely N, Curless B, Hoppe H, Seitz S (2007) Multi-view stereo for community photo collections. In: ICCV 2007Google Scholar
  20. 20.
    Graham J, Hull JJ (2008) Icandy: a tangible user interface for itunes. In: CHI ’08 extended abstracts on human factors in computing systemsGoogle Scholar
  21. 21.
    Hays J, Efros AA (2007) Scene completion using millions of photographs. In: ACM SIGGRAPH 2007, SIGGRAPH ’07Google Scholar
  22. 22.
    Hays J, Efros AA (2008) Im2gps: estimating geographic information from a single image. In: CVPRGoogle Scholar
  23. 23.
    Henze N, Schinke T, Boll S (2009) What is that? object recognition from natural features on a mobile phone. In: MIRWGoogle Scholar
  24. 24.
    Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: ECCVGoogle Scholar
  25. 25.
    Jégou H, Douze M, Schmid C (2009) Packing bag-of-features. In: ICCVGoogle Scholar
  26. 26.
    Jegou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: CVPRGoogle Scholar
  27. 27.
    Jégou H, Perronnin F, Douze M, Sánchez J, Pérez P, Schmid C (2012) Aggregating local image descriptors into compact codes. In: PAMIGoogle Scholar
  28. 28.
    Ji R, Duan LY, Chen J, Yao H, Rui Y, Chang SF, Gao W (2011) Towards low bit rate mobile visual search with multiple-channel coding. In: ACM MMGoogle Scholar
  29. 29.
    Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: CVPRGoogle Scholar
  30. 30.
    Panda J, Brown M, Jawahar CV (2013) Offline mobile instance retrieval with a small memory footprint. In: ICCVGoogle Scholar
  31. 31.
    Panda J, Jawahar CV (2013) Efficient and rich annotations for large photo collections. In: ACPRGoogle Scholar
  32. 32.
    Panda J, Sharma S, Jawahar CV (2012) Heritage app: annotating images on mobile phones. In: ICVGIPGoogle Scholar
  33. 33.
    Perronnin F, Liu Y, Sánchez J, Poirier H (2010) Large-scale image retrieval with compressed fisher vectors. In: CVPRGoogle Scholar
  34. 34.
    Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: CVPRGoogle Scholar
  35. 35.
    Schroth G, Huitl R, Chen D, Abu-Alqumsan M, Al-Nuaimi A, Steinbach E (2011) Mobile visual location recognition. In: IEEE SPMGoogle Scholar
  36. 36.
    Simon I, Seitz SM (2008) Scene segmentation using the wisdom of crowds. In: ECCV 2008, ECCV’08Google Scholar
  37. 37.
    Simon I, Snavely N, Seitz S (2007) Scene summarization for online image collections. In: ICCV 2007Google Scholar
  38. 38.
    Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: ICCV, pp 1470Google Scholar
  39. 39.
    Snavely N, Garg R, Seitz SM, Szeliski R (2008) Finding paths through the world’s photos. In: ACM SIGGRAPH 2008Google Scholar
  40. 40.
    Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. In: ACM SIGGRAPH 2006 Papers, SIGGRAPH’06Google Scholar
  41. 41.
    Takacs G, Chandrasekhar V, Gelfand N, Xiong Y, Chen WC, Bismpigiannis T, Grzeszczuk R, Pulli K, Girod B (2008) Outdoors augmented reality on mobile phone using loxel-based visual feature organization. In: MIR (’08)Google Scholar
  42. 42.
    Torralba A, Fergus R, Weiss Y (2008) Small codes and large image databases for recognition. In: CVPRGoogle Scholar
  43. 43.
    Turcot P, Lowe DG (2010) Better matching with fewer features: the selection of useful features in large database recognition problemsGoogle Scholar
  44. 44.
    Wagner D, Reitmayr G, Mulloni A, Drummond T, Schmalstieg D (2008) Pose tracking from natural features on mobile phones. In: ISMARGoogle Scholar
  45. 45.
    Zhang X, Li Z, Zhang L, Ma W, Shum HY (2009) Efficient indexing for large scale visual search. In: ICCVGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.CVIT, IIIT HyderabadHyderabadIndia

Personalised recommendations