Multimedia Tools and Applications

, Volume 51, Issue 1, pp 77–98 | Cite as

Research and applications on georeferenced multimedia: a survey

Article

Abstract

In recent years, the emergence of georeferenced media, like geotagged photos, on the Internet has opened up a new world of possibilities for geographic related research and applications. Despite of its short history, georeferenced media has been attracting attentions from several major research communities of Computer Vision, Multimedia, Digital Libraries and KDD. This paper provides a comprehensive survey on recent research and applications on online georeferenced media. Specifically, the survey focuses on four aspects: (1) organizing and browsing georeferenced media resources, (2) mining semantic/social knowledge from georeferenced media, (3) learning landmarks in the world, and (4) estimating geographic location of a photo. Furthermore, based on the current technical achievements, open research issues and challenges are identified, and directions that can lead to compelling applications are suggested.

Keywords

Georeferenced media Geotagged photo Survey 

References

  1. 1.
    Abbasi R, Chernov S, Nejdl W, Paiu R, Staab S (2009) Exploiting flickr tags and groups for finding landmark photos. In: Advances in information retrieval. Lecture notes in computer science, vol 5478. Springer, Berlin, pp 654–661Google Scholar
  2. 2.
    Agarwal A, Furukawa Y, Snavely N, Curless B, Seitz SM, Szeliski R (2010) Reconstructing Rome. Computer 43:40–47CrossRefGoogle Scholar
  3. 3.
    Agarwal S, Snavely N, Simon I, Seitz SM, Szeliski R (2009) Building Rome in a day. In: Proceedings of international conference on computer vision. Kyoto, JapanGoogle Scholar
  4. 4.
    Ahern S, Naaman M, Nair R, Yang JH-I (2007) World explorer: visualizing aggregate data from unstructured text in geo-referenced collections. In: Proceedings of the 7th ACM/IEEE-CS joint conference on digital libraries. ACM, New York, pp 1–10Google Scholar
  5. 5.
    Anselin L (1992) Spatial data analysis with GIS: an introduction to application in the social sciences. Technical report, University of California, Santa BarbaraGoogle Scholar
  6. 6.
    Asakura Y, Iryo T (2007) Analysis of tourist behaviour based on the tracking data collected using a mobile communication instrument. Transp Res, Part A Policy Pract 41(7):684–690CrossRefGoogle Scholar
  7. 7.
    Bay H, Tuytelaars T, Gool V, Surf L (2006) Speeded up robust features. In: 9th European conference on computer vision. Graz AustriaGoogle Scholar
  8. 8.
    Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(4):509–522CrossRefGoogle Scholar
  9. 9.
    Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Berg TL, Forsyth D (2007) Automatic ranking of iconic images. Technical report UCB/EECS-2007-13, EECS Department, University of California, BerkeleyGoogle Scholar
  11. 11.
    Berry JK (1993) Beyond mapping: concepts, algorithms, and issues in GIS/Joseph K. Berry. GIS World, Inc., Ft. Collins, Colo., USAGoogle Scholar
  12. 12.
    Bishop CM (2006) Pattern recognition and machine learning. Springer, BerlinMATHGoogle Scholar
  13. 13.
    Camara AS, Raper J (eds) (1999) Spatial multimedia and virtual reality. Taylor & Francis, BristolGoogle Scholar
  14. 14.
    Cayzer S, Butler MH (2004) Semantic photos. Technical report, HP Laboratories BristolGoogle Scholar
  15. 15.
    Chen W-C, Battestini A, Gelfand N, Setlur V (2009) Visual summaries of popular landmarks from community photo collections. In: MM ’09: Proceedings of the seventeen ACM international conference on multimedia. ACM, New York, pp 789–792CrossRefGoogle Scholar
  16. 16.
    Chippendale P, Zanin M, Andreatta C. Visual environment monitoring: the marmota project. http://tev.fbk.eu/marmota/
  17. 17.
    Chippendale P, Zanin M, Andreatta C (2009) Collective photography. In: Conference for visual media production, pp 188–194Google Scholar
  18. 18.
    Crandall DJ, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: Proceedings of the 18th international conference on World Wide Web. ACM, New York, pp 761–770CrossRefGoogle Scholar
  19. 19.
    DeSouza GN, Kak AC (2002) Vision for mobile robot navigation: a survey. IEEE Trans Pattern Anal Mach Intell 24:237–267CrossRefGoogle Scholar
  20. 20.
    Diaconis P (2009) The Markov chain Monte Carlo revolution. Bull Am Math Soc New Ser 46(2):179–205MATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    Donoser M, Bischof H (2006) Efficient maximally stable extremal region (MSER) tracking. In: Proceedings of conference on computer vision and pattern recognition, pp 553–560Google Scholar
  22. 22.
    Dubinko M, Kumar R, Magnani J, Novak J, Raghavan P, Tomkins A (2006) Visualizing tags over time. In: Proceedings of the 15th international conference on World Wide Web. ACM, New York, pp 193–202CrossRefGoogle Scholar
  23. 23.
    Ester M, Kriegel HP (1997) Spatial data mining: a database approach. In: Advances in spatial databases. Springer, Berlin, pp 47–66Google Scholar
  24. 24.
    Gionis A, Mannila H (2003) Finding recurrent sources in sequences. In: Proceedings of the annual international conference on research in computational molecular biology. ACM, New York, pp 123–130Google Scholar
  25. 25.
    Goesele M, Snavely N, Curless B, Hoppe H, Seitz SM (2007) Multi-view stereo for community photo collections. In: Proceedings of IEEE conference on computer vision, Rio de Janeiro, Brazil, 14–20 October 2007Google Scholar
  26. 26.
    Goldberger J, Tassa T (2008) A hierarchical clustering algorithm based on the Hungarian method. Pattern Recogn Lett 29(11):1632–1638CrossRefGoogle Scholar
  27. 27.
    Graham A, Garcia-Molina H, Paepcke A, Winograd T (2002) Time as essence for photo browsing through personal digital libraries. In: Proceedings of the ACM/IEEE-CS joint conference on digital libraries. ACM, New York, pp 326–335CrossRefGoogle Scholar
  28. 28.
    Hakeem A, Vezzani R, Shah M, Cucchiara R (2006) Estimating geospatial trajectory of a moving camera. In: ICPR ’06: Proceedings of the 18th international conference on pattern recognition. IEEE Computer Society, Washington, DC, pp 82–87Google Scholar
  29. 29.
    Han J, Kamber M, Tung AKH (2001) Geographic data mining and knowledge discovery, chapter. Spatial clustering methods in data mining: a survey. Taylor and FrancisGoogle Scholar
  30. 30.
    Hao Q, Cai R, Wang C, Xiao R, Yang J-M, Pang Y, Zhang L (2010) Equip tourists with knowledge mined from travelogues. In: WWW ’10: Proceedings of the 19th international conference on World Wide Web. ACM, New York, pp 401–410CrossRefGoogle Scholar
  31. 31.
    Hao Q, Cai R, Wang X-J, Yang J-M, Pang Y, Zhang L (2009) Generating location overviews with images and tags by mining user-generated travelogues. In: Proceedings of the seventeen ACM international conference on multimedia. ACM, New York, pp 801–804CrossRefGoogle Scholar
  32. 32.
    Harada S, Naaman M, Song YJ, Wang QY, Paepcke A (2004) Lost in memories: interacting with photo collections on PDAS. In: JCDL ’04: Proceedings of the 4th ACM/IEEE-CS joint conference on digital libraries. ACM, New York, pp 325–333Google Scholar
  33. 33.
    Hays J, Efros A (2008) IM2GPS: estimating geographic information from a single image. In: Proceedngs of conference on computer vision and pattern recognitionGoogle Scholar
  34. 34.
    Heuer JT, Dupke S (2007) Towards a spatial search engine using geotags. In: Kessler C, Probst F (eds) GI-Days 2007—young researchers conference. Institute for Geoinformatics, pp 199–204. citeulike-article-id=1668715Google Scholar
  35. 35.
    Hile H, Vedantham R, Cuellar G, Liu A, Gelfand N, Grzeszczuk R, Borriello G (2008) Landmark-based pedestrian navigation from collections of geotagged photos. In: Proceedings of the 7th international conference on mobile and ubiquitous multimedia. ACM, New York, pp 145–152CrossRefGoogle Scholar
  36. 36.
    Hofmann T (1999) Probabilistic latent semantic analysis. In: Proceedings of uncertainty in artificial intelligence. UAI, StockholmGoogle Scholar
  37. 37.
    Ishikawa Y, Tsukamoto Y, Kitagawa H (2004) Extracting mobility statistics from indexed spatio-temporal datasets. In: Spatio-temporal database management, 2nd international workshop STDBM’04, Toronto, Canada, 30 August 2004, pp 9–16Google Scholar
  38. 38.
    Jaffe A, Naaman M, Tassa T, Davis M (2006) Generating summaries and visualization for large collections of geo-referenced photographs. In: Proceedings of the 8th ACM international workshop on multimedia information retrieval. ACM, New York, pp 89–98CrossRefGoogle Scholar
  39. 39.
    Jesdanun A (2008) GPS adds dimension to online photos citation. In: ABC news, technology & science. http://www.physorg.com/news119889687.html. Accessed 18 Jan 2008
  40. 40.
    Jing F, Zhang L, Ma W-Y (2006) Virtualtour: an online travel assistant based on high quality images. In: Proceedings of the 14th annual ACM international conference on multimedia. ACM, New York, pp 599–602CrossRefGoogle Scholar
  41. 41.
    Jung V (1999) Metaviz: visual interaction with geospatial digital libraries. Technical report, 4] INVISIP—Information Visualisation for Site PlanningGoogle Scholar
  42. 42.
    Kalogerakis E, Vesselova O, Hays J, Efros AA, Hertzmann A Image sequence geolocation with human travel priors. In: Proceedings of international conference on computer vision. Kyoto, JapanGoogle Scholar
  43. 43.
    Kennedy L, Naaman M (2008) Generating diverse and representative image search results for landmarks. In: Proceeding of the 17th international conference on World Wide Web. ACM, New York, pp 297–306CrossRefGoogle Scholar
  44. 44.
    Kennedy L, Naaman M, Ahern S, Nair R, Rattenbury T (2007) How flickr helps us make sense of the world: context and content in community-contributed media collections. In: Proceedings of conference on multimedia. ACM, New York, pp 631–640CrossRefGoogle Scholar
  45. 45.
    Koperski K, Adhikary J, Han J (1996) Spatial data mining: progress and challenges survey paper. In: SIGMOD workshop on research issues on data mining and knowledge discovery, pp 1–10Google Scholar
  46. 46.
    Kuipers B, Beeson P (2002) Bootstrap learning for place recognition. In: Eighteenth national conference on artificial intelligence. American Association for Artificial Intelligence, Menlo Park, pp 174–180Google Scholar
  47. 47.
    Lazebnik S, Schmid C, Ponce J (2005) A sparse texture representation using local affine regions. IEEE Trans Pattern Anal Mach Intell 27(8):1265–1278CrossRefGoogle Scholar
  48. 48.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of conference on computer vision and pattern recognition, pp 2169–2178. Washington, DC, USAGoogle Scholar
  49. 49.
    Lewa A, McKerchera B (2006) Modeling tourist movements: a local destination analysis. Ann Tour Res 33(2):403–423CrossRefGoogle Scholar
  50. 50.
    Li X, Wu C, Zach C, Lazebnik S, Frahm J-M (2008) Modeling and recognition of landmark image collections using iconic scene graphs. In: Proceedings of European conference on computer vision, pp 427–440Google Scholar
  51. 51.
    Li Y, Crandall DJ, Huttenlocher DP (2009) Landmark classification in large-scale image collections. In: Proceedings of international conference on computer vision, pp 1957–1964. Kyoto, JapanGoogle Scholar
  52. 52.
    Lim E-P, Goh DH-L, Ng ZLW-K, Liu Z, Ng WK, Khoo CSG, Higgins SE (2002) G-portal: a map-based digital library for distributed geospatial and georeferenced resources. In: Proceedings of the second ACM+IEEE joint conference on digital libraries, pp 351–358Google Scholar
  53. 53.
    Lisin DA, Mattar MA, Blaschko MB, Learned-Miller EG, Benfield MC (2005) Combining local and global image features for object class recognition. In: CVPR ’05: Proceedings of the 2005 IEEE Computer Society conference on computer vision and pattern recognition (CVPR’05)—workshops. IEEE Computer Society, Washington, DC, p 47Google Scholar
  54. 54.
    Lowe DG (1999) Object recognition from local scale-invariant features. In: IEEE international conference on computer vision, vol 2, pp 1150–1157Google Scholar
  55. 55.
    Lowe DG (2003) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 20:91–110Google Scholar
  56. 56.
    McKercher B, Lau G (2007) Understanding tourist movement patterns in a destination: a GIS approach. Tour Hosp Res 7(1):39–49Google Scholar
  57. 57.
    Mckercher B, Lau G (2008) Movement patterns of tourists within a destination. Tour Geogr 10(3):355–374CrossRefGoogle Scholar
  58. 58.
    Mei Q, Liu C, Su H, Zhai CX (2006) A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: WWW ’06: Proceedings of the 15th international conference on World Wide Web. ACM, New York, pp 533–542CrossRefGoogle Scholar
  59. 59.
    Mikolajczyk K, Schmid C (2004) Scale and affine invariant interest point detectors. Int J Comput Vis 60(1):63–86CrossRefGoogle Scholar
  60. 60.
    Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell 27(10):1615–1630CrossRefGoogle Scholar
  61. 61.
    Miller HJ, Han J (2001) Geographic data mining and knowledge discovery. Taylor & Francis, BristolCrossRefGoogle Scholar
  62. 62.
    Naaman M, Harada S, Wang Q, Paepcke A (2004) Adventures in space and time browsing personal collection of geo-referenced digital library. Technical report, Stanford UniversityGoogle Scholar
  63. 63.
    Naaman M, Harada S, Wang QY, Garcia-Molina H, Paepcke A (2004) Context data in geo-referenced digital photo collections. In: Proceedings of the ACM international conference on multimedia. ACM, New York, pp 196–203Google Scholar
  64. 64.
    Naaman M, Song YJ, Paepcke A, Garcia-Molina H (2004) Automatic organization for digital photographs with geographic coordinates. In: JCDL ’04: Proceedings of the 4th ACM/IEEE-CS joint conference on digital libraries. ACM, New York, pp 53–62Google Scholar
  65. 65.
    Ni K, Steedlyy D, Dellaert F (2007) Out-of-core bundle adjustment for large-scale 3d reconstruction. In: Proceeding of international conference on computer vision, Rio de Janeiro, Brazil, 14–20 October 2007Google Scholar
  66. 66.
    Niculescu D, Nath B (2001) Ad hoc positioning system (APS). In: Globecom, pp 2926–2931Google Scholar
  67. 67.
  68. 68.
    Pigeau A, Gelgon M (2004) Organizing a personal image collection with statistical model-based ICL clustering on spatio-temporal camera phone meta-data. J Vis Commun Image Represent 15(3):425–445CrossRefGoogle Scholar
  69. 69.
    Pope AR, Lowe DG (2000) Probabilistic models of appearance for 3-d object recognition. Int J Comput Vis 40(2):149–167MATHCrossRefGoogle Scholar
  70. 70.
    Quack T, Leibe B, Gool LV (2008) World-scale mining of objects and events from community photo collections. In: CIVR ’08: Proceedings of the 2008 international conference on content-based image and video retrieval. ACM, New York, pp 47–56CrossRefGoogle Scholar
  71. 71.
    Rattenbury T, Good N, Naaman M (2007) Towards automatic extraction of event and place semantics from flickr tags. In: Proceedings of ACM SIGIR. ACM, New York, pp 103–110Google Scholar
  72. 72.
    Ren Y, Yu M, Wang X-J, Zhang L, Ma W-Y (2010) Diversifying landmark image search results by learning interested views from community photos. In: Proceedings of the 19th international conference on World Wide Web. ACM, New York, pp 1289–1292CrossRefGoogle Scholar
  73. 73.
    Rota G-C, Baclawski K (1979) Introduction to probability and random processesGoogle Scholar
  74. 74.
    Se S, Lowe D, Little J (2001) Vision-based mobile robot localization and mapping using scale-invariant features. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), pp 2051–2058Google Scholar
  75. 75.
    Serdyukov P, Murdock V, van Zwol R (2009) Placing flickr photos on a map. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 484–491CrossRefGoogle Scholar
  76. 76.
    Simon I, Snavely N, Seitz SM (2007) Scene summarization for online image collections. In: Proceedings of international conference on computer vision, Kyoto, Japan. IEEE, pp 1–8Google Scholar
  77. 77.
  78. 78.
    Smith TR (1996) A digital library for geographically referenced materials. Computer 29(5):54–60CrossRefGoogle Scholar
  79. 79.
    Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3d. In: ACM transactions on graphics. ACM, New York, pp 835–846Google Scholar
  80. 80.
    Snavely N, Seitz SM, Szeliski R (2008) Modeling the world from Internet photo collections. Int J Comput Vis 80(2):189–210CrossRefGoogle Scholar
  81. 81.
    Snavely N, Seitz SM, Szeliski R (2008) Skeletal sets for efficient structure from motion. In: Proceeding of conference on computer vision and pattern recognition. Anchorage, Alaska, USAGoogle Scholar
  82. 82.
    Spinellis DD (2003) Position-annotated photographs: a geotemporal web. IEEE Pervasive Computing 2(2):72–79CrossRefGoogle Scholar
  83. 83.
    Srinivasan A, Richards JA (1993) Analysis of GIS spatial data using knowledge-based methods. Int J Geogr Inf Syst 7(6):479–500CrossRefGoogle Scholar
  84. 84.
    Szeliski R (2009) “Where am i?”: ICCV 2005 computer vision contest. In: Proceedings of the seventeen ACM international conference on multimedia. ACM, New York, pp 961–962Google Scholar
  85. 85.
    Torniai C, Battle S, Cayzer S (2007) Sharing, discovering and browsing geotagged pictures on the web. Technical report, HP Laboratories Bristol, 15 May 2007Google Scholar
  86. 86.
    Torralba A, Murphy KP, Freeman WT, Rubin MA (2003) Context-based vision system for place and object recognition. In: Proceedings of the ninth IEEE international conference on computer vision. IEEE Computer Society, Washington, DC, p 273CrossRefGoogle Scholar
  87. 87.
    Toyama K, Logan R, Roseway A (2003) Geographic location tags on digital images. In: Proceedings of the ACM international conference on multimedia. ACM, New York, pp 156–166Google Scholar
  88. 88.
    Ulrich W, Nourbakhsh I (2000) Appearance-based place recognition and I. In: Proceedings of IEEE international conference on robotics and automation, San Francisco, CA, pp 1023–1029Google Scholar
  89. 89.
    Upton GJG, Fingleton B (1989) Spatial data analysis by example. Vol. 2: Categorical and directional data. Wiley, New YorkMATHGoogle Scholar
  90. 90.
    Valentino-DeVries J (2010) Using flickr photos as a travel guide. Wall Street J July 23. http://blogs.wsj.com/digits/2010/07/23/using-flickr-photos-as-a-travel-guide/
  91. 91.
    Van Laere O, Schockaert S, Dhoedt B (2010) Towards automated georeferencing of flickr photos. In: Proceedings of the 6th workshop on geographic information retrieval. ACM, New York, pp 1–7CrossRefGoogle Scholar
  92. 92.
    Wagenaar WA (1986) My memory: a study of autobiographical memory over six years. Cogn Psychol 18:225–252CrossRefGoogle Scholar
  93. 93.
    Yanai K, Kawakubo H, Qiu B (2009) A visual analysis of the relationship between word concepts and geographical locations. In: Proceeding of the ACM international conference on image and video retrieval. ACM, New York, pp 1–8CrossRefGoogle Scholar
  94. 94.
    Yanai K, Qiu B (2009) Mining cultural differences from a large number of geotagged photos. In: Proceedings of the 18th international conference on World Wide Web. ACM, New York, pp 1173–1174CrossRefGoogle Scholar
  95. 95.
    Yanai K, Yaegashi K, Qiu B (2009) Detecting cultural differences using consumer-generated geotagged photos. In: Proceedings of the 2nd international workshop on location and the web. ACM, New York, pp 1–4CrossRefGoogle Scholar
  96. 96.
    Zhang W, Kosecka J (2006) Image based localization in urban environments. In: Proceedings of the third international symposium on 3D data processing, visualization, and transmission (3DPVT’06). IEEE Computer Society, Washington, DC, pp 33–40CrossRefGoogle Scholar
  97. 97.
    Zheng Y-T, Li Y, Zha Z-J, Chua T-S (2011) Mining travel patterns from GPS-tagged photos. In: Proceedings of ACM conference on multimedia modeling, Taipei, Taiwan, 5–7 Jan 2011. ACM, New YorkGoogle Scholar
  98. 98.
    Zheng Y-T, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua T-S, Neven H (2009) Tour the world: building a web-scale landmark recognition engine. In: Proceedings of international conference on computer vision and pattern recognition, Miami, FL, USAGoogle Scholar
  99. 99.
    Zheng Y-T, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua T-S, Neven H, Yagnik J (2009) Tour the world: a technical demonstration of a web-scale landmark recognition engine. In: Proceedings of the seventeen ACM international conference on multimedia. ACM, New York, pp 961–962CrossRefGoogle Scholar
  100. 100.
    Zheng Y, Zhang L, Xie X, Ma W-Y (2009) Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of the 18th international conference on World Wide Web. ACM, New York, pp 791–800CrossRefGoogle Scholar
  101. 101.
    Zheng Y, Zhang L, Xie X, Ma W-Y (2009) Mining interesting locations and travel sequences from GPS trajectories. In: WWW ’09: Proceedings of the 18th international conference on World Wide Web. ACM, New York, pp 791–800CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Institute for Infocomm ResearchSingaporeSingapore
  2. 2.Department of Computer ScienceNational University of SingaporeSingaporeSingapore

Personalised recommendations