Where the Photos Were Taken: Location Prediction by Learning from Flickr Photos

  • Li-Jia LiEmail author
  • Rahul Kumar Jha
  • Bart Thomee
  • David Ayman Shamma
  • Liangliang Cao
  • Yang Wang
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)


In this chapter, we explore the characteristics of geographically tagged Internet photos and determine their location based on the visual content. We develop a principled machine learning model to estimate geographical locations of photos by modeling the relationship between location and the photo content. To build reliable geographical estimators, it is important to find distinguishable geographical clusters in the world. These clusters cover general geographical regions not limited to just landmarks. Geographical clusters provide more training samples and hence lead to better recognition accuracy. We develop a framework for geographical cluster estimation, and employ latent variables to estimate the geographical clusters. To solve this estimation problem, we propose to build an efficient solver to find the latent clusters. We illustrate detailed qualitative results obtained from beaches photos taken at different continents. In addition, we show significantly improved quantitative results over other approaches for recognizing different beaches using the Flickr beach dataset as validation.


Cluster Center Training Image Visual Content Geographical Cluster Landmark Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Hays J, Efros AA (2008) Im2gps: estimating geographic information from a single image. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  2. 2.
    Crandall D, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: International conference on world wide web, pp 761–770Google Scholar
  3. 3.
    Chen W, Battestini A, Gelfand N, Setlur V (2009) Visual summaries of popular landmarks from community photo collections. In: ACM international conference on Multimedia, pp 789–792Google Scholar
  4. 4.
    Yin Z, Cao L, Han J, Zhai C, Huang T (2011) Geographical topic discovery and comparison. In: Proceedings of the 20th international conference on world wide web. ACM, pp 247–256Google Scholar
  5. 5.
    Zheng Y, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua T, Neven H (2009) Tour the World: building a web-scale landmark recognition engine. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  6. 6.
    Cao L, Smith J, Wen Z, Yin Z, Jin X, Han J (2012) BlueFinder: estimate where a beach photo was taken. In: WWWGoogle Scholar
  7. 7.
    Wang Y, Cao L (2013) Discovering latent clusters from geotagged beach images. In: Advances in multimedia modeling. Springer, pp 133–142Google Scholar
  8. 8.
    Naaman M, Song Y, Paepcke A, Garcia-Molina H (2004) Automatic organization for digital photographs with geographic coordinates. In: International conference on digital libraries, vol 7. pp 53–62Google Scholar
  9. 9.
    Agarwal M, Konolige K (2006) Real-time localization in outdoor environments using stereo vision and inexpensive GPS. In: International conference on pattern recognitionGoogle Scholar
  10. 10.
    Cao L, Yu J, Luo J, Huang T (2009) Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression. In: Proceedings of the seventeen ACM international conference on multimedia, pp 125–134Google Scholar
  11. 11.
    Yu J, Luo J (2008) Leveraging probabilistic season and location context models for scene understanding. In: International conference on content-based image and video retrieval, pp 169–178Google Scholar
  12. 12.
    Joshi D, Luo J (2008) Inferring generic places based on visual content and bag of geotags. In: ACM conference on content-based image and video retrievalGoogle Scholar
  13. 13.
    Yuan J, Luo J, Wu Y (2008) Mining compositional features for boosting. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  14. 14.
    Kennedy L, Naaman M, Ahern S, Nair R, Rattenbury T (2007) How flickr helps us make sense of the world: context and content in community-contributed media collections. In: ACM conference on multimediaGoogle Scholar
  15. 15.
    Naaman M (2005) Leveraging geo-referenced digital photographs. PhD thesis, Stanford UniversityGoogle Scholar
  16. 16.
    Quack T, Leibe B, Van Gool L (2008) World-scale mining of objects and events from community photo collections. In: ACM conference on image and video retrieval, pp 47–56Google Scholar
  17. 17.
    Luo J, Yu J, Joshi D, Hao W (2008) Event recognition: viewing the world with a third eye. In: ACM international conference on multimedia, pp 1071–1080Google Scholar
  18. 18.
    Schindler G, Krishnamurthy P, Lublinerman R, Liu Y, Dellaert F (2008) Detecting and matching repeated patterns for automatic geo-tagging in urban environments. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  19. 19.
    Cao L, Luo J, Gallagher A, Jin X, Han J, Huang T (2010) A worldwide tourism recommendation system based on geotagged web photos. In: International conference on acoustics, speech, and signal processing (ICASSP)Google Scholar
  20. 20.
    Bush V (1945) As we may think. The Atlantic MonthlyGoogle Scholar
  21. 21.
    Agarwal S, Snavely N, Simon I, Seitz SM, Szeliski R (2009) Building rome in a day. In: International conference on computer visionGoogle Scholar
  22. 22.
    Ji R, Xie X, Yao H, Ma WY (2009) Mining city landmarks from blogs by graph modeling. In: ACM Multimedia, pp 105–114Google Scholar
  23. 23.
    Gallagher A, Joshi D, Yu J, Luo J (2009) Geo-location inference from image content and user tags. In: Workshop on internet visionGoogle Scholar
  24. 24.
    Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell 32:1672–1645Google Scholar
  25. 25.
    Xu L, Neufeldand J, Larson B, Schuurmans D (2005) Maximum margin clustering. In Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems, vol 17. MIT Press, Cambridge, MA, pp 1537–1544Google Scholar
  26. 26.
    Choi J, Lei H, Ekambaram V, Kelm P, Gottlieb L, Sikora T, Ramchandran K, Friedland G (2013) Human vs machine: establishing a human baseline for multimodal location estimation. In: Proceedings of the 21st ACM international conference on multimedia, MM ’13 pp 867–876Google Scholar
  27. 27.
    Xu L, Wilkinson D, Southey F, Schuurmans D (2006) Discriminative unsupervised learning of structured predictors. In: Proceedings of the 23th international conference on machine learningGoogle Scholar
  28. 28.
    Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn ResGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Li-Jia Li
    • 1
    Email author
  • Rahul Kumar Jha
    • 2
  • Bart Thomee
    • 3
  • David Ayman Shamma
    • 3
  • Liangliang Cao
    • 4
  • Yang Wang
    • 5
  1. 1.Yahoo! ResearchSunnyvaleUSA
  2. 2.University of MichiganAnn ArborUSA
  3. 3.Yahoo! ResearchSan FranciscoUSA
  4. 4.IBM Watson ResearchNew YorkUSA
  5. 5.University of ManitobaWinnipegCanada

Personalised recommendations