Spatial Natural Language Generation for Location Description in Photo Captions

  • Mark M. Hall
  • Christopher B. JonesEmail author
  • Philip Smart
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9368)


We present a spatial natural language generation system to create captions that describe the geographical context of geo-referenced photos. An analysis of existing photo captions was used to design templates representing typical caption language patterns, while the results of human subject experiments were used to create field-based spatial models of the applicability of some commonly used spatial prepositions. The language templates are instantiated with geo-data retrieved from the vicinity of the photo locations. A human subject evaluation was used to validate and to improve the spatial language generation procedure, examples of the results of which are presented in the paper.


Vague spatial language Natural language processing Human-subject experiments Spatial prepositions Field-based spatial models Locative expressions 



This work was supported by the EC TRIPOD project (FP6 045335).


  1. 1.
    Bateman, J.A., Hois, J., Ross, R.J., Tenbrink, T.: A linguistic ontology of space for natural language processing. Artif. Intell. 174(14), 1027–1071 (2010)CrossRefGoogle Scholar
  2. 2.
    Carolis, B.D., Cozzolongo, G., Pizzutilo, S., Silvestri, V.: Mymap: generating personalized tourist descriptions. Appl. Intell. 26(2), 111–124 (2007)CrossRefGoogle Scholar
  3. 3.
    Dale, R., Geldof, S., Prost, J.: Using natural language generation in automatic route. J. Res. Pract. Inf. Technol. 36(3), 23 (2004)Google Scholar
  4. 4.
    Dethlefs, N., Wu, Y., Kazerani, A., Winter, S.: Generation of adaptive route descriptions in urban environments. Spat. Cogn. Comput. 11(2), 153–177 (2011)Google Scholar
  5. 5.
    Fisher, P.F., Orf, T.M.: An investigation of the meaning of near and close on a university campus. Comput. Environ. Urban Syst. 15(1–2), 23–35 (1991)CrossRefGoogle Scholar
  6. 6.
    Gahegan, M.: Proximity operators for qualitative spatial reasoning. In: Kuhn, W., Frank, A.U. (eds.) COSIT 1995. LNCS, vol. 988, pp. 31–44. Springer, Heidelberg (1995) Google Scholar
  7. 7.
    Hall, M., Jones, C.: Quantifying spatial prepositions: an experimental study. In: Proceedings of the ACM GIS 2008, pp. 451–454 (2008)Google Scholar
  8. 8.
    Hall, M., Smart, P., Jones, C.: Interpreting spatial language in image captions. Cogn. Process. 12(1), 67–94 (2011)CrossRefGoogle Scholar
  9. 9.
    Herskovits, A.: Semantics and pragmatics of locative expressions. Cogn. Sci. Multi. J. 9(3), 341–378 (1985)CrossRefGoogle Scholar
  10. 10.
    Kelleher, J., Costello, F.: Applying computational models of spatial prepositions to visually situated dialog. Comput. Linguist. 35(2), 271–306 (2009)CrossRefGoogle Scholar
  11. 11.
    Landau, B., Jackendoff, R.: “What" and “where" in spatial language and spatial cognition. Behav. Brain Sci. 16(2), 217–238 (1993)CrossRefGoogle Scholar
  12. 12.
    Levinson, S.: Space in Language and Cognition: Explorations in Cognitive Diversity. CUP, Cambridge (2003)CrossRefGoogle Scholar
  13. 13.
    Logan, G., Sadler, D.: A computational analysis of the apprehension of spatial relations. In: Bloom, P., Peterson, M., Garrett, M., Nadel, L. (eds.) Language and Space, pp. 493–529. MIT Press, Cambridge (1996)Google Scholar
  14. 14.
    Mukerjee, A., Gupta, K., Nautiyal, S., Singh, M., Mishra, N.: Conceptual description of visual scenes from linguistic models. Image Vis. Comput. 18(2), 173–187 (2000)CrossRefGoogle Scholar
  15. 15.
    Naaman, M., Nair, R.: Zonetag’s collaborative tag suggestions: what is this person doing in my phone? IEEE MultiMedia 15(3), 34–40 (2008)CrossRefGoogle Scholar
  16. 16.
    Naaman, M., Song, Y., Paepcke, A., Molina, H.G.: Automatic organization for digital photographs with geographic coordinates. In: JCDL, pp. 53–62 (2004)Google Scholar
  17. 17.
    Oliver, M., Webster, R.: Kriging: a method of interpolation for geographical information systems. Int. J. Geogr. Inf. Syst. 4(3), 313–332 (1990)CrossRefGoogle Scholar
  18. 18.
    Reiter, E., Dale, R.: Building Natural Language Generation Systems. Cambridge University Press, Cambridge (2000)CrossRefGoogle Scholar
  19. 19.
    Richter, D., Vasardani, M., Stirling, L., Richter, K.F., Winter, S.: Zooming in - zooming out: hierarchies in place descriptions. In: Krisp, J.M. (ed.) Progress in Location-Based Services, pp. 339–355. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  20. 20.
    Robinson, V.: Interactive machine acquisition of a fuzzy spatial relation. Comput. Geosci. 16, 857–872 (1990)CrossRefGoogle Scholar
  21. 21.
    Robinson, V.: Individual and multipersonal fuzzy spatial relations acquired using human-machine interaction. Fuzzy Sets Syst. 113(1), 133–145 (2000)CrossRefzbMATHGoogle Scholar
  22. 22.
    Schirra, J.: A contribution to reference semantics of spatial prepositions: the visualization problem and its solution in VITRA. In: Zelinsky-Wibbelt, C. (ed.) The Semantics of Prepositions: From Mental Processing to Natural Language Processing, pp. 471–515. Mouton de Gruyter, Berlin (1993)Google Scholar
  23. 23.
    Schockaert, S., de Cock, M., Kerre, E.: Location approximation for local search services using natural language hints. Int. J. Geogr. Inf. Sci. 22(3), 315–336 (2008)CrossRefGoogle Scholar
  24. 24.
    Skubic, M., Perzanowski, D., Blisard, S., Schultz, A., Adams, W., Bugajska, M., Brock, D.: Spatial language for human-robot dialogs. IEEE Trans. Syst. Man Cyber. Part C Appl. Rev. 34(2), 154–167 (2004)CrossRefGoogle Scholar
  25. 25.
    Smart, P.D., Jones, C.B., Twaroch, F.A.: Multi-source toponym data integration and mediation for a meta-gazetteer service. In: Fabrikant, S.I., Reichenbacher, T., van Kreveld, M., Schlieder, C. (eds.) GIScience 2010. LNCS, vol. 6292, pp. 234–248. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  26. 26.
    Snavely, N., Seitz, S., Szeliski, R.: Modeling the world from internet photo collections. Int. J. Comput. Vis. 80(2), 189–210 (2007)CrossRefGoogle Scholar
  27. 27.
    Sorrows, M.E., Hirtle, S.C.: The nature of landmarks for real and electronic spaces. In: Freksa, C., Mark, D.M. (eds.) COSIT 1999. LNCS, vol. 1661, pp. 37–50. Springer, Heidelberg (1999) Google Scholar
  28. 28.
    Spinellis, D.: Position-annotated photographs: a geotemporal web. IEEE Pervasive Comput. 2(2), 72–79 (2003)CrossRefGoogle Scholar
  29. 29.
    Talmy, L.: How language structures space. In: Pick Jr., H.L., Acredolo, L.P. (eds.) Spatial Orientation, pp. 225–282. Plenum, New York (1983) CrossRefGoogle Scholar
  30. 30.
    Tanasescu, V., Smart, P., Jones, C.: Reverse geocoding for photo captioning with a meta-gazetteer. In: SIGSPATIAL 2014. ACM Press (2014)Google Scholar
  31. 31.
    Tenbrink, T.: Reference frames of space and time in language. J. Pragmatics 43, 704–722 (2011)CrossRefGoogle Scholar
  32. 32.
    Worboys, M.: Nearness relations in environmental space. Int. J. Geogr. Inf. Sci. 15(7), 633–651 (2001)CrossRefGoogle Scholar
  33. 33.
    Worboys, M., Duckham, M., Kulik, L.: Commonsense notions of proximity and direction in environmental space. Spat. Cogn. Comput. 4(4), 285–312 (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Mark M. Hall
    • 1
  • Christopher B. Jones
    • 2
    Email author
  • Philip Smart
    • 2
  1. 1.University of EdgehillOrmskirkUK
  2. 2.Cardiff UniversityCardiffUK

Personalised recommendations