Skip to main content

Creating a Corpus of Geospatial Natural Language

  • Conference paper
Spatial Information Theory (COSIT 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8116))

Included in the following conference series:

Abstract

The description of location using natural language is of interest for a number of research activities including the automated interpretation and generation of natural language to ease interaction with geographic information systems. For such activities, examples of geospatial natural language are usually collected from the personal knowledge of researchers, or in small scale collection activities specific to the project concerned. This paper describes the process used to develop a more generic corpus of geospatial natural language.

The paper discusses the development and evaluation of four methods for semi-automated harvesting of geospatial natural language clauses from text to create a corpus of geospatial natural language. The most successful method uses a set of geospatial syntactic templates that describe common patterns of grammatical geospatial word categories and provide a precision of 0.66. Particular challenges were posed by the range of English dialects included, as well as metaphoric and sporting references.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, A., Bader, M., Bard, E., Boyle, E., Doherty, G.M., Garrod, S., Isard, S., Kowtko, J., McAllister, J., Miller, J., Sotillo, C., Thompson, H.S., Weinert, R.: The HCRC Map Task Corpus. Language and Speech 34, 351–366 (1991)

    Google Scholar 

  2. Bateman, J.A., Hois, J., Ross, R.J., Tenbrink, T.: A Linguistic Ontology of Space for Natural Language Processing. Artificial Intelligence 174, 1027–1071 (2010)

    Article  Google Scholar 

  3. Bitters, B.: Geospatial Reasoning in a Natural Language Processing (NLP) Environment. In: Proceedings of the 25th International Cartographic Conference (2011)

    Google Scholar 

  4. Blaylock, N., Swain, B., Allen, J.F.: Tesla: A tool for annotating geospatial language corpora. In: HLT-NAACL (Short Papers), pp. 45–48 (2009)

    Google Scholar 

  5. Blaylock, N., Swain, B., Allen, J.: Mining Geospatial Path Data from Natural Language Descriptions. In: ACM QUeST 2009, Seattle, November 3 (2009)

    Google Scholar 

  6. Brill, E.: A simple rule-based part of speech tagger. In: Proceedings of the Third Conference on Applied Natural Language Processing, ANLC 1992, pp. 152–155. Association for Computational Linguistics, Stroudsburg (1992)

    Chapter  Google Scholar 

  7. Califi, M.E., Mooney, R.J.: Relational learning of pattern-match rules for information extraction. In: Proceedings of AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, Standford, CA, pp. 6–11 (1998)

    Google Scholar 

  8. Chomsky, N.: Three models for the description of language. IRE Transactions on Information Theory 2, 113–124 (1956)

    Article  MATH  Google Scholar 

  9. Cohen, K.B., Fox, L., Ogren, P.V., Hunter, L.: Corpus design for biomedical natural language processing. In: Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics, pp. 38–45 (June 2005)

    Google Scholar 

  10. Coventry, K.R., Garrod, S.C.: Saying, Seeing and Acting: The Psychological Semantics of Spatial Prepositions. Psychology Press, East Sussex (2004)

    Google Scholar 

  11. Cunningham, H.: GATE, a General Architecture for Text Engineering. Computers and the Humanities 36, 223–254 (2002)

    Article  Google Scholar 

  12. Evans, V., Green, M.: Cognitive Linguistics: An Introduction. Edinburgh University Press, Edinburgh (2006)

    Google Scholar 

  13. Goldberg, A.: Constructions at Work: The Nature of Generalization in Language. Oxford University Press, Oxford (2006)

    Google Scholar 

  14. Gregory, I., Hardie, A.: Visual GISting: bringing together corpus linguistics and Geographical Information Systems. Literary and Linguistic Computing 26, 297–314 (2011)

    Article  Google Scholar 

  15. Hirtle, S., Richter, K.-F., Srinivas, S., Firth, R.: This is the tricky part: When directions become difficult. Journal of Spatial Information Science 1, 53–73 (2010)

    Google Scholar 

  16. Hornsby, K.S., Li, N.: Conceptual Framework for Modeling Dynamic Paths from Natural Language Expressions. Transactions in GIS 13, 27–45 (2009)

    Article  Google Scholar 

  17. http://www.nottingham.ac.uk/~lgzwww/contacts/staffPages/kristinstock/documents/PatternsinGeospatialNaturalLanguagev1.0.pdf

  18. Hunston, S., Francis, G.: Pattern Grammar: A Corpus-Driven Approach to the Lexical Grammar of English. John Benjamins Publishing Co., Amsterdam (2000)

    Google Scholar 

  19. Johnson, M.: The body in the mind: the bodily basis of meaning, imagination, and reason. University of Chicago Press, Chicago (1987)

    Google Scholar 

  20. Klippel, A., Xu, S., Li, R., Yang, J.: Spatial event language across domains. In: Workshop on Computational Models for Spatial Language Interpretation and Generation, CoSLI-2 (2011)

    Google Scholar 

  21. Lakoff, G.: Women, fire, and dangerous things: what categories reveal about the mind. University of Chicago Press, Chicago (1990)

    Google Scholar 

  22. Landau, B., Jackendoff, R.: “What” and “Where” in spatial language and spatial cognition. Behavioral and Brain Sciences 16, 217–265 (1993)

    Article  Google Scholar 

  23. Law, M.: Guide to Worldwide Postal Code and Address Formats. WorldVu LLC (2010), http://www.worldvu.com (accessed May 22, 2013)

  24. Mark, D.M., Egenhofer, M.J.: Topology of Prototypical Spatial Relations Between Lines and Regions in English and Spanish. In: Proceedings of the Auto Carto 12, Charlotte, North Carolina, pp. 245–254 (1995)

    Google Scholar 

  25. McEnery, T., Hardie, A.: Corpus Linguistics: Method, Theory and Practice. Cambridge University Press, Cambridge (2012)

    Google Scholar 

  26. Miller, G.A.: Wordnet: A lexical database for English. Communications of the ACM 38, 39–41 (1995)

    Article  Google Scholar 

  27. Montello, D.R.: Scale and multiple psychologies of space. In: Campari, I., Frank, A.U. (eds.) COSIT 1993. LNCS, vol. 716, pp. 312–321. Springer, Heidelberg (1993)

    Google Scholar 

  28. Morimoto, Y., Aono, M., Houle, M.E., McCurley, K.S.: Extracting spatial knowledge from the web. In: SAINT 2003: Proceedings of the 2003 Symposium on Applications and the Internet, pp. 326–333. IEEE Computer Society, Washington, DC (2003)

    Chapter  Google Scholar 

  29. Morton-Owens, E.: A tool for extracting and indexing spatio-temporal information from biographical articles in Wikipedia. Masters Thesis. New York University (2012)

    Google Scholar 

  30. Pustejofsky, J., Moszkowics, J., Verhagen, M.: ISO-Space: The Annotation of Spatial Information in Language. In: Proceedings of the Sixth Joint ISO - ACL SIGSEM Workshop on Interoperable Semantic Annotation, Oxford, UK (2011)

    Google Scholar 

  31. Riedemann, C.: Naming Topological Operators at GIS User Interfaces. In: 8th AGILE Conference on Geographic Information Science, Estoril, Portugal, pp. 307–315 (2005)

    Google Scholar 

  32. Saracevic, T.: Evaluation of evaluation in information retrieval. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Special Issue of SIGIR Forum., pp. 138–146 (1995)

    Google Scholar 

  33. Schockaert, S., De Cock, M., Kerre, E., Smart, P., Abdelmoty, A., Jones, C.: Mining topological relations from the web. In: Bhowmick, S.S., Kung, J., Wagner, R. (eds.) DEXA 2008. LNCS, pp. 652–656. Springer (2008)

    Google Scholar 

  34. Schwering, A.: Evaluation of a semantic similarity measure for natural language spatial relations. In: Winter, S., Duckham, M., Kulik, L., Kuipers, B. (eds.) COSIT 2007. LNCS, vol. 4736, pp. 116–132. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  35. Semino, E., Hardie, A., Koller, V., Rayson, P.: A computer-assisted approach to the analysis of metaphor variation across genres. In: Barnden, J., Lee, M., Littlemore, J., Moon, R., Philip, G., Wallington, A. (eds.) Corpus-based Approaches to Figurative Language, pp. 145–153. University of Birmingham School of Computer Science, Birmingham (2005)

    Google Scholar 

  36. Stock, K.: NaturalGeo Project: Identifying Patterns in Geospatial Natural Language (2012) (accessed on May 22, 2013)

    Google Scholar 

  37. Talmy, L.: Toward a Cognitive Semantics. MIT Press, Cambridge (2000)

    Google Scholar 

  38. Tellex, S.: Natural Language and Spatial Reasoning. PhD Thesis, Massachusetts Institute of Technology (2009)

    Google Scholar 

  39. Tomai, E., Kavouras, M.: Where the city sits? Revealing Geospatial Semantics in Text Descriptions. In: 7th AGILE Conference on Geographic Information Science, pp. 189–194. Association of Geographic Information Laboratories for Europe, Heraklion (2004)

    Google Scholar 

  40. Usmani, T.A., Pant, D., Bhatt, A.K.: A Comparative Study of Google and Bing Search Engines in Context of Precision and Relative Recall Parameter. International Journal on Computer Science & Engineering 4, 21–34 (2012)

    Google Scholar 

  41. Vasardani, M., Winter, S., Richter, K.-F.: Locating place names from place descriptions. International Journal of Geographical Information Science (2013)

    Google Scholar 

  42. Wang, X., Matsakis, P., Trick, L., Nonnecke, B., Veltman, M.A.: A study on how humans describe relative positions of image objects. In: Ruas, A., Gold, C. (eds.) Headway in Spatial Data Handling, Proceedings of SDH 2008, 13th Int. Symposium on Spatial Data Handling, pp. 1–18. Springer Publications (2008)

    Google Scholar 

  43. Wu, H.C., Luk, R.W.P., Wong, K.F., Kwok, K.L.: Interpreting tf–idf term weights as making relevance decisions. ACM Transactions on Information Systems 26, 1–37 (2008)

    Article  Google Scholar 

  44. Xiao, R.: Corpus Creation. In: Indurkhya, N., Damerau, F.J. (eds.) The Handbook of Natural Language Processing, 2nd edn., pp. 147–165 (2010)

    Google Scholar 

  45. Xu, S., Klippel, A., MacEachren, A., Mitra, P., Turton, I., Zhang, X., Jaiswal, A.: Exploring regional variation in spatial language - a case study on spatial orientation with spatially stratified web-sampled documents. In: Spatial Cognition Conference – Poster Session, Mt. Hood, Portland Oregon (2010)

    Google Scholar 

  46. Zhang, C., Zhang, X., Jiang, W., Shen, Q., Zhang, S.: Rule-Based Extraction of Spatial Relations in Natural Language Text. In: International Conference on Computational Intelligence and Software Engineering, CiSE 2009, pp. 1–4 (2009)

    Google Scholar 

  47. Zhang, X., Zhang, C., Du, C., Zhu, S.: SVM based Extraction of Spatial Relations in Text. In: Proceedings of the IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services 2011, Fuzhou, China (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Stock, K., Pasley, R.C., Gardner, Z., Brindley, P., Morley, J., Cialone, C. (2013). Creating a Corpus of Geospatial Natural Language. In: Tenbrink, T., Stell, J., Galton, A., Wood, Z. (eds) Spatial Information Theory. COSIT 2013. Lecture Notes in Computer Science, vol 8116. Springer, Cham. https://doi.org/10.1007/978-3-319-01790-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01790-7_16

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-01789-1

  • Online ISBN: 978-3-319-01790-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics