SpatialML: annotation scheme, resources, and evaluation
- First Online:
- 215 Downloads
SpatialML is an annotation scheme for marking up references to places in natural language. It covers both named and nominal references to places, grounding them where possible with geo-coordinates, and characterizes relationships among places in terms of a region calculus. A freely available annotation editor has been developed for SpatialML, along with several annotated corpora. Inter-annotator agreement on SpatialML extents is 91.3 F-measure on a corpus of SpatialML-annotated ACE documents released by the Linguistic Data Consortium. Disambiguation agreement on geo-coordinates on ACE is 87.93 F-measure. An automatic tagger for SpatialML extents scores 86.9 F on ACE, while a disambiguator scores 93.0 F on it. Results are also presented for two other corpora. In adapting the extent tagger to new domains, merging the training data from the ACE corpus with annotated data in the new domain provides the best performance.
KeywordsAnnotation Guidelines Spatial language Geography Information extraction Evaluation Adaptation
- Barker, E., & Purves, R. (2008). A caption annotation system for georeferencing images. In Fifth workshop on geographic information retrieval (GIR’08). ACM 17th Conference on Information and Knowledge Management, Napa, CA, October 30, 2008.Google Scholar
- Bateman, J. (2008). The long road from spatial language to geospatial information, and the even longer road back: the role of ontological heterogeneity. Invited talk, LREC workshop on methodologies and resources for processing spatial language. http://www.sfbtr8.spatial-cognition.de/SpatialLREC/.
- Daume III, H. (2007). Frustratingly easy domain adaptation. In Proceedings of ACL’2007.Google Scholar
- Egenhofer, M., & Herring, J. (1990). Categorizing binary topological relations between regions, lines, and points in geographic databases/technical report. Department of Surveying Engineering, University of Maine, 1990.Google Scholar
- Garbin, E., & Mani, I. (2005). Disambiguating toponyms in news. In Proceedings of the human language technology conference and conference on empirical methods in natural language processing (pp. 363–370).Google Scholar
- Leidner, J. L. (2006). Toponym resolution: A first large-scale comparative evaluation. Research Report EDI-INF-RR-0839.Google Scholar
- Mandl, T., Agosti, M., Di Nunzio, G. M., Yeh, A., Mani, I., Doran, C. et al. (2009). LogCLEF 2009: The CLEF 2009 multilingual logfile analysis track overview. Working notes for the CLEF 2009 workshop, Corfu, Greece. http://clef.isti.cnr.it/2009/working_notes/LogCLEF-2009-Overview-Working-Notes-2009-09-14.pdf.
- Mardis, S., & Burger, J. (2005). Design for an integrated gazetteer database: Technical description and user guide for a gazetteer to support natural language processing applications. Mitre technical report, MTR 05B0000085. http://www.mitre.org/work/tech_papers/tech_papers_06/06_0375/index.html.
- Papadias, D., Theodoridis, Y., Sellis, T. K., & Egenhofer, M. J. (1995). Topological relations in the world of minimum bounding rectangles: A study with R-trees. In Proceedings of the 1995 ACM SIGMOD international conference on management of data (pp. 92–103). San Jose, California. May 22–25, 1995.Google Scholar
- Pustejovsky, J., Ingria, B., Sauri, R., Castano, J., Littman, J., Gaizauskas, R., et al. (2005). The specification language timeML. In I. Mani, J. Pustejovsky, & R. Gaizauskas (Eds.), The language of time: A reader (pp. 545–557). Oxford: Oxford University Press.Google Scholar
- Pustejovsky, J., & Moszkowicz, J. L. (2008). Integrating motion predicate classes with spatial and temporal annotations. In Proceedings of COLING 2008: Companion volume—posters and demonstrations (pp. 95–98).Google Scholar
- Randell, D. A., Cui, Z., & Cohn, A. G. (1992). A spatial logic based on regions and connection. In Proceedings of 3rd international conference on knowledge representation and reasoning, Morgan Kaufmann, San Mateo (pp. 165–176).Google Scholar
- Rashid, A., Shariff, B. M., Egenhofer, M. J., & Mark, D. M. (1998). Natural-language spatial relations between linear and area objects: The topology and metric of english-language terms. International Journal of Geographic Information Science, 12(3), 215–246.Google Scholar
- Schilder, F., Versley, Y., & Habel, C. (2004). Extracting spatial information: Grounding, classifying and linking spatial expressions. Workshop on geographic information. Retrieval at the 27th ACM SIGIR conference, Sheffield, England, UK.Google Scholar
- Sundheim, B., Mardis, S., & Burger, J. (2006). Gazetteer linkage to WordNet. In The Third International WordNet Conference, South Jeju Island, Korea. http://nlpweb.kaist.ac.kr/gwc/pdf2006/7.pdf.