Skip to main content

Geocoding Textual Documents Through a Hierarchy of Linear Classifiers

  • Conference paper
  • First Online:
Progress in Artificial Intelligence (EPIA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9273))

Included in the following conference series:

Abstract

In this paper, we empirically evaluate an automated technique, based on a hierarchical representation for the Earth’s surface and leveraging linear classifiers, for assigning geospatial coordinates to previously unseen documents, using only the raw text as input evidence. We measured the results obtained with models based on Support Vector Machines, over collections of geo-referenced Wikipedia articles in four different languages, namely English, German, Spanish and Portuguese. The best performing models obtained state-of-the-art results, corresponding to an average prediction error of 83 Kilometers, and a median error of just 9 Kilometers, in the case of the English Wikipedia collection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adams, B., Janowicz, K.: On the geo-indicativeness of non-georeferenced text. In: Proceedings of the International AAAI Conference on Weblogs and Social Media (2012)

    Google Scholar 

  2. Dias, D., Anastácio, I., Martins, B.: A language modeling approach for georeferencing textual documents. Actas del Congreso Español de Recuperación de Información (2012)

    Google Scholar 

  3. Dutton, G.: Encoding and handling geospatial data with hierarchical triangular meshes. In: Kraak, M.J., Molenaar, M., (eds.) Advances in GIS Research II. CRC Press (1996)

    Google Scholar 

  4. Górski, K.M., Hivon, E., Banday, A.J., Wandelt, B.D., Hansen, F.K., Reinecke, M., Bartelmann, M.: HEALPIX - a framework for high resolution discretization, and fast analysis of data distributed on the sphere. The Astrophysical Journal 622(2) (2005)

    Google Scholar 

  5. Lieberman, M.D., Samet, H.: Multifaceted toponym recognition for streaming news. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (2011)

    Google Scholar 

  6. Mehler, A., Bao, Y., Li, X., Wang, Y., Skiena, S.: Spatial analysis of news sources. IEEE Transactions on Visualization and Computer Graphics 12(5) (2006)

    Google Scholar 

  7. Roller, S., Speriosu, M., Rallapalli, S., Wing, B., Baldridge, J.: Supervised text-based geolocation using language models on an adaptive grid. In: Proceedings of the Conference on Empirical Methods on Natural Language Processing (2012)

    Google Scholar 

  8. Santos, J., Anastácio, I., Martins, B.: Using machine learning methods for disambiguating place references in textual documents. GeoJournal 80(3) (2015)

    Google Scholar 

  9. Speriosu, M., Baldridge, J.: Text-driven toponym resolution using indirect supervision. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (2013)

    Google Scholar 

  10. Vincenty, T.: Direct and inverse solutions of geodesics on the ellipsoid with application of nested equations. Survey Review XXIII(176) (1975)

    Google Scholar 

  11. Wing, B., Baldridge, J.: Simple supervised document geolocation with geodesic grids. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (2011)

    Google Scholar 

  12. Wing, B., Baldridge, J.: Hierarchical discriminative classification for text-based geolocation. In: Proceedings of the Conference on Empirical Methods on Natural Language Processing (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bruno Martins .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Melo, F., Martins, B. (2015). Geocoding Textual Documents Through a Hierarchy of Linear Classifiers. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds) Progress in Artificial Intelligence. EPIA 2015. Lecture Notes in Computer Science(), vol 9273. Springer, Cham. https://doi.org/10.1007/978-3-319-23485-4_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23485-4_59

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23484-7

  • Online ISBN: 978-3-319-23485-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics