Named Entity Linking in a Complex Domain: Case Second World War History

  • Erkki Heino
  • Minna Tamper
  • Eetu Mäkelä
  • Petri Leskinen
  • Esko Ikkala
  • Jouni Tuominen
  • Mikko Koho
  • Eero Hyvönen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10318)

Abstract

This paper discusses the challenges of applying named entity linking in a rich, complex domain – specifically, the linking of (1) military units, (2) places and (3) people in the context of interlinked Second World War data. Multiple sub-scenarios are discussed in detail through concrete evaluations, analyzing the problems faced, and the solutions developed. A key contribution of this work is to highlight the heterogeneity of problems and approaches needed even inside a single domain, depending on both the source data as well as the target authority.

References

  1. 1.
    Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, vol. 6, pp. 9–16 (2006)Google Scholar
  2. 2.
    Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: EMNLP-CoNLL, vol. 7, pp. 708–716 (2007)Google Scholar
  3. 3.
    Doerr, M.: The CIDOC CRM - an ontological approach to semantic interoperability of metadata. AI Mag. 24(3), 75–92 (2003)Google Scholar
  4. 4.
    Godoy, J., Atkinson, J., Rodriguez, A.: Geo-referencing with semi-automatic gazetteer expansion using lexico-syntactical patterns and co-reference analysis. Int. J. Geogr. Inf. Sci. 25(1), 149–170 (2011). http://dx.doi.org/10.1080/13658816.2010.513981 CrossRefGoogle Scholar
  5. 5.
    Gracia, J., Mena, E.: Multiontology semantic disambiguation in unstructured web contexts. In: Proceedings of the 2009 K-CAP Workshop on Collective Knowledge Capturing and Representation, pp. 1–9 (2009)Google Scholar
  6. 6.
    Grishman, R., Sundheim, B.: Message understanding conference-6: a brief history. In: Coling, vol. 96, pp. 466–471 (1996)Google Scholar
  7. 7.
    Grover, C., Tobin, R., Byrne, K., Woollard, M., Reid, J., Dunn, S., Ball, J.: Use of the edinburgh geoparser for georeferencing digitized historical collections. Philos. Trans. R. Soc. Lond. A Math. Phys. Eng. Sci. 368(1925), 3875–3889 (2010). http://rsta.royalsocietypublishing.org/content/368/1925/3875 Google Scholar
  8. 8.
    Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with Wikipedia. Artif. Intell. 194, 130–150 (2013). http://dx.doi.org/10.1016/j.artint.2012.04.005 MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Hoffart, J., Yosef, M.A., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, G.: Robust disambiguation of named entities in text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 782–792 (2011). http://dl.acm.org/citation.cfm?id=2145432.2145521
  10. 10.
    Hu, Y., Janowicz, K., Prasad, S.: Improving Wikipedia-based place name disambiguation in short texts using structured data from DBpedia. In: Proceedings of the 8th Workshop on Geographic Information Retrieval, GIR 2014, NY, USA, pp. 8:1–8:8 (2014). http://doi.acm.org/10.1145/2675354.2675356
  11. 11.
    Hyvönen, E., Heino, E., Leskinen, P., Ikkala, E., Koho, M., Tamper, M., Tuominen, J., Mäkelä, E.: WarSampo Data Service and Semantic Portal for Publishing Linked Open Data About the Second World War History. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 758–773. Springer, Cham (2016). doi:10.1007/978-3-319-34129-3_46 CrossRefGoogle Scholar
  12. 12.
    Hyvönen, E., Tuominen, J., Kauppinen, T., Väätäinen, J.: Representing and utilizing changing historical places as an ontology time series. In: Ashish, N., Sheth, A. (eds.) Geospatial Semantics and Semantic Web: Foundations, Algorithms, and Applications. Springer, New York (2011)Google Scholar
  13. 13.
    Kettunen, K., Mäkelä, E., Kuokkala, J., Ruokolainen, T., Niemi, J.: Modern tools for old content - in search of named entities in a finnish ocred historical newspaper collection 1771–1910. In: Proceedings of LWDA 2016, September 2016Google Scholar
  14. 14.
    Koho, M., Hyvönen, E., Heino, E., Tuominen, J., Leskinen, P., Mäkelä, E.: Linked death - representing, publishing, and using second world war death records as linked open data. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) The Semantic Web: ESWC 2016 Satellite Events. Springer, Heidelberg (2016)Google Scholar
  15. 15.
    Löfberg, L., Archer, D., Piao, S., Rayson, P., McEnery, T., Varantola, K., Juntunen, J.P.: Porting an English semantic tagger to the finnish language. In: Proceedings of the Corpus Linguistics 2003 conference, pp. 457–464 (2003)Google Scholar
  16. 16.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8. ACM (2011)Google Scholar
  17. 17.
    Mäkelä, E.: Combining a REST Lexical Analysis Web Service with SPARQL for Mashup Semantic Annotation from Text. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8798, pp. 424–428. Springer, Cham (2014). doi:10.1007/978-3-319-11955-7_60 Google Scholar
  18. 18.
    Mäkelä, E.: LAS: an integrated language analysis tool for multiple languages. J. Open Source Softw. 1(6), 2 (2016). http://dx.doi.org/10.21105/joss.00035 CrossRefGoogle Scholar
  19. 19.
    Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Invest. 30(1), 3–26 (2007)CrossRefGoogle Scholar
  20. 20.
    Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)CrossRefGoogle Scholar
  21. 21.
    The Association for Military History in Finland: Kansa taisteli lehdet 1957–1986 (2014). http://www.sshs.fi/sitenews/view/-/nid/92/ngid/1
  22. 22.
    Wentland, W., Knopp, J., Silberer, C., Hartung, M.: Building a multilingual lexical resource for named entity disambiguation, translation and transliteration. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation, LREC 2008, European Language Resources Association (ELRA), Marrakech, Morocco, May 2008. http://www.lrec-conf.org/proceedings/lrec2008/

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Erkki Heino
    • 1
    • 2
  • Minna Tamper
    • 1
    • 2
  • Eetu Mäkelä
    • 1
    • 2
  • Petri Leskinen
    • 1
    • 2
  • Esko Ikkala
    • 1
    • 2
  • Jouni Tuominen
    • 1
    • 2
  • Mikko Koho
    • 1
    • 2
  • Eero Hyvönen
    • 1
    • 2
  1. 1.Semantic Computing Research Group (SeCo)Aalto UniversityEspooFinland
  2. 2.HELDIG – Helsinki Centre for Digital HumanitiesUniversity of HelsinkiHelsinkiFinland

Personalised recommendations