Exploring Linked Data for the Automatic Enrichment of Historical Archives

  • Gary MunnellyEmail author
  • Harshvardhan J.  Pandit
  • Séamus Lawless
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11155)


With the increasing scale of online cultural heritage collections, the efforts of manually adding annotations to their contents become a challenging and costly endeavour. Entity Linking is a process used to automatically apply such annotations to a text based collection, where the quality and coverage of the linking process is highly dependent on the knowledge base that informs it. In this paper, we present our ongoing efforts to annotate a corpus of \(17^{th}\) century Irish witness statements using Entity Linking methods that utilise Semantic Web techniques. We discuss problems faced in this process and attempts to remedy them.


Entity linking Ontology creation Automatic enrichment 



The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.


  1. 1.
    Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2001)CrossRefGoogle Scholar
  2. 2.
    Van Hooland, S., De Wilde, M., Verborgh, R., Steiner, T., Van de Walle, R.: Exploring entity recognition and disambiguation for cultural heritage collections. Digital Sch. Humanit. 30(2), 262–279 (2015)CrossRefGoogle Scholar
  3. 3.
    Wilde, M.: Improving retrieval of historical content with entity linking. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds.) ADBIS 2015. CCIS, vol. 539, pp. 498–504. Springer, Cham (2015). Scholar
  4. 4.
    Stiller, J., Petras, V., Gäde, M., Isaac, A.: Automatic enrichments with controlled vocabularies in europeana: challenges and consequences. In: Ioannides, M., Magnenat-Thalmann, N., Fink, E., Žarnić, R., Yen, A.-Y., Quak, E. (eds.) EuroMed 2014. LNCS, vol. 8740, pp. 238–247. Springer, Cham (2014). Scholar
  5. 5.
    Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)CrossRefGoogle Scholar
  6. 6.
    Ganea, O.E., Ganea, M., Lucchi, A., Eickhoff, C., Hofmann, T.: Probabilistic bag-of-hyperlinks model for entity linking. In: Proceedings of the 25th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 927–938 (2016)Google Scholar
  7. 7.
    Usbeck, R., et al.: AGDISTIS - graph-based disambiguation of named entities using linked data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 457–471. Springer, Cham (2014). Scholar
  8. 8.
    Yosef, M.A., Hoffart, J., Bordino, I., Spaniol, M., Weikum, G.: AIDA: an online tool for accurate disambiguation of named entities in text and tables. Proc. VLDB Endowment 4(12), 1450–1453 (2011)Google Scholar
  9. 9.
    Zwicklbauer, S., Seifert, C., Granitzer, M.: Robust and collective entity disambiguation through semantic embeddings. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 425–434. ACM, New York (2016)Google Scholar
  10. 10.
    Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., Van Kleef, P., Auer, S., et al.: DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web 6(2), 167–195 (2015)Google Scholar
  11. 11.
    Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 697–706. ACM, New York (2007)Google Scholar
  12. 12.
    Manguinhas, H., et al.: Exploring comparative evaluation of semantic enrichment tools for cultural heritage metadata. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds.) TPDL 2016. LNCS, vol. 9819, pp. 266–278. Springer, Cham (2016). Scholar
  13. 13.
    Munnelly, G., Lawless, S.: Investigating entity linking in early English legal documents. In: ACM/IEEE Joint Conference on Digital Libraries (JCDL) (2018)Google Scholar
  14. 14.
    Agirre, E., Barrena, A., Lacalle, O.L.D., Soroa, A., Fern, S., Stevenson, M.: Matching cultural heritage items to wikipedia. In: LREC, pp. 1729–1735 (2012)Google Scholar
  15. 15.
    Brando, C., Frontini, F., Ganascia, J.G.: REDEN: named entity linking in digital literary editions using linked data sets. Complex Syst. Inform. Model. Q. 7, 60–80 (2016)CrossRefGoogle Scholar
  16. 16.
    Steiner, C.M., Agosti, M., Sweetnam, M.S., Hillemann, E.C., Orio, N., Ponchia, C., Hampson, C., Munnelly, G., Nussbaumer, A., Albert, D., et al.: Evaluating a digital humanities research environment: the CULTURA approach. Int. J. Digit. Libr. 15(1), 53–70 (2014)CrossRefGoogle Scholar
  17. 17.
    Munnelly, G., Lawless, S.: Constructing a knowledge base for entity linking on Irish cultural heritage collections. In: Proceedings of the 14th International Conference on Semantic Systems (in press)Google Scholar
  18. 18.
    DuVall, S.L., Kerber, R.A., Thomas, A.: Extending the fellegi-sunter probabilistic record linkage method for approximate field comparators. J. Biomed. Inform. 43(1), 24–30 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Gary Munnelly
    • 1
    Email author
  • Harshvardhan J.  Pandit
    • 1
  • Séamus Lawless
    • 1
  1. 1.Adapt CentreTrinity College DublinDublinIreland

Personalised recommendations