Skip to main content

Exploring Linked Data for the Automatic Enrichment of Historical Archives

Part of the Lecture Notes in Computer Science book series (LNISA,volume 11155)

Abstract

With the increasing scale of online cultural heritage collections, the efforts of manually adding annotations to their contents become a challenging and costly endeavour. Entity Linking is a process used to automatically apply such annotations to a text based collection, where the quality and coverage of the linking process is highly dependent on the knowledge base that informs it. In this paper, we present our ongoing efforts to annotate a corpus of \(17^{th}\) century Irish witness statements using Entity Linking methods that utilise Semantic Web techniques. We discuss problems faced in this process and attempts to remedy them.

Keywords

  • Entity linking
  • Ontology creation
  • Automatic enrichment

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-98192-5_57
  • Chapter length: 11 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   69.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-98192-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   89.99
Price excludes VAT (USA)

Notes

  1. 1.

    https://www.europeana.eu/portal/en.

  2. 2.

    https://dp.la/.

  3. 3.

    http://wiki.dbpedia.org/.

  4. 4.

    https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/#c10444.

  5. 5.

    http://1641.tcd.ie/deposition.php?depID=815351r406.

  6. 6.

    http://downsurvey.tcd.ie/index.html.

  7. 7.

    http://www.oxforddnb.com.

  8. 8.

    http://dib.cambridge.org/.

References

  1. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2001)

    CrossRef  Google Scholar 

  2. Van Hooland, S., De Wilde, M., Verborgh, R., Steiner, T., Van de Walle, R.: Exploring entity recognition and disambiguation for cultural heritage collections. Digital Sch. Humanit. 30(2), 262–279 (2015)

    CrossRef  Google Scholar 

  3. Wilde, M.: Improving retrieval of historical content with entity linking. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds.) ADBIS 2015. CCIS, vol. 539, pp. 498–504. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23201-0_50

    CrossRef  Google Scholar 

  4. Stiller, J., Petras, V., Gäde, M., Isaac, A.: Automatic enrichments with controlled vocabularies in europeana: challenges and consequences. In: Ioannides, M., Magnenat-Thalmann, N., Fink, E., Žarnić, R., Yen, A.-Y., Quak, E. (eds.) EuroMed 2014. LNCS, vol. 8740, pp. 238–247. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13695-0_23

    CrossRef  Google Scholar 

  5. Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)

    CrossRef  Google Scholar 

  6. Ganea, O.E., Ganea, M., Lucchi, A., Eickhoff, C., Hofmann, T.: Probabilistic bag-of-hyperlinks model for entity linking. In: Proceedings of the 25th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 927–938 (2016)

    Google Scholar 

  7. Usbeck, R., et al.: AGDISTIS - graph-based disambiguation of named entities using linked data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 457–471. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_29

    CrossRef  Google Scholar 

  8. Yosef, M.A., Hoffart, J., Bordino, I., Spaniol, M., Weikum, G.: AIDA: an online tool for accurate disambiguation of named entities in text and tables. Proc. VLDB Endowment 4(12), 1450–1453 (2011)

    Google Scholar 

  9. Zwicklbauer, S., Seifert, C., Granitzer, M.: Robust and collective entity disambiguation through semantic embeddings. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 425–434. ACM, New York (2016)

    Google Scholar 

  10. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., Van Kleef, P., Auer, S., et al.: DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web 6(2), 167–195 (2015)

    Google Scholar 

  11. Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 697–706. ACM, New York (2007)

    Google Scholar 

  12. Manguinhas, H., et al.: Exploring comparative evaluation of semantic enrichment tools for cultural heritage metadata. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds.) TPDL 2016. LNCS, vol. 9819, pp. 266–278. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43997-6_21

    CrossRef  Google Scholar 

  13. Munnelly, G., Lawless, S.: Investigating entity linking in early English legal documents. In: ACM/IEEE Joint Conference on Digital Libraries (JCDL) (2018)

    Google Scholar 

  14. Agirre, E., Barrena, A., Lacalle, O.L.D., Soroa, A., Fern, S., Stevenson, M.: Matching cultural heritage items to wikipedia. In: LREC, pp. 1729–1735 (2012)

    Google Scholar 

  15. Brando, C., Frontini, F., Ganascia, J.G.: REDEN: named entity linking in digital literary editions using linked data sets. Complex Syst. Inform. Model. Q. 7, 60–80 (2016)

    CrossRef  Google Scholar 

  16. Steiner, C.M., Agosti, M., Sweetnam, M.S., Hillemann, E.C., Orio, N., Ponchia, C., Hampson, C., Munnelly, G., Nussbaumer, A., Albert, D., et al.: Evaluating a digital humanities research environment: the CULTURA approach. Int. J. Digit. Libr. 15(1), 53–70 (2014)

    CrossRef  Google Scholar 

  17. Munnelly, G., Lawless, S.: Constructing a knowledge base for entity linking on Irish cultural heritage collections. In: Proceedings of the 14th International Conference on Semantic Systems (in press)

    Google Scholar 

  18. DuVall, S.L., Kerber, R.A., Thomas, A.: Extending the fellegi-sunter probabilistic record linkage method for approximate field comparators. J. Biomed. Inform. 43(1), 24–30 (2010)

    CrossRef  Google Scholar 

Download references

Acknowledgements

The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gary Munnelly .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Munnelly, G., Pandit, H.J., Lawless, S. (2018). Exploring Linked Data for the Automatic Enrichment of Historical Archives. In: , et al. The Semantic Web: ESWC 2018 Satellite Events. ESWC 2018. Lecture Notes in Computer Science(), vol 11155. Springer, Cham. https://doi.org/10.1007/978-3-319-98192-5_57

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98192-5_57

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98191-8

  • Online ISBN: 978-3-319-98192-5

  • eBook Packages: Computer ScienceComputer Science (R0)