BiographySampo – Publishing and Enriching Biographies on the Semantic Web for Digital Humanities Research

  • Eero HyvönenEmail author
  • Petri Leskinen
  • Minna Tamper
  • Heikki Rantala
  • Esko Ikkala
  • Jouni Tuominen
  • Kirsi Keravuori
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11503)


This paper argues for making a paradigm shift in publishing and using biographical dictionaries on the web, based on Linked Data. The idea is to provide the user with enhanced reading experience of biographies by enriching contents with data linking and reasoning. In addition, versatile tooling for (1) biographical research of individual persons as well as for (2) prosopographical research on groups of people are provided. To demonstrate and evaluate the new possibilities, we present the semantic portal “BiographySampo – Finnish Biographies on the Semantic Web”. The system is based on a knowledge graph extracted automatically from a collection of 13 100 textual biographies, enriched with data linking to 16 external data sources, and by harvesting external collection data from libraries, museums, and archives. The portal was released in September 2018 for free public use at



Thanks to Business Finland for financial support and CSC – IT Center for Science, Finland, for computational resources.


  1. 1.
    Snellman, J.V.: Kootut teokset 1–24. Ministry of Education and Culture, Helsinki (2002)Google Scholar
  2. 2.
    Suomen kansallisbiografia 1–10. Suomalaisen Kirjallisuuden Seura, Helsinki (2003)Google Scholar
  3. 3.
    Aylett, R.S., Bental, D.S., Stewart, R., Forth, J., Wiggins, G.: Supporting serendipitous discovery. In: Digital Futures (Third Annual Digital Economy Conference), 23–25 October 2012, Aberdeen, UK (2012)Google Scholar
  4. 4.
    ter Braake, S., Fokkens, A., Sluijter, R., Declerck, T., Wandl-Vogt, E. (eds.): BD2015 Biographical Data in a Digital World 2015. In: CEUR Workshop Proceedings, vol. 1399 (2015)Google Scholar
  5. 5.
    Doerr, M.: The CIDOC CRM–an ontological approach to semantic interoperability of metadata. AI Mag. 24(3), 75–92 (2003)Google Scholar
  6. 6.
    Fokkens, A., ter Braake, S., Sluijter, R., Arthur, P., Wandl-Vogt, E. (eds.): BD2017 biographical data in a digital world 2017. In: CEUR Workshop Proceedings, vol. 2119 (2017)Google Scholar
  7. 7.
    Fokkens, A., et al.: BiographyNet: extracting relations between people and events. In: Europa baut auf Biographien, pp. 193–224. New Academic Press, Wien (2017)Google Scholar
  8. 8.
    Gangemi, A., Presutti, V., Recupero, D.R., Nuzzolese, A.G., Draicchio, F., Mongiovì, M.: Semantic web machine reading with FRED. Semant. Web J. 8(6), 873–893 (2017)CrossRefGoogle Scholar
  9. 9.
    Gardiner, E., Musto, R.G.: The Digital Humanities: A Primer for Students and Scholars. Cambridge University Press, New York (2015)CrossRefGoogle Scholar
  10. 10.
    Heath, T., Bizer, C.: Linked Data: Evolving the Web Into a Global Data Space. Synthesis Lectures on the Semantic Web: Theory and Technology, 1st edn. Morgan & Claypool, Palo Alto (2011). Scholar
  11. 11.
    Hellmann, S., Lehmann, J., Auer, S., Brümmer, M.: Integrating NLP using linked data. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 98–113. Springer, Heidelberg (2013). Scholar
  12. 12.
    Hyvönen, E.: Preventing interoperability problems instead of solving them. Semant. Web J. 1(1–2), 33–37 (2010)Google Scholar
  13. 13.
    Hyvönen, E.: Publishing and Using Cultural Heritage Linked Data on the Semantic Web. Morgan & Claypool, Palo Alto (2012)CrossRefGoogle Scholar
  14. 14.
    Hyvönen, E., Alonen, M., Ikkala, E., Mäkelä, E.: Life stories as event-based linked data: case semantic national biography. In: Proceedings of ISWC 2014 Posters & Demonstrations Track, pp. 1–4. CEUR Workshop Proceedings, vol. 1272 (2014)Google Scholar
  15. 15.
    Hyvönen, E., et al.: CultureSampo - Finnish culture on the semantic web 2.0. Thematic perspectives for the end-user. In: Museums and the Web 2009. Archives and Museum Informatics, Toronto (2009)Google Scholar
  16. 16.
    Hyvönen, E., Tuominen, J., Alonen, M., Mäkelä, E.: Linked data Finland: A 7-star model and platform for publishing and re-using linked datasets. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8798, pp. 226–230. Springer, Cham (2014). Scholar
  17. 17.
    Hyvönen, E.: Cultural heritage linked data on the semantic web: three case studies using the Sampo model. In: VIII Encounter of Documentation Centres of Contemporary Art: Open Linked Data and Integral Management of Information in Cultural Centres Artium, 19–20 October 2016, Vitoria-Gasteiz, Spain (2016)Google Scholar
  18. 18.
    Hyvönen, E., Leskinen, P., Heino, E., Tuominen, J., Sirola, L.: Reassembling and enriching the life stories in printed biographical registers: Norssi high school alumni on the semantic web. In: Gracia, J., Bond, F., McCrae, J.P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) LDK 2017. LNCS (LNAI), vol. 10318, pp. 113–119. Springer, Cham (2017). Scholar
  19. 19.
    Hyvönen, E., Leskinen, P., Tamper, M., Tuominen, J., Keravuori, K.: Semantic national biography of Finland. In: Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018). CEUR Workshop Proceedings, vol. 2084, pp. 372–385 (2018)Google Scholar
  20. 20.
    Hyvönen, E., Rantala, H.: Knowledge-based relation discovery in cultural heritage knowledge graphs. In: Proceedings of the 4th Digital Humanities in the Nordic Countries Conference (DHN 2019). CEUR Workshop Proceedings (2019)Google Scholar
  21. 21.
    Ikkala, E., Tuominen, J., Hyvönen, E.: Contextualizing historical places in a gazetteer by using historical maps and linked data. Proceedings of DH 2016, pp. 573–577 (2016)Google Scholar
  22. 22.
    Keith, T.: Changing Conceptions of National Biography. Cambridge University Press, Cambridge (2004)Google Scholar
  23. 23.
    Koho, M., Heino, E., Hyvönen, E.: SPARQL Faceter–Client-side Faceted Search Based on SPARQL. In: Joint Proceedings of the 4th International Workshop on Linked Media and the 3rd Developers Hackshop. CEUR Workshop Proceedings, vol. 1615 (2016)Google Scholar
  24. 24.
    Langmead, A., Otis, J., Warren, C., Weingart, S., Zilinski, L.: Towards interoperable network ontologies for the digital humanities. Int. J. Humanit. Arts Comput. 10(1), 22–35 (2016)CrossRefGoogle Scholar
  25. 25.
    Larson, R.: Bringing lives to light: biography in context. Final project report, University of Berkeley (2010).
  26. 26.
    Le Boeuf, P., Doerr, M., Ore, C.E., Stead, S. (eds.): Definition of the CIDOC Conceptual Reference Model, Version 6.2.4. ICOM/CIDOC Documentation Standards Group (CIDOC CRM Special Interest Group) (2018).
  27. 27.
    Lohmann, S., Heim, P., Stegemann, T., Ziegler, J.: The RelFinder user interface: interactive exploration of relationships between objects of interest. In: Proceedings of the 14th International Conference on Intelligent User Interfaces (IUI 2010), pp. 421–422. ACM (2010)Google Scholar
  28. 28.
    Mäkelä, E., Ruotsalo, T., Hyvönen, E.: How to deal with massively heterogeneous cultural heritage data–lessons learned in CultureSampo. Semant. Web J. 3(1), 85–109 (2012)Google Scholar
  29. 29.
    Miyakita, G., Leskinen, P., Hyvönen, E.: Using linked data for prosopographical research of historical persons: case U.S. Congress Legislators. In: Ioannides, M., et al. (eds.) EuroMed 2018. LNCS, vol. 11197, pp. 150–162. Springer, Cham (2018). Scholar
  30. 30.
    Roberts, B.: Biographical Research. Understanding Social Research. Open University Press, London (2002)Google Scholar
  31. 31.
    Rospocher, M., et al.: Building event-centric knowledge graphs from news. Web Semant. Sci. Serv. Agents WWW 37, 132–151 (2016)CrossRefGoogle Scholar
  32. 32.
    Schultz, A., Matteni, A., Isele, R., Bizer, C., Becker, C.: LDIF - linked data integration framework. In: Proceedings of the 2nd International Workshop on Consuming Linked Data (COLD 2011). CEUR Workshop Proceedings, vol. 782 (2011)Google Scholar
  33. 33.
    Shultz, K.: What is distant reading? New York Times, 24 June 2011Google Scholar
  34. 34.
    Tamper, M., Leskinen, P., Apajalahti, K., Hyvönen, E.: Using biographical texts as linked data for prosopographical research and applications. In: Ioannides, M., et al. (eds.) EuroMed 2018. LNCS, vol. 11196, pp. 125–137. Springer, Cham (2018). Scholar
  35. 35.
    Tartari, G., Hogan, A.: WiSP: weighted shortest paths for RDF graphs. In: Proceedings of VOILA 2018. CEUR Workshop Proceedings, vol. 2187, pp. 37–52 (2018)Google Scholar
  36. 36.
    Tuominen, J., Hyvönen, E., Leskinen, P.: Bio CRM: a data model for representing biographical data for prosopographical research. In: Proceedings of the Second Conference on Biographical Data in a Digital World 2017 (BD2017). CEUR Workshop Proceedings, vol. 2119, pp. 59–66 (2018)Google Scholar
  37. 37.
    Verboven, K., Carlier, M., Dumolyn, J.: A short manual to the art of prosopography. In: Prosopography Approaches and Applications. A Handbook, pp. 35–70. Unit for Prosopographical Research (Linacre College) (2007)Google Scholar
  38. 38.
    Warren, C., Shore, D., Otis, J., Wang, L., Finegold, M., Shalizi, C.: Six degrees of Francis Bacon: a statistical method for reconstructing large historical social networks. Digit. Humanit. Q. 10(3) (2016)Google Scholar
  39. 39.
    Wu, Y., Sun, H., Yan, C.: An event timeline extraction method based on news corpus. In: 2017 IEEE 2nd International Conference on Big Data Analysis, pp. 697–702. IEEE (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Semantic Computing Research Group (SeCo)Aalto UniversityEspooFinland
  2. 2.HELDIG – Helsinki Centre for Digital HumanitiesUniversity of HelsinkiHelsinkiFinland
  3. 3.Finnish Literature Society (SKS)HelsinkiFinland

Personalised recommendations