NotaryPedia: A Knowledge Graph of Historical Notarial Manuscripts

  • Charlene EllulEmail author
  • Joel AzzopardiEmail author
  • Charlie AbelaEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11877)


The Notarial Archives in Valletta, the capital city of Malta, houses a rich and valuable collection of around twenty thousand notarial manuscripts dating back to the 15th century. The Archive wants to make the contents of this collection easily accessible and searchable to researchers and the general public. Knowledge Graphs have been successfully used to represent similar historical content. Nevertheless, building a Knowledge Graph for the archives is challenging as these documents are written in medieval Latin and currently there is a lack of information extraction tools that recognise this language. This is, furthermore, compounded with a lack of medieval Latin corpora to train and evaluate machine learning algorithms, as well as a lack of an ontological representation for the contents of notarial manuscripts. In this paper, we present NotaryPedia, a Knowledge Graph for the Notarial Archives. We extend our previous work on entity and keyphrase extraction with relation extraction to populate the Knowledge Graph using an ontological vocabulary for notarial deeds. Furthermore, we perform Knowledge Graph completeness using link-prediction and inference. Our work was evaluated using different translational distance and semantic matching models to predict relations amongst literals by promoting them to entities and to infer new knowledge from existing entities. A 49% relation prediction accuracy using TransE was achieved.


Knowledge Graph Medieval latin text Notarial Ontology Relation extraction Link prediction 


  1. 1.
    ISAD(G): General international standard archival description 2000, 2 edn. (2000)Google Scholar
  2. 2.
    Ahonen, E., Hyvonen, E.: Publishing Historical Texts on the Semantic Web –A Case Study, pp. 167–173. IEEE (2009)Google Scholar
  3. 3.
    Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating Embeddings for Modeling Multi-relational Data, pp. 2787–2795 (2013)Google Scholar
  4. 4.
    Debruyne, C., Beyan, O.D., Grant, R., Collins, S., Decker, S., Harrower, N.: A semantic architecture for preserving and interpreting the information contained in irish historical vital records. Int. J. Digit. Libr. 17(3), 159–174 (2016)CrossRefGoogle Scholar
  5. 5.
    Efremova, J., Montes García, A., Calders, T.: Classification of historical notary acts with noisy labels. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 49–54. Springer, Cham (2015). Scholar
  6. 6.
    Efremova, J., García, A.M., Iriondo, A.B., Calders, T.: Who are my ancestors? Retrieving family relationships from historical texts. In: Braslavski, P., et al. (eds.) RuSSIR 2015. CCIS, vol. 573, pp. 121–129. Springer, Cham (2016). Scholar
  7. 7.
    Efremova, J., Montes Garcia, A., Calders, T., Zhang, J.: Towards population reconstruction: extraction of family relationships from historical documents (2015)Google Scholar
  8. 8.
    Efremova, J., et al.: Multi-source entity resolution for genealogical data. In: Bloothooft, G., Christen, P., Mandemakers, K., Schraagen, M. (eds.) Population Reconstruction, pp. 129–154. Springer, Cham (2015). Scholar
  9. 9.
    Ehrlinger, L., Wob, W.: Towards a Definition of Knowledge Graphs (2016)Google Scholar
  10. 10.
    Ellul, C., Abela, C., Azzopardi, J.: Extracting Information from Medieval Notarial deeds, pp. 25–28. EKAW (2018)Google Scholar
  11. 11.
    Erdmann, A., et al.: Challenges and solutions for latin named entity recognition. In: The COLING 2016 Organizing Committee, pp. 85–93 (2016)Google Scholar
  12. 12.
    Feeney, K.C., O’Sullivan, D., Tai, W., Brennan, R.: Improving curated web-data quality with structured harvesting and assessment. Int. J. Semant. Web Inf. Syst. 10(2), 35–62 (2014)CrossRefGoogle Scholar
  13. 13.
    Fiorini, S.: Documentary Sources of Maltese History Part I Notarial Documents No 1 Notary Giacomo Zabbara. University of Malta, 1 edn. (1996)Google Scholar
  14. 14.
    Gonzalez, E.: Unsupervised Relation Extraction by Massive Clustering (2009)Google Scholar
  15. 15.
    Han, X., et al.: Openke: an open toolkit for knowledge embedding. In: Proceedings of EMNLP (2018)Google Scholar
  16. 16.
    Monti, M., et al.: Construction of enterprise knowledge graphs. In: Pan, J.Z., Vetere, G., Gomez-Perez, J.M., Wu, H. (eds.) Exploiting Linked Data and Knowledge Graphs in Large Organisations. Springer, Cham (2017). chap 8Google Scholar
  17. 17.
    Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Semant. Web 8(3), 489–508 (2016)CrossRefGoogle Scholar
  18. 18.
    Pawar, S., Palshikar, G., Bhattacharyya, P.: Relation Extraction: A Survey (2017)Google Scholar
  19. 19.
    Ruddock, B.: Linked data and the locah project. Bus. Inf. Rev. 28(2), 105–111 (2011)Google Scholar
  20. 20.
    Siddiqui, T., Aalam, P.: Short text clustering; challenges & solutions: a literature review. Int. J. Math. Comput. Res. 3(6), 1025–1031 (2015)Google Scholar
  21. 21.
    Srinivas, V.: Link Prediction in Social Networks, 1st edn. Springer, Cham (2016). Scholar
  22. 22.
    Villazon-Terrazas, B., Garcia-Santa, N., Ren, Y., Srinivas, K., Rodriguez-Muro, M., Alexopoulos, P., Pan, J.Z.: Construction of enterprise knowledge graphs (I). Exploiting Linked Data and Knowledge Graphs in Large Organisations, pp. 87–116. Springer, Cham (2017). Scholar
  23. 23.
    Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)CrossRefGoogle Scholar
  24. 24.
    Winkler, W.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods (1990)Google Scholar
  25. 25.
    Yang, Y., Lichtenwalter, R.N., Chawla, N.V.: Evaluating link prediction methods. Knowl. Inf. Syst. 45(3), 751–782 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of MaltaMsidaMalta

Personalised recommendations