Abstract
The Notarial Archives in Valletta, the capital city of Malta, houses a rich and valuable collection of around twenty thousand notarial manuscripts dating back to the 15th century. The Archive wants to make the contents of this collection easily accessible and searchable to researchers and the general public. Knowledge Graphs have been successfully used to represent similar historical content. Nevertheless, building a Knowledge Graph for the archives is challenging as these documents are written in medieval Latin and currently there is a lack of information extraction tools that recognise this language. This is, furthermore, compounded with a lack of medieval Latin corpora to train and evaluate machine learning algorithms, as well as a lack of an ontological representation for the contents of notarial manuscripts. In this paper, we present NotaryPedia, a Knowledge Graph for the Notarial Archives. We extend our previous work on entity and keyphrase extraction with relation extraction to populate the Knowledge Graph using an ontological vocabulary for notarial deeds. Furthermore, we perform Knowledge Graph completeness using link-prediction and inference. Our work was evaluated using different translational distance and semantic matching models to predict relations amongst literals by promoting them to entities and to infer new knowledge from existing entities. A 49% relation prediction accuracy using TransE was achieved.
Keywords
This work is partially funded by project E-18LO28-01 as part of the collaboration between the Notarial Archives in Valletta and the University of Malta.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
The prototype can be accessed from: https://notarypedia.opendatamalta.com/.
- 9.
- 10.
- 11.
The current Knowledge Graph is found here: https://notarypedia.opendatamalta.com/graph/notarypedia.ttl.
- 12.
The smallest unit of a description in an archival collection, for example a report. [1].
- 13.
An organized unit of documents grouped together either for current use by the creator or in the process of archival arrangement. In our case this is a register [1].
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
References
ISAD(G): General international standard archival description 2000, 2 edn. (2000)
Ahonen, E., Hyvonen, E.: Publishing Historical Texts on the Semantic Web –A Case Study, pp. 167–173. IEEE (2009)
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating Embeddings for Modeling Multi-relational Data, pp. 2787–2795 (2013)
Debruyne, C., Beyan, O.D., Grant, R., Collins, S., Decker, S., Harrower, N.: A semantic architecture for preserving and interpreting the information contained in irish historical vital records. Int. J. Digit. Libr. 17(3), 159–174 (2016)
Efremova, J., Montes García, A., Calders, T.: Classification of historical notary acts with noisy labels. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 49–54. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16354-3_6
Efremova, J., García, A.M., Iriondo, A.B., Calders, T.: Who are my ancestors? Retrieving family relationships from historical texts. In: Braslavski, P., et al. (eds.) RuSSIR 2015. CCIS, vol. 573, pp. 121–129. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41718-9_6
Efremova, J., Montes Garcia, A., Calders, T., Zhang, J.: Towards population reconstruction: extraction of family relationships from historical documents (2015)
Efremova, J., et al.: Multi-source entity resolution for genealogical data. In: Bloothooft, G., Christen, P., Mandemakers, K., Schraagen, M. (eds.) Population Reconstruction, pp. 129–154. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19884-2_7
Ehrlinger, L., Wob, W.: Towards a Definition of Knowledge Graphs (2016)
Ellul, C., Abela, C., Azzopardi, J.: Extracting Information from Medieval Notarial deeds, pp. 25–28. EKAW (2018)
Erdmann, A., et al.: Challenges and solutions for latin named entity recognition. In: The COLING 2016 Organizing Committee, pp. 85–93 (2016)
Feeney, K.C., O’Sullivan, D., Tai, W., Brennan, R.: Improving curated web-data quality with structured harvesting and assessment. Int. J. Semant. Web Inf. Syst. 10(2), 35–62 (2014)
Fiorini, S.: Documentary Sources of Maltese History Part I Notarial Documents No 1 Notary Giacomo Zabbara. University of Malta, 1 edn. (1996)
Gonzalez, E.: Unsupervised Relation Extraction by Massive Clustering (2009)
Han, X., et al.: Openke: an open toolkit for knowledge embedding. In: Proceedings of EMNLP (2018)
Monti, M., et al.: Construction of enterprise knowledge graphs. In: Pan, J.Z., Vetere, G., Gomez-Perez, J.M., Wu, H. (eds.) Exploiting Linked Data and Knowledge Graphs in Large Organisations. Springer, Cham (2017). chap 8
Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Semant. Web 8(3), 489–508 (2016)
Pawar, S., Palshikar, G., Bhattacharyya, P.: Relation Extraction: A Survey (2017)
Ruddock, B.: Linked data and the locah project. Bus. Inf. Rev. 28(2), 105–111 (2011)
Siddiqui, T., Aalam, P.: Short text clustering; challenges & solutions: a literature review. Int. J. Math. Comput. Res. 3(6), 1025–1031 (2015)
Srinivas, V.: Link Prediction in Social Networks, 1st edn. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-28922-9
Villazon-Terrazas, B., Garcia-Santa, N., Ren, Y., Srinivas, K., Rodriguez-Muro, M., Alexopoulos, P., Pan, J.Z.: Construction of enterprise knowledge graphs (I). Exploiting Linked Data and Knowledge Graphs in Large Organisations, pp. 87–116. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-45654-6_4
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Winkler, W.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods (1990)
Yang, Y., Lichtenwalter, R.N., Chawla, N.V.: Evaluating link prediction methods. Knowl. Inf. Syst. 45(3), 751–782 (2014)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ellul, C., Azzopardi, J., Abela, C. (2019). NotaryPedia: A Knowledge Graph of Historical Notarial Manuscripts. In: Panetto, H., Debruyne, C., Hepp, M., Lewis, D., Ardagna, C., Meersman, R. (eds) On the Move to Meaningful Internet Systems: OTM 2019 Conferences. OTM 2019. Lecture Notes in Computer Science(), vol 11877. Springer, Cham. https://doi.org/10.1007/978-3-030-33246-4_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-33246-4_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33245-7
Online ISBN: 978-3-030-33246-4
eBook Packages: Computer ScienceComputer Science (R0)