Automatic Identification of Relations in Quebec Heritage Data

  • François Ferry
  • Amal ZouaqEmail author
  • Michel Gagnon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11196)


Heritage data is often represented in unstructured format, especially textual data. In this paper, our objective is to extract instances of predefined relations between persons and real estates from historical notices in French. Using several vector-based representations and supervised learning algorithms, we build classifiers able to achieve an F-measure between 75% to 85% for relation detection. Our results show that performances are highly dependent on the type of relation, and also on the specific evaluation metrics. Our best results are obtained using a TF-IDF vector representation with a support vector machine classifier or Word2Vec vectors combined with a multilayer perceptron classifier.


Relation extraction Heritage data Supervised learning Word2Vec TF-IDF 



This work has been funded by the Quebec Ministry of Culture and Communication.


  1. 1.
    Vlachidis, A., Tudhope, D.: A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain. J. Assoc. Inf. Sci. Technol. 67(5), 1138–1152 (2016)CrossRefGoogle Scholar
  2. 2.
    Augenstein, I., Padó, S., Rudolph, S.: LODifier: generating linked data from unstructured text. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 210–224. Springer, Heidelberg (2012). Scholar
  3. 3.
    Benson, E., Haghighi, A., Barzilay, R.: Event discovery in social media feeds. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 389–398. Association for Computational Linguistics (2011)Google Scholar
  4. 4.
    Buranasing, W., Phoomvuthisarn, S., Buranarach, M.: Information extraction and integration for enriching cultural heritage collections. In: 2016 11th International Conference on Knowledge, Information and Creativity Support Systems (KICSS), pp. 1–6, November 2016Google Scholar
  5. 5.
    Byrne, K., Klein, E.: Automatic extraction of archaeological events from text, April 2009Google Scholar
  6. 6.
    Doulamis, N.D., Doulamis, A.D., Kokkinos, P., Varvarigos, E.M.: Event detection in Twitter microblogging. IEEE Trans. Cybern. 46(12), 2810–2824 (2016)CrossRefGoogle Scholar
  7. 7.
    Nie, T., Shen, D., Kou, Y., Yu, G., Yue, D.: An entity relation extraction model based on semantic pattern matching. In: 2011 Eighth Web Information Systems and Applications Conference (WISA), pp. 7–12. IEEE (2011)Google Scholar
  8. 8.
    Odat, S., Groza, T., Hunter, J.: Extracting structured data from publications in the art conservation domain. Digit. Scholarsh. Humanit. 30(2), 225–245 (2014)CrossRefGoogle Scholar
  9. 9.
    Petit, J., Boisson, J.C., Rousseaux, F.: Discovering cultural conceptual structures from texts for ontology generation. In: 2017 4th International Conference on Control, Decision and Information Technologies (CoDIT), pp. 0225–0229. IEEE (2017)Google Scholar
  10. 10.
    Schöch, C.: A Word2Vec model file built from the French Wikipedia XML Dump using gensim, October 2016Google Scholar
  11. 11.
    Song, S., Sun, Y., Di, Q.: Multiple order semantic relation extraction. Neural Comput. Appl. 1–14 (2018)Google Scholar
  12. 12.
    Zahedi, M., Kahani, M.: SREC: discourse-level semantic relation extraction from text. Neural Comput. Appl. 23(6), 1573–1582 (2013)CrossRefGoogle Scholar
  13. 13.
    Zheng, S., Jiaming, X., Zhou, P., Bao, H., Qi, Z., Xu, B.: A neural network framework for relation extraction: learning entity semantic and relation pattern. Knowl.-Based Syst. 114, 12–23 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Ecole Polytechnique de MontréalMontrealCanada
  2. 2.University of OttawaOttawaCanada

Personalised recommendations