A Framework for Automatic Population of Ontology-Based Digital Libraries

  • Laura PandolfoEmail author
  • Luca Pulina
  • Giovanni Adorni
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10037)


Maintaining updated ontology-based digital libraries faces two main issues. First, documents are often unstructured and in heterogeneous data formats, making it even more difficult to extract information and search in. Second, manual ontology population is time consuming and therefore automatic methods to support this process are needed.

In this paper, we present an ontology-based framework aiming at populating ontologies. In particular, we propose an approach for triplet extraction from heterogeneous and unstructured documents in order to automatically populate ontology-based digital libraries. Finally, we evaluate the proposed framework on a real world case study.


Ontology population Ontology-based digital library Information Extraction 



The authors would like to thank Dr. Anastasia Di Nunzio for helpful discussion on linguistic typology studies.


  1. 1.
    Bush, V.: As we may think. ACM SIGPC Notes 1(4), 36–44 (1979)CrossRefGoogle Scholar
  2. 2.
    Zghal, H.B., Moreno, A.: A system for information retrieval in a medical digital library based on modular ontologies and query reformulation. Multimedia Tools Appl. 72(3), 2393–2412 (2014)CrossRefGoogle Scholar
  3. 3.
    Li, N., Zhu, L., Mitra, P., Mueller, K., Poweleit, E., Giles, C.L.: Orechem chemxseer: a semantic digital library for chemistry. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 245–254. ACM (2010)Google Scholar
  4. 4.
    Doerr, M., Gradmann, S., Hennicke, S., Isaac, A., Meghini, C., van de Sompel, H.: The Europeana data model (edm). In: World Library and Information Congress: 76th IFLA General Conference and Assembly, pp. 10–15 (2010)Google Scholar
  5. 5.
    Ruiz-Martınez, J.M., Minarro-Giménez, J.A., Castellanos-Nieves, D., Garcıa-Sánchez, F., Valencia-Garcia, R.: Ontology population: an application for the e-tourism domain. Int. J. Innovative Comput. Inf. Control (IJICIC) 7(11), 6115–6134 (2011)Google Scholar
  6. 6.
    Bontcheva, K., Tablan, V., Maynard, D., Cunningham, H.: Evolving gate to meet new challenges in language engineering. Nat. Lang. Eng. 10(3–4), 349–373 (2004)CrossRefGoogle Scholar
  7. 7.
    Faria, C., Serra, I., Girardi, R.: A domain-independent process for automatic ontology population from text. Sci. Comput. Program. 95, 26–43 (2014)CrossRefGoogle Scholar
  8. 8.
    Antinucci, F., Cinque, G.: Sull’ordine delle parole in italiano: l’emarginazione. Studi di grammatica italiana VI, pp. 121–146 (1977)Google Scholar
  9. 9.
    Boschi, S.: La comunicazione vista dal nostro cervello. Lampi di stampa, Milan (2008)Google Scholar
  10. 10.
    Sabatini, F.: La comunicazione e gli usi della lingua. Loescher, Torino (1991)Google Scholar
  11. 11.
    Adrian, W.T., Leone, N., Manna, M.: Ontology-driven information extraction. arXiv preprint arXiv:1512.06034 (2015)
  12. 12.
    Benammar, R., Trémeau, A., Maret, P.: An approach for ontology population based on information extraction techniques. In: Debruyne, C., Panetto, H., Meersman, R., Dillon, T., Weichhart, G., An, Y., Agostino Ardagna, C. (eds.) On the Move to Meaningful Internet Systems: OTM 2015 Conferences. LNCS, vol. 9415, pp. 397–404. Springer, Switzerland (2015). doi: 10.1007/978-3-319-26148-5_26 CrossRefGoogle Scholar
  13. 13.
    Rusu, D., Dali, L., Fortuna, B., Grobelnik, M., Mladenic, D.: Triplet extraction from sentences. In: Proceedings of the 10th International Multiconference Information Society-IS, pp. 8–12 (2007)Google Scholar
  14. 14.
    Adorni, G., Maratea, M., Pandolfo, L., Pulina, L.: An ontology-based archive for historical research. In: Proceedings of the 28th International Workshop on Description Logics. CEUR Workshop Proceedings, Athens, Greece, 7–10 June 2015, vol. 1350. (2015)Google Scholar
  15. 15.
    Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5(2), 199–220 (1993)CrossRefGoogle Scholar
  16. 16.
    Motik, B., Patel-Schneider, P.F., Parsia, B., Bock, C., Fokoue, A., Haase, P., Hoekstra, R., Horrocks, I., Ruttenberg, A., Sattler, U., et al.: Owl 2 web ontology language: structural specification and functional-style syntax. W3C Recommendation 27(65), 159 (2009)Google Scholar
  17. 17.
    Hitzler, P., Krötzsch, M., Parsia, B., Patel-Schneider, P.F., Rudolph, S.: Owl 2 Web Ontology Language Primer, 2nd edn. W3C Recommendation, December 2012Google Scholar
  18. 18.
    Dale, R., Moisl, H., Somers, H.: Handbook of Natural Language Processing. CRC Press, Boca Raton (2000)Google Scholar
  19. 19.
    Jiang, J.: Information extraction from text. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 11–41. Springer, Heidelberg (2012). doi: 10.1007/978-1-4614-3223-4_2 CrossRefGoogle Scholar
  20. 20.
    Piskorski, J., Yangarber, R.: Information extraction: past, present and future. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing, pp. 23–49. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-28569-1_2 CrossRefGoogle Scholar
  21. 21.
    Saggion, H., Funk, A., Maynard, D., Bontcheva, K.: Ontology-Based Information Extraction for Business Intelligence. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  22. 22.
    Wimalasuriya, D.C., Dou, D.: Ontology-based information extraction: an introduction and a survey of current approaches. J. Inf. Sci. 36, 306 (2010)CrossRefGoogle Scholar
  23. 23.
    The opennlp project (2005). Accessed June 2016
  24. 24.
    Horridge, M., Bechhofer, S.: The owl API: a java api for owl ontologies. Semant. Web 2(1), 11–21 (2011)Google Scholar
  25. 25.
    Harris, S., Seaborne, A., Prudhommeaux, E.: Sparql 1.1 Query Language, vol. 21. W3C Recommendation (2013)Google Scholar
  26. 26.
    Pianta, E., Bentivogli, L., Girardi, C.: MultiWordNet: developing an aligned multilingual database. In: Proceedings of the 1st International Conference on Global WordNet, pp. 293–302 (2002)Google Scholar
  27. 27.
    Berners-Lee, T., Connolly, D., Kagal, L., Scharf, Y., Hendler, J.: N3logic: a logical framework for the world wide web. Theor. Pract. Logic Program. 8(03), 249–269 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Adorni, G., Maratea, M., Pandolfo, L., Pulina, L.: An ontology for historical research documents. In: Cate, B., Mileo, A. (eds.) RR 2015. LNCS, vol. 9209, pp. 11–18. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-22002-4_2 CrossRefGoogle Scholar
  29. 29.
    Adorni, G., Maratea, M., Mura, S., Pandolfo, L., Pulina, L., Soddu, F.: A domain ontology for historical research documents. In: Artificial Intelligence for Cultural Heritage, pp. 25–48 Cambridge Scholars Publishing (2016)Google Scholar
  30. 30.
    Kontchakov, R., Pandolfo, L., Pulina, L., Ryzhikov, V., Zakharyaschev, M.: Temporal and spatial OBDA with many-dimensional Halpern-Shoham logic. To appear in Proceedings of IJCAI (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.DIBRISUniversità di GenovaGenovaItaly
  2. 2.PolcomingUniversità di SassariSassariItaly

Personalised recommendations