Ontology-Driven Information Extraction from Research Publications

  • Vayianos Pertsas
  • Panos Constantopoulos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11057)


Extraction of information from a research article, association with other sources and inference of new knowledge is a challenging task that has not yet been entirely addressed. We present Research Spotlight, a system that leverages existing information from DBpedia, retrieves articles from repositories, extracts and interrelates various kinds of named and non-named entities by exploiting article metadata, the structure of text as well as syntactic, lexical and semantic constraints, and populates a knowledge base in the form of RDF triples. An ontology designed to represent scholarly practices is driving the whole process. The system is evaluated through two experiments that measure the overall accuracy in terms of token- and entity- based precision, recall and F1 scores, as well as entity boundary detection, with promising results.


Information extraction from text Ontology population Linked data Knowledge base creation 


  1. 1.
    Jurafsky, D., Martin, J.H.: Speech and language processing - an introduction to natural language processing, computational linguistics, and speech recognition (2017)Google Scholar
  2. 2.
    Pertsas, V., Constantopoulos, P.: Scholarly ontology: modelling scholarly practices. Int. J. Digit. Libr. 18, 173–190 (2017). Scholar
  3. 3.
    Gerber, D., Hellmann, S., Bühmann, L., Soru, T., Usbeck, R., Ngonga Ngomo, A.-C.: Real-time RDF extraction from unstructured data streams. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 135–150. Springer, Heidelberg (2013). Scholar
  4. 4.
    Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6, 167–195 (2015). Scholar
  5. 5.
    Zimmermann, A., Gravier, C., Subercaze, J., Cruzille, Q.: Nell2RDF: read the web, and turn it into RDF. In: CEUR Workshop Proceedings, pp. 1–7 (2013)Google Scholar
  6. 6.
    Stern, R., Sagot, B.: Population of a knowledge base for news metadata from unstructured text and web data. In: AKBC-WEKEX 2012, pp. 35–40, Montreal, Canada (2012)Google Scholar
  7. 7.
    Alani, H., et al.: Automatic ontology-based knowledge extraction from web documents. IEEE Intell. Syst. 18, 14–21 (2003)CrossRefGoogle Scholar
  8. 8.
    Makki, J., Alquier, A.-M., Prince, V.: Ontology population via NLP techniques in risk management. Int. J. Humanit. Soc. Sci. 3, 212–217 (2008)Google Scholar
  9. 9.
    Celjuska, D., Vargas-Vera, M.: Ontosophie: a semi-automatic system for ontology population from text. In: ICON 2004 (2004)Google Scholar
  10. 10.
    Buitelaar, P., Cimiano, P., Frank, A., Hartung, M., Racioppa, S.: Ontology-based information extraction and integration from heterogeneous data sources. Int. J. Hum.-Comput. Stud. 66, 759–788 (2008). Scholar
  11. 11.
    Pertsas, V.: Modeling and extracting research processes. Athens University of Economics and Business, Athens (2018)Google Scholar
  12. 12.
    Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefGoogle Scholar
  13. 13.
    De Sitter, A., Calders, T., Daelemans, W.: A formal framework for evaluation of information extraction, University of Antwerp (2004)Google Scholar
  14. 14.
    Maynard, D., Peters, W., Li, Y.: Metrics for evaluation of ontology based information extraction. In: WWW 2006 Workshop on Evaluation of Ontologies for the Web (2006)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of InformaticsAthens University of Economics and BusinessAthensGreece
  2. 2.Digital Curation UnitAthena Research CentreAthensGreece

Personalised recommendations