Supporting Description of Research Data: Evaluation and Comparison of Term and Concept Extraction Approaches

  • Cláudio MonteiroEmail author
  • Carla Teixeira Lopes
  • João Rocha Silva
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11057)


The importance of research data management is widely recognized. Dendro is an ontology-based platform that allows researchers to describe datasets using generic and domain-specific descriptors from ontologies. Selecting or building the right ontologies for each research domain or group requires meetings between curators and researchers in order to capture the main concepts of their research. Envisioning a tool to assist curators through the automatic extraction of key concepts from research documents, we propose 2 concept extraction methods and compare them with a term extraction method. To compare the three approaches, we use as ground truth an ontology previously created by human curators.


Term extraction Ontology learning Research data management 


  1. 1.
    Amorim, R.C., Castro, J.A., da Silva, J.R., Ribeiro, C.: A comparative study of platforms for research data management: interoperability, metadata capabilities and integration potential. In: Rocha, A., Correia, A.M., Costanzo, S., Reis, L.P. (eds.) New Contributions in Information Systems and Technologies. AISC, vol. 353, pp. 101–111. Springer, Cham (2015). Scholar
  2. 2.
    Castro, J.A., Perrotta, D., Amorim, R.C., da Silva, J.R., Ribeiro, C.: Ontologies for research data description: a design process applied to vehicle simulation. In: Garoufallou, E., Hartley, R.J., Gaitanou, P. (eds.) MTSR 2015. CCIS, vol. 544, pp. 348–354. Springer, Cham (2015). Scholar
  3. 3.
    Cimiano, P., Mädche, A., Staab, S., Völker, J.: Ontology learning. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies. IHIS, pp. 245–267. Springer, Heidelberg (2009). Scholar
  4. 4.
    Frantzi, K.T., Ananiadou, S., Tsujii, J.: The C-value/NC-value method of automatic recognition for multi-word terms. In: Nikolaou, C., Stephanidis, C. (eds.) ECDL 1998. LNCS, vol. 1513, pp. 585–604. Springer, Heidelberg (1998). Scholar
  5. 5.
    Rocha, J., Ribeiro, C., Lopes, J.: Ranking Dublin Core descriptor lists from user interactions: a case study with Dublin Core Terms using the Dendro platform. Int. J. Digital Libr. (2018).
  6. 6.
    Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text. ACM Comput. Surv. 44(4), 1–36 (2012)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Faculty of EngineeringUniversity of PortoPortoPortugal
  2. 2.INESC TEC, Faculty of EngineeringUniversity of PortoPortoPortugal

Personalised recommendations