Advertisement

Supporting Description of Research Data: Evaluation and Comparison of Term and Concept Extraction Approaches

  • Cláudio Monteiro
  • Carla Teixeira Lopes
  • João Rocha Silva
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11057)

Abstract

The importance of research data management is widely recognized. Dendro is an ontology-based platform that allows researchers to describe datasets using generic and domain-specific descriptors from ontologies. Selecting or building the right ontologies for each research domain or group requires meetings between curators and researchers in order to capture the main concepts of their research. Envisioning a tool to assist curators through the automatic extraction of key concepts from research documents, we propose 2 concept extraction methods and compare them with a term extraction method. To compare the three approaches, we use as ground truth an ontology previously created by human curators.

Keywords

Term extraction Ontology learning Research data management 

References

  1. 1.
    Amorim, R.C., Castro, J.A., da Silva, J.R., Ribeiro, C.: A comparative study of platforms for research data management: interoperability, metadata capabilities and integration potential. In: Rocha, A., Correia, A.M., Costanzo, S., Reis, L.P. (eds.) New Contributions in Information Systems and Technologies. AISC, vol. 353, pp. 101–111. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-16486-1_10CrossRefGoogle Scholar
  2. 2.
    Castro, J.A., Perrotta, D., Amorim, R.C., da Silva, J.R., Ribeiro, C.: Ontologies for research data description: a design process applied to vehicle simulation. In: Garoufallou, E., Hartley, R.J., Gaitanou, P. (eds.) MTSR 2015. CCIS, vol. 544, pp. 348–354. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24129-6_30CrossRefGoogle Scholar
  3. 3.
    Cimiano, P., Mädche, A., Staab, S., Völker, J.: Ontology learning. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies. IHIS, pp. 245–267. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-540-92673-3_11CrossRefGoogle Scholar
  4. 4.
    Frantzi, K.T., Ananiadou, S., Tsujii, J.: The C-value/NC-value method of automatic recognition for multi-word terms. In: Nikolaou, C., Stephanidis, C. (eds.) ECDL 1998. LNCS, vol. 1513, pp. 585–604. Springer, Heidelberg (1998).  https://doi.org/10.1007/3-540-49653-X_35CrossRefGoogle Scholar
  5. 5.
    Rocha, J., Ribeiro, C., Lopes, J.: Ranking Dublin Core descriptor lists from user interactions: a case study with Dublin Core Terms using the Dendro platform. Int. J. Digital Libr. (2018).  https://doi.org/10.1007/s00799-018-0238-x
  6. 6.
    Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text. ACM Comput. Surv. 44(4), 1–36 (2012)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Faculty of EngineeringUniversity of PortoPortoPortugal
  2. 2.INESC TEC, Faculty of EngineeringUniversity of PortoPortoPortugal

Personalised recommendations