Performance Evaluation of Knowledge Extraction Methods

  • Juan M. Rodríguez
  • Hernán D. Merlino
  • Patricia Pesado
  • Ramón García-Martínez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9799)


This paper shows the precision, the recall and the F-measure for the knowledge extraction methods (under Open Information Extraction paradigm): ReVerb, OLLIE and ClausIE. For obtaining these three measures a subset of 55 newswires corpus was used. This subset was taken from the Reuters-21578 text categorization and test collection database. A handmade relation extraction was applied for each one of these newswires.



The research reported in this paper was partially funded by Projects UNLa-33A205 and UNLa-33B177 of National University of Lanus (Argentina). Authors wish to thank to senior students in our courses within Information Engineering Bachelor Degree at Engineering School - University of Buenos Aires for their help during the experiment.


  1. Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction for the web. In: IJCAI, vol. 7, pp. 2670–2676, January 2007Google Scholar
  2. Christensen, J., Soderland, S., Etzioni, O.: An analysis of open information extraction based on semantic role labeling. In: Proceedings of the Sixth International Conference on Knowledge Capture, pp. 113–120. ACM (2011)Google Scholar
  3. Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 355–366. International World Wide Web Conferences Steering Committee, May 2013Google Scholar
  4. Etzioni, O., Cafarella, M., Downey, D., Popescu, A. M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web: an experimental study. Artif. Intell. 165(1), 91–134 (2005)CrossRefGoogle Scholar
  5. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545. Association for Computational Linguistics, July 2011Google Scholar
  6. Hamburg, M.: Basic Statistics: A Modern Approach. Jovanovich, New York (1979)zbMATHGoogle Scholar
  7. Joachims, T.: Text categorization with support vector machines. In: Nédellec, C., Rouveirol, C. (eds.) Learning with many relevant features, pp. 137–142. Springer, Heidelberg (1998)Google Scholar
  8. Lewis, D.D.: Reuters-21578 text categorization test collection, distribution 1.0.
  9. Mesquita, F., Merhav, Y., Barbosa, D.: Extracting information networks from the blogosphere: State-of-the-art and challenges. In: Proceedings of the Fourth AAAI Conference on Weblogs and Social Media (ICWSM), Data Challenge Workshop (2010)Google Scholar
  10. Mirrezaei, S.I., Martins, B., Cruz, I.F.: The triplex approach for recognizing semantic relations from noun phrases, appositions, and adjectives. In: The Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data (Know@LOD) Co-located with Extended Semantic Web Conference (ESWC), Portoroz, Slovenia (2015)Google Scholar
  11. Rancan, C., Kogan, A., Pesado, P., García-Martínez, R.: Knowledge discovery for knowledge based systems. Some experimental results. Res. Comput. Sci. J. 27, 3–13 (2007)Google Scholar
  12. Rodríguez, J.M., García-Martínez, R., Merlino, H.D.: Revisión Sistemática Comparativa de Evolución de Métodos de Extracción de Conocimiento para la Web. XXI Congreso Argentino de Ciencias de la Computación (CACIC 2015), Buenos Aires, Argentina (2015)Google Scholar
  13. Schmitz, M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 523–534, July 2012Google Scholar
  14. Wu, F., Weld, D.S.: Open information extraction using Wikipedia. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 118–127. Association for Computational Linguistics, July 2010Google Scholar
  15. Yahya, M., Whang, S.E., Gupta, R., Halevy, A.: Renoun: fact extraction for nominal attributes. In: Proceedings 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, October 2014Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Juan M. Rodríguez
    • 1
    • 2
    • 3
  • Hernán D. Merlino
    • 2
    • 3
  • Patricia Pesado
    • 4
  • Ramón García-Martínez
    • 3
  1. 1.PhD Program on Computer ScienceNational University of La PlataLa PlataArgentina
  2. 2.Intelligent Systems GroupUniversity of Buenos AiresBuenos AiresArgentina
  3. 3.Information Systems Research GroupNational University of LanúsLanúsArgentina
  4. 4.III-LIDI. Computer Science SchoolNational University of La Plata – CIC Bs AsLa PlataArgentina

Personalised recommendations