Semantic Web Evaluation Challenge

Semantic Web Evaluation Challenges pp 93-104

Extracting Contextual Information from Scientific Literature Using CERMINE System

Conference paper

DOI: 10.1007/978-3-319-25518-7_8

Part of the Communications in Computer and Information Science book series (CCIS, volume 548)
Cite this paper as:
Tkaczyk D., Bolikowski Ł. (2015) Extracting Contextual Information from Scientific Literature Using CERMINE System. In: Gandon F., Cabrio E., Stankovic M., Zimmermann A. (eds) Semantic Web Evaluation Challenges. Communications in Computer and Information Science, vol 548. Springer, Cham

Abstract

CERMINE is a comprehensive open source system for extracting structured metadata and references from born-digital scientific literature. Among other information, the system is able to extract information related to the context the article was written in, such as the authors and their affiliations, the relations between them or references to other articles. Extracted information is presented in a structured, machine-readable form. CERMINE is based on a modular workflow, whose loosely coupled architecture allows for individual components evaluation and adjustment, enables effortless improvements and replacements of independent parts of the algorithm and facilitates future architecture expanding. The implementation of the workflow is based mostly on supervised and unsupervised machine-learning techniques, which simplifies the procedure of adapting the system to new document layouts and styles. In this paper we outline the overall workflow architecture, describe key aspects of the system implementation, provide details about training and adjusting of individual algorithms, and finally report how CERMINE was used for extracting contextual information from scientific articles in PDF format in the context of ESWC 2015 Semantic Publishing Challenge. CERMINE system is available under an open-source licence and can be accessed at http://cermine.ceon.pl.

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Interdisciplinary Centre for Mathematical and Computational ModellingUniversity of WarsawWarsawPoland

Personalised recommendations