Mining Semantic Descriptions of Bioinformatics Web Resources from the Literature
- Cite this paper as:
- Afzal H., Stevens R., Nenadic G. (2009) Mining Semantic Descriptions of Bioinformatics Web Resources from the Literature. In: Aroyo L. et al. (eds) The Semantic Web: Research and Applications. ESWC 2009. Lecture Notes in Computer Science, vol 5554. Springer, Berlin, Heidelberg
A number of projects (myGrid, BioMOBY, etc.) have recently been initiated in order to organise emerging bioinformatics Web Services and provide their semantic descriptions. They typically rely on manual curation efforts. In this paper we focus on a semi-automated approach to mine semantic descriptions from the bioinformatics literature. The method combines terminological processing and dependency parsing of journal articles, and applies information extraction techniques to profile Web services using informative textual passages, related ontological annotations and service descriptors. Service descriptors are terminological phrases reflecting related concepts (e.g. tasks, approaches, data) and/or specific roles (e.g. input/output parameters, etc.) of the associated resource classes (e.g. algorithms, databases, etc.). They can be used to facilitate subsequent manual description of services, but also for providing a semantic synopsis of a service that can be used to locate related services. We present a case-study involving full text articles from the BMC Bioinformatics journal. We illustrate the potential of natural language processing not only for mining descriptions of known services, but also for discovering new services that have been described in the literature.