Mining Semantic Descriptions of Bioinformatics Web Resources from the Literature

  • Hammad Afzal
  • Robert Stevens
  • Goran Nenadic
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5554)


A number of projects (myGrid, BioMOBY, etc.) have recently been initiated in order to organise emerging bioinformatics Web Services and provide their semantic descriptions. They typically rely on manual curation efforts. In this paper we focus on a semi-automated approach to mine semantic descriptions from the bioinformatics literature. The method combines terminological processing and dependency parsing of journal articles, and applies information extraction techniques to profile Web services using informative textual passages, related ontological annotations and service descriptors. Service descriptors are terminological phrases reflecting related concepts (e.g. tasks, approaches, data) and/or specific roles (e.g. input/output parameters, etc.) of the associated resource classes (e.g. algorithms, databases, etc.). They can be used to facilitate subsequent manual description of services, but also for providing a semantic synopsis of a service that can be used to locate related services. We present a case-study involving full text articles from the BMC Bioinformatics journal. We illustrate the potential of natural language processing not only for mining descriptions of known services, but also for discovering new services that have been described in the literature.


Service Descriptor Semantic Description Semantic Classis Semantic Synopsis Dependency Parsing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Cannata, N., Merelli, E., Altman, R.B.: Time to Organize the Bioinformatics Resourceome. PLoS Computational Biology 1, e76 (2005)CrossRefGoogle Scholar
  2. 2.
    Lord, P., Alper, P., Wroe, C., Goble, C.: Feta: A Light-Weight Architecture for User Ori-ented Semantic Service Discovery. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 17–31. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  3. 3.
    Wolstencroft, K., Alper, P., Hull, D., Wroe, C., Lord, P.W., Stevens, R.D., Goble, C.A.: The myGrid Ontology: Bioinformatics Service Discovery. International Journal of Bioin-formatics Research and Applications 3, 326–340 (2007)CrossRefGoogle Scholar
  4. 4.
    Carman, M.J., Knoblock, C.A.: Learning Semantic Descriptions of Web Information Sources. In: International Joint Conferences on Artificial Intelligence, Hyderabad, pp. 2695–2701 (2006)Google Scholar
  5. 5.
    Lerman, K., Plangrasopchok, A., Knoblock, C.A.: Automatically Labeling the Inputs and Outputs of Web Services. In: Proc. of AAAI 2006, Boston, MA, USA, pp. 149–181 (2006)Google Scholar
  6. 6.
    Hess, A., Kushmerick, N.: Learning to Attach Semantic Metadata to Web Services. In: Proc. 2nd International Semantic Web Conference, Sanibel Island, Florida, USA (2003)Google Scholar
  7. 7.
    Belhajjame, K., Embury, S.M., Paton, N.W., Stevens, R., Goble, C.A.: Automatic Annotation of Web Services based on Workflow Definitions. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 116–129. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Wattarujeekrit, T., Shah, P., Collier, N.: PASBio: predicate-argument structures for event extraction in molecular biology. BMC Bioinformatics 5, 155 (2004)CrossRefGoogle Scholar
  9. 9.
    Tateisi, Y., Ohta, T., Tsujii, J.: Annotation of Predicate-argument Structure on Molecular Biology Text. In: Workshop on the 1st International Joint Conference on Natural Language Processing (IJCNLP 2004) (2004)Google Scholar
  10. 10.
    Afzal, H., Stevens, R., Nenadic, G.: Towards Semantic Annotation of Bioinformatics Services: Building a Controlled Vocabulary. In: Proc. of the Third International Symposium on Semantic Mining in Biomedicine, Turku, Finland, pp. 5–12 (2008)Google Scholar
  11. 11.
    Hearst, M.A., Schutze, H.: Customizing a lexicon to better suit a computational task. In: Corpus processing for lexical acquisition, pp. 77–96. MIT Press, Cambridge (1996)Google Scholar
  12. 12.
    Oinn, T., Li, P., Kell, D.B., Goble, C., Goderis, A., Greenwood, M., Hull, D., Stevens, R., Turi, D., Zhao, J.: Taverna/myGrid: aligning a workflow system with the life sciences community. In: Dennis, B., Ian, G., Taylor, J., Deelman, E., Shields, M. (eds.) Workflows for e-Science: scientific workflows for Grids, pp. 300–319. Springer, Guildford (2007)CrossRefGoogle Scholar
  13. 13.
    Spasic, I., Ananiadou, S., McNaught, J., Kumar, A.: Text mining and ontologies in bio-medicine: Making sense of raw text. Briefings in Bioinformatics 6, 239–251 (2005)CrossRefGoogle Scholar
  14. 14.
    Rebholz-Schuhmann, D., Arregui, M., Gaudan, M., Kirsch, H., Jimeno, A.: Text processing through Web services: Calling Whatizit. Bioinformatics 24(2), 296–298 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Hammad Afzal
    • 1
  • Robert Stevens
    • 1
  • Goran Nenadic
    • 1
  1. 1.School of Computer ScienceUniversity of ManchesterManchesterUK

Personalised recommendations