Advertisement

Extracting Protein-Protein Interaction Sentences by Applying Rough Set Data Analysis

  • Filip Ginter
  • Tapio Pahikkala
  • Sampo Pyysalo
  • Jorma Boberg
  • Jouni Järvinen
  • Tapio Salakoski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3066)

Abstract

In this paper, we introduce a way to apply rough set data analysis to the problem of extracting protein-protein interaction sentences in biomedical literature. Our approach builds on decision rules of protein names, interaction words, and their mutual positions in sentences. In order to broaden the set of potential interaction words, we develop a morphological model which generates spelling and inflection variants of the interaction words. We evaluate the performance of the proposed method on a hand-tagged dataset of 1894 sentences and show a precision-recall break-even performance of 79,8% by using leave-one-out crossvalidation.

Keywords

Decision Rule Coverage Factor Decision Table Decision Attribute Morphological Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bader, G., Donaldson, I., Wolting, C., Ouellette, B., Pawson, T., Hogue, C.: BIND – the biomolecular interaction network database. Nucleic Acids Research 29, 242–245 (2001)CrossRefGoogle Scholar
  2. 2.
    Xenarios, I., Rice, D., Salwinski, L., Baron, M., Marcotte, E., Eisenberg, D.: DIP: The database of interacting proteins. Nucleic Acids Research 28, 289–291 (2000)CrossRefGoogle Scholar
  3. 3.
    Temkin, J., Gilder, M.: Extraction of protein interaction information from unstructured text using a context-free grammar. Bioinformatics 19, 2046–2053 (2003)CrossRefGoogle Scholar
  4. 4.
    Bunescu, R., Ge, R., Kate, R., Marcotte, E.M., Mooney, R., Ramani, A.K., Wong, Y.W.: Comparative experiments on learning information extractors for proteins and their interactions. Artificial Intelligence in Medicine. Special Issue on Summarization and Information Extraction from Medical Documents (2004) (to appear)Google Scholar
  5. 5.
    Pawlak, Z.: Rough sets, decision algorithms and Bayes’ theorem. European Journal of Operational Research 136, 181–189 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Lydall, D., Weiner, T.: G2/M checkpoint genes of saccharomyces cerevisiae: further evidence for roles in DNA replication and/or repair. Molecular and General Genetic 256, 638–651 (1997)CrossRefGoogle Scholar
  7. 7.
    Calderwood, D., Zent, R., Grant, R., Rees, D., Hynes, R., Ginsberg, M.: The talin head domain binds to integrin beta subunit cytoplasmic tails and regulates integrin activation. The Journal of Biological Chemistry 274, 28071–28074 (1999)CrossRefGoogle Scholar
  8. 8.
    Koskenniemi, K.: Two-level model for morphological analysis. In: Bundy, A. (ed.) Proceedings of the Eighth International Joint Conference on Artificial Intelligence, Karlsruhe, West Germany, August 8-12, pp. 683–685. William Kaufmann, Inc., San Francisco (1983)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Filip Ginter
    • 1
    • 2
  • Tapio Pahikkala
    • 1
    • 2
  • Sampo Pyysalo
    • 1
    • 2
  • Jorma Boberg
    • 1
    • 2
  • Jouni Järvinen
    • 1
    • 2
  • Tapio Salakoski
    • 1
    • 2
  1. 1.Turku Centre for Computer Science (TUCS)Finland
  2. 2.Department of Information TechnologyUniversity of TurkuTurkuFinland

Personalised recommendations