Extracting Protein-Protein Interaction Sentences by Applying Rough Set Data Analysis
In this paper, we introduce a way to apply rough set data analysis to the problem of extracting protein-protein interaction sentences in biomedical literature. Our approach builds on decision rules of protein names, interaction words, and their mutual positions in sentences. In order to broaden the set of potential interaction words, we develop a morphological model which generates spelling and inflection variants of the interaction words. We evaluate the performance of the proposed method on a hand-tagged dataset of 1894 sentences and show a precision-recall break-even performance of 79,8% by using leave-one-out crossvalidation.
KeywordsDecision Rule Coverage Factor Decision Table Decision Attribute Morphological Model
Unable to display preview. Download preview PDF.
- 4.Bunescu, R., Ge, R., Kate, R., Marcotte, E.M., Mooney, R., Ramani, A.K., Wong, Y.W.: Comparative experiments on learning information extractors for proteins and their interactions. Artificial Intelligence in Medicine. Special Issue on Summarization and Information Extraction from Medical Documents (2004) (to appear)Google Scholar
- 8.Koskenniemi, K.: Two-level model for morphological analysis. In: Bundy, A. (ed.) Proceedings of the Eighth International Joint Conference on Artificial Intelligence, Karlsruhe, West Germany, August 8-12, pp. 683–685. William Kaufmann, Inc., San Francisco (1983)Google Scholar