Exploring Protein Functional Relationships Using Genomic Information and Data Mining Techniques
Anapproach that uses both supervised and unsupervised learning methods for exploring protein functional relationships is reported; we refer to this as Maximum Contrast (MC) tree. The tree is constructed by performing a hierarchical decomposition of the feature space; this step is performed regardless of complex nature of protein functions, i.e. it performs this decomposition even without knowledge of the protein functional class labels. In order to test our algorithm, we have constructed a library of Protein Phylogenetic Profiles for the proteins in the yeast Saccharomyces Cerevisiae with 60 species. Results showed our algorithm compares favorably to other classification algorithms such as the decision tree algorithms C4.5, C5, and to support vector machines.
KeywordsSupport Vector Machine Feature Space Leaf Node Class Label Test Instance
Unable to display preview. Download preview PDF.
- 2.Brown, M. P. S., Grundy, W. N., Lin, D., Cristianini, N., Sugnet, C.W., Furey, T. S., Ares M. J., and Haussler, D. (2000), Knowledge-based analysis of microarray gene expression data by using support vector machines, PNAS 97, p. 262–267.Google Scholar
- 6.Ersoy, O. K. et al (1998) in Algorithm and Architectures (Leondes, C. T. editor) Pages 364–401, Academic Press 1998 (ISBN: 012443861X).Google Scholar
- 8.Pavlidis, Paul, Jason Weston, Jinsong Cai and William Noble Grundy. “Learning Gene Functional Classification from Multiple Data Types”. J. of Computational Biology, Vol 9. pp. 401–444.Google Scholar
- 9.Pellegrini, M., Marcotte, E. M., Thompson, M. J., Eisenberg, D., and Yeates, T. O. (1999), Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles, PNAS 96, p. 4285–4288.Google Scholar
- 10.Yang, Jack, Yang, Mary and Ersoy, O.K. (2002) “Gene finding and protein functional determination by protein phylogenetic profile and computational intelligence,” Intelligent Engineering Systems through Neural Networks, Vol 12. Page 733–740 ASME Press (ISBN: 0791801918)Google Scholar
- 11.Vert J.(2002) “A tree kernel to analyze phylogenetic profiles”, Bioinformatics, Vol 18Suppl 1. pp. S276–S284.Google Scholar