Prediction of Protein-Protein Interactions Using Local Description of Amino Acid Sequence
Abstract
Protein-protein interactions (PPIs) are essential to most biological processes. Although high-throughput technologies have generated a large amount of PPI data for a variety of organisms, the interactome is still far from complete. So many computational methods based on machine learning have already been widely used in the prediction of PPIs. However, a major drawback of most existing methods is that they need the prior information of the protein pairs such as protein homology information. In this paper, we present an approach for PPI prediction using only the information of protein sequence. This approach is developed by combing a novel representation of local protein sequence descriptors and support vector machine (SVM). Local descriptors account for the interactions between sequentially distant but spatially close amino acid residues, so this method can adequately capture multiple overlapping continuous and discontinuous binding patterns within a protein sequence.
Keywords
Protein-protein interactions Protein sequence Local descriptors SVMPreview
Unable to display preview. Download preview PDF.
References
- 1.Zhao, X., Wang, R., Chen, L.: Uncovering signal transduction networks from high-throughput data by integer linear programming. Nucleic Acids Research (36), 48 (2008)CrossRefGoogle Scholar
- 2.Zhao, X., Wang, R., Chen, L., et al.: Gene function prediction using labelled and unlabeled data. BMC Bioinformatics 957 (2008)Google Scholar
- 3.Zhao, X., Wang, R., Chen, L., et al.: Protein function prediction with high-throughput data. Amino Acids 35, 517–530 (2008)CrossRefGoogle Scholar
- 4.Ito, T., Chiba, T., Ozawa, R., et al.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001)CrossRefGoogle Scholar
- 5.Ho, Y., Gruhler, A., Heilbut, A., et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002)CrossRefGoogle Scholar
- 6.Zhu, H., Bilgin, M., Bangham, R., Hall, D., et al.: Global analysis of protein activities using proteome chips. Science 293, 2101–2105 (2001)CrossRefGoogle Scholar
- 7.Skrabanek, L., Saini, H., Bader, G., Enright, A.: Computational prediction of protein–protein interactions. Molecular Biotechnology 38, 1–17 (2008)CrossRefGoogle Scholar
- 8.Shen, J., Zhang, J., Luo, X., et al.: Predicting protein-protein interactions based only on sequences information. Proc. Natl. Acad. Sci. USA 104, 4337–4341 (2007)CrossRefGoogle Scholar
- 9.Guo, Y., Yu, L., Wen, Z., Li, M.: Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Res. 36, 3025–3030 (2008)CrossRefGoogle Scholar
- 10.Bock, J.R., Gough, D.A.: Whole-proteome interaction mining. Bioinformatics 19, 125–134 (2003)CrossRefGoogle Scholar
- 11.Martin, S., Roe, D., Faulon, J.L.: Predicting protein-protein interactions using signature products. Bioinformatics 21, 218–226 (2005)CrossRefGoogle Scholar
- 12.Nanni, L.: Fusion of classifiers for predicting protein-protein interactions. Neurocomputing 68, 289–296 (2005)CrossRefGoogle Scholar
- 13.Nanni, L.: Hyperplanes for predicting protein-protein interactions. Neurocomputing 69, 257–263 (2005)CrossRefGoogle Scholar
- 14.Nanni, L., Lumini, A.: An ensemble of K-local hyperplanes for predicting protein-protein interactions. Bioinformatics 22, 1207–1210 (2006)CrossRefGoogle Scholar
- 15.Shi, M., Xia, J., Li, X., Huang, D.: Predicting protein-protein interactions from sequence using correlation coefficient and high-quality interaction dataset. Amino Acids (2009), doi:10.1007/s00726-009-0295-yGoogle Scholar
- 16.Xia, J., Liu, K.H., Huang, D.: Sequence-based prediction of protein-protein interactions by means of rotation forest and autocorrelation descriptor. Protein and Peptide Letters (2009) (in press)Google Scholar
- 17.Tong, J., Tammi, M.: Prediction of protein allergenicity using local description of amino acid sequence. Frontiers in Bioscience: A Journal and Virtual Library 13, 6072 (2008)CrossRefGoogle Scholar
- 18.Davies, M., Secker, A., Freitas, A., Clark, E.: Optimizing amino acid groupings for GPCR classification. Bioinformatics 24, 1980–1986 (2008)CrossRefGoogle Scholar
- 19.Xenarios, I., Salwinski, L., Duan, X.J., Higney, P., Kim, S.M., Eisenberg, D.: The Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303–305 (2002)CrossRefGoogle Scholar
- 20.Matthews, B.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta 405, 442–451 (1975)CrossRefGoogle Scholar
- 21.Zweig, M., Campbell, G.: Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin. Chem. 39, 561–577 (1993)Google Scholar
- 22.Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, New York (2004)CrossRefMATHGoogle Scholar