Use of Machine Learning Features to Detect Protein-Protein Interaction Sites at the Molecular Level
Protein-protein interactions (PPI) play pivotal roles in many biological processes like hormone-receptor binding. Their disruption leads to generation of inherited diseases. Therefore prediction of PPI is a challenging task. Machine learning has been found to be an appropriate tool for predicting PPI. Machine learning features generated from a set of protein hetero-complex structures were found to be a good predictor of PPIs. These machine learning features were used as training examples to develop Support Vector Machines (SVM) and Random Forests (RF) based PPI prediction tools. Among the important features the sequence based features related to sequence conservations and structure based features like solvent accessibility were found to have the maximum predictive capability as measured by their Area Under the Receiver Operating Characteristics (ROC) curves (AUC value). The RF based predictor was found to be a better performer than the SVM based predictor for this training set.
KeywordsPPI Machine learning SVM RF Features ROC
The author is grateful to the BIF Center, Dept of Biochemistry and Biophysics, University of Kalyani for providing workstation to carry out the experiments. The author would like to acknowledge the ongoing DST-PURSE program 2012–2015 for the infrastructural support.
- 1.Park, J., Lee D.-S., Christakis, N.A., Barabasi, A.-L.: The impact of cellular networks on disease comorbidity. Mol. Sys. Biol. 311, 1–7 (2009)Google Scholar
- 2.Jones, S., Thornton, J.M.: Principles of protein-protein interactions. Proc. Natl. Acad. Sci. USA 93, 13–20 (2002)Google Scholar
- 3.Nooren, I., Thornton, J.M.: Diversity of protein-protein interactions. EMBO J. 22, 3486–3492 (2003)Google Scholar
- 6.Chung, J.L., Wang, W., Bourne, P.S.: Exploiting sequence and structure homologs to identify protein-protein binding sites. Proteins: Struct. Funct. Bioinf. 62, 630–640 (2006)Google Scholar