Use of Machine Learning Features to Detect Protein-Protein Interaction Sites at the Molecular Level

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 340)

Abstract

Protein-protein interactions (PPI) play pivotal roles in many biological processes like hormone-receptor binding. Their disruption leads to generation of inherited diseases. Therefore prediction of PPI is a challenging task. Machine learning has been found to be an appropriate tool for predicting PPI. Machine learning features generated from a set of protein hetero-complex structures were found to be a good predictor of PPIs. These machine learning features were used as training examples to develop Support Vector Machines (SVM) and Random Forests (RF) based PPI prediction tools. Among the important features the sequence based features related to sequence conservations and structure based features like solvent accessibility were found to have the maximum predictive capability as measured by their Area Under the Receiver Operating Characteristics (ROC) curves (AUC value). The RF based predictor was found to be a better performer than the SVM based predictor for this training set.

Keywords

PPI Machine learning SVM RF Features ROC 

Notes

Acknowledgments

The author is grateful to the BIF Center, Dept of Biochemistry and Biophysics, University of Kalyani for providing workstation to carry out the experiments. The author would like to acknowledge the ongoing DST-PURSE program 2012–2015 for the infrastructural support.

References

  1. 1.
    Park, J., Lee D.-S., Christakis, N.A., Barabasi, A.-L.: The impact of cellular networks on disease comorbidity. Mol. Sys. Biol. 311, 1–7 (2009)Google Scholar
  2. 2.
    Jones, S., Thornton, J.M.: Principles of protein-protein interactions. Proc. Natl. Acad. Sci. USA 93, 13–20 (2002)Google Scholar
  3. 3.
    Nooren, I., Thornton, J.M.: Diversity of protein-protein interactions. EMBO J. 22, 3486–3492 (2003)Google Scholar
  4. 4.
    Bogan, A.A., Thorn, K.S.: Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280, 1–9 (1998)CrossRefGoogle Scholar
  5. 5.
    Ofran, Y., Rost, B.: ISIS: interaction sites identified from sequence. Bioinformatics 23, e13–e16 (2007)CrossRefGoogle Scholar
  6. 6.
    Chung, J.L., Wang, W., Bourne, P.S.: Exploiting sequence and structure homologs to identify protein-protein binding sites. Proteins: Struct. Funct. Bioinf. 62, 630–640 (2006)Google Scholar
  7. 7.
    Altschul, S.F., Gish, W., Miller, W., et al.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)CrossRefGoogle Scholar

Copyright information

© Springer India 2015

Authors and Affiliations

  1. 1.Department of Biochemistry and BiophysicsUniversity of KalyaniKalyani, NadiaIndia

Personalised recommendations