Computational Prediction of Protein-Protein Interactions

  • Tobias Ehrenberger
  • Lewis C. Cantley
  • Michael B. Yaffe
Part of the Methods in Molecular Biology book series (MIMB, volume 1278)


The prediction of protein-protein interactions and kinase-specific phosphorylation sites on individual proteins is critical for correctly placing proteins within signaling pathways and networks. The importance of this type of annotation continues to increase with the continued explosion of genomic and proteomic data, particularly with emerging data categorizing posttranslational modifications on a large scale. A variety of computational tools are available for this purpose. In this chapter, we review the general methodologies for these types of computational predictions and present a detailed user-focused tutorial of one such method and computational tool, Scansite, which is freely available to the entire scientific community over the Internet.

Key words

Scansite Protein-protein interaction prediction Sequence motif PSSM Binding motif Phosphorylation sites Bioinformatics 


  1. 1.
    Berman HM, Westbrook J, Feng Z et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242CrossRefPubMedCentralPubMedGoogle Scholar
  2. 2.
    Mathivanan S, Periaswamy B, Gandhi T et al (2006) An evaluation of human protein-protein interaction data in the public domain. BMC Bioinformatics 7:S19CrossRefPubMedCentralPubMedGoogle Scholar
  3. 3.
    Turinsky A, Razick S, Turner B, et al. (2010) Literature curation of protein interactions: measuring agreement across major public databases. Database (Oxford) 2010: baq026Google Scholar
  4. 4.
    Shoemaker B, Panchenko A (2007) Deciphering protein-protein interactions. Part II. Computational methods to predict protein and domain interaction partners. PLoS Comput Biol 3:e43CrossRefPubMedCentralPubMedGoogle Scholar
  5. 5.
    Pitre S, Alamgir M, Green J, et al. (2008) Computational methods for predicting protein–protein interactions. In: Advances in biochemical engineering/biotechnology: protein-protein interaction. Springer, Heidelberg, pp 247–267Google Scholar
  6. 6.
    Andrusier N, Mashiach E, Nussinov R et al (2008) Principles of flexible protein–protein docking. Proteins 73:271–289CrossRefPubMedCentralPubMedGoogle Scholar
  7. 7.
    Janin J (2002) Welcome to CAPRI: a Critical Assessment of PRedicted Interactions. Proteins Struct Funct Genet 47:257CrossRefGoogle Scholar
  8. 8.
    Rhodes DR, Tomlins SA, Varambally S et al (2005) Probabilistic model of the human protein-protein interaction network. Nat Biotechnol 23:951–959CrossRefPubMedGoogle Scholar
  9. 9.
    Trost B, Kusalik A (2011) Computational prediction of eukaryotic phosphorylation sites. Bioinformatics 27:2927–2935CrossRefPubMedGoogle Scholar
  10. 10.
    Hutti J, Jarrell E, Chang J et al (2004) A rapid method for determining protein kinase phosphorylation specificity. Nat Methods 1:27–29CrossRefPubMedGoogle Scholar
  11. 11.
    Songyang Z, Blechner S, Hoagland N et al (1994) Use of an oriented peptide library to determine the optimal substrates of protein kinases. Curr Biol 4:973–982CrossRefPubMedGoogle Scholar
  12. 12.
    Kemp BE, Pearson RB (1990) Protein kinase recognition sequence motifs. Trends Biochem Sci 15:342–346CrossRefPubMedGoogle Scholar
  13. 13.
    Pinna LA, Maria Ruzzene M (1996) How do protein kinases recognize their substrates? Biochim Biophys Acta 13143:191–225CrossRefGoogle Scholar
  14. 14.
    Bairoch A (1992) PROSITE: a dictionary of sites and patterns in proteins. Nucleic Acids Res 20(Suppl):2013–2018CrossRefPubMedCentralPubMedGoogle Scholar
  15. 15.
    Yaffe M, Leparc G, Lai J et al (2001) A motif-based profile scanning approach for genome-wide prediction of signaling pathways. Nat Biotechnol 19:348–353CrossRefPubMedGoogle Scholar
  16. 16.
    Obenauer J, Cantley L, Yaffe M (2003) Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res 31:3635–3641CrossRefPubMedCentralPubMedGoogle Scholar
  17. 17.
    M. M, UniProt-consortium (2011) UniProt Knowledgebase: a hub of integrated protein data. Database: bar009Google Scholar
  18. 18.
    Cherry J, Hong E, Amundsen C et al (2011) Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res 40(Database issue):D700–D705PubMedCentralPubMedGoogle Scholar
  19. 19.
    Flicek P, Amode M, Barrell D et al (2011) Ensembl 2011. Nucleic Acids Res 39(Suppl 1):D800–D806CrossRefPubMedCentralPubMedGoogle Scholar
  20. 20.
    Burks C, Cassidy M, Cinkosky MJ et al (1991) GenBank. Nucleic Acids Res 19:221–225CrossRefGoogle Scholar
  21. 21.
    Boeckmann B, Bairoch A, Apweiler R et al (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31:365–370CrossRefPubMedCentralPubMedGoogle Scholar
  22. 22.
    Bjellqvist B, Hughes G, Pasquali C et al (1993) The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences. Electrophoresis 14:1023–1031CrossRefPubMedGoogle Scholar
  23. 23.
    Hornbeck P, Kornhauser J, Tkachev S et al (2012) PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse. Nucleic Acids Res 40:D261–D270CrossRefPubMedCentralPubMedGoogle Scholar
  24. 24.
    Dinkel H, Chica C, Via A et al (2011) (2011) Phospho.ELM: a database of phosphorylation sites—update 2011. Nucleic Acids Res 39:D261–D267CrossRefPubMedCentralPubMedGoogle Scholar
  25. 25.
    Gnad F, Ren S, Cox J et al (2007) PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites. Genome Biol 8:R250CrossRefPubMedCentralPubMedGoogle Scholar
  26. 26.
    Altschul SF, Madden TL, Schäffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402CrossRefPubMedCentralPubMedGoogle Scholar
  27. 27.
    Iakoucheva L, Radivojac P, Brown C et al (2004) The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res 32:1037–1049CrossRefPubMedCentralPubMedGoogle Scholar
  28. 28.
    Hunter S, Apweiler R, Attwood T et al (2009) InterPro: the integrative protein signature database. Nucleic Acids Res 37:D211–D215CrossRefPubMedCentralPubMedGoogle Scholar
  29. 29.
    Schneider T, Stephens R (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18:6097–6100CrossRefPubMedCentralPubMedGoogle Scholar
  30. 30.
    Stelzer G, Dalah I, Stein T et al (2011) In-silico human genomics with GeneCards. Hum Genomics 5:709–717CrossRefPubMedCentralPubMedGoogle Scholar
  31. 31.
    Punta M, Coggill P, Eberhardt R et al (2012) The Pfam protein families database. Nucleic Acids Res 40:D290–D301CrossRefPubMedCentralPubMedGoogle Scholar
  32. 32.
    Uversky V, Dunker A (2010) Understanding protein non-folding. Biochim Biophys Acta 1804:1231–1264CrossRefPubMedCentralPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Tobias Ehrenberger
    • 1
    • 2
  • Lewis C. Cantley
    • 3
    • 4
    • 6
  • Michael B. Yaffe
    • 1
    • 2
    • 5
    • 7
  1. 1.Department of BiologyMassachusetts Institute of TechnologyCambridgeUSA
  2. 2.Department of Biological EngineeringMassachusetts Institute of TechnologyCambridgeUSA
  3. 3.Division of Signal Transduction, Beth Israel Deaconess Medical CenterHarvard Medical SchoolBostonUSA
  4. 4.Department of Medicine, Beth Israel Deaconess Medical CenterHarvard Medical SchoolBostonUSA
  5. 5.Department of Surgery, Beth Israel Deaconess Medical CenterHarvard Medical SchoolBostonUSA
  6. 6.Weill Cornell Cancer CenterWeill Cornell Medical CollegeNew YorkUSA
  7. 7.Koch Institute for Integrative Cancer BiologyMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations