Journal of Molecular Modeling

, Volume 13, Issue 11, pp 1157–1167 | Cite as

Statistical analysis of physical-chemical properties and prediction of protein-protein interfaces

  • Surendra S. Negi
  • Werner BraunEmail author
Original Paper


We have developed a fully automated method, InterProSurf, to predict interacting amino acid residues on protein surfaces of monomeric 3D structures. Potential interacting residues are predicted based on solvent accessible surface areas, a new scale for interface propensities, and a cluster algorithm to locate surface exposed areas with high interface propensities. Previous studies have shown the importance of hydrophobic residues and specific charge distribution as characteristics for interfaces. Here we show differences in interface and surface regions of all physical chemical properties of residues as represented by five quantitative descriptors. In the current study a set of 72 protein complexes with known 3D structures were analyzed to obtain interface propensities of residues, and to find differences in the distribution of five quantitative descriptors for amino acid residues. We also investigated spatial pair correlations of solvent accessible residues in interface and surface areas, and compared log-odds ratios for interface and surface areas. A new scoring method to predict potential functional sites on the protein surface was developed and tested for a new dataset of 21 protein complexes, which were not included in the original training dataset. Empirically we found that the algorithm achieves a good balance in the accuracy of precision and sensitivity by selecting the top eight highest scoring clusters as interface regions. The performance of the method is illustrated for a dimeric ATPase of the hyperthermophile, Methanococcus jannaschii, and the capsid protein of Human Hepatitis B virus. An automated version of the method can be accessed from our web server at


Comparison of the experimentally observed interface (blue) of the dimeric ATPase of the hyperthermophile Methanococcus jannaschii with the prediction of InterProSurf using a cluster (red) and patch (yellow) algorithm.


Hot spots Molecular recognition Physical chemical properties of interface residues Protein-protein interface 



This work was supported by National Institutes of Health Grants R21 AI055746 and R01 AI064913.

Supplementary material

894_2007_237_MOESM1_ESM.doc (254 kb)
(DOC 260 KB)


  1. 1.
    Kinoshita K, Nakamura H (2003) Curr Opin Struct Biol 13:396–400CrossRefGoogle Scholar
  2. 2.
    Archakov AI, Govorun VM, Dubanov AV, Ivanov YD, Veselovsky AV, Lewi P, Janssen P (2003) Proteomics 3:380–391CrossRefGoogle Scholar
  3. 3.
    Lo Conte L, Chothia C, Janin J (1998) J Mol Biol 285:2177–2198CrossRefGoogle Scholar
  4. 4.
    Janin J, Chothia C (1990) J Bio Chem 265:16027–16030Google Scholar
  5. 5.
    Tsai CJ, Lin SL, Wolfson HJ, Nussinov R (1997) Protein Sci 6:53–64CrossRefGoogle Scholar
  6. 6.
    Hu ZJ, Ma BY, Wolfson H, Nussinov R (2000) Proteins-Structure Function Genetics 39:331–342CrossRefGoogle Scholar
  7. 7.
    Jones S, Thornton JM (1996) Proc Natl Acad Sci USA 93:13–20CrossRefGoogle Scholar
  8. 8.
    McCoy AJ, Epa VC, Colman PM (1997) J Mol Biol 268:570–584CrossRefGoogle Scholar
  9. 9.
    Thorn KS, Bogan AA (2001) Bioinformatics 17:284–285CrossRefGoogle Scholar
  10. 10.
    Bader GD, Betel D, Hogue CWV (2003) Nucleic Acids Res 31:248–250CrossRefGoogle Scholar
  11. 11.
    Xenarios I, Salwinski L, Duan XQJ, Higney P, Kim SM, Eisenberg D (2002) Nucleic Acids Res 30:303–305CrossRefGoogle Scholar
  12. 12.
    Xenarios L, Eisenberg D (2001) Curr Opin Biotechnol 12:334–339CrossRefGoogle Scholar
  13. 13.
    DeLano WL (2002) Curr Opin Struct Biol 12:14–20CrossRefGoogle Scholar
  14. 14.
    DeLano WL, Ultsch MH, de Vos AM, Wells JA (2000) Science 287:1279–1283CrossRefGoogle Scholar
  15. 15.
    Kortemme T, Baker D (2002) Proc Natl Acad Sci USA 99:14116–14121CrossRefGoogle Scholar
  16. 16.
    Bogan AA, Thorn KS (1998) J Mol Biol 280:1–9CrossRefGoogle Scholar
  17. 17.
    Clackson T, Wells JA (1995) Science 267:383–386CrossRefGoogle Scholar
  18. 18.
    Jones S, Thornton JM (1997) J Mol Biol 272:133–143CrossRefGoogle Scholar
  19. 19.
    Jones S, Thornton JM (1997) J Mol Biol 272:121–132CrossRefGoogle Scholar
  20. 20.
    Jones S, Thornton JM (1995) Prog Biophys Mol Biol 63:31–35CrossRefGoogle Scholar
  21. 21.
    Jones S, Marin A, Thornton JM (2000) Protein Eng 13:77–82CrossRefGoogle Scholar
  22. 22.
    Bordner AJ, Abagyan R (2005) Proteins: Structure, Function, Bioinformatics 60:353–366CrossRefGoogle Scholar
  23. 23.
    Murakami Y, Jones S (2006) Bioinformatics 22:1794–1795CrossRefGoogle Scholar
  24. 24.
    Brinda KV, Kannan N, Vishveshwara S (2002) Protein Eng 15:265–277CrossRefGoogle Scholar
  25. 25.
    Landgraf R, Xenarios I, Eisenberg D (2001) J Mol Biol 307:1487–1502CrossRefGoogle Scholar
  26. 26.
    Morrison KL, Weiss GA (2001) Curr Opin Chem Biol 5:302–307CrossRefGoogle Scholar
  27. 27.
    Kortemme T, Kim DE, Baker D (2004) Sci STKE 219:12Google Scholar
  28. 28.
    Massova I, Kollman PA (1999) J Am Chem Soc 121:8133–8143CrossRefGoogle Scholar
  29. 29.
    Zhou HX, Shan Y (2001) Proteins 44:336–343CrossRefGoogle Scholar
  30. 30.
    Fariselli P, Pazos F, Valencia A, Casadio R (2002) Eur J Biochem 269:1356–1361CrossRefGoogle Scholar
  31. 31.
    Koike A, Takagi T (2004) Protein Engineering Design & Selection 17:165–173CrossRefGoogle Scholar
  32. 32.
    Bradford JR, Westhead DR (2004) Bioinformatics 21:1487–1494CrossRefGoogle Scholar
  33. 33.
    Gallet X, Charloteaux B, Thomas A, Brasseur R (2000) J Mol Biol 302:917–926CrossRefGoogle Scholar
  34. 34.
    Bock JR, Gough DA (2001) Bioinformatics 17:455–460CrossRefGoogle Scholar
  35. 35.
    Gao Y, Wang RX, Lai LH (2004) J Mol Model 10:44–54CrossRefGoogle Scholar
  36. 36.
    Yao H, Kristensen DM, Mihalek I, Sowa ME, Shaw C, Kimmel M, Kavraki L, Lichtarge O (2003) J Mol Biol 326:255–261CrossRefGoogle Scholar
  37. 37.
    Glaser F, Pupko T, Paz I, Bell RE, Bechor-Shental D, Martz E, Ben-Tal N (2003) Bioinformatics 19:163–164CrossRefGoogle Scholar
  38. 38.
    Chakrabarti P, Janin J (2002) Proteins 47:334–343CrossRefGoogle Scholar
  39. 39.
    Bahadur RP, Chakrabarti P, Rodier F, Janin J (2004) J Mol Biol 336:943–955CrossRefGoogle Scholar
  40. 40.
    Glaser F, Steinberg DM, Vakser IA, Ben-Tal N (2001) Proteins-Structure Function Genetics 43:89–102CrossRefGoogle Scholar
  41. 41.
    Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N (2005) Nucleic Acids Res 33:W299–W302CrossRefGoogle Scholar
  42. 42.
    Neuvirth H, Raz R, Schreiber G (2004) J Mol Biol 338:181–199CrossRefGoogle Scholar
  43. 43.
    Ogmen U, Keskin O, Aytuna AS, Nussinov R, Gursoy A (2005) Nucleic Acids Res 33:W331–W336CrossRefGoogle Scholar
  44. 44.
    Mihalek I, Res I, Lichtarge O (2006) Bioinformatics 22:1656–1657CrossRefGoogle Scholar
  45. 45.
    Venkatarajan MS, Braun W (2001) J Mol Model 7:445–453CrossRefGoogle Scholar
  46. 46.
    Negi SS, Kolokoltsov AA, Schein CH, Davey RA, Braun W (2006) J Mol Model 12:921–929CrossRefGoogle Scholar
  47. 47.
    Altschul S, Madden T, Schaffer A, Zhang JH, Zhang Z, Miller W, Lipman D (1998) Faseb J 12:A1326–A1326Google Scholar
  48. 48.
    Thompson JD, Higgins DG, Gibson TJ (1994) Nucleic Acids Res 22:4673–4680CrossRefGoogle Scholar
  49. 49.
    Fraczkiewicz R, Braun W (1998) J Comp Chem 19:319CrossRefGoogle Scholar
  50. 50.
    Liang S, Liu Z, Li W, Ni L, Lai L (2000) Biopolymers 54:515–523CrossRefGoogle Scholar
  51. 51.
    Singh RK, Tropsha A, Vaisman II (1996) J Comp Biol 3:213–221CrossRefGoogle Scholar
  52. 52.
    Linde Y, Buzo A, Gray RM (1980) IEEE Trans Commun 28:84–95CrossRefGoogle Scholar
  53. 53.
    Sayood K (2000) Introduction to data compression, 2nd edn. Morgan Kaufmann Publishers IncGoogle Scholar
  54. 54.
    Patane G, Russo M (2001) Neural Netw 14:1219–1237CrossRefGoogle Scholar
  55. 55.
    Equitz HW (1989) IEEE Trans Acoustics Speech Signal Process 37:1568–1575CrossRefGoogle Scholar
  56. 56.
    Cosman PC, Oehler KL, Riskin EA, Gray RM (1993) Proc I E E E 81:1326–1341Google Scholar
  57. 57.
    Lin CL, Tai SC (1998) IEEE Trans Circuits Syst, II Analog Digit Signal Process 45:432–435CrossRefGoogle Scholar
  58. 58.
    Baldi P, Brunak S, Chauvin Y, Andersen CAF, Nielsen H (2000) Bioinformatics 16:412–424CrossRefGoogle Scholar
  59. 59.
    Ma B, Elkayam T, Wolfson H, Nussinov R (2003) Proc Natl Acad Sci USA 100:5772–5777CrossRefGoogle Scholar
  60. 60.
    Nagano N, Ota M, Nishikawa K (1999) FEBS Lett 458:69–71CrossRefGoogle Scholar
  61. 61.
    Karlin S, Zhu ZY, Baud F (1999) Proc Natl Acad Sci USA 96:12500–12505CrossRefGoogle Scholar
  62. 62.
    Mathura VS, Schein CH, Braun W (2003) Bioinformatics 19:1381–1390CrossRefGoogle Scholar
  63. 63.
    Zarembinski TI, Hung LW, Mueller-Dieckmann HJ, Kim KK, Yokota H, Kim R, Kim SH (1998) Proc Natl Acad Sci USA 95:15189–15193CrossRefGoogle Scholar
  64. 64.
    Wynne SA, Crowther RA, Leslie AGW (1999) Mol Cell 3:771–780CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular BiophysicsUniversity of Texas Medical BranchGalvestonUSA

Personalised recommendations