Skip to main content

Identification of Surface Residues Involved in Protein-Protein Interaction — A Support Vector Machine Approach

  • Conference paper
Intelligent Systems Design and Applications

Part of the book series: Advances in Soft Computing ((AINSC,volume 23))

Summary

We describe a machine learning approach for sequence-based prediction of protein-protein interaction sites. . A support vector machine (SVM) classifier was trained to predict whether or not a surface residue is an interface residue (i.e., is located in the protein-protein interaction surface) based on the identity of the target residue and its 10 sequence neighbors. Separate classifiers were trained on proteins from two categories of complexes, antibody-antigen and protease-inhibitor. The effectiveness of each classifier was evaluated using leave-one-out (jack-knife) cross-validation. Interface and non-interface residues were classified with relatively high sensitivity (82.3% and 78.5%) and specificity (81.0% and 77.6%) for proteins in the antigen-antibody and protease inhibitor complexes, respectively. The correlation between predicted and actual labels was 0.430 and 0.462, indicating that the method performs substantially better than chance (zero correlation). Combined with recently developed methods for identification of surface residues from sequence information, this offers a promising approach to prediction of residues involved in protein-protein interaction from sequence information alone.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Baldi P, Brunak S, Chauvin Y, Andersen CAF (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16: 412–424

    Article  Google Scholar 

  • Benner SA, Badcoe I, Cohen MA, Gerloff DL (1994) Bona fide prediction of aspects of protein conformation: Assigning interior and surface residues from patterns of variation and conservation in homologous protein sequences. J Mol Biol 235, 926–58

    Article  Google Scholar 

  • Bossart-Whitaker P, Chang CY, Novotny J, Benjamin DC, Sheriff S (1995) The crystal structure of the antibody N10-staphylococcal nuclease complex at 2.9 A resolution. J Mol Biol 253, 559–575

    Article  Google Scholar 

  • Braden BC, Fields BA, Ysern X, Dall’Acqua W, Goldbaum FA, Poljak RJ, Mariuzza RA (1996) Crystal structure of an Fv-Fv idiotope-anti-idiotope complex at 1.9 A resolution. J Mol Bio1264:137–51

    Google Scholar 

  • Casari G, Sander C, Valencia A (1995) A method to predict functional residues in proteins. Nat Struct Biol 2, 171–178

    Article  Google Scholar 

  • Dodge C, Schneider R, Sander C (1998) The HSSP database of protein structure-sequence alignments and family profiles. Nucleic Acids Res 26, 313–315

    Article  Google Scholar 

  • Fariselli P, Pazos F, Valencia A, Casadia R (2002) Prediction of protein-protein interaction sites in heterocomplexes with neural networks. Eur J Biochem 269, 1356–1361

    Article  Google Scholar 

  • Frigerio F, Coda A, Pugliese L, Lionetti C, Menegatti E, Amiconi G, Schnebli HP, Ascenzi P, Bolognesi M (1992) Crystal and molecular structure of the bovine alpha-chymotrypsin-eglin c complex at 2.0 A resolution. J Mol Biol 225: 107–123

    Article  Google Scholar 

  • Gallet X, Charloteaux B, Thomas A, Brasseur R (2000) A fast method to predict protein in-teraction sites from sequences. J Mol Biol 302, 917–926

    Article  Google Scholar 

  • Gallivan JP, Lester HA, Dougherty DA (1997) Site-specific incorporation of biotinylated amino acids to identify surface-exposed residues in integral membrane proteins. Chem Biol 4, 739–749

    Article  Google Scholar 

  • Holbrook SR, Muskal SM, Kim SH (1990) Predicting surface exposure of amino acids from protein sequence. Protein Eng 3, 659–665

    Article  Google Scholar 

  • Jones S,Thornton JM (1996) Principles of protein-protein interactions. Proc Natl Acad Sci USA, 93, 13–20

    Article  Google Scholar 

  • Jones S, Thornton JM (1997a) Analysis of protein-protein interaction sites using surface patches. JMoI Boil 272, 121–132

    Article  Google Scholar 

  • Jones S, Thornton JM (1997b) Prediction of protein-protein interaction sites using patch analysis. JMol Biol 272, 133–143

    Article  Google Scholar 

  • Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637

    Article  Google Scholar 

  • Kini RM, Evans HJ (1996) Prediction of potential protein-protein interaction sites from amino acid sequence identification of a fibrin polymerization site. FEBS letters 385, 81–86

    Article  Google Scholar 

  • Lichtarge O, Bourne HR, Cohen FE (1996) An evolutionary trace method defines binding surfaces common to protein families. J Mol Bio1257, 342–358

    Google Scholar 

  • Lu L, Lu H, and Skolnick J (2002) MULTIPROSPECTOR: An algorithm for the prediction of protein-protein interactions by multimeric threading Proteins 49, 350–364

    Article  Google Scholar 

  • Mandler J (1988) ANTIGEN: protein surface residue prediction. Compute Apple Basic 4, 493

    Google Scholar 

  • Mucchielli-Giorgi MH, About S, Puffery P (1999) PredAcc: prediction of solvent accessibility. Bioinformatics 15, 176–177

    Article  Google Scholar 

  • Naderi-Manesh H, Sadeghi M, Arab S, Movahedi AAM (2001) Prediction of protein surface accessibility with information theory. Proteins 42, 452–459

    Article  Google Scholar 

  • Pazos F, Helmer-Citterich M, Ausiello G, Valencia A (1997) Correlated mutations contain information about protein-protein interaction. J Mol Biol 271, 511–523

    Article  Google Scholar 

  • Platt J (1998)Fast training of support vector machines using sequential minimal optimization. In B Scholkopf C J C, Burges and A J Smola editors, Advances in Kernel Methods - Support Vector Learning, p 185–208, Cambridge, MA, MIT Press

    Google Scholar 

  • Rost B, Sander C (1994) Conservation and prediction of solvent accessibility in protein families. Proteins 20, 216–226

    Article  Google Scholar 

  • Teichmann SA, Murzin AG, and Chothia C (2001) Determination of protein function, evo- lution and interactins by structural genomics. Curr Opin Struct Biol 11: 354–363

    Article  Google Scholar 

  • Tsunemi M, Matsuura Y, Sakakibara S, Katsube Y(1996) Crystal structure of an elastasespecific inhibitor elafin complexed with porcine pancreatic elastase determined at 1.9 A resolution Biochemistry 35: 11570–11576

    Google Scholar 

  • Valencia A and Pazos F (2002) Computational methods for prediction of protein interactions. Curr Opin Struct Biol 12:368–373

    Google Scholar 

  • Witten I H, Frank E (1999) Data mining: Practical machine learning tools and techniques with java implementations. San Mateo, CA: Morgan Kaufmann

    Google Scholar 

  • YanC, Dobbs D, Honavar V (2002) Predicting protein-protein interaction sites from amino acid sequence. Technical report (http://archives.cs.iastate.edu/) ISU-CS-TR 02–11. Department of computer science, Iowa State University, USA

    Google Scholar 

  • Zhou H, Shan Y (2001) Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins 44, 336–343

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yan, C., Dobbs, D., Honavar, V. (2003). Identification of Surface Residues Involved in Protein-Protein Interaction — A Support Vector Machine Approach. In: Abraham, A., Franke, K., Köppen, M. (eds) Intelligent Systems Design and Applications. Advances in Soft Computing, vol 23. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44999-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-44999-7_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40426-2

  • Online ISBN: 978-3-540-44999-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics