Identification of Surface Residues Involved in Protein-Protein Interaction — A Support Vector Machine Approach

Yan, Changhui; Dobbs, Drena; Honavar, Vasant

doi:10.1007/978-3-540-44999-7_6

Changhui Yan^4,5,8,
Drena Dobbs^6,7,8 &
Vasant Honavar^4,5,7,8

Part of the book series: Advances in Soft Computing ((AINSC,volume 23))

342 Accesses
14 Citations

Summary

We describe a machine learning approach for sequence-based prediction of protein-protein interaction sites. . A support vector machine (SVM) classifier was trained to predict whether or not a surface residue is an interface residue (i.e., is located in the protein-protein interaction surface) based on the identity of the target residue and its 10 sequence neighbors. Separate classifiers were trained on proteins from two categories of complexes, antibody-antigen and protease-inhibitor. The effectiveness of each classifier was evaluated using leave-one-out (jack-knife) cross-validation. Interface and non-interface residues were classified with relatively high sensitivity (82.3% and 78.5%) and specificity (81.0% and 77.6%) for proteins in the antigen-antibody and protease inhibitor complexes, respectively. The correlation between predicted and actual labels was 0.430 and 0.462, indicating that the method performs substantially better than chance (zero correlation). Combined with recently developed methods for identification of surface residues from sequence information, this offers a promising approach to prediction of residues involved in protein-protein interaction from sequence information alone.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baldi P, Brunak S, Chauvin Y, Andersen CAF (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16: 412–424
Article Google Scholar
Benner SA, Badcoe I, Cohen MA, Gerloff DL (1994) Bona fide prediction of aspects of protein conformation: Assigning interior and surface residues from patterns of variation and conservation in homologous protein sequences. J Mol Biol 235, 926–58
Article Google Scholar
Bossart-Whitaker P, Chang CY, Novotny J, Benjamin DC, Sheriff S (1995) The crystal structure of the antibody N10-staphylococcal nuclease complex at 2.9 A resolution. J Mol Biol 253, 559–575
Article Google Scholar
Braden BC, Fields BA, Ysern X, Dall’Acqua W, Goldbaum FA, Poljak RJ, Mariuzza RA (1996) Crystal structure of an Fv-Fv idiotope-anti-idiotope complex at 1.9 A resolution. J Mol Bio1264:137–51
Google Scholar
Casari G, Sander C, Valencia A (1995) A method to predict functional residues in proteins. Nat Struct Biol 2, 171–178
Article Google Scholar
Dodge C, Schneider R, Sander C (1998) The HSSP database of protein structure-sequence alignments and family profiles. Nucleic Acids Res 26, 313–315
Article Google Scholar
Fariselli P, Pazos F, Valencia A, Casadia R (2002) Prediction of protein-protein interaction sites in heterocomplexes with neural networks. Eur J Biochem 269, 1356–1361
Article Google Scholar
Frigerio F, Coda A, Pugliese L, Lionetti C, Menegatti E, Amiconi G, Schnebli HP, Ascenzi P, Bolognesi M (1992) Crystal and molecular structure of the bovine alpha-chymotrypsin-eglin c complex at 2.0 A resolution. J Mol Biol 225: 107–123
Article Google Scholar
Gallet X, Charloteaux B, Thomas A, Brasseur R (2000) A fast method to predict protein in-teraction sites from sequences. J Mol Biol 302, 917–926
Article Google Scholar
Gallivan JP, Lester HA, Dougherty DA (1997) Site-specific incorporation of biotinylated amino acids to identify surface-exposed residues in integral membrane proteins. Chem Biol 4, 739–749
Article Google Scholar
Holbrook SR, Muskal SM, Kim SH (1990) Predicting surface exposure of amino acids from protein sequence. Protein Eng 3, 659–665
Article Google Scholar
Jones S,Thornton JM (1996) Principles of protein-protein interactions. Proc Natl Acad Sci USA, 93, 13–20
Article Google Scholar
Jones S, Thornton JM (1997a) Analysis of protein-protein interaction sites using surface patches. JMoI Boil 272, 121–132
Article Google Scholar
Jones S, Thornton JM (1997b) Prediction of protein-protein interaction sites using patch analysis. JMol Biol 272, 133–143
Article Google Scholar
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637
Article Google Scholar
Kini RM, Evans HJ (1996) Prediction of potential protein-protein interaction sites from amino acid sequence identification of a fibrin polymerization site. FEBS letters 385, 81–86
Article Google Scholar
Lichtarge O, Bourne HR, Cohen FE (1996) An evolutionary trace method defines binding surfaces common to protein families. J Mol Bio1257, 342–358
Google Scholar
Lu L, Lu H, and Skolnick J (2002) MULTIPROSPECTOR: An algorithm for the prediction of protein-protein interactions by multimeric threading Proteins 49, 350–364
Article Google Scholar
Mandler J (1988) ANTIGEN: protein surface residue prediction. Compute Apple Basic 4, 493
Google Scholar
Mucchielli-Giorgi MH, About S, Puffery P (1999) PredAcc: prediction of solvent accessibility. Bioinformatics 15, 176–177
Article Google Scholar
Naderi-Manesh H, Sadeghi M, Arab S, Movahedi AAM (2001) Prediction of protein surface accessibility with information theory. Proteins 42, 452–459
Article Google Scholar
Pazos F, Helmer-Citterich M, Ausiello G, Valencia A (1997) Correlated mutations contain information about protein-protein interaction. J Mol Biol 271, 511–523
Article Google Scholar
Platt J (1998)Fast training of support vector machines using sequential minimal optimization. In B Scholkopf C J C, Burges and A J Smola editors, Advances in Kernel Methods - Support Vector Learning, p 185–208, Cambridge, MA, MIT Press
Google Scholar
Rost B, Sander C (1994) Conservation and prediction of solvent accessibility in protein families. Proteins 20, 216–226
Article Google Scholar
Teichmann SA, Murzin AG, and Chothia C (2001) Determination of protein function, evo- lution and interactins by structural genomics. Curr Opin Struct Biol 11: 354–363
Article Google Scholar
Tsunemi M, Matsuura Y, Sakakibara S, Katsube Y(1996) Crystal structure of an elastasespecific inhibitor elafin complexed with porcine pancreatic elastase determined at 1.9 A resolution Biochemistry 35: 11570–11576
Google Scholar
Valencia A and Pazos F (2002) Computational methods for prediction of protein interactions. Curr Opin Struct Biol 12:368–373
Google Scholar
Witten I H, Frank E (1999) Data mining: Practical machine learning tools and techniques with java implementations. San Mateo, CA: Morgan Kaufmann
Google Scholar
YanC, Dobbs D, Honavar V (2002) Predicting protein-protein interaction sites from amino acid sequence. Technical report (http://archives.cs.iastate.edu/) ISU-CS-TR 02–11. Department of computer science, Iowa State University, USA
Google Scholar
Zhou H, Shan Y (2001) Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins 44, 336–343
Article Google Scholar

Download references

Author information

Authors and Affiliations

Artificial Intelligence Research Labortory, Iowa State University, Atanasoff Hall 226, Ames, 50011-1040, IA, USA
Changhui Yan & Vasant Honavar
Department of Computer Science, Iowa State University, Atanasoff Hall 226, Ames, 50011-1040, IA, USA
Changhui Yan & Vasant Honavar
Department of Genetics, Development and Cell Biology, Iowa State University, Atanasoff Hall 226, Ames, 50011-1040, IA, USA
Drena Dobbs
Laurence H Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Atanasoff Hall 226, Ames, 50011-1040, IA, USA
Drena Dobbs & Vasant Honavar
Bioinformatics and Computational Biology Graduate Program, Iowa State University, Atanasoff Hall 226, Ames, 50011-1040, IA, USA
Changhui Yan, Drena Dobbs & Vasant Honavar

Authors

Changhui Yan
View author publications
You can also search for this author in PubMed Google Scholar
Drena Dobbs
View author publications
You can also search for this author in PubMed Google Scholar
Vasant Honavar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Oklahoma State University, Tulsa, OK, USA
Ajith Abraham
Fraunhofer Institute for Production Systems and Design Technology, Berlin, Germany
Katrin Franke & Mario Köppen &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, C., Dobbs, D., Honavar, V. (2003). Identification of Surface Residues Involved in Protein-Protein Interaction — A Support Vector Machine Approach. In: Abraham, A., Franke, K., Köppen, M. (eds) Intelligent Systems Design and Applications. Advances in Soft Computing, vol 23. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44999-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-540-44999-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40426-2
Online ISBN: 978-3-540-44999-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics