Summary
We describe a machine learning approach for sequence-based prediction of protein-protein interaction sites. . A support vector machine (SVM) classifier was trained to predict whether or not a surface residue is an interface residue (i.e., is located in the protein-protein interaction surface) based on the identity of the target residue and its 10 sequence neighbors. Separate classifiers were trained on proteins from two categories of complexes, antibody-antigen and protease-inhibitor. The effectiveness of each classifier was evaluated using leave-one-out (jack-knife) cross-validation. Interface and non-interface residues were classified with relatively high sensitivity (82.3% and 78.5%) and specificity (81.0% and 77.6%) for proteins in the antigen-antibody and protease inhibitor complexes, respectively. The correlation between predicted and actual labels was 0.430 and 0.462, indicating that the method performs substantially better than chance (zero correlation). Combined with recently developed methods for identification of surface residues from sequence information, this offers a promising approach to prediction of residues involved in protein-protein interaction from sequence information alone.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baldi P, Brunak S, Chauvin Y, Andersen CAF (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16: 412–424
Benner SA, Badcoe I, Cohen MA, Gerloff DL (1994) Bona fide prediction of aspects of protein conformation: Assigning interior and surface residues from patterns of variation and conservation in homologous protein sequences. J Mol Biol 235, 926–58
Bossart-Whitaker P, Chang CY, Novotny J, Benjamin DC, Sheriff S (1995) The crystal structure of the antibody N10-staphylococcal nuclease complex at 2.9 A resolution. J Mol Biol 253, 559–575
Braden BC, Fields BA, Ysern X, Dall’Acqua W, Goldbaum FA, Poljak RJ, Mariuzza RA (1996) Crystal structure of an Fv-Fv idiotope-anti-idiotope complex at 1.9 A resolution. J Mol Bio1264:137–51
Casari G, Sander C, Valencia A (1995) A method to predict functional residues in proteins. Nat Struct Biol 2, 171–178
Dodge C, Schneider R, Sander C (1998) The HSSP database of protein structure-sequence alignments and family profiles. Nucleic Acids Res 26, 313–315
Fariselli P, Pazos F, Valencia A, Casadia R (2002) Prediction of protein-protein interaction sites in heterocomplexes with neural networks. Eur J Biochem 269, 1356–1361
Frigerio F, Coda A, Pugliese L, Lionetti C, Menegatti E, Amiconi G, Schnebli HP, Ascenzi P, Bolognesi M (1992) Crystal and molecular structure of the bovine alpha-chymotrypsin-eglin c complex at 2.0 A resolution. J Mol Biol 225: 107–123
Gallet X, Charloteaux B, Thomas A, Brasseur R (2000) A fast method to predict protein in-teraction sites from sequences. J Mol Biol 302, 917–926
Gallivan JP, Lester HA, Dougherty DA (1997) Site-specific incorporation of biotinylated amino acids to identify surface-exposed residues in integral membrane proteins. Chem Biol 4, 739–749
Holbrook SR, Muskal SM, Kim SH (1990) Predicting surface exposure of amino acids from protein sequence. Protein Eng 3, 659–665
Jones S,Thornton JM (1996) Principles of protein-protein interactions. Proc Natl Acad Sci USA, 93, 13–20
Jones S, Thornton JM (1997a) Analysis of protein-protein interaction sites using surface patches. JMoI Boil 272, 121–132
Jones S, Thornton JM (1997b) Prediction of protein-protein interaction sites using patch analysis. JMol Biol 272, 133–143
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637
Kini RM, Evans HJ (1996) Prediction of potential protein-protein interaction sites from amino acid sequence identification of a fibrin polymerization site. FEBS letters 385, 81–86
Lichtarge O, Bourne HR, Cohen FE (1996) An evolutionary trace method defines binding surfaces common to protein families. J Mol Bio1257, 342–358
Lu L, Lu H, and Skolnick J (2002) MULTIPROSPECTOR: An algorithm for the prediction of protein-protein interactions by multimeric threading Proteins 49, 350–364
Mandler J (1988) ANTIGEN: protein surface residue prediction. Compute Apple Basic 4, 493
Mucchielli-Giorgi MH, About S, Puffery P (1999) PredAcc: prediction of solvent accessibility. Bioinformatics 15, 176–177
Naderi-Manesh H, Sadeghi M, Arab S, Movahedi AAM (2001) Prediction of protein surface accessibility with information theory. Proteins 42, 452–459
Pazos F, Helmer-Citterich M, Ausiello G, Valencia A (1997) Correlated mutations contain information about protein-protein interaction. J Mol Biol 271, 511–523
Platt J (1998)Fast training of support vector machines using sequential minimal optimization. In B Scholkopf C J C, Burges and A J Smola editors, Advances in Kernel Methods - Support Vector Learning, p 185–208, Cambridge, MA, MIT Press
Rost B, Sander C (1994) Conservation and prediction of solvent accessibility in protein families. Proteins 20, 216–226
Teichmann SA, Murzin AG, and Chothia C (2001) Determination of protein function, evo- lution and interactins by structural genomics. Curr Opin Struct Biol 11: 354–363
Tsunemi M, Matsuura Y, Sakakibara S, Katsube Y(1996) Crystal structure of an elastasespecific inhibitor elafin complexed with porcine pancreatic elastase determined at 1.9 A resolution Biochemistry 35: 11570–11576
Valencia A and Pazos F (2002) Computational methods for prediction of protein interactions. Curr Opin Struct Biol 12:368–373
Witten I H, Frank E (1999) Data mining: Practical machine learning tools and techniques with java implementations. San Mateo, CA: Morgan Kaufmann
YanC, Dobbs D, Honavar V (2002) Predicting protein-protein interaction sites from amino acid sequence. Technical report (http://archives.cs.iastate.edu/) ISU-CS-TR 02–11. Department of computer science, Iowa State University, USA
Zhou H, Shan Y (2001) Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins 44, 336–343
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yan, C., Dobbs, D., Honavar, V. (2003). Identification of Surface Residues Involved in Protein-Protein Interaction — A Support Vector Machine Approach. In: Abraham, A., Franke, K., Köppen, M. (eds) Intelligent Systems Design and Applications. Advances in Soft Computing, vol 23. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44999-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-44999-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40426-2
Online ISBN: 978-3-540-44999-7
eBook Packages: Springer Book Archive