Abstract
In this paper, a machine learning approach, known as support vector machine (SVM) is employed to predict the distance between antibody’s interface residue and antigen in antigen–antibody complex. The heavy chains, light chains and the corresponding antigens of 37 antibodies are extracted from the antibody–antigen complexes in protein data bank. According to different distance ranges, sequence patch sizes and antigen classes, a number of computational experiments are conducted to describe the distance between antibody’s interface residue and antigen with antibody sequence information. The high prediction accuracy of both self-consistent and cross-validation tests indicates that the sequential discovered information from antibody structure characterizes much in predicting the distance between antibody’s interface residue and antigen. Furthermore, the antigen class is predicted from residue composition information that belongs to different distance range by SVM, which shows some potential significance.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-006-0076-4/MediaObjects/521_2006_76_Fig1_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-006-0076-4/MediaObjects/521_2006_76_Fig2_HTML.gif)
Similar content being viewed by others
References
Petrovsky N, Brusic V (2002) Computational immunology: the coming of age. Immunol Cell Biol 80:248–254
Webster DM, Henry AH, Rees AR (1994) Antibody–antigen interactions. Curr Opin Struct Biol 4:123–129
Stanfield RL, Fieser TM, Lerner RA, Wilson IA (1990) Crystal structures of an antibody to a peptide and its complex with peptide antigen at 2.8 Å. Science 248:712–719
Bath TN, Bentley GA, Fischmann TO, Boulot G, Poljak RJ (1990) Small rearrangements in structures of Fv and Fab fragments of antibody D1.3 on antigen binding. Nature 347:483–485
Colman PM, Laver WG, Varghese JN, Baker AT, Tulloch PA, Air GM, Webster RG (1987) Three-dimensional structure of a complex of antibody with influenza virus neuraminidase. Nature 326:358–363
Xiang J, Sha Y, Prasad L, Delbaere LTJ (1996) Complementarity determining region residues aspartic acid at H55 serine at tyrosines at H97 andL96 play important roles in the B72.3 antibody-TAG72 antigen interaction. Protein Eng 9:539–543
Chothia C, Lesk AM, Gherardi E, Tomlinson IM, Walter G, Marks JD, Lewelyn MB, Winter G (1992) Structural repertoire of the human Vh segments. J Mol Biol 227:799–817
Iba Y, Hayshi N, Sawada I, Titani K, Kurosawa Y (1998) Changes in the specificity of antibodies against steroid antigens by introduction of mutations into complementarity-determining regions of Vh domain. Protein Eng 11:361–370
Rees AR, Staunton D, Webster DM (1994) Antibody design: beyond the natural limits. Trends Biotechnol 12:199–207
Minakuchi1 Y, Konagaya A (2004) Prediction of protein–protein interaction sites using support vector machines. Protein Eng Des Sel 17:165–173
Chakrabarti P, Janin J (2002) Dissecting protein–protein recognition sites. Proteins 47:334–343
Glaser F, Steinberg DM, Vakser A, Ben-Tal N (2001) Residue frequencies and pairing preferences at protein–protein interfaces. Proteins 43:89–102
Lu L, Lu H, Skolnick J (2003) Development of United Statistical Potentials describing protein–protein interactions. Biophys J 84:1895–1901
Fariselli P, Casadio R (1999) Neural network based predictor of residue contacts in proteins. Protein Eng 12:15–21
Fariselli P, Pazos F, Valencia A, Casadio R (2002) Prediction of protein–protein interaction sites in heterocomplexes with neural networks. Eur J Biochem 269:1356–1361
Ofran Y, Rost B (2003) Analysing six types of protein–protein interfaces. J Mol Biol 325:377–387
Ofran Y, Rost B (2003) Predicted protein–protein interaction sites from local sequence information. FEBS Lett 544:236–239
Yan C, Honavar V, Dobbs D (2004) Identification of interface residues in protease-inhibitor and antigen–antibody complexes: a support vector machine approach. Neural Comput Appl 13:123–129
Yan C, Dobbs D, Honavar V (2004) A two-stage classifier for identification of protein–protein interface residues. Bioinformatics 20(Suppl 1):i371–i378
Rost B, Sander C (1994) Conservation and prediction of solvent accessibility in protein families. Proteins 20:216–226
Holbrook SR, Muskal SM, Kim SH (1990) Predicting surface exposure of amino acids from protein sequence. Protein Eng 3:659–665
Naderi-Manesh H, Sadeghi M, Arab S, Moosavi Movahedi AA (2001) Prediction of protein surface accessibility with information theory. Proteins 42:452–459
Li X, Pan XM (2001) New method for accurate prediction of solvent accessibility from protein sequence. Proteins 42:1–5
Pascarella S, De Persio R, Bossa F, Argos P (1998) Easy method to predict solvent accessibility from multiple protein sequence alignments. Proteins 32:190–199
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22(12):2577–2637
Berman HM, Westbrook J, Feng Z, Gillliland G, Bhat TN, et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242
Jones S, Thornton JM (1996) Principles of protein–protein interactions. Proc Natl Acad Sci USA 93:13–20
Jones S, Thornton JM (1997a) Analysis of protein–protein interaction sites using surface patches. J Mol Biol 272:121–132
Jones S, Thornton JM (1997b) Prediction of protein–protein interaction sites using patch analysis. J Mol Biol 272:133–143
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge
Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167
Cortes C, Vapnik V (1995) Support-vector network. Mach Learn 20:273–297
Bradley PS, Fayyad UM, Magasarian OL (1999) Mathematical programming for data mining: formulations and challenges. INFORMS J Comput 11:217–238
Li J, Liu J, Xu W, Shi Y (2004) Support vector machines approach to credit assessment. In: Sloot PMA et al (eds) ICCS 2004, LNCS 2658, Springer, Berlin Heidelberg New York, pp 892–899
Brown MPS, Grundy WN, Lin D, Cristianini N, Sugnet C (2000) Knowledge-based analysis of microarray gene expression data using support vector machines. Proc Natl Acad Sci 97:262–267
Furey T, Cristianini N, Duffy N (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16:906–914
Haussler D (1999) Convolution kernels on discrete structures, Tech Rep UCSC-CRL-99–10, UC Santa Cruz
http://www.csie.ntu.edu.tw/∼cjlin/libhttp://www.csie.ntu.edu.tw/∼cjlin/libsvm
Kou G, Liu X, Peng Y, Shi Y, Wise W, Xu W (2003) Multiple criteria linear programming to data mining: models, algorithm designs and software developments. Optim Methods Softw 18:453–473
Zheng J, Zhuang W, Yan N, Kou G, Peng H, et al (2004) Classification of HIV-1 mediated neuronal dendritic and synaptic damage using multiple criteria linear programming. Neuroinformatics 2:303–326
Acknowledgments
This research has been partially supported by a 973 Project grant (2004CB720103) from the Ministry of Science and Technology, China and the grants (70531040, 70472074) from the National Natural Science Foundation, China. We would like to express our thanks to Mrs. Li Zhang, Northeastern University, USA and Mr. Gang Kou, University of Nebraska, USA for their constructive comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shi, Y., Zhang, X., Wan, J. et al. Predicting the distance between antibody’s interface residue and antigen to recognize antigen types by support vector machine. Neural Comput & Applic 16, 481–490 (2007). https://doi.org/10.1007/s00521-006-0076-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-006-0076-4