Abstract
Protein–protein interactions occur when two or more proteins bind together, often to carry out their biological function. A small fraction of interfaces on protein surface found providing major contributions to the binding free energy are referred as hot spots. Identifying hot spots is important for examining the actions and properties occurring around the binding sites. However experimental studies require significant effort; and computational methods still have limitations in prediction performance and feature interpretation.
In this paper we describe a hot spots residues prediction measure which provides a significant improvement over other existing methods. Combining 8 features derived from accessibility, sequence conservation, inter-residue potentials, computational alanine scanning, small-world structure characteristics, phi-psi interaction, and contact number, logistic regression is used to derive a prediction model. To demonstrate its effectiveness, the proposed method is applied to ASEdb. Our prediction model achieves an accuracy of 0.819, F1 score of 0.743. Experimental results show that the additional features can improve the prediction performance. Especially phi-psi has been found to give important effort. We then perform an exhaustive comparison of our method with various machine learning based methods and those previously published prediction models in the literature. Empirical studies show that our method can yield significantly better prediction performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Li, P., Heo, L., Li, M., Ryu, K.H.: Protein function prediction using frequent patterns in protein-protein interaction networks. FSDK 3, 1664–1668 (2011)
Jones, S., Thornton, J.M.: Principles of protein-protein interactions. Proc. Natl. Acad. Sci. 93(1), 13–20 (1996)
Clackson, T., Wells, J.A.: A hot spot of binding energy in a hormone-receptor interface. Science 267(5196), 383–386 (1995)
Morrison, K.L., Weiss, G.A.: Combinatorial alanine-scanning. Curr. Opin. Chem. Biol. 5(3), 302–307 (2001)
Thorn, K.S., Bogan, A.A.: ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17(3), 284–285 (2001)
Fischer, T.B., Arunachalam, K.V., Bailey, D., Mangual, V., Bakhru, S., et al.: The binding interface database (BID): a compilation of amino acid hot spots in protein interfaces. Bioinformatics 19(11), 1453–1454 (2003)
Bogan, A.A., Thorn, K.S.: Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280(1), 1–9 (1998)
Ma, B., Elkayam, T., Wolfson, H., Nussinov, R.: Protein-protein interactions: structurally conserved residues distinguish between binding sites and exposed protein surfaces. Proc. Natl. Acad. Sci. 100(10), 5772–5777 (2003)
Keskin, O., Ma, B., Nussinov, R.: Hot regions in protein–protein interactions: the organization and contribution of structurally conserved hot spot residues. J. Mol. Biol. 345(5), 1281–1294 (2005)
Chen, X., Jeong, J.: Sequence-based prediction of protein interaction sites with an integrative method. Bioinformatics 25(5), 585–591 (2009)
Li, N., Sun, Z., Jiang, F.: Prediction of protein-protein binding site by using core interface residue and support vector machine. BMC Bioinform. 9(1), 553 (2008)
Xia, J.F., Zhao, X.M., Song, J., Huang, D.S.: APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinform. 11, 174 (2010)
Tuncbag, N., Keskin, O., Gursoy, A.: HotPoint: hot spot prediction server for protein interfaces. Nucleic Acids Res. 38, W402–W406 (2010)
Darnell, S.J., Page, D., Mitchell, J.C.: An automated decision-tree approach to predicting protein interaction hot spots. Proteins 68, 813–823 (2007)
Del, Sol: A. and O’Meara, P.: Small-world network approach to identify key residues in protein-protein interaction. Proteins 58(3), 672–682 (2005)
Shrake, A., Rupley, J.A.: Environment and exposure to solvent of protein atoms. Lysozyme and insulin. J. Mol. Biol. 79, 351–371 (1973)
Rost, B., Sander, C.: Conservation and prediction of solvent accessibility in protein families. Proteins 20, 216–226 (1994)
Tuncbag, N., Gursoy, A., Keskin, O.: Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy. Bioinformatics 25(12), 1513–1520 (2009)
Hubbard, S.J., Thornton, J.M.: NACCESS. Department of Biochemistry and Molecular Biology, University College, London (1993)
Sankararaman, S., Sha, F., Kirsch, J.F., Jordan, M.I., Sjölander, K.: Active site prediction using evolutionary and structural information. Bioinformatics 26(5), 617–624 (2010)
Guney, E., Tuncbag, N., Keskin, O., Gursoy, A.: HotSprint: database of computational hot spots in protein interfaces. Nucleic Acids Res. 36, D662–D666 (2008)
Dodge, C., Schneider, R., Sander, C.: The HSSP database of protein structure-sequence alignments and family profiles. Nucleic Acids Res. 26(1), 313–315 (1998)
Mayrose, I., Graur, D., Ben-Tal, N., Pupko, T.: Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior. Mol. Biol. Evol. 21(9), 1781–1791 (2004)
Jernigan, R.L., Bahar, I.: Structure-derived potentials and protein simulations. Curr. Opin. Struct. Biol. 6(2), 195–209 (1996)
Greene, L.H., Higman, V.A.: Uncovering network systems within protein structures. J. Mol. Biol. 334(4), 781–791 (2003)
Holland, R.C., Down, T.A., Pocock, M., Prlić, A., Huen, D., et al.: BioJava: an open-source framework for bioinformatics. Bioinformatics 24(18), 2096–2097 (2008)
Pollastri, G., Baldi, P., Fariselli, P., Casadio, R.: Prediction of coordination number and relative solvent accessibility in proteins. Proteins 47, 142–153 (2002)
Li, P., Pok, G., Jung, K.S., Shon, H.S., Ryu, K.H.: QSE: A new solvent exposure measure for the analysis of protein structure. Proteomics 11(19), 3793–3801 (2011)
Karchin, R., Cline, M., Karplus, K.: Evaluation of local structure alphabets based on residue burial. Proteins. 55, 508–518 (2004)
Levesque, R.: SPSS Programming and Data Management: A Guide for SPSS and SAS Users, 4th edn. SPSS Inc., Chicago Ill (2007)
Hartley, R.W.: Barnase and barstar: two small proteins to fold and fit together. Trends Biochem. Sci. 14(11), 450–454 (1989)
Acknowledgments
This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (No. 2013R1A2A2A01068923) and by the ITRC(Information Technology Research Center) support program (NIPA-2014-H0301-14-1002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Li, P., Ryu, K.H. (2015). A Logistic Regression Approach for Identifying Hot Spots in Protein Interfaces. In: Renda, M., Bursa, M., Holzinger, A., Khuri, S. (eds) Information Technology in Bio- and Medical Informatics. ITBAM 2015. Lecture Notes in Computer Science(), vol 9267. Springer, Cham. https://doi.org/10.1007/978-3-319-22741-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-22741-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22740-5
Online ISBN: 978-3-319-22741-2
eBook Packages: Computer ScienceComputer Science (R0)