Skip to main content

A Logistic Regression Approach for Identifying Hot Spots in Protein Interfaces

  • Conference paper
  • First Online:
Book cover Information Technology in Bio- and Medical Informatics (ITBAM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9267))

  • 502 Accesses

Abstract

Protein–protein interactions occur when two or more proteins bind together, often to carry out their biological function. A small fraction of interfaces on protein surface found providing major contributions to the binding free energy are referred as hot spots. Identifying hot spots is important for examining the actions and properties occurring around the binding sites. However experimental studies require significant effort; and computational methods still have limitations in prediction performance and feature interpretation.

In this paper we describe a hot spots residues prediction measure which provides a significant improvement over other existing methods. Combining 8 features derived from accessibility, sequence conservation, inter-residue potentials, computational alanine scanning, small-world structure characteristics, phi-psi interaction, and contact number, logistic regression is used to derive a prediction model. To demonstrate its effectiveness, the proposed method is applied to ASEdb. Our prediction model achieves an accuracy of 0.819, F1 score of 0.743. Experimental results show that the additional features can improve the prediction performance. Especially phi-psi has been found to give important effort. We then perform an exhaustive comparison of our method with various machine learning based methods and those previously published prediction models in the literature. Empirical studies show that our method can yield significantly better prediction performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Li, P., Heo, L., Li, M., Ryu, K.H.: Protein function prediction using frequent patterns in protein-protein interaction networks. FSDK 3, 1664–1668 (2011)

    Google Scholar 

  2. Jones, S., Thornton, J.M.: Principles of protein-protein interactions. Proc. Natl. Acad. Sci. 93(1), 13–20 (1996)

    Article  Google Scholar 

  3. Clackson, T., Wells, J.A.: A hot spot of binding energy in a hormone-receptor interface. Science 267(5196), 383–386 (1995)

    Article  Google Scholar 

  4. Morrison, K.L., Weiss, G.A.: Combinatorial alanine-scanning. Curr. Opin. Chem. Biol. 5(3), 302–307 (2001)

    Article  Google Scholar 

  5. Thorn, K.S., Bogan, A.A.: ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17(3), 284–285 (2001)

    Article  Google Scholar 

  6. Fischer, T.B., Arunachalam, K.V., Bailey, D., Mangual, V., Bakhru, S., et al.: The binding interface database (BID): a compilation of amino acid hot spots in protein interfaces. Bioinformatics 19(11), 1453–1454 (2003)

    Article  Google Scholar 

  7. Bogan, A.A., Thorn, K.S.: Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280(1), 1–9 (1998)

    Article  Google Scholar 

  8. Ma, B., Elkayam, T., Wolfson, H., Nussinov, R.: Protein-protein interactions: structurally conserved residues distinguish between binding sites and exposed protein surfaces. Proc. Natl. Acad. Sci. 100(10), 5772–5777 (2003)

    Article  Google Scholar 

  9. Keskin, O., Ma, B., Nussinov, R.: Hot regions in protein–protein interactions: the organization and contribution of structurally conserved hot spot residues. J. Mol. Biol. 345(5), 1281–1294 (2005)

    Article  Google Scholar 

  10. Chen, X., Jeong, J.: Sequence-based prediction of protein interaction sites with an integrative method. Bioinformatics 25(5), 585–591 (2009)

    Article  Google Scholar 

  11. Li, N., Sun, Z., Jiang, F.: Prediction of protein-protein binding site by using core interface residue and support vector machine. BMC Bioinform. 9(1), 553 (2008)

    Article  Google Scholar 

  12. Xia, J.F., Zhao, X.M., Song, J., Huang, D.S.: APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinform. 11, 174 (2010)

    Article  Google Scholar 

  13. Tuncbag, N., Keskin, O., Gursoy, A.: HotPoint: hot spot prediction server for protein interfaces. Nucleic Acids Res. 38, W402–W406 (2010)

    Article  Google Scholar 

  14. Darnell, S.J., Page, D., Mitchell, J.C.: An automated decision-tree approach to predicting protein interaction hot spots. Proteins 68, 813–823 (2007)

    Article  Google Scholar 

  15. Del, Sol: A. and O’Meara, P.: Small-world network approach to identify key residues in protein-protein interaction. Proteins 58(3), 672–682 (2005)

    Article  Google Scholar 

  16. Shrake, A., Rupley, J.A.: Environment and exposure to solvent of protein atoms. Lysozyme and insulin. J. Mol. Biol. 79, 351–371 (1973)

    Article  Google Scholar 

  17. Rost, B., Sander, C.: Conservation and prediction of solvent accessibility in protein families. Proteins 20, 216–226 (1994)

    Article  Google Scholar 

  18. Tuncbag, N., Gursoy, A., Keskin, O.: Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy. Bioinformatics 25(12), 1513–1520 (2009)

    Article  Google Scholar 

  19. Hubbard, S.J., Thornton, J.M.: NACCESS. Department of Biochemistry and Molecular Biology, University College, London (1993)

    Google Scholar 

  20. Sankararaman, S., Sha, F., Kirsch, J.F., Jordan, M.I., Sjölander, K.: Active site prediction using evolutionary and structural information. Bioinformatics 26(5), 617–624 (2010)

    Article  Google Scholar 

  21. Guney, E., Tuncbag, N., Keskin, O., Gursoy, A.: HotSprint: database of computational hot spots in protein interfaces. Nucleic Acids Res. 36, D662–D666 (2008)

    Article  Google Scholar 

  22. Dodge, C., Schneider, R., Sander, C.: The HSSP database of protein structure-sequence alignments and family profiles. Nucleic Acids Res. 26(1), 313–315 (1998)

    Article  Google Scholar 

  23. Mayrose, I., Graur, D., Ben-Tal, N., Pupko, T.: Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior. Mol. Biol. Evol. 21(9), 1781–1791 (2004)

    Article  Google Scholar 

  24. Jernigan, R.L., Bahar, I.: Structure-derived potentials and protein simulations. Curr. Opin. Struct. Biol. 6(2), 195–209 (1996)

    Article  Google Scholar 

  25. Greene, L.H., Higman, V.A.: Uncovering network systems within protein structures. J. Mol. Biol. 334(4), 781–791 (2003)

    Article  Google Scholar 

  26. Holland, R.C., Down, T.A., Pocock, M., Prlić, A., Huen, D., et al.: BioJava: an open-source framework for bioinformatics. Bioinformatics 24(18), 2096–2097 (2008)

    Article  Google Scholar 

  27. Pollastri, G., Baldi, P., Fariselli, P., Casadio, R.: Prediction of coordination number and relative solvent accessibility in proteins. Proteins 47, 142–153 (2002)

    Article  Google Scholar 

  28. Li, P., Pok, G., Jung, K.S., Shon, H.S., Ryu, K.H.: QSE: A new solvent exposure measure for the analysis of protein structure. Proteomics 11(19), 3793–3801 (2011)

    Article  Google Scholar 

  29. Karchin, R., Cline, M., Karplus, K.: Evaluation of local structure alphabets based on residue burial. Proteins. 55, 508–518 (2004)

    Article  Google Scholar 

  30. Levesque, R.: SPSS Programming and Data Management: A Guide for SPSS and SAS Users, 4th edn. SPSS Inc., Chicago Ill (2007)

    Google Scholar 

  31. Hartley, R.W.: Barnase and barstar: two small proteins to fold and fit together. Trends Biochem. Sci. 14(11), 450–454 (1989)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (No. 2013R1A2A2A01068923) and by the ITRC(Information Technology Research Center) support program (NIPA-2014-H0301-14-1002)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Keun Ho Ryu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Li, P., Ryu, K.H. (2015). A Logistic Regression Approach for Identifying Hot Spots in Protein Interfaces. In: Renda, M., Bursa, M., Holzinger, A., Khuri, S. (eds) Information Technology in Bio- and Medical Informatics. ITBAM 2015. Lecture Notes in Computer Science(), vol 9267. Springer, Cham. https://doi.org/10.1007/978-3-319-22741-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22741-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22740-5

  • Online ISBN: 978-3-319-22741-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics