Skip to main content

Hot Spots & Hot Regions Detection Using Classification Algorithms in BMPs Complexes at the Protein-Protein Interface with the Ground-State Energy Feature

  • Conference paper
  • First Online:
  • 617 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13264))

Abstract

We present the results of the application of some machine learning algorithms to predict the hot spots & hot regions residues in protein complexes at the protein-protein interface between their polypeptide chains. The dataset consisted of twenty-nine bone morphogenetic proteins (BMPs) obtained from the Protein Data Bank (PDB). The training features were selected from biochemical and biophysical properties such as B-factor, hydrophobicity index, prevalence score, accessible surface area (ASA), conservation score, and the ground-state energy (using Density Functional Theory (DFT)) of each amino acid of these interfaces. Also, we implemented parallel CPU/GPU hardware acceleration techniques during the preprocessing in order to speed up the ASA and DFT calculations with more efficient execution times. We evaluated the performance of the classifiers with several metrics. The random forest classifier obtained the best performance, achieving an average of \(90\%\) of well-classified residues in both the true negative and true positive rates.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bogan, A.A., Thorn, K.S.: Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280(mb981843), 1–9 (1998)

    Article  Google Scholar 

  2. Ashkenazy, H., et al.: ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 44, 344–350 (2016). https://doi.org/10.1093/nar/gkw408

    Article  Google Scholar 

  3. Berman, H., Henrick, K., Nakamura, H., Markley, J.: The worldwide protein data bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res. 35, D301–D303 (2007)

    Article  Google Scholar 

  4. Boughorbel, S., Jarray, F., El-Anbari, M.: Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PLoS ONE 12(6), 1–17 (2017). https://doi.org/10.1371/journal.pone.0177678

    Article  Google Scholar 

  5. Carugo, O.: How large b-factors can be in protein crystal structures. BMC Bioinf. 19(61), 1–9 (2018). https://doi.org/10.1186/s12859-018-2083-8

    Article  Google Scholar 

  6. Chen, D., Zhao, M., Mundy, G.R.: Bone morphogenetic proteins. Growth Factors 22(4), 233–241 (2004)

    Article  Google Scholar 

  7. Cukuroglu, E., Gursoy, A., Keskin, O.: HotRegion: a database of predicted hot spot clusters. Nucleic Acids Res. 40(22080558), 829–833 (2011)

    Google Scholar 

  8. Haykin, S., Haykin, S.: Neural Networks and Learning Machines, vol. 10. Prentice Hall, New York (2009)

    MATH  Google Scholar 

  9. Hintze, B.J., et al.: MolProbity ultimate rotamer-library distributions for model validation. Proteins Struct. Funct. Bioinf. 84, 1177–1189 (2016)

    Article  Google Scholar 

  10. Kortemme, T., Baker, D.: A simple physical model for binding energy hot spots in protein-protein complexes. PNAS 99(22), 14116–14121 (2002). https://doi.org/10.1073/pnas.202485799

    Article  Google Scholar 

  11. Kortemme, T., Kim, D.E., Baker, D.: Computational alanine scanning of protein-protein interfaces. Sci. STKE Protoc. 1–8 (2004). https://doi.org/10.1126/stke.2192004pl2

  12. Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 5(157), 105–132 (1982). https://doi.org/10.1016/0022-2836(82)90515-0

    Article  Google Scholar 

  13. Lise, S., et al.: Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods. BMC Bioinf. 10(365), 1–17 (2009). https://doi.org/10.1186/1471-2105-10-365

    Article  Google Scholar 

  14. Liu, S., Liu, C., Deng, L.: Machine learning approaches for protein-protein interaction hot spot prediction: progress and comparative assessment. MDPI Mol. 23(10), 2535 (2018). https://doi.org/10.3390/molecules23102535

    Article  Google Scholar 

  15. McKerns, M.M., et al.: Building a framework for predictive science. In: Proceedings of the 10th Python in Science Conference, vol. 1, pp. 1–11 (2011). https://doi.org/10.48550/arXiv.1202.1056

  16. Mitternacht, S.: FreeSASA: an open source C library for solvent accessible surface area calculations. F1000 Res. 5(189), 1–10 (2016). https://doi.org/10.12688/f1000research.7931.1

    Article  Google Scholar 

  17. Morrow, J.K., Zhang, S.: Computational prediction of hot spot residues. Curr. Pharm. Des. 18, 1255–1265 (2012). https://doi.org/10.2174/138161212799436412

    Article  Google Scholar 

  18. Muller, R.: PyQuante2. PyQuante Sourceforge Project Page (2013). https://github.com/rpmuller/pyquante2

  19. Tuncbag, N., Keskin, O., Gursoy, A.: Hotpoint: hot spot prediction server for protein interfaces. Nucleic Acids Res. 38(20444871), 402–406 (2010). https://doi.org/10.1093/nar/gkq323

    Article  Google Scholar 

  20. Nguyen, Q.T., Fablet, R., Pastor, D.: Protein interaction hotspot identification using sequence-based frequency-derived features. IEEE Trans. Biomed. Eng. 60(11), 2993–3002 (2013). https://doi.org/10.1109/TBME.2011.2161306

    Article  Google Scholar 

  21. Nussinov, R., Schreiber, G.: Computational Protein-Protein Interactions. CRC Press, Boca Raton (2009). https://doi.org/10.1201/9781420070071

    Book  Google Scholar 

  22. NVIDIA, Vingelmann, P., Fitzek, F.H.: CUDA, release. Accessed 10 Feb 1989 (2020). https://developer.nvidia.com/cuda-toolkit

  23. PDBremix: Calculating the solvent accessible surface area (2014)

    Google Scholar 

  24. Qiao, Y., et al.: Protein-protein interface hot spots prediction based on a hybrid feature selection strategy. BMC Bioinf. 14(19), 1–16 (2018). https://doi.org/10.1186/s12859-018-2009-5

    Article  Google Scholar 

  25. Shrake, A., Rupley, J.A.: Environment and exposure to solvent of protein atoms. lysozyme and insulin. J. Mol. Biol. 2(79), 351–371 (1973). https://doi.org/10.1016/0022-2836(73)90011-9

    Article  Google Scholar 

  26. Stephen, F., et al.: Density functional theory calculations on entire proteins for free energies of binding: application to a model polar binding site. Proteins Struct. Funct. Bioinf. 82(12), 3335–3346 (2014). https://doi.org/10.1002/prot.24686

    Article  Google Scholar 

  27. Tuncbag, N., Gursoy, A., Keskin, O.: Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy. J. Bioinf. 25(12), 1513–1520 (2009). https://doi.org/10.1093/bioinformatics/btp240

    Article  Google Scholar 

  28. Cavalcante, J.P.U., Gonçalves, A.C., Bonidia, R.P., Sanches, D.S., de Carvalho, A.C.P.L.F.: MathPIP: classification of proinflammatory peptides using mathematical descriptors. In: Stadler, P.F., Walter, M.E.M.T., Hernandez-Rosales, M., Brigido, M.M. (eds.) BSB 2021. LNCS, vol. 13063, pp. 131–136. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-91814-9_13

    Chapter  Google Scholar 

  29. Wang, L., et al.: Prediction of hot spots in protein interfaces using a random forest model with hybrid features. Protein Eng. Des. Sel. 25(3), 119–126 (2012). https://doi.org/10.1093/protein/gzr066

    Article  Google Scholar 

  30. Xia, J.F., et al.: APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinf. 174(11), 1–14 (2010). https://doi.org/10.1186/1471-2105-11-174

    Article  Google Scholar 

  31. Yan, C., et al.: Characterization of protein-protein interfaces. Protein J. 27(1), 59–70 (2008). https://doi.org/10.1007/s10930-007-9108-x

    Article  Google Scholar 

Download references

Acknowledgments

This study was supported by: “Programa de desarrollo tecnológico e innovación para alumnos del IPN. México 2021” and by CONACYT (Consejo Nacional de Ciencia y Tecnología).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to O. Chaparro-Amaro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chaparro-Amaro, O., Martínez-Felipe, M., Martínez-Castro, J. (2022). Hot Spots & Hot Regions Detection Using Classification Algorithms in BMPs Complexes at the Protein-Protein Interface with the Ground-State Energy Feature. In: Vergara-Villegas, O.O., Cruz-Sánchez, V.G., Sossa-Azuela, J.H., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Olvera-López, J.A. (eds) Pattern Recognition. MCPR 2022. Lecture Notes in Computer Science, vol 13264. Springer, Cham. https://doi.org/10.1007/978-3-031-07750-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07750-0_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07749-4

  • Online ISBN: 978-3-031-07750-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics