Skip to main content

Inter-helical Residue Contact Prediction in \(\alpha \)-Helical Transmembrane Proteins Using Structural Features

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2023)

Abstract

Residue contact maps offer a 2-d, reduced representation of 3-d protein structures and constitute a structural constraint and scaffold in structural modeling. Precise residue contact maps are not only helpful as an intermediate step towards generating effective 3-d protein models, but also useful in their own right in identifying binding sites and hence providing insights about a protein’s functions. Indeed, many computational methods have been developed to predict residue contacts using a variety of features based on sequence, physio-chemical properties, and co-evolutionary information. In this work, we set to explore the use of structural information for predicting inter-helical residue contact in transmembrane proteins. Specifically, we extract structural information from a neighborhood around a residue pair of interest and train a classifier to determine whether the residue pair is a contact point or not. To make the task practical, we avoid using the 3-d coordinates directly, instead we extract features such as relative distances and angles. Further, we exclude any structural information of the residue pair of interest from the input feature set in training and testing of the classifier. We compare our method to a state-of-the-art method that uses non-structural information on a benchmark data set. The results from experiments on held out datasets show that the our method achieves above 90% precision for top L/2 and L inter-helical contacts, significantly outperforming the state-of-the-art method and may serve as an upper bound on the performance when using non-structural information. Further, we evaluate the robustness of our method by injecting Gaussian normal noise into PDB coordinates and hence into our derived features. We find that our model’s performance is robust to high noise levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Contact maps (molecular biology). https://what-when-how.com/molecular-biology/contact-maps-molecular-biology/. Accessed 26 Jan 2022

  2. Information retrieval - wikipedia. https://en.wikipedia.org/w/index.php?title=Information_retrieval &oldid=793358396#Average_precision. Accessed 26 Jan 2022

  3. Receiver operating characteristic - Wikipedia. https://en.wikipedia.org/wiki/Receiver_operating_characteristic. Accessed 26 Jan 2022

  4. Scientists alter membrane proteins to make them easier to study - sciencedaily. https://www.sciencedaily.com/releases/2018/08/180828104043.htm. Accessed 26 Jan 2022

  5. Albers, R.W.W.: Cell membrane structures and functions. In: Basic Neurochemistry, pp. 26–39. Elsevier (2012)

    Google Scholar 

  6. Almén, M.S., Nordström, K.J., Fredriksson, R., Schiöth, H.B.: Mapping the human membrane proteome: a majority of the human membrane proteins can be classified according to function and evolutionary origin. BMC Biol. 7(1), 1–14 (2009)

    Article  Google Scholar 

  7. Attwood, M.M., Schiöth, H.B.: Characterization of five transmembrane proteins: with focus on the tweety, sideroflexin, and YIP1 domain families. Front. Cell Dev. Biol. 9, 1950 (2021)

    Article  Google Scholar 

  8. Baldassi, C., et al.: Fast and accurate multivariate gaussian modeling of protein families: predicting residue contacts and protein-interaction partners. PLoS One 9(3), e92721 (2014)

    Google Scholar 

  9. Berman, H.M., Battistuz, T., Bhat, T.N., Bluhm, W.F., Bourne, P.E., Burkhardt, K., Feng, Z., Gilliland, G.L., Iype, L., Jain, S., et al.: The protein data bank. Acta Crystallogr. D Biol. Crystallogr. 58(6), 899–907 (2002)

    Article  PubMed  Google Scholar 

  10. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  11. Brünger, A.T.: X-ray crystallography and NMR reveal complementary views of structure and dynamics. Nat. Struct. Biol. 4, 862–865 (1997)

    PubMed  Google Scholar 

  12. Cooper, J.: Alpha-Helix geometry part. 2 – cryst.bbk.ac.uk (1995). https://www.cryst.bbk.ac.uk/PPS95/course/3_geometry/helix2.html. Accessed 25 Jan 2022

  13. Dago, A.E., Schug, A., Procaccini, A., Hoch, J.A., Weigt, M., Szurmant, H.: Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis. Proc. Natl. Acad. Sci. 109(26), E1733–E1742 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240 (2006)

    Google Scholar 

  15. Du, Z., et al.: The trRosetta server for fast and accurate protein structure prediction. Nat. Protoc. 16(12), 5634–5651 (2021)

    Article  CAS  PubMed  Google Scholar 

  16. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)

    Article  Google Scholar 

  17. Friedman, J., Hastie, T., Tibshirani, R., et al.: The Elements of Statistical Learning. Springer Series in Statistics, vol. 1. Springer, New York (2001). https://doi.org/10.1007/978-0-387-84858-7

    Book  Google Scholar 

  18. Frishman, D., Mewes, H.W.: Protein structural classes in five complete genomes. Nat. Struct. Biol. 4(8), 626–628 (1997)

    Article  CAS  PubMed  Google Scholar 

  19. Hönigschmid, P., Frishman, D.: Accurate prediction of helix interactions and residue contacts in membrane proteins. J. Struct. Biol. 194(1), 112–123 (2016)

    Article  PubMed  Google Scholar 

  20. James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning, vol. 112. Springer, Heidelberg (2013)

    Book  Google Scholar 

  21. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., et al.: Highly accurate protein structure prediction with alphafold. Nature 596(7873), 583–589 (2021)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Kaján, L., Hopf, T.A., Kalaš, M., Marks, D.S., Rost, B.: FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinform. 15(1), 1–6 (2014)

    Article  Google Scholar 

  23. Kandathil, S.M., Greener, J.G., Jones, D.T.: Prediction of interresidue contacts with DeepMetaPSICOV in CASP13. Proteins Struct. Funct. Bioinform. 87(12), 1092–1099 (2019)

    Article  CAS  Google Scholar 

  24. Karlin, S., Zuker, M., Brocchieri, L.: Measuring residue association in protein structures possible implications for protein folding. J. Mol. Biol. 239(2), 227–248 (1994)

    Article  CAS  PubMed  Google Scholar 

  25. Kermani, A.A.: A guide to membrane protein X-ray crystallography. FEBS J. 288(20), 5788–5804 (2021)

    Article  CAS  PubMed  Google Scholar 

  26. Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, Montreal, Canada, vol. 14, pp. 1137–1145 (1995)

    Google Scholar 

  27. Kozma, D., Simon, I., Tusnady, G.E.: PDBTM: protein data bank of transmembrane proteins after 8 years. Nucleic Acids Res. 41(D1), D524–D529 (2012)

    Article  PubMed  PubMed Central  Google Scholar 

  28. Lagerström, M.C., Schiöth, H.B.: Structural diversity of G protein-coupled receptors and significance for drug discovery. Nat. Rev. Drug Discovery 7(4), 339–357 (2008)

    Article  PubMed  Google Scholar 

  29. Lee, H.S., Choi, J., Yoon, S.: QHELIX: a computational tool for the improved measurement of inter-helical angles in proteins. Protein. J. 26(8), 556–561 (2007)

    Article  CAS  PubMed  Google Scholar 

  30. Li, J., Sawhney, A., Lee, J.Y., Liao, L.: Improving inter-helix contact prediction with local 2D topological information (2023)

    Google Scholar 

  31. Lubecka, E.A., Liwo, A.: Introduction of a bounded penalty function in contact-assisted simulations of protein structures to omit false restraints. J. Comput. Chem. 40(25), 2164–2178 (2019)

    Article  CAS  PubMed  Google Scholar 

  32. Mahbub, S., Bayzid, M.S.: EGRET: edge aggregated graph attention networks and transfer learning improve protein-protein interaction site prediction. bioRxiv, pp. 2020–11 (2021)

    Google Scholar 

  33. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    Google Scholar 

  34. Raval, A., Piana, S., Eastwood, M.P., Shaw, D.E.: Assessment of the utility of contact-based restraints in accelerating the prediction of protein structure using molecular dynamics simulations. Protein Sci. 25(1), 19–29 (2016)

    Article  CAS  PubMed  Google Scholar 

  35. Schrödinger, LLC: The AxPyMOL molecular graphics plugin for Microsoft PowerPoint, version 1.8 (2015)

    Google Scholar 

  36. Schrödinger, LLC: The JyMOL molecular graphics development component, version 1.8 (2015)

    Google Scholar 

  37. Schrödinger, LLC: The PyMOL molecular graphics system, version 1.8 (2015)

    Google Scholar 

  38. Sheridan, R., et al.: EVfold. org: evolutionary couplings and protein 3D structure prediction. biorxiv, p. 021022 (2015)

    Google Scholar 

  39. Sun, J., Frishman, D.: DeepHelicon: accurate prediction of inter-helical residue contacts in transmembrane proteins by residual neural networks. J. Struct. Biol. 212(1), 107574 (2020)

    Google Scholar 

  40. Torda, A.: Powerpoint presentation. https://www.zbh.uni-hamburg.de/forschung/bm/lehre/downloads/ws1718/67-104/1-genauigkeit.pdf. Accessed 07 Apr 2022

  41. Tusnády, G.E., Dosztányi, Z., Simon, I.: Transmembrane proteins in the protein data bank: identification and classification. Bioinformatics 20(17), 2964–2972 (2004)

    Article  PubMed  Google Scholar 

  42. Tusnády, G.E., Dosztányi, Z., Simon, I.: PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank. Nucleic Acids Res. 33(suppl_1), D275–D278 (2005)

    Google Scholar 

  43. Vangone, A., Bonvin, A.M.: Contacts-based prediction of binding affinity in protein-protein complexes. Elife 4, e07454 (2015)

    Google Scholar 

  44. Wang, S., Sun, S., Li, Z., Zhang, R., Xu, J.: Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol. 13(1), e1005324 (2017)

    Google Scholar 

  45. Wang, X.F., Chen, Z., Wang, C., Yan, R.X., Zhang, Z., Song, J.: Predicting residue-residue contacts and helix-helix interactions in transmembrane proteins using an integrative feature-based random forest approach. PLoS One 6(10), e26767 (2011)

    Google Scholar 

  46. Xu, J., Zhang, Y.: How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26(7), 889–895 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Zhang, H., et al.: Evaluation of residue-residue contact prediction methods: from retrospective to prospective. PLoS Comput. Biol. 17(5), e1009027 (2021)

    Google Scholar 

Download references

Acknowledgment

Support from the University of Delaware CBCB Bioinformatics Core Facility and use of the BIOMIX compute cluster was made possible through funding from Delaware INBRE (NIH NIGMS P20 GM103446), the State of Delaware, and the Delaware Biotechnology Institute. The authors would also like to thank the National Science Foundation (NSF-MCB1820103), which partly supported this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aman Sawhney .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sawhney, A., Li, J., Liao, L. (2023). Inter-helical Residue Contact Prediction in \(\alpha \)-Helical Transmembrane Proteins Using Structural Features. In: Rojas, I., Valenzuela, O., Rojas Ruiz, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2023. Lecture Notes in Computer Science(), vol 13920. Springer, Cham. https://doi.org/10.1007/978-3-031-34960-7_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34960-7_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34959-1

  • Online ISBN: 978-3-031-34960-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics