Skip to main content

Sequence-Based Prediction of Hot Spots in Protein-RNA Complexes Using an Ensemble Approach

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11643))

Included in the following conference series:

  • 1445 Accesses

Abstract

RNA-binding hot spots are dominant and fundamental residues that contribute most to the binding free energy of protein-RNA interfaces. As experimental methods for identifying hot spots are expensive and time-consuming, high-efficiency computational approaches are required in predicting hot spots on a large scale. In this work, we proposed a sequence-based machine learning method to predict hot spots in protein-RNA complexes. We extracted 83 relative independent physicochemical features from a set of the 544 properties in AAindex1. Each physicochemical feature was combined with the predicted relative accessible surface area (RASA) and substitution probability feature from Blocks Substitution Matrix (BLOSUM) for training models by support vector machine (SVM) and k-nearest neighbor algorithm (k-NN). The combinations of the 166 individual models were explored and 33 top-performance models were selected to construct the final ensemble classifier by a majority voting technique. The ensemble classifier outperformed the state-of-the-art computational methods, yielding F1 score of 0.742 and AUC of 0.824 on the independent test set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. König, J., et al.: Protein-RNA interactions: new genomic technologies and perspectives. Nat. Rev. Genetics 13(2), 77 (2012)

    Article  Google Scholar 

  2. Ellis, J.J., Broom, M., Jones, S.: Protein-RNA interactions: structural analysis and functional classes. Proteins: Struct. Funct. Bioinf. 66(4), 903–911 (2007)

    Google Scholar 

  3. Clackson, T., Wells, J.A.: A hot spot of binding energy in a hormone-receptor interface. Science 267(5196), 383–386 (1995)

    Article  Google Scholar 

  4. Moreira, I.S., Fernandes, P.A., Ramos, M.J.: Hot spots-A review of the protein-protein interface determinant amino-acid residues. Proteins: Struct. Funct. Bioinf. 68(4), 803–812 (2007)

    Article  Google Scholar 

  5. Kumar, M., Gromiha, M.M., Raghava, G.: Prediction of RNA binding sites in a protein using SVM and PSSM profile. Proteins: Struct. Funct. Bioinf. 71(1), 189–194 (2008)

    Article  Google Scholar 

  6. Liu, Z.-P., et al.: Prediction of protein-RNA binding sites by a random forest method with combined features. Bioinformatics 26(13), 1616–1622 (2010)

    Article  Google Scholar 

  7. Tang, Y., et al.: A boosting approach for prediction of protein-RNA binding residues. BMC Bioinf. 18(13), 465 (2017)

    Article  Google Scholar 

  8. Walia, R.R., et al.: RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins. PLoS One 9(5), e97725 (2014)

    Article  Google Scholar 

  9. Yang, X., et al.: SNBRFinder: a sequence-based hybrid algorithm for enhanced prediction of nucleic acid-binding residues. PLoS One 10(7), e0133260 (2015)

    Article  Google Scholar 

  10. Barik, A., et al.: Probing binding hot spots at protein-RNA recognition sites. Nucleic Acids Res. 44(2), e9 (2015)

    Article  Google Scholar 

  11. Pan, Y., et al.: Computational identification of binding energy hot spots in protein-RNA complexes using an ensemble approach. Bioinformatics 34(9), 1473–1480 (2017)

    Article  Google Scholar 

  12. Shuichi, K., et al.: AAindex: amino acid index database, progress report 2008. Nucleic Acids Res. 36(Database issue), D202–D205 (2008)

    Google Scholar 

  13. Chen, P., et al.: Accurate prediction of hot spot residues through physicochemical characteristics of amino acid sequences. Proteins: Struct. Funct. Bioinf. 81(8), 1351–1362 (2013)

    Article  Google Scholar 

  14. Hu, S.-S., et al.: Protein binding hot spots prediction from sequence only by a new ensemble learning method. Amino Acids 49(10), 1773–1785 (2017)

    Article  Google Scholar 

  15. Chou, K.C.: Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins: Struct. Funct. Bioinf. 43(3), 246–255 (2001)

    Article  Google Scholar 

  16. Morten, N., et al.: A generic method for assignment of reliability scores applied to solvent accessibility predictions. BMC Struct. Biol. 9(1), 51 (2009)

    Article  Google Scholar 

  17. Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Nat. Acad. Sci. 89(22), 10915–10919 (1992)

    Article  Google Scholar 

  18. Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)

    Google Scholar 

  19. Xia, J.-F., et al.: APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinf. 11(1), 174 (2010)

    Article  Google Scholar 

  20. Zhu, X., Mitchell, J.C.: KFC2: a knowledge-based hot spot prediction method based on interface solvation, atomic density, and plasticity features. Proteins: Struct. Funct. Bioinf. 79(9), 2671–2683 (2011)

    Article  Google Scholar 

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China (61672037, 21601001, and 11835014), the Anhui Provincial Outstanding Young Talent Support Plan (gxyqZD2017005), the Young Wanjiang Scholar Program of Anhui Province, the Recruitment Program for Leading Talent Team of Anhui Province (2019-16), the China Postdoctoral Science Foundation Grant (2018M630699) and the Anhui Provincial Postdoctoral Science Foundation Grant (2017B325).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sijia Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, L., Zhang, S., Xia, J. (2019). Sequence-Based Prediction of Hot Spots in Protein-RNA Complexes Using an Ensemble Approach. In: Huang, DS., Bevilacqua, V., Premaratne, P. (eds) Intelligent Computing Theories and Application. ICIC 2019. Lecture Notes in Computer Science(), vol 11643. Springer, Cham. https://doi.org/10.1007/978-3-030-26763-6_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26763-6_55

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26762-9

  • Online ISBN: 978-3-030-26763-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics