Skip to main content

IRIS: A method for predicting in vivo RNA secondary structures using PARIS data

Abstract

Background

RNA secondary structures play a pivotal role in posttranscriptional regulation and the functions of non-coding RNAs, yet in vivo RNA secondary structures remain enigmatic. PARIS (Psoralen Analysis of RNA Interactions and Structures) is a recently developed high-throughput sequencing-based approach that enables direct capture of RNA duplex structures in vivo. However, the existence of incompatible, fuzzy pairing information obstructs the integration of PARIS data with the existing tools for reconstructing RNA secondary structure models at the singlebase resolution.

Methods

We introduce IRIS, a method for predicting RNA secondary structure ensembles based on PARIS data. IRIS generates a large set of candidate RNA secondary structure models under the guidance of redistributed PARIS reads and then uses a Bayesian model to identify the optimal ensemble, according to both thermodynamic principles and PARIS data.

Results

The predicted RNA structure ensembles by IRIS have been verified based on evolutionary conservation information and consistency with other experimental RNA structural data. IRIS is implemented in Python and freely available at http://iris.zhanglab.net.

Conclusion

IRIS capitalizes upon PARIS data to improve the prediction of in vivo RNA secondary structure ensembles. We expect that IRIS will enhance the application of the PARIS technology and shed more insight on in vivo RNA secondary structures.

Abbreviations

PARIS:

psoralen analysis of RNA interactions and structures

icSHAPE:

in vivo click selective 2-hydroxyl acylation and profiling experiment

MFE:

minimum free energy

NRDS:

non-redundant sampling algorithm

LASSO:

least absolute shrinkage and selection operator

KL:

distance Kullback-Leibler distance

NP-hard:

non-deterministic polynomial-time hard

References

  1. Eddy, S. R. (2001) Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet., 2, 919–929

    CAS  PubMed  Article  Google Scholar 

  2. Cech, T. R. and Steitz, J. A. (2014) The noncoding RNA revolution-trashing old rules to forge new ones. Cell, 157, 77–94

    CAS  PubMed  Article  Google Scholar 

  3. Tinoco, I. Jr and Bustamante, C. (1999) How RNA folds. J. Mol. Biol., 293, 271–281

    CAS  PubMed  Article  Google Scholar 

  4. Fallmann, J., Will, S., Engelhardt, J., Grüning, B., Backofen, R. and Stadler, P. F. (2017) Recent advances in RNA folding. J. Biotechnol., 261, 97–104

    CAS  PubMed  Article  Google Scholar 

  5. Rivas, E. (2013) The four ingredients of single-sequence RNA secondary structure prediction. A unifying perspective. RNA Biol., 10, 1185–1196

    CAS  PubMed  Google Scholar 

  6. Zuker, M. (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res., 31, 3406–3415

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  7. Hofacker, I. L. (2003) Vienna RNA secondary structure server. Nucleic Acids Res., 31, 3429–3431

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  8. Reuter, J. S. and Mathews, D. H. (2010) RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics, 11, 129

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  9. Bevilacqua, P. C., Ritchey, L. E., Su, Z. and Assmann, S. M. (2016) Genome-wide analysis of RNA secondary structure. Annu. Rev. Genet., 50, 235–266

    CAS  PubMed  Article  Google Scholar 

  10. McCaskill, J. S. (1990) The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers, 29, 1105–1119

    CAS  PubMed  Article  Google Scholar 

  11. Chen, S.-J. (2008) RNA folding: conformational statistics, folding kinetics, and ion electrostatics. Annu. Rev. Biophys., 37, 197–214

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  12. Flamm, C., Hofacker, I. L., Stadler, P. F. and Wolfinger, M. T. (2002) Barrier trees of degenerate landscapes. Z. Phys. Chem., 216, 155

    CAS  Article  Google Scholar 

  13. Kucharík, M., Hofacker, I. L., Stadler, P. F. and Qin, J. (2014) Basin Hopping Graph: a computational framework to characterize RNA folding landscapes. Bioinformatics, 30, 2009–2017

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  14. Michálik, J., Touzet, H. and Ponty, Y. (2017) Efficient approximations of RNA kinetics landscape using non-redundant sampling. Bioinformatics, 33, i283–i292

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  15. Hofacker, I. L., Schuster, P. and Stadler, P. F. (1998) Combinatorics of RNA secondary structures. Discrete Appl. Math., 88, 207–237

    Article  Google Scholar 

  16. Rivas, E. and Eddy, S. R. (2001) Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics, 2, 8

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  17. Kalvari, I., Nawrocki, E. P., Argasinska, J., Quinones-Olvera, N., Finn, R. D., Bateman, A. and Petrov, A. I. (2018) Non-coding RNA analysis using the rfam database. Curr. Protoc. Bioinf., 62, e51

    Article  CAS  Google Scholar 

  18. Kalvari, I., Argasinska, J., Quinones-Olvera, N., Nawrocki, E. P., Rivas, E., Eddy, S. R., Bateman, A., Finn, R. D. and Petrov, A. I. (2018) Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res., 46, D335–D342

    CAS  PubMed  Article  Google Scholar 

  19. Knudsen, B. and Hein, J. (2003) Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res., 31, 3423–3428

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. Do, C. B., Woods, D. A. and Batzoglou, S. (2006) CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics, 22, e90–e98

    CAS  PubMed  Article  Google Scholar 

  21. Zakov, S., Goldberg, Y., Elhadad, M. and Ziv-Ukelson, M. (2011) Rich parameterization improves RNA structure prediction. J. Comput. Biol., 18, 1525–1542

    CAS  PubMed  Article  Google Scholar 

  22. Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H. and Murphy, K. P. (2007) Efficient parameter estimation for RNA secondary structure prediction. Bioinformatics, 23, i19–i28

    CAS  PubMed  Article  Google Scholar 

  23. Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H. and Murphy, K. P. (2010) Computational approaches for RNA energy parameter estimation. RNA, 16, 2304–2318

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  24. Singh, J., Hanson, J., Paliwal, K. and Zhou, Y. (2019) RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat. Commun., 10, 5407

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  25. Kwok, C. K. (2016) Dawn of the in vivo RNA structurome and interactome. Biochem. Soc. Trans., 44, 1395–1410

    CAS  PubMed  Article  Google Scholar 

  26. Leamy, K. A., Assmann, S. M., Mathews, D. H. and Bevilacqua, P. C. (2016) Bridging the gap between in vitro and in vivo RNA folding. Q. Rev. Biophys., 49, e10

    PubMed  PubMed Central  Article  Google Scholar 

  27. Strobel, E. J., Yu, A. M. and Lucks, J. B. (2018) High-throughput determination of RNA structures. Nat. Rev. Genet., 19, 615–634

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. Rouskin, S., Zubradt, M., Washietl, S., Kellis, M. and Weissman, J. S. (2014) Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo. Nature, 505, 701–705

    CAS  PubMed  Article  Google Scholar 

  29. Ding, Y., Tang, Y., Kwok, C. K., Zhang, Y., Bevilacqua, P. C. and Assmann, S. M. (2014) In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features. Nature, 505, 696–700

    CAS  PubMed  Article  Google Scholar 

  30. Spitale, R. C., Flynn, R. A., Zhang, Q. C., Crisalli, P., Lee, B., Jung, J.-W., Kuchelmeister, H. Y., Batista, P. J., Torre, E. A., Kool, E. T., et al. (2015) Structural imprints in vivo decode RNA regulatory mechanisms. Nature, 519, 486–490

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  31. Deigan, K. E., Li, T. W., Mathews, D. H. and Weeks, K. M. (2009) Accurate SHAPE-directed RNA structure determination. Proc. Natl. Acad. Sci. USA, 106, 97–102

    CAS  PubMed  Article  Google Scholar 

  32. Deng, F., Ledda, M., Vaziri, S. and Aviran, S. (2016) Data-directed RNA secondary structure prediction using probabilistic modeling. RNA, 22, 1109–1119

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  33. Wu, Y., Shi, B., Ding, X., Liu, T., Hu, X., Yip, K. Y., Yang, Z. R., Mathews, D. H. and Lu, Z. J. (2015) Improved prediction of RNA secondary structure by integrating the free energy model with restraints derived from experimental probing data. Nucleic Acids Res., 43, 7247–7259

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. Washietl, S., Hofacker, I. L., Stadler, P. F. and Kellis, M. (2012) RNA folding with soft constraints: reconciliation of probing data and thermodynamic secondary structure prediction. Nucleic Acids Res., 40, 4261–4272

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. Spasic, A., Assmann, S. M., Bevilacqua, P. C. and Mathews, D. H. (2018) Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Res., 46, 314–323

    CAS  PubMed  Article  Google Scholar 

  36. Aw, J. G. A., Shen, Y., Wilm, A., Sun, M., Lim, X. N., Boon, K.-L., Tapsin, S., Chan, Y.-S., Tan, C.-P., Sim, A. Y., et al. (2016) In vivo mapping of eukaryotic rna interactomes reveals principles of higher-order organization and regulation. Mol. Cell, 62, 603–617

    CAS  PubMed  Article  Google Scholar 

  37. Sharma, E., Sterne-Weiler, T., O’Hanlon, D. and Blencowe, B. J. (2016) Global mapping of human RNA-RNA interactions. Mol. Cell, 62, 618–626

    CAS  PubMed  Article  Google Scholar 

  38. Lu, Z., Zhang, Q. C., Lee, B., Flynn, R. A., Smith, M. A., Robinson, J. T., Davidovich, C., Gooding, A. R., Goodrich, K. J., Mattick, J. S., et al. (2016) Rna duplex map in living cells reveals higher-order transcriptome structure. Cell, 165, 1267–1279

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  39. Gong, J., Ju, Y., Shao, D. and Zhang, Q. C. (2018) Advances and challenges towards the study of RNA-RNA interactions in a transcriptome-wide scale. Quant. Biol., 6, 239–252

    Article  Google Scholar 

  40. Lu, Z., Gong, J. and Zhang, Q. C. (2018) PARIS: Psoralen analysis of RNA interactions and structures with high throughput and resolution. In: RNA Detection, pp. 59–84. Springer

  41. Fischer-Hwang, I., Lu, Z., Zou, J. and Weissman, T. (2019) Cross-linked RNA secondary structure analysis using network techniques. bioRxiv, 668491

  42. Li, P., Wei, Y., Mei, M., Tang, L., Sun, L., Huang, W., Zhou, J., Zou, C., Zhang, S., and Qin, C.-f. (2018) Integrative analysis of zika virus genome RNA structure reveals critical determinants of viral infectivity. Cell host & microbe. 24, 875–886. e875

    CAS  Article  Google Scholar 

  43. Danaee, P., Rouches, M., Wiley, M., Deng, D., Huang, L. and Hendrix, D. (2018) bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res., 46, 5381–5394

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. Li, P., Shi, R. and Zhang, Q. C. (2019) icSHAPE-pipe: A comprehensive toolkit for icSHAPE data analysis and evaluation. Methods, 178, 96–103

    PubMed  Article  CAS  Google Scholar 

  45. Flynn, R. A., Zhang, Q. C., Spitale, R. C., Lee, B., Mumbach, M. R. and Chang, H. Y. (2016) Transcriptome-wide interrogation of RNA secondary structure in living cells with icSHAPE. Nat. Protoc., 11, 273–290

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. Zhu, J. Y. A., Steif, A., Proctor, J. R. and Meyer, I. M. (2013) Transient RNA structure features are evolutionarily conserved and can be computationally predicted. Nucleic Acids Res., 41, 6273–6285

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. Martin, L. C., Gloor, G. B., Dunn, S. D. and Wahl, L. M. (2005) Using information theory to search for co-evolving residues in proteins. Bioinformatics, 21, 4116–4124

    CAS  PubMed  Article  Google Scholar 

  48. Rivas, E., Clements, J. and Eddy, S. R. (2017) A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nat. Methods, 14, 45–48

    CAS  PubMed  Article  Google Scholar 

  49. Hamada, M. (2012) Direct updating of an RNA base-pairing probability matrix with marginal probability constraints. J. Comput. Biol., 19, 1265–1276

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  50. Hofacker, I. L., Fontana, W., Stadler, P. F., Bonhoeffer, L. S., Tacker, M., and Schuster, P. (1994) Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie/Chemical Monthly. 125, 167–188

    CAS  Article  Google Scholar 

  51. Cox, M. A. and Cox, T. F. (2008) Multidimensional scaling. In: Handbook of Data Visualization, pp. 315–347. Springer

  52. Aurenhammer, F. (1991) Voronoi diagrams—a survey of a fundamental geometric data structure. ACM Comput. Surv., 23, 345–405

    Article  Google Scholar 

  53. Lyngsø, R. B. (2004) Complexity of pseudoknot prediction in simple models. In: International Colloquium on Automata, Languages, and Programming, pp. 919–931. Springer

  54. Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M. and Gingeras, T. R. (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29, 15–21

    CAS  PubMed  Article  Google Scholar 

  55. Lyngsø, R. B. and Pedersen, C. N. (2000) RNA pseudoknot prediction in energy-based models. J. Comput. Biol., 7, 409–427

    PubMed  Article  Google Scholar 

  56. Murtagh, F. (1983) A survey of recent advances in hierarchical clustering algorithms. Comput. J., 26, 354–359

    Article  Google Scholar 

  57. Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., et al. (2020) Scipy 1.0: Fundamental algorithms for scientific computing in python. Nat. Methods, 17, 261–272

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R. and Dubourg, V. (2011) Scikit-learn: Machine learning in python. J. Mach. Learn. Res., 12, 2825–2830

    Google Scholar 

  59. Darty, K., Denise, A. and Ponty, Y. (2009) VARNA: Interactive drawing and editing of the RNA secondary structure. Bioinformatics, 25, 1974–1975

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  60. Hunter, J. D. (2007) Matplotlib: A 2D graphics environment. Comput. Sci. Eng., 9, 90–95

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Chinese Ministry of Science and Technology (No. 2018YFA0107603 to Q.C.Z.), the National Natural Science Foundation of China (Nos. 91740204 and 31761163007 to Q.C.Z.), the National Natural Science Foundation of China (No. 61772197 to T.J.) and the National Key Research and Development Program of China (No. 2018YFC0910404 to T.J.). Q.C.Z thanks for support from the Beijing Advanced Innovation Center for Structural Biology and the Tsinghua-Peking Joint Center for Life Sciences.

Author information

Authors and Affiliations

Authors

Contributions

Q.C.Z. conceived the project. T.J. and Q.C.Z. supervised the entire project. J.Z. and T.J. designed the IRIS algorithms. P.L. assisted in data collection and pre-processing and gave many critical suggestions to the methods. Q.C. Z. and T.J. proposed evaluation benchmarks. W.M. and Z.L. gave many useful suggestions. W.M. provided the support of computational resources. W.Z. and R.J. carried out a preliminary exploration of the project. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Qiangfeng Cliff Zhang or Tao Jiang.

Additional information

Compliance with Ethics Guidelines

Jianyu Zhou, Pan Li, Wanwen Zeng, Wenxiu Ma, Zhipeng Lu, Rui Jiang, Qiangfeng Cliff Zhang and Tao Jiang declare that they have no conflict of interest.

The article does not contain any human or animal subjects performed by any of the authors.

Author summary

Decoding RNA secondary structures in living cells is still a thorny problem in bioinformatics. Recently, PARIS enables the direct capture of in vivo RNA duplex structures in a high-throughput sequencing way. However, PARIS can only obtain low-resolution information of a mixture of alternative RNA structures. A computational method to construct the high-resolution structure ensemble is the key to exploit the full power of the PARIS technology. Here we present IRIS, a method for predicting in vivo RNA secondary structure ensembles base on PARIS data. We expect that IRIS will help shed more insight on in vivo RNA secondary structures.

Supplemental Information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhou, J., Li, P., Zeng, W. et al. IRIS: A method for predicting in vivo RNA secondary structures using PARIS data. Quant Biol 8, 369–381 (2020). https://doi.org/10.1007/s40484-020-0223-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40484-020-0223-4

Keywords

  • RNA secondary structure
  • PARIS data
  • in vivo
  • structure ensembles
  • incompatible reads