Abstract
Background
RNA secondary structures play a pivotal role in posttranscriptional regulation and the functions of non-coding RNAs, yet in vivo RNA secondary structures remain enigmatic. PARIS (Psoralen Analysis of RNA Interactions and Structures) is a recently developed high-throughput sequencing-based approach that enables direct capture of RNA duplex structures in vivo. However, the existence of incompatible, fuzzy pairing information obstructs the integration of PARIS data with the existing tools for reconstructing RNA secondary structure models at the singlebase resolution.
Methods
We introduce IRIS, a method for predicting RNA secondary structure ensembles based on PARIS data. IRIS generates a large set of candidate RNA secondary structure models under the guidance of redistributed PARIS reads and then uses a Bayesian model to identify the optimal ensemble, according to both thermodynamic principles and PARIS data.
Results
The predicted RNA structure ensembles by IRIS have been verified based on evolutionary conservation information and consistency with other experimental RNA structural data. IRIS is implemented in Python and freely available at http://iris.zhanglab.net.
Conclusion
IRIS capitalizes upon PARIS data to improve the prediction of in vivo RNA secondary structure ensembles. We expect that IRIS will enhance the application of the PARIS technology and shed more insight on in vivo RNA secondary structures.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Abbreviations
- PARIS:
-
psoralen analysis of RNA interactions and structures
- icSHAPE:
-
in vivo click selective 2-hydroxyl acylation and profiling experiment
- MFE:
-
minimum free energy
- NRDS:
-
non-redundant sampling algorithm
- LASSO:
-
least absolute shrinkage and selection operator
- KL:
-
distance Kullback-Leibler distance
- NP-hard:
-
non-deterministic polynomial-time hard
References
Eddy, S. R. (2001) Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet., 2, 919–929
Cech, T. R. and Steitz, J. A. (2014) The noncoding RNA revolution-trashing old rules to forge new ones. Cell, 157, 77–94
Tinoco, I. Jr and Bustamante, C. (1999) How RNA folds. J. Mol. Biol., 293, 271–281
Fallmann, J., Will, S., Engelhardt, J., Grüning, B., Backofen, R. and Stadler, P. F. (2017) Recent advances in RNA folding. J. Biotechnol., 261, 97–104
Rivas, E. (2013) The four ingredients of single-sequence RNA secondary structure prediction. A unifying perspective. RNA Biol., 10, 1185–1196
Zuker, M. (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res., 31, 3406–3415
Hofacker, I. L. (2003) Vienna RNA secondary structure server. Nucleic Acids Res., 31, 3429–3431
Reuter, J. S. and Mathews, D. H. (2010) RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics, 11, 129
Bevilacqua, P. C., Ritchey, L. E., Su, Z. and Assmann, S. M. (2016) Genome-wide analysis of RNA secondary structure. Annu. Rev. Genet., 50, 235–266
McCaskill, J. S. (1990) The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers, 29, 1105–1119
Chen, S.-J. (2008) RNA folding: conformational statistics, folding kinetics, and ion electrostatics. Annu. Rev. Biophys., 37, 197–214
Flamm, C., Hofacker, I. L., Stadler, P. F. and Wolfinger, M. T. (2002) Barrier trees of degenerate landscapes. Z. Phys. Chem., 216, 155
Kucharík, M., Hofacker, I. L., Stadler, P. F. and Qin, J. (2014) Basin Hopping Graph: a computational framework to characterize RNA folding landscapes. Bioinformatics, 30, 2009–2017
Michálik, J., Touzet, H. and Ponty, Y. (2017) Efficient approximations of RNA kinetics landscape using non-redundant sampling. Bioinformatics, 33, i283–i292
Hofacker, I. L., Schuster, P. and Stadler, P. F. (1998) Combinatorics of RNA secondary structures. Discrete Appl. Math., 88, 207–237
Rivas, E. and Eddy, S. R. (2001) Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics, 2, 8
Kalvari, I., Nawrocki, E. P., Argasinska, J., Quinones-Olvera, N., Finn, R. D., Bateman, A. and Petrov, A. I. (2018) Non-coding RNA analysis using the rfam database. Curr. Protoc. Bioinf., 62, e51
Kalvari, I., Argasinska, J., Quinones-Olvera, N., Nawrocki, E. P., Rivas, E., Eddy, S. R., Bateman, A., Finn, R. D. and Petrov, A. I. (2018) Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res., 46, D335–D342
Knudsen, B. and Hein, J. (2003) Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res., 31, 3423–3428
Do, C. B., Woods, D. A. and Batzoglou, S. (2006) CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics, 22, e90–e98
Zakov, S., Goldberg, Y., Elhadad, M. and Ziv-Ukelson, M. (2011) Rich parameterization improves RNA structure prediction. J. Comput. Biol., 18, 1525–1542
Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H. and Murphy, K. P. (2007) Efficient parameter estimation for RNA secondary structure prediction. Bioinformatics, 23, i19–i28
Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H. and Murphy, K. P. (2010) Computational approaches for RNA energy parameter estimation. RNA, 16, 2304–2318
Singh, J., Hanson, J., Paliwal, K. and Zhou, Y. (2019) RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat. Commun., 10, 5407
Kwok, C. K. (2016) Dawn of the in vivo RNA structurome and interactome. Biochem. Soc. Trans., 44, 1395–1410
Leamy, K. A., Assmann, S. M., Mathews, D. H. and Bevilacqua, P. C. (2016) Bridging the gap between in vitro and in vivo RNA folding. Q. Rev. Biophys., 49, e10
Strobel, E. J., Yu, A. M. and Lucks, J. B. (2018) High-throughput determination of RNA structures. Nat. Rev. Genet., 19, 615–634
Rouskin, S., Zubradt, M., Washietl, S., Kellis, M. and Weissman, J. S. (2014) Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo. Nature, 505, 701–705
Ding, Y., Tang, Y., Kwok, C. K., Zhang, Y., Bevilacqua, P. C. and Assmann, S. M. (2014) In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features. Nature, 505, 696–700
Spitale, R. C., Flynn, R. A., Zhang, Q. C., Crisalli, P., Lee, B., Jung, J.-W., Kuchelmeister, H. Y., Batista, P. J., Torre, E. A., Kool, E. T., et al. (2015) Structural imprints in vivo decode RNA regulatory mechanisms. Nature, 519, 486–490
Deigan, K. E., Li, T. W., Mathews, D. H. and Weeks, K. M. (2009) Accurate SHAPE-directed RNA structure determination. Proc. Natl. Acad. Sci. USA, 106, 97–102
Deng, F., Ledda, M., Vaziri, S. and Aviran, S. (2016) Data-directed RNA secondary structure prediction using probabilistic modeling. RNA, 22, 1109–1119
Wu, Y., Shi, B., Ding, X., Liu, T., Hu, X., Yip, K. Y., Yang, Z. R., Mathews, D. H. and Lu, Z. J. (2015) Improved prediction of RNA secondary structure by integrating the free energy model with restraints derived from experimental probing data. Nucleic Acids Res., 43, 7247–7259
Washietl, S., Hofacker, I. L., Stadler, P. F. and Kellis, M. (2012) RNA folding with soft constraints: reconciliation of probing data and thermodynamic secondary structure prediction. Nucleic Acids Res., 40, 4261–4272
Spasic, A., Assmann, S. M., Bevilacqua, P. C. and Mathews, D. H. (2018) Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Res., 46, 314–323
Aw, J. G. A., Shen, Y., Wilm, A., Sun, M., Lim, X. N., Boon, K.-L., Tapsin, S., Chan, Y.-S., Tan, C.-P., Sim, A. Y., et al. (2016) In vivo mapping of eukaryotic rna interactomes reveals principles of higher-order organization and regulation. Mol. Cell, 62, 603–617
Sharma, E., Sterne-Weiler, T., O’Hanlon, D. and Blencowe, B. J. (2016) Global mapping of human RNA-RNA interactions. Mol. Cell, 62, 618–626
Lu, Z., Zhang, Q. C., Lee, B., Flynn, R. A., Smith, M. A., Robinson, J. T., Davidovich, C., Gooding, A. R., Goodrich, K. J., Mattick, J. S., et al. (2016) Rna duplex map in living cells reveals higher-order transcriptome structure. Cell, 165, 1267–1279
Gong, J., Ju, Y., Shao, D. and Zhang, Q. C. (2018) Advances and challenges towards the study of RNA-RNA interactions in a transcriptome-wide scale. Quant. Biol., 6, 239–252
Lu, Z., Gong, J. and Zhang, Q. C. (2018) PARIS: Psoralen analysis of RNA interactions and structures with high throughput and resolution. In: RNA Detection, pp. 59–84. Springer
Fischer-Hwang, I., Lu, Z., Zou, J. and Weissman, T. (2019) Cross-linked RNA secondary structure analysis using network techniques. bioRxiv, 668491
Li, P., Wei, Y., Mei, M., Tang, L., Sun, L., Huang, W., Zhou, J., Zou, C., Zhang, S., and Qin, C.-f. (2018) Integrative analysis of zika virus genome RNA structure reveals critical determinants of viral infectivity. Cell host & microbe. 24, 875–886. e875
Danaee, P., Rouches, M., Wiley, M., Deng, D., Huang, L. and Hendrix, D. (2018) bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res., 46, 5381–5394
Li, P., Shi, R. and Zhang, Q. C. (2019) icSHAPE-pipe: A comprehensive toolkit for icSHAPE data analysis and evaluation. Methods, 178, 96–103
Flynn, R. A., Zhang, Q. C., Spitale, R. C., Lee, B., Mumbach, M. R. and Chang, H. Y. (2016) Transcriptome-wide interrogation of RNA secondary structure in living cells with icSHAPE. Nat. Protoc., 11, 273–290
Zhu, J. Y. A., Steif, A., Proctor, J. R. and Meyer, I. M. (2013) Transient RNA structure features are evolutionarily conserved and can be computationally predicted. Nucleic Acids Res., 41, 6273–6285
Martin, L. C., Gloor, G. B., Dunn, S. D. and Wahl, L. M. (2005) Using information theory to search for co-evolving residues in proteins. Bioinformatics, 21, 4116–4124
Rivas, E., Clements, J. and Eddy, S. R. (2017) A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nat. Methods, 14, 45–48
Hamada, M. (2012) Direct updating of an RNA base-pairing probability matrix with marginal probability constraints. J. Comput. Biol., 19, 1265–1276
Hofacker, I. L., Fontana, W., Stadler, P. F., Bonhoeffer, L. S., Tacker, M., and Schuster, P. (1994) Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie/Chemical Monthly. 125, 167–188
Cox, M. A. and Cox, T. F. (2008) Multidimensional scaling. In: Handbook of Data Visualization, pp. 315–347. Springer
Aurenhammer, F. (1991) Voronoi diagrams—a survey of a fundamental geometric data structure. ACM Comput. Surv., 23, 345–405
Lyngsø, R. B. (2004) Complexity of pseudoknot prediction in simple models. In: International Colloquium on Automata, Languages, and Programming, pp. 919–931. Springer
Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M. and Gingeras, T. R. (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29, 15–21
Lyngsø, R. B. and Pedersen, C. N. (2000) RNA pseudoknot prediction in energy-based models. J. Comput. Biol., 7, 409–427
Murtagh, F. (1983) A survey of recent advances in hierarchical clustering algorithms. Comput. J., 26, 354–359
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., et al. (2020) Scipy 1.0: Fundamental algorithms for scientific computing in python. Nat. Methods, 17, 261–272
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R. and Dubourg, V. (2011) Scikit-learn: Machine learning in python. J. Mach. Learn. Res., 12, 2825–2830
Darty, K., Denise, A. and Ponty, Y. (2009) VARNA: Interactive drawing and editing of the RNA secondary structure. Bioinformatics, 25, 1974–1975
Hunter, J. D. (2007) Matplotlib: A 2D graphics environment. Comput. Sci. Eng., 9, 90–95
Acknowledgements
This work was supported by the Chinese Ministry of Science and Technology (No. 2018YFA0107603 to Q.C.Z.), the National Natural Science Foundation of China (Nos. 91740204 and 31761163007 to Q.C.Z.), the National Natural Science Foundation of China (No. 61772197 to T.J.) and the National Key Research and Development Program of China (No. 2018YFC0910404 to T.J.). Q.C.Z thanks for support from the Beijing Advanced Innovation Center for Structural Biology and the Tsinghua-Peking Joint Center for Life Sciences.
Author information
Authors and Affiliations
Contributions
Q.C.Z. conceived the project. T.J. and Q.C.Z. supervised the entire project. J.Z. and T.J. designed the IRIS algorithms. P.L. assisted in data collection and pre-processing and gave many critical suggestions to the methods. Q.C. Z. and T.J. proposed evaluation benchmarks. W.M. and Z.L. gave many useful suggestions. W.M. provided the support of computational resources. W.Z. and R.J. carried out a preliminary exploration of the project. All authors read and approved the final manuscript.
Corresponding authors
Additional information
Compliance with Ethics Guidelines
Jianyu Zhou, Pan Li, Wanwen Zeng, Wenxiu Ma, Zhipeng Lu, Rui Jiang, Qiangfeng Cliff Zhang and Tao Jiang declare that they have no conflict of interest.
The article does not contain any human or animal subjects performed by any of the authors.
Author summary
Decoding RNA secondary structures in living cells is still a thorny problem in bioinformatics. Recently, PARIS enables the direct capture of in vivo RNA duplex structures in a high-throughput sequencing way. However, PARIS can only obtain low-resolution information of a mixture of alternative RNA structures. A computational method to construct the high-resolution structure ensemble is the key to exploit the full power of the PARIS technology. Here we present IRIS, a method for predicting in vivo RNA secondary structure ensembles base on PARIS data. We expect that IRIS will help shed more insight on in vivo RNA secondary structures.
Supplemental Information
Rights and permissions
About this article
Cite this article
Zhou, J., Li, P., Zeng, W. et al. IRIS: A method for predicting in vivo RNA secondary structures using PARIS data. Quant Biol 8, 369–381 (2020). https://doi.org/10.1007/s40484-020-0223-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40484-020-0223-4