The Trouble with Long-Range Base Pairs in RNA Folding

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8213)


RNA prediction has long been struggling with long-range base pairs since prediction accuracy decreases with base pair span. We analyze here the empirical distribution of base pair spans in large collection of experimentally known RNA structures. Surprisingly, we find that long-range base pairs are overrepresented in these data. In particular, there is no evidence that long-range base pairs are systematically overpredicted relative to short-range interactions in thermodynamic predictions. This casts doubt on a recent suggestion that kinetic effects are the cause of length-dependent decrease of predictability. Instead of a modification of the energy model we advocate a modification of the expected accuracy model for RNA secondary structures. We demonstrate that the inclusion of a span-dependent penalty leads to improved maximum expected accuracy structure predictions compared to both the standard MEA model and a modified folding algorithm with an energy penalty function. The prevalence of long-range base pairs provide further evidence that RNA structures in general do not have the so-called polymer zeta property. This has consequences for the asymptotic performance for a large class of sparsified RNA folding algorithms.


RNA folding long-range base pair prediction accuracy polymer zeta property 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Doshi, K., Cannone, J., Cobaugh, C., Gutell, R.: Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction. BMC Bioinformatics 5, 105 (2004)CrossRefGoogle Scholar
  2. 2.
    Hofacker, I.L., Priwitzer, B., Stadler, P.F.: Prediction of locally stable RNA secondary structures for genome-wide surveys. Bioinformatics 20, 191–198 (2004)CrossRefGoogle Scholar
  3. 3.
    Bernhart, S., Hofacker, I.L., Stadler, P.F.: Local RNA base pairing probabilities in large sequences. Bioinformatics 22, 614–615 (2006)CrossRefGoogle Scholar
  4. 4.
    Kiryu, H., Kin, T., Asai, K.: Rfold: an exact algorithm for computing local base pairing probabilities. Bioinformatics 24, 367–373 (2008)CrossRefGoogle Scholar
  5. 5.
    Kiryu, H., Terai, G., Imamura, O., Yoneyama, H., Suzuki, K., Asai, K.: A detailed investigation of accessibilities around target sites of siRNAs and miRNAs. Bioinformatics 27, 1788–1797 (2011)CrossRefGoogle Scholar
  6. 6.
    Lange, S.J., Maticzka, D., Möhl, M., Gagnon, J.N., Brown, C.M., Backofen, R.: Global or local? Predicting secondary structure and accessibility in mRNAs. Nucleic Acids Res. 40, 5215–5226 (2012)CrossRefGoogle Scholar
  7. 7.
    Proctor, J.R.P., Meyer, I.M.: CoFold: an RNA secondary structure prediction method that takes co-transcriptional folding into account. Nucleic Acids Res. 41, e102 (2013)CrossRefGoogle Scholar
  8. 8.
    Romero-López, C., Berzal-Herranz, A.: A long-range RNA-RNA interaction between the 5’ and 3’ ends of the HCV genome. RNA 15, 1740–1752 (2009)CrossRefGoogle Scholar
  9. 9.
    Wu, B., Grigull, J., Ore, M.O., Morin, S., White, K.A.: Global organization of a positive-strand RNA virus genome. PLoS Pathog. 9, e1003363 (2013)Google Scholar
  10. 10.
    Raker, V.A., Mironov, A.A., Gelfand, M.S., Pervouchine, D.D.: Modulation of alternative splicing by long-range RNA structures in Drosophila. Nucleic Acids Res. 37, 4533–4534 (2009)CrossRefGoogle Scholar
  11. 11.
    Pervouchine, D.D., Khrameeva, E.E., Pichugina, M.Y., Nikolaienko, O.V., Gelfand, M.S., Rubtsov, P.M., Mironov, A.A.: Evidence for widespread association of mammalian splicing and conserved long-range RNA structures. RNA 18, 1–15 (2012)CrossRefGoogle Scholar
  12. 12.
    Yoffe, A.M., Prinsen, P., Gelbart, W.M., Ben-Shaul, A.: The ends of a large RNA molecule are necessarily close. Nucl. Acids Res. 39, 292–299 (2011)CrossRefGoogle Scholar
  13. 13.
    Fang, L.T.: The end-to-end distance of RNA as a randomly self-paired polymer. J. Theor. Biol. 280, 101–107 (2011)CrossRefGoogle Scholar
  14. 14.
    Clote, P., Ponty, Y., Steyaert, J.M.: Expected distance between terminal nucleotides of RNA secondary structures. J. Math. Biol. 65, 581–599 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Han, H.S., Reidys, C.M.: The 5’-3’ distance of RNA secondary structures. J. Comput. Biol. 19, 867–878 (2012)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Backofen, R., Fricke, M., Marz, M., Qin, J., Stadler, P.F.: Distribution of graph-distances in Boltzmann ensembles of RNA secondary structures. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 112–125. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  17. 17.
    Backofen, R., Tsur, D., Zakov, S., Ziv-Ukelson, M.: Sparse RNA folding: Time and space efficient algorithms. J. Discr. Alg. 9, 12–31 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Andronescu, M., Bereg, V., Hoos, H.H., Condon, A.: RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinf. 9, 340 (2008)CrossRefGoogle Scholar
  19. 19.
    Zwieb, C., Gorodkin, J., Knudsen, B., Burks, J., Wower, J.: tmrdb (tmrna database). Nucleic Acids Res. 31(1), 446–447 (2003)CrossRefGoogle Scholar
  20. 20.
    Rosenblad, M.A., Larsen, N., Samuelsson, T., Zwieb, C.: Kinship in the SRP RNA family. RNA Biol. 6(5), 508–516 (2009)CrossRefGoogle Scholar
  21. 21.
    Brown, J.: The ribonuclease p database. NAR 27(1) (1999)Google Scholar
  22. 22.
    Jiang, M., Anderson, J., Gillespie, J., Mayne, M.: ushuffle: a useful tool for shuffling biological sequences while preserving the k-let counts. BMC Bioinformatics 9(1), 192 (2008)CrossRefGoogle Scholar
  23. 23.
    Waterman, M.S.: Secondary structure of single-stranded nucleic acids. Adv. Math. Suppl. Studies 1, 167–212 (1978)MathSciNetGoogle Scholar
  24. 24.
    McCaskill, J.S.: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29(6-7), 1105–1119 (1990)CrossRefGoogle Scholar
  25. 25.
    Lu, Z., Gloor, J., Mathews, D.: Improved RNA secondary structure prediction by maximizing expected pair accuracy. RNA 15, 1805–1813 (2009)CrossRefGoogle Scholar
  26. 26.
    Lorenz, R., Bernhart, S.H., Höner Zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P.F., Hofacker, I.L.: ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011)CrossRefGoogle Scholar
  27. 27.
    van Rijsbergen, C.J.: Information Retrieval. Butterworth (1979)Google Scholar
  28. 28.
    Gardner, P.P., Giegerich, R.: A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics 5, 140 (2004)CrossRefGoogle Scholar
  29. 29.
    Wexler, Y., Zilberstein, C., Ziv-Ukelson, M.: A study of accessible motifs and RNA folding complexity. J. Comput. Biol. 14, 856–872Google Scholar
  30. 30.
    Dimitrieva, S., Bucher, P.: Practicality and time complexity of a sparsified RNA folding algorithm. J Bioinf. Comp. Biol. 10, 1241007 (2012)CrossRefGoogle Scholar
  31. 31.
    Huang, F.W.D., Reidys, C.M.: On the combinatorics of sparsification. Alg. Mol. Biol. 7, 28 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  1. 1.Dept. Computer Science, and Interdisciplinary Center for BioinformaticsUniv. LeipzigLeipzigGermany
  2. 2.LIFE, Leipzig Research Center for Civilization DiseasesUniversity LeipzigLeipzigGermany
  3. 3.Dept. Theoretical ChemistryUniv. ViennaWienAustria
  4. 4.MPI Mathematics in the SciencesLeipzigGermany
  5. 5.RTHUniv. CopenhagenDenmark
  6. 6.FHI Cell Therapy and ImmunologyLeipzigGermany
  7. 7.Santa Fe InstituteSanta FeUSA
  8. 8.Bioinformatics and Computational Biology research groupUniversity of ViennaViennaAustria

Personalised recommendations