Skip to main content
Log in

Efficient sampling of RNA secondary structures from the Boltzmann ensemble of low-energy

The boustrophedon method

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

We adapt here a surprising technique, the boustrophedon method, to speed up the sampling of RNA secondary structures from the Boltzmann low-energy ensemble. This technique is simple and its implementation straight-forward, as it only requires a permutation in the order of some operations already performed in the stochastic traceback stage of these algorithms. It nevertheless greatly improves their worst-case complexity from \({\mathcal{O}}({n^2})\) to \({\mathcal{O}}({n\log(n)})\) , for n the size of the original sequence. Moreover the average-case complexity of the generation is shown to be improved from \({\mathcal{O}}({n\sqrt{n}})\) to \({\mathcal{O}}({n\log(n)})\) in an Boltzmann-weighted homopolymer model based on the Nussinov–Jacobson free-energy model. These results are extended to the more realistic Turner free-energy model through experiments performed on both structured (Drosophilia melanogaster mRNA 5S) and hybrid (Staphylococcus aureus RNAIII) RNA sequences, using a boustrophedon modified version of the popular software UnaFold. This improvement allows for the sampling of greater and more significant sets of structures in a given time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. André D. (1879). Développements de sec(x) et de tan(x). C. R. Acad. Sci. Paris 88: 965–967

    Google Scholar 

  2. Barrick J., Corbino K., Winkler W., Nahvi A., Mandal M., Collins J., Lee M., Roth A., Sudarsan N., Jona I., Wickiser J., Breaker R. (2004). New RNA motifs suggest an expanded scope for riboswitches in bacterial genetic control. Proc. Natl. Acad. Sci. USA 101(17): 6421–6426

    Article  Google Scholar 

  3. Clote P. (2005). An efficient algorithm to compute the landscape of locally optimal RNA secondary structures with respect to the Nussinov-Jacobson energy model. J. Comput. Biol. 12(1): 83–101

    Article  Google Scholar 

  4. Clote P. (2005). RNALOSS: a web server for RNA locally optimal secondary structures. Nucleic Acids Res. 33(Web Server issue): W600–604

    Article  Google Scholar 

  5. Clote P., Ferre F., Kranakis E., Krizanc D. (2005). Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency. RNA 11(5): 578–591

    Article  Google Scholar 

  6. Clote P., Waldispühl J., Behzadi B., Steyaert J.M. (2005). Energy landscape of k-point mutants of an RNA molecule. Bioinformatics 21(22): 4140–4147

    Article  Google Scholar 

  7. Ding Y. (2006). Statistical and bayesian approaches to RNA secondary structure prediction. RNA 12(3): 323–331

    Article  Google Scholar 

  8. Ding Y., Chan C., Lawrence C. (2004). SFold web server for statistical folding and rational design of nucleic acids. Nucleic Acids Res. 32(Web Server Issue): 135–141

    Article  Google Scholar 

  9. Ding Y., Chan C.Y., Lawrence C.E. (2005). RNA secondary structure prediction by centroids in a boltzmann weighted ensemble. RNA 11: 1157–1166

    Article  Google Scholar 

  10. Ding Y., Lawrence E. (2003). A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res. 31(24): 7280–7301

    Article  Google Scholar 

  11. Freyhult, E., Moulton, V., Clote, P.: Rnabor: A web server for RNA structural neighbors. Nucleic Acids Res (2007) (in press)

  12. Flajolet, P.: Singular combinatorics. In: Proceedings of the International Congress of Mathematicians, vol. 3, pp. 561–571 (2002)

  13. Flajolet P., Odlyzko A. (1990). Singularity analysis of generating functions. SIAM J. Discrete Math. 3(2): 216–240

    Article  MATH  MathSciNet  Google Scholar 

  14. Flajolet, P., Zimmermann, P., Van Cutsem, B.: Calculus for the random generation of labelled combinatorial structures. A preliminary version is available in INRIA Research Report RR-1830. Theor Comput Sci 132, 1–35 (1994)

  15. Gan N.K.H.H., Schlick T. (2007). A computational proposal for designing structured RNA pools for in vitro selection of RNAs. RNA 13: 478–492

    Article  Google Scholar 

  16. Greene D.H., Knuth D.E. (1981). Mathematics for the Analysis of Algorithms. Birkhauser, Boston

    MATH  Google Scholar 

  17. Griffiths-Jones S., Bateman A., Marshall M., Khanna A., Eddy S.R. (2003). Rfam: an RNA family database. Nucleic Acids Res. 31(1): 439–441

    Article  Google Scholar 

  18. Hofacker I.L., Fontana W., Stadler P.F., Bonhoeffer L.S., Tacker M., Schuster P. (1994). Fast folding and comparison of RNA secondary structures. Monatsch. Chem. 125: 167–188

    Article  Google Scholar 

  19. Tinoco J., Borer P., Dengler B., Levin M., Uhlenbeck O., Crothers D., Bralla J. (1973). Improved estimation of secondary structure in ribonucleic acids. Nat. New Biol. 246(150): 40–41

    Google Scholar 

  20. Leontis N., Westhof E. (2001). Geometric nomenclature and classification of RNA base pairs. RNA 7: 499–512

    Article  Google Scholar 

  21. Lescoute A., Westhof E. (2006). Topology of three-way junctions in folded RNAs. RNA 12(1): 83–93

    Article  Google Scholar 

  22. Lesk A.M. (1974). A combinatorial study of the effects of admitting non-watson-crick base pairings and of base compositions on the helix-forming potential of polynucleotides of random sequences. J. Theor. Biol. 44: 7–17

    Article  Google Scholar 

  23. Lorenz, W., Ponty, Y., Clote, P.: Asymptotics of RNA shapes. J. Comput. Biol. (in press, 2007)

  24. Lyngs R.B., Pedersen C.N.S. (2000). RNA pseudoknot prediction in energy-based models. J. Comput. Biol. 7(3–4): 409–427

    Article  Google Scholar 

  25. Markham, N.R.: Algorithms and software for nucleic acid sequences. PhD thesis, Rensselaer Polytechnic Institute (2006)

  26. Markham N.R., Zuker M. (2005). Dinamelt web server for nucleic acid melting prediction. Nucleic Acids Res. 33: W577–W581

    Article  Google Scholar 

  27. Mathews D. (2004). Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. RNA 10: 1178–1190

    Article  Google Scholar 

  28. McCaskill J. (1990). The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29: 1105–1119

    Article  Google Scholar 

  29. Millar J., Sloane N., Young N. (1996). A new operation on sequences: The boustrophedon transform. J. Combin. Th. Ser. A 76: 44–54

    Article  MATH  MathSciNet  Google Scholar 

  30. Nebel M. (2003). Combinatorial properties of RNA secondary structures. J. Comput. Biol. 3(9): 541–574

    Google Scholar 

  31. Nebel M.E. (2004). Investigation of the bernoulli model for rna secondary structures. Bull. Math. Biol. 66(5): 925–964

    Article  MathSciNet  Google Scholar 

  32. Nussinov R., Jacobson A. (1980). Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc. Natl. Acad. Sci. USA 77: 6903–6913

    Article  Google Scholar 

  33. Penchovsky R., Breaker R. (2005). Computational design and experimental validation of oligonucleotide-sensing allosteric ribozymes. Nat. Biotechnol. 23(11): 1424–1431

    Article  Google Scholar 

  34. Ponty Y., Termier M., Denise A. (2006). GenRGenS: software for generating random genomic sequences and structures. Bioinformatics 22(12): 1534–1535

    Article  Google Scholar 

  35. Salvy, B., Zimmerman, P.: Gfun: a maple package for the manipulation of generating and holonomic functions in one variable. ACM Transactions on Mathematical Softwares 20(2), 163–177 (1994). doi:10.1145/178365.178368

  36. Steffen P., Voss B., Rehmsmeier M., Reeder J., Giegerich R. (2006). RNAshapes: an integrated RNA analysis package based on abstract shapes. Bioinformatics 22(4): 500–503

    Article  Google Scholar 

  37. Vauchaussade de Chaumont, M., Viennot, X.: Enumeration of RNA’s secondary structures by complexity. In: Capasso, V., Grosso, E., Paven-Fontana, S. (eds.) Mathematics in Medecine and Biology, Lecture Notes in Biomathematics, vol. 57. pp. 360–365 (1985)

  38. Voss, B., Giegerich, R., Rehmsmeier, M.: Complete probabilistic analysis of RNA shapes. BMC Biol. 4(5) (2006)

  39. Waterman M.S. (1978). Secondary structure of single stranded nucleic acids. Adv. Math. Suppl. Stud. 1(1): 167–212

    MathSciNet  Google Scholar 

  40. Wuchty S., Fontana W., Hofacker I.L., Schuster P. (1999). Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers 49: 145–165

    Article  Google Scholar 

  41. Xia T., Burkard M., Kierzek R., Schroeder S., Jiao X., Cox C., Turner D., SantaLucia J. (1999). Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. Biochemistry 37: 14719–14735

    Article  Google Scholar 

  42. Zhao, J., Malmberg, R., Cai, L.: Rapid ab initio RNA folding including pseudoknots via graph tree decomposition. In: Proceedings of the 6th Workshop on Algorithms in Bioinformatics (WABI 2006), vol. 4175. pp. 262–273 (2006)

  43. Zuker M., Stiegler P. (1981). Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9: 133–148

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yann Ponty.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ponty, Y. Efficient sampling of RNA secondary structures from the Boltzmann ensemble of low-energy. J. Math. Biol. 56, 107–127 (2008). https://doi.org/10.1007/s00285-007-0137-z

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-007-0137-z

Keywords

Mathematics Subject Classification (2000)

Navigation