Journal of Mathematical Biology

, Volume 56, Issue 1–2, pp 107–127 | Cite as

Efficient sampling of RNA secondary structures from the Boltzmann ensemble of low-energy

The boustrophedon method


We adapt here a surprising technique, the boustrophedon method, to speed up the sampling of RNA secondary structures from the Boltzmann low-energy ensemble. This technique is simple and its implementation straight-forward, as it only requires a permutation in the order of some operations already performed in the stochastic traceback stage of these algorithms. It nevertheless greatly improves their worst-case complexity from \({\mathcal{O}}({n^2})\) to \({\mathcal{O}}({n\log(n)})\) , for n the size of the original sequence. Moreover the average-case complexity of the generation is shown to be improved from \({\mathcal{O}}({n\sqrt{n}})\) to \({\mathcal{O}}({n\log(n)})\) in an Boltzmann-weighted homopolymer model based on the Nussinov–Jacobson free-energy model. These results are extended to the more realistic Turner free-energy model through experiments performed on both structured (Drosophilia melanogaster mRNA 5S) and hybrid (Staphylococcus aureus RNAIII) RNA sequences, using a boustrophedon modified version of the popular software UnaFold. This improvement allows for the sampling of greater and more significant sets of structures in a given time.


Statistical sampling Boltzmann free-energy ensemble RNA structure MFE folding 

Mathematics Subject Classification (2000)

92E10 92C40 05A16 68Q25 82B41 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    André D. (1879). Développements de sec(x) et de tan(x). C. R. Acad. Sci. Paris 88: 965–967 Google Scholar
  2. 2.
    Barrick J., Corbino K., Winkler W., Nahvi A., Mandal M., Collins J., Lee M., Roth A., Sudarsan N., Jona I., Wickiser J., Breaker R. (2004). New RNA motifs suggest an expanded scope for riboswitches in bacterial genetic control. Proc. Natl. Acad. Sci. USA 101(17): 6421–6426 CrossRefGoogle Scholar
  3. 3.
    Clote P. (2005). An efficient algorithm to compute the landscape of locally optimal RNA secondary structures with respect to the Nussinov-Jacobson energy model. J. Comput. Biol. 12(1): 83–101 CrossRefGoogle Scholar
  4. 4.
    Clote P. (2005). RNALOSS: a web server for RNA locally optimal secondary structures. Nucleic Acids Res. 33(Web Server issue): W600–604 CrossRefGoogle Scholar
  5. 5.
    Clote P., Ferre F., Kranakis E., Krizanc D. (2005). Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency. RNA 11(5): 578–591 CrossRefGoogle Scholar
  6. 6.
    Clote P., Waldispühl J., Behzadi B., Steyaert J.M. (2005). Energy landscape of k-point mutants of an RNA molecule. Bioinformatics 21(22): 4140–4147 CrossRefGoogle Scholar
  7. 7.
    Ding Y. (2006). Statistical and bayesian approaches to RNA secondary structure prediction. RNA 12(3): 323–331 CrossRefGoogle Scholar
  8. 8.
    Ding Y., Chan C., Lawrence C. (2004). SFold web server for statistical folding and rational design of nucleic acids. Nucleic Acids Res. 32(Web Server Issue): 135–141 CrossRefGoogle Scholar
  9. 9.
    Ding Y., Chan C.Y., Lawrence C.E. (2005). RNA secondary structure prediction by centroids in a boltzmann weighted ensemble. RNA 11: 1157–1166 CrossRefGoogle Scholar
  10. 10.
    Ding Y., Lawrence E. (2003). A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res. 31(24): 7280–7301 CrossRefGoogle Scholar
  11. 11.
    Freyhult, E., Moulton, V., Clote, P.: Rnabor: A web server for RNA structural neighbors. Nucleic Acids Res (2007) (in press)Google Scholar
  12. 12.
    Flajolet, P.: Singular combinatorics. In: Proceedings of the International Congress of Mathematicians, vol. 3, pp. 561–571 (2002)Google Scholar
  13. 13.
    Flajolet P., Odlyzko A. (1990). Singularity analysis of generating functions. SIAM J. Discrete Math. 3(2): 216–240 MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Flajolet, P., Zimmermann, P., Van Cutsem, B.: Calculus for the random generation of labelled combinatorial structures. A preliminary version is available in INRIA Research Report RR-1830. Theor Comput Sci 132, 1–35 (1994)Google Scholar
  15. 15.
    Gan N.K.H.H., Schlick T. (2007). A computational proposal for designing structured RNA pools for in vitro selection of RNAs. RNA 13: 478–492 CrossRefGoogle Scholar
  16. 16.
    Greene D.H., Knuth D.E. (1981). Mathematics for the Analysis of Algorithms. Birkhauser, Boston MATHGoogle Scholar
  17. 17.
    Griffiths-Jones S., Bateman A., Marshall M., Khanna A., Eddy S.R. (2003). Rfam: an RNA family database. Nucleic Acids Res. 31(1): 439–441 CrossRefGoogle Scholar
  18. 18.
    Hofacker I.L., Fontana W., Stadler P.F., Bonhoeffer L.S., Tacker M., Schuster P. (1994). Fast folding and comparison of RNA secondary structures. Monatsch. Chem. 125: 167–188 CrossRefGoogle Scholar
  19. 19.
    Tinoco J., Borer P., Dengler B., Levin M., Uhlenbeck O., Crothers D., Bralla J. (1973). Improved estimation of secondary structure in ribonucleic acids. Nat. New Biol. 246(150): 40–41 Google Scholar
  20. 20.
    Leontis N., Westhof E. (2001). Geometric nomenclature and classification of RNA base pairs. RNA 7: 499–512 CrossRefGoogle Scholar
  21. 21.
    Lescoute A., Westhof E. (2006). Topology of three-way junctions in folded RNAs. RNA 12(1): 83–93 CrossRefGoogle Scholar
  22. 22.
    Lesk A.M. (1974). A combinatorial study of the effects of admitting non-watson-crick base pairings and of base compositions on the helix-forming potential of polynucleotides of random sequences. J. Theor. Biol. 44: 7–17 CrossRefGoogle Scholar
  23. 23.
    Lorenz, W., Ponty, Y., Clote, P.: Asymptotics of RNA shapes. J. Comput. Biol. (in press, 2007)Google Scholar
  24. 24.
    Lyngs R.B., Pedersen C.N.S. (2000). RNA pseudoknot prediction in energy-based models. J. Comput. Biol. 7(3–4): 409–427 CrossRefGoogle Scholar
  25. 25.
    Markham, N.R.: Algorithms and software for nucleic acid sequences. PhD thesis, Rensselaer Polytechnic Institute (2006)Google Scholar
  26. 26.
    Markham N.R., Zuker M. (2005). Dinamelt web server for nucleic acid melting prediction. Nucleic Acids Res. 33: W577–W581 CrossRefGoogle Scholar
  27. 27.
    Mathews D. (2004). Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. RNA 10: 1178–1190 CrossRefGoogle Scholar
  28. 28.
    McCaskill J. (1990). The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29: 1105–1119 CrossRefGoogle Scholar
  29. 29.
    Millar J., Sloane N., Young N. (1996). A new operation on sequences: The boustrophedon transform. J. Combin. Th. Ser. A 76: 44–54 MATHCrossRefMathSciNetGoogle Scholar
  30. 30.
    Nebel M. (2003). Combinatorial properties of RNA secondary structures. J. Comput. Biol. 3(9): 541–574 Google Scholar
  31. 31.
    Nebel M.E. (2004). Investigation of the bernoulli model for rna secondary structures. Bull. Math. Biol. 66(5): 925–964 CrossRefMathSciNetGoogle Scholar
  32. 32.
    Nussinov R., Jacobson A. (1980). Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc. Natl. Acad. Sci. USA 77: 6903–6913 CrossRefGoogle Scholar
  33. 33.
    Penchovsky R., Breaker R. (2005). Computational design and experimental validation of oligonucleotide-sensing allosteric ribozymes. Nat. Biotechnol. 23(11): 1424–1431 CrossRefGoogle Scholar
  34. 34.
    Ponty Y., Termier M., Denise A. (2006). GenRGenS: software for generating random genomic sequences and structures. Bioinformatics 22(12): 1534–1535 CrossRefGoogle Scholar
  35. 35.
    Salvy, B., Zimmerman, P.: Gfun: a maple package for the manipulation of generating and holonomic functions in one variable. ACM Transactions on Mathematical Softwares 20(2), 163–177 (1994). doi: 10.1145/178365.178368
  36. 36.
    Steffen P., Voss B., Rehmsmeier M., Reeder J., Giegerich R. (2006). RNAshapes: an integrated RNA analysis package based on abstract shapes. Bioinformatics 22(4): 500–503 CrossRefGoogle Scholar
  37. 37.
    Vauchaussade de Chaumont, M., Viennot, X.: Enumeration of RNA’s secondary structures by complexity. In: Capasso, V., Grosso, E., Paven-Fontana, S. (eds.) Mathematics in Medecine and Biology, Lecture Notes in Biomathematics, vol. 57. pp. 360–365 (1985)Google Scholar
  38. 38.
    Voss, B., Giegerich, R., Rehmsmeier, M.: Complete probabilistic analysis of RNA shapes. BMC Biol. 4(5) (2006)Google Scholar
  39. 39.
    Waterman M.S. (1978). Secondary structure of single stranded nucleic acids. Adv. Math. Suppl. Stud. 1(1): 167–212 MathSciNetGoogle Scholar
  40. 40.
    Wuchty S., Fontana W., Hofacker I.L., Schuster P. (1999). Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers 49: 145–165 CrossRefGoogle Scholar
  41. 41.
    Xia T., Burkard M., Kierzek R., Schroeder S., Jiao X., Cox C., Turner D., SantaLucia J. (1999). Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. Biochemistry 37: 14719–14735 CrossRefGoogle Scholar
  42. 42.
    Zhao, J., Malmberg, R., Cai, L.: Rapid ab initio RNA folding including pseudoknots via graph tree decomposition. In: Proceedings of the 6th Workshop on Algorithms in Bioinformatics (WABI 2006), vol. 4175. pp. 262–273 (2006)Google Scholar
  43. 43.
    Zuker M., Stiegler P. (1981). Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9: 133–148 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Biology Department, Higgins Hall 577Boston CollegeChestnut HillUSA

Personalised recommendations