Advertisement

Journal of Mathematical Biology

, Volume 68, Issue 1–2, pp 341–375 | Cite as

Combinatorics of locally optimal RNA secondary structures

  • Éric Fusy
  • Peter Clote
Article

Abstract

It is a classical result of Stein and Waterman that the asymptotic number of RNA secondary structures is \(1.104366 \cdot n^{-3/2} \cdot 2.618034^n\). Motivated by the kinetics of RNA secondary structure formation, we are interested in determining the asymptotic number of secondary structures that are locally optimal, with respect to a particular energy model. In the Nussinov energy model, where each base pair contributes \(-1\) towards the energy of the structure, locally optimal structures are exactly the saturated structures, for which we have previously shown that asymptotically, there are \(1.07427\cdot n^{-3/2} \cdot 2.35467^n\) many saturated structures for a sequence of length \(n\). In this paper, we consider the base stacking energy model, a mild variant of the Nussinov model, where each stacked base pair contributes \(-1\) toward the energy of the structure. Locally optimal structures with respect to the base stacking energy model are exactly those secondary structures, whose stems cannot be extended. Such structures were first considered by Evers and Giegerich, who described a dynamic programming algorithm to enumerate all locally optimal structures. In this paper, we apply methods from enumerative combinatorics to compute the asymptotic number of such structures. Additionally, we consider analogous combinatorial problems for secondary structures with annotated single-stranded, stacking nucleotides (dangles).

Mathematics Subject Classification

05A16 92B05 82B30 

Notes

Acknowledgments

Figure 1 was created by W.A. Lorenz and H. Jabbari. We would like to thank the anonymous referees for their helpful comments. É. Fusy is supported by the European project ExploreMaps—-ERC StG 208471. P. Clote is supported by the National Science Foundation under grants DBI-0543506 and DMS-0817971, and by Digiteo Foundation project RNAomics. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Supplementary material

285_2012_631_MOESM1_ESM.pdf (84 kb)
ESM 1 (PDF 85KB)
285_2012_631_MOESM2_ESM.pdf (82 kb)
ESM 2 (PDF 83KB)

References

  1. Bender EA (1973) Central and local limit theorem applied to asymptotic enumeration. J Combin Theory Ser A 15:91–111CrossRefzbMATHGoogle Scholar
  2. Clote P (2005) An efficient algorithm to compute the landscape of locally optimal RNA secondary structures with respect to the Nussinov-Jacobson energy model. J Comput Biol 12(1):83–101CrossRefMathSciNetGoogle Scholar
  3. Clote P (2006) Combinatorics of saturated secondary structures of RNA. J Comput Biol 13(9):1640–1657CrossRefMathSciNetGoogle Scholar
  4. Clote P, Kranakis E, Krizanc D, Salvy B (2009) Asymptotics of canonical and saturated RNA secondary structures. J Bioinform Comput Biol 7(5):869–893CrossRefGoogle Scholar
  5. Clote P, Dobrev S, Dotu I, Kranakis E, Krizanc D, Urrutia J (2012) On the page number of RNA secondary structures with pseudoknots. J Math Biol 65:1337–1257CrossRefzbMATHMathSciNetGoogle Scholar
  6. Dill KA, Bromberg S (2002) Molecular driving forces: statistical thermodynamics in chemistry and biology. Garland Publishing Inc., New YorkGoogle Scholar
  7. Drmota M (1997) Systems of functional equations. Random Struct Algorithms 10(1–2):103–124CrossRefzbMATHMathSciNetGoogle Scholar
  8. Drmota M, Fusy É, Jué J, Kang M, Kraus V (2011) Asymptotic study of subcritical graph classes. SIAM J Discrete Math 25(4):1615–1651CrossRefzbMATHMathSciNetGoogle Scholar
  9. Evers DJ, Giegerich R (2001) Reducing the conformation space in RNA structure prediction. In: German conference on bioinformatics (GCB’01), pp 1–6Google Scholar
  10. Flajolet P, Odlyzko A (1990) Singularity analysis of generating functions. SIAM J Discrete Math 3(2):216–240CrossRefzbMATHMathSciNetGoogle Scholar
  11. Flajolet P, Sedgewick R (2009) Analytic combinatorics. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  12. Harer J, Zagier D (1986) The Euler characteristic of the moduli space of curves. Invent Math 85(3):457–485CrossRefzbMATHMathSciNetGoogle Scholar
  13. Haslinger C, Stadler PF (1999) RNA structures with pseudo-knots: graph-theoretical, combinatorial, and statistical properties. Bull Math Biol 61(3):437–467CrossRefGoogle Scholar
  14. Henrici P (1991) Applied and computational complex analysis, vol 2. Wiley Classics Library, Wiley, New YorkzbMATHGoogle Scholar
  15. Hofacker I (2003) Vienna RNA secondary structure server. Nucleic Acids Res 31(13):3429–3431CrossRefGoogle Scholar
  16. Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P (1994) Fast folding and comparison of RNA secondary structures. Monatsch Chem 125:167–188CrossRefGoogle Scholar
  17. Hofacker IL, Schuster P, Stadler PF (1998) Combinatorics of RNA secondary structures. Discr Appl Math 88:207–237CrossRefzbMATHMathSciNetGoogle Scholar
  18. Knudsen B, Hein J (2003) Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res 31(13):3423–3428CrossRefGoogle Scholar
  19. Li TJ, Reidys CM (2012) Combinatorics of RNA-RNA interaction. J Math Biol 64(3):529–556CrossRefzbMATHMathSciNetGoogle Scholar
  20. Lorenz WA, Ponty Y, Clote P (2008) Asymptotics of RNA shapes. J Comput Biol 15(1):31–63CrossRefMathSciNetGoogle Scholar
  21. Lyngso RB, Pedersen CN (2000) RNA pseudoknot prediction in energy-based models. J Comput Biol 7(3–4):409–427CrossRefGoogle Scholar
  22. Markham NR (2006) Algorithms and software for nucleic acid sequences. PhD, Rensselaer Polytechnic Institute, under the direction of M. ZukerGoogle Scholar
  23. Markham NR, Zuker M (2008) UNAFold: software for nucleic acid folding and hybridization. Methods Mol Biol 453:3–31CrossRefGoogle Scholar
  24. Matthews DH, Sabina J, Zuker M, Turner DH (1999) Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol 288:911–940CrossRefGoogle Scholar
  25. McCaskill JS (1990) The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29:1105–1119CrossRefGoogle Scholar
  26. Meir A, Moon JW (1989) On an asymptotic method in enumeration. J Combin Theory Ser A 51:77–89CrossRefzbMATHMathSciNetGoogle Scholar
  27. Nussinov R, Jacobson AB (1980) Fast algorithm for predicting the secondary structure of single stranded RNA. Proc Natl Acad Sci USA 77(11):6309–6313CrossRefGoogle Scholar
  28. Pemantle R, Wilson MC (2008) Twenty combinatorial examples of asymptotics derived from multivariate generating functions. SIAM Rev 50(2):199–272CrossRefzbMATHMathSciNetGoogle Scholar
  29. Rodland EA (2006) Pseudoknots in RNA secondary structures: representation, enumeration, and prevalence. J Comput Biol 13(6):1197–1213CrossRefMathSciNetGoogle Scholar
  30. Saule C, Regnier M, Steyaert JM, Denise A (2011) Counting RNA pseudoknotted structures. J Comput Biol 18(10):1339–1351CrossRefMathSciNetGoogle Scholar
  31. Sheikh S, Backofen R, Ponty Y (2012) Impact of the energy model on the complexity of RNA folding with pseudoknots. In: Lecture notes in computer science, vol 7354. 23rd Annual symposium on combinatorial pattern matching, CPM 2012, Helsinki, Finland, pp 321–333Google Scholar
  32. Steffen P, Giegerich R (2005) Versatile and declarative dynamic programming using pair algebras. BMC Bioinformatics 6:224CrossRefGoogle Scholar
  33. Steffen P, Voss B, Rehmsmeier M, Reeder J, Giegerich R (2006) RNAshapes: an integrated RNA analysis package based on abstract shapes. Bioinformatics 22(4):500–503CrossRefGoogle Scholar
  34. Stein PR, Waterman MS (1978) On some new sequences generalizing the Catalan and Motzkin numbers. Discrete Math 26:261–272CrossRefMathSciNetGoogle Scholar
  35. Tabaska JE, Cary RE, Gabow HN, Stormo GD (1998) An RNA folding method capable of identifying pseudoknots and base triples. Bioinformatics 14:691–699CrossRefGoogle Scholar
  36. Vernizzi G, Orland H, Zee A (2005) Enumeration of RNA structures by matrix models. Phys Rev Lett 94(16):168103CrossRefGoogle Scholar
  37. Voss B, Giegerich R, Rehmsmeier M (2006) Complete probabilistic analysis of RNA shapes. BMC Biol 4(1):5–27CrossRefGoogle Scholar
  38. Waterman MS (1978) Secondary structure of single-stranded nucleic acids. Stud Found Combin: Adv Math Supplem Stud 1:167–212MathSciNetGoogle Scholar
  39. Xia T Jr, SantaLucia J, Burkard ME, Kierzek R, Schroeder SJ, Jiao X, Cox C, Turner DH (1999) Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. Biochemistry 37:14719–14735CrossRefGoogle Scholar
  40. Yoffe AM, Prinsen P, Gelbart WM, Ben-Shaul A (2011) The ends of a large RNA molecule are necessarily close. Nucleic Acids Res 39(1):292–299CrossRefGoogle Scholar
  41. Zuker M (1986) RNA folding prediction: the continued need for interaction between biologists and mathematicians. Lect Math Life Sci 17:87–124MathSciNetGoogle Scholar
  42. Zuker M (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31(13):3406–3415CrossRefGoogle Scholar
  43. Zuker M, Stiegler P (1981) Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res 9:133–148CrossRefGoogle Scholar
  44. Zuker M, Mathews DH, Turner DH (1999) Algorithms and thermodynamics for RNA secondary structure prediction: a practical guide. In: Barciszewski J, Clark BFC (eds) RNA biochemistry and biotechnology. NATO ASI series. Kluwer, Dordrecht, pp 11–43CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Laboratoire d’Informatiques (LIX)Ecole PolytechniquePalaiseauFrance
  2. 2.Department of BiologyBoston CollegeChestnut HillUSA

Personalised recommendations