Abstract
In this paper we analyze the length-spectrum of blocks in \(\gamma \)-structures. \(\gamma \)-structures are a class of RNA pseudoknot structures that play a key role in the context of polynomial time RNA folding. A \(\gamma \)-structure is constructed by nesting and concatenating specific building components having topological genus at most \(\gamma \). A block is a substructure enclosed by crossing maximal arcs with respect to the partial order induced by nesting. We show that, in uniformly generated \(\gamma \)-structures, there is a significant gap in this length-spectrum, i.e., there asymptotically almost surely exists a unique longest block of length at least \(n-O(n^{1/2})\) and that with high probability any other block has finite length. For fixed \(\gamma \), we prove that the length of the complement of the longest block converges to a discrete limit law, and that the distribution of short blocks of given length tends to a negative binomial distribution in the limit of long sequences. We refine this analysis to the length spectrum of blocks of specific pseudoknot types, such as H-type and kissing hairpins. Our results generalize the rainbow spectrum on secondary structures by the first and third authors and are being put into context with the structural prediction of long non-coding RNAs.
Similar content being viewed by others
References
Andersen JE, Huang FWD, Penner RC, Reidys CM (2012) Topology of RNA-interaction structures. J Comput Biol 19:928–943
Andersen JE, Penner RC, Reidys CM, Waterman MS (2013) Topological classification and enumeration of RNA structures by genus. J Math Biol 67(5):1261–1278
Backofen R, Tsur D, Zakov S, Ziv-Ukelson M (2011) Sparse RNA folding: time and space efficient algorithms. J Discrete Algorithms 9(1):12–31
Bon M, Vernizzi G, Orland H, Zee A (2008) Topological classification of RNA structures. J Mol Biol 379:900–911
Chen J, Blasco M, Greider C (2000) Secondary structure of vertebrate telomerase RNA. Cell 100(5):503–514
Clote P, Ponty Y, Steyaert JM (2012) Expected distance between terminal nucleotides of RNA secondary structures. J Math Biol 65(3):581–599
Eddy SR (2001) Non-coding RNA genes and the modern RNA world. Nat Rev Genet 2(12):919–929
Flajolet P, Sedgewick R (2009) Analytic combinatorics. Cambridge University Press, New York
Graham RL, Knuth DE, Patashnik O (1994) Concrete mathematics: a foundation for computer science, 2nd edn. Addison-Wesley Professional, Reading
Han HSW, Reidys CM (2012) The \(5^{\prime }\)–\(3^{\prime }\) distance of RNA secondary structures. J Comput Biol 19(7):868–878
Han HSW, Li TJX, Reidys CM (2014) Combinatorics of \(\gamma \)-structures. J Comput Biol 21:591–608
Harer J, Zagier D (1986) The Euler characteristic of the moduli space of curves. Invent Math 85:457–485
Howell J, Smith T, Waterman M (1980) Computation of generating functions for biological molecules. SIAM J Appl Math 39(1):119–133
Huang FWD, Reidys CM (2015) Shapes of topological RNA structures. Math Biosci 270(Part 4):57–65
Hunter C, Sanders J (1990) The nature of \(\pi \)–\(\pi \) interactions. J Am Chem Soc 112(14):5525–5534
Iyer MK, Niknafs YS, Malik R et al (2015) The landscape of long noncoding RNAs in the human transcriptome. Nat Genet 47(3):199–208
Konings DA, Gutell RR (1995) A comparison of thermodynamic foldings with comparatively derived structures of 16s and 16s-like rRNAs. RNA 1(6):559–574
Kruger K, Grabowski PJ, Zaug AJ, Sands J, Gottschling DE, Cech TR (1982) Self-splicing RNA: autoexcision and autocyclization of the ribosomal RNA intervening sequence of tetrahymena. Cell 31(1):147–157
Li TJX (2014) Combinatorics of shapes, topological RNA structures and RNA–RNA interactions. Ph.D Thesis, University of Southern Denmark, University of Southern Denmark
Li TJX, Reidys CM (2011) Combinatorial analysis of interacting RNA molecules. Math Biosci 233:47–58
Li TJX, Reidys CM (2013) The topological filtration of \(\gamma \)-structures. Math Biosci 241:24–33
Li TJX, Reidys CM (2017) Statistics of topological RNA structures. J Math Biol 74:1793–1821
Li TJX, Reidys CM (2018) The rainbow spectrum of RNA secondary structures. Bull Math Biol 80(6):1514–1538
Loebl M, Moffatt I (2008) The chromatic polynomial of fatgraphs and its categorification. Adv Math 217:1558–1587
Loria A, Pan T (1996) Domain structure of the ribozyme from eubacterial ribonuclease P. RNA 2(6):551–563
Massey W (1967) Algebraic topology: an introduction. Springer-Verlag, New York
McCarthy BJ, Holland JJ (1965) Denatured DNA as a direct template for in vitro protein synthesis. Proc Natl Acad Sci USA 54(3):880–886
Möhl R, Salari R, Will S (2010) Sparsification of RNA structure prediction including pseudoknots. Algorithms Mol Biol 5:39
Orland H, Zee A (2002) RNA folding and large \(n\) matrix theory. Nucl Phys B 620:456–476
Penner R (2004) Cell decomposition and compactification of Riemann’s moduli space in decorated Teichmüller theory. In: Tongring N, Penner R (eds) Woods hole mathematics-perspectives in math and physics. World Scientific, Singapore, pp 263–301
Penner R, Waterman M (1993) Spaces of RNA secondary structures. Adv Math 217:31–49
Penner RC, Knudsen M, Wiuf C, Andersen JE (2010) Fatgraph models of proteins. Commun Pure Appl Math 63(10):1249–1297
Reeder J, Steffen P, Giegerich R (2007) pknotsRG: RNA pseudoknot folding including near-optimal structures and sliding windows. Nucleic Acids Res 35(Web Server issue):W320–W324
Reidys CM, Wang RR, Zhao AYY (2010) Modular, k-noncrossing diagrams. Electron J Comb 17(1):R76
Reidys CM, Huang FWD, Andersen JE, Penner RC, Stadler PF, Nebel ME (2011) Topology and prediction of RNA pseudoknots. Bioinformatics 27:1076–1085
Salari R, Möhl M, Will S, Sahinalp SC, Backofen R (2010) Time and space efficient RNA–RNA interaction prediction via sparse folding. In: Berger B (ed) Research in computational molecular biology, no. 6044 lecture notes in computer science. Springer, Berlin, pp 473–490
Schmitt W, Waterman M (1994) Linear trees and RNA secondary structure. Discrete Appl Math 51:317–323
Smith TF, Waterman MS (1978) RNA secondary structure. Math Biol 42:31–49
Šponer J, Leszczynski J, Hobza P (2001) Electronic properties, hydrogen bonding, stacking, and cation binding of DNA and RNA bases. Biopolymers 61(1):3–31
Šponer J, Sponer J, Mládek A, Jurečka P, Banáš P, Otyepka M (2013) Nature and magnitude of aromatic base stacking in DNA and RNA: quantum chemistry, molecular mechanics, and experiment. Biopolymers 99(12):978–988
Staple DW, Butcher SE (2005) Pseudoknots: RNA structures with diverse functions. PLOS Biol 3(6):e213
Stein P, Waterman M (1979) On some new sequences generalizing the Catalan and Motzkin numbers. Discrete Math 26(3):261–272
Tsukiji S, Pattnaik SB, Suga H (2003) An alcohol dehydrogenase ribozyme. Nat Struct Biol 10(9):713–717
Tuerk C, MacDougal S, Gold L (1992) RNA pseudoknots that inhibit human immunodeficiency virus type 1 reverse transcriptase. Proc Natl Acad Sci USA 89(15):6988–6992
Vernizzi G, Orland H, Zee A (2005) Enumeration of RNA structures by matrix models. Phys Rev Lett 94(16):168,103
Waterman M (1978) Secondary structure of single-stranded nucleic acids. In: Rota GC (ed) Studies on foundations and combinatorics, advances in mathematics supplementary studies, vol 1. Academic Press, Cambridge, pp 167–212
Waterman M (1979) Combinatorics of RNA hairpins and cloverleaves. Stud Appl Math 60(2):91–98
Westhof E, Jaeger L (1992) RNA pseudoknots. Curr Opin Chem Biol 2:327–333
Wexler Y, Zilberstein C, Ziv-Ukelson M (2007) A study of accessible motifs and RNA folding complexity. J Comput Biol 14(6):856–872
Yoffe AM, Prinsen P, Gelbart WM, Ben-Shaul A (2011) The ends of a large RNA molecule are necessarily close. Nucleic Acids Res 39(1):292–299
Acknowledgements
We would like to thank the reviewers for their comments and suggestions. We want to thank Executive Director of Biocomplexity Institute and Initiative, Dr. Christopher Barrett, for stimulating discussions. Christian M. Reidys is a Thermo Fisher Scientific Fellow in Advanced Systems for Information Biology and acknowledges their support of this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, T.J.X., Burris, C.S. & Reidys, C.M. The block spectrum of RNA pseudoknot structures. J. Math. Biol. 79, 791–822 (2019). https://doi.org/10.1007/s00285-019-01379-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-019-01379-8