Abstract
The classical algorithm for RNA single strand folding requires O(n Z) time and O(n 2) space, where n denotes the length of the input sequence and Z is a sparsity parameter that satisfies n ≤ Z ≤ n 2. We show how to reduce the space complexity of this algorithm. The space reduction is based on the observation that some solutions for subproblems are not examined after a certain stage of the algorithm, and may be discarded from memory. This yields an O(nZ) time and O(Z) space algorithm, that outputs both the cardinality of the optimal folding as well as a corresponding secondary structure. The space-efficient approach also extends to the related RNA simultaneous alignment with folding problem, and can be applied to reduce the space complexity of the fastest algorithm for this problem from O(n 2 m 2) down to \(O(nm^2 + \tilde{Z})\), where n and m denote the lengths of the input sequences to be aligned, and \(\tilde{Z}\) is a sparsity parameter that satisfies n m ≤ \(\tilde{Z}\) ≤ n 2 m 2.
In addition, we also show how to speed up the base-pairing maximization variant of RNA single strand folding. The speed up is achieved by combining two independent existing techniques, which restrict the number of expressions that need to be examined in bottleneck computations of these algorithms. This yields an O(LZ) time and O(Z) space algorithm, where L denotes the maximum cardinality of a folding of the input sequence.
Additional online supporting material may be found at:
http://www.cs.bgu.ac.il/zakovs/RNAfold/CPM09_supporting_material.pdf
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Consortium, A.F.B., Backofen, R., Bernhart, S.H., Flamm, C., Fried, C., Fritzsch, G., Hackermuller, J., Hertel, J., Hofacker, I.L., Missal, K., Mosig, A., Prohaska, S.J., Rose, D., Stadler, P.F., Tanzer, A., Washietl, S., Will, S.: RNAs everywhere: genome-wide annotation of structured RNAs. Journal of Experimental Zoology Part B: Molecular and Developmental Evolution 308(1), 1–25 (2007)
Zuker, M.: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research (13), 3406–3415 (2003)
Hofacker, I.L.: Vienna RNA secondary structure server. Nucleic Acids Research (13), 3429–3431 (2003)
Zuker, M.: Computer prediction of RNA structure. Methods Enzymol. 180, 262–288 (1989)
Tinoco, I., Borer, P., Dengler, B., Levine, M., Uhlenbeck, O., Crothers, D., Gralla, J.: Improved estimation of secondary structure in ribonucleic acids. Nature New Biology 246, 40–41 (1973)
Waterman, M., Smith, T.: RNA secondary structure: a complete mathematical analysis. Mathematical Biosciences 42, 257–266 (1978)
Nussinov, R., Jacobson, A.B.: Fast algorithm for predicting the secondary structure of single-stranded RNA. PNAS 77(11), 6309–6313 (1980)
Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research 9(1), 133–148 (1981)
Akutsu, T.: Approximation and exact algorithms for RNA secondary structure prediction and recognition of stochastic context-free languages. Journal of Combinatorial Optimization 3, 321–336 (1999)
Wexler, Y., Zilberstein, C., Ziv-Ukelson, M.: A study of accessible motifs and RNA folding complexity. Journal of Computational Biology 14(6), 856–872 (2007)
Chan, T.M.: More algorithms for all-pairs shortest paths in weighted graphs. In: Proc. 39th Symposium on the Theory of Computing (STOC), pp. 590–598 (2007)
Sankoff, D.: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM Journal on Applied Mathematics 45(5), 810–825 (1985)
Mathews, D.H., Turner, D.H.: Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. Journal of Molecular Biology 317(2), 191–203 (2002)
Havgaard, J., Lyngso, R., Stormo, G., Gorodkin, J.: Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21(9), 1815–1824 (2005)
Ziv-Ukelson, M., Gat-Viks, I., Wexler, Y., Shamir, R.: A faster algorithm for RNA co-folding, pp. 174–185 (2008)
Will, S., Reiche, K., Hofacker, I.L., Stadler, P.F., Backofen, R.: Inferring non-coding RNA families and classes by means of genome-scale structure-based clustering. PLOS Computational Biology 3(4), e65 (2007)
Gardner, P.P., Giegerich, R.: A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics 5, 140 (2004)
Jansson, J., Ng, S.K., Sung, W.K., Willy, H.: A faster and more space-efficient algorithm for inferring arc-annotations of RNA sequences through alignment. Algorithmica 46(2), 223–245 (2006)
Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge (1998)
Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. Communications of the ACM 18(6), 341–343 (1975)
Hirschberg, D.S.: Algorithms for the longest common subsequence problem. JACM 24, 664–675 (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Backofen, R., Tsur, D., Zakov, S., Ziv-Ukelson, M. (2009). Sparse RNA Folding: Time and Space Efficient Algorithms. In: Kucherov, G., Ukkonen, E. (eds) Combinatorial Pattern Matching. CPM 2009. Lecture Notes in Computer Science, vol 5577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02441-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-02441-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02440-5
Online ISBN: 978-3-642-02441-2
eBook Packages: Computer ScienceComputer Science (R0)