Sparse RNA Folding: Time and Space Efficient Algorithms

Backofen, Rolf; Tsur, Dekel; Zakov, Shay; Ziv-Ukelson, Michal

doi:10.1007/978-3-642-02441-2_22

Rolf Backofen¹⁸,
Dekel Tsur¹⁹,
Shay Zakov¹⁹ &
…
Michal Ziv-Ukelson¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5577))

Included in the following conference series:

Annual Symposium on Combinatorial Pattern Matching

626 Accesses
16 Citations

Abstract

The classical algorithm for RNA single strand folding requires O(n Z) time and O(n ²) space, where n denotes the length of the input sequence and Z is a sparsity parameter that satisfies n ≤ Z ≤ n ². We show how to reduce the space complexity of this algorithm. The space reduction is based on the observation that some solutions for subproblems are not examined after a certain stage of the algorithm, and may be discarded from memory. This yields an O(nZ) time and O(Z) space algorithm, that outputs both the cardinality of the optimal folding as well as a corresponding secondary structure. The space-efficient approach also extends to the related RNA simultaneous alignment with folding problem, and can be applied to reduce the space complexity of the fastest algorithm for this problem from O(n ² m ²) down to \(O(nm^2 + \tilde{Z})\), where n and m denote the lengths of the input sequences to be aligned, and \(\tilde{Z}\) is a sparsity parameter that satisfies n m ≤ \(\tilde{Z}\) ≤ n ² m ².

In addition, we also show how to speed up the base-pairing maximization variant of RNA single strand folding. The speed up is achieved by combining two independent existing techniques, which restrict the number of expressions that need to be examined in bottleneck computations of these algorithms. This yields an O(LZ) time and O(Z) space algorithm, where L denotes the maximum cardinality of a folding of the input sequence.

Additional online supporting material may be found at:

http://www.cs.bgu.ac.il/zakovs/RNAfold/CPM09_supporting_material.pdf

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Consortium, A.F.B., Backofen, R., Bernhart, S.H., Flamm, C., Fried, C., Fritzsch, G., Hackermuller, J., Hertel, J., Hofacker, I.L., Missal, K., Mosig, A., Prohaska, S.J., Rose, D., Stadler, P.F., Tanzer, A., Washietl, S., Will, S.: RNAs everywhere: genome-wide annotation of structured RNAs. Journal of Experimental Zoology Part B: Molecular and Developmental Evolution 308(1), 1–25 (2007)
Google Scholar
Zuker, M.: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research (13), 3406–3415 (2003)
Article Google Scholar
Hofacker, I.L.: Vienna RNA secondary structure server. Nucleic Acids Research (13), 3429–3431 (2003)
Article Google Scholar
Zuker, M.: Computer prediction of RNA structure. Methods Enzymol. 180, 262–288 (1989)
Article MATH Google Scholar
Tinoco, I., Borer, P., Dengler, B., Levine, M., Uhlenbeck, O., Crothers, D., Gralla, J.: Improved estimation of secondary structure in ribonucleic acids. Nature New Biology 246, 40–41 (1973)
Article Google Scholar
Waterman, M., Smith, T.: RNA secondary structure: a complete mathematical analysis. Mathematical Biosciences 42, 257–266 (1978)
Article MATH Google Scholar
Nussinov, R., Jacobson, A.B.: Fast algorithm for predicting the secondary structure of single-stranded RNA. PNAS 77(11), 6309–6313 (1980)
Article Google Scholar
Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research 9(1), 133–148 (1981)
Article Google Scholar
Akutsu, T.: Approximation and exact algorithms for RNA secondary structure prediction and recognition of stochastic context-free languages. Journal of Combinatorial Optimization 3, 321–336 (1999)
Article MathSciNet MATH Google Scholar
Wexler, Y., Zilberstein, C., Ziv-Ukelson, M.: A study of accessible motifs and RNA folding complexity. Journal of Computational Biology 14(6), 856–872 (2007)
Article MathSciNet MATH Google Scholar
Chan, T.M.: More algorithms for all-pairs shortest paths in weighted graphs. In: Proc. 39th Symposium on the Theory of Computing (STOC), pp. 590–598 (2007)
Google Scholar
Sankoff, D.: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM Journal on Applied Mathematics 45(5), 810–825 (1985)
Article MathSciNet MATH Google Scholar
Mathews, D.H., Turner, D.H.: Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. Journal of Molecular Biology 317(2), 191–203 (2002)
Article Google Scholar
Havgaard, J., Lyngso, R., Stormo, G., Gorodkin, J.: Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21(9), 1815–1824 (2005)
Article Google Scholar
Ziv-Ukelson, M., Gat-Viks, I., Wexler, Y., Shamir, R.: A faster algorithm for RNA co-folding, pp. 174–185 (2008)
Google Scholar
Will, S., Reiche, K., Hofacker, I.L., Stadler, P.F., Backofen, R.: Inferring non-coding RNA families and classes by means of genome-scale structure-based clustering. PLOS Computational Biology 3(4), e65 (2007)
Article Google Scholar
Gardner, P.P., Giegerich, R.: A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics 5, 140 (2004)
Article Google Scholar
Jansson, J., Ng, S.K., Sung, W.K., Willy, H.: A faster and more space-efficient algorithm for inferring arc-annotations of RNA sequences through alignment. Algorithmica 46(2), 223–245 (2006)
Article MathSciNet MATH Google Scholar
Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge (1998)
Book MATH Google Scholar
Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. Communications of the ACM 18(6), 341–343 (1975)
Article MathSciNet MATH Google Scholar
Hirschberg, D.S.: Algorithms for the longest common subsequence problem. JACM 24, 664–675 (1977)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Albert Ludwigs University, Freiburg, Germany
Rolf Backofen
Department of Computer Science, Ben-Gurion University of the Negev, Israel
Dekel Tsur, Shay Zakov & Michal Ziv-Ukelson

Authors

Rolf Backofen
View author publications
You can also search for this author in PubMed Google Scholar
Dekel Tsur
View author publications
You can also search for this author in PubMed Google Scholar
Shay Zakov
View author publications
You can also search for this author in PubMed Google Scholar
Michal Ziv-Ukelson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIFL - Bâtiment M3 59655 Villeneuve d’Ascq Cédex,, France
Gregory Kucherov
Department of Computer Science, University of Helsinki,, Gustaf Hällströmin katu 2b, P.O. Box 68, FI-00014, Finland
Esko Ukkonen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Backofen, R., Tsur, D., Zakov, S., Ziv-Ukelson, M. (2009). Sparse RNA Folding: Time and Space Efficient Algorithms. In: Kucherov, G., Ukkonen, E. (eds) Combinatorial Pattern Matching. CPM 2009. Lecture Notes in Computer Science, vol 5577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02441-2_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-02441-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02440-5
Online ISBN: 978-3-642-02441-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics