A Faster Algorithm for RNA Co-folding

  • Michal Ziv-Ukelson
  • Irit Gat-Viks
  • Ydo Wexler
  • Ron Shamir
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5251)

Abstract

The current pairwise RNA (secondary) structural alignment algorithms are based on Sankoff’s dynamic programming algorithm from 1985. Sankoff’s algorithm requires O(N6) time and O(N4) space, where N denotes the length of the compared sequences, and thus its applicability is very limited. The current literature offers many heuristics for speeding up Sankoff’s alignment process, some making restrictive assumptions on the length or the shape of the RNA substructures. We show how to speed up Sankoff’s algorithm in practice via non-heuristic methods, without compromising optimality. Our analysis shows that the expected time complexity of the new algorithm is O(N4ζ(N)), where ζ(N) converges to O(N), assuming a standard polymer folding model which was supported by experimental analysis. Hence our algorithm speeds up Sankoff’s algorithm by a linear factor on average. In simulations, our algorithm speeds up computation by a factor of 3-12 for sequences of length 25-250.

Availability: Code and data sets are available, upon request.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Uzilov, A.V., Keegan, J.M., Mathews, D.H.: Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics 7, 173 (2005)CrossRefGoogle Scholar
  2. 2.
    Crochemore, M., Landau, G.M., Schieber, B., Ziv-Ukelson, M.: Re-use dynamic programming for sequence alignment: An algorithmic toolkit, pp. 19–60 (2005)Google Scholar
  3. 3.
    Mathews, D., Turner, D.: Dynalign: An algorithm for finding the secondary structure common to two RNA sequences. Journal of Molecular Biology 317, 191–203 (2002)CrossRefGoogle Scholar
  4. 4.
    Sankoff, D.: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM Journal on Applied Mathematics 45, 810–825 (1985)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Mathews, D.H., Burkard, M.E., Freier, S.M., Wyatt, J.R., Turner, D.H.: Predicting oligonucleotide affinity to nucleic acid target. RNA 5, 1458 (1999)CrossRefGoogle Scholar
  6. 6.
    Rivas, E., Eddy, S.R.: Secondary structure alone is generally not statistically significant for the detection of non-coding RNAs. Bioinformatics 16, 583–605 (2000)CrossRefGoogle Scholar
  7. 7.
    Torarinsson, E., Havgaard, J.H., Gorodkin, J.: Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23(8), 926–932 (2007)CrossRefGoogle Scholar
  8. 8.
    Fisher, M.E.: Shape of a self-avoiding walk or polymer chain. J.Chem. Phys. 44, 616–622 (1966)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Giancarlo, R.: Dynamic Programming: Special Cases. Oxford University Press, Oxford (1997)Google Scholar
  10. 10.
    Kiryu, H., Tabei, Y., Kin, T., Asai, K.: Murlet: a practical multiple alignment tool for structural RNA sequences. Bioinformatics 23, 1588–1598 (2007)CrossRefGoogle Scholar
  11. 11.
    Holmes, I.: Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics 6, 73 (2005)CrossRefGoogle Scholar
  12. 12.
    Tinoco, I., Borer, P.N., Dengler, B., Levine, M.D., Uhlenbeck, O.C., Crothers, D.M., Gralla, J.: Improved estimation of secondary structure in ribonucleic acids. Nature New Biology 246, 40–41 (1973)Google Scholar
  13. 13.
    Hofacker, I.L., Fekete, M., Stadler, P.F.: Secondary structure prediction for aligned RNA sequences. Journal of Molecular Biology 319, 1059–1066 (2002)CrossRefGoogle Scholar
  14. 14.
    Hofacker, I.L., Bernhart, S., Stadler, P.: Alignment of RNA base pairing probability matrices. Bioinformatics 20, 2222–2227 (2004)CrossRefGoogle Scholar
  15. 15.
    Havgaard, J.H., Lyngso, R.B., Stormo, G.D., Gorodkin, J.: Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21(9), 1815–1824 (2005)CrossRefGoogle Scholar
  16. 16.
    Pederson, J., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh, K., Lander, E., Kent, J., Miller, W., Haussler, D.: Identification and classification of conserved RNA secondary structres in the human genome. PLOS Computational Biology 2, 33 (2006)CrossRefGoogle Scholar
  17. 17.
    Kabakcioglu, A., Stella, A.L.: A scale-free network hidden in the collapsing polymer. ArXiv Condensed Matter e-prints (September 2004)Google Scholar
  18. 18.
    Kafri, Y., Mukamel, D., Peliti, L.: Why is the DNA denaturation transition first order? Physical Review Letters 85, 4988–4991 (2000)CrossRefGoogle Scholar
  19. 19.
    Mandal, M., Breaker, R.R.: Gene regulation by riboswitches. Cell 6, 451–463 (2004)Google Scholar
  20. 20.
    Nussinov, R., Jacobson, A.B.: Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc. Natl. Acad. Sci. 77(11), 6309–6313 (1980)CrossRefGoogle Scholar
  21. 21.
    Dowell, R.D., Eddy, S.: Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints. BMC Bioinformatics 7, 400 (2006)CrossRefGoogle Scholar
  22. 22.
    Griffiths-Jones, S.: The microrna registry. Nucleic Acids Research 32, D109–D111 (2003)CrossRefGoogle Scholar
  23. 23.
    Washietl, S., Hofacker, I.L.: Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. Journal of Molecular Biology 342, 19–30 (2004)CrossRefGoogle Scholar
  24. 24.
    Vanderzande, C.: Lattice Models of Polymers (Cambridge Lecture Notes in Physics 11). Cambridge University Press, Cambridge (1998)Google Scholar
  25. 25.
    Waterman, M.S., Smith, T.F.: Rapid dynamic programming algorithms for RNA secondary structure. Adv. Appl. Math. 7, 455–464 (1986)MATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    Wexler, Y., Zilberstein, C., Ziv-Ukelson, M.: A study of accessible motifs and the complexity of RNA folding. Journal of Computational Biology 14(6), 856–872 (2007)CrossRefMathSciNetGoogle Scholar
  27. 27.
    Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research 9(1), 133–148 (1981)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Michal Ziv-Ukelson
    • 1
  • Irit Gat-Viks
    • 2
  • Ydo Wexler
    • 3
  • Ron Shamir
    • 4
  1. 1.Computer Science DepartmentBen Gurion University of the NegevBeer-Sheva 
  2. 2.Computational Molecular Biology DepartmentMax Planck Institute for Molecular GeneticsBerlinGermany
  3. 3.Microsoft ResearchMicrosoft CorporationRedmond 
  4. 4.School of Computer ScienceTel Aviv University 

Personalised recommendations