Consensus Folding of Unaligned RNA Sequences Revisited
As one of the earliest problems in computational biology, RNA secondary structure prediction (sometimes referred to as “RNA folding”) problem has attracted attention again, thanking to the recent discoveries of many novel non-coding RNA molecules. The two common approaches to this problem are de novo prediction of RNA secondary structure based on energy minimization and “consensus folding” approach (computing the common secondary structure for a set of unaligned RNA sequences). Consensus folding algorithms work well when the correct seed alignment is part of the input to the problem. However, seed alignment itself is a challenging problem for diverged RNA families.
In this paper, we propose a novel framework to predict the common secondary structure for unaligned RNA sequences. By matching putative stacks in RNA sequences, we make use of both primary sequence information and thermodynamic stability for prediction at the same time. We show that our method can predict the correct common RNA secondary structures even when we are only given a limited number of unaligned RNA sequences, and it outperforms current algorithms in sensitivity and accuracy.
KeywordsSecondary Structure Consensus Structure Ondary Structure Rfam Database Unpaired Basis
Unable to display preview. Download preview PDF.
- 3.International Human Genome Sequencing Consortium: Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004)Google Scholar
- 13.Zuker, M.: Prediction of RNA secondary structure by energy minimization. Methods Mol. Biol. 25, 267–294 (1994)Google Scholar
- 24.Sakakibara, Y., et al.: Recent methods for RNA modeling using Stochastic Context Free Grammars. Combinatorial Pattern Matching 807 (1994)Google Scholar
- 28.Waterman, M.: Consensus methods for fodling single-stranded nucleic acids. Mathematical methods for DNA Sequences, 185–224 (1989)Google Scholar
- 32.Davydov, E., Batzoglou, S.: A computational model for rna multiple structural alignment. Combinatorial Pattern Matching (2004)Google Scholar