Abstract
Multiple RNA structure alignment is particularly challenging because covarying mutations make sequence information alone insufficient. Many existing tools for multiple RNA alignments first generate pairwise RNA structure alignments and then build the multiple alignment using only the sequence information. Here we present PMFastR, an algorithm which iteratively uses a sequence-structure alignment procedure to build a multiple RNA structure alignment. PMFastR has low memory consumption allowing for the alignment of large sequences such as 16S and 23S rRNA. The algorithm also provides a method to utilize a multi-core environment. Finally, we present results on benchmark data sets from BRAliBase, which shows PMFastR outperforms other state-of-the-art programs. Furthermore, we regenerate 607 Rfam seed alignments and show that our automated process creates similar multiple alignments to the manually-curated Rfam seed alignments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bauer, M., Klau, G., Reinert, K.: Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization. BMC Bioinformatics 8(1), 271 (2007)
Cannone, J., Subramanian, S., Schnare, M., Collett, J., D’Souza, L., Du, Y., Feng, B., Lin, N., Madabusi, L., Muller, K., Pande, N., Shang, Z., Yu, N., Gutell, R.: The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinformatics 3(1), 2 (2002)
Dalli, D., Wilm, A., Mainz, I., Steger, G.: STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time. Bioinformatics 22, 1593–1599 (2006)
Do, C.B., Mahabhashyam, M.S., Brudno, M., Batzoglou, S.: ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res. 15, 330–340 (2005)
Eddy, S.R.: Infernal package, http://infernal.janelia.org/
Eddy, S.R., Durbin, R.: RNA sequence analysis using covariance models. Nucleic Acids Res. 22, 2079–2088 (1994)
Edgar, R.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004)
Gardner, P.P., Wilm, A., Washietl, S.: A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Res. 33, 2433–2439 (2005)
Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S.R., Bateman, A.: Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33, D121–D124 (2005)
Hofacker, I.L., Bernhart, S.H., Stadler, P.F.: Alignment of RNA base pairing probability matrices. Bioinformatics 20, 2222–2227 (2004)
Hofacker, I.L., Fekete, M., Stadler, P.F.: Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 319, 1059–1066 (2002)
Holmes, I.: Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics 6, 73 (2005)
Jaeger, J.A., Turner, D.H., Zuker, M.: Improved predictions of secondary structures for RNA. Proc. Natl. Acad. Sci. U.S.A. 86, 7706–7710 (1989)
Jiang, T., Lin, G., Ma, B., Zhang, K.: A general edit distance between RNA structures. Journal of Computational Biology 9, 2002 (2002)
Katoh, K., Kuma, K., Toh, H., Miyata, T.: MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005)
Klein, R.J., Eddy, S.R.: RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinformatics 4, 44 (2003)
Knudsen, B., Hein, J.: Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res. 31, 3423–3428 (2003)
Larkin, M., Blackshields, G., Brown, N., Chenna, R., McGettigan, P., McWilliam, H., Valentin, F., Wallace, I., Wilm, A., Lopez, R., Thompson, J., Gibson, T., Higgins, D.: Clustal W and Clustal X version 2.0. Bioinformatics 23(21), 2947–2948 (2007)
Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217 (2000)
Rivas, E., Eddy, S.R.: Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics 2, 8 (2001)
Sankoff, D.: Simulations solution of the RNA folding, alignment and protosequence problems. SIAM J. Appl. Math. 45(5), 810–825 (1985)
Siebert, S., Backofen, R.: MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons. Bioinformatics 21, 3352–3359 (2005)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)
Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27, 2682–2690 (1999)
Torarinsson, E., Havgaard, J.H., Gorodkin, J.: Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23, 926–932 (2007)
Washietl, S., Hofacker, I.L., Lukasser, M., Httenhofer, A., Stadler, P.F.: Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome. Nat. Biotechnol. 23, 1383–1390 (2005)
Weinberg, Z., Ruzzo, W.L.: Exploiting conserved structure for faster annotation of non-coding RNAs without loss of accuracy. Bioinformatics 20(suppl. 1), i334–i341 (2004)
Wilm, A., Mainz, I., Steger, G.: An enhanced RNA alignment benchmark for sequence alignment programs. Algorithms Mol. Biol. 1, 19 (2006)
Zhang, S., Borovok, I., Aharonowitz, Y., Sharan, R., Bafna, V.: A sequence-based filtering method for ncRNA identification and its application to searching for riboswitch elements. Bioinformatics 22(14), e557–e565 (2006)
Zhang, S., Haas, B., Eskin, E., Bafna, V.: Searching genomes for noncoding RNA using FastR. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2(4), 366–379 (2005)
Zuker, M., Sankoff, D.: RNA secondary structures and their prediction. Bulletin of Mathematical Biology 46(4), 591–621 (1984)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
DeBlasio, D., Bruand, J., Zhang, S. (2009). PMFastR: A New Approach to Multiple RNA Structure Alignment. In: Salzberg, S.L., Warnow, T. (eds) Algorithms in Bioinformatics. WABI 2009. Lecture Notes in Computer Science(), vol 5724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04241-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-04241-6_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04240-9
Online ISBN: 978-3-642-04241-6
eBook Packages: Computer ScienceComputer Science (R0)