Abstract
During the last few years new functionalities of RNA have been discovered, renewing the need for computational tools for their analysis. To this respect, multiple sequence alignment is an essential step in finding structurally conserved regions in related RNA sequences. In contrast to proteins, many classes of functionally related RNA molecules show a rather weak sequence conservation but instead a fairly well conserved secondary structure. Hence, any method that relates RNA sequences in form of multiple alignments should take structural features into account, which has been verified in recent studies.
Progress has been made in developing new structural alignment algorithms, however, current methods are computationally costly or do not have the desired accuracy to make them an everyday tool. In this paper we present a fast, practical, and accurate method for computing multiple, structural RNA alignments. The method is based on combining a new pairwise structural alignment method with the popular program T-Coffee. Our pairwise method is based on an integer linear programming (ILP) formulation resulting from a graph-theoretic reformulation of the structural alignment problem. We find provably optimal or near-optimal solutions of the ILP with a Lagrangian approach. Tests on a recently published benchmark set show that our Lagrangian approach outperforms current programs in quality and in the length of the sequences it can align.
Supported by the DFG Research Center Matheon “Mathematics for key technologies” in Berlin, the German Federal Ministry of Education and Research, (grant no. 0312705A ‘Berlin Center for Genome Based Bioinformatics’), and the IMPRS for Computational Biology and Scientific Computing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hofacker, I.L., Bernhart, S.H.F., Stadler, P.F.: Alignment of RNA base pairing probability matrices. Bioinformatics 20, 2222–2227 (2004)
Washietl, S., Hofacker, I.L.: Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. J. Mol. Biol. 342, 19–30 (2004)
Gardner, P., Wilm, A., Washietl, S.: A benchmark of multiple sequence alignment programs upon structural RNAs. Nucl. Acids Res. 33, 2433–2439 (2005)
Sankoff, D.: Simultaneous solution of the RNA folding, alignment, and proto-sequence problems. SIAM J. Appl. Math. 45, 810–825 (1985)
Corpet, F., Michot, B.: RNAlign program: alignment of RNA sequences using both primary and secondary structures. CABIOS 10, 389–399 (1994)
Mathews, D.H., Turner, D.H.: Dynalign: An algorithm for finding secondary structures common to two RNA sequences. J. Mol. Biol. 317, 191–203 (2002)
Mathews, D.: Predicting a set of minimal free energy RNA secondary structures common to two sequences. Bioinformatics 21, 2246–2253 (2005)
Gorodkin, J., Heyer, L.J., Stormo, G.D.: Finding the most significant common sequence and structure motifs in a set of RNA sequences. Nucl. Acids Res. 25, 3724–3732 (1997)
Hull Havgaard, J., Lyngsø, R., Stormo, G., Gorodkin, J.: Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21, 1815–1824 (2005)
Holmes, I.: A probabilistic model for the evolution of RNA structure. BMC Bioinformatics 5, 166 (2004)
Holmes, I.: Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics 5, 73 (2004)
Bafna, V., Muthukrishnan, S., Ravi, R.: Computing similarity between RNA strings. In: Galil, Z., Ukkonen, E. (eds.) CPM 1995. LNCS, vol. 937, pp. 1–16. Springer, Heidelberg (1995)
Waterman, M.S.: Consensus methods for folding single-stranded nucleic adds. Mathematical Methods for DNA Sequences, 185–224 (1989)
Eddy, S.P., Durbin, R.: RNA sequence analysis using covariance models. Nucl. Acids Research 22, 2079–2088 (1994)
McCaskill, J.S.: The Equilibrium Partition Function and Base Pair Binding Probabilities for RNA Secondary Structure. Biopolymers 29, 1105–1119 (1990)
Lenhof, H.P., Reinert, K., Vingron, M.: A polyhedral approach to RNA sequence structure alignment. Journal of Comp. Biology 5, 517–530 (1998)
Caprara, A., Lancia, G.: Structural alignment of large-size proteins via Lagrangian relaxation. In: Proc. of RECOMB 2002, pp. 100–108. ACM Press, New York (2002)
Bauer, M., Klau, G.W.: Structural alignment of two RNA sequences with Lagrangian relaxation. In: Fleischer, R., Trippen, G. (eds.) ISAAC 2004. LNCS, vol. 3341, pp. 113–123. Springer, Heidelberg (2004)
Bauer, M., Klau, G.W., Reinert, K.: Multiple structural RNA alignment with Lagrangian relaxation. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 303–314. Springer, Heidelberg (2005)
Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology (2000)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680 (1994)
Siebert, S., Backofen, R.: MARNA: Multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons. Bioinformatics (2005) (in press)
Jiang, T., Lin, G.H., Ma, B., Zhang, K.: A general edit distance between RNA structures. J. of Computational Biology 9, 371–388 (2002)
Gotoh, O.: An improved algorithm for matching biological sequences. Journal of Molecular Biology, 705–708 (1982)
Kececioglu, J., Lenhof, H.P., Mehlhorn, K., Mutzel, P., Reinert, K., Vingron, M.: A polyhedral approach to sequence alignment problems. Discrete Applied Mathematics 104, 143–186 (2000)
Huang, X., Miller, W.: A time efficient, linear space local similarity algorithm. Adv. Appl. Math. 12, 337–357 (1991)
Dirks, R., Pierce, N.: An algorithm for computing nucleic acid base-pairing probabilities including pseudoknots. Journal of Computational Chemistry 25, 1295–1304 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bauer, M., Klau, G.W., Reinert, K. (2005). Fast and Accurate Structural RNA Alignment by Progressive Lagrangian Optimization. In: R. Berthold, M., Glen, R.C., Diederichs, K., Kohlbacher, O., Fischer, I. (eds) Computational Life Sciences. CompLife 2005. Lecture Notes in Computer Science(), vol 3695. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11560500_20
Download citation
DOI: https://doi.org/10.1007/11560500_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29104-6
Online ISBN: 978-3-540-31726-5
eBook Packages: Computer ScienceComputer Science (R0)