Plant Systematics and Evolution

, Volume 282, Issue 3–4, pp 127–149 | Cite as

A framework for phylogenetic sequence alignment

Review

Abstract

A phylogenetic alignment differs from other forms of multiple sequence alignment because it must align homologous features. Therefore, the goal of the alignment procedure should be to identify the events associated with the homologies, so that the aligned sequences accurately reflect those events. That is, an alignment is a set of hypotheses about historical events rather than about residues, and any alignment algorithm must be designed to identify and align such events. Some events (e.g., substitution) involve single residues, and our current algorithms can successfully align those events when sequence similarity is great enough. However, the other common events (such as duplication, translocation, deletion, insertion and inversion) can create complex sequence patterns that defeat such algorithms. There is therefore currently no computerized algorithm that can successfully align molecular sequences for phylogenetic analysis, except under restricted circumstances. Manual re-alignment of a preliminary alignment is thus the only feasible contemporary methodology, although it should be possible to automate such a procedure.

Keywords

Molecular sequences Sequence alignment Phylogenetic analysis 

References

  1. Ahola V, Aittokallio T, Vihinen M, Uusipaikka E (2006) A statistical score for assessing the quality of multiple sequence alignments. BMC Bioinform 7:484CrossRefGoogle Scholar
  2. Baron M, Norman D, Willis A, Campbell ID (1990) Structure of the fibronectin type I module. Nature 345:642–646PubMedCrossRefGoogle Scholar
  3. Barta JR (1997) Investigating phylogenetic relationships within the Apicomplexa using sequence data: the search for homology. Methods 13:81–88PubMedCrossRefGoogle Scholar
  4. Beebe NW, Cooper RD, Morrison DA, Ellis JT (2000) Subset partitioning of the ribosomal DNA small subunit and its effects on the phylogeny of the Anopheles punctulatus group. Insect Molec Biol 9:515–520CrossRefGoogle Scholar
  5. Bertrand D, Gascuel O (2005) Topological rearrangements and local search method for tandem duplication trees. IEEE/ACM Trans Comput Biol Bioinform 2:15–28PubMedCrossRefGoogle Scholar
  6. Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AFA, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, Haussler D, Miller W (2004) Aligning multiple genomic sequences with the Threaded Blockset Aligner. Genome Res 14:708–715PubMedCrossRefGoogle Scholar
  7. Borsch T, Hilu KW, Quandt D, Wilde V, Neinhuis C, Barthlott W (2003) Noncoding plastid trnT–trnF sequences reveal a well resolved phylogeny of basal angiosperms. J Evol Biol 16:558–576PubMedCrossRefGoogle Scholar
  8. Borsch T, Hilu KW, Wiersema JH, Löhne C, Barthlott W, Wilde V (2007) Phylogeny of Nymphaea (Nymphaeaceae): evidence from substitutions and microstructural changes in the chloroplast trnT–trnF region. Int J Plant Sci 168:639–671CrossRefGoogle Scholar
  9. Bray N, Pachter L (2004) MAVID: constrained ancestral alignment of multiple sequences. Genome Res 14:693–699PubMedCrossRefGoogle Scholar
  10. Brower AVZ, Schawaroch V (1996) Three steps of homology assessment. Cladistics 12:265–272Google Scholar
  11. Brudno M, Do CB, Cooper GM, Kim MF, Davydov E, Green ED, Sidow A, Batzoglou S (2003a) LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res 13:721–731PubMedCrossRefGoogle Scholar
  12. Brudno M, Malde S, Poliakov A, Do CB, Couronne O, Dubchak I, Batzoglou S (2003b) Glocal alignment: finding rearrangements during alignment. Bioinformatics 19:i54–i62PubMedCrossRefGoogle Scholar
  13. Cammarano P, Creti R, Sanangelantoni AM, Palm P (1999) The Archaea monophyly issue: a phylogeny of translational elongation factor g(2) sequences inferred from an optimized selection of alignment positions. J Molec Evol 49:524–537PubMedCrossRefGoogle Scholar
  14. Cannone JJ, Subramanian S, Schnare MN, Collett JR, D’Souza LM, Du Y, Feng B, Lin N, Madabusi LV, Muller KM, Pande N, Shang Z, Yu N, Gutell RR (2002) The comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinform 3:2CrossRefGoogle Scholar
  15. Cartmill M (1994) A critique of homology as a morphological concept. Am J Physical Anthropol 94:115–123CrossRefGoogle Scholar
  16. Castresana J (2000) Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molec Biol Evol 17:540–552PubMedGoogle Scholar
  17. Charleston MA (1998) Jungles: a new solution to the host/parasite phylogeny reconciliation problem. Math Biosci 149:191–223PubMedCrossRefGoogle Scholar
  18. Colbourn CJ, Kumar S (2007) Lower bounds on multiple sequence alignment using exact 3-way alignment. BMC Bioinform 8:140CrossRefGoogle Scholar
  19. Creer S (2007) Choosing and using introns in molecular phylogenetics. Evol Bioinform 3:99–108Google Scholar
  20. Damberger SH, Gutell RR (1994) A comparative database of group I intron structures. Nucleic Acids Res 22:3508–3510PubMedCrossRefGoogle Scholar
  21. Darling ACE, Mau B, Blattner FR, Perna NT (2004) Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14:1394–1403PubMedCrossRefGoogle Scholar
  22. Dessimoz C, Cannarozzi GM, Gil M, Margadant D, Roth A, Schneider A, Gonnet GH (2005) OMA, A comprehensive, automated project for the identification of orthologs from complete genome data: introduction and first achievements. Lect Notes Comput Sci 3678:61–72CrossRefGoogle Scholar
  23. de Pinna MCC (1991) Concepts and tests of homology in the cladistic paradigm. Cladistics 7:367–394CrossRefGoogle Scholar
  24. Dewey CN, Pachter L (2006) Evolution at the nucleotide level: the problem of multiple whole-genome alignment. Human Molec Genet 15:R51–R56CrossRefGoogle Scholar
  25. Do CB, Mahabhashyam MSP, Brudno M, Batzoglou S (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15:330–340PubMedCrossRefGoogle Scholar
  26. Dobzhansky T (1973) Nothing in biology makes sense except in the light of evolution. Am Biol Teacher 35:125–129Google Scholar
  27. Dopazo J (1997) A new index to find regions showing an unexpected variability or conservation in sequence alignments. Comput Appl Biosc 13:313–317Google Scholar
  28. Du Z, Lin F (2007) Pattern-constrained multiple polypeptide sequence alignment. Comput Biol Chem 29:303–307CrossRefGoogle Scholar
  29. Ellis J, Morrison D (1995) Effects of sequence alignment on the phylogeny of Sarcocystis deduced from 18S rDNA sequences. Parasitol Res 81:696–699PubMedCrossRefGoogle Scholar
  30. Finn RD, Mistry J, Schuster-Böckler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R, Eddy SR, Sonnhammer ELL, Bateman A (2006) Pfam: clans, web tools and services. Nucleic Acids Res 34:D247–D251PubMedCrossRefGoogle Scholar
  31. Fleissner R, Metzler D, von Haeseler A (2005) Simultaneous statistical multiple alignment and phylogeny reconstruction. Syst Biol 54:548–561PubMedCrossRefGoogle Scholar
  32. Frith MC, Hansen U, Spouge JL, Weng Z (2004) Finding functional sequence elements by multiple local alignment. Nucleic Acids Res 32:189–200PubMedCrossRefGoogle Scholar
  33. Gillespie JJ (2004) Characterizing regions of ambiguous alignment caused by the expansion and contraction of hairpin-stem loops in ribosomal RNA molecules. Molec Phylogenet Evol 33:936–943PubMedCrossRefGoogle Scholar
  34. Gillespie JJ, Yoder MJ, Wharton RA (2005) Predicted secondary structure for 28S and 18S rRNA from Ichneumonoidea (Insecta:Hymenoptera:Apocrita): impact on sequence alignment and phylogeny estimation. J Molec Evol 61:114–137PubMedCrossRefGoogle Scholar
  35. Giribet G, Edgecombe GD, Wheeler WC (2001) Arthropod phylogeny based on eight molecular loci and morphology. Nature 413:157–160PubMedCrossRefGoogle Scholar
  36. Golenberg EM, Clegg MT, Durbin ML, Doebley J, Ma DP (1993) Evolution of a noncoding region of the chloroplast genome. Molec Phylogen Evol 2:52–64CrossRefGoogle Scholar
  37. Golubchik T, Wise MJ, Eastel S, Jermiin LS (2007) Mind the gaps: evidence of bias in estimates of multiple sequence alignments. Mol Biol Evol 24:2433–2442PubMedCrossRefGoogle Scholar
  38. Goode MG, Rodrigo AG (2007) SQUINT: a multiple alignment program and editor. Bioinformatics 23:1553–1555PubMedCrossRefGoogle Scholar
  39. Graham SW, Reeves PA, Burns ACE, Olmstead RG (2000) Microstructural changes in noncoding chloroplast DNA: interpretation, evolution, and utility of indels and inversions in basal angiosperm phylogenetic inference. Int J Plant Sci 161:S83–S96CrossRefGoogle Scholar
  40. Grundy WN, Naylor GJP (1999) Phylogenetic inference from conserved alignments. J Exp Zool 285:128–139PubMedCrossRefGoogle Scholar
  41. He Y, Jones J, Armstrong M, Lamberti F, Moens M (2005) The mitochondrial genome of Xiphinema americanum sensu stricto (Nematoda: Enoplea): considerable economization in the length and structural features of encoded genes. J Molec Evol 61:819–833PubMedCrossRefGoogle Scholar
  42. Henikoff S, Henikoff JG, Alford WJ, Pietrokovski S (1995) Automated construction and graphical presentation of protein blocks from unaligned sequences. Gene 163:GC17–GC26PubMedCrossRefGoogle Scholar
  43. Hertwig S, de Sá RO, Haas A (2004) Phylogenetic signal and the utility of 12S and 16S mtDNA in frog phylogeny. J Zool Syst Evol Res 42:2–18Google Scholar
  44. Hickson RE, Simon C, Cooper A, Spicer GS, Sullivan J, Penny D (1996) Conserved sequence motifs, alignment, and secondary structure for the third domain of animal 12S rRNA. Molec Biol Evol 13:150–169PubMedGoogle Scholar
  45. Höhl M, Kurtz S, Ohlebusch E (2002) Efficient multiple genome alignment. Bioinformatics 18:S312–S320PubMedGoogle Scholar
  46. Höhl M, Ragan MA (2007) Is multiple-sequence alignment required for accurate inference of phylogeny? Syst Biol 56:206–221PubMedCrossRefGoogle Scholar
  47. Hoot SB, Douglas AW (1998) Phylogeny of the Proteaceae based on atpB and atpB–rbcL intergenic spacer region sequences. Aust Syst Bot 11:301–320CrossRefGoogle Scholar
  48. Jansen RK, Kaittanis C, Saski C, Lee S-B, Tomkins J, Alverson AJ, Daniell H (2006) Phylogenetic analysis of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids. BMC Evol. Biol 6:32Google Scholar
  49. Jermiin LS, Ho SYW, Ababneh F, Robinson J, Larkum AWD (2004) The biasing effect of compositional heterogeneity on phylogenetic estimates may be underestimated. Syst Biol 53:638–643PubMedCrossRefGoogle Scholar
  50. Johnson R (1982) Parsimony principles in phylogenetic systematics: a critical re-appraisal. Evol Theory 6:79–90Google Scholar
  51. Katoh K, Kuma K, Toh H, Miyata T (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33:511–518PubMedCrossRefGoogle Scholar
  52. Kauff F, Cox CJ, Lutzoni F (2007) WASABI: an automated sequence processing system for multigene phylogenies. Syst Biol 56:523–531PubMedCrossRefGoogle Scholar
  53. Keightley PD, Johnson T (2004) MCALIGN: stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Res 14:442–450PubMedCrossRefGoogle Scholar
  54. Kelchner SA (2000) The evolution of non-coding chloroplast DNA and its application in plant systematics. Ann Missouri Bot Gard 87:482–498CrossRefGoogle Scholar
  55. Kelchner SA (2002) Group II introns as phylogenetic tools: structure, function, and evolutionary constraints. Amer J Bot 89:1651–1669CrossRefGoogle Scholar
  56. Kelchner SA, Clark LG (1997) Molecular evolution and phylogenetic utility of the chloroplast rpl16 intron in Chusquea and the Bambusoideae (Poaceae). Molec Phylogenet Evol 8:385–397PubMedCrossRefGoogle Scholar
  57. Kelchner SA, Wendel JF (1996) Hairpins create minute inversions in non-coding regions of chloroplast DNA. Curr Genet 30:259–262PubMedCrossRefGoogle Scholar
  58. Kellogg EA, Juliano ND (1997) The structure and function of RuBisCo and their implications for systematic studies. Am J Bot 84:413–428CrossRefGoogle Scholar
  59. Kim J, Sinha S (2007) Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment. Bioinformatics 23:289–297PubMedCrossRefGoogle Scholar
  60. Kiryu H, Tabei Y, Kin T, Asai K (2007) Murlet: a practical multiple alignment tool for structural RNA sequences. Bioinformatics 23:1588–1598PubMedCrossRefGoogle Scholar
  61. Kjer KM (1995) Use of rRNA secondary structure in phylogenetic studies to identify homologous positions: an example of alignment and data presentation from the frogs. Molec Phylogenet Evol 4:314–330PubMedCrossRefGoogle Scholar
  62. Kjer KM (1997) An alignment template for amphibian 12S rRNA, domain III: conserved primary and secondary structural motifs. J. Herpetol 31:599–604CrossRefGoogle Scholar
  63. Kjer KM, Baldridge GD, Fallon AM (1994) Mosquito large subunit ribosomal RNA: simultaneous alignment of primary and secondary structure. Biochim Biophys Acta 1217:147–155PubMedGoogle Scholar
  64. Kjer KM, Gillespie JJ, Ober KA (2006) Structural homology in ribosomal RNA, and a deliberation on POY. Arthropod Syst Phylogeny 64:159–164Google Scholar
  65. Kjer KM, Gillespie JJ, Ober KA (2007) Opinions on multiple sequence alignment, and an empirical comparison of repeatability and accuracy between POY and structural alignment. Syst Biol 56:133–146PubMedCrossRefGoogle Scholar
  66. Kreitman M (1983) Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304:412–417PubMedCrossRefGoogle Scholar
  67. Kumar S, Filipski A (2007) Multiple sequence alignment: in pursuit of homologous DNA positions. Genome Res 17:127–135PubMedCrossRefGoogle Scholar
  68. Lambert C, Van Campenhout J-M, DeBolle X, Depiereux E (2003) Review of common sequence alignment methods: clues to enhance reliability. Curr Genom 4:131–146CrossRefGoogle Scholar
  69. Landan G, Graur D (2007) Heads or tails: a simple reliability check for multiple sequence alignments. Molec Biol Evol 24:1380–1383PubMedCrossRefGoogle Scholar
  70. Lassmann T, Sonnhammer ELL (2005) Automatic assessment of alignment quality. Nucleic Acids Res 33:7120–7128PubMedCrossRefGoogle Scholar
  71. Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF, Wootton JC (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262:208–214PubMedCrossRefGoogle Scholar
  72. Lawrence CJ, Zmasek CM, Dawe RK, Malmberg RL (2004) LumberJack: a heuristic tool for sequence alignment exploration and phylogenetic inference. Bioinformatics 20:1977–1979PubMedCrossRefGoogle Scholar
  73. Lebrun E, Santini JM, Brugna M, Ducluzeau A-L, Ouchane S, Schoepp-Cothenet B, Baymann F, Nitschke W (2006) The rieske protein: a case study on the pitfalls of multiple sequence alignments and phylogenetic reconstruction. Molec Biol Evol 23:1180–1191PubMedCrossRefGoogle Scholar
  74. Letunic I, Copley RR, Schmidt S, Ciccarelli FD, Doerks T, Schultz J, Ponting CP, Bork O (2004) SMART 4.0: towards genomic data integration. Nucleic Acids Res 32:142–144CrossRefGoogle Scholar
  75. Ljunggren EL, Bergström K, Morrison DA, Mattsson JG (2006) Characterisation of an atypical antigen from Sarcoptes scabiei containing an MADF domain. Parasitology 132:117–126PubMedCrossRefGoogle Scholar
  76. Löhne C, Borsch T (2005) Molecular evolution and phylogenetic utility of the petD Group II intron: a case study in basal angiosperms. Molec Biol Evol 22:317–332PubMedCrossRefGoogle Scholar
  77. Löytynoja A, Goldman N (2005) An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci USA 102:10557–10562PubMedCrossRefGoogle Scholar
  78. Löytynoja A, Milinkovitch MC (2001) SOAP, cleaning multiple alignments from unstable blocks. Bioinformatics 17:573–574PubMedCrossRefGoogle Scholar
  79. Löytynoja A, Milinkovitch MC (2003) A hidden markov model for progressive multiple alignment. Bioinformatics 19:1505–1513PubMedCrossRefGoogle Scholar
  80. Lunter G (2007) Probabilistic whole-genome alignments reveal high indel rates in the human and mouse genomes. Bioinformatics 23:i289–i296PubMedCrossRefGoogle Scholar
  81. Lunter G, Drummond AJ, Miklós I, Hein J (2005) Statistical alignment: recent progress, new applications, and challenges. In: Nielsen R (ed) Statistical methods in molecular evolution. Springer, New York, pp 375–405CrossRefGoogle Scholar
  82. Marchler-Bauer A, Anderson JB, Cherukuri PF, DeWeese-Scott C, Geer LY, Gwadz M, He S, Hurwitz DI, Jackson JD, Ke Z, Lanczycki C, Liebert CA, Liu C, Lu F, Marchler GH, Mullokandov M, Shoemaker BA, Simonyan V, Song JS, Thiessen PA, Yamashita RA, Yin JJ, Zhang D, Bryant SH (2005) CDD: a conserved domain database for protein classification. Nucleic Acids Res 33:D192–D196PubMedCrossRefGoogle Scholar
  83. Martin MJ, González-Candelas F, Sobrino F, Dopazo J (1995) A method for determining the position and size of optimal sequence regions for phylogenetic analysis. J Molec Evol 41:1128–1138PubMedCrossRefGoogle Scholar
  84. Martin W, Roettger M, Lockhart PJ (2007) A reality check for alignments and trees. Trends Genet 23:478–480PubMedCrossRefGoogle Scholar
  85. May ACW (2004) Percent sequence identity: the need to be explicit. Structure 12:737–738PubMedCrossRefGoogle Scholar
  86. Messer PW, Arndt PF (2007) The majority of recent short DNA insertions in the human genome are tandem duplications. Molec Biol Evol 24:1190–1197PubMedCrossRefGoogle Scholar
  87. Mishler BD (2005) The logic of the data matrix in phylogenetic analysis. In: Albert VA (ed) Parsimony, phylogeny, and genomics. Oxford University Press, Oxford, pp 57–70Google Scholar
  88. Morrison DA (2006) Multiple sequence alignment for phylogenetic purposes. Aust Syst Bot 19:479–539CrossRefGoogle Scholar
  89. Morrison DA, Bornstein S, Thebo P, Wernery U, Kinne J, Mattsson JG (2004) The current status of the small subunit rRNA: phylogeny of the coccidia (Sporozoa). Int J Parasitol 34:501–514PubMedCrossRefGoogle Scholar
  90. Morrison DA, Ellis JT (1997) Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of Apicomplexa. Molec Biol Evol 14:428–441PubMedGoogle Scholar
  91. Müller K, Borsch T (2005a) Phylogenetics of Utricularia (Lentibulariaceae) and molecular evolution of the trnK intron in a lineage with high substitutional rates. Plant Syst Evol 250:39–67CrossRefGoogle Scholar
  92. Müller K, Borsch T (2005b) Phylogenetics of Amaranthaceae based on matK/trnK sequence data—evidence from parsimony, likelihood, and Bayesian methods. Ann Missouri Bot Gard 92:66–102Google Scholar
  93. Mugridge NB, Morrison DA, Jäkel T, Heckeroth AR, Tenter AM, Johnson AM (2000) Effects of sequence alignment and structural domains of ribosomal DNA on phylogeny reconstruction for the protozoan family Sarcocystidae. Molec Biol Evol 17:1842–1853PubMedGoogle Scholar
  94. Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Molec Biol 302:205–217PubMedCrossRefGoogle Scholar
  95. Notredame C, Holm L, Higgins DG (1998) COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14:407–422PubMedCrossRefGoogle Scholar
  96. O’Brien EA, Higgins DG (1998) Empirical estimation of the reliability of ribosomal RNA alignments. Bioinformatics 14:830–838PubMedCrossRefGoogle Scholar
  97. O’Dushlaine CT, Shields DC (2006) Tools for the identification of variable and potentially variable tandem repeats. BMC Genom 7:290CrossRefGoogle Scholar
  98. Ogden TH, Rosenberg MS (2006) Multiple sequence alignment accuracy and phylogenetic inference. Syst Biol 55:314–328CrossRefGoogle Scholar
  99. Ogden TH, Rosenberg MS (2007) Alignment and topological accuracy of the direct optimization approach via POY and traditional phylogenetics via ClustalW + PAUP*. Syst Biol 56:182–193PubMedCrossRefGoogle Scholar
  100. Papadopoulos JS, Agarwala R (2007) COBALT: constraint-based alignment tool for multiple protein sequences. Bioinformatics 23:1073–1079PubMedCrossRefGoogle Scholar
  101. Patterson C (1988) Homology in classical and molecular biology. Molec Biol Evol 5:603–625PubMedGoogle Scholar
  102. Pei J, Grishin NV (2001) AL2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics 17:700–712PubMedCrossRefGoogle Scholar
  103. Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23:802–808PubMedCrossRefGoogle Scholar
  104. Phillips A, Janies D, Wheeler W (2000) Multiple sequence alignment in phylogenetic analysis. Molec Phylogen Evol 16:317–330CrossRefGoogle Scholar
  105. Phuong TM, Do CB, Edgar RC, Batzoglou S (2006) Multiple alignment of protein sequences with repeats and rearrangements. Nucleic Acids Res 34:5932–5942PubMedCrossRefGoogle Scholar
  106. Pöhler D, Werner N, Steinkamp R, Morgenstern B (2005) Multiple alignment of genomic sequences using CHAOS, DIALIGN and ABC. Nucleic Acids Res 33:W532–W534PubMedCrossRefGoogle Scholar
  107. Pollard DA, Bergman CM, Stoye J, Celniker SE, Eisen MB (2004) Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinform 5:6CrossRefGoogle Scholar
  108. Pons J, Vogler AP (2006) Size, frequency, and phylogenetic signal of multiple-residue indels in sequence alignment of introns. Cladistics 22:144–156CrossRefGoogle Scholar
  109. Prychitko TM, Moore WS (2003) Alignment and phylogenetic analysis of β-fibrinogen intron 7 sequences among avian orders reveal conserved regions within the intron. Mol Biol Evol 20:762–771PubMedCrossRefGoogle Scholar
  110. Quandt D, Müller K, Huttunen S (2003) Characterisation of the chloroplast DNA psbT-H region and the influence of dyad symmetrical events on phylogenetic reconstructions. Pl Biol 5:400–410CrossRefGoogle Scholar
  111. Quandt D, Müller K, Stech M, Frahm J-P, Frey W, Hiku KW, Borsch T (2004) Molecular evolution of the chloroplast trnL-F region in land plants. In: Goffinet B, Hollowell V, Magill R (eds) Molecular systematics of bryophytes. Missouri Botanical Garden Press, St Louis, pp 13–37Google Scholar
  112. Raphael B, Zhi D, Tang H, Pevzner P (2004) A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Res 14:2336–2346PubMedCrossRefGoogle Scholar
  113. Redelings BD, Suchard MA (2005) Joint bayesian estimation of alignment and phylogeny. Syst Biol 54:401–418PubMedCrossRefGoogle Scholar
  114. Ronquist F (2003) Parsimony analysis of coevolving species associations. In: Page RDM (ed) Phylogeny, cospeciation and evolution. University of Chicago Press, Chicago, pp 22–64Google Scholar
  115. Sammeth M, Heringa J (2006) Global multiple-sequence alignment with repeats. Proteins Struct Funct Bioinform 64:263–274CrossRefGoogle Scholar
  116. Sammeth M, Stoye J (2006) Comparing tandem repeats with duplications and excisions of variable degree. IEEE/ACM Trans Computat Biol Bioinform 3:395–407CrossRefGoogle Scholar
  117. Sankoff D, Morel C, Cedergren RJ (1973) Evolution of 5S RNA and the non-randomness of base replacement. Nature 245:232–234CrossRefGoogle Scholar
  118. Sanson GFO, Kawashita SY, Brunstein A, Briones MRS (2002) Experimental phylogeny of neutrally evolving DNA sequences generated by a bifurcate series of nested polymerase chain reactions. Mol Biol Evol 19:170–178PubMedGoogle Scholar
  119. Schultz J, Maisel S, Gerlach D, Müller T, Wolf M (2005) A common core of secondary structure of the internal transcribed spacer 2 (ITS2) throughout the Eukaryota. RNA 11:361–364PubMedCrossRefGoogle Scholar
  120. Schwartz S, Elnitski L, Li M, Weirauch M, Riemer C, Smit A (2003) MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res 31:3518–3524PubMedCrossRefGoogle Scholar
  121. Seibel PN, Müller T, Dandekar T, Schultz J, Wolf M (2006) 4SALE—a tool for synchronous RNA sequence and secondary structure alignment and editing. BMC Bioinform 7:498CrossRefGoogle Scholar
  122. Shan Y, Milios EE, Roger AJ, Blouin C, Susko E (2003) Automatic recognition of regions of intrinsically poor multiple alignment using machine learning. In: Proceedings of the IEEE computer society second conference in bioinformatics (CSB’03). IEEE Press, Piscataway, pp 482–483Google Scholar
  123. Shih AC-C, Lee DT, Lin L, Peng C-L, Chen S-H, Wu Y-W, Wong C-Y, Chou M-Y, Shiao T-C, Hsieh M-F (2006) SinicView: a visualization environment for comparisons of multiple nucleotide sequence alignment tools. BMC Bioinform 7:103CrossRefGoogle Scholar
  124. Simmons MP (2004) Independence of alignment and tree search. Molec Phylogenet Evol 31:874–879PubMedCrossRefGoogle Scholar
  125. Smith HO, Annau TM, Chandrasegaran S (1990) Finding sequence motifs in groups of functionally related proteins. Proc Natl Acad Sci USA 87:826–830PubMedCrossRefGoogle Scholar
  126. Stebbings LA, Mizuguchi K (2004) HOMSTRAD: recent developments of the homologous protein structure alignment database. Nucleic Acids Res 32:D203–D207PubMedCrossRefGoogle Scholar
  127. Suchard MA, Redelings BD (2006) BAli-Phy: simultaneous bayesian inference of alignment and phylogeny. Bioinformatics 22:2047–2048PubMedCrossRefGoogle Scholar
  128. Szklarczyk R, Heringa J (2006) AuberGene—a sensitive genome alignment tool. Bioinformatics 22:1431–1436PubMedCrossRefGoogle Scholar
  129. Szymanski M, Erdmann VA, Barciszewski J (2007) Noncoding RNAs database (ncRNAdb). Nucleic Acids Res 35:D162–D164PubMedCrossRefGoogle Scholar
  130. Talavera G, Castresana J (2007) Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol 56:564–577PubMedCrossRefGoogle Scholar
  131. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL-X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882PubMedCrossRefGoogle Scholar
  132. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680PubMedCrossRefGoogle Scholar
  133. Thompson JD, Plewniak F, Ripp R, Thierry J-C, Poch O (2001) Towards a reliable objective function for multiple sequence alignments. J Molec Biol 314:937–951PubMedCrossRefGoogle Scholar
  134. Torarinsson E, Havgaard JH, Gorodkin J (2007) Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23:926–932PubMedCrossRefGoogle Scholar
  135. Vingron M (1996) Near-optimal sequence alignment. Curr Opin Struct Biol 6:346–352PubMedCrossRefGoogle Scholar
  136. Wegner K, Jansen S, Wuchty S, Gauges R, Kummer U (2004) CombAlign: a protein sequence comparison algorithm considering recombinations. In Silico Biol 4:0021Google Scholar
  137. Whelan S, de Bakker PIW, Quevillon E, Rodriguez N, Goldman N (2006) PANDIT: an evolution-centric database of protein and associated nucleotide domains with inferred trees. Nucleic Acids Res 34:D327–D331PubMedCrossRefGoogle Scholar
  138. Wheeler TJ, Kececioglu JD (2007) Multiple alignment by aligning alignments. Bioinformatics 23:i559–i568PubMedCrossRefGoogle Scholar
  139. Wheeler W (1996) Optimization alignment: the end of multiple sequence alignment in phylogenetics? Cladistics 12:1–9CrossRefGoogle Scholar
  140. Wheeler WC (1999) Fixed character states and the optimization of molecular sequence data. Cladistics 15:379–385CrossRefGoogle Scholar
  141. Wheeler WC (2006) Dynamic homology and the likelihood criterion. Cladistics 22:157–170CrossRefGoogle Scholar
  142. Wilm A, Mainz I, Steger G (2006) An enhanced RNA alignment benchmark for sequence alignment programs. Algorithms Molec Biol 1:19CrossRefGoogle Scholar
  143. Xiao L, Sulaiman IM, Ryan UM, Zhou L, Atwill ER, Tischler ML, Zhang X, Fayer R, Lal AA (2002) Host adaptation and host-parasite co-evolution in Cryptosporidium: implications for taxonomy and public health. Int J Parasitol 32:1773–1785PubMedCrossRefGoogle Scholar
  144. Xu X, Ji Y, Stormo GD (2007) RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment. Bioinformatics 23:1883–1891PubMedCrossRefGoogle Scholar
  145. Yao Z, Weinberg Z, Ruzzo WL (2006) CMfinder—a covariance model based RNA motif finding algorithm. Bioinformatics 22:445–452PubMedCrossRefGoogle Scholar
  146. Ye L, Huang X (2005) MAP2: multiple alignment of syntenic genomic sequences. Nucleic Acids Res 33:162–170PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  1. 1.Department of Parasitology (SWEPAR)National Veterinary Institute and Swedish University of Agricultural SciencesUppsalaSweden

Personalised recommendations