Skip to main content
Log in

A framework for phylogenetic sequence alignment

  • Review
  • Published:
Plant Systematics and Evolution Aims and scope Submit manuscript

Abstract

A phylogenetic alignment differs from other forms of multiple sequence alignment because it must align homologous features. Therefore, the goal of the alignment procedure should be to identify the events associated with the homologies, so that the aligned sequences accurately reflect those events. That is, an alignment is a set of hypotheses about historical events rather than about residues, and any alignment algorithm must be designed to identify and align such events. Some events (e.g., substitution) involve single residues, and our current algorithms can successfully align those events when sequence similarity is great enough. However, the other common events (such as duplication, translocation, deletion, insertion and inversion) can create complex sequence patterns that defeat such algorithms. There is therefore currently no computerized algorithm that can successfully align molecular sequences for phylogenetic analysis, except under restricted circumstances. Manual re-alignment of a preliminary alignment is thus the only feasible contemporary methodology, although it should be possible to automate such a procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Ahola V, Aittokallio T, Vihinen M, Uusipaikka E (2006) A statistical score for assessing the quality of multiple sequence alignments. BMC Bioinform 7:484

    Article  CAS  Google Scholar 

  • Baron M, Norman D, Willis A, Campbell ID (1990) Structure of the fibronectin type I module. Nature 345:642–646

    Article  CAS  PubMed  Google Scholar 

  • Barta JR (1997) Investigating phylogenetic relationships within the Apicomplexa using sequence data: the search for homology. Methods 13:81–88

    Article  CAS  PubMed  Google Scholar 

  • Beebe NW, Cooper RD, Morrison DA, Ellis JT (2000) Subset partitioning of the ribosomal DNA small subunit and its effects on the phylogeny of the Anopheles punctulatus group. Insect Molec Biol 9:515–520

    Article  CAS  Google Scholar 

  • Bertrand D, Gascuel O (2005) Topological rearrangements and local search method for tandem duplication trees. IEEE/ACM Trans Comput Biol Bioinform 2:15–28

    Article  CAS  PubMed  Google Scholar 

  • Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AFA, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, Haussler D, Miller W (2004) Aligning multiple genomic sequences with the Threaded Blockset Aligner. Genome Res 14:708–715

    Article  CAS  PubMed  Google Scholar 

  • Borsch T, Hilu KW, Quandt D, Wilde V, Neinhuis C, Barthlott W (2003) Noncoding plastid trnT–trnF sequences reveal a well resolved phylogeny of basal angiosperms. J Evol Biol 16:558–576

    Article  CAS  PubMed  Google Scholar 

  • Borsch T, Hilu KW, Wiersema JH, Löhne C, Barthlott W, Wilde V (2007) Phylogeny of Nymphaea (Nymphaeaceae): evidence from substitutions and microstructural changes in the chloroplast trnT–trnF region. Int J Plant Sci 168:639–671

    Article  CAS  Google Scholar 

  • Bray N, Pachter L (2004) MAVID: constrained ancestral alignment of multiple sequences. Genome Res 14:693–699

    Article  CAS  PubMed  Google Scholar 

  • Brower AVZ, Schawaroch V (1996) Three steps of homology assessment. Cladistics 12:265–272

    Google Scholar 

  • Brudno M, Do CB, Cooper GM, Kim MF, Davydov E, Green ED, Sidow A, Batzoglou S (2003a) LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res 13:721–731

    Article  CAS  PubMed  Google Scholar 

  • Brudno M, Malde S, Poliakov A, Do CB, Couronne O, Dubchak I, Batzoglou S (2003b) Glocal alignment: finding rearrangements during alignment. Bioinformatics 19:i54–i62

    Article  PubMed  Google Scholar 

  • Cammarano P, Creti R, Sanangelantoni AM, Palm P (1999) The Archaea monophyly issue: a phylogeny of translational elongation factor g(2) sequences inferred from an optimized selection of alignment positions. J Molec Evol 49:524–537

    Article  CAS  PubMed  Google Scholar 

  • Cannone JJ, Subramanian S, Schnare MN, Collett JR, D’Souza LM, Du Y, Feng B, Lin N, Madabusi LV, Muller KM, Pande N, Shang Z, Yu N, Gutell RR (2002) The comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinform 3:2

    Article  Google Scholar 

  • Cartmill M (1994) A critique of homology as a morphological concept. Am J Physical Anthropol 94:115–123

    Article  CAS  Google Scholar 

  • Castresana J (2000) Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molec Biol Evol 17:540–552

    CAS  PubMed  Google Scholar 

  • Charleston MA (1998) Jungles: a new solution to the host/parasite phylogeny reconciliation problem. Math Biosci 149:191–223

    Article  CAS  PubMed  Google Scholar 

  • Colbourn CJ, Kumar S (2007) Lower bounds on multiple sequence alignment using exact 3-way alignment. BMC Bioinform 8:140

    Article  CAS  Google Scholar 

  • Creer S (2007) Choosing and using introns in molecular phylogenetics. Evol Bioinform 3:99–108

    CAS  Google Scholar 

  • Damberger SH, Gutell RR (1994) A comparative database of group I intron structures. Nucleic Acids Res 22:3508–3510

    Article  CAS  PubMed  Google Scholar 

  • Darling ACE, Mau B, Blattner FR, Perna NT (2004) Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14:1394–1403

    Article  CAS  PubMed  Google Scholar 

  • Dessimoz C, Cannarozzi GM, Gil M, Margadant D, Roth A, Schneider A, Gonnet GH (2005) OMA, A comprehensive, automated project for the identification of orthologs from complete genome data: introduction and first achievements. Lect Notes Comput Sci 3678:61–72

    Article  Google Scholar 

  • de Pinna MCC (1991) Concepts and tests of homology in the cladistic paradigm. Cladistics 7:367–394

    Article  Google Scholar 

  • Dewey CN, Pachter L (2006) Evolution at the nucleotide level: the problem of multiple whole-genome alignment. Human Molec Genet 15:R51–R56

    Article  CAS  Google Scholar 

  • Do CB, Mahabhashyam MSP, Brudno M, Batzoglou S (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15:330–340

    Article  CAS  PubMed  Google Scholar 

  • Dobzhansky T (1973) Nothing in biology makes sense except in the light of evolution. Am Biol Teacher 35:125–129

    Google Scholar 

  • Dopazo J (1997) A new index to find regions showing an unexpected variability or conservation in sequence alignments. Comput Appl Biosc 13:313–317

    CAS  Google Scholar 

  • Du Z, Lin F (2007) Pattern-constrained multiple polypeptide sequence alignment. Comput Biol Chem 29:303–307

    Article  CAS  Google Scholar 

  • Ellis J, Morrison D (1995) Effects of sequence alignment on the phylogeny of Sarcocystis deduced from 18S rDNA sequences. Parasitol Res 81:696–699

    Article  CAS  PubMed  Google Scholar 

  • Finn RD, Mistry J, Schuster-Böckler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R, Eddy SR, Sonnhammer ELL, Bateman A (2006) Pfam: clans, web tools and services. Nucleic Acids Res 34:D247–D251

    Article  CAS  PubMed  Google Scholar 

  • Fleissner R, Metzler D, von Haeseler A (2005) Simultaneous statistical multiple alignment and phylogeny reconstruction. Syst Biol 54:548–561

    Article  PubMed  Google Scholar 

  • Frith MC, Hansen U, Spouge JL, Weng Z (2004) Finding functional sequence elements by multiple local alignment. Nucleic Acids Res 32:189–200

    Article  CAS  PubMed  Google Scholar 

  • Gillespie JJ (2004) Characterizing regions of ambiguous alignment caused by the expansion and contraction of hairpin-stem loops in ribosomal RNA molecules. Molec Phylogenet Evol 33:936–943

    Article  CAS  PubMed  Google Scholar 

  • Gillespie JJ, Yoder MJ, Wharton RA (2005) Predicted secondary structure for 28S and 18S rRNA from Ichneumonoidea (Insecta:Hymenoptera:Apocrita): impact on sequence alignment and phylogeny estimation. J Molec Evol 61:114–137

    Article  CAS  PubMed  Google Scholar 

  • Giribet G, Edgecombe GD, Wheeler WC (2001) Arthropod phylogeny based on eight molecular loci and morphology. Nature 413:157–160

    Article  CAS  PubMed  Google Scholar 

  • Golenberg EM, Clegg MT, Durbin ML, Doebley J, Ma DP (1993) Evolution of a noncoding region of the chloroplast genome. Molec Phylogen Evol 2:52–64

    Article  CAS  Google Scholar 

  • Golubchik T, Wise MJ, Eastel S, Jermiin LS (2007) Mind the gaps: evidence of bias in estimates of multiple sequence alignments. Mol Biol Evol 24:2433–2442

    Article  CAS  PubMed  Google Scholar 

  • Goode MG, Rodrigo AG (2007) SQUINT: a multiple alignment program and editor. Bioinformatics 23:1553–1555

    Article  CAS  PubMed  Google Scholar 

  • Graham SW, Reeves PA, Burns ACE, Olmstead RG (2000) Microstructural changes in noncoding chloroplast DNA: interpretation, evolution, and utility of indels and inversions in basal angiosperm phylogenetic inference. Int J Plant Sci 161:S83–S96

    Article  CAS  Google Scholar 

  • Grundy WN, Naylor GJP (1999) Phylogenetic inference from conserved alignments. J Exp Zool 285:128–139

    Article  CAS  PubMed  Google Scholar 

  • He Y, Jones J, Armstrong M, Lamberti F, Moens M (2005) The mitochondrial genome of Xiphinema americanum sensu stricto (Nematoda: Enoplea): considerable economization in the length and structural features of encoded genes. J Molec Evol 61:819–833

    Article  CAS  PubMed  Google Scholar 

  • Henikoff S, Henikoff JG, Alford WJ, Pietrokovski S (1995) Automated construction and graphical presentation of protein blocks from unaligned sequences. Gene 163:GC17–GC26

    Article  CAS  PubMed  Google Scholar 

  • Hertwig S, de Sá RO, Haas A (2004) Phylogenetic signal and the utility of 12S and 16S mtDNA in frog phylogeny. J Zool Syst Evol Res 42:2–18

    Google Scholar 

  • Hickson RE, Simon C, Cooper A, Spicer GS, Sullivan J, Penny D (1996) Conserved sequence motifs, alignment, and secondary structure for the third domain of animal 12S rRNA. Molec Biol Evol 13:150–169

    CAS  PubMed  Google Scholar 

  • Höhl M, Kurtz S, Ohlebusch E (2002) Efficient multiple genome alignment. Bioinformatics 18:S312–S320

    PubMed  Google Scholar 

  • Höhl M, Ragan MA (2007) Is multiple-sequence alignment required for accurate inference of phylogeny? Syst Biol 56:206–221

    Article  PubMed  CAS  Google Scholar 

  • Hoot SB, Douglas AW (1998) Phylogeny of the Proteaceae based on atpB and atpB–rbcL intergenic spacer region sequences. Aust Syst Bot 11:301–320

    Article  Google Scholar 

  • Jansen RK, Kaittanis C, Saski C, Lee S-B, Tomkins J, Alverson AJ, Daniell H (2006) Phylogenetic analysis of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids. BMC Evol. Biol 6:32

    Google Scholar 

  • Jermiin LS, Ho SYW, Ababneh F, Robinson J, Larkum AWD (2004) The biasing effect of compositional heterogeneity on phylogenetic estimates may be underestimated. Syst Biol 53:638–643

    Article  PubMed  Google Scholar 

  • Johnson R (1982) Parsimony principles in phylogenetic systematics: a critical re-appraisal. Evol Theory 6:79–90

    Google Scholar 

  • Katoh K, Kuma K, Toh H, Miyata T (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33:511–518

    Article  CAS  PubMed  Google Scholar 

  • Kauff F, Cox CJ, Lutzoni F (2007) WASABI: an automated sequence processing system for multigene phylogenies. Syst Biol 56:523–531

    Article  CAS  PubMed  Google Scholar 

  • Keightley PD, Johnson T (2004) MCALIGN: stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Res 14:442–450

    Article  CAS  PubMed  Google Scholar 

  • Kelchner SA (2000) The evolution of non-coding chloroplast DNA and its application in plant systematics. Ann Missouri Bot Gard 87:482–498

    Article  Google Scholar 

  • Kelchner SA (2002) Group II introns as phylogenetic tools: structure, function, and evolutionary constraints. Amer J Bot 89:1651–1669

    Article  CAS  Google Scholar 

  • Kelchner SA, Clark LG (1997) Molecular evolution and phylogenetic utility of the chloroplast rpl16 intron in Chusquea and the Bambusoideae (Poaceae). Molec Phylogenet Evol 8:385–397

    Article  CAS  PubMed  Google Scholar 

  • Kelchner SA, Wendel JF (1996) Hairpins create minute inversions in non-coding regions of chloroplast DNA. Curr Genet 30:259–262

    Article  CAS  PubMed  Google Scholar 

  • Kellogg EA, Juliano ND (1997) The structure and function of RuBisCo and their implications for systematic studies. Am J Bot 84:413–428

    Article  CAS  Google Scholar 

  • Kim J, Sinha S (2007) Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment. Bioinformatics 23:289–297

    Article  CAS  PubMed  Google Scholar 

  • Kiryu H, Tabei Y, Kin T, Asai K (2007) Murlet: a practical multiple alignment tool for structural RNA sequences. Bioinformatics 23:1588–1598

    Article  CAS  PubMed  Google Scholar 

  • Kjer KM (1995) Use of rRNA secondary structure in phylogenetic studies to identify homologous positions: an example of alignment and data presentation from the frogs. Molec Phylogenet Evol 4:314–330

    Article  CAS  PubMed  Google Scholar 

  • Kjer KM (1997) An alignment template for amphibian 12S rRNA, domain III: conserved primary and secondary structural motifs. J. Herpetol 31:599–604

    Article  Google Scholar 

  • Kjer KM, Baldridge GD, Fallon AM (1994) Mosquito large subunit ribosomal RNA: simultaneous alignment of primary and secondary structure. Biochim Biophys Acta 1217:147–155

    CAS  PubMed  Google Scholar 

  • Kjer KM, Gillespie JJ, Ober KA (2006) Structural homology in ribosomal RNA, and a deliberation on POY. Arthropod Syst Phylogeny 64:159–164

    Google Scholar 

  • Kjer KM, Gillespie JJ, Ober KA (2007) Opinions on multiple sequence alignment, and an empirical comparison of repeatability and accuracy between POY and structural alignment. Syst Biol 56:133–146

    Article  CAS  PubMed  Google Scholar 

  • Kreitman M (1983) Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304:412–417

    Article  CAS  PubMed  Google Scholar 

  • Kumar S, Filipski A (2007) Multiple sequence alignment: in pursuit of homologous DNA positions. Genome Res 17:127–135

    Article  CAS  PubMed  Google Scholar 

  • Lambert C, Van Campenhout J-M, DeBolle X, Depiereux E (2003) Review of common sequence alignment methods: clues to enhance reliability. Curr Genom 4:131–146

    Article  CAS  Google Scholar 

  • Landan G, Graur D (2007) Heads or tails: a simple reliability check for multiple sequence alignments. Molec Biol Evol 24:1380–1383

    Article  CAS  PubMed  Google Scholar 

  • Lassmann T, Sonnhammer ELL (2005) Automatic assessment of alignment quality. Nucleic Acids Res 33:7120–7128

    Article  CAS  PubMed  Google Scholar 

  • Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF, Wootton JC (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262:208–214

    Article  CAS  PubMed  Google Scholar 

  • Lawrence CJ, Zmasek CM, Dawe RK, Malmberg RL (2004) LumberJack: a heuristic tool for sequence alignment exploration and phylogenetic inference. Bioinformatics 20:1977–1979

    Article  CAS  PubMed  Google Scholar 

  • Lebrun E, Santini JM, Brugna M, Ducluzeau A-L, Ouchane S, Schoepp-Cothenet B, Baymann F, Nitschke W (2006) The rieske protein: a case study on the pitfalls of multiple sequence alignments and phylogenetic reconstruction. Molec Biol Evol 23:1180–1191

    Article  CAS  PubMed  Google Scholar 

  • Letunic I, Copley RR, Schmidt S, Ciccarelli FD, Doerks T, Schultz J, Ponting CP, Bork O (2004) SMART 4.0: towards genomic data integration. Nucleic Acids Res 32:142–144

    Article  Google Scholar 

  • Ljunggren EL, Bergström K, Morrison DA, Mattsson JG (2006) Characterisation of an atypical antigen from Sarcoptes scabiei containing an MADF domain. Parasitology 132:117–126

    Article  CAS  PubMed  Google Scholar 

  • Löhne C, Borsch T (2005) Molecular evolution and phylogenetic utility of the petD Group II intron: a case study in basal angiosperms. Molec Biol Evol 22:317–332

    Article  PubMed  CAS  Google Scholar 

  • Löytynoja A, Goldman N (2005) An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci USA 102:10557–10562

    Article  PubMed  CAS  Google Scholar 

  • Löytynoja A, Milinkovitch MC (2001) SOAP, cleaning multiple alignments from unstable blocks. Bioinformatics 17:573–574

    Article  PubMed  Google Scholar 

  • Löytynoja A, Milinkovitch MC (2003) A hidden markov model for progressive multiple alignment. Bioinformatics 19:1505–1513

    Article  PubMed  CAS  Google Scholar 

  • Lunter G (2007) Probabilistic whole-genome alignments reveal high indel rates in the human and mouse genomes. Bioinformatics 23:i289–i296

    Article  CAS  PubMed  Google Scholar 

  • Lunter G, Drummond AJ, Miklós I, Hein J (2005) Statistical alignment: recent progress, new applications, and challenges. In: Nielsen R (ed) Statistical methods in molecular evolution. Springer, New York, pp 375–405

    Chapter  Google Scholar 

  • Marchler-Bauer A, Anderson JB, Cherukuri PF, DeWeese-Scott C, Geer LY, Gwadz M, He S, Hurwitz DI, Jackson JD, Ke Z, Lanczycki C, Liebert CA, Liu C, Lu F, Marchler GH, Mullokandov M, Shoemaker BA, Simonyan V, Song JS, Thiessen PA, Yamashita RA, Yin JJ, Zhang D, Bryant SH (2005) CDD: a conserved domain database for protein classification. Nucleic Acids Res 33:D192–D196

    Article  CAS  PubMed  Google Scholar 

  • Martin MJ, González-Candelas F, Sobrino F, Dopazo J (1995) A method for determining the position and size of optimal sequence regions for phylogenetic analysis. J Molec Evol 41:1128–1138

    Article  CAS  PubMed  Google Scholar 

  • Martin W, Roettger M, Lockhart PJ (2007) A reality check for alignments and trees. Trends Genet 23:478–480

    Article  CAS  PubMed  Google Scholar 

  • May ACW (2004) Percent sequence identity: the need to be explicit. Structure 12:737–738

    Article  CAS  PubMed  Google Scholar 

  • Messer PW, Arndt PF (2007) The majority of recent short DNA insertions in the human genome are tandem duplications. Molec Biol Evol 24:1190–1197

    Article  CAS  PubMed  Google Scholar 

  • Mishler BD (2005) The logic of the data matrix in phylogenetic analysis. In: Albert VA (ed) Parsimony, phylogeny, and genomics. Oxford University Press, Oxford, pp 57–70

    Google Scholar 

  • Morrison DA (2006) Multiple sequence alignment for phylogenetic purposes. Aust Syst Bot 19:479–539

    Article  CAS  Google Scholar 

  • Morrison DA, Bornstein S, Thebo P, Wernery U, Kinne J, Mattsson JG (2004) The current status of the small subunit rRNA: phylogeny of the coccidia (Sporozoa). Int J Parasitol 34:501–514

    Article  CAS  PubMed  Google Scholar 

  • Morrison DA, Ellis JT (1997) Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of Apicomplexa. Molec Biol Evol 14:428–441

    CAS  PubMed  Google Scholar 

  • Müller K, Borsch T (2005a) Phylogenetics of Utricularia (Lentibulariaceae) and molecular evolution of the trnK intron in a lineage with high substitutional rates. Plant Syst Evol 250:39–67

    Article  CAS  Google Scholar 

  • Müller K, Borsch T (2005b) Phylogenetics of Amaranthaceae based on matK/trnK sequence data—evidence from parsimony, likelihood, and Bayesian methods. Ann Missouri Bot Gard 92:66–102

    Google Scholar 

  • Mugridge NB, Morrison DA, Jäkel T, Heckeroth AR, Tenter AM, Johnson AM (2000) Effects of sequence alignment and structural domains of ribosomal DNA on phylogeny reconstruction for the protozoan family Sarcocystidae. Molec Biol Evol 17:1842–1853

    CAS  PubMed  Google Scholar 

  • Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Molec Biol 302:205–217

    Article  CAS  PubMed  Google Scholar 

  • Notredame C, Holm L, Higgins DG (1998) COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14:407–422

    Article  CAS  PubMed  Google Scholar 

  • O’Brien EA, Higgins DG (1998) Empirical estimation of the reliability of ribosomal RNA alignments. Bioinformatics 14:830–838

    Article  PubMed  Google Scholar 

  • O’Dushlaine CT, Shields DC (2006) Tools for the identification of variable and potentially variable tandem repeats. BMC Genom 7:290

    Article  CAS  Google Scholar 

  • Ogden TH, Rosenberg MS (2006) Multiple sequence alignment accuracy and phylogenetic inference. Syst Biol 55:314–328

    Article  Google Scholar 

  • Ogden TH, Rosenberg MS (2007) Alignment and topological accuracy of the direct optimization approach via POY and traditional phylogenetics via ClustalW + PAUP*. Syst Biol 56:182–193

    Article  CAS  PubMed  Google Scholar 

  • Papadopoulos JS, Agarwala R (2007) COBALT: constraint-based alignment tool for multiple protein sequences. Bioinformatics 23:1073–1079

    Article  CAS  PubMed  Google Scholar 

  • Patterson C (1988) Homology in classical and molecular biology. Molec Biol Evol 5:603–625

    CAS  PubMed  Google Scholar 

  • Pei J, Grishin NV (2001) AL2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics 17:700–712

    Article  CAS  PubMed  Google Scholar 

  • Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23:802–808

    Article  CAS  PubMed  Google Scholar 

  • Phillips A, Janies D, Wheeler W (2000) Multiple sequence alignment in phylogenetic analysis. Molec Phylogen Evol 16:317–330

    Article  CAS  Google Scholar 

  • Phuong TM, Do CB, Edgar RC, Batzoglou S (2006) Multiple alignment of protein sequences with repeats and rearrangements. Nucleic Acids Res 34:5932–5942

    Article  CAS  PubMed  Google Scholar 

  • Pöhler D, Werner N, Steinkamp R, Morgenstern B (2005) Multiple alignment of genomic sequences using CHAOS, DIALIGN and ABC. Nucleic Acids Res 33:W532–W534

    Article  PubMed  CAS  Google Scholar 

  • Pollard DA, Bergman CM, Stoye J, Celniker SE, Eisen MB (2004) Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinform 5:6

    Article  Google Scholar 

  • Pons J, Vogler AP (2006) Size, frequency, and phylogenetic signal of multiple-residue indels in sequence alignment of introns. Cladistics 22:144–156

    Article  Google Scholar 

  • Prychitko TM, Moore WS (2003) Alignment and phylogenetic analysis of β-fibrinogen intron 7 sequences among avian orders reveal conserved regions within the intron. Mol Biol Evol 20:762–771

    Article  CAS  PubMed  Google Scholar 

  • Quandt D, Müller K, Huttunen S (2003) Characterisation of the chloroplast DNA psbT-H region and the influence of dyad symmetrical events on phylogenetic reconstructions. Pl Biol 5:400–410

    Article  CAS  Google Scholar 

  • Quandt D, Müller K, Stech M, Frahm J-P, Frey W, Hiku KW, Borsch T (2004) Molecular evolution of the chloroplast trnL-F region in land plants. In: Goffinet B, Hollowell V, Magill R (eds) Molecular systematics of bryophytes. Missouri Botanical Garden Press, St Louis, pp 13–37

    Google Scholar 

  • Raphael B, Zhi D, Tang H, Pevzner P (2004) A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Res 14:2336–2346

    Article  CAS  PubMed  Google Scholar 

  • Redelings BD, Suchard MA (2005) Joint bayesian estimation of alignment and phylogeny. Syst Biol 54:401–418

    Article  PubMed  Google Scholar 

  • Ronquist F (2003) Parsimony analysis of coevolving species associations. In: Page RDM (ed) Phylogeny, cospeciation and evolution. University of Chicago Press, Chicago, pp 22–64

    Google Scholar 

  • Sammeth M, Heringa J (2006) Global multiple-sequence alignment with repeats. Proteins Struct Funct Bioinform 64:263–274

    Article  CAS  Google Scholar 

  • Sammeth M, Stoye J (2006) Comparing tandem repeats with duplications and excisions of variable degree. IEEE/ACM Trans Computat Biol Bioinform 3:395–407

    Article  CAS  Google Scholar 

  • Sankoff D, Morel C, Cedergren RJ (1973) Evolution of 5S RNA and the non-randomness of base replacement. Nature 245:232–234

    Article  CAS  Google Scholar 

  • Sanson GFO, Kawashita SY, Brunstein A, Briones MRS (2002) Experimental phylogeny of neutrally evolving DNA sequences generated by a bifurcate series of nested polymerase chain reactions. Mol Biol Evol 19:170–178

    CAS  PubMed  Google Scholar 

  • Schultz J, Maisel S, Gerlach D, Müller T, Wolf M (2005) A common core of secondary structure of the internal transcribed spacer 2 (ITS2) throughout the Eukaryota. RNA 11:361–364

    Article  CAS  PubMed  Google Scholar 

  • Schwartz S, Elnitski L, Li M, Weirauch M, Riemer C, Smit A (2003) MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res 31:3518–3524

    Article  CAS  PubMed  Google Scholar 

  • Seibel PN, Müller T, Dandekar T, Schultz J, Wolf M (2006) 4SALE—a tool for synchronous RNA sequence and secondary structure alignment and editing. BMC Bioinform 7:498

    Article  CAS  Google Scholar 

  • Shan Y, Milios EE, Roger AJ, Blouin C, Susko E (2003) Automatic recognition of regions of intrinsically poor multiple alignment using machine learning. In: Proceedings of the IEEE computer society second conference in bioinformatics (CSB’03). IEEE Press, Piscataway, pp 482–483

  • Shih AC-C, Lee DT, Lin L, Peng C-L, Chen S-H, Wu Y-W, Wong C-Y, Chou M-Y, Shiao T-C, Hsieh M-F (2006) SinicView: a visualization environment for comparisons of multiple nucleotide sequence alignment tools. BMC Bioinform 7:103

    Article  CAS  Google Scholar 

  • Simmons MP (2004) Independence of alignment and tree search. Molec Phylogenet Evol 31:874–879

    Article  PubMed  Google Scholar 

  • Smith HO, Annau TM, Chandrasegaran S (1990) Finding sequence motifs in groups of functionally related proteins. Proc Natl Acad Sci USA 87:826–830

    Article  CAS  PubMed  Google Scholar 

  • Stebbings LA, Mizuguchi K (2004) HOMSTRAD: recent developments of the homologous protein structure alignment database. Nucleic Acids Res 32:D203–D207

    Article  CAS  PubMed  Google Scholar 

  • Suchard MA, Redelings BD (2006) BAli-Phy: simultaneous bayesian inference of alignment and phylogeny. Bioinformatics 22:2047–2048

    Article  CAS  PubMed  Google Scholar 

  • Szklarczyk R, Heringa J (2006) AuberGene—a sensitive genome alignment tool. Bioinformatics 22:1431–1436

    Article  CAS  PubMed  Google Scholar 

  • Szymanski M, Erdmann VA, Barciszewski J (2007) Noncoding RNAs database (ncRNAdb). Nucleic Acids Res 35:D162–D164

    Article  CAS  PubMed  Google Scholar 

  • Talavera G, Castresana J (2007) Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol 56:564–577

    Article  CAS  PubMed  Google Scholar 

  • Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL-X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882

    Article  CAS  PubMed  Google Scholar 

  • Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680

    Article  CAS  PubMed  Google Scholar 

  • Thompson JD, Plewniak F, Ripp R, Thierry J-C, Poch O (2001) Towards a reliable objective function for multiple sequence alignments. J Molec Biol 314:937–951

    Article  CAS  PubMed  Google Scholar 

  • Torarinsson E, Havgaard JH, Gorodkin J (2007) Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23:926–932

    Article  CAS  PubMed  Google Scholar 

  • Vingron M (1996) Near-optimal sequence alignment. Curr Opin Struct Biol 6:346–352

    Article  CAS  PubMed  Google Scholar 

  • Wegner K, Jansen S, Wuchty S, Gauges R, Kummer U (2004) CombAlign: a protein sequence comparison algorithm considering recombinations. In Silico Biol 4:0021

    Google Scholar 

  • Whelan S, de Bakker PIW, Quevillon E, Rodriguez N, Goldman N (2006) PANDIT: an evolution-centric database of protein and associated nucleotide domains with inferred trees. Nucleic Acids Res 34:D327–D331

    Article  CAS  PubMed  Google Scholar 

  • Wheeler TJ, Kececioglu JD (2007) Multiple alignment by aligning alignments. Bioinformatics 23:i559–i568

    Article  CAS  PubMed  Google Scholar 

  • Wheeler W (1996) Optimization alignment: the end of multiple sequence alignment in phylogenetics? Cladistics 12:1–9

    Article  Google Scholar 

  • Wheeler WC (1999) Fixed character states and the optimization of molecular sequence data. Cladistics 15:379–385

    Article  Google Scholar 

  • Wheeler WC (2006) Dynamic homology and the likelihood criterion. Cladistics 22:157–170

    Article  Google Scholar 

  • Wilm A, Mainz I, Steger G (2006) An enhanced RNA alignment benchmark for sequence alignment programs. Algorithms Molec Biol 1:19

    Article  CAS  Google Scholar 

  • Xiao L, Sulaiman IM, Ryan UM, Zhou L, Atwill ER, Tischler ML, Zhang X, Fayer R, Lal AA (2002) Host adaptation and host-parasite co-evolution in Cryptosporidium: implications for taxonomy and public health. Int J Parasitol 32:1773–1785

    Article  PubMed  Google Scholar 

  • Xu X, Ji Y, Stormo GD (2007) RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment. Bioinformatics 23:1883–1891

    Article  CAS  PubMed  Google Scholar 

  • Yao Z, Weinberg Z, Ruzzo WL (2006) CMfinder—a covariance model based RNA motif finding algorithm. Bioinformatics 22:445–452

    Article  CAS  PubMed  Google Scholar 

  • Ye L, Huang X (2005) MAP2: multiple alignment of syntenic genomic sequences. Nucleic Acids Res 33:162–170

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David A. Morrison.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Morrison, D.A. A framework for phylogenetic sequence alignment. Plant Syst Evol 282, 127–149 (2009). https://doi.org/10.1007/s00606-008-0072-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00606-008-0072-5

Keywords

Navigation