Skip to main content
Log in

Homology in coding and non-coding DNA sequences: a parsimony perspective

  • Original Article
  • Published:
Plant Systematics and Evolution Aims and scope Submit manuscript

Abstract

Putative synapomorphy assessment (primary homology assessment) is distinct for DNA strings having a codon structure (hereafter, coding DNA) versus those lacking it (hereafter, non-coding DNA). The first requires the identification of a reading frame and of usually few in-frame insertions and deletions. In non-coding DNA, where length variation is much more common, putative synapomorphy assessment is considerably less straightforward and highly depends on the alignment method. Appreciating the existence of evolutionary constraints, alignments that consider patterns associated with specific putative evolutionary events are favored. Once the sequences have been aligned, the postulated putative evolutionary events need to be coded as an additional step. In order for the alignments and the alignment coding to be falsifiable, they should be carried out using justified and explicitly formulated criteria. Alternative coding methods for the most common patterns present in alignments of non-coding DNA are discussed here. Simpler putative synapomorphy assessment will not always correlate to more reliable phylogenetic information because simplicity does not necessarily correlate to the degree of homoplasy. The use of non-coding DNA can result in more laborious coding, but at the same time in more corroborated hypotheses, mirroring their accuracy for phylogenetic inference.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Abdeddaïm S (1997) Incremental computation of transitive closure and greedy alignment. Proceedings of the eight annual symposium on combinatorial pattern matching, lecture notes in computer sciences, vol 1264. Springer Heidelberg, pp 167–179

  • Aldrich J, Chereney BW, Merlin E, Christopherson L (1988) The role of insertion/deletions in the evolution of the intergenic region between psbA and trnH in the chloroplast genome. Curr Genet 14:137–146

    CAS  PubMed  Google Scholar 

  • Allard MW, Carpenter JM (1996) On weighting and congruence. Cladistics 12:183–198

    Google Scholar 

  • Altschul S, Gish W, Miller W, Myers EW, Lipman D (1990) A basic local alignment search tool. J Molec Biol 215:403–410

    CAS  PubMed  Google Scholar 

  • Applequist WL, Wallace RS (2002) Deletions in plastid trnT–trnL intergenic spacer define clades within Cactaceae subfamily Cactoideae. Pl Syst Evol 231:153–162

    CAS  Google Scholar 

  • Aranguren-Méndez JA, Román-Bravo R, Isea W, Villasmil Y, Jordana J (2005) Los microsatélites (STR’s), marcadores moleculares de ADN por excelencia para programas de conservación: una revisión. Arch Latinoam Prod Anim 13:30–43

    Google Scholar 

  • Bailey TL, Gribskov M (1998) Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14:48–54

    CAS  PubMed  Google Scholar 

  • Baldwin BG, Sanderson MJ, Porter JM, Wojciechowski MF, Campbell CS, Donoghue MJ (1995) The ITS region of nuclear ribosomal DNA: a valuable source of evidence on angiosperm phylogeny. Ann Missouri Bot Gard 82:247–277

    Google Scholar 

  • Barriel V (2004) Molecular phylogenies and how to code insertion/deletion events. Life Sci 317:693–701

    Google Scholar 

  • Baum DA, Sytsma KJ, Hoch PC (1994) A phylogenetic analysis of Epilobium (Onagraceae) based on nuclear ribosomal DNA sequences. Syst Bot 19:363–388

    Google Scholar 

  • Bayer RJ, Starr JR (1998) Tribal phylogeny of the Asteraceae based on two non-coding chloroplast sequences, the trnL intron and trnL/trnF intergenic spacer. Ann Missouri Bot Gard 85:242–256

    Google Scholar 

  • Bena G, Prosper JM, Lejeune B, Olivieri I (1998) Evolution of annual species of the genus Medicago: a molecular phylogenetic approach. Molec Phylogenet Evol 9:552–559

    CAS  PubMed  Google Scholar 

  • Benson G (1997) Sequence alignment with tandem duplication. J Comput Biol 4:351–367

    CAS  PubMed  Google Scholar 

  • Benson G (1999) Tandem repeats finder—a program to analyze DNA sequences. Nucleic Acids Res 27:573–580

    CAS  PubMed  Google Scholar 

  • Benson G, Dong L (1999) Reconstructing the duplication history of a tandem repeat. Proc seventh intl conf intelligent systems for mol biol (ISMB-99): 44–53

  • Blasko K, Kaplam SA, Higgins KG, Wolfson R, Sears BB (1988) Variation in copy number of a 24-base pair tandem repeat in the chloroplast DNA of Oenothera hookeri strain Johansen. Curr Genet 14:287–292

    CAS  PubMed  Google Scholar 

  • Björklund M (1999) Are third positions really that bad? A test using vertebrate Cytochrome b. Cladistics 15:191–197

    Google Scholar 

  • Bonnard G, Michel F, Weil JH, Steinmetz A (1984) Nucleotide sequence of the split tRNA (Leu/UAA) gene from Vicia faba chloroplast: evidence for structural homologies of the chloroplast tRNA (Leu) intron with the intron from the autosplicable Tetrahymena ribosomal RNA precursor. Molec Genet Genomics 194:330–336

    CAS  Google Scholar 

  • Borsch T, Hilu KW, Quandt D, Wilde V, Neinhuis C, Barthlott W (2003) Non-coding plastid trnT–trnF sequences reveal a well resolved phylogeny of basal angiosperms. J Evol Biol 16:558–576

    CAS  PubMed  Google Scholar 

  • Borsch T, Hilu KW, Wiersema JH, Löhne C, Barthlott W, Wilde V (2007) Phylogeny of Nymphaea (Nymphaeaceae): evidence from substitutions and microstructural changes in the chloroplast trnT–trnF region. Int J Pl Sci 168:639–671

    CAS  Google Scholar 

  • Borsch T, Quandt D (2008) Mutational dynamics and phylogenetic utility of non-coding chloroplast DNA. Pl Syst Evol (this volume)

  • Brocchieri L, Karlin S (1998) A symmetric-iterated multiple alignment of protein sequences. J Molec Biol 276:249–264

    CAS  PubMed  Google Scholar 

  • Brower AVZ (2000) Homology and the inference of systematic relationships: some historical and philosophical perspectives. In: Scotland R, Pennington RT (eds) Homology and systematics. Coding characters for phylogenetic analysis. The Systematic Association Special Volume Series. Taylor & Francis, London, pp 10–21

    Google Scholar 

  • Buroker NE, Brown JR, Gilbert TA, O’Hara PJ, Beckenback AT, Thomas WK, Smith MJ (1990) Length heteroplasmy of sturgeon mitochondrial DNA: an illegitimate elongation model. Genetics 124:157–163

    CAS  PubMed  Google Scholar 

  • Cech TR (1990) Self-splicing of group I introns. Annual Rev Biochem 59:543–568

    CAS  Google Scholar 

  • Clegg MT, Zurawski G (1992) Chloroplast DNA and the study of plant phylogeny: present status and future prospects. In: Soltis PS, Soltis DE, Doyle JJ (eds) Molecular systematics of plants. Chapman and Hall, New York, pp 1–13

    Google Scholar 

  • Clegg MT, Gaut BS, Learn GH Jr, Morton BR (1994) Rates and patterns of chloroplast DNA evolution. Proc Natl Acad Sci USA 91:6795–6801

    CAS  PubMed  Google Scholar 

  • Cox AV, Chase MW (1995) DNA alignment gaps—coding strategies for phylogenetic analysis. Amer J Bot 82(Suppl):122

    Google Scholar 

  • Cummings MP, King LM, Kellogg EA (1994) Slippered-strand mispairing in a plastid gene: rpoC2 in grasses (Poaceae). Molec Biol Evol 11:1–8

    CAS  PubMed  Google Scholar 

  • Curtis SE, Clegg MT (1984) Molecular evolution of chloroplast DNA sequences. Molec Biol Evol 1:291–301

    CAS  PubMed  Google Scholar 

  • De Laet JE (2006) Parsimony and the problem of inapplicables in sequence data. In: Albert VA (ed) Parsimony, phylogeny and genomics. Oxford University Press, New York, pp 81–116

    Google Scholar 

  • De Pinna MC (1991) Concepts and tests of homology in the cladistics paradigm. Cladistics 7:367–394

    Google Scholar 

  • Depiereux E, Feytmans E (1992) MATCHBOX: a fundamentally new algorithm for the simultaneous alignment of several protein sequences. Comput Appl Biosci 8:501–509

    CAS  PubMed  Google Scholar 

  • Depiereux E, Baudoux G, Briffeuil P, Reginster I, De Bolle X, Vinals C, Feytmans E (1997) Match-Box_server: a multiple sequence alignment tool placing emphasis on reliability. Comput Appl Biosci 13:249–256

    CAS  PubMed  Google Scholar 

  • Downie SR, Llanas E, Katz-Downie DS (1996) Multiple independent losses of the rpoC1 intron in angiosperm chloroplast DANA’s. Syst Bot 21:135–151

    Google Scholar 

  • Downie SR, Ramanath S, Katz-Downie DS, Llanas E (1998) Molecular systematics of Apiaceae subfamily Apioideae: phylogenetic analyses of nuclear ribosomal DNA internal transcribed spacer and plastid rpoC1 intron sequences. Amer J Bot 85:563–591

    CAS  Google Scholar 

  • Doyle JJ, Davis JI (1998) Homology in molecular phylogenetics: A parsimony perspective. In: Soltis DE, Soltis PS, Doyle JJ (eds) Molecular systematics of plants II. Kluwer, Boston, pp 101–131

    Google Scholar 

  • Dujon B (1989) Group I introns as mobile genetic elements: facts and mechanistic speculations—a review. Gene 82:91–114

    CAS  PubMed  Google Scholar 

  • Dumolin-Lapègue S, Pemogne MH, Petit RJ (1998) Association between chloroplast and mitochondrial lineages in oaks. Molec Biol Evol 15:1321–1331

    PubMed  Google Scholar 

  • Farris JS (1969) A successive approximations approach to character weighting. Syst Zool 18:374–385

    Google Scholar 

  • Farris JS (1979) The retention index and homoplasy excess. Syst Zool 38:406–407

    Google Scholar 

  • Farris JS (1989) The retention index and the rescaled consistency index. Cladistics 5:417–419

    Google Scholar 

  • Farris JS (2001) Support weighting. Cladistics 17:389–394

    Google Scholar 

  • Felsenstein J (1983) Methods for inferring phylogenies: a statistical view. In: Felstenstein J (ed) Numerical taxonomy. Springer, Berlin, pp 315–334

    Google Scholar 

  • Ferris C, Oliver RP, Davy AJ, Hewitt GM (1995) Using chloroplast DNA to trace postglacial migration routes of oaks into Britain. Molec Ecol 4:731–738

    CAS  Google Scholar 

  • Freudenstein JV (2005) Characters, states, and homology. Syst Biol 54:965–973

    PubMed  Google Scholar 

  • Freudenstein JV, Chase MW (2001) Analysis of mitochondrial nad1b-c intron sequences in Orchidaceae: utility and coding of length-change characters. Syst Bot 26:643–657

    Google Scholar 

  • Geiger DL (2002) Stretch coding and block coding: two new strategies to represent questionably aligned DNA sequences. J Molec Evol 54:191–199

    CAS  PubMed  Google Scholar 

  • Gielly L, Taberlet P (1994) The use of chloroplast DNA to resolve plant phylogenies: non-coding versus rbcL sequences. Molec Biol Evol 11:769–777

    CAS  PubMed  Google Scholar 

  • Giribet G, Wheeler WC (1999) On gaps. Molec Phylogenet Evol 13:132–143

    CAS  PubMed  Google Scholar 

  • Goldstein BD, Schlötterer C (1999) Microsatellites—evolution and applications. Oxford University Press, New York, p 352

    Google Scholar 

  • Golenberg EM, Clegg MT, Durbin ML, Doebley J, Ma DP (1993) Evolution of the non-coding regions of the chloroplast genome. Molec Phylogenet Evol 2:52–64

    CAS  PubMed  Google Scholar 

  • González D (1996) Codificación de las inserciones-deleciones en análisis filogenéticos de secuencias génicas. Bol Soc Bot México 59:115–129

    Google Scholar 

  • Graham SW, Reeves PA, Burns ACE, Olmstead RG (2000) Microstructural changes in noncoding chloroplast DNA: interpretation, evolution, and utility of indels and inversions in basal Angiosperm phylogenetic inference. Int J Pl Sci 161:S83–S96

    CAS  Google Scholar 

  • Gu X, Li WH (1995) The size distribution of insertions and deletions and human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment. J Molec Evol 40:464–473

    CAS  PubMed  Google Scholar 

  • Hall BK (ed) (1994) Homology: the hierarchical basis of comparative biology. Academic press, New York, p 483

  • Hancock JM (1995) The contribution of DNA slippage to eukaryotic nuclear 18S rRNA evolution. J Molec Evol 40:629–639

    CAS  PubMed  Google Scholar 

  • Hein J (1989) A new method that simultaneously aligns and reconstructs ancestral sequences for any number of homologous sequences, when a phylogeny is given. Molec Biol Evol 6:649–668

    CAS  PubMed  Google Scholar 

  • Hein J (1990) Unified approach to alignment and phylogenies. In: Doolittle RF (ed) Molecular evolution: computer analysis of protein and nucleic acid sequences, Methods Enzymol 183. Academic Press, San Diego, pp 626–644

    Google Scholar 

  • Hennig W (1966) Phylogenetic systematics. University of Illinois Press, Urbana, p 263

    Google Scholar 

  • Hibbett DS (1996) Phylogenetic evidence for horizontal gene transfer of Group I introns in the nuclear ribosomal DNA on mushroom-forming fungi. Molec Biol Evol 13:903–917

    CAS  PubMed  Google Scholar 

  • Higgins DG (1994) CLUSTAL V: multiple alignment of DNA and protein sequences. Methods Molec Biol 25:307–318

    CAS  Google Scholar 

  • Higgins DG, Sharp PM (1988) CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73:237–244

    CAS  PubMed  Google Scholar 

  • Higgins DG, Bleasby AJ, Fuchs R (1992) CLUSTAL V: improved software for multiple sequence alignment. Comput Appl Biosci 8:189–191

    CAS  PubMed  Google Scholar 

  • Higgins DG, Thompson JD, Gibson TJ (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266:383–402

    CAS  PubMed  Google Scholar 

  • Hillis DM (1994) Homology in molecular biology. In: Hall BK (ed) Homology. Academic Press, San Diego, pp 339–368

    Google Scholar 

  • Hoot SB, Douglas AW (1998) Phylogeny of the Proteaceae based on atpB and atpB-rbcL intergenic spacer region sequences. Austral Syst Bot 11:301–320

    Google Scholar 

  • Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ (1988) Multiple sequence alignment with Clustal X. Trends Biochem Sci 23:403–405

    Google Scholar 

  • Kajita T, Kamiya K, Nakamura K, Tachida H, Wickneswari R, Tsumura Y, Yoshimaru H, Yamazaki T (1998) Molecular phylogeny of Dipterocarpaceae in Southeast Asia based on nucleotide sequences of matK, trnL intron, and trnL–trnF intergenic spacer region in chloroplast DNA. Molec Phylogenet Evol 10:202–209

    CAS  PubMed  Google Scholar 

  • Källersjö M, Albert V, Farris JS (1999) Homoplasy increases phylogenetic structure. Cladistics 15:91–93

    Google Scholar 

  • Karlin S, Brocchieri L (1996) Evolutionary conservation of RecA genes in relation to protein structure and function. J Bacteriol 178:1881–1894

    CAS  PubMed  Google Scholar 

  • Kato M, Tsunoda T (2007) MotifCombinator: a web based tool to search for combinations of cis-regularory motifs. BMC Bioinformatics 8:100

    PubMed  Google Scholar 

  • Kelchner SA (2000) The evolution of non-coding chloroplast DNA and its application in plant systematics. Ann Missouri Bot Gard 87:482–498

    Google Scholar 

  • Kelchner SA (2002) Group II introns as phylogenetic tools: structure, function, and evolutionary constraints. Amer J Bot 89:1651–1669

    CAS  Google Scholar 

  • Kelchner SA, Clark LG (1997) Molecular evolution and phylogenetic utility of the chloroplast rps16 intron in Chusquea and the Bambusoideae (Poaceae). Molec Phylogenet Evol 8:385–397

    CAS  PubMed  Google Scholar 

  • Kelchner SA, Wendel JF (1996) Hairpins create minute inversions in non-coding regions of the chloroplast DNA. Curr Genet 30:259–262

    CAS  PubMed  Google Scholar 

  • Kjer KM, Honeycutt RL (2007) Site specific rates of mitochondrial genomes and the phylogeny of eutheria. BMC Evol Biol 7:8

    PubMed  Google Scholar 

  • Kluge AG, Farris JS (1969) Quantitative phylogenetics and the evolution of anurans. Syst Zool 18:1–32

    Google Scholar 

  • Kohochi T, Ogura Y, Umesono K, Yamada Y, Komano T, Ozeki H, Ohyama K (1988) Ordered processing and splicing in a polycistronic trasnscipt in liverwort chloroplast. Curr Genet 14:147–154

    Google Scholar 

  • Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF, Wootton JC (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262:208–214

    CAS  PubMed  Google Scholar 

  • Levison G, Gutman GA (1987) Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Molec Biol Evol 4:203–221

    Google Scholar 

  • Little DP, Nixon KC (2004) The use of optimality criteria in sequence alignment and its application in a new computer program. In: Stevenson DW (comp) Abstracts of the 22nd annual meeting of the Willi Hennig Society. Cladistics 20:90–91

    Google Scholar 

  • Liu X, Brutlag DL, Liu JS (2001) Bioprospector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac Symp Biocomput 6:127–138

    Google Scholar 

  • Löhne C, Borsch T (2005) Phylogenetic utility and molecular evolution of the petD group II intron in basal angiosperms. Molec Biol Evol 22:317–332

    PubMed  Google Scholar 

  • López-Giráldez F, Andrés O, Domingo-Roura X, Bosch M (2006) Analyses of carnivore microsatellites and their intimate association with tRNA-derived SINEs. BMC Genomics 7:269

    PubMed  Google Scholar 

  • Mathews DH, Sabina J, Zucker M, Turner H (1999) Expanded sequence dependence of thermodynamic parameters provides robust prediction of RNA secondary structure. J Molec Biol 288:911–940

    CAS  PubMed  Google Scholar 

  • Mendes ND, Casimiro AC, Santos PM, Sa-Correia I, Oliveira AL, Freitas AT (2006) MUSA: a parameter free algorithm for the identification of biologically significant motifs. Bioinformatics 22:2996–3002

    CAS  PubMed  Google Scholar 

  • Michel F, Westhof E (1990) Modelling of the three-dimentional architecture of group II catalytic introns based on comparative sequence analysis. J Molec Biol 216:585–610

    CAS  PubMed  Google Scholar 

  • Michel F, Umesono K, Ozeki H (1989) Comparative and functional anatomy of group II catalytic introns–a review. Gene 82:5–30

    CAS  PubMed  Google Scholar 

  • Morgenstern B (1999) Dialign2: improvement of the segment to segment approach to multiple sequence alignment. Bioinformatics 15:211–218

    CAS  PubMed  Google Scholar 

  • Morgenstern B (2004) Dialign: multiple DNA and protein sequence alignment at BiBiServ. Nucleic Acids Res 32:33–36

    Google Scholar 

  • Morgenstern B, Dress AWM, Werner T (1996) Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc Natl Acad Sci USA 93:12098–12103

    CAS  PubMed  Google Scholar 

  • Morgenstern B, Frech K, Dress A, Werner T (1998) DIALIGN: finding local similarities by multiple sequence alignment. Bioinformatics 14:290–294

    CAS  PubMed  Google Scholar 

  • Morgenstern B, Prohaska SJ, Pohler D, Stadler PF (2006) Multiple sequence alignment with user-defined anchor points. Algorithms Molec Biol 1:6

    Google Scholar 

  • Morrison DA (2006) Multiple sequence alignment for phylogenetic purposes. Austral Syst Bot 19:479–539

    CAS  Google Scholar 

  • Morrison DA (2008) A framework for phylogenetic sequence alignment. Pl Syst Evol (this volume)

  • Müller K (2005) SeqState—primer design and sequence statistics for phylogenetic DNA data sets. Appl Bioinformatics 4:65–69

    PubMed  Google Scholar 

  • Müller K (2006) Incorporating information from length-mutational events into phylogenetic analysis. Molec Phylogenet Evol 38:667–676

    PubMed  Google Scholar 

  • Müller K, Borsch T (2005) Phylogenetics of Utricularia (Lentibulariaceae) and molecular evolution of the trnK intron in a lineage with high substitutional rates. Pl Syst Evol 205:39–67

    Google Scholar 

  • Müller K, Borsch T, Hilu KW (2006) Phylogenetics utility of rapidly evolving DNA at high taxonomical levels: cotrnasting matK, trnT–F and rbcL in basal angiosperms. Molec Phylogenet Evol 41:99–117

    PubMed  Google Scholar 

  • Natali A, Manen JF, Ehrendorfer F (1995) Phylogeny of Rubiaceae-Rubioideae, in particular of the tribe Rubieae: evidence from a non-coding chloroplast DNA sequence. Ann Missouri Bot Gard 82:428–439

    Google Scholar 

  • Needleman SB, Wunsch CD (1979) A general method applicable to the search for similarities in the aminoacid sequences of two proteins. J Molec Biol 48:443–453

    Google Scholar 

  • Ochoterena H, Arenas E, Ricalde E, Segura C, Rodríguez-Vázquez K (2008) GLOCSA: a global criterion for sequence alignment. Cladistics 24:100

    Google Scholar 

  • Palmer JD (1985) Comparative organization of chloroplast genomes. Annual Rev Genet 19:325–354

    CAS  Google Scholar 

  • Palmer JD (1991) Plastid chromosomes: structure and evolution. In: Bogard L, Vasil IK (eds) Cell culture and somatic cell genetics of plants vol. 7A: the molecular biology of plastids. Academic Press, Orlando, pp 5–53

    Google Scholar 

  • Patterson C (1982) Morphological characters and homology. In: Joysey KA, Friday AE (eds) Molecular and morphology in evolution: conflict or compromise? Academic Press, London, pp 1–22

    Google Scholar 

  • Patterson C (1988) Homology in classical and molecular biology. Molec Biol Evol 5:603–625

    CAS  PubMed  Google Scholar 

  • Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 85:2444–2448

    CAS  PubMed  Google Scholar 

  • Quandt D, Müller K, Stech M, Hilu KW, Frey W, Frahm JP, Borsch T (2004) Molecular evolution of the chloroplast trnL–F region in land plants. In: Goffinet B, Hollowell V, Magill R (eds) Molecular systematics of bryophytes. Missouri Botanical Garden Press, St Louis, pp 13–37

    Google Scholar 

  • Reeck GR, de Häen C, Teller DC, Doolittle RF, Fitch WM, Dickerson RE, Chambon P, McLachlan AD, Margoliash E, Jukes TH, Zuckerkandl E (1987) “Homology” in protein and nucleic acids: a terminology muddle and a way out of it. Cell 50:667

    CAS  PubMed  Google Scholar 

  • Rigaa A, Monnerot M, Sello D (1995) Molecular cloning and complete nucleotide sequence of the repeated unit and flanking gene of the scallop Pecten maximus mitochondrial DNA: putative replication origin features. J Molec Evol 41:189–195

    CAS  PubMed  Google Scholar 

  • Rippel OC (1988) Fundamentals of comparative biology. Birkhäuser Verlag, Basel

    Google Scholar 

  • Rippel O, Kearney M (2002) Similarity. Biol J Linn Soc 75:59–82

    Google Scholar 

  • Roth VL (1991) Homology and hierarchies: problems solved and unresolved. J Evol Biol 4:167–194

    Google Scholar 

  • Rychlik W, Rhoads RE (1989) A computer program for choosing optimal oligonucleotides for filter hybridization, sequencing and in vitro amplification of DNA. Nucleic Acids Res 17(21):8543–8551

    CAS  PubMed  Google Scholar 

  • Samuel R, Pinsker W, Kiehn M (1997) Phylogeny of some species of Cyrtandra (Gesneriaceae) inferred from the atpB/rbcL cpDNA intergene region. Bot Acta 110:503–510

    CAS  Google Scholar 

  • Sanderson MJ (1995) Objections to bootstrapping phylogenies: a critique. Syst Biol 44:299–320

    Google Scholar 

  • Sang T, Crawford DJ, Stuessy TF (1997) Chloroplast DNA phylogeny, reticulate evolution, and biogeography of Paeonia (Paeonieaceae). Amer J Bot 84:1120–1136

    CAS  Google Scholar 

  • Sinha S, Tompa M (2003) YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res 31:3586–3588

    CAS  PubMed  Google Scholar 

  • Siddharthan R (2006) Sigma: multiple alignment of weakly conserved non-coding DNA sequences. BMC Bioinformatics 7:143

    PubMed  Google Scholar 

  • Simmons MP (2000) A fundamental problem with amino-acid sequence characters for phylogenetic analysis. Cladistics 16:274–282

    Google Scholar 

  • Simmons MP (2004) Independence of alignment and tree search. Molec Biol Evol 31:874–879

    Google Scholar 

  • Simmons MP, Ochoterena H (2000) Gaps as characters in sequence-based phylogenetic analysis. Syst Biol 49:369–381

    CAS  PubMed  Google Scholar 

  • Simmons MP, Ochoterena H, Carr TG (2001) Incorporation, relative homoplasy, and effect of gap characters in sequence-based phylogenetic analysis. Syst Biol 50:454–462

    CAS  PubMed  Google Scholar 

  • Simmons MP, Ochoterena H, Freudenstein J (2002a) Amino acid vs. nucleotide characters: challenging preconceived notions. Molec Phylogenet Evol 24:78–90

    CAS  PubMed  Google Scholar 

  • Simmons MP, Ochoterena H, Freudenstein J (2002b) Conflict between amino acid and nucleotide characters. Cladistics 18:200–2006

    Google Scholar 

  • Simmons MP, Freudenstein JV (2003) The effects of increasing genetic distance on alignment of, and tree construction from, rDNA internal transcribed spacer sequences. Molec Phylogenet Evol 26:444–451

    CAS  PubMed  Google Scholar 

  • Simmons MP, Müller K, Andrew PN (2007) The relative performance of indel-coding methods in simulations. Molec Phylogenet Evol 44:724–740

    CAS  PubMed  Google Scholar 

  • Smit S, Widmann J, Knight R (2007) Evolutionary rates vary among rRNA structural elements. Nucleic Acids Res 35:3339–3354

    CAS  PubMed  Google Scholar 

  • Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Molec Biol 147:195–197

    CAS  PubMed  Google Scholar 

  • Soltis PS, Soltis DE (1998) Molecular evolution of 18S rDNA in angiosperms: implications for character weighting in phylogenetic analysis. In: Soltis DS, Soltis PS, Doyle JJ (eds) Molecular systematics of plants II: DNA sequencing. Chapman and Hall, New York, pp 188–210

    Google Scholar 

  • Soltis DE, Soltis PS, Nickrent DL, Johnson LA, Hahn WJ, Hoot SB, Sweere JA, Kuzoff RK, Korn KA, Chase MW, Swensen SM, Zimmer EA, Chaw SM, Gillespie LJ, Kress WJ, Sytsma KJ (1997) Angiosperm phylogeny inferred from 18S ribosomal DNA sequences. Ann Missouri Bot Gard 84:1–49

    Google Scholar 

  • Sosa V, Ochoterena H, Escamilla M (2006) A revision of Cerdia (Caryophyllaceae). Bot J Linn Soc 152:1–13

    Google Scholar 

  • Streisinger G, Owen J (1985) Mechanisms of spontaneous and induced frame shift mutations in bacteriophage T4. Genetics 109:633–659

    CAS  PubMed  Google Scholar 

  • Tesfaye K, Borsch T, Govers K, Bekele E (2007) Characterization of Coffea chloroplast microsatellites and evidence for the recent divergence of C. arabica and C. eugenioides cp genomes. Genome 50:1112–1129

    CAS  PubMed  Google Scholar 

  • Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position s-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680

    CAS  PubMed  Google Scholar 

  • Thompson JD, Gibson TJ, Plewiank F, Jeanmougin F, Higgins DG (1997) The CLUSTAL X window interface: flexible strategies for multiple sequence alignment aided by qualitative analysis tools. Nucleic Acids Res 25:4876–4882

    CAS  PubMed  Google Scholar 

  • Thompson JD, Plewniak F, Poch O (1999) A comprehensive comparison of protein sequence alignment programs. Nucleic Acids Res 27:2682–2690

    CAS  PubMed  Google Scholar 

  • Van Ham RCHJ, Hart H, Mes TH, Sansbrink JM (1994) Molecular evolution of noncoding regions of the chloroplast genome in the Crassulaceae and related species. Curr Genet 25:558–566

    PubMed  Google Scholar 

  • Vogt T (2002) Substrate specificity and sequence analysis define a polyphyletic origin of betanidin 5- and 6-O-glucosyltransferase from Dorotheanthus bellidiformis. Planta 214:492–495

    CAS  PubMed  Google Scholar 

  • Wenzel JW, Siddall M (1999) Noise. Cladistics 15:51–64

    Google Scholar 

  • Wertz JE, McGregor KF, Bessen DE (2007) Detecting key structural features within highly recombined genes. PLOS Comput Biol 3:137–150

    CAS  Google Scholar 

  • Wheeler WC (1996) Optimization alignment: the end of multiple sequence alignment in phylogenetics? Cladistics 12:1–9

    Google Scholar 

  • Wheeler WC (1998) Alignment characters, dynamic programming, and heuristic solutions. In: Schierwater B, Streit B, Wagner GP, Desalle R (eds) Molecular approaches to ecology and evolution, 2nd edn. Birkhäuser Verlag, Basel, pp 243–251

    Google Scholar 

  • Wheeler WC (1999) Fixed character state and the optimization of molecular sequence data. Cladistics 15:379–386

    Google Scholar 

  • Wheeler WC (2000) Heuristic reconstruction of hypothetical-ancestral DNA sequences: sequence alignment versus direct optimization. In: Scotland R, Pennington RT (eds) Homology and systematics. Coding characters for phylogenetic analysis. The Systematic Association Special Volume Series. Taylor & Francis, London, pp 106–113

    Google Scholar 

  • Wheeler WC (2001) Homology and optimization of sequence data. Cladistics 17:S3–S11

    CAS  PubMed  Google Scholar 

  • Wheeler WC (2003) Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search. Cladistics 19:261–268

    PubMed  Google Scholar 

  • Wheeler WC (2006) Alignment, dynamic homology, and optimization. In: Albert VA (ed) Parsimony, phylogeny and genomics. Oxford University Press, New York, pp 71–80

    Google Scholar 

  • Wheeler WC, Aagersen L, Arango CP, Faivovich J, Grant T, D`Hasse C, Janies D, Smith Wm L, Varón A, Giribet G (2006) Dynamic homology and phylogenetic systematics: a unified approach using Poy. Amer Mus Nat Hist, NY, p 365

  • Wheeler WC, Gladstein DS (1991–1998) MALIGN: a multiple sequence alignment program

  • Wheeler WC, Gladstein DS (1994) MALIGN: a multiple sequence alignment program. J Heredity 85:417–418

    Google Scholar 

  • Wheeler WC, Honeycutt RL (1988) Paired sequence difference in ribosomal RNAs: evolutionary and phylogenetic implications. Molec Biol Evol 5:90–96

    CAS  PubMed  Google Scholar 

  • Wolfe KH, Li WH, Sharp PM (1987) Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc Natl Acad Sci USA 84:9054–9058

    CAS  PubMed  Google Scholar 

  • Wolfson R, Higgins KG, Sears BB (1991) Evidence for replication slippage in the evolution of Oenothera chloroplast DNA. Molec Biol Evol 8:709–720

    CAS  PubMed  Google Scholar 

  • Young ND, Healy J (2003) GapCoder automates the use of indel characters in phylogenetic analysis. BMC Bioinformatics 4:1–6

    Google Scholar 

  • Zuker M (1989) On finding all suboptimal foldings of an RNA molecule. Science 244(4900):48–52

    CAS  PubMed  Google Scholar 

  • Zurawski G, Clegg MT (1987) Evolution of higher-plant chloroplast DNA-encoded genes: implications for structure-function and phylogenetic studies. Annual Rev Physiol 38:398–418

    Google Scholar 

Download references

Acknowledgments

I would like to thank Thomas Borsch for the invitation to present these ideas at the 17th International Symposium on Biodiversity and Evolutionary Biology, and more especially for the thorough revision and important comments that highly improved the manuscript. I am grateful to the German Academic Exchange Service (DAAD) for the financial support that made it possible for me to participate in the 17th International Symposium on Biodiversity and Evolutionary Biology in Bonn. I greatly appreciate the comments and suggestions by Donovan Bailey, Mark Simmons, David A. Morrison and the two reviewers, Jerry Davis and Kai Müller.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Helga Ochoterena.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ochoterena, H. Homology in coding and non-coding DNA sequences: a parsimony perspective. Plant Syst Evol 282, 151–168 (2009). https://doi.org/10.1007/s00606-008-0095-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00606-008-0095-y

Keywords

Navigation