Skip to main content
Log in

Network models for sequence evolution

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

We introduce a general class of models for sequence evolution that includes network phylogenies. Networks, a generalization of strictly tree-like phylogenies, are proposed to model situations where multiple lineages contribute to the observed sequences. An algorithm to compute the probability distribution of binary character-state configurations is presented and statistical inference for this model is developed in a likelihood framework. A stepwise procedure based on likelihood ratios is used to explore the space of models. Starting with a star phylogeny, new splits (nontrivial bipartitions of the sequence set) are successively added to the model until no significant change in the likelihood is observed. A novel feature of our approach is that the new splits are not necessarily constrained to be consistent with a treelike mode of evolution. The fraction of invariable sites is estimated by maximum likelihood simultaneously with other model parameters and is essential to obtain a good fit to the data. The effect of finite sequence length on the inference methods is discussed. Finally, we provide an illustrative example using aligned VPl genes from the foot and mouth disease viruses (FMDV). The different serotypes of the FMDV exhibit a range of treelike and network evolutionary relationships.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bandelt HJ, Dress AWM (1990) A canonical decomposition theory for metrics on a finite set. Preprint 90-032, SFB 343, Universität Bielefeld

  • Barry D, Hartigan JA (1987) Statistical analysis of hominoid molecular evolution. Stat Sci 2:191–210

    Google Scholar 

  • Beck E, Strohmaier K (1987) Subtyping of european FMDV outbreak strains by nucleotide sequence determination. J Virol 61:1621–1629

    Google Scholar 

  • Brown WM, Prager EM, Wang A, Wilson AC (1982) Mitochondrial DNA sequences of primates: tempo and mode of evolution. J Mol Evol 18:225–239

    Google Scholar 

  • Buneman P (1971) The recovery of trees from measures of dissimilarity. In: Hodson FR, Kendall DG, Tantu P (eds) Mathematics in the archaeological and historical science. Proc of the Anglo-Romanian-Conference 1970, University Press Edinburgh, pp 387–395

    Google Scholar 

  • Cavender J (1978) Taxonomy with confidence. Math Biosc 40: 271–280

    Google Scholar 

  • Churchill GA, von Haeseler A, Navidi WC (1992) Sample size for a phylogenetic inference. Mol Biol Evol 9:753–769

    Google Scholar 

  • Dopazo J, Dress A, von Haeseler A (1990) Split decomposition: a new technique to analyze viral evolution. Preprint 90-037, Sonderforschungsbereich 343 Diskrete Strukturen in der Mathematik. Universität Bielefeld

  • Domingo E, Holland JJ (1988) High error rates, population equilibrium and evolution of RNA replication systems. In: Domingo E, Holland JJ, Ahlquist P (eds) RNA genetics, vol III. CRC Press, Boca Raton, FL, pp 3–36

    Google Scholar 

  • Draper NR, Smith H (1981) Applied regression analysis, 2nd ed. John Wiley, New York

    Google Scholar 

  • Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376

    CAS  PubMed  Google Scholar 

  • Felsenstein J (1988) Phylogenies from molecular sequences: Inference and reliability. Annu Rev Genet 22:521–565

    Google Scholar 

  • Felsenstein J (1989) Phylip manual, version 3.2, University Herbarium of the University of California at Berkeley

  • Fitch WM (1986a) The estimate of total nucleotide substitutions from pairwise difference is biased. Philos Trans R Soc Lond [B] 312:317–324

    Google Scholar 

  • Fitch WM (1986b) An estimation of the number of invariable sites is necessary for the accurate estimation of nucleotide substitutions since a common ancestor. In: Gershowitz H (ed) Evolutionary perspectives and the new genetics. Alan R Liss, New York, pp 149–159

    Google Scholar 

  • Fitch WM, Margoliash E (1967) A method for estimating the number of invariant amino acid coding positions in a gene using cytochrome c as a model case. Biochem Genet 1:65–71

    Google Scholar 

  • Hasegawa M, Kishino H (1989) Confidence limits on the maximum likelihood estimate of the hominoid tree from mitochondrial DNA sequences. Evolution 43:672–677

    Google Scholar 

  • Hasegawa M, Kishino H, Yano K (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174

    CAS  PubMed  Google Scholar 

  • Hendy MD (1989) The relationship between simple evolutionary tree models and observable sequence data. Syst Zool 38:310–321

    Google Scholar 

  • Hendy MD, Penny D (1991) Spectral analysis of phylogenetic data. (preprint)

  • Hendy MD, Penny D (1989) A framework for the quantitative study of evolutionary trees. Syst Zool 38:297–309

    Google Scholar 

  • Navidi WC, Churchill GA, von Haeseler A (1991) Methods for inferring phylogenies from nucleic acid sequence data by using maximum likelihood and linear invariants. Mol Biol Evol 8:128–143

    Google Scholar 

  • Piccone ME, Kaplan G, Giavedoni L, Domingo E, Palma EL (1988) VP1 of serotype C foot-and-mouth disease virus: long-term conservation of sequences. J Virol 62:1469–1473

    Google Scholar 

  • Saitou N (1988) Property and efficiency of the maximum likelihood method for molecular phylogeny. J Mol Evol 27:261–273

    Google Scholar 

  • Sawyer S (1989) Statistical test for detecting gene conversion. Mol Biol Evol 6:526–538

    Google Scholar 

  • Shoemaker JS, Fitch WM (1989) Evidence from nuclear sequences that invariable sites should be considered when sequence divergence is calculated. Mol Biol Evol 6:270–289

    Google Scholar 

  • Sobrino F, Palma EL, Beck E, Davila M, de la Torre JC, Negro P, Villaneuva N, Ortin J, Domingo E (1986) Fixation of mutations in the viral genome during an outbreak of foot-and-mouth disease: heterogeneity and rate variations. Gene 50: 149–159

    Google Scholar 

  • Steinhauer DA, Holland JJ (1987) Rapid evolution of RNA viruses. Ann Rev Microbiol 41:409–433

    Google Scholar 

  • Swofford DL, Olsen GJ (1990) Phylogeny reconstruction. In: Hillis DM and Moritz C (eds) Molecular systematics. Sinauer Associates, Sunderland MA, pp 411–501

    Google Scholar 

  • Swofford DL (1991) PAUP 3.0 user's manual (Draft 2.9.91). Illinois Natural History Survey, Champaign, 1991

    Google Scholar 

  • Ward RH, Frazer BS, Dew K, Pääbo S (1991) A single north-american tribal group contains extensive mitochondrial diversity. Proc Natl Acad Sci USA 88:8720–8724

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Correspondence to: A. von Haeseler

Rights and permissions

Reprints and permissions

About this article

Cite this article

von Haeseler, A., Churchill, G.A. Network models for sequence evolution. J Mol Evol 37, 77–85 (1993). https://doi.org/10.1007/BF00170465

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00170465

Key words

Navigation