Mixture Tree Construction and Its Applications

  • Grace S. C. Chen
  • Mingze Li
  • Michael Rosenberg
  • Bruce Lindsay
Chapter

Abstract

A new method for building a gene tree from Single Nucleotide Polymorphism (SNP) data was developed by Chen and Lindsay (Biometrika 93(4):843–860, 2006). Called the mixture tree, it was based on an ancestral mixture model. The sieve parameter in the model plays the role of time in the evolutionary tree of the sequences. By varying the sieve parameter, one can create a hierarchical tree that estimates the population structure at each fixed backward point in time. In this chapter, we will review the model and then present an application to the clustering of the mitochondrial sequences to show that the approach performs well. A simulator that simulates real SNPs sequences with unknown ancestral history will be introduced. Using the simulator we will compare the mixture trees with true trees to evaluate how well the mixture tree method performs. Comparison with some existing methods including neighbor-joining method and maximum parsimony method will also be presented in this chapter.

References

  1. 1.
    Chen, S. C., & Lindsay, B. (2006). Building mixture trees from binary sequence data. Biometrika, 93(4), 843–860.MathSciNetCrossRefGoogle Scholar
  2. 2.
    Czelsniak, J., Goodman, M., Moncrief, N. D., & Kehoe, S. M. (1990). Maximum parsimony approach to construction of evolutionary trees from aligned homologous sequences. Methods in Enzymology, 183, 601–615.CrossRefGoogle Scholar
  3. 3.
    Edwards, A. W. F., & Cavalli-Sforza, L. L. (1963). The reconstruction of evolution. Annals of Human Genetics, 27, 105–106. (also published in Heredity 18:553)Google Scholar
  4. 4.
    Edwards, A. W. F., & Cavalli-Sforza, L. L. (1964). Reconstruction of evolutionary trees. In V. H. Heywood & J. McNeill (Ed.), Phenetic and phylogenetic classification (Vol. 6, pp. 67–76). London: Systematics Association Publ.Google Scholar
  5. 5.
    Felsenstein, J. (1981). Evolutionary trees from DNA sequences: A maximum Liklihood Approach. Journal of Molecular Evolution, 17, 368–376.CrossRefGoogle Scholar
  6. 6.
    Fisher, R. A. (1912). On an absolute criterion for fitting frequency curves. Messenger of Mathematics, 41, 155–160.Google Scholar
  7. 7.
    Fisher, R. A. (1921). On the “probable error” of a coefficient of correlation deduced from a small sample. Metron, 1, 3–32.Google Scholar
  8. 8.
    Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical transactions of the Royal Society of London A, 222, 309–368.MATHCrossRefGoogle Scholar
  9. 9.
    Hudson, R. R. (2002). Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics, 18, 337–338.CrossRefGoogle Scholar
  10. 10.
    Huelsenbeck, J. P., & Ronquist, F. (2005). Bayesian analysis of molecular evolution using MrBayes. In Nielsen, R. (Ed.), Statistical methods in molecular evolution. New York: Springer.Google Scholar
  11. 11.
    Lindsay, B., Markatou, M., Ray, S., Kang, K., & Chen, S. C. (2008). Quadratic distances on probabilities: A unified foundation. The Annals of Statistics, 36(2), 983–1006.MathSciNetMATHCrossRefGoogle Scholar
  12. 12.
    Nei, M., & Kumar, S. (2000). Molecular evolution and phylogenetics. New York: Oxford University Press.Google Scholar
  13. 13.
    Nei, M., Kumar, S., & Takahashi, K. (1998). The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small. Proceedings of the National Academy of Sciences of the United States of America, 95, 12390–12397.CrossRefGoogle Scholar
  14. 14.
    Penny, D., & Hendy, M. D. (1985). The use of tree comparison metrics. Systematic Zoology, 34, 75–82.CrossRefGoogle Scholar
  15. 15.
    Robinson, D. F., & Foulds, L. R. (1981). Comparison of phylogenetic trees. Mathematical Biosciences, 53, 131–147.MathSciNetMATHCrossRefGoogle Scholar
  16. 16.
    Rzhestky, A., & Nei, M. (1992). A simple method for estimating and testing minimum-evolution trees. Molecular Biology and Evolution, 9, 945–967.Google Scholar
  17. 17.
    Saitou, N., & Nei, M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular and Biological Evolution, 4, 406–425.Google Scholar
  18. 18.
    Takahashi, K., & Nei, M. (2000). Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Molecular Biology and Evolution, 17, 1251–1258.CrossRefGoogle Scholar
  19. 19.
    Tamura, K., Dudley, J., Nei, M., & Kumar, S. (2007). MEGA4: Molecular evolutionary genetics analysis (MEGA) software version 4.0. Molecular Biology and Evolution, 24, 1596–1599.Google Scholar
  20. 20.
    Wen, B., Li, H., Gao, S., et al. (2004). Genetic structure of Hmong-Mien speaking populations in east Asia as revealed by mtDNA lineages. Molecular and Biological Evolution, 22(3), 725–734.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Grace S. C. Chen
    • 1
  • Mingze Li
    • 1
  • Michael Rosenberg
    • 2
  • Bruce Lindsay
    • 3
  1. 1.School of Math & StatArizona State UniversityTempeU.S.A.
  2. 2.School of Life SciencesArizona State UniversityTempeU.S.A.
  3. 3.Department of StatisticsPenn State UniversityUniversity ParkU.S.A.

Personalised recommendations