# Dimensions of Group-Based Phylogenetic Mixtures

## Abstract

Mixtures of group-based Markov models of evolution correspond to joins of toric varieties. In this paper, we establish a large number of cases for which these phylogenetic join varieties realize their expected dimension, meaning that they are nondefective. Nondefectiveness is not only interesting from a geometric point-of-view, but has been used to establish combinatorial identifiability for several classes of phylogenetic mixture models. Our focus is on group-based models where the equivalence classes of identified parameters are orbits of a subgroup of the automorphism group of the abelian group defining the model. In particular, we show that for these group-based models, the variety corresponding to the mixture of *r* trees with *n* leaves is nondefective when \(n \ge 2r+5\). We also give improved bounds for claw trees and give computational evidence that 2-tree and 3-tree mixtures are nondefective for small *n*.

## Notes

### Acknowledgements

This work began at the 2016 AMS Mathematics Research Community on “Algebraic Statistics,” which was supported by the National Science Foundation under Grant number DMS-1321794. RD was supported by NSF DMS-1401591. EG was supported by NSF DMS-1620109. RW was supported by a NSF GRF under Grant number PGF-031543, NSF RTG Grant 0943832, and a Ford Foundation Dissertation Fellowship. HB was supported in part by a research assistantship, funded by the National Institutes of Health Grant R01 GM117590. PEH was partially supported by NSF Grant DMS-1620202.

## References

- Abo H, Brambilla MC (2012) New examples of defective secant varieties of Segre–Veronese varieties. Collectanea Mathematica 63(3):287–297MathSciNetCrossRefGoogle Scholar
- Abo H, Brambilla MC (2013) On the dimensions of secant varieties of Segre-Veronese varieties. Annali di Matematica Pura ed Applicata 192(1):1–32MathSciNetCrossRefGoogle Scholar
- Alexander J, Hirschowitz A (1995) Polynomial interpolation in several variables. J Algebraic Geom 4(2):201–222MathSciNetzbMATHGoogle Scholar
- Allman ES, Petrović S, Rhodes JA, Sullivant S (2011) Identifiability of 2-tree mixtures for group-based models. IEEE/ACM Trans Comput Biol Bioinf 8(3):710–722CrossRefGoogle Scholar
- Baños H, Bushek N, Davidson R, Gross E, Harris PE, Krone R, Long C, Stewart A, Walker R (2016) Phylogenetic trees. arXiv preprint arXiv:1611.05805 (2016)
- Buczynska W, Wisniewski JA (2007) On the geometry of binary symmetric models of phylogenetic trees. J Eur Mathe Soc 9(3):609–635MathSciNetCrossRefGoogle Scholar
- Casanellas M (2012) Algebraic tools for evolutionary biology. Math Soc 12–17Google Scholar
- Daskalakis C, Mossel E, Roch S (2011) Evolutionary trees and the ising model on the Bethe lattice: a proof of steel’s conjecture. Probab Theory Rel Fields 149(1):149–189Google Scholar
- Draisma J (2008) A tropical approach to secant dimensions. J Pure Appl Algebra 212(2):349–363MathSciNetCrossRefGoogle Scholar
- Evans SN, Speed TP (1993) Invariants of some probability models used in phylogenetic inference. Ann Stat 21(1):355–377MathSciNetCrossRefGoogle Scholar
- Grayson DR, Stillman ME (2002) Macaulay 2, a software system for research in algebraic geometryGoogle Scholar
- Hendy MD, Penny D, Steel MA (1994) A discrete fourier analysis for evolutionary trees. Proc Natl Acad Sci 91(8):3339–3343CrossRefGoogle Scholar
- Jukes TH, Cantor CR (1969) Evolution of protein molecules. Mamm Protein Metab 3(21):132Google Scholar
- Long C, Sullivant S (2015) Identifiability of 3-class Jukes–Cantor mixtures. Adv Appl Math 64:89–110, 3MathSciNetCrossRefGoogle Scholar
- Mauhar M, Rusinko J, Vernon Z (2017) H-representation of the kimura-3 polytope for the m-claw tree. SIAM J Discret Math 31(2):783–795MathSciNetCrossRefGoogle Scholar
- Michałek M (2011) Geometry of phylogenetic group-based models. J Algebra 339(1):339–356MathSciNetCrossRefGoogle Scholar
- Neyman J (1971) Molecular studies of evolution: a source of novel statistical problems. Stat Decis Theory Rel Top 1:1–27MathSciNetzbMATHGoogle Scholar
- Pagel M, Meade A (2004) A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst Biol 53(4):571–581CrossRefGoogle Scholar
- Sturmfels B (1996) Gröbner bases and Convex Polytopes, vol 8. American Mathematical Society, ProvidencezbMATHGoogle Scholar
- Sturmfels B, Sullivant S (2005) Toric ideals of phylogenetic invariants. J Comput Biol 12(2):204–228CrossRefGoogle Scholar
- Sullivant S (2018) Algebraic statistics. Graduate studies in mathematics. American Mathematical Society, ProvidenceGoogle Scholar