Abstract
Principal component analysis is a widely used method for the dimensionality reduction of a given data set in a high-dimensional Euclidean space. Here we define and analyze two analogues of principal component analysis in the setting of tropical geometry. In one approach, we study the Stiefel tropical linear space of fixed dimension closest to the data points in the tropical projective torus; in the other approach, we consider the tropical polytope with a fixed number of vertices closest to the data points. We then give approximative algorithms for both approaches and apply them to phylogenetics, testing the methods on simulated phylogenetic data and on an empirical dataset of Apicomplexa genomes.
Similar content being viewed by others
Notes
Our software for all computations can be downloaded at http://polytopes.net/computations/tropicalPCA/.
A more detailed tutorial for how to use Mesquite to generate data can be accessed at https://www.youtube.com/watch?v=94tXmA4ods4&t=238s. We followed the same simulation steps and parameter settings as in this video.
In keeping with the format of Mesquite Maddison and Maddison (2017), leaves of the species tree are rendered as capital letters. Tree topologies of all projected points can be found in the supplement at http://polytopes.net/computations/tropicalPCA/.
References
Akian M, Gaubert S, Viorel N, Singer I (2011) Best approximation in max-plus semimodules. Linear Algebra Appl 435:3261–3296
Billera L, Holmes S, Vogtman K (2001) Geometry of the space of phylogenetic trees. Adv Appl Math 27:733–767
Butkovic P (2010) Max-linear systems: theory and algorithms. Springer, London Springer monographs in mathematics
Burkard R, Dell’Amico M, Martello S (2009) Assignment problems. Society for Industrial and Applied Mathematics, Philadelphia
Cohen G, Gaubert S, Quadrat J (2004) Duality and separation theorems in idempotent semimodules. Linear Algebra Appl 379:395–422
Depersin J, Gaubert S, Joswig M (2017) A tropical isoperimetric inequality. Sémin Lothar Combin 78B:12
Develin M, Sturmfels B (2004) Tropical convexity. Doc Math 9:1–27
Feragen A, Owen M, Petersen J, Wille MMW, Thomsen LH, Dirksen A, de Bruijne M (2012) Tree-space statistics and approximations for large-scale analysis of anatomical trees. In: IPMI 2013: information processing in medical imaging
Fink A, Rincón F (2015) Stiefel tropical linear spaces. J Combin Theory A 135:291–331
Igor G, Stephan N, Ariela S (2009) Linear and nonlinear optimization, 2nd edn. Society for Industrial Mathematics, Philadelphia
Joswig M (2017) Essentials of tropical combinatorics (in preparation). http://page.math.tu-berlin.de/~joswig/etc/index.html
Joswig M, Sturmfels B, Yu J (2007) Affine buildings and tropical convexity. Alban J Math 1:187–211
Kuo C, Wares JP, Kissinger JC (2008) The apicomplexan whole-genome phylogeny: an analysis of incongruence among gene trees. Mol Biol Evol 25:2689–2698
Lenstra HW (1983) Integer programming with a fixed number of variables. Math Oper Res 8:538–548
Lin B, Sturmfels B, Tang X, Yoshida R (2017) Convexity in tree spaces. SIAM Discrete Math 3:2015–2038
Lin B, Yoshida R (2018) Tropical Fermat–Weber points. SIAM Discrete Math. arXiv:1604.04674
Maclagan D, Sturmfels B (2015) Introduction to tropical geometry, graduate studies in mathematics, vol 161. American Mathematical Society, Providence
Maddison WP, Maddison D (2017) Mesquite: a modular system for evolutionary analysis. Version 3.31 http://mesquiteproject.org
Nye T, Tang X, Weyenberg G, Yoshida R (2017) Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees. Biometrika 104(4):901–922
Richter-Gebert J, Sturmfels B, Theobald T (2005) First steps in tropical geometry. In: Litvinov GL, Maslov VP (eds) Idempotent mathematics and mathematical physics, vol 377. American Mathematical Society, Providence, pp 289–308
Weyenberg G, Yoshida R, Howe D (2016) Normalizing kernels in the Billera–Holmes–Vogtmann treespace. IEEE/ACM Trans Comput Biol Bioinform. https://doi.org/10.1109/TCBB.2016.2565475
Zhao J, Yoshida R, Cheung SS, Haws D (2013) Approximate techniques in solving optimal camera placement problems. Int J Distrib Sens Netw 241913:15. https://doi.org/10.1155/2013/241913
Acknowledgements
R. Y. was supported by Research Initiation Proposals from the Naval Postgraduate School and NSF Division of Mathematical Sciences 1622369. L. Z. was supported by an NSF Graduate Research Fellowship. X. Z. was supported by travel funding from the Department of Statistics at the University of Kentucky. The authors thank Bernd Sturmfels (UC Berkeley and MPI Leipzig) for many helpful conversations. The authors also thank Daniel Howe (University of Kentucky) for his input on Apicomplexa tree topologies.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yoshida, R., Zhang, L. & Zhang, X. Tropical Principal Component Analysis and Its Application to Phylogenetics. Bull Math Biol 81, 568–597 (2019). https://doi.org/10.1007/s11538-018-0493-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-018-0493-4