Abstract
Two genes are xenologs in the sense of Fitch if they are separated by at least one horizontal gene transfer event. Horizonal gene transfer is asymmetric in the sense that the transferred copy is distinguished from the one that remains within the ancestral lineage. Hence xenology is more precisely thought of as a non-symmetric relation: y is xenologous to x if y has been horizontally transferred at least once since it diverged from the least common ancestor of x and y. We show that xenology relations are characterized by a small set of forbidden induced subgraphs on three vertices. Furthermore, each xenology relation can be derived from a unique least-resolved edge-labeled phylogenetic tree. We provide a linear-time algorithm for the recognition of xenology relations and for the construction of its least-resolved edge-labeled phylogenetic tree. The fact that being a xenology relation is a heritable graph property, finally has far-reaching consequences on approximation problems associated with xenology relations.
Similar content being viewed by others
References
Aho AV, Sagiv Y, Szymanski TG, Ullman JD (1981) Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J Comput 10(3):405–421
Altenhoff AM, Boeckmann B, Capella-Gutierrez S, Dalquen DA, DeLuca T, Forslund K, Jaime HC, Linard B, Pereira C, Pryszcz LP, Schreiber F, da Silva AS, Szklarczyk D, Train CM, Bork P, Lecompte O, von Mering C, Xenarios I, Sjölander K, Jensen LJ, Martin MJ, Muffato M, Gabaldón T, Lewis SE, Thomas PD, Sonnhammer E, Dessimoz C (2016) Standardized benchmarking in the quest for orthologs. Nat Methods 13:425–430
Böcker S, Dress AWM (1998) Recovering symbolically dated, rooted trees from symbolic ultrametrics. Adv Math 138:105–125. https://doi.org/10.1006/aima.1998.1743
Bryant D (1997) Building trees, hunting for trees, and comparing trees: theory and methods in phylogenetic analysis. Ph.D. thesis, University of Canterbury
Bryant D, Steel M (1995) Extension operations on sets of leaf-labeled trees. Adv Appl Math 16(4):425–453. https://doi.org/10.1006/aama.1995.1020
Cai L (1996) Fixed-parameter tractability of graph modification problems for hereditary properties. Inf Process Lett 58(4):171–176
Cormen TH, Clifford Stein, Leiserson CE, Rivest RL (2009) Introduction to algorithms. MIT Press, Cambridge
Crespelle C, Paul C (2006) Fully dynamic recognition algorithm and certificate for directed cographs. Discrete Appl Math 154:1722–1741
Dekker MCH (1986) Reconstruction methods for derivation trees. Master’s thesis, Vrije Universiteit, Amsterdam, Netherlands
Deng Y, Fernández-Baca D (2016) Fast compatibility testing for rooted phylogenetic trees. In: Grossi R, Lewenstein M (eds) 27th Annual symposium on combinatorial pattern matching (CPM 2016), Leibniz international proceedings in informatics (LIPIcs), vol 54, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, pp 12:1–12:12
Dondi R, El-Mabrouk N, Lafond M (2016) Correction of weighted orthology and paralogy relations-complexity and algorithmic results. In: International workshop on algorithms in bioinformatics. Springer, pp 121–136
Dondi R, Mauri G, Zoppis I (2017) Orthology correction for gene tree reconstruction: theoretical and experimental results. Procedia Computer Science 108:1115–1124. In: International conference on computational science, ICCS 2017, 12–14 June 2017, Zurich, Switzerland
Ehrenfeucht A, Rozenberg G (1990) Primitivity is hereditary for 2-structures. Theor Comput Sci 70:343–359
Fitch WM (1970) Distinguishing homologous from analogous proteins. Syst Biol 19(2):99–113. https://doi.org/10.2307/2412448
Fitch WM (2000) Homology a personal view on some of the problems. Trends Genet 16(5):227–231. https://doi.org/10.1016/S0168-9525(00)02005-9
Gabaldon T, Koonin EV (2013) Functional and evolutionary implications of gene orthology. Nat Rev Genet 14:360–366
Geiß M, Hellmuth M, Long Y, Stadler P (2018) A short note on undirected Fitch graphs. Art Discrete Appl. Math. 1(1): #P1.08
Grünewald S, Steel M, Swenson MS (2007) Closure operations in phylogenetics. Math Biosci 208(2):521–537. https://doi.org/10.1016/j.mbs.2006.11.005
Hellmuth M (2017) Biologically feasible gene trees, reconciliation maps and informative triples. Algor Mol Biol 12(1):23
Hellmuth M (2018) Generalized Fitch graphs: edge-labeled graphs that are explained by edge-labeled trees. Tech. Rep. arXiv:1802.03657v2
Hellmuth M, Wieseke N (2016) From sequence data including orthologs, paralogs, and xenologs to gene and species trees. In: Pontarotti P (ed) Evolutionary biology: convergent evolution, evolution of complex traits, concepts and methods. Springer, Cham, pp 373–392
Hellmuth M, Wieseke N (2017) On tree representations of relations and graphs: symbolic ultrametrics and cograph edge decompositions. J Comb Optim. https://doi.org/10.1007/s10878-017-0111-7
Hellmuth M, Hernandez-Rosales M, Huber KT, Moulton V, Stadler PF, Wieseke N (2013) Orthology relations, symbolic ultrametrics, and cographs. J Math Biol 66(1–2):399–420
Hellmuth M, Wieseke N, Lechner M, Lenhof HP, Middendorf M, Stadler PF (2015) Phylogenomics with paralogs. Proc Natl Acad Sci 112(7):2058–2063. https://doi.org/10.1073/pnas.1412770112
Hellmuth M, Stadler PF, Wieseke N (2017) The mathematics of xenology: Di-cographs, symbolic ultrametrics, 2-structures and tree-representable systems of binary relations. J Math Biol 75(1):199–237. https://doi.org/10.1007/s00285-016-1084-3
Henzinger MR, King V, Warnow T (1999) Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. Algorithmica 24(1):1–13. https://doi.org/10.1007/PL00009268
Hernandez-Rosales M, Hellmuth M, Wieseke N, Huber KT, Moulton V, Stadler PF (2012) From event-labeled gene trees to species trees. BMC Bioinform 13(19):S6
Holm J, de Lichtenberg K, Thorup M (2001) Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. J ACM 48(4):723–760. https://doi.org/10.1145/502090.502095
Jansson J, Ng JHK, Sadakane K, Sung WK (2005) Rooted maximum agreement supertrees. Algorithmica 43(4):293–307. https://doi.org/10.1007/s00453-004-1147-5
Jensen RA (2001) Orthologs and paralogs—we need to get it right. Genome Biol 2:interactions1002. https://doi.org/10.1186/gb-2001-2-8-interactions1002
Koonin EV (2005) Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet 39(1):309–338. https://doi.org/10.1146/annurev.genet.39.073003.114725
Lafond M, El-Mabrouk N (2014) Orthology and paralogy constraints: satisfiability and consistency. BMC Genomics 15(6):S12. https://doi.org/10.1186/1471-2164-15-S6-S12
Lafond M, El-Mabrouk N (2015) Orthology relation and gene tree correction: complexity results. In: International workshop on algorithms in bioinformatics. Springer, pp 66–79
Lafond M, Semeria M, Swenson KM, Tannier E, El-Mabrouk N (2013) Gene tree correction guided by orthology. BMC Bioinform 14(15):S5. https://doi.org/10.1186/1471-2105-14-S15-S5
Lafond M, Dondi R, El-Mabrouk N (2016) The link between orthology relations and gene trees: a correction perspective. Algor Mol Biol 11:4. https://doi.org/10.1186/s13015-016-0067-7
Lewis JM, Yannakakis M (1980) The node-deletion problem for hereditary properties is NP-complete. J Comput Syst Sci 20(2):219–230
McConnell RM, de Montgolfier F (2005) Linear-time modular decomposition of directed graphs. Discrete Appl Math 145(2):198–209
Möhring RH, Radermacher FJ (1984) Substitution decomposition for discrete structures and connections with combinatorial optimization. Ann Discrete Math 19:257–356
Nichio BTL, Marchaukoski JN, Raitzz RT (2017) New tools in orthology analysis: a brief review of promising perspectives. Front Genet 8:165. https://doi.org/10.3389/fgene.2017.00165
Niedermeier R (2006) Invitation to fixed-parameter algorithms. Oxford lecture series in mathematics and its applications. OUP, Oxford
Nojgaard N, Geiß M, Merkle D, Stadler PF, Wieseke N, Hellmuth M (2017) Forbidden time travel: characterization of time-consistent tree reconciliation maps. In: Schwartz R, Reinert K (eds) 17th international workshop on algorithms in bioinformatics (WABI 2017), Leibniz international proceedings in informatics (LIPIcs), vol 88, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, pp 17:1–17:12
Nøjgaard N, El-Mabrouk N, Merkle D, Wieseke N, Hellmuth M (2018) Partial homology relations—satisfiability in terms of di-cographs. In: Computing and combinatorics: 24st international conference (COCOON). Springer, Cham (to appear)
Novichkov PS, Omelchenko MV, Gelfand MS, Mironov AA, Wolf YI, Koonin EV (2004) Genome-wide molecular clock and horizontal gene transfer in bacterial evolution. J Bacteriol 186:6575–6585. https://doi.org/10.1128/JB.186.19.65756585.2004
Rancurel C, Legrand L, Danchin EGJ (2017) Alienness: rapid detection of candidate horizontal gene transfers across the tree of life. Genes 8:E248. https://doi.org/10.3390/genes8100248
Ravenhall M, Škunca N, Lassalle F, Dessimoz C (2015) Inferring horizontal gene transfer. PLoS Comput Biol 11:e1004,095. https://doi.org/10.1371/journal.pcbi.1004095
Seemann CR, Hellmuth M (2018) The matroid structure of representative triple sets and triple closure computation. Eur J Combin 70:384–407
Semple C, Steel M (2003) Phylogenetics, Oxford lecture series in mathematics and its applications, vol 24. Oxford University Press, Oxford
Steel M (2016) Phylogeny: discrete and random processes in evolution. CBMS-NSF regional conference series in applied mathematics. Society for Industrial and Applied Mathematics, Philadelphia
Yannakakis M (1978) Node- and edge-deletion NP-complete problems. In: STOC ’78 Proceedings of the tenth annual ACM symposium on theory of computing, San Diego, California. ACM, pp 253–264
Acknowledgements
We thank Maribel Hernández Rosales and her team for stimulating discussions. This work was funded in part by the BMBF-funded project “Center for RNA-Bioinformatics” (031A538A, de.NBI-RBC) and a travel grant from DAAD PROALMEX (Proj. No. 278966).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Geiß, M., Anders, J., Stadler, P.F. et al. Reconstructing gene trees from Fitch’s xenology relation. J. Math. Biol. 77, 1459–1491 (2018). https://doi.org/10.1007/s00285-018-1260-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-018-1260-8
Keywords
- Fitch xenology
- Phylogenetic tree
- Least-resolved tree
- Rooted triples
- Informative triple sets
- Di-cograph
- Heritable graph property
- Forbidden induced subgraphs
- Recognition algorithm
- Fixed parameter tractable