Journal of Mathematical Biology

, Volume 77, Issue 5, pp 1459–1491 | Cite as

Reconstructing gene trees from Fitch’s xenology relation

  • Manuela Geiß
  • John Anders
  • Peter F. Stadler
  • Nicolas Wieseke
  • Marc HellmuthEmail author


Two genes are xenologs in the sense of Fitch if they are separated by at least one horizontal gene transfer event. Horizonal gene transfer is asymmetric in the sense that the transferred copy is distinguished from the one that remains within the ancestral lineage. Hence xenology is more precisely thought of as a non-symmetric relation: y is xenologous to x if y has been horizontally transferred at least once since it diverged from the least common ancestor of x and y. We show that xenology relations are characterized by a small set of forbidden induced subgraphs on three vertices. Furthermore, each xenology relation can be derived from a unique least-resolved edge-labeled phylogenetic tree. We provide a linear-time algorithm for the recognition of xenology relations and for the construction of its least-resolved edge-labeled phylogenetic tree. The fact that being a xenology relation is a heritable graph property, finally has far-reaching consequences on approximation problems associated with xenology relations.


Fitch xenology Phylogenetic tree Least-resolved tree Rooted triples Informative triple sets Di-cograph Heritable graph property Forbidden induced subgraphs Recognition algorithm Fixed parameter tractable 

Mathematics Subject Classification

05C05 05C85 68R05 68R10 



We thank Maribel Hernández Rosales and her team for stimulating discussions. This work was funded in part by the BMBF-funded project “Center for RNA-Bioinformatics” (031A538A, de.NBI-RBC) and a travel grant from DAAD PROALMEX (Proj. No. 278966).


  1. Aho AV, Sagiv Y, Szymanski TG, Ullman JD (1981) Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J Comput 10(3):405–421MathSciNetCrossRefGoogle Scholar
  2. Altenhoff AM, Boeckmann B, Capella-Gutierrez S, Dalquen DA, DeLuca T, Forslund K, Jaime HC, Linard B, Pereira C, Pryszcz LP, Schreiber F, da Silva AS, Szklarczyk D, Train CM, Bork P, Lecompte O, von Mering C, Xenarios I, Sjölander K, Jensen LJ, Martin MJ, Muffato M, Gabaldón T, Lewis SE, Thomas PD, Sonnhammer E, Dessimoz C (2016) Standardized benchmarking in the quest for orthologs. Nat Methods 13:425–430CrossRefGoogle Scholar
  3. Böcker S, Dress AWM (1998) Recovering symbolically dated, rooted trees from symbolic ultrametrics. Adv Math 138:105–125. MathSciNetCrossRefzbMATHGoogle Scholar
  4. Bryant D (1997) Building trees, hunting for trees, and comparing trees: theory and methods in phylogenetic analysis. Ph.D. thesis, University of CanterburyGoogle Scholar
  5. Bryant D, Steel M (1995) Extension operations on sets of leaf-labeled trees. Adv Appl Math 16(4):425–453. MathSciNetCrossRefzbMATHGoogle Scholar
  6. Cai L (1996) Fixed-parameter tractability of graph modification problems for hereditary properties. Inf Process Lett 58(4):171–176MathSciNetCrossRefGoogle Scholar
  7. Cormen TH, Clifford Stein, Leiserson CE, Rivest RL (2009) Introduction to algorithms. MIT Press, CambridgezbMATHGoogle Scholar
  8. Crespelle C, Paul C (2006) Fully dynamic recognition algorithm and certificate for directed cographs. Discrete Appl Math 154:1722–1741MathSciNetCrossRefGoogle Scholar
  9. Dekker MCH (1986) Reconstruction methods for derivation trees. Master’s thesis, Vrije Universiteit, Amsterdam, NetherlandsGoogle Scholar
  10. Deng Y, Fernández-Baca D (2016) Fast compatibility testing for rooted phylogenetic trees. In: Grossi R, Lewenstein M (eds) 27th Annual symposium on combinatorial pattern matching (CPM 2016), Leibniz international proceedings in informatics (LIPIcs), vol 54, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, pp 12:1–12:12Google Scholar
  11. Dondi R, El-Mabrouk N, Lafond M (2016) Correction of weighted orthology and paralogy relations-complexity and algorithmic results. In: International workshop on algorithms in bioinformatics. Springer, pp 121–136Google Scholar
  12. Dondi R, Mauri G, Zoppis I (2017) Orthology correction for gene tree reconstruction: theoretical and experimental results. Procedia Computer Science 108:1115–1124. In: International conference on computational science, ICCS 2017, 12–14 June 2017, Zurich, SwitzerlandCrossRefGoogle Scholar
  13. Ehrenfeucht A, Rozenberg G (1990) Primitivity is hereditary for 2-structures. Theor Comput Sci 70:343–359MathSciNetCrossRefGoogle Scholar
  14. Fitch WM (1970) Distinguishing homologous from analogous proteins. Syst Biol 19(2):99–113. MathSciNetCrossRefGoogle Scholar
  15. Fitch WM (2000) Homology a personal view on some of the problems. Trends Genet 16(5):227–231. CrossRefGoogle Scholar
  16. Gabaldon T, Koonin EV (2013) Functional and evolutionary implications of gene orthology. Nat Rev Genet 14:360–366CrossRefGoogle Scholar
  17. Geiß M, Hellmuth M, Long Y, Stadler P (2018) A short note on undirected Fitch graphs. Art Discrete Appl. Math. 1(1): #P1.08Google Scholar
  18. Grünewald S, Steel M, Swenson MS (2007) Closure operations in phylogenetics. Math Biosci 208(2):521–537. MathSciNetCrossRefzbMATHGoogle Scholar
  19. Hellmuth M (2017) Biologically feasible gene trees, reconciliation maps and informative triples. Algor Mol Biol 12(1):23CrossRefGoogle Scholar
  20. Hellmuth M (2018) Generalized Fitch graphs: edge-labeled graphs that are explained by edge-labeled trees. Tech. Rep. arXiv:1802.03657v2
  21. Hellmuth M, Wieseke N (2016) From sequence data including orthologs, paralogs, and xenologs to gene and species trees. In: Pontarotti P (ed) Evolutionary biology: convergent evolution, evolution of complex traits, concepts and methods. Springer, Cham, pp 373–392CrossRefGoogle Scholar
  22. Hellmuth M, Wieseke N (2017) On tree representations of relations and graphs: symbolic ultrametrics and cograph edge decompositions. J Comb Optim. CrossRefzbMATHGoogle Scholar
  23. Hellmuth M, Hernandez-Rosales M, Huber KT, Moulton V, Stadler PF, Wieseke N (2013) Orthology relations, symbolic ultrametrics, and cographs. J Math Biol 66(1–2):399–420MathSciNetCrossRefGoogle Scholar
  24. Hellmuth M, Wieseke N, Lechner M, Lenhof HP, Middendorf M, Stadler PF (2015) Phylogenomics with paralogs. Proc Natl Acad Sci 112(7):2058–2063. CrossRefGoogle Scholar
  25. Hellmuth M, Stadler PF, Wieseke N (2017) The mathematics of xenology: Di-cographs, symbolic ultrametrics, 2-structures and tree-representable systems of binary relations. J Math Biol 75(1):199–237. MathSciNetCrossRefzbMATHGoogle Scholar
  26. Henzinger MR, King V, Warnow T (1999) Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. Algorithmica 24(1):1–13. MathSciNetCrossRefzbMATHGoogle Scholar
  27. Hernandez-Rosales M, Hellmuth M, Wieseke N, Huber KT, Moulton V, Stadler PF (2012) From event-labeled gene trees to species trees. BMC Bioinform 13(19):S6Google Scholar
  28. Holm J, de Lichtenberg K, Thorup M (2001) Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. J ACM 48(4):723–760. MathSciNetCrossRefzbMATHGoogle Scholar
  29. Jansson J, Ng JHK, Sadakane K, Sung WK (2005) Rooted maximum agreement supertrees. Algorithmica 43(4):293–307. MathSciNetCrossRefzbMATHGoogle Scholar
  30. Jensen RA (2001) Orthologs and paralogs—we need to get it right. Genome Biol 2:interactions1002. CrossRefGoogle Scholar
  31. Koonin EV (2005) Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet 39(1):309–338. CrossRefGoogle Scholar
  32. Lafond M, El-Mabrouk N (2014) Orthology and paralogy constraints: satisfiability and consistency. BMC Genomics 15(6):S12. CrossRefGoogle Scholar
  33. Lafond M, El-Mabrouk N (2015) Orthology relation and gene tree correction: complexity results. In: International workshop on algorithms in bioinformatics. Springer, pp 66–79Google Scholar
  34. Lafond M, Semeria M, Swenson KM, Tannier E, El-Mabrouk N (2013) Gene tree correction guided by orthology. BMC Bioinform 14(15):S5. CrossRefGoogle Scholar
  35. Lafond M, Dondi R, El-Mabrouk N (2016) The link between orthology relations and gene trees: a correction perspective. Algor Mol Biol 11:4. CrossRefzbMATHGoogle Scholar
  36. Lewis JM, Yannakakis M (1980) The node-deletion problem for hereditary properties is NP-complete. J Comput Syst Sci 20(2):219–230MathSciNetCrossRefGoogle Scholar
  37. McConnell RM, de Montgolfier F (2005) Linear-time modular decomposition of directed graphs. Discrete Appl Math 145(2):198–209MathSciNetCrossRefGoogle Scholar
  38. Möhring RH, Radermacher FJ (1984) Substitution decomposition for discrete structures and connections with combinatorial optimization. Ann Discrete Math 19:257–356MathSciNetzbMATHGoogle Scholar
  39. Nichio BTL, Marchaukoski JN, Raitzz RT (2017) New tools in orthology analysis: a brief review of promising perspectives. Front Genet 8:165. CrossRefGoogle Scholar
  40. Niedermeier R (2006) Invitation to fixed-parameter algorithms. Oxford lecture series in mathematics and its applications. OUP, OxfordCrossRefGoogle Scholar
  41. Nojgaard N, Geiß M, Merkle D, Stadler PF, Wieseke N, Hellmuth M (2017) Forbidden time travel: characterization of time-consistent tree reconciliation maps. In: Schwartz R, Reinert K (eds) 17th international workshop on algorithms in bioinformatics (WABI 2017), Leibniz international proceedings in informatics (LIPIcs), vol 88, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, pp 17:1–17:12Google Scholar
  42. Nøjgaard N, El-Mabrouk N, Merkle D, Wieseke N, Hellmuth M (2018) Partial homology relations—satisfiability in terms of di-cographs. In: Computing and combinatorics: 24st international conference (COCOON). Springer, Cham (to appear)Google Scholar
  43. Novichkov PS, Omelchenko MV, Gelfand MS, Mironov AA, Wolf YI, Koonin EV (2004) Genome-wide molecular clock and horizontal gene transfer in bacterial evolution. J Bacteriol 186:6575–6585. CrossRefGoogle Scholar
  44. Rancurel C, Legrand L, Danchin EGJ (2017) Alienness: rapid detection of candidate horizontal gene transfers across the tree of life. Genes 8:E248. CrossRefGoogle Scholar
  45. Ravenhall M, Škunca N, Lassalle F, Dessimoz C (2015) Inferring horizontal gene transfer. PLoS Comput Biol 11:e1004,095. CrossRefGoogle Scholar
  46. Seemann CR, Hellmuth M (2018) The matroid structure of representative triple sets and triple closure computation. Eur J Combin 70:384–407MathSciNetCrossRefGoogle Scholar
  47. Semple C, Steel M (2003) Phylogenetics, Oxford lecture series in mathematics and its applications, vol 24. Oxford University Press, OxfordGoogle Scholar
  48. Steel M (2016) Phylogeny: discrete and random processes in evolution. CBMS-NSF regional conference series in applied mathematics. Society for Industrial and Applied Mathematics, PhiladelphiaCrossRefGoogle Scholar
  49. Yannakakis M (1978) Node- and edge-deletion NP-complete problems. In: STOC ’78 Proceedings of the tenth annual ACM symposium on theory of computing, San Diego, California. ACM, pp 253–264Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Manuela Geiß
    • 1
    • 2
  • John Anders
    • 1
    • 2
  • Peter F. Stadler
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
  • Nicolas Wieseke
    • 9
  • Marc Hellmuth
    • 10
    • 11
    Email author
  1. 1.Bioinformatics Group, Department of Computer ScienceLeipzig UniversityLeipzigGermany
  2. 2.Interdisciplinary Center of BioinformaticsLeipzig UniversityLeipzigGermany
  3. 3.German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and SolutionsLeipzig UniversityLeipzigGermany
  4. 4.Leipzig Research Center for Civilization DiseasesLeipzig UniversityLeipzigGermany
  5. 5.Department of DiagnosticsFraunhofer Institute for Cell Therapy and Immunology – IZILeipzigGermany
  6. 6.Max-Planck-Institute for Mathematics in the SciencesLeipzigGermany
  7. 7.Inst. f. Theoretical ChemistryUniversity of ViennaWienAustria
  8. 8.Santa Fe InstituteSanta FeUSA
  9. 9.Swarm Intelligence and Complex Systems Group, Department of Computer ScienceLeipzig UniversityLeipzigGermany
  10. 10.Institute of Mathematics and Computer ScienceUniversity of GreifswaldGreifswaldGermany
  11. 11.Center for BioinformaticsSaarland UniversitySaarbrückenGermany

Personalised recommendations