Partial Homology Relations - Satisfiability in Terms of Di-Cographs

  • Nikolai Nøjgaard
  • Nadia El-Mabrouk
  • Daniel Merkle
  • Nicolas Wieseke
  • Marc Hellmuth
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10976)


Directed cographs (di-cographs) play a crucial role in the reconstruction of evolutionary histories of genes based on homology relations which are binary relations between genes. A variety of methods based on pairwise sequence comparisons can be used to infer such homology relations (e.g. orthology, paralogy, xenology). They are satisfiable if the relations can be explained by an event-labeled gene tree, i.e., they can simultaneously co-exist in an evolutionary history of the underlying genes. Every gene tree is equivalently interpreted as a so-called cotree that entirely encodes the structure of a di-cograph. Thus, satisfiable homology relations must necessarily form a di-cograph. The inferred homology relations might not cover each pair of genes and thus, provide only partial knowledge on the full set of homology relations. Moreover, for particular pairs of genes, it might be known with a high degree of certainty that they are not orthologs (resp. paralogs, xenologs) which yields forbidden pairs of genes. Motivated by this observation, we characterize (partial) satisfiable homology relations with or without forbidden gene pairs, provide a quadratic-time algorithm for their recognition and for the computation of a cotree that explains the given relations.


Directed cographs Partial relations Forbidden relations Recognition algorithm Homology Orthology Paralogy Xenology 



This contribution is supported in part by the Independent Research Fund Denmark, Natural Sciences, grant DFF-7014-00041.


  1. 1.
  2. 2.
    Altenhoff, A.M., Dessimoz, C.: Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput. Biol. 5, e1000262 (2009)CrossRefGoogle Scholar
  3. 3.
    Altenhoff, A.M., Gil, M., Gonnet, G.H., Dessimoz, C.: Inferring hierarchical orthologous groups from orthologous gene pairs. PLoS ONE 8(1), e53786 (2013)CrossRefGoogle Scholar
  4. 4.
    Altenhoff, A.M., Škunca, N., Glover, N., Train, C.M., Sueki, A., Piližota, I., Gori, K., Tomiczek, B., Müller, S., Redestig, H., Gonnet, G.H., Dessimoz, C.: The OMA orthology database in 2015: function predictions, better plant support, synteny view and other improvements. Nucleic Acids Res. 43(D1), D240–D249 (2015)CrossRefGoogle Scholar
  5. 5.
    Böcker, S., Dress, A.W.M.: Recovering symbolically dated, rooted trees from symbolic ultrametrics. Adv. Math. 138, 105–125 (1998)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Brandstädt, A., Le, V.B., Spinrad, J.P.: Graph Classes: A Survey. Society for Industrial and Applied Mathematics, Philadelphia (1999)CrossRefGoogle Scholar
  7. 7.
    Chen, F., Mackey, A.J., Stoeckert, C.J., Roos, D.S.: OrthoMCL-db: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res. 34(S1), D363–D368 (2006)CrossRefGoogle Scholar
  8. 8.
    Corneil, D.G., Lerchs, H., Steward Burlingham, L.: Complement reducible graphs. Discret. Appl. Math. 3, 163–174 (1981)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Corneil, D.G., Perl, Y., Stewart, L.K.: A linear recognition algorithm for cographs. SIAM J. Comput. 14, 926–934 (1985)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Crespelle, C., Paul, C.: Fully dynamic recognition algorithm and certificate for directed cographs. Discret. Appl. Math. 154, 1722–1741 (2006)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Dessimoz, C., Margadant, D., Gonnet, G.H.: DLIGHT – lateral gene transfer detection using pairwise evolutionary distances in a statistical framework. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS, vol. 4955, pp. 315–330. Springer, Heidelberg (2008). Scholar
  12. 12.
    Dondi, R., El-Mabrouk, N., Lafond, M.: Correction of weighted orthology and paralogy relations - complexity and algorithmic results. In: Frith, M., Storm Pedersen, C.N. (eds.) WABI 2016. LNCS, vol. 9838, pp. 121–136. Springer, Cham (2016). Scholar
  13. 13.
    Engelfriet, J., Harju, T., Proskurowski, A., Rozenberg, G.: Characterization and complexity of uniformly nonprimitive labeled 2-structures. Theor. Comp. Sci. 154, 247–282 (1996)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Fitch, W.M.: Homology: a personal view on some of the problems. Trends Genet. 16, 227–231 (2000)CrossRefGoogle Scholar
  15. 15.
    Gao, Y., Hare, D.R., Nastos, J.: The cluster deletion problem for cographs. Discret. Math. 313(23), 2763–2771 (2013)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Gurski, F.: Dynamic programming algorithms on directed cographs. Stat. Optim. Inf. Comput. 5(1), 35–44 (2017)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Hartmann, K., Wong, D., Stadler, T.: Sampling trees from evolutionary models. Syst. Biol. 59(4), 465–476 (2010)CrossRefGoogle Scholar
  18. 18.
    Hellmuth, M., Hernandez-Rosales, M., Huber, K.T., Moulton, V., Stadler, P.F., Wieseke, N.: Orthology relations, symbolic ultrametrics, and cographs. J. Math. Biol. 66(1–2), 399–420 (2013)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Hellmuth, M., Stadler, P.F., Wieseke, N.: The mathematics of xenology: Di-cographs, symbolic ultrametrics, 2-structures and tree- representable systems of binary relations. J. Math. Biol. 75(1), 199–237 (2017)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Hellmuth, M., Wieseke, N.: From sequence data including orthologs, paralogs, and xenologs to gene and species trees. In: Pontarotti, P. (ed.) Evolutionary Biology, pp. 373–392. Springer, Cham (2016). Scholar
  21. 21.
    Hellmuth, M., Wieseke, N.: On tree representations of relations and graphs: symbolic ultrametrics and cograph edge decompositions. J. Comb. Optim. 1–26 (2017)Google Scholar
  22. 22.
    Hellmuth, M., Wieseke, N., Lechner, M., Lenhof, H.P., Middendorf, M., Stadler, P.F.: Phylogenomics with paralogs. Proc. Natl. Acad. Sci. 112(7), 2058–2063 (2015)CrossRefGoogle Scholar
  23. 23.
    Lafond, M., Dondi, R., El-Mabrouk, N.: The link between orthology relations and gene trees: a correction perspective. Algorithms Mol. Biol. 11(1), 1 (2016)CrossRefGoogle Scholar
  24. 24.
    Lafond, M., El-Mabrouk, N.: Orthology and paralogy constraints: satisfiability and consistency. BMC Genomics 15(6), S12 (2014)CrossRefGoogle Scholar
  25. 25.
    Lafond, M., El-Mabrouk, N.: Orthology relation and gene tree correction: complexity results. In: Pop, M., Touzet, H. (eds.) WABI 2015. LNCS, vol. 9289, pp. 66–79. Springer, Heidelberg (2015). Scholar
  26. 26.
    Lawrence, J.G., Hartl, D.L.: Inference of horizontal genetic transfer from molecular data: an approach using the bootstrap. Genetics 131(3), 753–760 (1992)Google Scholar
  27. 27.
    Lechner, M., Findeiß, S., Steiner, L., Marz, M., Stadler, P.F., Prohaska, S.J.: Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinform. 12, 124 (2011)CrossRefGoogle Scholar
  28. 28.
    Lechner, M., Hernandez-Rosales, M., Doerr, D., Wiesecke, N., Thevenin, A., Stoye, J., Hartmann, R.K., Prohaska, S.J., Stadler, P.F.: Orthology detection combining clustering and synteny for very large datasets. PLoS ONE 9(8), e105015 (2014)CrossRefGoogle Scholar
  29. 29.
    Liu, Y., Wang, J., Guo, J., Chen, J.: Complexity and parameterized algorithms for cograph editing. Theor. Comput. Sci. 461, 45–54 (2012)MathSciNetCrossRefGoogle Scholar
  30. 30.
    McConnell, R.M., De Montgolfier, F.: Linear-time modular decomposition of directed graphs. Discret. Appl. Math. 145(2), 198–209 (2005)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Östlund, G., Schmitt, T., Forslund, K., Köstler, T., Messina, D.N., Roopra, S., Frings, O., Sonnhammer, E.L.: InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res. 38(suppl 1), D196–D203 (2010)CrossRefGoogle Scholar
  32. 32.
    Ravenhall, M., Škunca, N., Lassalle, F., Dessimoz, C.: Inferring horizontal gene transfer. PLoS Comput. Biol. 11(5), e1004095 (2015)CrossRefGoogle Scholar
  33. 33.
    Sukumaran, J., Holder, M.T.: Dendropy: a python library for phylogenetic computing. Bioinformatics 26(12), 1569–1571 (2010)CrossRefGoogle Scholar
  34. 34.
    Tatusov, R.L., Galperin, M.Y., Natale, D.A., Koonin, E.V.: The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28(1), 33–36 (2000)CrossRefGoogle Scholar
  35. 35.
    Trachana, K., Larsson, T.A., Powell, S., Chen, W.H., Doerks, T., Muller, J., Bork, P.: Orthology prediction methods: a quality assessment using curated protein families. BioEssays 33(10), 769–780 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Nikolai Nøjgaard
    • 1
    • 2
  • Nadia El-Mabrouk
    • 3
  • Daniel Merkle
    • 2
  • Nicolas Wieseke
    • 4
  • Marc Hellmuth
    • 1
    • 5
  1. 1.Institute of Mathematics and Computer ScienceUniversity of GreifswaldGreifswaldGermany
  2. 2.Department of Mathematics and Computer ScienceUniversity of Southern DenmarkOdense MDenmark
  3. 3.Department of Computer Science and Operational ResearchUniversity of MontrealMontrealCanada
  4. 4.Swarm Intelligence and Complex Systems Group, Department of Computer ScienceUniversity of LeipzigLeipzigGermany
  5. 5.Center for BioinformaticsSaarland UniversitySaarbrückenGermany

Personalised recommendations