Correction of Weighted Orthology and Paralogy Relations - Complexity and Algorithmic Results

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9838)


A relation graph for a gene family is a graph with vertices representing the genes, edges connecting pairs of orthologous genes and “missing” edges representing paralogs. While a gene tree directly leads to a set of orthology and paralogy relations, the converse is not always true. Indeed a relation graph cannot necessarily be inferred from any tree, and even if it is “satisfiable” by a tree, this tree is not necessarily “consistent”, i.e. does not necessarily reflect a valid history for the genes, in agreement with a species tree. Here, we consider the problems of minimally correcting a relation graph for satisfiability and consistency, when a degree of confidence is assigned to each orthology or paralogy relation, leading to a weighted relation graph. We provide complexity and algorithmic results for minimizing corrections on a weighted graph, and also for the maximization variant of the problems for unweighted graphs.


  1. 1.
    Alon, N., Stav, U.: Hardness of edge-modification problems. Theor. Comput. Sci. 410(47–49), 4920–4927 (2009)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Arora, S., Frieze, A.M., Kaplan, H.: A new rounding procedure for the assignment problem with applications to dense graph arrangement problems. Math. Program. 92(1), 1–36 (2002)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Berglund, A., Sjolund, E., Ostlund, G., Sonnhammer, E.: InParanoid 6: eukaryotic ortholog clusters with inparalogs. Nucl. Acids Res. 36, D263–D266 (2008)CrossRefGoogle Scholar
  4. 4.
    Chawla, S., Krauthgamer, R., Kumar, R., Rabani, Y., Sivakumar, D.: On the hardness of approximating multicut and sparsest-cut. Comput. Complex. 15(2), 94–114 (2006)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000)Google Scholar
  6. 6.
    Corneil, D.G., Perl, Y., Stewart, L.K.: A linear recognition algorithm for cographs. SIAM J. Comput. 14(4), 926934 (1985)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Fitch, W.M.: Homology: a personal view on some of the problems. TIG 16(5), 227–231 (2000)CrossRefGoogle Scholar
  8. 8.
    Goodman, M., Czelusniak, J., Moore, G., Romero-Herrera, A., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Zool. 28, 132–163 (1979)CrossRefGoogle Scholar
  9. 9.
    Hellmuth, M., Hernandez-Rosales, M., Huber, K., Moulton, V., Stadler, P., Wieseke, N.: Orthology relations, symbolic ultrametrics, and cographs. J. Math. Biol. 66(1–2), 399–420 (2013)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Hellmuth, M., Wieseke, N., Lechner, M., Lenhof, H.-P., Middendorf, M., Stadler, P.F.: Phylogenomics with paralogs. In: PNAS (2014)Google Scholar
  11. 11.
    Hernandez-Rosales, M., Hellmuth, M., Wieseke, N., Huber, K.T., Moulton, V., Stadler, P.F.: From event-labeled gene trees to species trees. BMC Bioinform. 13(Suppl 19), S6 (2012)Google Scholar
  12. 12.
    Jiang, T., Kearney, P.E., Li, M.: A polynomial time approximation scheme for inferring evolutionary trees from quartet topologies and its application. SIAM J. Comput. 30(6), 1942–1961 (2000). doi: 10.1137/S0097539799361683 MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Lafond, M., Dondi, R., El-Mabrouk, N.: The link between orthology relations and gene trees: a correction perspective. Algorithms Mol. Biol. 11(1), 1 (2016)CrossRefGoogle Scholar
  14. 14.
    Lafond, M., El-Mabrouk, N.: Orthology and paralogy constraints: satisfiability and consistency. BMC Genom. 15(Suppl. 6), S12 (2014)CrossRefGoogle Scholar
  15. 15.
    Lafond, M., Semeria, M., Swenson, K., Tannier, E., El-Mabrouk, N.: Gene tree correction guided by orthology. BMC Bioinform. 14(suppl. 15), S5 (2013)CrossRefGoogle Scholar
  16. 16.
    Lafond, M., Swenson, K., El-Mabrouk, N.: Error detection and correction of gene trees. In: Chauve, C., El-Mabrouk, N., Tannier, E. (eds.) Models and Algorithms for Genome Evolution. Springer, London (2013)Google Scholar
  17. 17.
    Lechner, M., Findeiß, S., Steiner, L., Marz, M., Stadler, P.F., Prohaska, S.J.: Proteinortho: detection of co-orthologs in large-scale analysis. BMC Bioinform. 12(1), 1 (2011)CrossRefGoogle Scholar
  18. 18.
    Li, L., Stoeckert, C.J., Roos, D.: OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003)CrossRefGoogle Scholar
  19. 19.
    Liu, Y., Wang, J., Guo, J., Chen, J.: Complexity and parameterized algorithms for cograph editing. Theor. Comput. Sci. 461, 45–54 (2012)MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Natanzon, A., Shamir, R., Sharan, R.: Complexity classification of some edge modification problems. Discrete Appl. Math. 113(1), 109–128 (2001)MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Stoer, M., Wagner, F.: A simple min-cut algorithm. J. ACM 44(4), 585–591 (1997)MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Tatusov, R., Galperin, M., Natale, D., Koonin, E.: The COG database: a tool for genome-scale analysis of protein functions. Nucleic Acids Res. 28, 33–36 (2000)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Riccardo Dondi
    • 1
  • Nadia El-Mabrouk
    • 2
  • Manuel Lafond
    • 2
  1. 1.Dipartimento di Lettere, Filosofia e ComunicazioneUniversità degli Studi di BergamoBergamoItaly
  2. 2.Department of Computer ScienceUniversité de MontréalMontréalCanada

Personalised recommendations