Orthology relations, symbolic ultrametrics, and cographs
Orthology detection is an important problem in comparative and evolutionary genomics and, consequently, a variety of orthology detection methods have been devised in recent years. Although many of these methods are dependent on generating gene and/or species trees, it has been shown that orthology can be estimated at acceptable levels of accuracy without having to infer gene trees and/or reconciling gene trees with species trees. Thus, it is of interest to understand how much information about the gene tree, the species tree, and their reconciliation is already contained in the orthology relation on the underlying set of genes. Here we shall show that a result by Böcker and Dress concerning symbolic ultrametrics, and subsequent algorithmic results by Semple and Steel for processing these structures can throw a considerable amount of light on this problem. More specifically, building upon these authors’ results, we present some new characterizations for symbolic ultrametrics and new algorithms for recovering the associated trees, with an emphasis on how these algorithms could be potentially extended to deal with arbitrary orthology relations. In so doing we shall also show that, somewhat surprisingly, symbolic ultrametrics are very closely related to cographs, graphs that do not contain an induced path on any subset of four vertices. We conclude with a discussion on how our results might be applied in practice to orthology detection.
KeywordsOrthology Symbolic ultrametric Cograph Cotree Rooted triples
Mathematics Subject Classification05C05 92D15 68R10
Unable to display preview. Download preview PDF.
- Altenhoff AM, Dessimoz C (2009) Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput Biol 5:e1000262Google Scholar
- Brandstädt A, Le VB, Spinrad JP (1999) Graph classes: a survey. SIAM monographs on discrete mathematics and applications. Soc Ind Appl Math, PhiladelphiaGoogle Scholar
- Falls C, Powell B, Snœyink J (2008) Computing high-stringency COGs using Turán-type graphs. Technical report. http://www.cs.unc.edu/~snoeyink/comp145/cogs.pdf
- Hubbard TJ, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Fitzgerald S, Fernandez-Banet J, Graf S, Haider S, Hammond M, Herrero J, Holland R, Howe K, Howe K, Johnson N, Kahari A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Melsopp C, Megy K, Meidl P, Ouverdin B, Parker A, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Severin J, Slater G, Smedley D, Spudich G, Trevanion S, Vilella A, Vogel J, White S, Wood M, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Flicek P, Kasprzyk A, Proctor G, Searle S, Smith J, Ureta-Vidal A, Birney E (2007) Ensembl 2007. Nucleic Acids Res 35: D610–D617CrossRefGoogle Scholar
- Liu Y, Wang J, Guo J, Chen J (2011) Cographs editing: complexity and parametrized algorithms. In: Fu B, Du DZ (eds) COCOON 2011. Lecture notes computer science, vol 6842. Springer, Berlin, pp 110–121Google Scholar
- Semple C, Steel M (2003) Phylogenetics. Oxford lecture series in mathematics and its applications, vol 24. Oxford University Press, OxfordGoogle Scholar
- Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, Dicuccio M, Edgar R, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Khovayko O, Landsman D, Lipman DJ, Madden TL, Maglott DR, Miller V, Ostell J, Pruitt KD, Schuler GD, Shumway M, Sequeira E, Sherry ST, Sirotkin K, Souvorov A, Starchenko G, Tatusov RL, Tatusova TA, Wagner L, Yaschenko E (2008) Database resources of the national center for biotechnology information. Nucleic Acids Res 36: D13–D21CrossRefGoogle Scholar