On the DCJ Median Problem

  • Mingfu Shao
  • Bernard M. E. Moret
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8486)

Abstract

As many whole genomes are sequenced, comparative genomics is moving from pairwise comparisons to multiway comparisons framed within a phylogenetic tree. A central problem in this process is the inference of data for internal nodes of the tree from data given at the leaves. When phrased as an optimization problem, this problem reduces to computing a median of three genomes under the operations (evolutionary changes) of interest. We focus on the universal rearrangement operation known as double-cut-and join (DCJ) and present three contributions to the DCJ median problem. First, we describe a new strategy to find so-called adequate subgraphs in the multiple breakpoint graph, using a seed genome. We show how to compute adequate subgraphs w.r.t. this seed genome using a network flow formulation. Second, we prove that the upper bound of the median distance computed from the triangle inequality is tight. Finally, we study the question of whether the median distance can reach its lower and upper bounds. We derive a necessary and sufficient condition for the median distance to reach its lower bound and a necessary condition for it to reach its upper bound and design algorithms to test for these conditions.

Keywords

genomic rearrangement network flow dynamic programming 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fertin, G., Labarre, A., Rusu, I., Tannier, E., Vialette, S.: Combinatorics of Genome Rearrangements. MIT Press (2009)Google Scholar
  2. 2.
    Bergeron, A., Mixtacki, J., Stoye, J.: A unifying view of genome rearrangements. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 163–173. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Yancopoulos, S., Attie, O., Friedberg, R.: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21(16), 3340–3346 (2005)CrossRefGoogle Scholar
  4. 4.
    Alekseyev, M.A., Pevzner, P.A.: Whole genome duplications, multi-break rearrangements, and genome halving problem. In: Proc. 18th ACM-SIAM Symp. Discrete Algs. SODA 2007, pp. 665–679. SIAM Press (2007)Google Scholar
  5. 5.
    Braga, M.D., Willing, E., Stoye, J.: Double cut and join with insertions and deletions. J. Comput. Biol. 18(9), 1167–1184 (2011)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Chen, X., Sun, R., Yu, J.: Approximating the double-cut-and-join distance between unsigned genomes. In: Proc. 9th RECOMB Workshop Compar. Genomics RECOMB-CG 2011, BMC Bioinformatics 12(S.9), S17 (2011)Google Scholar
  7. 7.
    Shao, M., Lin, Y.: Approximating the edit distance for genomes with duplicate genes under DCJ, insertion and deletion. In: Proc. 10th RECOMB Workshop Compar. Genomics RECOMB-CG 2012, BMC Bioinformatics 13(S. 19), S13 (2012)Google Scholar
  8. 8.
    Shao, M., Lin, Y., Moret, B.M.E.: Sorting genomes with rearrangements and segmental duplications through trajectory graphs. In: Proc. 11th RECOMB Workshop Compar. Genomics RECOMB-CG 2013, BMC Bioinformatics 14(S. 15), S9 (2013)Google Scholar
  9. 9.
    Moret, B.M.E., Lin, Y., Tang, J.: Rearrangements in phylogenetic inference: Compare, model, or encode? In: Chauve, C., et al. (eds.) Models and Algorithms for Genome Evolution, Computational Biology, vol. 19, pp. 147–172. Springer (2013)Google Scholar
  10. 10.
    Hannenhalli, S., Pevzner, P.A.: Transforming cabbage into turnip (polynomial algorithm for sorting signed permutations by reversals). In: Proc. 27th ACM Symp. Theory of Computing STOC 1995, pp. 178–189. ACM Press (1995)Google Scholar
  11. 11.
    Bader, D.A., Moret, B.M.E., Yan, M.: A fast linear-time algorithm for inversion distance with an experimental comparison. J. Comput. Biol. 8(5), 483–491 (2001)CrossRefGoogle Scholar
  12. 12.
    Caprara, A.: The reversal median problem. INFORMS J. Comput. 15, 93–113 (2003)CrossRefMATHMathSciNetGoogle Scholar
  13. 13.
    Tannier, E., Zheng, C., Sankoff, D.: Multichromosomal genome median and halving problems. In: Crandall, K.A., Lagergren, J. (eds.) WABI 2008. LNCS (LNBI), vol. 5251, pp. 1–13. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Siepel, A.C., Moret, B.M.E.: Finding an optimal inversion median: experimental results. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 189–203. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  15. 15.
    Moret, B.M.E., Siepel, A.C., Tang, J., Liu, T.: Inversion medians outperform breakpoint medians in phylogeny reconstruction from gene-order data. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 521–536. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  16. 16.
    Arndt, W., Tang, J.: Improving reversal median computation using commuting reversals and cycle information. J. Comput. Biol. 15(8), 1079–1092 (2008)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Rajan, V., Xu, A.W., Lin, Y., Swenson, K.M., Moret, B.M.E.: Heuristics for the inversion median problem. In: Proc. 8th Asia-Pacific Bioinf. Conf. APBC 2010, BMC Bioinformatics 11(S. 1), S30 (2010)Google Scholar
  18. 18.
    Zhang, M., Arndt, W., Tang, J.: An exact solver for the DCJ median problem. In: Proc. 14th Pacific Symp. Biocomputing PSB 2009, pp. 138–149 (2009)Google Scholar
  19. 19.
    Xu, A.W., Sankoff, D.: Decompositions of multiple breakpoint graphs and rapid exact solutions to the median problem. In: Crandall, K.A., Lagergren, J. (eds.) WABI 2008. LNCS (LNBI), vol. 5251, pp. 25–37. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  20. 20.
    Alekseyev, M.A., Pevzner, P.A.: Breakpoint graphs and ancestral genome reconstructions. Genome Research 19(5), 943–957 (2009)CrossRefGoogle Scholar
  21. 21.
    Xu, A.W.: A fast and exact algorithm for the median of three problem: A graph decomposition approach. J. Comput. Biol. 16(10), 1369–1381 (2009)CrossRefMathSciNetGoogle Scholar
  22. 22.
    Aganezov, S., Alekseyev, M.A.: On pairwise distances and median score of three genomes under DCJ. In: Proc. 10th RECOMB Workshop Compar. Genomics RECOMB-CG 2012, BMC Bioinformatics 13(S.19), S1 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Mingfu Shao
    • 1
  • Bernard M. E. Moret
    • 1
  1. 1.Laboratory for Computational Biology and BioinformaticsEPFLSwitzerland

Personalised recommendations