A Unified ILP Framework for Genome Median, Halving, and Aliquoting Problems Under DCJ

  • Pavel Avdeyev
  • Nikita Alexeev
  • Yongwu Rong
  • Max A. Alekseyev
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10562)

Abstract

One of the key computational problems in comparative genomics is the reconstruction of genomes of ancestral species based on genomes of extant species. Since most dramatic changes in genomic architectures are caused by genome rearrangements, this problem is often posed as minimization of the number of genome rearrangements between extant and ancestral genomes. The basic case of three given genomes is known as the genome median problem. Whole genome duplications (WGDs) represent yet another type of dramatic evolutionary events and inspire the reconstruction of pre-duplicated ancestral genomes, referred to as the genome halving problem. Generalization of WGDs to whole genome multiplication events leads to the genome aliquoting problem.

In the present study, we provide polynomial-size integer linear programming formulations for the aforementioned problems. We further obtain such formulations for the restricted versions of the median and halving problems, which have been recently introduced for improving biological relevance.

Notes

Acknowledgements

The work of PA and MAA is supported by the National Science Foundation under the grant No. IIS-1462107. The work of NA and YR is partially supported by the National Science Foundation under the grant No. DMS-1406984.

References

  1. 1.
    Alekseyev, M.A., Pevzner, P.A.: Colored de Bruijn graphs and the genome halving problem. IEEE/ACM Trans. Comput. Biol. Bioinf. (TCBB) 4(1), 98–107 (2007)CrossRefGoogle Scholar
  2. 2.
    Alekseyev, M.A., Pevzner, P.A.: Whole genome duplications, multi-break rearrangements, and genome halving problem. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2007), pp. 665–679. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (2007)Google Scholar
  3. 3.
    Alekseyev, M.A., Pevzner, P.A.: Multi-break rearrangements and chromosomal evolution. Theoret. Comput. Sci. 395(2), 193–202 (2008)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Alexeev, N., Avdeyev, P., Alekseyev, M.A.: Comparative genomics meets topology: a novel view on genome median and halving problems. BMC Bioinf. 17(14), 418 (2016)CrossRefGoogle Scholar
  5. 5.
    Avdeyev, P., Jiang, S., Aganezov, S., Hu, F., Alekseyev, M.A.: Reconstruction of ancestral genomes in presence of gene gain and loss. J. Comput. Biol. 23(3), 150–164 (2016)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bergeron, A., Mixtacki, J., Stoye, J.: A unifying view of genome rearrangements. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS, vol. 4175, pp. 163–173. Springer, Heidelberg (2006). doi:10.1007/11851561_16 CrossRefGoogle Scholar
  7. 7.
    Caprara, A.: The reversal median problem. INFORMS J. Comput. 15(1), 93–113 (2003)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Caprara, A., Lancia, G., Ng, S.K.: Fast practical solution of sorting by reversals. In: Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2000), pp. 12–21. Society for Industrial and Applied Mathematics (2000)Google Scholar
  9. 9.
    Dehal, P., Boore, J.L.: Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 3(10), e314 (2005)CrossRefGoogle Scholar
  10. 10.
    Dias, Z., de Souza, C.C.: Polynomial-sized ILP models for rearrangement distance problems. In: Brazilian Symposium On Bioinformatics, p. 74 (2007)Google Scholar
  11. 11.
    El-Mabrouk, N., Sankoff, D.: The reconstruction of doubled genomes. SIAM J. Comput. 32(3), 754–792 (2003)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Feijão, P.: Reconstruction of ancestral gene orders using intermediate genomes. BMC Bioinf. 16(Suppl 14), S3 (2015)CrossRefGoogle Scholar
  13. 13.
    Feijão, P., Araujo, E.: Fast ancestral gene order reconstruction of genomes with unequal gene content. BMC Bioinf. 17(14), 413 (2016)CrossRefGoogle Scholar
  14. 14.
    Gagnon, Y., Savard, O.T., Bertrand, D., El-Mabrouk, N.: Advances on genome duplication distances. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol. 6398, pp. 25–38. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16181-0_3 CrossRefGoogle Scholar
  15. 15.
    Gao, N., Yang, N., Tang, J.: Ancestral genome inference using a genetic algorithm approach. PLoS ONE 8(5), 1–6 (2013)Google Scholar
  16. 16.
    Gavranović, H., Tannier, E.: Guided genome halving: provably optimal solutions provide good insights into the preduplication ancestral genome of saccharomyces cerevisiae. Pac. Symp. Biocomput. 15, 21–30 (2010)Google Scholar
  17. 17.
    Gurobi Optimization Inc: Gurobi optimizer reference manual (2016). http://www.gurobi.com
  18. 18.
    Guyot, R., Keller, B.: Ancestral genome duplication in rice. Genome 47(3), 610–614 (2004)CrossRefGoogle Scholar
  19. 19.
    Haghighi, M., Sankoff, D.: Medians seek the corners, and other conjectures. BMC Bioinform. 13(19), 1 (2012)Google Scholar
  20. 20.
    Hartmann, T., Wieseke, N., Sharan, R., Middendorf, M., Bernt, M.: Genome Rearrangement with ILP. IEEE/ACM Trans. Comput. Biol. Bioinform. (2017, in press). doi:10.1109/TCBB.2017.2708121
  21. 21.
    Kellis, M., Birren, B.W., Lander, E.S.: Proof and evolutionary analysis of ancient genome duplication in the yeast saccharomyces cerevisiae. Nature 428(6983), 617–624 (2004)CrossRefGoogle Scholar
  22. 22.
    Lancia, A.C.G., Ng, S.K.: A column-generation based branch-and-bound algorithm for sorting by reversals. Math. Support Mol. Biol. 47, 213 (1999)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Lancia, G., Rinaldi, F., Serafini, P.: A unified integer programming model for genome rearrangement problems. In: Ortuño, F., Rojas, I. (eds.) IWBBIO 2015. LNCS, vol. 9043, pp. 491–502. Springer, Cham (2015). doi:10.1007/978-3-319-16483-0_48 Google Scholar
  24. 24.
    Laohakiat, S., Lursinsap, C., Suksawatchon, J.: Duplicated genes reversal distance under gene deletion constraint by integer programming. In: 2008 2nd International Conference on Bioinformatics and Biomedical Engineering, pp. 527–530, May 2008Google Scholar
  25. 25.
    Mixtacki, J.: Genome halving under DCJ revisited. In: Hu, X., Wang, J. (eds.) COCOON 2008. LNCS, vol. 5092, pp. 276–286. Springer, Heidelberg (2008). doi:10.1007/978-3-540-69733-6_28 CrossRefGoogle Scholar
  26. 26.
    Postlethwait, J.H., Yan, Y.L., Gates, M.A., Horne, S., Amores, A., Brownlie, A., Donovan, A., Egan, E.S., Force, A., Gong, Z., et al.: Vertebrate genome evolution and the zebrafish gene map. Nat. Genet. 18(4), 345–349 (1998)CrossRefGoogle Scholar
  27. 27.
    Rajan, V., Xu, A.W., Lin, Y., Swenson, K.M., Moret, B.M.: Heuristics for the inversion median problem. BMC Bioinf. 11(1), S30 (2010)CrossRefGoogle Scholar
  28. 28.
    Savard, O.T., Gagnon, Y., Bertrand, D., El-Mabrouk, N.: Genome halving and double distance with losses. J. Comput. Biol. 18(9), 1185–1199 (2011)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Shao, M., Lin, Y., Moret, B.M.: An exact algorithm to compute the double-cut-and-join distance for genomes with duplicate genes. J. Comput. Biol. 22(5), 425–435 (2015)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Shao, M., Moret, B.M.: Comparing genomes with rearrangements and segmental duplications. Bioinformatics 31(12), i329 (2015)CrossRefGoogle Scholar
  31. 31.
    Suksawatchon, J., Lursinsap, C., Boden, M.: Computing the reversal distance between genomes in the presence of multi-gene families via binary integer programming. J. Bioinf. Comput. Biol. 05(01), 117–133 (2007)CrossRefGoogle Scholar
  32. 32.
    Swenson, K.M., Moret, B.M.: Inversion-based genomic signatures. BMC Bioinf. 10(1), 1 (2009)CrossRefGoogle Scholar
  33. 33.
    Tannier, E., Zheng, C., Sankoff, D.: Multichromosomal median and halving problems under different genomic distances. BMC Bioinf. 10(1), 1 (2009)CrossRefGoogle Scholar
  34. 34.
    The OEIS Foundation: The On-Line Encyclopedia of Integer Sequences. Published electronically at http://oeis.org (2017)
  35. 35.
    Warren, R., Sankoff, D.: Genome aliquoting with double cut and join. BMC Bioinf. 10(1), S2 (2009)CrossRefGoogle Scholar
  36. 36.
    Warren, R., Sankoff, D.: Genome halving with double cut and join. J. Bioinf. Comput. Biol. 7(02), 357–371 (2009)CrossRefGoogle Scholar
  37. 37.
    Xu, A.W.: A fast and exact algorithm for the median of three problem: A graph decomposition approach. J. Comput. Biol. 16(10), 1369–1381 (2009)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Yancopoulos, S., Attie, O., Friedberg, R.: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21(16), 3340–3346 (2005)CrossRefGoogle Scholar
  39. 39.
    Zhang, M., Arndt, W., Tang, J.: An exact solver for the DCJ median problem. In: Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, p. 138. NIH Public Access (2009)Google Scholar
  40. 40.
    Zheng, C., Zhu, Q., Adam, Z., Sankoff, D.: Guided genome halving: hardness, heuristics and the history of the hemiascomycetes. Bioinformatics 24(13), i96 (2008)CrossRefGoogle Scholar
  41. 41.
    Zheng, C., Zhu, Q., Sankoff, D.: Genome halving with an outgroup. Evol. Bioinf. 2, 295–302 (2006)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Pavel Avdeyev
    • 1
  • Nikita Alexeev
    • 1
  • Yongwu Rong
    • 1
  • Max A. Alekseyev
    • 1
  1. 1.The George Washington UniversityWashington, D.C.USA

Personalised recommendations