Genome Rearrangement Analysis: Cut and Join Genome Rearrangements and Gene Cluster Preserving Approaches

  • Tom Hartmann
  • Martin Middendorf
  • Matthias Bernt
Part of the Methods in Molecular Biology book series (MIMB, volume 1704)


Genome rearrangements are mutations that change the gene content of a genome or the arrangement of the genes on a genome. Several years of research on genome rearrangements have established different algorithmic approaches for solving some fundamental problems in comparative genomics based on gene order information. This review summarizes the literature on genome rearrangement analysis along two lines of research. The first line considers rearrangement models that are particularly well suited for a theoretical analysis. These models use rearrangement operations that cut chromosomes into fragments and then join the fragments into new chromosomes. The second line works with rearrangement models that reflect several biologically motivated constraints, e.g., the constraint that gene clusters have to be preserved. In this chapter, the border between algorithmically “easy” and “hard” rearrangement problems is sketched and a brief review is given on the available software tools for genome rearrangement analysis.

Key words

Gene order analysis Genome rearrangements Cut and join Gene cluster 


  1. 1.
    Wang L-S, Warnow T, Moret BME, Jansen RK, Raubeson LA (2006) Distance-based genome rearrangement phylogeny. J Mol Evol 63(4):473–483CrossRefPubMedGoogle Scholar
  2. 2.
    Sankoff D (1992) Edit distance for genome comparison based on non-local operations. In: Proceedings of the 3rd annual symposium on combinatorial pattern matching (CPM ’92). Lecture Notes in Computer Science, vol 644, pp 121–135Google Scholar
  3. 3.
    Watterson GA, Ewens WJ, Hall TE, Morgan A (1982) The chromosome inversion problem. J Theor Biol 99(1):1–7CrossRefGoogle Scholar
  4. 4.
    Hannenhalli S, Pevzner PA (1999) Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. J ACM 46(1):1–27CrossRefGoogle Scholar
  5. 5.
    Hannenhalli S, Pevzner PA (1995) Transforming men into mice (polynomial algorithm for genomic distance problem). In: Proceedings of the 36th annual symposium on foundations of computer science (FOCS ’95), pp 581–592Google Scholar
  6. 6.
    Caprara A (1997) Sorting by reversals is difficult. In: Proceedings of the 11th annual international conference on computational molecular biology (RECOMB ’97), pp 75–83Google Scholar
  7. 7.
    Caprara A (2003) The reversal median problem. INFORMS J Comput 15(1):93–113CrossRefGoogle Scholar
  8. 8.
    Bulteau L, Fertin G, Rusu I (2012) Sorting by transpositions is difficult. SIAM J Discrete Math 26(3):1148–1180CrossRefGoogle Scholar
  9. 9.
    Bourque G, Pevzner PA (2002) Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res 12(1):26–36PubMedPubMedCentralGoogle Scholar
  10. 10.
    Elias I, Hartman T (2006) A 1.375-approximation algorithm for sorting by transpositions. IEEE/ACM Trans Comput Biol Bioinform 3(4):369–379CrossRefPubMedGoogle Scholar
  11. 11.
    Figeac M, Varré J-S (2004) Sorting by reversals with common intervals. In: Proceedings of the 4th international workshop algorithms in bioinformatics (WABI ’04). Lecture Notes in Computer Science, vol 3240, pp 26–37Google Scholar
  12. 12.
    Bérard S, Bergeron A, Chauve C, Paul C (2007) Perfect sorting by reversals is not always difficult. IEEE/ACM Trans Comput Biol Bioinform 4(1):4–16CrossRefPubMedGoogle Scholar
  13. 13.
    Bernt M (2009) Gene order rearrangement methods for the reconstruction of phylogeny. PhD thesis, University LeipzigGoogle Scholar
  14. 14.
    Bernt M, Merkle D, Middendorf M (2007) A fast and exact algorithm for the perfect reversal median problem. In: Proceedings of the 3rd international symposium on bioinformatics research and applications (ISBRA ’07). Lecture Notes in Computer Science, vol 4463, pp 305–316Google Scholar
  15. 15.
    Swenson KM, Simonaitis P, Blanchette M (2016) Models and algorithms for genome rearrangement with positional constraints. Algorithm Mol Biol 11(1):1–10CrossRefGoogle Scholar
  16. 16.
    Véron AS, Lemaitre C, Gautier C, Lacroix V, Sagot M-F (2011) Close 3D proximity of evolutionary breakpoints argues for the notion of spatial synteny. BMC Genomics 12(1):1–13CrossRefGoogle Scholar
  17. 17.
    Graham GJ (1995) Tandem genes and clustered genes. J Theor Biol 175(1):71–87CrossRefPubMedGoogle Scholar
  18. 18.
    Heber S, Stoye J (2001) Finding all common intervals of k permutations. In: Proceedings of the 12th annual symposium on combinatorial pattern matching (CPM ’01). Lecture Notes in Computer Science, vol 2089, pp 207–218Google Scholar
  19. 19.
    Bergeron A, Medvedev P, Stoye J (2010) Rearrangement models and single-cut operations. J Comput Biol 17(9):1213–1225CrossRefPubMedGoogle Scholar
  20. 20.
    Felsenstein J, Felenstein J (2004) Inferring phylogenies, vol 2. Sinauer Associates, SunderlandGoogle Scholar
  21. 21.
    Tannier E, Sagot M-F (2004) Sorting by reversals in subquadratic time. In: Proceedings of the 15th annual symposium on combinatorial pattern matching (CPM ’04). Lecture Notes in Computer Science, vol 3109, pp 1–13Google Scholar
  22. 22.
    Sankoff D, Blanchette M (1998) Multiple genome rearrangement and breakpoint phylogeny. J Comput Biol 5(3):555–570CrossRefPubMedGoogle Scholar
  23. 23.
    Moret BME, Wang L-S, Warnow T, Wyman SK (2001) New approaches for reconstructing phylogenies from gene order data. Bioinformatics 17(9):165–173CrossRefGoogle Scholar
  24. 24.
    Zhang M, Arndt W, Tang J (2009) An exact solver for the DCJ median problem. In: Proceedings of the pacific symposium on biocomputing (PSB ’09), pp 138–149Google Scholar
  25. 25.
    Feijão P, Meidanis J (2011) SCJ: a breakpoint-like distance that simplifies several rearrangement problems. IEEE/ACM Trans Comput Biol Bioinform 8(5):1318–1329CrossRefPubMedGoogle Scholar
  26. 26.
    Ohlebusch E, Abouelhoda M, Hockel K (2007) A linear time algorithm for the inversion median problem in circular bacterial genomes. J Discrete Algorithms 5(4):637–646CrossRefGoogle Scholar
  27. 27.
    Tannier E, Zheng C, Sankoff D (2009) Multichromosomal median and halving problems under different genomic distances. BMC Bioinform 10(1):1–15CrossRefGoogle Scholar
  28. 28.
    Bader M (2011) The transposition median problem is NP-complete. Theor Comput Sci 412(12–14):1099–1110CrossRefGoogle Scholar
  29. 29.
    Fertin G, Labarre A, Rusu I, Tannier E, Vialette S (2009) Combinatorics of genome rearrangements, 1st edn. The MIT Press, CambridgeCrossRefGoogle Scholar
  30. 30.
    Bergeron A, Mixtacki J, Stoye J (2006) CEGeD.
  31. 31.
    Bergeron A, Mixtacki J, Stoye J (2006) A unifying view of genome rearrangements. In: Proceedings of the 6th international workshop algorithms in bioinformatics (WABI ’06). Lecture Notes in Computer Science, vol 4175, pp 163–173Google Scholar
  32. 32.
    Bernt M, Merkle D, Ramsch K, Fritzsch G, Perseke M, Bernhard D, Schlegel M, Stadler PF, Middendorf M (2007) CREx.
  33. 33.
    Bernt M, Merkle D, Ramsch K, Fritzsch G, Perseke M, Bernhard D, Schlegel M, Stadler PF, Middendorf M (2007) CREx: inferring genomic rearrangements based on common intervals. Bioinformatics 23(21):2957–2958CrossRefPubMedGoogle Scholar
  34. 34.
  35. 35.
    Tesler G (2002) GRIMM: genome rearrangements web server. Bioinformatics 18(3):492–493CrossRefPubMedGoogle Scholar
  36. 36.
    Tesler G, Yu Y, Pevzner P (2002) GRIMM.
  37. 37.
    Bader M, Abouelhoda MI, Ohlebusch E (2002) MGR.
  38. 38.
    Alekseyev MA, Pevzner PA (2009) Breakpoint graphs and ancestral genome reconstructions. Genome Res 19(5):943–957CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Alekseyev MA, Pevzner PA (2009) MGRA.
  40. 40.
    Hu F, Lin Yu, Tang J (2014) MLGO.
  41. 41.
    Hu F, Lin Yu, Tang J (2014) MLGO: phylogeny reconstruction and ancestral inference from gene-order data. BMC Bioinform 15(1):1–6CrossRefGoogle Scholar
  42. 42.
    Stoye J, Wittler R (2009) A unified approach for reconstructing ancient gene clusters. IEEE/ACM Trans Comput Biol Bioinform 6(3):387–400CrossRefPubMedGoogle Scholar
  43. 43.
  44. 44.
    Huang Y-L, Huang C-C, Tang CY, Lu CL (2009) SoRT 2.
  45. 45.
    Huang Y-L, Lu CL (2010) Sorting by reversals, generalized transpositions, and translocations using permutation groups. J Comput Biol 17(5):685–705CrossRefPubMedGoogle Scholar
  46. 46.
    Christie DA (1996) Sorting permutations by block-interchanges. Inf Process Lett 60(4):165–169CrossRefGoogle Scholar
  47. 47.
  48. 48.
    Hilker R, Sickinger C, Friesen R, Mixtacki J, Stoye J (2005) UniMoG.
  49. 49.
    Braga MDV (2008) baobabLUNA.
  50. 50.
    Braga MDV (2009) baobabluna: the solution space of sorting by reversals. Bioinformatics 25(14):1833–1835CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
  52. 52.
    Bader M (2009) Sorting by reversals, block interchanges, tandem duplications, and deletions. BMC Bioinform 10(Suppl 1):S9CrossRefGoogle Scholar
  53. 53.
  54. 54.
    Zhao H, Bourque G (2009) Recovering genome rearrangements in the mammalian phylogeny. Genome Res 19(5):934–942CrossRefPubMedPubMedCentralGoogle Scholar
  55. 55.
    Gog S, Bader M, Ohlebusch E (2008) Genesis: genome evolution scenarios. Bioinformatics 24(5):711–712CrossRefPubMedGoogle Scholar
  56. 56.
  57. 57.
    Bader DA, Moret BME, Warnow T, Wyman SK, Yan M, Tang J, Siepel AC, Caprara A (2004) GRAPPA.
  58. 58.
  59. 59.
    Shao M, Lin Yu, Moret B (2014) An exact algorithm to compute the DCJ distance for genomes with duplicate genes. In: Proceedings of the 18th annual international conference on computational molecular biology (RECOMB ’14). Lecture Notes in Computer Science, vol 8394, pp 280–292Google Scholar
  60. 60.
    Darling AC, Mau B, Blattner FR, Perna NT (2004) Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14(7):1394–1403CrossRefPubMedPubMedCentralGoogle Scholar
  61. 61.
    Darling ACE, Mau B, Blattner FR, Perna NT (2015) Mauve.
  62. 62.
    Bader M, Abouelhoda MI, Ohlebusch E (2008) A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions. BMC Bioinform 9(1):1–13CrossRefGoogle Scholar
  63. 63.
    Bader M, Abouelhoda MI, Ohlebusch E (2008) phylo.
  64. 64.
    Fu Z, Chen X, Vacic V, Nan P, Zhong Y, Jiang T (2007) MSOAR: a high-throughput ortholog assignment system based on genome rearrangement. J Comput Biol 14(9):1160–1175CrossRefPubMedGoogle Scholar
  65. 65.
    Fu Z, Chen X, Vacic V, Nan P, Zhong Y, Jiang T (2009) MSOAR.
  66. 66.
    Zheng C, Sankoff D (2011) On the pathgroups approach to rapid small phylogeny. BMC Bioinform 12(1):1–9CrossRefGoogle Scholar
  67. 67.
  68. 68.
    Friedberg R, Darling AE, Yancopoulos S (2008) Genome rearrangement by the double cut and join operation. Methods in molecular biology, vol 452, pp 385–416. Humana Press, New YorkGoogle Scholar
  69. 69.
    Feijão P, Meidanis J (2009) SCJ: a variant of breakpoint distance for which sorting, genome median and genome halving problems are easy. In: Proceedings of the 9th international workshop algorithms in bioinformatics (WABI ’09). Lecture Notes in Computer Science, vol 5724, pp 85–96Google Scholar
  70. 70.
    Fitch WM (1971). Toward defining the course of evolution: minimum change for a specific tree topology. Syst Biol 20(4):406–416CrossRefGoogle Scholar
  71. 71.
    Foulds LR, Graham RL (1982) The steiner problem in phylogeny is NP-complete. Adv Appl Math 3(1):43–49CrossRefGoogle Scholar
  72. 72.
    Pe’er I, Shamir R (1998) The median problems for breakpoints are NP-complete. Elec Colloq Comput Complexity 5(71)Google Scholar
  73. 73.
    Casjens S, Palmer N, van Vugt R, Huang WM, Stevenson B, Rosa P, Lathigra R, Sutton G, Peterson J, Dodson RJ, Haft D, Hickey E, Gwinn M, White O, Fraser CM (2000) A bacterial genome in flux: the twelve linear and nine circular extrachromosomal DNAs in an infectious isolate of the Lyme disease spirochete Borrelia burgdorferi. Mol Microbiol 35(3):490–516CrossRefPubMedGoogle Scholar
  74. 74.
    Qiu WG, Schutzer SE, Bruno JF, Attie O, Xu Y, Dunn JJ, Fraser CM, Casjens SR, Luft BJ (2004) Genetic exchange and plasmid transfers in Borrelia burgdorferi sensu stricto revealed by three-way genome comparisons and multilocus sequence typing. Proc Natl Acad Sci USA 101(39):14150–14155CrossRefPubMedPubMedCentralGoogle Scholar
  75. 75.
    Volff JN, Altenbuchner J (2000) A new beginning with new ends: linearisation of circular chromosomes during bacterial evolution. FEMS Microbiol Lett 186(2):143–150CrossRefPubMedGoogle Scholar
  76. 76.
    Raphael BJ, Pevzner PA (2004) Reconstructing tumor amplisomes. Bioinformatics 20(Suppl 1):265–273CrossRefGoogle Scholar
  77. 77.
    Yancopoulos S, Attie O, Friedberg R (2005) Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21(16):3340–3346CrossRefPubMedGoogle Scholar
  78. 78.
    Jiang S, Alekseyev MA (2015) Implicit transpositions in shortest DCJ scenarios. In: Proceedings of the 2nd international conference on algorithms for computational biology (AlCoB ’15). Lecture Notes in Computer Science, vol 9199, pp 13–24Google Scholar
  79. 79.
    Bergeron A, Stoye J (2013) The genesis of the DCJ formula. Computational biology, vol 19, pp 63–81. Springer, New YorkGoogle Scholar
  80. 80.
    Chen X (2010) On sorting permutations by double-cut-and-joins. In: Proceedings of the 16th annual international computing and combinatorics conference (COCOON ’10). Lecture Notes in Computer Science, vol 6196, pp 439–448Google Scholar
  81. 81.
    Kececioglu J, Sankoff D (1995) Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement. Algorithmica 13(1–2):180–210CrossRefGoogle Scholar
  82. 82.
    Lin Y, Moret BM (2008) Estimating true evolutionary distances under the DCJ model. Bioinformatics 24(13):114–122CrossRefGoogle Scholar
  83. 83.
    Xu AW, Sankoff D (2008) Decompositions of multiple breakpoint graphs and rapid exact solutions to the median problem. In: Proceedings of the 8th international workshop algorithms in bioinformatics (WABI ’08). Lecture Notes in Computer Science, vol 5251, pp 25–37Google Scholar
  84. 84.
    Adam Z, Sankoff D (2008) The ABCs of MGR with DCJ. Evol Bioinform Online 4:69–74CrossRefPubMedPubMedCentralGoogle Scholar
  85. 85.
    Lenne R, Solnon C, Stützle T, Tannier E, Birattari M (2008) Reactive stochastic local search algorithms for the genomic median problem. In: Proceedings of the 8th European conference on evolutionary computation in combinatorial optimisation (EvoCOP ’08). Lecture Notes in Computer Science, vol 4972, pp 266–276Google Scholar
  86. 86.
    Pevzner P, Tesler G (2003) Transforming men into mice: the Nadeau-Taylor chromosomal breakage model revisited. In: Proceedings of the 7th annual international conference on computational molecular biology (RECOMB ’03), pp 247–256Google Scholar
  87. 87.
    Ma J, Zhang L, Suh BB, Raney BJ, Burhans RC, Kent WJ, Blanchette M, Haussler D, Miller W (2006) Reconstructing contiguous regions of an ancestral genome. Genome Res 16(12):1557–1565CrossRefPubMedPubMedCentralGoogle Scholar
  88. 88.
    Murphy WJ, Larkin DM, Everts-van der Wind A, Bourque G, Tesler G, Auvil L, Beever JE, Chowdhary BP, Galibert F, Gatzke L, Hitte C, Meyers SN, Milan D, Ostrander EA, Pape G, Parker HG, Raudsepp T, Rogatcheva MB, Schook LB, Skow LC, Welge M, Womack JE, O’Brien SJ, Pevzner PA, Lewin HA (2005) Dynamics of mammalian chromosome evolution inferred from multispecies comparative maps. Science 309(5734):613–617CrossRefPubMedGoogle Scholar
  89. 89.
    Brown TA (2006) Genomes. Garland Science, New YorkGoogle Scholar
  90. 90.
    Alekseyev MA, Pevzner PA (2008) Multi-break rearrangements and chromosomal evolution. Theor Comput Sci 395(2):193–202CrossRefGoogle Scholar
  91. 91.
    Alekseyev MA (2008) Multi-break rearrangements and breakpoint re-uses: from circular to linear genomes. J Comput Biol 15(8):1117–1131CrossRefPubMedGoogle Scholar
  92. 92.
    Bouvel M, Chauve C, Mishna M, Rossin D (2011) Average-case analysis of perfect sorting by reversals. Discrete Math Algorithms Appl 3(3):369–392CrossRefGoogle Scholar
  93. 93.
    Bérard S, Chauve C, Paul C (2008) A more efficient algorithm for perfect sorting by reversals. Inf Process Lett 106(3):90–95CrossRefGoogle Scholar
  94. 94.
    Bernt M, Chao K-M, Kao J-W, Middendorf M, Tannier E (2012) Preserving inversion phylogeny reconstruction. In: Proceedings of the 12th international workshop algorithms in bioinformatics (WABI ’12). Lecture Notes in Computer Science, vol 7534, pp 1–13Google Scholar
  95. 95.
    Bernt M, Merkle D, Middendorf M (2008) Solving the preserving reversal median problem. IEEE/ACM Trans Comput Biol Bioinform 5(3):332–347CrossRefPubMedGoogle Scholar
  96. 96.
    Booth KS, Lueker GS (1976) Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms. J Comput Syst Sci 13(3):335–379CrossRefGoogle Scholar
  97. 97.
    Tannier E, Bergeron A, Sagot M-F (2007) Advances on sorting by reversals. Discrete Appl Math 155(6):881–888CrossRefGoogle Scholar
  98. 98.
    Bérard S, Chateau A, Chauve C, Paul C, Tannier E (2009) Computation of perfect DCJ rearrangement scenarios with linear and circular chromosomes. J Comput Biol 16(10):1287–1309CrossRefPubMedGoogle Scholar
  99. 99.
    Belda E, Moya A, Silva FJ (2015) Genome rearrangement distances and gene order phylogeny in γ-proteobacteria. Mol Biol Evol 22(6):1456–1467CrossRefGoogle Scholar
  100. 100.
    Gibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ, Scherer S, Scott G, Steffen D, Worley KC, Burch PE, et al (2004) Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428(6982):493–521CrossRefPubMedGoogle Scholar
  101. 101.
    Chaudhuri K, Chen K, Mihaescu R, Rao S (2006) On the tandem duplication-random loss model of genome rearrangement. In: Proceedings of the 17th annual ACM-SIAM symposium discrete algorithm (SODA ’06), pp 564–570Google Scholar
  102. 102.
    Boore JL (2000) The duplication/random loss model for gene rearrangement exemplified by mitochondrial genomes of deuterostome animals, pp 133–147. Springer, New YorkGoogle Scholar
  103. 103.
    Inoue JG, Miya M, Tsukamoto K, Nishida M (2003) Evolution of the deep-sea gulper eel mitochondrial genomes: large-scale gene rearrangements originated within the eels. Mol Biol Evol 20(11):1917–1924CrossRefPubMedGoogle Scholar
  104. 104.
    San Mauro D, Gower DJ, Zardoya R, Wilkinson M (2006) A hotspot of gene order rearrangement by tandem duplication and random loss in the vertebrate mitochondrial genome. Mol Biol Evol 23(1):227–234CrossRefPubMedGoogle Scholar
  105. 105.
    Bérard S, Chateau A, Chauve C, Paul C, Tannier E (2008) Perfect DCJ rearrangement. In: Proceedings of the RECOMB international workshop comparative genomics (RCG ’08). Lecture Notes in Computer Science, vol 5267, pp 158–169Google Scholar
  106. 106.
    Bernt M, Middendorf M (2011) A method for computing an inventory of metazoan mitochondrial gene order rearrangements. BMC Bioinform 12(9):1Google Scholar
  107. 107.
    Bachmann L, Fromm B, Patella de Azambuja L, Boeger WA (2016) The mitochondrial genome of the egg-laying flatworm Aglaiogyrodactylus forficulatus (Platyhelminthes: Monogenoidea). Parasit Vectors 9(1):1–8CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2018

Authors and Affiliations

  • Tom Hartmann
    • 1
  • Martin Middendorf
    • 1
  • Matthias Bernt
    • 1
  1. 1.Swarm Intelligence and Complex Systems GroupInstitute of Computer Science, University LeipzigLeipzigGermany

Personalised recommendations