Genome rearrangement problems have been extensively studied for more than two decades, intended to understand the species evolutionary relationships in terms of the long range genetic mutations at the genome level. While most earlier studies focus on the simplified genomes ignoring gene duplicates, thousands of whole genome sequencing projects reveal that a genome typically carries multiple gene duplicates distributed in various ways along the genome. Given a source genome and a target genome such that one is a re-ordering of the genes in the other, we measure the evolutionary distance by the minimum number of reversals applied on the source genome to recover all the gene adjacencies in the target genome. We define this optimization problem as sorting by reversals to recover all adjacencies, or SBR2RA in short. We show that SBR2RA is APX-hard and uncover some similarities and differences to the classic counterpart, the sorting by reversals problem. From the approximability perspective, we present a \(2 \alpha \)-approximation algorithm, where \(\alpha \in [1, 2]\) is the best approximation ratio for a related optimization problem which is suspected to be NP-hard.
Genome rearrangement Sorting by reversals Gene adjacency Maximum matching Alternating cycle
This is a preview of subscription content, log in to check access.
PZ is partially supported by the NNSF China Grant 61672323 and the NSF of Shandong Province Grant ZR2016AM28; DZ is partially supported by the NNSF China Grants 61472222, 61732009, and 61761136017; WT is partially supported by the funds from the Office of the Vice President for Research and Economic Development at Georgia Southern University; GL is supported by the NSERC.
Berman P, Hannenhalli S, Karpinski M (2002) \(1.375\)-approximation algorithm for sorting by reversals. In: Proceedings of the 10th annual European symposium on algorithms (ESA’02), pp 200–210Google Scholar
Berman P, Karpinski M (1999) On some tighter inapproximability results. In: Proceedings of the of 26th international colloquium on automata, languages and programming (ICALP’99), pp 200–209Google Scholar
Caprara A (1997) Sorting by reversals is difficult. In: Proceedings of the first annual international conference on computational molecular biology, pp 75–83Google Scholar
Chen W, Chen Z, Samatova NF, Peng L, Wang J, Tang M (2014) Solving the maximum duo-preservation string mapping problem with linear programming. Theor Comput Sci 530:1–11MathSciNetCrossRefzbMATHGoogle Scholar
Chrobak M, Kolman P, Sgall J (2004) The greedy algorithm for the minimum common string partition problem. In: Proceedings of the 7th international workshop on approximation algorithms for combinatorial optimization problems (APPROX 2004) and the 8th international workshop on randomization and computation (RANDOM 2004), LNCS 3122, pp 84–95Google Scholar
Goldstein A, Kolman P, Zheng J (2004) Minimum common string partition problem: hardness and approximations. In: Proceedings of the 15th international symposium on algorithms and computation (ISAAC 2004), LNCS 3341, pp 484–495Google Scholar
Gu Q-P, Peng S, Sudborough H (1999) A \(2\)-approximation algorithm for genome rearrangements by reversals and transpositions. Theor Comput Sci 210:327–339MathSciNetCrossRefzbMATHGoogle Scholar
Hannenhalli S, Pevzner P (1995) Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. In: ACM proceedings of the 27th annual symposium on the theory of computing (STOC’95), pp 178–189Google Scholar
Kececioglu JD, Sankoff D (1993) Exact and approximation algorithms for the inversion distance between two permutations. In: Proceedings of the fourth annual symposium on combinatorial pattern matching (CPM’93), LNCS 684, pp 87–105Google Scholar