# The Rooted SCJ Median with Single Gene Duplications

## Abstract

The median problem is a classical problem in genome rearrangements. It aims to compute a gene order that minimizes the sum of the genomic distances to \(k\ge 3\) given gene orders. This problem is intractable except in the related Single-Cut-or-Join and breakpoint rearrangement models. Here we consider the rooted median problem, where we assume one of the given genomes to be ancestral to the median, which is itself ancestral to the other genomes. We show that in the Single-Cut-or-Join model with single gene duplications, the rooted median problem is NP-hard. We also describe an Integer Linear Program for solving this problem, which we apply to simulated data, showing high accuracy of the reconstructed medians.

## Notes

### Acknowledgments

CC is supported by Natural Science and Engineering Research Council of Canada (NSERC) Discovery Grant RGPIN-2017-03986. CC and PF are supported by CIHR/Genome Canada Bioinformatics and Computational Biology grant B/CB 11106. Most computations were done on the Cedar system of Compute Canada through a resource allocation to CC.

## References

- 1.Angibaud, S., Fertin, G., Rusu, I., Thévenin, A., Vialette, S.: On the approximability of comparing genomes with duplicates. J. Graph Algorithms Appl.
**13**(1), 19–53 (2009)MathSciNetCrossRefGoogle Scholar - 2.Berman, P., Karpinski, M., Scott, A.D.: Approximation hardness of short symmetric instances of MAX-3SAT. Technical report TR03-049, Electronic Colloquium on Computational Complexity (ECCC) (2003)Google Scholar
- 3.Blanchette, M., Bourque, G., Sankoff, D.: Breakpoint phylogenies. Genome Inform.
**8**, 25–34 (1997)Google Scholar - 4.Blin, G., Chauve, C., Fertin, G., Rizzi, R., Vialette, S.: Comparing genomes with duplications: a computational complexity point of view. IEEE/ACM Trans. Comput. Biol. Bioinform.
**4**(4), 523–534 (2007)CrossRefGoogle Scholar - 5.Boyd, S.C., Haghighi, M.: Mixed and circular multichromosomal genomic median problem. SIAM J. Discret. Math.
**27**(1), 63–74 (2013)MathSciNetCrossRefGoogle Scholar - 6.Bryant, D.: The complexity of calculating exemplar distances. In: Sankoff, D., Nadeau, J.H. (eds.) Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment and the Evolution of Gene Families, pp. 207–211. Springer, Dordrecht (2000). https://doi.org/10.1007/978-94-011-4309-7CrossRefGoogle Scholar
- 7.Bryant, D.: A lower bound for the breakpoint phylogeny problem. J. Discret. Algorithms
**2**(2), 229–255 (2004)MathSciNetCrossRefGoogle Scholar - 8.Davin, A.A., Tricou, T., Tannier, E., de Vienne, D.M., Szollosi, G.J.: Zombi: a simulator of species, genes and genomes that accounts for extinct lineages. bioRxiv (2018). https://doi.org/10.1101/339473
- 9.Doerr, D., Balaban, M., Feijão, P., Chauve, C.: The gene family-free median of three. Algorithms Mol. Biol.
**12**(1), 14:1–14:14 (2017)CrossRefGoogle Scholar - 10.Feijão, P., Mane, A.C., Chauve, C.: A tractable variant of the single cut or join distance with duplicated genes. In: Meidanis, J., Nakhleh, L. (eds.) RECOMB CG 2017. LNCS, vol. 10562, pp. 14–30. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67979-2_2CrossRefGoogle Scholar
- 11.Feijão, P., Meidanis, J.: SCJ: a breakpoint-like distance that simplifies several rearrangement problems. IEEE/ACM Trans. Comput. Biol. Bioinform.
**8**(5), 1318–1329 (2011)CrossRefGoogle Scholar - 12.Fertin, G., Labarre, A., Rusu, I., Tannier, E., Vialette, S.: Combinatorics of Genome Rearrangements. Computational Molecular Biology. MIT Press, Cambridge (2009)CrossRefGoogle Scholar
- 13.Kondrashov, F.A.: Gene duplication as a mechanism of genomic adaptation to a changing environment. Proc. R. Soc. Lond. B Biol. Sci.
**279**(1749), 5048–5057 (2012)CrossRefGoogle Scholar - 14.Kovác, J.: On the complexity of rearrangement problems under the breakpoint distance. J. Comput. Biol.
**21**(1), 1–15 (2014)MathSciNetCrossRefGoogle Scholar - 15.Levasseur, A., Pontarotti, P.: The role of duplications in the evolution of genomes highlights the need for evolutionary-based approaches in comparative genomics. Biol. Direct
**6**(1), 11 (2011)CrossRefGoogle Scholar - 16.Luhmann, N., Lafond, M., Thèvenin, A., Ouangraoua, A., Wittler, R., Chauve, C.: The SCJ small parsimony problem for weighted gene adjacencies. IEEE/ACM Trans. Comput. Biol. Bioinform. (2017). https://doi.org/10.1109/TCBB.2017.2661761
- 17.Ming, R., VanBuren, R., Wai, C.M., et al.: The pineapple genome and the evolution of CAM photosynthesis. Nat. Genet.
**47**(12), 1435–1442 (2015)CrossRefGoogle Scholar - 18.Moret, B.M.E., Wyman, S.K., Bader, D.A., Warnow, T.J., Yan, M.: A new implementation and detailed study of breakpoint analysis. In: Pacific Symposium on Biocomputing, pp. 583–594 (2001)Google Scholar
- 19.Neafsey, D., Waterhouse, R., Abai, M., et al.: Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes. Science
**347**(6217), 1258522 (2015)CrossRefGoogle Scholar - 20.Pe’er, I., Shamir, R.: The median problems for breakpoints are np-complete. Technical report TR98-071, Electronic Colloquium on Computational Complexity (ECCC) (1998)Google Scholar
- 21.Sankoff, D., Sundaram, G., Kececioglu, J.D.: Steiner points in the space of genome rearrangements. Int. J. Found. Comput. Sci.
**7**(1), 1–9 (1996)CrossRefGoogle Scholar - 22.Tannier, E., Zheng, C., Sankoff, D.: Multichromosomal median and halving problems under different genomic distances. BMC Bioinform.
**10**, 120 (2009)CrossRefGoogle Scholar - 23.Zeira, R., Shamir, R.: Sorting by cuts, joins, and whole chromosome duplications. J. Comput. Biol.
**24**(2), 127–137 (2017)MathSciNetCrossRefGoogle Scholar