Sorting Genomes with Insertions, Deletions and Duplications by DCJ
We extend the DCJ paradigm to perform genome rearrangments on pairs of genomes having unequal gene content and/or multiple copies by permitting genes in one genome which are completely or partially unmatched in the other. The existence of unmatched gene ends introduces new kinds of paths in the adjacency graph, since some paths can now terminate internal to a chromosome and not on telomeres. We introduce Òghost adjacenciesÓ to supply the missing gene ends in the genome not containing them. Ghosts enable us to close paths that were due to incomplete matching, just as null points enable us to close even paths terminating in telomeres. We define generalalized DCJ operations on the generalized adjacency graph, and give a prescription for calculating the DCJ distance for the expanded repertoire of operations which includes insertions, deletions and duplications.
KeywordsTriangle Inequality Null Point Target Genome Adjacency Graph Circular Chromosome
Unable to display preview. Download preview PDF.
- 6.The Chimpanzee Sequencing and Analysis Consortium, Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87 (2005)Google Scholar
- 8.Bergeron communicationGoogle Scholar
- 9.Bafna, V., Pevzner, P.A.: Genome rearrangements and sorting by reversals. In: Proc. 34th Ann. IEEE Symp Found. Comp. Sci., pp. 148–157. IEEE Press, Los Alamitos (1993)Google Scholar
- 10.Friedberg, R., Darling, A.E., Yancopoulos, S.: Genome Rearrangement by the Double Cut and Join Operation. In: Keith, J.M. (ed.) Bioinformatics, Data, Sequence Analysis and Evolution, ch. 18. vol. I. Humana Press (2008)Google Scholar