Ordering Partially Assembled Genomes Using Gene Arrangements

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Several mammalian genomes will only be sequenced at a 2X coverage, resulting in the impossibility of assembling contigs into chromosomes. We introduce the problem of ordering the contigs of two partially assembled genomes so as to maximize the similarity (measured in terms of genome rearrangements) between the assembled genomes. We give a linear-time algorithm for the Block Ordering Problem (BOP): Given two signed permutations (genomes) that are been broken into blocks (contigs), order and orient each set of blocks, in such a way that the number of cycles in the breakpoint graph of the resulting permutations is maximized. We illustrate our algorithm on a set of 90 markers from the human and mouse chromosomes X and show how the size of the contigs and the rearrangement distance between the two genomes affects the accuracy of the predicted assemblies. The appendix and an implementation are available at www.mcb.mcgill.ca/~egaul/recomb2006.