Efficient Algorithms for Analyzing Segmental Duplications, Deletions, and Inversions in Genomes

  • Crystal L. Kahn
  • Shay Mozes
  • Benjamin J. Raphael
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5724)


Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics consisting of pieces of multiple other segmental duplications. This complex genomic organization complicates analysis of the evolutionary history of these sequences. Earlier, we introduced a genomic distance, called duplication distance, that computes the most parsimonious way to build a target string by repeatedly copying substrings of a source string. We also showed how to use this distance to describe the formation of segmental duplications according to a two-step model that has been proposed to explain human segmental duplications. Here we describe polynomial-time exact algorithms for several extensions of duplication distance including models that allow certain types of substring deletions and inversions. These extensions will permit more biologically realistic analyses of segmental duplications in genomes.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sankoff, D., Leduc, G., Antoine, N., Paquin, B., Lang, B., Cedergren, R.: Gene Order Comparisons for Phylogenetic Inference: Evolution of the Mitochondrial Genome. Proc. Natl. Acad. Sci. U.S.A. 89(14), 6575–6579 (1992)CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Pevzner, P.: Computational molecular biology: an algorithmic approach. MIT Press, Cambridge (2000)Google Scholar
  3. 3.
    Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of Orthologous Genes via Genome Rearrangement. IEEE/ACM Trans. Comp. Biol. Bioinformatics 2(4), 302–315 (2005)CrossRefGoogle Scholar
  4. 4.
    Marron, M., Swenson, K.M., Moret, B.M.E.: Genomic Distances Under Deletions and Insertions. TCS 325(3), 347–360 (2004)CrossRefGoogle Scholar
  5. 5.
    El-Mabrouk, N.: Genome Rearrangement by Reversals and Insertions/Deletions of Contiguous Segments. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 222–234. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Zhang, Y., Song, G., Vinar, T., Green, E.D., Siepel, A.C., Miller, W.: Reconstructing the Evolutionary History of Complex Human Gene Clusters. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 29–49. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Ma, J., Ratan, A., Raney, B.J., Suh, B.B., Zhang, L., Miller, W., Haussler, D.: Dupcar: Reconstructing contiguous ancestral regions with duplications. Journal of Computational Biology 15(8), 1007–1027 (2008)CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Bertrand, D., Lajoie, M., El-Mabrouk, N.: Inferring Ancestral Gene Orders for a Family of Tandemly Arrayed Genes. J. Comp. Biol. 15(8), 1063–1077 (2008)CrossRefGoogle Scholar
  9. 9.
    Chaudhuri, K., Chen, K., Mihaescu, R., Rao, S.: On the Tandem Duplication-Random Loss Model of Genome Rearrangement. In: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 564–570. ACM, New York (2006)CrossRefGoogle Scholar
  10. 10.
    Elemento, O., Gascuel, O., Lefranc, M.P.: Reconstructing the Duplication History of Tandemly Repeated Genes. Mol. Biol. Evol. 19(3), 278–288 (2002)CrossRefPubMedGoogle Scholar
  11. 11.
    Lajoie, M., Bertrand, D., El-Mabrouk, N., Gascuel, O.: Duplication and Inversion History of a Tandemly Repeated Genes Family. J. Comp. Bio. 14(4), 462–478 (2007)CrossRefGoogle Scholar
  12. 12.
    El-Mabrouk, N., Sankoff, D.: The reconstruction of doubled genomes. SIAM J. Comput. 32(3), 754–792 (2003)CrossRefGoogle Scholar
  13. 13.
    Alekseyev, M.A., Pevzner, P.A.: Whole Genome Duplications and Contracted Breakpoint Graphs. SICOMP 36(6), 1748–1763 (2007)CrossRefGoogle Scholar
  14. 14.
    Bailey, J., Eichler, E.: Primate Segmental Duplications: Crucibles of Evolution, Diversity and Disease. Nat. Rev. Genet. 7, 552–564 (2006)CrossRefPubMedGoogle Scholar
  15. 15.
    Kahn, C.L., Raphael, B.J.: Analysis of Segmental Duplications via Duplication Distance. Bioinformatics 24, i133–i138 (2008)CrossRefGoogle Scholar
  16. 16.
    Kahn, C.L., Raphael, B.J.: A Parsimony Approach to Analysis of Human Segmental Duplications. In: Pacific Symposium on Biocomputing, pp. 126–137 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Crystal L. Kahn
    • 1
  • Shay Mozes
    • 1
  • Benjamin J. Raphael
    • 1
    • 2
  1. 1.Department of Computer ScienceBrown UniversityUSA
  2. 2.Center for Computational Molecular BiologyBrown UniversityProvidenceUSA

Personalised recommendations