Gaps and Runs in Syntenic Alignments

  • Zhe Yu
  • Chunfang Zheng
  • David SankoffEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12099)


Gene loss is the obverse of novel gene acquisition by a genome through a variety of evolutionary processes. It serves a number of functional and structural roles, compensating for the energy and material costs of gene complement expansion.

A type of gene loss widespread in the lineages of plant genomes is “fractionation” after whole genome doubling or tripling, where one of a pair or triplet of paralogous genes in parallel syntenic contexts is discarded.

The detailed syntenic mechanisms of gene loss, especially in fractionation, remain controversial.

We focus on the the frequency distribution of gap lengths (number of deleted genes – not nucleotides) within syntenic blocks calculated during the comparison of chromosomes from two genomes. We mathematically characterize s simple model in some detail and show how it is an adequate description neither of the Coffea arabica subgenomes nor its two progenitor genomes.

We find that a mixture of two models, a random, one-gene-at-a-time, model and a geometric-length distributed excision for removing a variable number of genes, fits well.


Gene loss Tetraploidy Fractionation Plant genomes Coffee Run length 



Research supported in part by grants from the Natural Sciences and Engineering Research Council of Canada. DS holds the Canada Research Chair in Mathematical Genomics.


  1. 1.
    De Kochko, A., Crouzillat, D.: Arabica coffee genome consortium: Aims and goals of the Arabica Coffee Genome Consortium (ACGC). In: 12th Solanaceae Conference (2015)Google Scholar
  2. 2.
    Jaillon, O., Aury, J.M., Noel, B., Policriti, A., Clepet, C., et al.: The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463–467 (2007)CrossRefGoogle Scholar
  3. 3.
    Hamon, P., Grover, C.E., et al.: Genotyping-by-sequencing provides the first well-resolved phylogeny for coffee (Coffea) and insights into the evolution of caffeine content in its species: GBS coffee phylogeny and the evolution of caffeine content. Mol. Phylogenet. Evol. 109, 351–361 (2017) CrossRefGoogle Scholar
  4. 4.
    van Hoek, M.J., Hogeweg, P.: The role of mutational dynamics in genome shrinkage. Mol. Biol. Evol. 24, 2485–2494 (2007)CrossRefGoogle Scholar
  5. 5.
    Byrnes, J.K., Morris, G.P., Li, W.H.: Reorganization of adjacent gene relationships in yeast genomes by whole-genome duplication and gene deletion. Mol. Biol. Evol. 23, 1136–1143 (2006)CrossRefGoogle Scholar
  6. 6.
    Zheng, C., Wall, P.K., Leebens-Mack, J., dePamphilis, C., Albert, V.A., Sankoff, D.: Gene loss under neighbourhood selection following whole genome duplication and the reconstruction of the ancestral Populus diploid. J. Bioinform. Comput. Biol. 7, 499–520 (2009)CrossRefGoogle Scholar
  7. 7.
    Sankoff, D., Zheng, C., Wang, B., Fernando Buen Abad Najar, C.: Structural vs. functional mechanisms of duplicate gene loss following whole genome doubling. BMC Genomics 15 (2015).
  8. 8.
    Yu, Z.N., Sankoff, D.: A continuous analog of run length distributions reflecting accumulated fractionation events. BMC Bioinform. 17(suppl 14), 412 (2016)CrossRefGoogle Scholar
  9. 9.
    Lyons, E., Freeling, M.: How to usefully compare homologous plant genes and chromosomes as DNA sequences. Plant J. 53, 661–673 (2008)CrossRefGoogle Scholar
  10. 10.
    Lyons, E., et al.: Finding and comparing syntenic regions among Arabidopsis and the outgroups papaya, poplar and grape: CoGe with rosids. Plant Physiol. 148, 1772–1781 (2008)CrossRefGoogle Scholar
  11. 11.
    Sankoff, D., Zheng, C., Zhang, Y., Meidanis, J., Lyons, E., Tang, H.: Models for similarity distributions of syntenic homologs and applications to phylogenomics. IEEE/ACM Trans. Comput. Biol. Bioinform. 16, 727–737 (2019)CrossRefGoogle Scholar
  12. 12.
    McLachlan, G.J., Peel, D., Basford, K.E., Adams, P.: The EMMIX software for the fitting of mixtures of normal and t-components. J. Stat. Softw. 4, 1–14 (1999)CrossRefGoogle Scholar
  13. 13.
    Weisstein, E.: Run. MathWorld-A Wolfram Web Resource. Accessed 20 Aug 2019

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.University of OttawaOttawaCanada

Personalised recommendations