Bounds on Identification of Genome Evolution Pacemakers

  • Sagi SnirEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10847)


Several works have pointed out that the tight correlation between genes’ evolutionary rate is better explained by a model denoted as the Universal Pacemaker (UPM) rather than by a simple rate constancy as manifested by the classical hypothesis of Molecular Clock (MC). Under UPM, the relative evolutionary rates of all genes remain nearly constant whereas the absolute rates can change arbitrarily according to the pacemaker ticks. This evolutionary framework was recently adapted to model epigenetic aging where methylated sites are the analogs of evolving genes.

A consequent question to the above finding is the determination of the number of such pacemakers and which gene adheres to which pacemaker. This however turns to be a non trivial task and is affected by the number of variables, their random noise, and the amount of available information. To this end, a clustering heuristic was devised exploiting the correlation between corresponding edge lengths across thousands of gene trees. Nevertheless, no theoretical study linking the relationship between the affecting parameters was done.

We here study this question by providing theoretical bounds, expressed by the system parameters, on probabilities for positive and negative results. We corroborate these results by a simulation study that reveals the critical role of the variances.


Phylogenetics Universal Pacemaker Gene partitioning Probabilistic geometrical clustering 



We would like to thank Eugene Koonin and Yuri Wolf for inspiring the question, and Ilan Newman and Nick Harvey for helpful discussions. We also thank helpful and meticulous comments of the anonymous reviewers, used to clarify exposition. Part of this work was done while the author was visiting the NIH, USA, supported by Intramural funds of the US Department of Health and Human Services.


  1. 1.
    Bromham, L.: Why do species vary in their rate of molecular evolution? Biol. Lett. 5(3), 401–404 (2009)CrossRefGoogle Scholar
  2. 2.
    Bromham, L.: The genome as a life-history character: why rate of molecular evolution varies between mammal species. Philos. Trans. Roy. Soc. B: Biol. Sci. 366(1577), 2503–2513 (2011)CrossRefGoogle Scholar
  3. 3.
    Mouse Genome Sequencing Consortium: Initial sequencing and comparative analysis of the mouse genome. Nature 20, 520–562 (2002)Google Scholar
  4. 4.
    Deming, W.: Statistical Adjustment of Data. Wiley, Hoboken (1943)zbMATHGoogle Scholar
  5. 5.
    Drummond, D.A., Wilke, C.O.: Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell 134(2), 341–352 (2008)CrossRefGoogle Scholar
  6. 6.
    Duchne, S., Ho, S.Y.W.: Mammalian genome evolution is governed by multiple pacemakers. Bioinformatics 31, 2061–2065 (2015)CrossRefGoogle Scholar
  7. 7.
    Duchne, S., Ho, S.Y.: Using multiple relaxed-clock models to estimate evolutionary timescales from DNA sequence data. Mol. Phylogenet. Evol. 77, 65–70 (2014)CrossRefGoogle Scholar
  8. 8.
    Duchne, S., Molak, M., Ho, S.Y.W.: ClockstaR: choosing the number of relaxed-clock models in molecular phylogenetic analysis. Bioinformatics 30(7), 1017–1019 (2014). Scholar
  9. 9.
    Grishin, N.V., Wolf, Y.I., Koonin, E.V.: From complete genomes to measures of substitution rate variability within and between proteins. Genome Res. 10(7), 991–1000 (2000). Scholar
  10. 10.
    Hartigan, J.A., Wong, M.A.: A k-means clustering algorithm. Appl. Stat. 28, 100–108 (1979)CrossRefGoogle Scholar
  11. 11.
    Ho, S.Y.W., Lanfear, R.: Improved characterisation of among-lineage rate variation in cetacean mitogenomes using codon-partitioned relaxed clocks. Mitochondrial DNA 21(3–4), 138–146 (2010)CrossRefGoogle Scholar
  12. 12.
    Horvath, S.: DNA methylation age of human tissues and cell types. Genome Biol. 14(10), 1–20 (2013). Scholar
  13. 13.
    Kimura, M.: Molecular evolutionary clock and the neutral theory. J. Mol. Evol. 26, 24–33 (1987)CrossRefGoogle Scholar
  14. 14.
    Lanfear, R., Calcott, B., Ho, S.Y.W., Guindon, S.: PartitionFinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol. Evol. 29(6), 1695–1701 (2012). Scholar
  15. 15.
    Snir, S., Wolf, Y., Koonin, E.: Universal pacemaker of genome evolution. PLoS Comput. Biol. (in Press)Google Scholar
  16. 16.
    Snir, S.: On the number of genomic pacemakers: a geometric approach. Algorithm. Mol. Biol. 9, 26 (2014). Extended abstract appeared in WABI 2014CrossRefGoogle Scholar
  17. 17.
    Snir, S., Pellegrini, M.: An epigenetic PaceMaker is detected via a fast conditional EM algorithm. Epigenomics (2018, accepted)Google Scholar
  18. 18.
    Snir, S., vonHoldt, B.M., Pellegrini, M.: A statistical framework to identify deviation from time linearity in epigenetic aging. PLoS Comput. Biol. 12(11), 1–15 (2016). Scholar
  19. 19.
    Snir, S., Wolf, Y.I., Koonin, E.V.: Universal pacemaker of genome evolution in animals and fungi and variation of evolutionary rates in diverse organisms. Genome Biol. Evol. 6(6), 1268–1278 (2014)CrossRefGoogle Scholar
  20. 20.
    Wolf, Y.I., Novichkov, P.S., Karev, G.P., Koonin, E.V., Lipman, D.J.: The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc. Nat. Acad. Sci. 106(18), 7273–7280 (2009)CrossRefGoogle Scholar
  21. 21.
    Wolf, Y.I., Snir, S., Koonin, E.V.: Stability along with extreme variability in core genome evolution. Genome Biol. Evol. 5(7), 1393–1402 (2013)CrossRefGoogle Scholar
  22. 22.
    Zuckerkandl, E., Pauling, L.: Molecules as documents of evolutionary history. J. Theoret. Biol. 8(2), 357–366 (1965)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Evolutionary BiologyUniversity of HaifaHaifaIsrael

Personalised recommendations