Pacemaker Partition Identification
The universally observed conservation of the distribution of evolution rates across the complete sets of orthologous genes in pairs of related genomes can be explained by the model of the Universal Pacemaker (UPM) of genome evolution. Under UPM, the relative evolutionary rates of all genes remain nearly constant whereas the absolute rates can change arbitrarily. It was shown on several taxa groups spanning the entire tree of life that the UPM model describes the evolutionary process better than the traditional molecular clock model . Here we extend this analysis and ask: how many pacemakers are there and which genes are affected by which pacemakers? The answer to this question induces a partition of the gene set such that all the genes in one part are affected by the same pacemaker. The input to the problem comes with arbitrary amount of statistical noise, hindering the solution even more. In this work we devise a novel heuristic procedure, relying on statistical and geometrical tools, to solve the pacemaker partition identification problem and demonstrate by simulation that this approach can cope satisfactorily with considerable noise and realistic problem sizes. We applied this procedure to a set of over 2000 genes in 100 prokaryotes and demonstrated the significant existence of two pacemakers.
KeywordsMolecular Evolution Genome Evolution Pacemaker Deming regression Partition Distance Gap Statistics
Unable to display preview. Download preview PDF.
- 1.Adcock, R.J.: A problem in least squares. Annals of Mathematics 5, 53–54 (1878)Google Scholar
- 6.Deming, W.E.: Tatistical adjustment of data. J. Wiley & Sons (1943)Google Scholar
- 17.Kruskal, J.B., Wish, M.: Multidimensional Scaling. Sage Publications (1978)Google Scholar
- 18.Lawler, E.L.: Combinatorial optimization: networks and matroids. The University of Michigan (1976)Google Scholar
- 20.Mardia, K.V.: Some properties of classical multidimensional scaling. Communications on Statistics – Theory and Methods A7 (1978)Google Scholar
- 25.Snir, S., Wolf, Y.I., Koonin, E.V.: Universal pacemaker of genome evolution in animals and fungi and variation of evolutionary rates in diverse organisms. In: Genome Biology and Evolution (2014)Google Scholar
- 26.Snir, S., Wolf, Y.I., Koonin, E.V.: Universal pacemaker of genome evolution. PLoS Comput Biol. 8, e1002785 (2012)Google Scholar
- 28.Tutte, W.T.: Connectivity in graphs. Mathematical expositions. University of Toronto Press (1966)Google Scholar
- 31.Wolf, Y.I., Novichkov, P.S., Karev, G.P., Koonin, E.V., Lipman, D.J.: The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proceedings of the National Academy of Sciences 106(18), 7273–7280 (2009)CrossRefGoogle Scholar