Journal of Mathematical Biology

, Volume 53, Issue 5, pp 759–770 | Cite as

A model for the evolution of paralog families in genomes

  • Ryszard RudnickiEmail author
  • Jerzy Tiuryn
  • Damian Wójtowicz


We introduce and analyse a simple probabilistic model of genome evolution. It is based on three fundamental evolutionary events: gene loss, duplication and accumulated change. This is motivated by previous works which consisted in fitting the available genomic data into, what is called paralog distributions. This formalism is described by a system of infinite number of linear equations. We show that this system generates a semigroup of linear operators on the space l 1. We prove that size distribution of paralogous gene families in a genome converges to the equilibrium as time goes to infinity. Moreover we show that when probabilities of gene removal and duplication are close to each other, then the resulting distribution is close to logarithmic distribution. Some empirical results for yeast genomes are presented.


Genome evolution Paralogous genes Markov semigroups Asymptotic stability 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Banasiak J., Arlotti L. (2006): Perturbations of positive semigroups with applications. Springer Monographs in Mathematics. Springer, LondonGoogle Scholar
  2. 2.
    Bobrowski A. (2005): Functional analysis for probability and stochastic processes. An introduction. Cambridge University Press, CambridgeGoogle Scholar
  3. 3.
    Dokholyan N.V., Shakhnovich B., Shakhnovich E.I. (2002): Expanding protein universe and its origin from the biological Big Bang. Proc. Natl. Acad. Sci. USA 99, 14132–14136CrossRefGoogle Scholar
  4. 4.
    Dujon B. et al. (2004): Genome evolution in yeasts. Nature 430, 35–44CrossRefGoogle Scholar
  5. 5.
    Dunford N., Schwartz J.T. (1968): Linear Operators, Part I. Interscience Publ., New YorkzbMATHGoogle Scholar
  6. 6.
    Enright A.J., Van Dongen S., Ouzounis C.A. (2002): An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30(7): 1575–1584CrossRefGoogle Scholar
  7. 7.
    Fitch W.M. (2000): Homology, a personal view on some of the problems. Trends Genetics 16(5): 227–321CrossRefGoogle Scholar
  8. 8.
    Hille, E., Phillips, R.S.: Functional analysis and semi-groups. American Mathematical Society Colloquium Publications, vol. 31. American Mathematical Society, Providence (1957)Google Scholar
  9. 9.
    Huynen M.A., van Nimwegen E. (1998): The frequency distribution of gene family size in complete genomes. Mole. Biol. Evol. 15(5): 583–589Google Scholar
  10. 10.
    Jordan K., Makarova K.S., Spouge J.L., Wolf Y.I. Koonin E.V. (2001): Lineage-specific gene expansions in bacterial and archeal genomes. Genome Res. 11, 555–565CrossRefGoogle Scholar
  11. 11.
    Karev, G.P., Wolf, Y.I., Rzhetsky, A.Y., Berezovskaya, F.S., Koonin, E.V.: Birth and death of protein domains: a simple model of evolution explains power law behaviour. BMC Evolu. Biol. 2, 18 (2002)Google Scholar
  12. 12.
    Karev G.P., Wolf Y.I., Koonin E.V. (2003): Simple stochastic birth and death models of genome evolution: was there enough time for us to evolve?. Bioinformatics 19(15): 1889–1900CrossRefGoogle Scholar
  13. 13.
    Karev, G.P., Wolf, Y.I., Berezovskaya, F.S., Koonin, E.V.: Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models. BMC Evolu. Biol. 4(32) (2004)Google Scholar
  14. 14.
    Karev G.P., Wolf Y.I., Berezovskaya F.S., Koonin E.V. (2005): Modeling genome evolution with a diffusion approximation of a birth-and-death process. Bioinformatics 21(3): iii12–iii19CrossRefGoogle Scholar
  15. 15.
    Kimura M. (1983): The Neutral Theory of Molecular Evolution. Cambridge University Press, CambridgeGoogle Scholar
  16. 16.
    Lasota, A., Mackey, M.C.: Chaos, fractals and noise. Stochastic aspects of dynamics. Springer Applied Mathematical Sciences. 97 New York (1994)Google Scholar
  17. 17.
    Lasota A., Yorke J.A. (1982): Exact dynamical systems and the Frobenius-Perron operator. Trans. AMS 273, 375–384zbMATHMathSciNetCrossRefGoogle Scholar
  18. 18.
    Norris J. (1997): Markov chains. Cambridge Series on Statistical and Probabilistic Mathematics. Cambridge University Press, CambridgeGoogle Scholar
  19. 19.
    Ohno S. (1970): Evolution by Gene Duplication. Springer, Berlin Heidelberg New YorkGoogle Scholar
  20. 20.
    Reed W.J., Hughes B.D. (2004): A model explaining the size distribution of gene and protein families. Math. Biosci. 189, 97–102zbMATHMathSciNetCrossRefGoogle Scholar
  21. 21.
    Rudnicki R. (1995): On asymptotic stability and sweeping for Markov operators. Bull. Pol. Ac. Math. 43, 245–262zbMATHMathSciNetGoogle Scholar
  22. 22.
    Slonimski, P.P., Mosse, M.O., Golik, P., Henaût, A., Diaz, Y., Risler, J.L., Comet, J.P., Aude, J.C., Wozniak, A., Glemet, E., Codani, J.J.: The first laws of genomics. Microbial. Comp. Genomic. 3, 46 (1998)Google Scholar
  23. 23.
    Slonimski, P.P.: Comparison of complete genomes: organization and evolution. In: of the Third Annual Conference on Computational Molecular Biology, RECOMB’99, Stanislaw Ulam Memorial Lecture, 310, ACM Press (1999)Google Scholar
  24. 24.
    Tiuryn, J., Rudnicki, R., Wójtowicz, D.: A case study of genome evolution: from continuous to discrete time model. In: Fiala, J., Koubek, V., Kratochvíl J.,(eds.), Proceedings of Mathematical Foundations of Computer Science 2004, LNCS 3153, pp. 1–24, Springer Berlin Heidelberg New York (2004)Google Scholar
  25. 25.
    Wolf Y.I., Grishin N.V., Koonin E.V. (2000): Estimating the number of protein folds and families from complete genome data. J. Mole. Biol. 299, 897–905CrossRefGoogle Scholar
  26. 26.
    Yanai I., Camacho C.J., DeLisi C. (2000): Predictions of gene family distributions in microbial genomes: evolution by gene duplication and modification. Phys. Rev. Lett. 85(12): 2641–2644CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  • Ryszard Rudnicki
    • 1
    Email author
  • Jerzy Tiuryn
    • 2
  • Damian Wójtowicz
    • 2
  1. 1.Institute of Mathematics, Polish Academy of Sciences and Institute of MathematicsSilesian UniversityKatowicePoland
  2. 2.Institute of InformaticsWarsaw UniversityWarsawPoland

Personalised recommendations