On Genome Evolution with Accumulated Change and Innovation

  • Damian Wójtowicz
  • Jerzy Tiuryn
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4205)


We introduce and analyse a simple discrete probabilistic model of genome evolution. It is based on four fundamental evolutionary events: gene duplication, loss, change and innovation, and it is called DLCI model. This is the first such model rigorously analysed. The focus of the paper is around the size distribution of gene families. The formulas for equilibrium gene family sizes are derived showing that they follow a logarithmic distribution. We consider also a disjoint union of DLCI models and we present the result of this study. Some empirical results for microbial genomes are presented.


Gene Duplication Family Size Innovation Process Genome Evolution Paralogous Gene 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dokholyan, N.V., Shakhnovich, B., Shakhnovich, B.E.I.: Expanding protein universe and its origin from the biological Big Bang. Proc. Natl. Acad. Sci. USA 99, 14132–14136 (2002)CrossRefGoogle Scholar
  2. 2.
    Dujon, B., et al.: Genome evolution in yeasts. Nature 430, 35–44 (2004)CrossRefGoogle Scholar
  3. 3.
    Enright, A.J., Van Dongen, S., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research 30(7), 1575–1584 (2002)CrossRefGoogle Scholar
  4. 4.
    Feller, W.: An introduction to probability theory and its applications. John Wiley and Sons, Inc., New York (1961)Google Scholar
  5. 5.
    Fitch, W.M.: Homology, a personal view on some of the problems. Trends in Genetics 16(5), 227–321 (2000)CrossRefGoogle Scholar
  6. 6.
    Huynen, M.A., van Nimwegen, E.: The Frequency Distribution of Gene Family Size in Complete Genomes. Molecular Biology Evolution 15(5), 583–589 (1998)Google Scholar
  7. 7.
    Jordan, K., Makarova, K.S., Spouge, J.L., Wolf, Y.I., Koonin, E.V.: Lineage-Specific Gene Expansions in Bacterial and Archeal Genomes. Genome Research 11, 555–565 (2001)CrossRefGoogle Scholar
  8. 8.
    Karev, G.P., Wolf, Y.I., Rzhetsky, A.Y., Berezovskaya, F.S., Koonin, E.V.: Birth and death of protein domains: A simple model of evolution explains power law behavior. BMC Evolutionary Biology 2(18) (2002)Google Scholar
  9. 9.
    Karev, G.P., Wolf, Y.I., Berezovskaya, F.S., Koonin, E.V.: Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models. BMC Evolutionary Biology 4/32 (2004)Google Scholar
  10. 10.
    Karev, G.P., Wolf, Y.I., Berezovskaya, F.S., Koonin, E.V.: Modelling genome evolution with a diffusion approximation of a birth-and-death process. Bioinformatics 21/3, iii12–iii19 (2005)Google Scholar
  11. 11.
    Kimura, M.: The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge (1983)CrossRefGoogle Scholar
  12. 12.
    Wen-Hsiung, L.: Molecular Evolution. Sinauer Associates, Inc., Publishers, Sunderland Massachusetts (1997)Google Scholar
  13. 13.
    Luz, H., Vingron, M.: Family specific rates of protein evolution. Bioinformatics 22(10), 1166–1171 (2006)CrossRefGoogle Scholar
  14. 14.
    Ohno, S.: Evolution by Gene Duplication. Springer, Berlin (1970)Google Scholar
  15. 15.
    Peterson, J.D., Umayam, L.A., Dickinson, T.M., Hickey, E.K., White, O.: The Comprehensive Microbial Resource. Nucleic Acids Research 29(1), 123–125 (2001)CrossRefGoogle Scholar
  16. 16.
    Reed, W.J., Hughes, B.D.: A model explaining the size distribution of gene and protein families. Math. Biosci. 189(1), 97–102 (2004)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Slonimski, P.P., Mosse, M.O., Golik, P., Henaût, A., Diaz, Y., Risler, J.L., Comet, J.P., Aude, J.C., Wozniak, A., Glemet, E., Codani, J.J.: The first laws of genomics. Microbial and Comparative Genomics 3(46) (1998)Google Scholar
  18. 18.
    Slonimski, P.P.: Comparison of complete genomes: Organisation and evolution. In: Proceedings of the Third Annual Conference on Computational Molecular Biology. RECOMB 1999 Stanislaw Ulam Memorial Lecture, vol. 310. ACM Press, New York (1999)Google Scholar
  19. 19.
    Tiuryn, J., Rudnicki, R., Wójtowicz, D.: A case study of genome evolution: From continuous to discrete time model. In: Fiala, J., Koubek, V., Kratochvíl, J. (eds.) MFCS 2004. LNCS, vol. 3153, pp. 1–24. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  20. 20.
    Tiuryn, J., Wójtowicz, D., Rudnicki, R.: A Model of Evolution of Small Paralog Families in Genomes (submitted for publication, 2006)Google Scholar
  21. 21.
    Wolf, Y.I., Grishin, N.V., Koonin, E.V.: Estimating the number of protein folds and families from complete genome data. J. Molecular Biology 299, 897–905 (2000)CrossRefGoogle Scholar
  22. 22.
    Wójtowicz, D., Tiuryn, J.: On genome evolution with innovation. Mathematical Foundations of Computer Science (to appear, 2006)Google Scholar
  23. 23.
    Yanai, I., Camacho, C.J., DeLisi, C.: Predictions of Gene Family Distributions in Microbial Genomes: Evolution by Gene Duplication and Modification. Physical Review Letters 85(12), 2641–2644 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Damian Wójtowicz
    • 1
  • Jerzy Tiuryn
    • 1
  1. 1.Institute of InformaticsWarsaw UniversityWarsawPoland

Personalised recommendations