Genetic Paralog Analysis and Simulations

  • Stanisław Cebrat
  • Jan P. Radomski
  • Dietrich Stauffer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3039)

Abstract

Using Monte Carlo methods, we simulated the effects of bias in generation and elimination of paralogs on the size distribution of paralog groups. It was found that the function describing the decay of the number of paralog groups with their size depends on the ratio between the probability of duplications of genes and their deletions, which corresponds to different selection pressures on the genome size. Slightly different slopes of curves describing the decay of the number of paralog groups with their size were also observed when the threshold of homology between paralogous sequences was changed.

Keywords

Genome Size Down Syndrome Paralogous Sequence Paralog Group Paralog Pair 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Fitch, W.M.: Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–113 (1970)CrossRefGoogle Scholar
  2. 2.
    Cebrat, S., Stauffer, D.: Monte Carlo simulation of genome viability with paralog replacement. J. Appl. Genet. 43, 391–395 (2002)Google Scholar
  3. 3.
    MIPS, Database (2002), http://mips.gsf.de/proj/yeast/
  4. 4.
    Mackiewicz, P., Kowalczuk, M., Mackiewicz, D., Nowicka, A., Dudkiewicz, M., Laszkiewicz, A., Dudek, M.R., Cebrat, S.: How many protein-coding genes are there in the Saccharomyces cerevisiae genome? Yeast 19, 619–629 (2002)CrossRefGoogle Scholar
  5. 5.
    Slonimski, P.P., Mosse, M.O., Golik, P., Henaut, A., Risler, J.L., Comet, J.P., Aude, J.C., Wozniak, A., Glemet, E., Codani, J.J.: The first laws of genomics. Microb. Comp. Genomics 3, 46 (1998)Google Scholar
  6. 6.
    Koonin, E.V., Galperin, M.Y.: Sequence - Evolution - Function, Computational approaches in Comparative Genomics. Kluwer Academic Publishers, Dordrecht (2003)Google Scholar
  7. 7.
    Mackiewicz, P., Mackiewicz, D., Kowalczuk, M., Dudkiewicz, M., Dudek, M.R., Cebrat, S.: High divergence rate of sequences located on different DNA strands in closely related bacterial genomes. J. Appl. Genet. 44, 561–584 (2003)Google Scholar
  8. 8.
    Gerstein, M.: A structural census of genomes: Comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. J. Mol. Biol. 274, 562–574 (1997)CrossRefGoogle Scholar
  9. 9.
    Yanai, I., Camacho, C.J., DeLisi, C.: Predictions of gene family distributions in microbial genomes: Evolution by gene duplication and modification. Phys. Rev. Lett. 85, 2641–2644 (2000)CrossRefGoogle Scholar
  10. 10.
    Brenner, S.E., Hubbard, T., Murzin, A., Chotia, C.: Gene duplications in Haemophilus influenzae. Nature 378, 140 (1995)CrossRefGoogle Scholar
  11. 11.
    Huynen, M.A., van Nimwegen, E.: The frequency distribution of gene family sizes in complete genomes. Mol. Biol. Evol. 15, 583–589 (1998)Google Scholar
  12. 12.
    Codani, J.J., Comet, J.P., Aude, J.C., Glemet, E., Wozniak, A., Risler, J.L., Henaut, A., Slonimski, P.P.: Automatic analysis of large-scale pairwise alignments of protein sequences. Methods Microbiol. 28, 229–244 (1999)CrossRefGoogle Scholar
  13. 13.
    Comet, J.P., Aude, J.C., Glemet, E., Risler, J.L., Henaut, A., Slonimski, P.P., Codani, J.J.: Significance of Z-value statistics of Smith-Waterman scores for protein alignments. Comput. Chem. 23, 317–331 (1999)CrossRefGoogle Scholar
  14. 14.
    Qian, J., Luscombe, N.M., Gerstein, M.: Protein family and fold occurrence in genomes: Power-law behaviour and evolutionary model. J. Mol. Biol. 313, 673–681 (2001)CrossRefGoogle Scholar
  15. 15.
    Unger, R., Uliel, S., Havlin, S.: Scaling law in sizes of protein sequence families: From super-families to orphan genes. Proteins 51, 569–576 (2003)CrossRefGoogle Scholar
  16. 16.
    van Nimwegen, E.: Scaling laws in the functional content of genomes. Trends Genet. 19, 479–484 (2003)CrossRefGoogle Scholar
  17. 17.
    Koonin, E.V.: Wolf Yi, Karev GP: The structure of the protein universe and genome evolution. Nature 420, 218–223 (2002)CrossRefGoogle Scholar
  18. 18.
    TERAPROT project (CEA, Gene-It, Infobiogen) (June 2002)Google Scholar
  19. 19.
    Alle, P.: Simulation of gene duplication in the Penna bit-string model of biological ageing. Master’s thesis, Cologne University (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Stanisław Cebrat
    • 1
  • Jan P. Radomski
    • 2
  • Dietrich Stauffer
    • 3
  1. 1.Institute of Genetics and MicrobiologyUniversity of WroclawWroclawPoland
  2. 2.Interdisciplinary Center for Computational and Mathematical ModelingWarsaw UniversityWarsawPoland
  3. 3.Institute for Theoretical PhysicsCologne UniversityKölnEuroland

Personalised recommendations