Skip to main content

Statistics of the Numbers of Transcripts and Protein Sequences Encoded in the Genome

  • Chapter

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   74.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • C. Adami, (1998) Introduction to artificial life, Springer-Verlag, New-York.

    Google Scholar 

  • J.O. Bishop, J.G. Morton, M. Rosbash, and M. Richardson, (1974) Three classes in Hela cell messenger RNA. Nature, 250, 199–204.

    Article  PubMed  CAS  Google Scholar 

  • M. Yu. Borodovsky, S.M. Gusein-Zade (1989) A general rule for ranged series of codon frequencies in different genomes. J. Biomolecular Structure and Dynamics, 6, 1001–1012.

    Google Scholar 

  • C.R. Cantor, C. L. Smith (1999) Genomics, J. Willey & Sons, Inc. New York.

    Google Scholar 

  • J. Chelly, J.-P. Concordet, J.-C. Kaplan, A. Kahn (1989) Illegitimate transcription: Transcription of any gene in cell type. Proc. Natl. Acad. Sci. U.S.A., 86, 2617–2621.

    PubMed  CAS  Google Scholar 

  • J.-J. Chen, J. D. Rowley & S. M. Wang (2000) Generation of longer cDNA fragments from serial analysis of gene expression tags for gene identification. Proc. Natl. Acad. Sci. U.S.A., 97, 349–353.

    PubMed  CAS  Google Scholar 

  • D. L. Cook, A.N. Gerber and S.T. Tatscott (1998). Modeling stochastic gene expression: Implications for haploinsufficiency, Proc. Natl. Acad. Sci. U.S.A., 95, 15641–15646.

    PubMed  CAS  Google Scholar 

  • Caron, H., et al. (2001) The human transcriptome map: clustering of highly expressed genes in chromosomal domains. Science, 291, 1289–1292.

    Article  PubMed  CAS  Google Scholar 

  • B.S. Croix et al. (2000) Genes expressed in human tumor endothelium. Science, 289, 1197–1202.

    Article  Google Scholar 

  • R. Crollius, et al. (2000) Estimate of human gene number provided by genomewide analysis using Tetraodon nigroviridis DNA sequence. Nature Genetics, 25, 235–238.

    CAS  Google Scholar 

  • S. Douglas et al. (2001) The highly reduced genome of an enslaved aldal nucleus. Nature, 410, 1091–1096.

    Article  PubMed  CAS  Google Scholar 

  • S.R. Eddy (2001) Non-coding RNA genes and the modern RNA world. Nature Rev. Genetics, 2, 919–928.

    CAS  Google Scholar 

  • M. R. Emmert-Buck, et al. (2000) Molecular profiling of clinical tissue specimens: Feasibility and applications. Am. J. Pathol., 156, 1109–1115.

    PubMed  CAS  Google Scholar 

  • B. Ewing, P. Green (2000) Analysis of expressed sequence tags indicates 35,000 human genes. Nature Genetics 25, 232–234.

    PubMed  CAS  Google Scholar 

  • A.M. Femino, F.S. Fay, K. Fogarty and R.H. Singer (1998) Visualization of single RNA transcripts in situ. Science, 280, 585–590.

    Article  PubMed  CAS  Google Scholar 

  • R.A. Fisher (1930). The genetical theory of natural selection. Oxford: Clarendon Press.

    Google Scholar 

  • R. Friedman and A.L. Hughes (2001) Pattern and timing of gene duplication in animal genomes. Genome Res., 11, 1842–1847.

    Article  PubMed  CAS  Google Scholar 

  • P. Guptasarma (1995) Does replication-induced transcription regulate synthesis of the myriad low number proteins of Escherichia coli? BioAssays, 17, 987–997.

    CAS  Google Scholar 

  • J.B. Hogenesch, et al. (2001) A comparison of the Celera and Ensemble predicted gene sets reveals little overlap in novel genes. Cell, 106, 413–415.

    Article  PubMed  CAS  Google Scholar 

  • G.A. Hollander. On the stochastic regulation of interleukin-2 transcription. Seminars in Immunology, 11, 357–367.

    Google Scholar 

  • F. C. P. Holstege, et al. (1998) Dissecting the regulatory circuitry of a eukaryotic genome. Cell, 95, 717–728.

    Article  PubMed  CAS  Google Scholar 

  • Huang S.-P. and Weir B. S. (2001) Estimating the total number of alleles using a sample coverage method. Genetics, 159, 1365–1373.

    PubMed  CAS  Google Scholar 

  • Hughes A.L., da Silva J., Freadman R. (2001) Ancient genome duplications did not structure the human Hox-bearing chromosomes. Genome Res., 11, 771–780.

    Article  PubMed  CAS  Google Scholar 

  • D. A. Hume (2000) Probability in transcriptional regulation and implications for leukocyte differentiation and inducible gene expression. Blood, 96, 7, 2323–2328.

    PubMed  CAS  Google Scholar 

  • International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature, 409, 860–921.

    Google Scholar 

  • D.A. Jackson, A. Pombo and F. Iborra (2000) The balance sheet for transcription: an analysis of nuclear RNA metabolism in mammalian cells. FASEB J., 14, 242–254.

    PubMed  CAS  Google Scholar 

  • S. A. Jelinsky and L. D. Samson (1999) Global response of Saccharomyces cerevisiae to alkylating agent. Proc. Natl. Acad. Sci. U.S.A., 96, 1486–1491.

    Article  PubMed  CAS  Google Scholar 

  • S. A. Jelinsky, P. Estep, G.M. Church, and L. D. Samson (2000) Regulatory networks revealed by transcriptional profiling of damaged Saccharomyces cerevisiae cells: Rpn4 links base excision repair with proteasomes. Molec. and Cell. Biology, 20, 8157–8167.

    CAS  Google Scholar 

  • H. Jeong, B. Tombor, R. Albert, Z.N. Ottval, A.-L. Barabasi (2000) The large-scale organization of metabolic networks. Nature 407, 651–654.

    PubMed  CAS  Google Scholar 

  • M. Johnson (2000) The yeast genome: on the road to the gold age. Current Opinion in Genetics and Development, 10, 617–623.

    Article  Google Scholar 

  • N. L. Johnson, S. Kotz, A. W. Kemp, Univariate Discrete Distributions. John Wiley & Sons, Inc., New-York, 1992.

    Google Scholar 

  • S. A. Kauffman (1993) The origins of Order: Self-Organization and Selection in Evolution. Oxford University Press, New-York.

    Google Scholar 

  • M.S.H. Ko (1992) Induction mechanism of a single gene molecule: stochastic or deterministic. BioAssays, 14, 341–346.

    CAS  Google Scholar 

  • E. Koonin, L. Aravind & A. S. Kondrashov (2000). The impact of comparative genomics on our understanding of evolution. Cell, 101, 573–576.

    Article  PubMed  CAS  Google Scholar 

  • V. A. Kuznetsov & R.F. Bonner (1999) Statistical tools for analysis of gene expression distributions with missing data. In: 3rd Annual Con-ference on Computational Genomics. Nov. 18–21. Baltimore, MD: The Institute for Genomic Research, p. 26.

    Google Scholar 

  • V. A. Kuznetsov (2000) The genes number game in growing sample. J. Comput. Biol, 7, 642.

    Google Scholar 

  • V. A. Kuznetsov (2001) Analysis of stochastic processes of gene expression in a single cell. In: 2001 IEEE-EURASIP Workshop on Nonlinear Signals and Image Processing, University of Delaware, Baltimore, MD, USA, June, 2001.

    Google Scholar 

  • V. A. Kuznetsov (2001) Distribution associated with stochastic processes of gene expression in a single eukaryotic cell. EURASIP J. on Applied Signal Processing, 4, 285–296.

    Google Scholar 

  • A. S. Lash, et al. (2000) SAGEmap: A public gene expression resource. Genome Res., 10, 1051–1060, 2000.

    Article  PubMed  CAS  Google Scholar 

  • W, Li (1992), Random texts exhibit Zipf’s-law-like word frequency distribution, IEEE Transactions on Information Theory, 38, 1842–1845.

    Google Scholar 

  • W. Li (1999) Statistical properties of open reading frames in complete genome sequences. Computers & Chemistry, 23, 283–301.

    Article  CAS  Google Scholar 

  • W.-H. Li, Z. Gu, H. Wang and A. Nekrutenko (2001) Evolutionary analyses of the human genome. Nature, 409, 847–849.

    Article  PubMed  CAS  Google Scholar 

  • B. Mandelbrot (1982). Fractal Geometry in Nature. New York: Freeman.

    Google Scholar 

  • H.H. McAdams and A. Arkin (1999) It’s a noisy business! Genetic regulation at the nanomolar scale. Trends in Genetics, 15, 65–69.

    Article  PubMed  CAS  Google Scholar 

  • T. Misteli (2001) Protein dynamics: Implications for nuclear architecture and gene expression. Science, 291, 843–847.

    Article  PubMed  CAS  Google Scholar 

  • S. Newlands, et al. (1998) Transcription occurs in pulses in muscle fibers. Genes Dev., 12, 2748–2758, 1998.

    PubMed  CAS  Google Scholar 

  • M. E. J. Newman, S.H. Strogatz and D.J. Watts (2001) Physical Rev. E., 64, 026118-1-02618-17.

    Google Scholar 

  • E. Pennisi (2000) And the gene number is...? Science, 288, 1146–1147.

    PubMed  CAS  Google Scholar 

  • R. Ohlsson, A. Paldi, and J.A. Marshall Graves (2001) Did genomic imprinting and X chromosome inactivation arise from stochastic expression? Trends in Genetics, 17, 136–141.

    PubMed  CAS  Google Scholar 

  • S. Ohno (1970) Evolution by gene duplication. Springer Verlag, New York.

    Google Scholar 

  • A. Pombo et al. (2000) Specialized transcription factories within mammalian nuclei. Critical Reviews in Eukaryotic Gene Expression, 10, 21–29.

    PubMed  CAS  Google Scholar 

  • J. J. Ramsden, J. Vohradsky (1998) Zipf-like behavior in prokaryotic protein expression. Phys. Review. E., 58, 7777–7780.

    Article  CAS  Google Scholar 

  • I. L. Ross, C. M. Browne, and D. A. Hume (1994) Transcription of individual genes in eukaryotic cells occurs randomly and infrequently. Immunol. Cell. Biol. 72, 177–185.

    PubMed  CAS  Google Scholar 

  • G. M. Rubin et al. (2000) Comparative genomics of the Eukaryotes. Science, 287, 2204–2215.

    PubMed  CAS  Google Scholar 

  • A. Rzhetsky and S. M. Gomez (2001) Birth of scale-free molecular networks and the number of distinct DNA and protein domains per genome. Bioinformatics, 17, 988–996.

    Article  PubMed  CAS  Google Scholar 

  • Y. Sano et al. (2001) Random monoallelic expression of three genes clustered within 60 kb of mouse t complex genomic DNA. Genome Res., 11, 1833–1841.

    PubMed  CAS  Google Scholar 

  • I. Shmulevich, E.R. Dougherty, S. Kim, and W. Zhang (2002) Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks Bioinformatics, 18 (in press).

    Google Scholar 

  • M.J. Shulman & G.E. Wu, (1999) Hypothesis: genes which function in a stochastic linagecommitment process are subject to monoallelic expression. Seminars in Immunology, 11, 369–371.

    CAS  Google Scholar 

  • H.A. Simon & T.A. Van Wormer (1963). Some Monte-Carlo estimates of the Yule distribution, Behavior Science, 8, 203–210.

    Google Scholar 

  • H.E. Stanley, et al. (1999) Scaling features of noncoding DNA. Phys. Review. E, 273, 1–18.

    CAS  Google Scholar 

  • H.G. Sutherland, et al. (2000) Reactivation of heritably silenced gene expression in mice. Mammalian Genome, 11, 347–355.

    Article  PubMed  CAS  Google Scholar 

  • D. Thieffry, A.M. Huerta, E. Perez-Rueda and J. Collado-Vides (1998) From specific gene regulation to genomic networks: a global analysis of transcriptional regulation in Escherichia coli. BioEssays, 20, 433–440.

    Article  PubMed  CAS  Google Scholar 

  • J.E. Till, E.A. McCulloch, L. Siminovish (1964) A stochastic model of stem cell proliferation, based on the growth of spleen colony-forming cells. Proc. Natl. Acad. Sci. U.S.A., 51, 29–38.

    PubMed  CAS  Google Scholar 

  • V. E. Velculescu, et al. (1997) Characterization of yeast transcriptome. Cell, 88, 243–251.

    Article  PubMed  CAS  Google Scholar 

  • V. E. Velculescu, et al. (1999) Analysis of human transcriptomes. Nat. Genet., 23, 387–388.

    Article  PubMed  CAS  Google Scholar 

  • J.C. Venter, J.C., et al. (2001) The sequence of the human genome. Science, 291, 1304–1351.

    Article  PubMed  CAS  Google Scholar 

  • T.J. Vision, D.G. Brown and S.D. Tanksley (2000) The origins of genome duplications in Arabidopsis. Science, 290, 2114–2117.

    Article  PubMed  CAS  Google Scholar 

  • J. Vohradsky and J.J. Ramsden. (2001) Genome resource utilization during prokaryotic development. FASEB J. (express article 10.1096/fj.00-0889fje).

    Google Scholar 

  • M.C. Walters, et al. (1995) Enhancers increase the probability but not the level of gene expression. Proc. Natl. Acad. Sci. U.S.A., 92, 7125–7129.

    PubMed  CAS  Google Scholar 

  • H. Weintraub (1988) Formation of stabletranscription complexes as assayed by analysis of individual templates. Proc. Natl. Acad. Sci. U.S.A., 85, 5819–5823.

    PubMed  CAS  Google Scholar 

  • S. Wuchty (2001) Scale-free behavior in protein domain networks. Molec. Biol. Evol., 18, 1694–1702.

    PubMed  CAS  Google Scholar 

  • G.U. Yule (1924) A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis, F.R.S. Philosophical Transactions of the Royal Society of London. Ser. B., 213, 21–87.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Kluwer Academic Publishers

About this chapter

Cite this chapter

Kuznetsov, V.A. (2003). Statistics of the Numbers of Transcripts and Protein Sequences Encoded in the Genome. In: Zhang, W., Shmulevich, I. (eds) Computational and Statistical Approaches to Genomics. Springer, Boston, MA. https://doi.org/10.1007/0-306-47825-0_9

Download citation

  • DOI: https://doi.org/10.1007/0-306-47825-0_9

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4020-7023-5

  • Online ISBN: 978-0-306-47825-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics