Skip to main content

Statistical Methods in Serial Analysis of Gene Expression (Sage)

  • Chapter
Computational and Statistical Approaches to Genomics

6. Conclusions

In this chapter we aimed to give a guide to the state-of-art in statistical methods for SAGE analysis. We just scratch some issues for the sake of being focused in differential expression detection problems, but we hope that main ideas could be useful to track the original literature. We saw that estimation of a tag abundance could not be simpler than observed counts divided by sequenced total, but rather can receive sophisticated treatments such as multinomial estimation, correction of potential sequencing errors, a priori knowledge incorporation, and so on. Given an (assumed) error-corrected data set, one could search for differentially expressed tags among conditions. Several methods for this were mentioned, but we stress the importance of using biological replication designs to capture general information. Finally, we want to point out that only accumulation of experimental data in public databases, with biological replication, and use of good statistics could improve usefulness of SAGE, MPSS or EST counting data in general terms, helping to elucidate basic/applied gene expression questions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Akmaev, V.R. and Wang, C.J. (2004) Correction of sequence based artifacts in serial analysis of gene expression. Bioinformatics 20, 1254–1263.

    Article  PubMed  CAS  Google Scholar 

  • Audic S. and Claverie J. (1997) The significance of digital gene expression profiles. Genome Research 7, 986–995.

    PubMed  CAS  Google Scholar 

  • Baggerly, K.A., Deng, L., Morris, J.S. and Aldaz, C.M. (2003) Differential expression in SAGE: accounting for normal between-library variation. Bioinformatics 19, 1477–1483.

    Article  PubMed  CAS  Google Scholar 

  • Beißbarth, T., Hyde, L., Smyth, G.K., Job, C., Boon, W., Tan, S., Scott, H.S. and Speed, T.P. (2004) Statistical modeling of sequencing errors in SAGE libraries. Bioinformatics 20, i31–i39.

    Article  PubMed  Google Scholar 

  • Blades, N., Velculescu, V.E. and Parmigiani, G. (2004a) Estimation of sequencing error rates in SAGE libraries. Genome Biology in press.

    Google Scholar 

  • Blades, N., Jones, J.B., Kern, S.E. and Parmigiani, G. (2004b) Denoising of data from serial analysis of gene expression. Bioinformatics in press.

    Google Scholar 

  • Boon, K., Osório, E.C., Greenhut, S.F., Schaefer, C.F., Shoemaker, J., Polyak, K., Morin, P.J., Buetow, K.H., Strausberg, R.L., Souza, S.J. and Riggins, G.J. (2002) An anatomy of normal and malignant gene expression. Proc. Natl. Acad. Sci. USA 99, 11287–11292.

    Article  PubMed  CAS  Google Scholar 

  • Brenner, S., Johnson, M., Bridgham, J., et al. (2000) Gene expression analysis by massively parallel signature sequencing (MPSS) on micro-bead arrays. Nature Biotechnology 18, 630–634.

    Article  PubMed  CAS  Google Scholar 

  • Bueno, A.M.S., Pereira, C.A.B., Rabello-Gay, M.N. and Stern, J.M. (2002) Environmental genotoxicity evaluation: Bayesian approach for a mixture statistical model. Stochastic Environmental Research and Risk Assessment 16, 267–278.

    Article  Google Scholar 

  • Chen, H., Centola, M., Altschul, S.F. and Metzger H. (1998) Characterization of gene expression in resting and activated mast cells. J. Exp. Med 188, 1657–1668.

    Article  PubMed  CAS  Google Scholar 

  • Colinge, J. and Feger, G. (2001) Detecting the impact of sequencing errors on SAGE data. Bioinformatics 17, 840–842.

    Article  PubMed  CAS  Google Scholar 

  • Duda, R.O., Hart, P.E. and Stork, D.G. (2000) in Pattern Classification-2nd Edition, (Wiley-Interscience Press)

    Google Scholar 

  • Ewing, B. and Green, P. (1998) Base-calling of automated sequencer traces using phred. II. error probabilities. Genome Research 8, 186–194.

    PubMed  CAS  Google Scholar 

  • Greller, L.D. and Tobin, F.L. (1999) Detecting selective expression of genes and proteins. Genome Research 9, 282–296.

    PubMed  CAS  Google Scholar 

  • Ihaka, R. and Gentleman, R. (1996) R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 5, 299–314.

    Article  Google Scholar 

  • Jeffreys, H. (1961) in Theory of Probability, (Oxford University Press).

    Google Scholar 

  • Kal, A.J., van Zonneveld, A.J., Benes, V., van den Berg, M., Koerkamp, M.G., Albermann, K., Strack, N., Ruijter, J.M., Richter, A., Dujon, B., Ansorge, W. and Tabak, H.F. (1999) Dynamics of gene expression revealed by comparison of serial analysis of gene expression transcript profiles from yeast grown on two different carbon sources. Mol. Biol. Cell 10, 1859–1872.

    PubMed  CAS  Google Scholar 

  • Lal, A., Lash, A.E., Altschul, S.F., Velculescu, V., Zhang, L., McLendon, R.E., Marra, M.A., Prange, C., Morin, P.J., Polyak, K., Papadopoulos, N., Vogelstein, B., Kinzler, K.W., Strausberg, R.L. and Riggins, G.J. (1999) A public database for gene expression in human cancers. Cancer Research 21, 5403–5407.

    Google Scholar 

  • Lash, A.E., Tolstoshev, C.M., Wagner, L., Schuler, G.D., Strausberg, R.L., Riggins, G.J. and Altschul, S.F. (2000) SAGEmap: a public gene expression resource. Genome Research 10, 1051–1060.

    Article  PubMed  CAS  Google Scholar 

  • Madruga, M.R., Pereira, C.A.B. and Stern, J.M. (2003) Bayesian evidence test for precise hypotheses. Journal of Planning and Inference 117, 185–198.

    Article  Google Scholar 

  • Man, M.Z., Wang X. and Wang Y. (2000) POWER SAGE: comparing statistical tests for SAGE experiments. Bioinfomatics 16, 953–959.

    Article  CAS  Google Scholar 

  • Margulies, E.H., Kardia, S.L. and Innis, J.W. (2001) Identification and prevention of a GC content bias in SAGE libraries. Nucleic Acids Res. 29, e60.

    Article  PubMed  CAS  Google Scholar 

  • Morris, J.S., Baggerly, K.A. and Coombes, K.R. (2003) Bayesian shrinkage estimation of the relative abundance of mRNA transcripts using SAGE. Biometrics 59, 476–486.

    Article  PubMed  Google Scholar 

  • Romualdi, C., Bortoluzzi, S. and Danieli, G.A. (2001) Detecting differentially expressed genes in multiple tag sampling experiments: comparative evaluation of statistical tests. Human Molecular Genetics 10, 2133–2141.

    Article  PubMed  CAS  Google Scholar 

  • Ruijter, J.M., Kampen, A.H.C. and Baas F. (2002) Statistical evaluation of SAGE libraries: consequences for experimental design. Physiol Genomics 11, 37–44.

    PubMed  CAS  Google Scholar 

  • Schuler, G.D. (1997) Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J. Mol. Med. 75, 694–698.

    Article  PubMed  CAS  Google Scholar 

  • Stekel, D.J., Git, Y. and Falciani, F. (2000) The comparison of gene expression from multiple cDNA libraries. Genome Research 10, 2055–2061.

    Article  PubMed  CAS  Google Scholar 

  • Stern, M.D., Anisimov, S.V. and Boheler, K.R. (2003) Can transcriptome size be estimated from SAGE catalogs?. Bioinformatics 19, 443–448.

    Article  PubMed  CAS  Google Scholar 

  • Stollberg, J., Urschitz, J., Urban, Z. and Boyd, C.D. (2000) A Quantitative Evaluation of SAGE. Genome Research 10, 1241–1248.

    Article  PubMed  CAS  Google Scholar 

  • Vêncio, R.Z.N., Brentani H. and Pereira, C.A.B. (2003) Using credibility intervals instead of hypothesis tests in SAGE analysis. Bioinformatics 19, 2461–2464.

    Article  PubMed  Google Scholar 

  • Vêncio, R.Z.N., Brentani, H., Patrão, D.F.C. and Pereira, C.A.B. (2004) Bayesian model accounting for within-class biological variability in Serial Analysis of Gene Expression (SAGE). BMC Bioinformatics 5, 119.

    Article  PubMed  Google Scholar 

  • Velculescu, V.E., Zhang, L., Vogelstein, B. and Kinzler, K.W. (1995) Serial analysis of gene expression. Science 270, 484–487.

    Article  PubMed  CAS  Google Scholar 

  • Velculescu, V.E., Zhang, L., Zhou, W., Vogelstein, J., Basrai M.A., Bassett, D.E., Hieter, P., Vogelstein, B. and Kinzler, K.W. (1997) Characterization of the yeast transcriptome. Cell 88, 243–251.

    Article  PubMed  CAS  Google Scholar 

  • Zhang, L., Zhou, W., Velculescu, V.E., Kern, S.E., Hruban, R.H., Hamilton, S.R., Vogelstein, B., and Kinzler, K.W. (1997) Gene Expression Profiles in Normal and Cancer Cells. Science 276, 1268–1272.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer Science+Business Media, Inc.

About this chapter

Cite this chapter

Vêncio, R.Z.N., Brentani, H. (2006). Statistical Methods in Serial Analysis of Gene Expression (Sage). In: Zhang, W., Shmulevich, I. (eds) Computational and Statistical Approaches to Genomics. Springer, Boston, MA. https://doi.org/10.1007/0-387-26288-1_11

Download citation

Publish with us

Policies and ethics