Skip to main content

Genomic Outlier Detection in High-Throughput Data Analysis

  • Protocol
  • First Online:
Book cover Statistical Methods for Microarray Data Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 972))

Abstract

In the analysis of high-throughput data, a very common goal is the detection of genes or of differential expression between two groups or classes. A recent finding from the scientific literature in prostate cancer demonstrates that by searching for a different pattern of differential expression, new candidate oncogenes might be found. In this chapter, we discuss the statistical problem, termed oncogene outlier detection, and discuss a variety of proposals to this problem. A statistical model in the multiclass situation is described; links with multiple testing concepts are established. Some new nonparametric procedures are described and compared to existing methods using simulation studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ludwig JA, Weinstein JN (2005) Biomarkers in cancer staging, prognosis and treatment. Nat Rev Cancer 11:845–856

    Article  Google Scholar 

  2. Ge Y, Dudoit S, Speed TP (2003) Resampling-based multiple testing for microarray data analysis. Test 12:1–44

    Article  Google Scholar 

  3. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300

    Google Scholar 

  4. Gordon A, Glazko G, Qiu X, Yakovlev A (2007) Control of the mean number of false discoveries, Bonferroni and stability of multiple testing. Ann Appl Stat 1:179–190

    Article  Google Scholar 

  5. Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM et al (2005) Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310:644–648

    Article  CAS  PubMed  Google Scholar 

  6. Tibshirani R, Hastie T (2007) Outlier sums for differential gene expression analysis. Biostatistics 8:2–8

    Article  PubMed  Google Scholar 

  7. Wu B (2007) Cancer outlier differential gene expression detection. Biostatistics 8:566–575

    Article  PubMed  Google Scholar 

  8. Lian H (2008) MOST: detecting cancer differential gene expression. Biostatistics 9:411–818

    Article  PubMed  Google Scholar 

  9. Xiao Y, Gordon A, Yakovlev A (2006) The L 1-version of the Cramer-von Mises test for two-sample comparisons in microarray data analysis. EURASIP J Bioinform Syst Biol 85769

    Google Scholar 

  10. Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100:57–70

    Article  CAS  PubMed  Google Scholar 

  11. Ghosh D, Chinnaiyan AM (2008) Genomic outlier prole analysis: mixture models, null hypotheses and nonparametric estimation. Biostatistics (Advance Access published on June 6, 2008). doi:10.1093/biostatistics/kxn015

    Google Scholar 

  12. Liu F, Wu B (2007) Multi-group cancer outlier differential gene expression detection. Comput Biol Chem 31:65–71

    Article  PubMed  Google Scholar 

  13. Shaffer J (1995) Multiple hypothesis testing. Annu Rev Psychol 46:561–584

    Article  Google Scholar 

  14. Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12:111–140

    Google Scholar 

  15. Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100:9440–9445

    Article  CAS  PubMed  Google Scholar 

  16. Lyons-Weiler J, Patel S, Becich MJ, Godfrey TE (2004) Tests for finding complex patterns of differential expression in cancers: towards individualized medicine. BMC Bioinform 125:110

    Article  Google Scholar 

  17. Benjamini Y, Heller R (2008) Screening for partial conjunction hypotheses. Biometrics (Published online February 6, 2008). doi:10.1111/j.1541-0420.2007.00983.x

    Google Scholar 

  18. Ploner A, Calza S, Gusnanto A, Pawitan Y (2006) Multidimensional local false discovery rate for microarray studies. Bioinformatics 22:556–565

    Article  CAS  PubMed  Google Scholar 

  19. Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96:1151–1160

    Article  Google Scholar 

  20. Chi Z (2008) False discovery control with multivariate p-values. Electron J Stat 2:368–411

    Article  Google Scholar 

  21. Genovese CR, Wasserman L (2004) A stochastic process approach to false discovery control. Ann Stat 35:1035–1061

    Google Scholar 

  22. Benjamini Y, Yekutieli D (2001) False discovery control under dependency. Ann Stat 29:1165–1188

    Article  Google Scholar 

  23. Dettling M, Gabrielson E, Parmigiani G (2005) Searching for differentially expressed gene combinations. Genome Biol 6:R88

    Article  PubMed  Google Scholar 

  24. Xiao Y, Frisina R, Gordon A, Klebanov L, Yakovlev A (2004) Multivariate search for differentially expressed gene combinations. BMC Bioinform 26:164

    Article  Google Scholar 

  25. MacDonald JW, Ghosh D (2006) COPA-cancer outlier prole analysis. Bioinformatics 22:2950–2951

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

The author would like to acknowledge the support of the Huck Institutes of Life Sciences at Penn State University and NIH Grant R01GM72007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Debashis Ghosh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this protocol

Cite this protocol

Ghosh, D. (2013). Genomic Outlier Detection in High-Throughput Data Analysis. In: Yakovlev, A., Klebanov, L., Gaile, D. (eds) Statistical Methods for Microarray Data Analysis. Methods in Molecular Biology, vol 972. Humana Press, New York, NY. https://doi.org/10.1007/978-1-60327-337-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-60327-337-4_9

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-60327-336-7

  • Online ISBN: 978-1-60327-337-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics