Genomic Outlier Detection in High-Throughput Data Analysis

Ghosh, Debashis

doi:10.1007/978-1-60327-337-4_9

Debashis Ghosh⁴

Part of the book series: Methods in Molecular Biology ((MIMB,volume 972))

2691 Accesses
4 Citations

Abstract

In the analysis of high-throughput data, a very common goal is the detection of genes or of differential expression between two groups or classes. A recent finding from the scientific literature in prostate cancer demonstrates that by searching for a different pattern of differential expression, new candidate oncogenes might be found. In this chapter, we discuss the statistical problem, termed oncogene outlier detection, and discuss a variety of proposals to this problem. A statistical model in the multiclass situation is described; links with multiple testing concepts are established. Some new nonparametric procedures are described and compared to existing methods using simulation studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ludwig JA, Weinstein JN (2005) Biomarkers in cancer staging, prognosis and treatment. Nat Rev Cancer 11:845–856
Article Google Scholar
Ge Y, Dudoit S, Speed TP (2003) Resampling-based multiple testing for microarray data analysis. Test 12:1–44
Article Google Scholar
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300
Google Scholar
Gordon A, Glazko G, Qiu X, Yakovlev A (2007) Control of the mean number of false discoveries, Bonferroni and stability of multiple testing. Ann Appl Stat 1:179–190
Article Google Scholar
Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM et al (2005) Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310:644–648
Article CAS PubMed Google Scholar
Tibshirani R, Hastie T (2007) Outlier sums for differential gene expression analysis. Biostatistics 8:2–8
Article PubMed Google Scholar
Wu B (2007) Cancer outlier differential gene expression detection. Biostatistics 8:566–575
Article PubMed Google Scholar
Lian H (2008) MOST: detecting cancer differential gene expression. Biostatistics 9:411–818
Article PubMed Google Scholar
Xiao Y, Gordon A, Yakovlev A (2006) The L ₁-version of the Cramer-von Mises test for two-sample comparisons in microarray data analysis. EURASIP J Bioinform Syst Biol 85769
Google Scholar
Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100:57–70
Article CAS PubMed Google Scholar
Ghosh D, Chinnaiyan AM (2008) Genomic outlier prole analysis: mixture models, null hypotheses and nonparametric estimation. Biostatistics (Advance Access published on June 6, 2008). doi:10.1093/biostatistics/kxn015
Google Scholar
Liu F, Wu B (2007) Multi-group cancer outlier differential gene expression detection. Comput Biol Chem 31:65–71
Article PubMed Google Scholar
Shaffer J (1995) Multiple hypothesis testing. Annu Rev Psychol 46:561–584
Article Google Scholar
Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12:111–140
Google Scholar
Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100:9440–9445
Article CAS PubMed Google Scholar
Lyons-Weiler J, Patel S, Becich MJ, Godfrey TE (2004) Tests for finding complex patterns of differential expression in cancers: towards individualized medicine. BMC Bioinform 125:110
Article Google Scholar
Benjamini Y, Heller R (2008) Screening for partial conjunction hypotheses. Biometrics (Published online February 6, 2008). doi:10.1111/j.1541-0420.2007.00983.x
Google Scholar
Ploner A, Calza S, Gusnanto A, Pawitan Y (2006) Multidimensional local false discovery rate for microarray studies. Bioinformatics 22:556–565
Article CAS PubMed Google Scholar
Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96:1151–1160
Article Google Scholar
Chi Z (2008) False discovery control with multivariate p-values. Electron J Stat 2:368–411
Article Google Scholar
Genovese CR, Wasserman L (2004) A stochastic process approach to false discovery control. Ann Stat 35:1035–1061
Google Scholar
Benjamini Y, Yekutieli D (2001) False discovery control under dependency. Ann Stat 29:1165–1188
Article Google Scholar
Dettling M, Gabrielson E, Parmigiani G (2005) Searching for differentially expressed gene combinations. Genome Biol 6:R88
Article PubMed Google Scholar
Xiao Y, Frisina R, Gordon A, Klebanov L, Yakovlev A (2004) Multivariate search for differentially expressed gene combinations. BMC Bioinform 26:164
Article Google Scholar
MacDonald JW, Ghosh D (2006) COPA-cancer outlier prole analysis. Bioinformatics 22:2950–2951
Article CAS PubMed Google Scholar

Download references

Acknowledgments

The author would like to acknowledge the support of the Huck Institutes of Life Sciences at Penn State University and NIH Grant R01GM72007.

Author information

Authors and Affiliations

Departments of Statistics and Public Health Sciences, Penn State University, DuBios, PA, USA
Debashis Ghosh

Authors

Debashis Ghosh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Debashis Ghosh .

Editor information

Editors and Affiliations

School of Medicine & Dentistry, Dept. Biostatistics & Computational, University of Rochester, Elmwood Ave. 601, Rochester, 14642, New York, USA
Andrei Y. Yakovlev
, Department of Probability and Statistics, Charles University, Sokolovska 83, Prague, 18675, Czech Republic
Lev Klebanov
State University of New York at Buffalo, Main St - 706 Kimball Tower 3435, Buffalo, 14214, New York, USA
Daniel Gaile

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Ghosh, D. (2013). Genomic Outlier Detection in High-Throughput Data Analysis. In: Yakovlev, A., Klebanov, L., Gaile, D. (eds) Statistical Methods for Microarray Data Analysis. Methods in Molecular Biology, vol 972. Humana Press, New York, NY. https://doi.org/10.1007/978-1-60327-337-4_9

Download citation

DOI: https://doi.org/10.1007/978-1-60327-337-4_9
Published: 03 January 2013
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-60327-336-7
Online ISBN: 978-1-60327-337-4
eBook Packages: Springer Protocols

Publish with us

Policies and ethics