Abstract
In the analysis of high-throughput data, a very common goal is the detection of genes or of differential expression between two groups or classes. A recent finding from the scientific literature in prostate cancer demonstrates that by searching for a different pattern of differential expression, new candidate oncogenes might be found. In this chapter, we discuss the statistical problem, termed oncogene outlier detection, and discuss a variety of proposals to this problem. A statistical model in the multiclass situation is described; links with multiple testing concepts are established. Some new nonparametric procedures are described and compared to existing methods using simulation studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ludwig JA, Weinstein JN (2005) Biomarkers in cancer staging, prognosis and treatment. Nat Rev Cancer 11:845–856
Ge Y, Dudoit S, Speed TP (2003) Resampling-based multiple testing for microarray data analysis. Test 12:1–44
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300
Gordon A, Glazko G, Qiu X, Yakovlev A (2007) Control of the mean number of false discoveries, Bonferroni and stability of multiple testing. Ann Appl Stat 1:179–190
Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM et al (2005) Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310:644–648
Tibshirani R, Hastie T (2007) Outlier sums for differential gene expression analysis. Biostatistics 8:2–8
Wu B (2007) Cancer outlier differential gene expression detection. Biostatistics 8:566–575
Lian H (2008) MOST: detecting cancer differential gene expression. Biostatistics 9:411–818
Xiao Y, Gordon A, Yakovlev A (2006) The L 1-version of the Cramer-von Mises test for two-sample comparisons in microarray data analysis. EURASIP J Bioinform Syst Biol 85769
Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100:57–70
Ghosh D, Chinnaiyan AM (2008) Genomic outlier prole analysis: mixture models, null hypotheses and nonparametric estimation. Biostatistics (Advance Access published on June 6, 2008). doi:10.1093/biostatistics/kxn015
Liu F, Wu B (2007) Multi-group cancer outlier differential gene expression detection. Comput Biol Chem 31:65–71
Shaffer J (1995) Multiple hypothesis testing. Annu Rev Psychol 46:561–584
Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12:111–140
Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100:9440–9445
Lyons-Weiler J, Patel S, Becich MJ, Godfrey TE (2004) Tests for finding complex patterns of differential expression in cancers: towards individualized medicine. BMC Bioinform 125:110
Benjamini Y, Heller R (2008) Screening for partial conjunction hypotheses. Biometrics (Published online February 6, 2008). doi:10.1111/j.1541-0420.2007.00983.x
Ploner A, Calza S, Gusnanto A, Pawitan Y (2006) Multidimensional local false discovery rate for microarray studies. Bioinformatics 22:556–565
Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96:1151–1160
Chi Z (2008) False discovery control with multivariate p-values. Electron J Stat 2:368–411
Genovese CR, Wasserman L (2004) A stochastic process approach to false discovery control. Ann Stat 35:1035–1061
Benjamini Y, Yekutieli D (2001) False discovery control under dependency. Ann Stat 29:1165–1188
Dettling M, Gabrielson E, Parmigiani G (2005) Searching for differentially expressed gene combinations. Genome Biol 6:R88
Xiao Y, Frisina R, Gordon A, Klebanov L, Yakovlev A (2004) Multivariate search for differentially expressed gene combinations. BMC Bioinform 26:164
MacDonald JW, Ghosh D (2006) COPA-cancer outlier prole analysis. Bioinformatics 22:2950–2951
Acknowledgments
The author would like to acknowledge the support of the Huck Institutes of Life Sciences at Penn State University and NIH Grant R01GM72007.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this protocol
Cite this protocol
Ghosh, D. (2013). Genomic Outlier Detection in High-Throughput Data Analysis. In: Yakovlev, A., Klebanov, L., Gaile, D. (eds) Statistical Methods for Microarray Data Analysis. Methods in Molecular Biology, vol 972. Humana Press, New York, NY. https://doi.org/10.1007/978-1-60327-337-4_9
Download citation
DOI: https://doi.org/10.1007/978-1-60327-337-4_9
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-60327-336-7
Online ISBN: 978-1-60327-337-4
eBook Packages: Springer Protocols