Skip to main content

Analysis of Array Data and Clinical Validation of Array-Based Assays

  • Chapter
  • First Online:
Book cover Microarrays in Diagnostics and Biomarker Development

Abstract

High-throughput array-based assays have been widely used for diagnostics and biomarker discovery. However, the development of these assays requires analysis of high-dimensional genomic data and clinical validation of the resulting models, which remain challenging. In this chapter, we describe all the steps of array-based data analysis from data quality control and normalization to higher-level analyses such as clustering, dimensionality reduction, and predictive modeling, with special emphasis on the pitfalls and dangers of such analyses. We then tackle the problems related to clinical validation of array-based biomarkers and predictive assays, which include reproducibility and portability of the initial discovery and its translation into clinics. As array-based data, and those generated by the next-generation sequencing technologies, become less expensive to produce and more widely available, the growing number of patients for whom we have genomic data will open new opportunities for development of robust and reliable genomic biomarkers—provided we apply the lessons we have learned from this last decade of array-based studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.transcriptome.ens.fr/AffyGCQC/.

  2. 2.

    http://statistics.byu.edu/johnson/ComBat/.

  3. 3.

    Although it may be helpful to think of the hazard as the instantaneous probability of an event at time t, this quantity is not a probability and may be greater than 1. This is due to the division by Δt. Although the hazard has no upper bound, it cannot be smaller than 0.

  4. 4.

    http://www.r-project.org/.

  5. 5.

    http://www.latex-project.org/.

  6. 6.

    http://www.broadinstitute.org/cancer/software/genepattern/.

References

  • Affymetrix (2004) GeneChip expression analysis: data analysis fundamentals, vol 2447, pp 1–42. doi:10.1002/jnr.10268

  • Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson J Jr, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511

    Article  PubMed  CAS  Google Scholar 

  • Allison PD, Inc. SI (eds) (1995) Survival analysis using SAS: a practical guide. SAS Institute Inc., Cary, NC

    Google Scholar 

  • Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12):6745–6750

    Article  PubMed  CAS  Google Scholar 

  • Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA 97(18):10101–10106. doi:97/18/10101 [pii]

    Article  PubMed  CAS  Google Scholar 

  • Ambroise C, McLachlan GJ (2002) Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci USA 99(10):6562–6566. doi:10.1073/pnas.102102699

    Article  PubMed  CAS  Google Scholar 

  • Bach FR, Jordan MI (2003) Kernel independent component analysis. J Mach Learn Res 3:1–48

    Google Scholar 

  • Bair E, Tibshirani R (2004) Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2(4):511–522

    Article  CAS  Google Scholar 

  • Barrett T, Suzek TO, Troup DB, Wilhite SE, Ngau WC, Ledoux P, Rudnev D, Lash AE, Fujibuchi W, Edgar R (2005) NCBI GEO: mining millions of expression profiles—database and tool. Nucleic Acids Res 33:D562

    Article  PubMed  CAS  Google Scholar 

  • Beer DG, Kardia SL, Huang CC, Giordano TJ, Levin AM, Misek DE, Lin L, Chen G, Gharib TG, Thomas DG, Lizyness ML, Kuick R, Hayasaka S, Taylor JM, Iannettoni MD, Orringer MB, Hanash S (2002) Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med 8(8):816–824

    PubMed  CAS  Google Scholar 

  • Ben-Hur A, Elisseeff A, Guyon I (2002) A stability based method for discovering structure in clustered data. Proc Pac Symp Biocomput 7:6–17

    Google Scholar 

  • Benito M, Parker J, Du Q, Wu J, Xiang D, Perou CM, Marron JS (2004) Adjustment of systematic microarray data biases. Bioinformatics 20(1):105–114

    Article  PubMed  CAS  Google Scholar 

  • Berrer DP, Dubitzky W, Granzow M (2002) A practical approach to microarray data analysis, 1st edn. Springer, New York

    Google Scholar 

  • Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, Loda M, Weber G, Mark EJ, Lander ES, Wong W, Johnson BE, Golub TR, Sugarbaker DJ, Meyerson M (2001) Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci USA 98(24):13790–13795

    Article  PubMed  CAS  Google Scholar 

  • Bishop CM, Jordan M, Kleinberg J, Scholkopf B (eds) (2006) Pattern recognition and machine learning information science and statistics. Springer, New York

    Google Scholar 

  • Bloom G, Yang IV, Boulware D, Kwong KY, Coppola D, Eschrich S, Quackenbush J, Yeatman TJ (2004) Multi-platform, multi-site, microarray-based human tumor classification. Am J Pathol 164(1):9–16

    Article  PubMed  CAS  Google Scholar 

  • Bolstad BM (2004) Low-level analysis of high-density oligonucleotide array data: background normalization and summarization. University of California, Berkeley

    Google Scholar 

  • Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185–193

    Article  PubMed  CAS  Google Scholar 

  • Boulesteix AL, Porzelius C, Daumer M (2008) Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value. Bioinformatics 24(15):1698–1706. doi:btn262 [pii]10.1093/bioinformatics/btn262

    Article  PubMed  CAS  Google Scholar 

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Chapman and Hall, New York

    Google Scholar 

  • Bylesjo M, Eriksson D, Sjodin A, Jansson S, Moritz T, Trygg J (2007) Orthogonal projections to latent structures as a strategy for microarray data normalization. BMC Bioinformatics 8:207. doi:1471-2105-8-207 [pii]10.1186/1471-2105-8-207

    Article  PubMed  CAS  Google Scholar 

  • Cardoso F, Piccart-Gebhart M, Van’t Veer L, Rutgers E (2007) The MINDACT trial: the first prospective clinical validation of a genomic tool. Mol Oncol 1(3):246–251. doi:S1574-7891(07)00077-4 [pii]10.1016/j.molonc.2007.10.004

    Article  PubMed  Google Scholar 

  • Caruana R, Niculescu-Mizil A (2004) Data mining in metric space: an empirical analysis of supervised learning performance criteria. Paper presented at the ACM SIGKDD international conference on Knowledge discovery and data mining, New York

    Google Scholar 

  • Chen C, Grennan K, Badner J, Zhang D, Gershon E, Jin L, Liu C (2011) Removing batch effects in analysis of expression microarray data: an evaluation of six batch adjustment methods. PLoS One 6(2):e17238. doi:10.1371/journal.pone.0017238

    Article  PubMed  CAS  Google Scholar 

  • Cheng Y, Church GM (2000) Biclustering of expression data. Proc Int Conf Intell Syst Mol Biol 8:93–103

    PubMed  CAS  Google Scholar 

  • Cobleigh MA, Tabesh B, Bitterman P, Baker J, Cronin M, Liu ML, Borchik R, Mosquera JM, Walker MG, Shak S (2005) Tumor gene expression and prognosis in breast cancer patients with 10 or more positive lymph nodes. Clin Cancer Res 11(24 Pt 1):8623–8631. doi:11/24/8623 [pii]10.1158/1078-0432.CCR-05-0735

    Article  PubMed  CAS  Google Scholar 

  • Collobert R, Bengio S (2001) SVMTorch: support vector machines for large-scale regression problems. J Mach Learn Res 1:143–160

    Google Scholar 

  • Contopoulos-Ioannidis DG, Alexiou GA, Gouvias TC, Ioannidis JP (2008) Medicine. Life cycle of translational research for medical interventions. Science 321(5894):1298–1299. doi:321/5894/1298 [pii]10.1126/science.1160622

    Article  PubMed  CAS  Google Scholar 

  • Cox DR (1972) Regression models and life tables. J R Stat Soc Ser B 34:187–220

    Google Scholar 

  • Cristianini N, Press CCU, Shawe-Taylor J (eds) (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

    Google Scholar 

  • Dasarathy BV (ed) (1990) Nearest neighbor: pattern classification techniques. IEEE Computer Society Press, New York

    Google Scholar 

  • Davis CA, Gerick F, Hintermair V, Friedel CC, Fundel K, Kuffner R, Zimmer R (2006) Reliable gene signatures for microarray classification: assessment of stability and performance. Bioinformatics 22(19):2356–2363. doi:10.1093/bioinformatics/btl400

    Article  PubMed  CAS  Google Scholar 

  • De Smet F, Mathys J, Marchal K, Thijs G, De Moor B, Moreau Y (2002) Adaptive quality-based clustering of gene expression profiles. Bioinformatics 18(5):735–746

    Article  PubMed  Google Scholar 

  • de Souto M, Costa I, de Araujo D, Ludermir T, Schliep A (2008) Clustering cancer gene expression data: a comparative study. BMC Bioinformatics 9(1):497. doi:10.1186/1471-2105-9-497

    Article  PubMed  CAS  Google Scholar 

  • DeRisi J, Penland L, Brown PO, Bittner ML, Meltzer PS, Ray M, Chen Y, Su YA, Trent JM (1996) Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nat Genet 14(4):457–460

    Article  PubMed  CAS  Google Scholar 

  • Desmedt C, Piette F, Loi SM, Wang Y, Lallemand F, Haibe-Kains B, Viale G, Delorenzi M, Zhang Y, d’Assignies MS, Bergh J, Lidereau R, Ellis P, Harris AL, Klijn JGM, Foekens JA, Cardoso F, Piccart MJ, Buyse M, Sotiriou C, Consortium T (2007) Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res 13(11):3207–3214

    Article  PubMed  CAS  Google Scholar 

  • Desmedt C, Haibe-Kains B, Wirapati P, Buyse M, Larsimont D, Bontempi G, Delorenzi M, Piccart M, Sotiriou C (2008) Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes. Clin Cancer Res 14(16):5158–5165. doi:10.1158/1078-0432.CCR-07-4756

    Article  PubMed  CAS  Google Scholar 

  • Duda RO, Hart PR, Stork DG (2001) Pattern classification. Wiley, New York

    Google Scholar 

  • Dudoit S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97(457):77–87

    Article  CAS  Google Scholar 

  • Dupuy A, Simon RM (2007) Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J Natl Cancer Inst 99(2):147–157. doi:10.1093/jnci/djk018

    Article  PubMed  Google Scholar 

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499

    Article  Google Scholar 

  • Eisen M, Spellman P, Brown P, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. PNAS 95:14863–14868

    Article  PubMed  CAS  Google Scholar 

  • Eng-Wong J, Zujewski JA (2008) Current NCI-sponsored cooperative group trials of endocrine therapies in breast cancer. Cancer 112(3 Suppl):723–729. doi:10.1002/cncr.23188

    Article  PubMed  CAS  Google Scholar 

  • Finak G, Bertos N, Pepin F, Sadekova S, Souleimanova M, Zhao H, Chen H, Omeroglu G, Meterissian S, Omeroglu A, Hallett M, Park M (2008) Stromal gene expression predicts clinical outcome in breast cancer. Nat Med 14(5):518–527. doi:10.1038/nm1764

    Article  PubMed  CAS  Google Scholar 

  • Fisher RA (2011) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188

    Article  Google Scholar 

  • Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611–631. doi:10.1198/016214502760047131

    Article  Google Scholar 

  • Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10):906–914

    Article  PubMed  CAS  Google Scholar 

  • Gamberger D, Lavrac N (2004) Avoiding data overfitting in scientific discovery: experiments in functional genomics. Paper presented at the ECAI, 22–27 Aug 2004, Valencia, Spain

    Google Scholar 

  • Gentleman R (2005) Reproducible research: a bioinformatics case study. Stat Appl Genet Mol Biol 4(1)

    Google Scholar 

  • Gentleman R, Huber W, Carey VJ, Irizarry RA, Dudoit S (2005) Bioinformatics and computational biology solutions using R and bioconductor. Springer, New York

    Book  Google Scholar 

  • Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537

    Article  PubMed  CAS  Google Scholar 

  • Gui J, Li H (2005) Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics 21(13):3001–3008. doi:10.1093/bioinformatics/bti422

    Article  PubMed  CAS  Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    Google Scholar 

  • Habel LA, Shak S, Jacobs MK, Capra A, Alexander C, Pho M, Baker J, Walker M, Watson D, Hackett J, Blick NT, Greenberg D, Fehrenbacher L, Langholz B, Quesenberry CP (2006) A population-based study of tumor gene expression and risk of breast cancer death among lymph node-negative patients. Breast Cancer Res 8(3):R25. doi:bcr1412 [pii]10.1186/bcr1412

    Article  PubMed  CAS  Google Scholar 

  • Haibe-Kains B, Desmedt C, Sotiriou C, Bontempi G (2008) A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all? Bioinformatics 24(19):2200–2208. doi:10.1093/bioinformatics/btn374

    Article  PubMed  CAS  Google Scholar 

  • Haibe-Kains B, Desmedt C, Loi SM, Culhane AC, Bontempi G, Quackenbush J, Sotiriou C (2012) A three-gene model to robustly identify breast cancer molecular subtypes. J Natl Cancer Inst 104(4):311–325. doi:10.1093/jnci/djr545

    Article  PubMed  CAS  Google Scholar 

  • Harr B, Schlotterer C (2006) Comparison of algorithms for the analysis of Affymetrix microarray data as evaluated by co-expression of genes in known operons. Nucleic Acids Res 34(2):8

    Article  CAS  Google Scholar 

  • Harrell FJ, Lee K, Mark D (1996) Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 15(4):361–387. doi:10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4

    Article  PubMed  Google Scholar 

  • Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall, London

    Google Scholar 

  • Hastie T, Bickel P, Tibshirani R, Diggle P, Friedman J, Fienberg S, Gather U, Otkin I, Zeger S (eds) (2001) The elements of statistical learning statistics. Springer, New York

    Google Scholar 

  • Heagerty PJ, Lumley T, Pepe MS (2000) Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 56:337–344

    Article  PubMed  CAS  Google Scholar 

  • Hu H, Li J-Y, Wang H, Daggard G, Wang L-Z (2008) Robustness analysis of diversified ensemble decision tree algorithms for Microarray data classification. Paper presented at the 2008 International Conference on Machine Learning and Cybernetics (ICMLC), Kunming, 12–15 Jul 2008

    Google Scholar 

  • Huber W, von Heydebreck A, Sultman H, Poustka A, Vingron M (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18(1):S96–S104

    Article  PubMed  Google Scholar 

  • Irizarry RA, Boldstad BM, Collin F, Cope LM, Hobbs B, Speed TR (2003a) Summaries of affymetrix GeneChip probe level data. Nucleic Acids Res 31(4)

    Google Scholar 

  • Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003b) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics (Oxford, England) 4(2):249–264. doi:10.1093/biostatistics/4.2.249

    Google Scholar 

  • Jin R, Si L, Chan C (2008) A Bayesian framework for knowledge driven regression model in micro-array data analysis. Int J Data Min Bioinform 2(3):250–267

    Article  PubMed  Google Scholar 

  • Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1):118–127. doi:10.1093/biostatistics/kxj037

    Article  PubMed  Google Scholar 

  • Jolliffe IT, Jolliffe IT (eds) (2002) Principal component analysis. Springer series in statistics. Springer, New York

    Google Scholar 

  • Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:451–457

    Google Scholar 

  • Kelemen A, Zhou H, Lawhead P, Liang Y (2003) Naive Bayesian classifier for microarray data. In: 2003 International joint conference on neural networks, vol 3, pp 1769–1773. Paper presented at the 2003 international joint conference on neural networks, IEEE. doi:10.1109/IJCNN.2003.1223675

  • Kelley RK, Wang G, Venook AP (2011) Biomarker use in colorectal cancer therapy. J Natl Compr Canc Netw 9(11):1293–1302. doi:9/11/1293 [pii]

    PubMed  CAS  Google Scholar 

  • Khan J, Simon R, Bittner M, Chen Y, Leighton SB, Pohida T, Smith PD, Jiang Y, Gooden GC, Trent JM, Meltzer PS (1998) Gene expression profiling of alveolar rhabdomyosarcoma with cDNA microarrays. Cancer Res 58(22):5009–5013

    PubMed  CAS  Google Scholar 

  • Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324

    Article  Google Scholar 

  • Kohli-Laven N, Bourret P, Keating P, Cambrosio A (2011) Cancer clinical trials in the era of genomic signatures: biomedical innovation, clinical utility, and regulatory-scientific hybrids. Soc Stud Sci 41(4):487–513

    Article  PubMed  Google Scholar 

  • Lee JW, Lee JB, Park M, Song SH (2005) An extensive comparison of recent classification tools applied to microarray data. Computational Statistics & Data Analysis 48(4):869–885. doi: 10.1016/j.csda.2004.03.017

    Google Scholar 

  • Lehmann EL, Caselia G (1998) Theory of point estimation, 2nd edn. Springer, New York

    Google Scholar 

  • Leisch F (2002) Sweave. Dynamic generation of statistical reports using literate data analysis. In: Computational statistics, vol 69, pp 575–580. Presented at the computational statistics, SFB adaptive information systems and modelling in economics and management science, WU Vienna University of Economics and Business. http://www.google.ca/url?sa=t&rct=j&q=sweave.%20dynamic%20generation%20of%20statistical%20reports%20using%20literate%20data%20analysis&source=web&cd=1&ved=0CDQQFjAA&url=http%3A%2F%2Fepub.wu.ac.at%2F1788%2F1%2Fdocument.pdf&ei=qiVNT7TPLevTiALGwp2wDw&usg=AFQjCNGZ5hg-vOqrB2j6hU7HGhQkhiBrRg&sig2=dmMu57Xag5ci-fANUqxnAA

  • Li C, Wong WH (2001) Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol 2(8):1–11

    Google Scholar 

  • Lipshutz RJ, Morris D, Chee M, Hubbell E, Kozal MJ, Shah N, Shen N, Yang R, Fodor SP (1995) Using oligonucleotide probe arrays to access genetic diversity. Biotechniques 19(3):442–447

    PubMed  CAS  Google Scholar 

  • Loi SM, Haibe-Kains B, Desmedt C, Wirapati P, Lallemand F, Tutt AM, Gillet C et al (2008) Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen. BMC Genomics 9:239. doi:10.1186/1471-2164-9-239

    Article  PubMed  CAS  Google Scholar 

  • Loi SM, Haibe-Kains B, Desmedt C, Wirapati P, Lallemand F, Tutt AM, Gillet C, Ellis P, Ryder K, Reid JF, Daidone MG, Pierotti MA, Berns EM, Jansen MP, Foekens JA, Delorenzi M, Bontempi G, Piccart MJ, Sotiriou C (2008) Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen. BMC Genomics 9:239. doi:10.1186/1471-2164-9-239

    Article  PubMed  CAS  Google Scholar 

  • Loi SM, Haibe-Kains B, Majjaj S, Lallemand F, Durbecq V, Larsimont D, Gonzalez-Angulo AM, Pusztai L, Symmans WF, Bardelli A, Ellis P, Tutt ANJ, Gillett CE, Hennessy BT, Mills GB, Phillips WA, Piccart MJ, Speed TP, McArthur GA, Sotiriou C (2010) PIK3CA mutations associated with gene signature of low mTORC1 signaling and better outcomes in estrogen receptor-positive breast cancer. Proc Natl Acad Sci USA 107(22):10208–10213. doi:10.1073/pnas.0907011107

    Article  PubMed  CAS  Google Scholar 

  • Luo J, Schumacher M, Scherer A, Sanoudou D, Megherbi D, Davison T, Shi T, Tong W, Shi L, Hong H, Zhao C, Elloumi F, Shi W, Thomas R, Lin S, Tillinghast G, Liu G, Zhou Y, Herman D, Li Y, Deng Y, Fang H, Bushel P, Woods M, Zhang J (2010) A comparison of batch effect removal methods for enhancement of prediction performance using MAQC-II microarray gene expression data. Pharmacogenomics J 10(4):278–291. doi:tpj201057 [pii]10.1038/tpj.2010.57

    Article  PubMed  CAS  Google Scholar 

  • Mamounas E, Budd GT, Miller K (2008) Incorporating the oncotype DX breast cancer assay into community practice: an expert Q and A and case study sampling. Clin Adv Hematol Oncol 6(2):s1–s8

    PubMed  Google Scholar 

  • Manilich EA, Ozsoyoglu ZM, Trubachev V, Radivoyevitch T (2011) Classification of large microarray datasets using fast random forest construction. J Bioinform Comput Biol 9(2):251–267. doi: [pii]S021972001100546X

    Article  PubMed  Google Scholar 

  • Marchionni L, Wilson RF, Marinopoulos SS, Wolff AC, Parmigiani G, Bass EB, Goodman SN (2007) Impact of gene expression profiling tests on breast cancer outcomes. Evid Rep Technol Assess (Full Rep) 160:1–105

    Google Scholar 

  • Mason SJ, Graham NE (2002) Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: statistical significance and interpretation. Q J R Meteorol Soc 128(584):2145–2166. doi:10.1256/003590002320603584

    Article  Google Scholar 

  • Matthews BW (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 405(2):442–451

    PubMed  CAS  Google Scholar 

  • McCall MN, Bolstad BM, Irizarry RA (2010) Frozen robust multiarray analysis (fRMA). Biostatistics (Oxford, England) 11(2):242–253. doi:10.1093/biostatistics/kxp059

    Google Scholar 

  • McCall MN, Murakami PN, Lukk M, Huber W, Irizarry RA (2011a) Assessing affymetrix GeneChip microarray quality. BMC Bioinformatics 12:137. doi:1471-2105-12-137 [pii]10.1186/1471-2105-12-137

    Article  PubMed  Google Scholar 

  • McCall MN, Uppal K, Jaffee HA, Zilliox MJ, Irizarry RA (2011b) The gene expression barcode: leveraging public data repositories to begin cataloging the human and murine transcriptomes. Nucleic Acids Res 39(Database issue):D1011–D1015. doi:gkq1259 [pii]

    Article  PubMed  Google Scholar 

  • Mesirov JP (2010) Computer science accessible reproducible research. Science 327(5964):415–416. doi:327/5964/415 [pii]10.1126/science.1179653

    Article  PubMed  CAS  Google Scholar 

  • Michiels S, Koscielny S, Hill C (2005) Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 365:488–492

    Article  PubMed  CAS  Google Scholar 

  • Moch H, Schraml P, Bubendorf L, Mirlacher M, Kononen J, Gasser T, Mihatsch MJ, Kallioniemi OP, Sauter G (1999) Identification of prognostic parameters for renal cell carcinoma by cDNA arrays and cell chips. Verh Dtsch Ges Pathol 83:225–232

    PubMed  CAS  Google Scholar 

  • Mook S, van’t Veer LJ, Rutgers EJ, Piccart-Gebhart MJ, Cardoso F (2007) Individualization of therapy using mammaprint: from development to the MINDACT Trial. Cancer Genomics Proteomics 4(3):147–155

    PubMed  CAS  Google Scholar 

  • Natsoulis G, El Ghaoui L, Lanckriet GRG, Tolley AM, Leroy F, Dunlea S, Eynon BP, Pearson CI, Tugendreich S, Jarnagin K (2005) Classification of a large microarray data set: algorithm comparison and analysis of drug signatures. Genome Res 15(5):724–736. doi:10.1101/gr.2807605

    Article  PubMed  CAS  Google Scholar 

  • Nepomuceno-Chamorro I, Azuaje F, Devaux Y, Nazarov PV, Muller A, Aguilar-Ruiz JS, Wagner DR (2011) Prognostic transcriptional association networks: a new supervised approach based on regression trees. Bioinformatics 27(2):252–258. doi:btq645 [pii]10.1093/bioinformatics/btq645

    Article  PubMed  CAS  Google Scholar 

  • Onitilo AA, Engel JM, Greenlee RT, Mukesh BN (2009) Breast cancer subtypes based on ER/PR and Her2 expression: comparison of clinicopathologic features and survival. Clin Med Res 7(1–2):4–13. doi:10.3121/cmr.2009.825

    Article  PubMed  CAS  Google Scholar 

  • Osorio YFJ, Prina E, Lang T, Milon G, Davory C, Coppee JY, Regnault B (2008) AffyGCQC: a web-based interface to detect outlying genechips with extreme studentized deviate tests. J Bioinform Comput Biol 6(2):317–334. doi:S0219720008003400 [pii]

    Article  Google Scholar 

  • Paik S, Shak S, Tang G, Kim C, Bakker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T, Hiller W, Fisher ER, Wickerham DL, Bryant J, Wolmark N (2004) A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 351(27):2817–2826

    Article  PubMed  CAS  Google Scholar 

  • Pang S, Havukkala I, Hu Y, Kasabov N (2007) Classification consistency analysis for bootstrapping gene selection. Neural Comput Appl 18(6):527–539

    Google Scholar 

  • Park MY, Hastie T (2007) L1 regularization path algorithm for generalized linear models. J R Stat Soc 69:659–677

    Article  Google Scholar 

  • Parkinson H, Sarkans U, Shojatalab M, Contrino S, Coulson R, Farne A, Lara GG, Holloway E, Kapushesky M, Lilja P, Mukherjee G, Oezcimen A, Rayner T, Rocca-Serra P, Sharma A, Sansone S, Brazma A (2005) ArrayExpress: a public repository for microarray gene expression data at the EBI. Nucleic Acids Res 33:D553–D555

    Article  PubMed  CAS  Google Scholar 

  • Parry RM, Jones W, Stokes TH, Phan JH, Moffitt RA, Fang H, Shi L, Oberthuer A, Fischer M, Tong W, Wang MD (2010) k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. Pharmacogenomics J 10(4):292–309. doi:10.1038/tpj.2010.56

    Article  PubMed  CAS  Google Scholar 

  • Perou CM, Jeffrey SS, van de Rijn M, Rees CA, Eisen MB, Ross DT, Pergamenschikov A, Williams CF, Zhu SX, Lee JC, Lashkari D, Shalon D, Brown PO, Botstein D (1999) Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci USA 96(16):9212–9217

    Article  PubMed  CAS  Google Scholar 

  • Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, Fluge O, Pergamenschikov A, Williams C, Zhu SX, Lonning PE, Borresen-Dale A-L, Brown PO, Botstein D (2000) Molecular portraits of human breast tumours. Nature 406(6797):747–752. doi:10.1038/35021093

    Article  PubMed  CAS  Google Scholar 

  • Phuong TM, Lee D, Lee KH (2004) Regression trees for regulatory element identification. Bioinformatics 20(5):750–757. doi:10.1093/bioinformatics/btg480 btg480 [pii]

    Article  PubMed  CAS  Google Scholar 

  • Ploner A, Miller LD, Hall P, Bergh J, Pawitan Y (2005) Correlation test to assess low-level processing of high-density oligonucletide microarray data. BMC Bioinformatics 6(80):1–20

    Google Scholar 

  • Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME, Kim JY, Goumnerova LC, Black PM, Lau C, Allen JC, Zagzag D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis DN, Mesirov JP, Lander ES, Golub TR (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870):436–442

    Article  PubMed  CAS  Google Scholar 

  • Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang CH, Angelo M, Ladd C, Reich M, Latulippe E, Mesirov JP, Poggio T, Gerald W, Loda M, Lander ES, Golub TR (2001) Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci USA 98(26):15149–15154

    Article  PubMed  CAS  Google Scholar 

  • Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP (2006) GenePattern 2.0. Nat Genet 38(5):500–501. doi:10.1038/ng0506-500

    Article  PubMed  CAS  Google Scholar 

  • Rifkin R, Klautau A (2004) In defense of One-Vs-All classification. J Mach Learn Res 5(1):101–141

    Google Scholar 

  • Ross DT, Scherf U, Eisen MB, Perou CM, Rees C, Spellman P, Iyer V, Jeffrey SS, Van de Rijn M, Waltham M, Pergamenschikov A, Lee JC, Lashkari D, Shalon D, Myers TG, Weinstein JN, Botstein D, Brown PO (2000) Systematic variation in gene expression patterns in human cancer cell lines. Nat Genet 24(3):227–235

    Article  PubMed  CAS  Google Scholar 

  • Ross JS, Hatzis C, Symmans WF, Pusztai L, Hortobagyi GN (2008) Commercialized multigene predictors of clinical outcome for breast cancer. Oncologist 13(5):477–493. doi:13/5/477 [pii]10.1634/theoncologist.2007-0248

    Article  PubMed  Google Scholar 

  • Royston P, Sauerbrei W (2004) A new measure of prognostic separation in survival data. Stat Med 23(5):723–748. doi:10.1002/sim.1621

    Article  PubMed  Google Scholar 

  • Sarder P, Schierding W, Cobb JP, Nehorai A (2010) Estimating sparse gene regulatory networks using a bayesian linear regression. IEEE Trans Nanobioscience 9(2):121–131. doi:10.1109/TNB.2010.2043444

    Article  PubMed  Google Scholar 

  • Schena M, Shalon D, Davis RW, Brown PO (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270(5235):467–470

    Article  PubMed  CAS  Google Scholar 

  • Schumacher M, Binder H, Gerds TA (2007) Assessment of survival prediction models based on microarray data. Bioinformatics 23(14):1768–1774

    Article  PubMed  CAS  Google Scholar 

  • Sheng Q, Moreau Y, De Moor B (2003) Biclustering microarray data by Gibbs sampling. Bioinformatics 19(Suppl 2):ii196–ii205. doi:10.1093/bioinformatics/btg1078

    Article  PubMed  Google Scholar 

  • Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY, Luo Y, Sun YA, Willey JC, Setterquist RA, Fischer GM, Tong W, Dragan YP, Dix DJ, Frueh FW, Goodsaid FM, Herman D, Jensen RV, Johnson CD, Lobenhofer EK, Puri RK, Schrf U, Thierry-Mieg J, Wang C, Wilson M, Wolber PK, Zhang L, Amur S, Bao W, Barbacioru CC, Lucas AB, Bertholet V, Boysen C, Bromley B, Brown D, Brunner A, Canales R, Cao XM, Cebula TA, Chen JJ, Cheng J, Chu TM, Chudin E, Corson J, Corton JC, Croner LJ, Davies C, Davison TS, Delenstarr G, Deng X, Dorris D, Eklund AC, Fan XH, Fang H, Fulmer-Smentek S, Fuscoe JC, Gallagher K, Ge W, Guo L, Guo X, Hager J, Haje PK, Han J, Han T, Harbottle HC, Harris SC, Hatchwell E, Hauser CA, Hester S, Hong H, Hurban P, Jackson SA, Ji H, Knight CR, Kuo WP, LeClerc JE, Levy S, Li QZ, Liu C, Liu Y, Lombardi MJ, Ma Y, Magnuson SR, Maqsodi B, McDaniel T, Mei N, Myklebost O, Ning B, Novoradovskaya N, Orr MS, Osborn TW, Papallo A, Patterson TA, Perkins RG, Peters EH, Peterson R, Philips KL, Pine PS, Pusztai L, Qian F, Ren H, Rosen M, Rosenzweig BA, Samaha RR, Schena M, Schroth GP, Shchegrova S, Smith DD, Staedtler F, Su Z, Sun H, Szallasi Z, Tezak Z, Thierry-Mieg D, Thompson KL, Tikhonova I, Turpaz Y, Vallanat B, Van C, Walker SJ, Wang SJ, Wang Y, Wolfinger R, Wong A, Wu J, Xiao C, Xie Q, Xu J, Yang W, Zhang L, Zhong S, Zong Y, Slikker W Jr (2006) The MicroArray quality control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 24(9):1151–1161

    Article  PubMed  CAS  Google Scholar 

  • Shi L, Campbell G, Jones WD, Campagne F, Wen Z, Walker SJ, Su Z, Chu TM, Goodsaid FM, Pusztai L, Shaughnessy JD Jr, Oberthuer A, Thomas RS, Paules RS, Fielden M, Barlogie B, Chen W, Du P, Fischer M, Furlanello C, Gallas BD, Ge X, Megherbi DB, Symmans WF, Wang MD, Zhang J, Bitter H, Brors B, Bushel PR, Bylesjo M, Chen M, Cheng J, Chou J, Davison TS, Delorenzi M, Deng Y, Devanarayan V, Dix DJ, Dopazo J, Dorff KC, Chou J, Davison TS, Delorenzi M, Deng Y, Devanarayan V, Dix DJ, Dopazo J, Dorff KC, Elloumi F, Fan J, Fan S, Fan X, Fang H, Gonzaludo N, Hess KR, Hong H, Huan J, Irizarry RA, Judson R, Juraeva D, Lababidi S, Lambert CG, Li L, Li Y, Li Z, Lin SM, Liu G, Lobenhofer EK, Luo J, Luo W, McCall MN, Nikolsky Y, Pennello GA, Perkins RG, Philip R, Popovici V, Price ND, Qian F, Scherer A, Shi T, Shi W, Sung J, Thierry-Mieg D, Thierry-Mieg J, Thodima V, Trygg J, Vishnuvajjala L, Wang SJ, Wu J, Wu Y, Xie Q, Yousef WA, Zhang L, Zhang X, Zhong S, Zhou Y, Zhu S, Arasappan D, Bao W, Lucas AB, Berthold F, Brennan RJ, Buness A, Catalano JG, Chang C, Chen R, Cheng Y, Cui J, Czika W, Demichelis F, Deng X, Dosymbekov D, Eils R, Feng Y, Fostel J, Fulmer-Smentek S, Fuscoe JC, Gatto L, Ge W, Goldstein DR, Guo L, Halbert DN, Han J, Harris SC, Hatzis C, Herman D, Huang J, Jensen RV, Jiang R, Johnson CD, Jurman G, Kahlert Y, Khuder SA, Kohl M, Li J, Li M, Li QZ, Li S, Liu J, Liu Y, Liu Z, Meng L, Madera M, Martinez-Murillo F, Medina I, Meehan J, Miclaus K, Moffitt RA, Montaner D, Mukherjee P, Mulligan GJ, Neville P, Nikolskaya T, Ning B, Page GP, Parker J, Parry RM, Peng X, Peterson RL, Phan JH, Quanz B, Ren Y, Riccadonna S, Roter AH, Samuelson FW, Schumacher MM, Shambaugh JD, Shi Q, Shippy R, Si S, Smalter A, Sotiriou C, Soukup M, Staedtler F, Steiner G, Stokes TH, Sun Q, Tan PY, Tang R, Tezak Z, Thorn B, Tsyganova M, Turpaz Y, Vega SC, Visintainer R, von Frese J, Wang C, Wang E, Wang J, Wang W, Westermann F, Willey JC, Woods M, Wu S, Xiao N, Xu J, Xu L, Yang L, Zeng X, Zhang M, Zhao C, Puri RK, Scherf U, Tong W, Wolfinger RD, Consortium M (2010) The MicroArray quality control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol 28(8):827–838. doi:nbt.1665 [pii]

    Article  PubMed  CAS  Google Scholar 

  • Simon R (2003) Diagnostic and prognostic prediction using gene expression profiles in high-dimensional microarray data. Br J Cancer 89:1599–1604

    Article  PubMed  CAS  Google Scholar 

  • Slodkowska EA, Ross JS (2009) MammaPrint 70-gene signature: another milestone in personalized medical care for breast cancer patients. Expert Rev Mol Diagn 9(5):417–422. doi:10.1586/erm.09.32

    Article  PubMed  Google Scholar 

  • Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Eystein Lonning P, Borresen-Dale AL (2001) Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA 98(19):10869–10874

    Article  PubMed  CAS  Google Scholar 

  • Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, Deng S, Johnsen H, Pesich R, Geister S, Demeter J, Perou C, Lonning PE, Brown PO, Borresen-Dale A-L, Botstein D (2003) Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA 1(14):8418–8423

    Article  CAS  Google Scholar 

  • Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, Nordgren H, Farmer P, Praz V, Haibe-Kains B, Desmedt C, Larsimont D, Cardoso F, Peterse H, Nuyten D, Buyse M, Van de Vijver MJ, Bergh J, Piccart M, Delorenzi M (2006) Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst 98(4):262–272

    Article  PubMed  CAS  Google Scholar 

  • Steel RGD, Torrie JH (1980) Principles and procedures of statistics. McGraw Hill, New York

    Google Scholar 

  • Straver ME, Glas AM, Hannemann J, Wesseling J, van de Vijver MJ, Rutgers EJ, Vrancken Peeters MJ, van Tinteren H, van’t Veer LJ, Rodenhuis S (2010) The 70-gene signature as a response predictor for neoadjuvant chemotherapy in breast cancer. Breast Cancer Res Treat 119(3):551–558. doi:10.1007/s10549-009-0333-1

    Article  PubMed  Google Scholar 

  • Sugar C (1998) Techniques for clustering and classification with applications to medical problems. Doctoral Thesis, Stanford University

    Google Scholar 

  • Suzuki K (ed) (2011) Artificial neural networks—methodological advances and biomedical applications. Artifical Neural Network Intech, Croatia

    Google Scholar 

  • Sweets JA (1988) Measuring the accuracy of diagnostic systems. Science 240(4857):1285–1293

    Article  Google Scholar 

  • Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, Lander ES, Golub TR (1999) Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci USA 96(6):2907–2912

    Article  PubMed  CAS  Google Scholar 

  • Taylor JS, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, New York

    Book  Google Scholar 

  • The Cancer Letter (2011) Duke accepts potti resignation; retraction process initiated with nature medicine. http://www.cancerletter.com/articles/20101123_1

  • Therneau TM, Gail M, Grambsch PM, Krickeberg K, Samet JM, Tsiatis A, Wong W (eds) (2000) Modeling survival data: extending the Cox model. Statistics for biology and health. Springer, New York. doi:10.1002/sim.956

    Google Scholar 

  • Tibshirani R (1997) The lasso method for variable selection in the Cox model. Stat Med 16(4):385–395. doi:10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3 [pii]

    Article  PubMed  CAS  Google Scholar 

  • Tibshirani R (2001) Regression shrinkage and selection via the lasso. J Royal Statist Soc B 58(1):1267–1288

    Google Scholar 

  • Tibshirani R, Walther G (2005) Cluster validation by prediction strength. J Comput Graph Stat 14(3):511–528

    Article  Google Scholar 

  • Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc Ser B Stat Methodol 63(2):411–423. doi:10.1111/1467-9868.00293

    Article  Google Scholar 

  • Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 99(10):6567–6572. doi:10.1073/pnas.082099299 99/10/6567 [pii]

    Article  PubMed  CAS  Google Scholar 

  • Tseng GC, Wong WH (2005) Tight clustering: a resampling-based approach for identifying stable and tight patterns in data. Biometrics 61(1):10–16. doi:10.1111/j.0006-341X.2005.031032.x

    Article  PubMed  Google Scholar 

  • UIshwaran H, Kogalur U, Blackstone E, Lauer M (2008) Random survival forests. Ann Appl Stat 2(3):841–860

    Article  Google Scholar 

  • Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ (2011) On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med 30(10):1105–1117. doi:10.1002/sim.4154

    Article  PubMed  Google Scholar 

  • Van Belle V, Pelckmans K, Van Huffel S, Suykens JA (2011a) Improved performance on high-dimensional survival data by application of survival-SVM. Bioinformatics 27(1):87–94. doi:btq617 [pii]10.1093/bioinformatics/btq617

    Article  PubMed  CAS  Google Scholar 

  • Van Belle V, Pelckmans K, Van Huffel S, Suykens JA (2011b) Support vector methods for survival analysis: a comparison between ranking and regression approaches. Artif Intell Med 53(2):107–118. doi:10.1016/j.artmed.2011.06.006S0933-3657(11)00076-5 [pii]

    Article  PubMed  Google Scholar 

  • van de Vijver MJ, He YD, van’t Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van der Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH, Bernards R (2002) A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347(25):1999–2009

    Article  PubMed  CAS  Google Scholar 

  • van der Laan MJ, Pollard KS, Bryan J (2003) A new partitioning around medoids algorithm. J Stat Comput Simulat 73(8):575–584

    Article  Google Scholar 

  • van Houwelingen H, Bruinsma T, Hart AA, van’t Veer LJ, Wessels LFA (2006) Cross-validated Cox regression on microarray gene expression data. Stat Med 25:3201–3216

    Article  PubMed  Google Scholar 

  • van Rijsbergen C (1979) Information retrieval, 2nd edn. Butterworths, London

    Google Scholar 

  • van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536

    Article  Google Scholar 

  • Verweij PJM, van Houwelingen JC (1993) Cross-validation in survival analysis. Stat Med 12:2305–2314

    Article  PubMed  CAS  Google Scholar 

  • Wang Y, Klijn JGM, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, Jatkoe T, Berns EMJJ, Atkins D, Foekens JA (2005) Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365(9460):671–679. doi:10.1016/S0140-6736(05)17947-1

    PubMed  CAS  Google Scholar 

  • Webb A (2003) Statistical pattern recognition, 2nd edn. Wiley, New York

    Google Scholar 

  • Wei JS, Greer BT, Westermann F, Steinberg SM, Son CG, Chen QR, Whiteford CC, Bilke S, Krasnoselsky AL, Cenacchi N, Catchpoole D, Berthold F, Schwab M, Khan J (2004) Prediction of clinical outcome using gene expression profiling and artificial neural networks for patients with neuroblastoma. Cancer Res 64(19):6883–6891. doi:64/19/6883 [pii]10.1158/0008-5472.CAN-04-0695

    Article  PubMed  CAS  Google Scholar 

  • Weiss SM, Kulikowski CA (1991) Computer systems that learn. Morgan Kaufmann, San Mateo

    Google Scholar 

  • Welford SM, Gregg J, Chen E, Garrison D, Sorensen PH, Denny CT, Nelson SF (1998) Detection of differentially expressed genes in primary tumor tissues using representational differences analysis coupled to microarray hybridization. Nucleic Acids Res 26(12):3059–3065

    Article  PubMed  CAS  Google Scholar 

  • Wilson CL, Miller CJ (2005) Simpleaffy: a BioConductor package for affymetrix quality control and data analysis. Bioinformatics 21(18):3683–3685

    Article  PubMed  CAS  Google Scholar 

  • Wu Z, Irizarry RA (2004) Preprocessing of oligonucleotide array data. Nat Biotechnol 22:656–658

    Article  PubMed  CAS  Google Scholar 

  • Yeung KY, Bumgarner RE (2003) Multiclass classification of microarray data with repeated measurements: application to cancer. Genome Biol 4(12):R83. doi:10.1186/gb-2003-4-12-r83

    Article  PubMed  Google Scholar 

  • Youden WJ (1950) Index for rating diagnostic tests. Cancer 3(1):32–35

    Article  PubMed  CAS  Google Scholar 

  • Zhu J, Hastie T (2004) Classification of gene microarrays by penalized logistic regression. Biostatistics 5(3):427–443. doi: 5/3/427 [pii]10.1093/biostatistics/5.3.427

    Article  PubMed  Google Scholar 

  • Zilliox MJ, Irizarry RA (2007) A gene expression bar code for microarray data. Nat Methods 4(11):911–913. doi:nmeth1102 [pii]10.1038/nmeth1102

    Article  PubMed  CAS  Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc SerB Stat Methodol 67(2):301–320. doi:10.1111/j.1467-9868.2005.00503.x

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin Haibe-Kains .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Haibe-Kains, B., Quackenbush, J. (2012). Analysis of Array Data and Clinical Validation of Array-Based Assays. In: Jordan, B. (eds) Microarrays in Diagnostics and Biomarker Development. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28203-4_11

Download citation

Publish with us

Policies and ethics