Skip to main content

The Analysis of Gene Expression Data: An Overview of Methods and Software

  • Chapter
The Analysis of Gene Expression Data

Part of the book series: Statistics for Biology and Health ((SBH))

Abstract

This chapter is a rough map of the book. It provides a concise overview of data-analytic tasks associated with microarray studies, pointers to chapters that can help perform these tasks, and connections with selected data-analytic tools not covered in any of the chapters. We wish to give a general orientation before moving to the detailed discussion provided by individual chapters. A comprehensive review of microarray data analysis methods is beyond the scope of this introduction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Abramovich F, Yoav Benjamini DD, Donaho D, Johnstone I (2000). Adapting to unknown sparsity by controlling the false discovery rate. Discussion paper, Department of Statistics, Stanford University.

    Google Scholar 

  • Adcock CJ (1997). Sample size determination: A review. The Statistician 46:261–283.

    Google Scholar 

  • Affymetrix (1999). Affymetrix Microarray Suite User Guide. Affymetrix, Santa Clara, CA.

    Google Scholar 

  • Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, J. Hudson Jr J, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403:503–511.

    Google Scholar 

  • Alter O, Brown PO, Botstein D (2000). Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the NationalAcademy of Science, USA 97(18):10101–10106.

    Google Scholar 

  • Baldi P, Long AD (2001). A Bayesian framework for the analysis of microarray expression data: Regularized t-test and statistical inferences of gene changes. Bioinformatics 17(6):509–519.

    Google Scholar 

  • Banfield JD, Raftery AE (1993). Model-based gaussian and non-gaussian clustering. Biometrics 49:803–822.

    MATH  MathSciNet  Google Scholar 

  • Becker RA, Chambers JM (1984). S: an interactive environment for data analysis and graphics. Belmont, California: Duxbury Press.

    Google Scholar 

  • Benjamini Y, Hochberg Y (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B 57:289–300.

    MATH  MathSciNet  Google Scholar 

  • Berger JO, Delampady M (1987). Testing precise hypotheses. Statistical Science 2:317–335.

    MATH  MathSciNet  Google Scholar 

  • Bittner M, Meltzer P, Chen Y, Jiang Y, Seftor E, Hendrix M, Radmacher M, Simon R, Yakhini Z, Ben-Dor A, Sampas N, Dougherty E, Wang W, Marincola F, Gooden C, Lueders J, Glatfelter A, Pollock P, Carpten J, Gillanders E, Leja D, Dietrich K, Beaudry C, Berens M, Alberts D, Sondak V, Hayward N, Trent J (2000). Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406:536–540.

    Google Scholar 

  • Blader IJ, Manger ID, Boothroyd JC (2001). Microarray analysis reveals previously unknown changes in toxoplasma gondii-infected human cells. Journal of Biological Chemistry 276:24223–24231.

    Google Scholar 

  • Bolsover SR, Hyams JS, Jones S, Shepard EA, White HA (1997). From Genes to Cells. New York: Wiley.

    Google Scholar 

  • Bolstad B, Irizarry R, Ã…strand M, Speed T (2002). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Technical report, UC Berkeley.

    Google Scholar 

  • Box GEP, Hunter WG, Hunter JS (1978). Statistics for experiments: An introduction to design, data analysis, and model building. New York: Wiley.

    MATH  Google Scholar 

  • Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, Vingron M (2001). Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nature Genetics 29:365–371.

    Google Scholar 

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984). Classification and Regression Trees. Belmont, CA: Wadsworth International Group.

    MATH  Google Scholar 

  • Brown CS, Goodwin PC, Sorger PK (2001). Image metrics in the statistical analysis of dna microarray data. Proceedings of the National Academy of Science, USA 98(16):8944–8949.

    Google Scholar 

  • Brown MPS, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, Ares MJ, Haussler D (2000). Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of the National Academy of Science, USA 97:262–267.

    Google Scholar 

  • Bryan J, van der Laan M (2001). Gene expression analysis with the parametric bootstrap. Biostatistics 2(4):445–461.

    MATH  Google Scholar 

  • Burges CJC (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2:121–167.

    Google Scholar 

  • Chambers JM (1998). Programming with Data: A Guide to the S Language. New York: Springer.

    MATH  Google Scholar 

  • Chen X, Cheung ST, So S, Fan ST, Barry C, Higgins J, Lai KM, Ji J, Dudiot S, Ng IOL, van de Rijn M, Botstein D, Brown PO (2002). Gene expression patterns in human liver cancers. Molecular Biology of the Cell 13:1929–1939.

    Google Scholar 

  • Chen Y, Dougherty E, Bittner M (1997). Ratio-based decisions and the quantitative analysis of cDNA micro-array images. Journal of Biomedical Optics 2:364–374.

    Google Scholar 

  • Chiang DY, Brown PO, Eisen M (2001). Visualizing associations between genome sequence and gene expression data using genome-mean expression profiles. Bioinformatics 17:S49–S55.

    Google Scholar 

  • Christianini N, Shawe-Taylor J (2000). An Introduction to Support-Vector Machines. Cambridge: Cambridge University Press.

    Google Scholar 

  • Clyde MA, DeSimone H, Parmigiani G (1996). Prediction via orthogonalized model mixing. Journal of the American Statistical Association 91:1197–1208.

    MATH  Google Scholar 

  • Clyde MA, Parmigiani G (1998). Bayesian variable selection and prediction with mixtures. Journal of Biopharmaceutical Statistics 8(3):431–443.

    MATH  Google Scholar 

  • Collins FS (1999). Microarrays and macroconsequences. Nature Genetics 21S:2.

    Google Scholar 

  • Cover TM, Hart PE (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory IT-13:21–27.

    Google Scholar 

  • DeGroot MH, Fienberg SE (1983). The comparison and evaluation of forecasters. The Statistician 32:12–22.

    Google Scholar 

  • DeRisi JL, Iyer VR, Brown PO (1997). Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278:680–686.

    Google Scholar 

  • Desu M, Raghavarao D (1990). Sample Size Methodology. New York: Academic Press.

    Google Scholar 

  • Diggle P, Liang KY, Zeger SL (1994). Analysis of Longitudinal Data. Oxford: Oxford University Press.

    Google Scholar 

  • Dudoit S, Fridlyand J, Speed TP (2002a). Comparison of discrimination methods for the classification of tumors using gene expression data. JASA 97:77–87.

    MATH  MathSciNet  Google Scholar 

  • Dudoit S, Yang YH, Callow MJ, Speed TP (2002b). Statistical methods for identifying genes with differential expression in replicated cDNA microarray experiments. Statistica Sinica 12:111–139.

    MATH  MathSciNet  Google Scholar 

  • Duggan D, Bittner M, Chen Y, Meltzer P, Trent J (1999). Expression profiling using cDNA microarrays. Nature Genetics 21:10–14.

    Google Scholar 

  • Dunteman GH (1989). Principal Components Analysis, Vol. 69. Sage University Paper series on Quantitative Applications in the Social Sciences, series no. 07-064. Newbury Park, CA: Sage.

    Google Scholar 

  • Efron B, Morris C (1973). Combining possibly related estimation problems (with discussion). Journal of the Royal Statistical Society, Series B 35:379–421.

    MATH  MathSciNet  Google Scholar 

  • Efron B, Tibshirani R, Storey JD, Tusher V (2001). Empirical Bayes analysis of a microarray experiment. Journal of the American Statistical Association 96:1151–1160.

    MATH  MathSciNet  Google Scholar 

  • Eisen MB, Spellman PT, Brown PO, Botstein D (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Science, USA 95:14863–14868.

    Google Scholar 

  • Everitt B (1980). Cluster Analysis. New York: Halsted.

    MATH  Google Scholar 

  • Everitt B (2001). Applied Multivariate Data Analysis. Edward Arnold, London.

    MATH  Google Scholar 

  • Fisher RA (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics 7 (part 2):179–188.

    Google Scholar 

  • Friston KJ, Holmes AP, Worsley KJ, Poline JB, Frith CD, Frackowiak R (1995). Statistical parametric maps in functional imaging: A general linear approach. Human Brain Mapping 2:189–210.

    Google Scholar 

  • Gardiner-Garden M, Littlejohn T (2001). A comparison of microarray databases. Briefings in Bioinformatics 2:143–158.

    Google Scholar 

  • Garrett RH, Grisham CM (2002). Principles of Biochemistry. Pacific Grove, CA: Brooks/Cole.

    Google Scholar 

  • Genovese C, Wasserman L (2002). Operating characteristics and extensions of the false discovery rate procedure. Journal of the Royal Statistical Society, Series B 64:499–518.

    MATH  MathSciNet  Google Scholar 

  • George EI, McCulloch RE (1993). Variable selection via Gibbs sampling. Journal of the American Statistical Association 88:881–889.

    Google Scholar 

  • Getz G, Levine E, Domany E (2000). Coupled two-way clustering analysis of gene microarray data. Proceedings of the National Academy of Science, USA 97(22):12079–12084.

    Google Scholar 

  • Gnanadesikan R (1977). Methods for Statistical Data Analysis of Multivariate Observations. New York: Wiley.

    MATH  Google Scholar 

  • Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh M, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286:531–537.

    Google Scholar 

  • Gordon AD (1999). Classification. New York: Chapman and Hall/CRC.

    MATH  Google Scholar 

  • Granucci F, Vizzardelli C, Pavelka N, Feau S, Persico M, Virzi E, Rescigno M, Moro G, Ricciardi-Castagnoli P (2001). Inducible IL-2 production by dendritic cells revealed by global gene expression analysis. Nature Immunology 2:882–888.

    Google Scholar 

  • Hardiman G (2002). Microarray technologies—an overview. Pharmacogenomics 3(3):293–7.

    Google Scholar 

  • Hartigan JA, Wong MA (1979). A k-means clustering algorithm. Applied Statistics 28:100–108.

    MATH  Google Scholar 

  • Hastie T, Tibshirani R (1990). Generalized Additive Models. London: Chapman and Hall.

    MATH  Google Scholar 

  • Hastie T, Tibshirani R, Eisen MB, Alizadeh A, Levy R, Staudt L, Chan WC, Botstein D, Brown P (2000). “Gene shaving“ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biology 1:research0003.1–research0003.21.

    Google Scholar 

  • Hastie TJ, Tibshirani R, Buja A (1994). Flexible discriminant analysis by optimal scoring. Journal of the American Statistical Association 89:1255–1270.

    MATH  MathSciNet  Google Scholar 

  • Herrero J, Valencia A, Dopazo J (2001). A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics 17:126–136.

    Google Scholar 

  • Ibrahim JG, Chen MH, Gray RJ (2002). Bayesian models for gene expression with DNA microarray data. Journal of the American Statistical Association 97:88–99.

    MATH  MathSciNet  Google Scholar 

  • Ihaka R, Gentleman R (1996). R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 5:299–314.

    Google Scholar 

  • Jain AN, Tokuyasu TA, Snijders AM, Segraves R, Albertson DG, Pinkel D (2002). Fully automatic quantification of microarray image data. Genome Research 12(2):325–332.

    Google Scholar 

  • James W, Stein C (1961). Estimation with quadratic loss. Proceedings of the Fourth Berkeley Symposium on Mathematical Statististics and Probability 1:361–380.

    MathSciNet  Google Scholar 

  • Kachigan SK (1991). Multivariate Statistical Analysis: A Conceptual Introduction. New York: Radius Press.

    Google Scholar 

  • Kaufmann L, Rousseeuw PJ (1990). Finding Groups in Data: An introduction to Cluster Analysis. New York: Wiley.

    Google Scholar 

  • Kerr MK, Churchill GA (2001a). Bootstrapping cluster analysis: Assessing the reliability of conclusions from microarray experiments. Proceedings of the National Academy of Science, USA 98:8961–8965.

    MATH  Google Scholar 

  • Kerr MK, Churchill GA (2001b). Experimental design in gene expression microarrays. Biostatistics 2:183–201.

    MATH  Google Scholar 

  • Kerr MK, Churchill GA (2001c). Statistical design and the analysis of gene expression microarray data. Genetics Research 77:123–128.

    Google Scholar 

  • Kerr MK, Martin M, Churchill GA (2000). Analysis of variance for gene expression microarray data. Journal of Computational Biology 7:819–837.

    Google Scholar 

  • Khan J, Simon R, Bittner M, Chen Y, Leighton S, Pohida T, Smith PD, Jiang Y, Gooden GC, Trent JM, Meltzer PS (1998). Gene expression profiling of alveolar rhabdomyosarcoma with cDNA microarrays. Cancer Research 58:5009–5013.

    Google Scholar 

  • Knudsen S (2002). A Biologist’s Guide to Analysis of DNA Microarray Data. New York: John Wiley and Sons.

    Google Scholar 

  • Kohane IS, Kho A, Butte AJ (2002). Microarrays for an Integrative Genomics. Cambridge, MA: MIT Press.

    Google Scholar 

  • Kohonen T (1982). Analysis of a simple self-organizing process. Biological ybernetics 43:59–69.

    MATH  Google Scholar 

  • Kohonen T (1989). Self-Organization and Associative Memory. Berlin: Springer-Verlag.

    Google Scholar 

  • Kohonen T (1995). Self Organizing Maps. Berlin: Springer-Verlag.

    Google Scholar 

  • Kruskal JB (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29:1–27.

    MATH  MathSciNet  Google Scholar 

  • Lazzeroni L, Owen AB (2002). Plaid models for gene expression data. Statistica Sinica 12:61–86.

    MATH  MathSciNet  Google Scholar 

  • Lee ML, Kuo FC, Whitmore GA, Sklar J (2000). Importance of replication in microarray gene expression studies: Statistical methods and evidence from repetitive cDNA hybridizations. Proceedings of the National Academy of Sciences USA 97(18):9834–9839.

    MATH  Google Scholar 

  • Lee Y, Lee CK (2002). Classification of multiple cancer types by multicategory support vector machines using gene expression data. Technical Report 1051, University of Wisconsin, Madison, WI.

    Google Scholar 

  • Li C, Wong W (2001). Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proceedings of the National Academy of Science, USA 98:31–36.

    MATH  Google Scholar 

  • Li W, Yang Y (2002). How many genes are needed for a discriminant microarray data analysis? In: SM Lin, KF Johnson (eds.), Methods of Microarray Data Analysis, 137–150. Dordrecht: Kluwer Academic.

    Google Scholar 

  • Lindley DV, Smith AFM (1972). Bayes estimates for the linear model (with discussion). Journal of the Royal Statistical Society, Series B 34:1–41.

    MATH  MathSciNet  Google Scholar 

  • Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, Brown EL (1996). Expression monitoring by hybridization to high-density oligonucleotide arrays. Nature Biotechnology 14:1675–1680.

    Google Scholar 

  • Lönnstedt I, Speed T (2002). Replicated microarray data. Statistica Sinica 12(1):31–46.

    MATH  MathSciNet  Google Scholar 

  • McShane LM, D RM, Freidlin B, Yu R, Li MC, Simon R (2001). Methods for assessing reproducibility of clustering patterns observed in analyses of microarray data. Tech report #2, BRB, NCI, Bethesda, MD.

    Google Scholar 

  • Michie D, Spiegelhalter DJ, Taylor CC (eds.) (1994). Machine Learning, Neural and Statistical Classification. New York: Ellis Horwood.

    MATH  Google Scholar 

  • National Research Council; Panel on Discriminant Analysis Classification and Clustering (1988). Discriminant Analysis and Clustering. Washington, D. C.: National Academy Press.

    Google Scholar 

  • Neal RM (1996). Bayesian Learning for Neural Networks. New York: Springer-Verlag.

    MATH  Google Scholar 

  • Newton MA, Kendziorski CM, Richmond CS, Blattner FR, Tsui KW (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology 8:37–52.

    Google Scholar 

  • Pan W (2002). A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics 18:546–554.

    Google Scholar 

  • Pan W, Lin J, Le CT (2002). How many replicates of arrays are required to detect gene expression changes in microarray experiments? A mixture model approach. Genome Biology 3(5):research0022.1–0022.10.

    Google Scholar 

  • Parmigiani G, Garrett ES, Anbazhagan R, Gabrielson E (2002). A statistical framework for expression-based molecular classification in cancer. Journal of the Royal Statistical Society, Series B, 64:717–736.

    MATH  MathSciNet  Google Scholar 

  • Pavlidis P, Tang C, Noble WS (2001). Classification of genes using probabilistic models of microarray expression profiles. In: MJ Zaki, H Toivonen, JTL Wang (eds.), Proceedings of BIOKDD 2001: Workshop on Data Mining in Bioinfor-matics, 15–18. New York: Association for Computing Machinery.

    Google Scholar 

  • Quackenbush J (2001). Computational analysis of microarray data. Nature Reviews Genetics 2:418–427.

    Google Scholar 

  • Radmacher MD, McShane LM, Simon R (2001). A paradigm for class prediction using gene expression profiles. Tech report #1, BRB, NCI, Bethesda, MD.

    Google Scholar 

  • Raychaudhuri S, Stuart JM, Altman RB (2000). Principal components analysis to summarize microarray experiments: Application to sporulation time series. In: RB Altman, AK Dunker, L Hunter, K Lauderdale, TE Klein (eds.), Fifth Pacific Symposium on Biocomputing, 455–466.

    Google Scholar 

  • Rios Insua D, Mueller P (1998). Feedforward neural networks for nonparametric regression. In: Practical Nonparametric and Semiparametric Bayesian Statistics, 181–194. New York: Springer.

    Google Scholar 

  • Ripley BD (1996). Pattern Recognition and Neural Networks. Cambridge: Cam-bridge University Press.

    MATH  Google Scholar 

  • Rosenwald A, Wright G, Chan W, Connors JM, Campo E, Fisher RI, Gascoyne RD, Muller-Hermelink HK, Smeland EB, Giltnane JM, Hurt EM, Zhao H, Averett L, Yang L, Wilson WH, Jaffe ES, Simon R, Klausner RD, Powell J, Duffey PL, Longo DL, Greiner TC, Weisenburger DD, Sanger WG, Dave BJ, Lynch JC, Vose J, Armitage JO, Montserrat E, Lopez-Guillermo A, Grogan TM, Miller TP, LeBlanc M, Ott G, Kvaloy S, Delabie J, Holte H, Krajci P, Stokke T, Staudt LM (2002). The use of molecular profiling to predict survival after chemotherapy for diffuse large b-cell lymphoma. New England Journal of Medicine 346(25):1937–1947.

    Google Scholar 

  • Ross DT, Scherf U, Eisen MB, Perou CM, Rees C, Spellman P, Iyer V, Jeffrey SS, van de Rijn M, Waltham M, Pergamenschikov A, Lee JCF, Lashkari D, Shalon S, Myers TG, Weinstein JN, Botstein D, Brown PO (2000). System-atic variation in gene expression patterns in human cancer cell lines. Nature Genetics 24:227–235.

    Google Scholar 

  • Rousseeuw P, Struyf A, Hubert M (1996). Clustering in an object-oriented environment. Journal of Statistical Software 1:1–30.

    Google Scholar 

  • Ruczinski I, Kooperberg C, LeBlanc M (2003). Logic regression. Manuscript submitted for publication.

    Google Scholar 

  • Saal LH, Troein C, Vallon-Christersson J, Gruvberger S, Borg A, Peterson C (2002). Bioarray software environment (base): a platform for comprehensive management and analysis of microarray data. Genome Biolog 3:software0003.10003.

    Google Scholar 

  • Schena M (2000). Microarray Biochip Technology. Westborough, MA: BioTechniques Press.

    Google Scholar 

  • Schena M, Shalon D, Davis R, Brown P (1995). Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467–470.

    Google Scholar 

  • Segal E, Taskar B, Gasch A, Friedman N, Koller D (2001). Rich probabilistic models for gene expression. Bioinformatics 17:S243–S252.

    Google Scholar 

  • Simon R, Radmacher MD, Dobbin K (2002). Design of studies using dna microarrays. Genetic Epidemiology 23:21–36.

    Google Scholar 

  • Slonim DK, Tamayo P, Mesirov P, Golub TR, Lander ES (1999). Class prediction and discovery using gene expression data. Discussion paper, Whitehead/M.I.T. Center for Genome Research, Cambridge, MA.

    Google Scholar 

  • Southern EM (2001). DNA microarrays. History and overview. Methods in Molecular Biology 170:1–15.

    Google Scholar 

  • Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B (1998). Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell 9:3273–3297.

    Google Scholar 

  • Storey JD (2001). The positive false discovery rate: A bayesian interpretation and the q-value. Discussion paper, Department of Statistics, Stanford University.

    Google Scholar 

  • Storey JD (2002). A direct approach to false discovery rates. Journal of the Royal Statistical Society, Series B 64:479–498.

    MATH  MathSciNet  Google Scholar 

  • Sundberg R (1999). Multivatiate calibration —direct and indirect regression methodology. Scandinavian Journal of Statistics 26:161–207.

    MATH  MathSciNet  Google Scholar 

  • Tamayo P, Slonim D, Mesirov J, Zhu Q, Dmitrovsky E, Lander ES, Golub TR (1999a). Interpreting gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proceedings of the National Academy of Science, USA 96:2907–2912.

    Google Scholar 

  • Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, Lander ES, Golub TR (1999b). Interpreting patterns of gene expression with self-organizing maps. Proceedings of the National Academy of Science USA 96:2907–2912.

    Google Scholar 

  • Tibshirani R, Hastie T, Eisen M, Ross D, Botstein D, Brown P (1999). Clustering methods for the analysis of DNA microarray data. Technical report, Department of Statistics, Stanford University, Stanford, CA.

    Google Scholar 

  • Tibshirani R, Hastie T, Narasimhan B, Chu G (2002). Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proceedings of the National Academy of Science, USA 99:6567–6572.

    Google Scholar 

  • Toussaint GT (1974). Bibliography on estimation of misclassification. IEEE Transactions on Information Theory IT-20:472–79.

    MathSciNet  Google Scholar 

  • Tseng GC, Oh MK, Rohlin L, Liao J, Wong W (2001). Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Research 29:2549–2557.

    Google Scholar 

  • Tusher V, Tibshirani R, Chu G (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Science, USA 98:5116–5121.

    MATH  Google Scholar 

  • Ultsch A (1993). Self-organizing neural network for visualization and classification. In: O Opitz, B Lausen, R Klar (eds.), Information and Classification, 307–313. Springer.

    Google Scholar 

  • Vapnik V (1998). Statistical Learning Theory. New York: Wiley.

    MATH  Google Scholar 

  • Venables WN, Ripley BD (2000). S programming. New York: Springer.

    MATH  Google Scholar 

  • West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, Zuzan H, Marks JR, Nevins JR (2001). Predicting the clinical status of human breast cancer using gene expression profiles. Proceedings of the National Academy of Science, USA 98:11462–11467.

    Google Scholar 

  • Wolfinger RD, Gibson G, Wolfinger E, Bennett L, Hamadeh H, Bushel P, Afshari C, Paules RS (2001). Assessing gene significance from cDNA microarray expression data via mixed models. Journal of Computational Biology 8:625–637.

    Google Scholar 

  • Worsley K, Liao C, Aston J, Petre V, Duncan G, Morales F, Evans A (2002). A general statistical analysis for fMRI data. NeuroImage 15:1–15.

    Google Scholar 

  • Xu Y, Selaru F, Yin J, Zou T, Shustova V, Mori Y, Sato F, Liu T, Olaru A, Wang S, Kimos M, Perry K, Desai K, Greenwald B, Krasna M, Shibata D, Abraham J, Meltzer S (2002). Artificial neural networks and gene filtering distinguish between global gene expression profiles of Barrett’s esophagus and esophageal cancer. Cancer Research 62:3493–3497.

    Google Scholar 

  • Yang H, Speed TP (2002). Design issues for cDNA microarray experiments. Nature Genetics Reviews 3:579–588.

    Google Scholar 

  • Yang YH, Buckley MJ, Speed TP (2001). Analysis of cDNA microarray images. Briefings in Bioinformatics 2(4):341–349.

    Google Scholar 

  • Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed T (2002). Normalization for cDNA microarray data: A robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research 30(4):e15.

    Google Scholar 

  • Yeung K, Fraley C, Murua A, Raftery A, Ruzzo W (2001a). Model-based clustering and data transformations for gene expression data. Bioinformatics 17:977–987.

    Google Scholar 

  • Yeung KY, Haynor DR, Ruzzo WL (2001b). Validating clustering for gene expression data. Bioinformatics 4:309–318.

    Google Scholar 

  • Yeung KY, Ruzzo WL (2001). Principal component analysis for clustering gene expression data. Bioinformatics 17:763–774.

    Google Scholar 

  • Zhang H, Yu CY (2002). Tree-based analysis of microarray data for classifying breast cancer. Frontiers in Bioscience 7:63–67.

    Google Scholar 

  • Zhao LP, Prentice R, Breeden L (2001). Statistical modeling of large microarray data sets to identify stimulus-response profiles. Proceedings of the National Academy of Science, USA 98:5631–5636.

    MATH  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag New York, Inc.

About this chapter

Cite this chapter

Parmigiani, G., Garrett, E.S., Irizarry, R.A., Zeger, S.L. (2003). The Analysis of Gene Expression Data: An Overview of Methods and Software. In: Parmigiani, G., Garrett, E.S., Irizarry, R.A., Zeger, S.L. (eds) The Analysis of Gene Expression Data. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/0-387-21679-0_1

Download citation

  • DOI: https://doi.org/10.1007/0-387-21679-0_1

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-95577-3

  • Online ISBN: 978-0-387-21679-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics