Abstract
Identifying differentially expressed (DE) genes across conditions or treatments is a typical problem in microarray experiments. In time course microarray experiments (under two or more conditions/treatments), it is sometimes of interest to identify two classes of DE genes: those with no time-condition interactions (called parallel DE genes, or PDE), and those with time-condition interactions (nonparallel DE genes, NPDE). Although many methods have been proposed for identifying DE genes in time course experiments, methods for discerning NPDE genes from the general DE genes are still lacking. We propose a functional ANOVA mixed-effect model to model time course gene expression observations. The fixed effect of (the mean curve) of the model decomposes bivariate functions of time and treatments (or experimental conditions) as in the classic ANOVA method and provides the associated notions of main effects and interactions. Random effects capture time-dependent correlation structures. In this model, identifying NPDE genes is equivalent to testing the significance of the time-condition interaction, for which an approximate F-test is suggested. We examined the performance of the proposed method on simulated datasets in comparison with some existing methods, and applied the method to a study of human reaction to the endotoxin stimulation, as well as to a cell cycle expression data set.
Similar content being viewed by others
References
Calvano S, Xiao W, Richards D et al. (2005) A network-based analysis of systemic inflammation in humans. Nature 437:1032–1037
Cantoni E, Hastie T (2002) Degrees-of-freedom tests for smoothing splines. Biometrika 89:251–263
Castillo-Davis C, Hartl D (2003) Genemerge: post-genomic analysis, data-mining and hypothesis. Bioinformatics 19:891–892
Crainiceanu CM, Ruppert D (2004) Restricted likelihood ratio tests in nonparametric longitudinal models. Stat Sin 14(3):713–729
Craven P, Wahba G (1979) Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:377–403
Davies RB, (1980) [Algorithm AS 155] The distribution of a linear combination of χ 2 random variables (AS R53: 84V33 pp 366–369). Appl Stat 29:323–333
Dennis JE, Schnabel RB (1996) Numerical methods for unconstrained optimization and nonlinear equations. SIAM, Philadelphia. Corrected reprint of the 1983 original
Gu C (2002) Smoothing spline ANOVA models. Springer, New York
Gu C (2004) Model diagnostics for smoothing spline ANOVA models. Can J Stat 32(4):347–358
Gu C, Ma P (2005) Optimal smoothing in nonparametric mixed-effect models. Ann Stat 33:1357–1379
Guo W (2002) Inference in smoothing spline analysis of variance. J R Stat Soc, Ser B: Stat Methodol 64(4):887–898
Hastie T, Tibshirani R (1990) Generalized additive models. Chapman & Hall, London
Hogan C, Serpente N, Cogram P, Hosking CR, Bialucha CU, Feller SM, Braga VMM, Birchmeier W, Fujita Y (2004) Rap1 regulates the formation of e-cadherin-based cell–cell contacts. Mol Cell Biol 24:6690–6700
Hong F, Li H (2006) Functional hierarchical models for identifying genes with different time-course expression profiles. Biometrics 62:534–544
Khatri P, Bhavsar P, Bawa G, Draghici S (2004) Onto-tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments. Nucleic Acids Res 32:W449–W456
Kim Y-J, Gu C (2004) Smoothing spline Gaussian regression: More scalable computation via efficient approximation. J Roy Stat Soc Ser B 66:337–356
Kunst CB (2004) Complex genetics of amyotrophic lateral sclerosis. Am J Hum Genet 75:933–947
Leung YF, Ma P, Link BA, Dowling J (2008) Factorial microarray analysis of zebrafish retina development. Proc Natl Acad Sci 105:12909–12914
Li C, Wong WH (2001) Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proc Natl Acad Sci 98:31–36
Liu A, Wang Y (2004) Hypothesis testing in smoothing spline models. J Stat Comput Simul 74(8):581–597
Ma P, Castillo-Davis CI, Zhong W, Liu JS (2006) A data-driven clustering method for time course gene expression data. Nucleic Acids Res 34:1261–1269
Ma P, Zhong W (2008) Penalized clustering of large scale functional data with multiple covariates. J Amer Stat Assoc 103:625–636
Maglott D, Ostell J, Pruitt KD, Tatusova T (2005) Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res 33(Database Issue):D45–D58
Orlando DA, Lin CY, Bernard A, Wang JY, Socolar JES, Iversen ES, Hartemink AJ, Haase SB (2008) Global control of cell-cycle transcription by coupled CDK and network oscillators. Nature 453:944–947
Robinson GK (1991) That BLUP is a good thing: The estimation of the random effects. Statist Sci 6:15–51 (with discussions)
Self SG, Liang K-Y (1987) Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J Am Stat Assoc 82:605–610
Storey JD, Tibshirani R (2003) Statistical significance for genome-wide studies. Proc Natl Acad Sci 100:9440–9445
Storey JD, Xiao W, Leek JT, Tompkins R, Davis G (2005) Significance of time course microarray experiments. Proc Natl Acad Sci 102:12837–12842
Tai YC, Speed TP (2006) A multivariate empirical Bayes statistic for replicated microarray time course data. Ann Stat 34:2387–2412
Tusher VG, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci 98:5116–5121
Wahba G (1990) Spline models for observational data. CBMS-NSF regional conference series in applied mathematics, vol. 59. SIAM, Philadelphia
Yuan M, Kendziorski C (2006) Hidden Markov models for microarray time course data under multiple biological conditions. J Am Stat Assoc 101:1323–1340
Zhang C (2003) Calibrating the degrees of freedom for automatic data smoothing and effective curve checking. J Am Stat Assoc 98(463):609–628
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Ma, P., Zhong, W. & Liu, J.S. Identifying Differentially Expressed Genes in Time Course Microarray Data. Stat Biosci 1, 144–159 (2009). https://doi.org/10.1007/s12561-009-9014-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-009-9014-1