Abstract
One of the essential issues in microarray data analysis is to identify differentially expressed genes (DEGs) under different experimental treatments. In this article, a statistical procedure was proposed to identify the DEGs for gene expression data with or without missing observations from microarray experiment with one- or two-treatment factors. An F statistic based on Henderson method III was constructed to test the significance of differential expression for each gene under different treatment(s) levels. The cutoff P value was adjusted to control the experimental-wise false discovery rate. A human acute leukemia dataset corrected from 38 leukemia patients was reanalyzed by the proposed method. In comparison to the results from significant analysis of microarray (SAM) and microarray analysis of variance (MAANOVA), it was indicated that the proposed method has similar performance with MAANOVA for data with one-treatment factor, but MAANOVA cannot directly handle missing data. In addition, a mouse brain dataset collected from six brain regions of two inbred strains (two-treatment factors) was reanalyzed to identify genes with distinct regional-specific expression patterns. The results showed that the proposed method could identify more distinct regional-specific expression patterns than the previous analysis of the same dataset. Moreover, a computer program was developed and incorporated in the software QTModel, which is freely available at http://ibi.zju.edu.cn/software/qtmodel.
Similar content being viewed by others
References
Brown PO, Botstein D (1999) Exploring the new world of the genome with DNA microarrays. Nat Genet 21(Suppl.):33–37
Cui X, Hwang JT, Qiu J, Blades NJ, Churchill GA (2005) Improved statistical tests for differential gene expression by shrinking variance components estimates. Biostatistics 6:59–75
DeRisi J, Penland L, Brown PO, Bittner ML, Meltzer PS, Ray M, Chen Y, Su YA, Trent JM (1996) Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nat Genet 14:457–460
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95:14863–14868
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537
Gunderson KL, Kruglyak S, Graige MS, Garcia F, Kermani BG, Zhao C, Che D, Dickinson T, Wickham E, Bierle J, Doucet D, Milewski M, Yang R, Siegmund C, Haas J, Zhou L, Oliphant A, Fan JB, Barnard S, Chee MS (2004) Decoding randomly ordered DNA arrays. Genome Res 14:870–877
Jin W, Riley RM, Wolfinger RD, White KP, Passador-Gurgel G, Gibson G (2001) The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster. Nat Genet 29:389–395
Kerr MK, Martin M, Churchill GA (2000) Analysis of variance for gene expression microarray data. J Comput Biol 7:819–837
Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ, Clemons PA, Wei R, Carr SA, Lander ES, Golub TR (2006) The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313:1929–1935
Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, Brown EL (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 14:1675–1680
Lu Y, Zhu J, Liu P (2005) A two-step strategy for detecting differential gene expression in cDNA microarray data. Curr Genet 47:121–131
Pavlidis P, Noble WS (2001) Analysis of strain and regional variation in gene expression in mouse brain. Genome Biol 2:RESEARCH0042
Sandberg R, Yasuda R, Pankratz DG, Carter TA, Del Rio JA, Wodicka L, Mayford M, Lockhart DJ, Barlow C (2000) Regional and strain-specific gene expression mapping in the adult mouse brain. Proc Natl Acad Sci U S A 97:11038–11043
Searle SR, Casella G, McCulloch CE (1992) Variance components. Wiley, New York
Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci U S A 100:9440–9445
Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman RB (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17:520–525
Tusher VG, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98:5116–5121
Wolfinger RD, Gibson G, Wolfinger ED, Bennett L, Hamadeh H, Bushel P, Afshari C, Paules RS (2001) Assessing gene significance from cDNA microarray expression data via mixed models. J Comput Biol 8:625–637
Zou YY, Yang J, Zhu J (2006) A robust statistical procedure to discover expression biomarkers using microarray genomic expression data. J Zhejiang Univ Sci B 7:603–607
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, J., Zou, Y. & Zhu, J. Identifying differentially expressed genes in human acute leukemia and mouse brain microarray datasets utilizing QTModel. Funct Integr Genomics 9, 59–66 (2009). https://doi.org/10.1007/s10142-008-0096-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10142-008-0096-5