Skip to main content
Log in

Identification of significant gene-sets differentially expressed in a specific disease by co-expressed functional gene modules generation

  • Original Research
  • Published:
BioChip Journal Aims and scope Submit manuscript

Abstract

In recent microarray studies, the gene-set analysis is one of the most popular computational approaches to find significant gene-sets that show significantly differential expression between case and control groups of samples. For this purpose, it employs a variety of biological resources such as pathway databases, gene ontology, literatures, and etc., to generate candidate functional gene-sets at the first step. Out of these candidates, then, the most significant ones are identified by taking such gene-sets that have sufficiently high statistical significance in expression difference between case and control groups. Here the significance of each gene-set is usually evaluated based on its representative score obtained from the expression profiles of its constituent genes. In practice, however, the representative score for a gene-set may not be easily able to capture overall characteristics of the expression patterns of its constituent genes. For example, it can occur that some genes in a specific functional gene-set show very different expression pattern from a majority of genes in the same gene-set. In such a case, those genes cause the problem that the representative score for a gene-set gets weakened, eventually leading to the hindrance in estimating the statistical significance of the gene-set. To handle this problem, thus, we propose an approach to employ gene modules, a group of genes which do not only share a specific function in common but are also strongly correlated to each other, as the candidate functional gene-sets for the gene-set analysis. Specifically, from each gene-set of the same functionality, we attempt to filter out the “bad” genes, of which expression patterns in a functional gene-set are not strongly correlated to those of a majority of genes in the same gene-set, by generating co-expressed functional gene modules from each gene-set. Also, for the significance evaluation of these gene modules, a nonparametric Wilcoxon ranksum test is employed. From our experiments, it is observed that our proposed approach to co-expressed functional modules generation for gene-set analysis can greatly improve the performance on the identification of significant gene-sets differentially expressed in a specific disease.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Zhang, A. Advanced analysis of gene expression microarray data. World Scientific (2006).

  2. McLachlan, G.J., Do, K.A. & Ambrose, C. Analyzing microarray gene expression data. WILEY-INTERSCIENCE John Wiley & Sons (2004).

  3. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genomewide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).

    Article  CAS  Google Scholar 

  4. Taskesen, E. Sub-typing of model organisms based on gene expression data. Bioinformatics Technical University of Delft Research Assignment (2006).

  5. Kim, S.Y. & Volsky D.J. PAGE: parametric analysis of gene set enrichment. BMC Bioinformatics 8, 144 (2005).

    Article  Google Scholar 

  6. KEGG: Kyoto Encyclopedia of Genes and Genomes, http://www.genome.ad.jp/kegg/.

  7. Kanehisa, M., Goto, S., Kawashima S. & Nakaya, A. The KEGG databases at GenomeNet. Nucleic Acids Res. 30, 42–46 (2002).

    Article  CAS  Google Scholar 

  8. Gene Ontology, http://www.geneontology.org/.

  9. Entrez, http://www.ncbi.nlm.nih.gov/sites/gquery.

  10. Hogg, R.V., Craig, A.T. & Mckean, J. Introduction to Mathematical Statistics, 6th Edition. Pearson Education (2005).

  11. Bhattacharjee, A. et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences of the United States of America 20;98, 13790–13795 (2001).

    Article  Google Scholar 

  12. Armstrong, S.A. et al. MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics 30, 41–47 (2001).

    Article  Google Scholar 

  13. MSigDB: Molecular Signatures Database, http://www.broadinstitute.org/gsea/index.jsp.

  14. David, W.M. Bioinformatics: sequence and genome analysis, 2/E. Cold Spring Harbor Laboratory Press (2004).

  15. Weinberg, R.A. The biology of CANCER. Carland Science (2007).

  16. Tan, P.N., Steinbach, M. & Kumar, V. Introduction to data mining. Pearson Education, Inc. (2006).

  17. Zuber, V. & Strimmer, K. Gene ranking and biomarker discovery under correlation. Bioinformatics 15;25, 2700–2707 (2009).

    Article  Google Scholar 

  18. Marc M. Triola, M.D. & Mario, F.T. Biostatistics for the Biological and Health Sciences. Pearson Education, Inc. (2006).

  19. Lee, E., Chuang, H.Y., Kim, J.W., Ideker, T. & Lee, D. Inferring pathway activity toward precise disease classification. PLoS Computational Biology 4, e1000217 (2008).

    Article  Google Scholar 

  20. Dudoit, S. & van der Laan, M.J. Multiple Testing Procedures and Applications to Genomics. Springer (2007).

  21. Gentleman, R., Carey, V., Huber, W., Irizarry, R. & Dudoit, S. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer (2005).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miyoung Shin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, J., Shin, M. Identification of significant gene-sets differentially expressed in a specific disease by co-expressed functional gene modules generation. BioChip J 4, 204–209 (2010). https://doi.org/10.1007/s13206-010-4307-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13206-010-4307-5

Keywords

Navigation