Statistical Tools for Gene Expression Analysis and Systems Biology and Related Web Resources

Abstract

During the past decade advanced technologies in the field of genomics have revolutionized life sciences and medical research. Large-scale applications of these technologies are making possible the completion of the sequences of an ever growing number of genomes of a variety of organisms in animal, plant and prokaryote kingdoms. The next major task to achieve is the understanding of the function of genes and their interactions.

In this chapter we will briefly introduce the major statistical techniques proposed for gene expression data analysis and for data integration, and then we will focus on the description of the most widely used and freely accessible web tools and software dedicated to genomic data analysis (Systems Biology).

Keywords

Boolean Networks Asterias Visualization Data integration 

Key References

  1. Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation, Nucleic Acids Res. 30, e15.CrossRefPubMedGoogle Scholar
  2. Jeffery, I. B., Higgins, D. G., and Cullane, A. C. (2006) Comparison and evaluation of methods for generatine differentially expressed gene lists from microarray data, BMC Bioinformatics 7, 359.CrossRefPubMedGoogle Scholar
  3. Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P. (2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias, Bioinformatics 19, 185–193.CrossRefPubMedGoogle Scholar
  4. Ideker, T., Galitski, T., and Hood, L. (2001) A new approach to decoding life: Systems Biology, Ann. Rev. Genomics Hum. Genet. 2, 343–272.CrossRefGoogle Scholar
  5. Neerincx, P. B. and Leunissen, J. A. (2005) Evolution of web services in bioinformatics, Brief Bioinform. 6, 178–188.CrossRefPubMedGoogle Scholar
  6. Lee, T. N., Rinaldi, F., Robert, D., Odom, Z., Bar-Joseph, G., Gerber, N., Hannett, C., Harbison, C., Thompson, I., Simon, J., Zeitlinger, E., Jennings, H., Murray, D., Gordon, B., Ren, J., Wyrick, J., Tagne, T., Volkert, E., Fraenkel, D. K., Gifford, and Young, R. A. (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae, Science 298, 799–804.CrossRefPubMedGoogle Scholar

Suggested Reading: Background

  1. 1.
    Takahashi, M., Rhodes, D. R., Furge, K. A., Kanayama, H., Kagawa, S., Haab, B. B., and Teh, B. T. (2001) Gene expression profiling of clear cell renal cell carcinoma: gene identification and prognostic classification, Proc. Natl. Acad. Sci. USA 98, 9754–9759.CrossRefPubMedGoogle Scholar
  2. 2.
    Nevins, J. R. and Potti, A. (2007) Mining gene expression profiles: expression signatures as cancer phenotypes, Nat. Rev. Genet. 8, 601–609.CrossRefPubMedGoogle Scholar
  3. 3.
    Rhodes, D. R. and Chinnaiyan, A. M. (2005) Integrative analysis of the cancer transcriptome, Nat. Genet. 37, S31–S37.CrossRefPubMedGoogle Scholar
  4. 4.
    Ideker, T., Galitski, T., and Hood, L. (2001) A new approach to decoding life: Systems Biology, Annu. Rev. Genomics Hum. Genet. 2, 343–372.CrossRefPubMedGoogle Scholar
  5. 5.
    Galperin, M. Y. (2007) The Molecular Biology Database Collection: 2007 update, Nucleic Acids Res. 36, D2–D4.CrossRefPubMedGoogle Scholar
  6. 6.
    Gentleman, R. C., Carey, V. J., Bates, D. M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossigni, A. J., Sawitzki, G., Smith, C., Smyth, G., Tierney, L., Yang, J. Y., and Zhang, J. (2004) Bioconductor: open software development for computational biology and bioinformatics, Genome Biol. 5, R80.CrossRefPubMedGoogle Scholar
  7. 7.
    Neerincx, P. B. and Leunissen, J. A. (2005) Evolution of web services in bioinformatics, Brief. Bioinform. 6, 178–188.CrossRefPubMedGoogle Scholar

Gene Expression Data Normalization

  1. 8.
    Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation, Nucleic Acids Res. 15, e15.CrossRefGoogle Scholar
  2. 9.
    Durbin, B. P., Hardin, J. S., Hawkins, D. M., and Rocke, D. M. (2002) A variance-stabilizing transformation for gene-expression microarray data, Bioinformatics. 18, S105–S110.PubMedGoogle Scholar
  3. 10.
    Huber, W., Von Heydebreck, A., Sültmann, H., Poustka, A., and Vingron, M. (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Bioinformatics. 18, S96–S104.PubMedGoogle Scholar
  4. 11.
    Rocke, D. M. and Durbin, B. P. (2003) Approximate variance stabilizing transformations for gene-expression microarray data, Bioinformatics. 19, 966–972.CrossRefPubMedGoogle Scholar
  5. 12.
    Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P. (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias, Bioinformatics. 19, 185–193.CrossRefPubMedGoogle Scholar
  6. 13.
    Amaratunga, D. and Cabrera, J. (2001) Analysis of data from viral DNA microchips. J. Am. Stat. Ass. 96, 1161–1170.CrossRefGoogle Scholar
  7. 14.
    Sidorov, I. A., Hosack, D. A., Gee, D., Yang, J., Cam, M. C., Lempicki, R. A., and Dimitrov, D. S. (2002) Oligonucleotide microarray data distribution and normalization, Inf. Sci. 146, 67–73.CrossRefGoogle Scholar
  8. 15.
    Workman, C., Jensen, L. J., Jarmer, H., Berka, R., Gautier, L., Nielser, H. B., Saxild, H.-H., Nielsen, C., Brunak, S., and Knudsen, S. (2002) A new non-linear normalization method for reducing variability in DNA microarray experiments, Genome Biol. 3, R0048.CrossRefGoogle Scholar
  9. 16.
    Schadt, E. E., Li, C., Ellis, B., and Wong, W. H. (2001) Feature extraction and normalization algorithms for high-density oligonucleotide gene expression array data, J. Cell Biochem. Suppl. 37, 120–125.Google Scholar
  10. 17.
    Li, C. and Wong, W. H. (2001). Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl. Acad. Sci. USA 98, 31–36.CrossRefPubMedGoogle Scholar
  11. 18.
    Li, C. and Wong, W. H. (2001) Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application, Genome Biol. 2, R0032.Google Scholar

Inferential Statistics for the Identification of Differentially Expressed Genes

  1. 19.
    Welch, B. L. (1947) The generalization of “student’s” problem when several different population variances are involved, Biometrika. 34, 28–35.PubMedGoogle Scholar
  2. 20.
    Baldi, P. and Long, A. D. (2001) A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes, Bioinformatics. 17, 509–519.CrossRefPubMedGoogle Scholar
  3. 21.
    Tusher, V. G., Tibshirani, R., and Chu, G. (2001) Significance analysis of microarrays applied to the ionizing radiation response, Proc. Natl. Acad. Sci. USA 98, 5116–5121.CrossRefPubMedGoogle Scholar
  4. 22.
    Broberg, P. (2003) Statistical methods for ranking differentially expressed genes, Genome Biol. 4, R41.CrossRefPubMedGoogle Scholar
  5. 23.
    Efron, B. and Tibshirani, R. (2002) Empirical bayes methods and false discovery rates for microarrays. Genet. Epidemiol. 23, 70–86.CrossRefPubMedGoogle Scholar
  6. 24.
    Pan, W. (2002) A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments, Bioinformatics 18, 546–554.CrossRefPubMedGoogle Scholar
  7. 25.
    Jeffery, I. B., Higgins, D. G., and Cullane, A. C. (2006) Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data, BMC Bioinformatics 26, 7:359.Google Scholar
  8. 26.
    Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing, J. R. Statist. Soc. B 57, 289–300.Google Scholar

Meta-Analysis of Gene Expression Data

  1. 27.
    Wang, J., Coombes, K. R., Highsmith, W. E., Keating, M. J., and Abruzzo, L. V. (2004) Differences in gene expression between B-cell chronic lymphocytic leukemia and normal B cells: a meta-analysis of three microarray studies, Bioinformatics 20, 3166–3178.CrossRefPubMedGoogle Scholar
  2. 28.
    Choi, J. K., Yu, U., Kim, S., and Yoo, O. J. (2003) Combining multiple microarray studies and modeling inter-study variation, Bioinformatics 19, i84–i90.CrossRefPubMedGoogle Scholar
  3. 29.
    Stevens, J. R. and Doerge, R. W. (2005) Combining Affymetrix microarray results, BMC Bioinformatics 6, 57.CrossRefPubMedGoogle Scholar
  4. 30.
    Hu, P., Greenwood, C. M. T., and Beyene, J. (2005) Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models, BMC Bioinformatics 6, 128.CrossRefPubMedGoogle Scholar
  5. 31.
    Park, T., Yi, S. G., Shin, Y. K., and Lee, S. (2006) Combining multiple microarrays in the presence of controlling variables, Bioinformatics 22, 1682–1689.CrossRefPubMedGoogle Scholar
  6. 32.
    Rhodes, D. R., Yu, J., Shanker, K., Deshpande, N., Varambally, R., Ghosh, D., Barrette, T., Pandey, A., and Chinnaiyan, A. M. (2004) Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression, Proc. Natl. Acad. Sci. USA, 101, 9309–9314.CrossRefPubMedGoogle Scholar
  7. 33.
    Romualdi, C., De Pitta, C., Tombolan, L., Bortoluzzi, S., Sartori, F., Rosolen, A., and Lanfranchi, G. (2006) Defining the gene expression signature of rhabdomyosarcoma by meta-analysis, BMC Genomics 7, 287.CrossRefPubMedGoogle Scholar
  8. 34.
    Parmigiani, G., Garrett-Mayer, E. S., Anbazhagan, R., and Gabrielson, E. (2004) A cross-study comparison of gene expression studies for the molecular classification of lung cancer, Clin. Cancer Res. 10, 2922–2927.CrossRefPubMedGoogle Scholar
  9. 35.
    Conlon, E. M., Song, J. J., and Liu, A. (2007) Bayesian meta-analysis models for microarray data: a comparative study, BMC Bioinformatics 7:80.CrossRefGoogle Scholar

Network Analysis

  1. 36.
    Lee, T. I., Rinaldi, N. J., Robert, F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. T., Thompson, C. M., Simon, I., Zeitlinger, J., Jennings, E. G., Murray, H. L., Gordon, D. B., Ren, B., Wyrick, J. J., Tagne, J.-B., Volkert, T. L., Fraenkel, E., Gifford, D. K., and Young, R. A. (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae, Science 298, 799–804.CrossRefPubMedGoogle Scholar
  2. 37.
    Gardner, T., diBernardo, D., Lorenz, D., and Collins, J. J. (2003) Inferring genetic networks and identifying compound mode of action via expression profiling, Science 301, 102–105.CrossRefPubMedGoogle Scholar
  3. 38.
    Bar-Joseph, Z., Gerber, G., Lee, T. I., Rinaldi, N., Yoo, J., Robert, F., Gordon, D., Fraenkel, E., Jaakkola, T., Young, R. A., and Gifford, D. K. (2003) Computational discovery of gene modules and regulatory networks, Nat. Biotechol. 21, 1337–1342.CrossRefGoogle Scholar
  4. 39.
    Haugen, A. C., Kelley, R., Collins, J. B., Tucker, C. J, Deng, C., Afshari, C. A., Brown, J. M., Ideker, T., and Van Houten, B. (2004) Integrating phenotypic and expression profiles to map arsenic response networks, Genome Biol. 5, R95.CrossRefPubMedGoogle Scholar
  5. 40.
    Said, M. R., Begley, T. J., Oppenheim, A. V., Lauffenburger, D. A., and Samson, L. D. (2004) Global network analysis of phenotypic effects: protein networks and toxicity modulation in Saccharomyces cerevisiae, Proc. Natl. Acad. Sci. USA 101, 18006–18011.CrossRefPubMedGoogle Scholar
  6. 40.
    McAdams, H. H. and Arkin, A. (1997) Stochastic mechanisms in gene expression, Proc. Natl. Acad. Sci. USA 94, 814–819.CrossRefPubMedGoogle Scholar
  7. 42.
    Somogyi, R. and Sniegoski, C. A. (1996) Modeling the complexity of genetic networks: understanding multigenic and pleiotropic regulation, Complexity 1, 45–63.Google Scholar

Web Resources and Statistical Tools: Expression Data Analysis

  1. 43.
    Romualdi, C., Vitulo, N., Del Bavero, M., Lanfranchi, G. (2005) MIDAW: a web tool for statistical analysis of microarray data, Nucleic Acids Res. 33, W644–W649.CrossRefPubMedGoogle Scholar
  2. 44.
    Tibshirani, R., Hastie, T., Narasimhan, B., and Chu, G. (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression, Proc, Natl, Acad, Sci, USA 99, 6567–6572.CrossRefGoogle Scholar
  3. 45.
    Vaquerizas, J. M., Conde, L., Yankilevich, P., Cabezon, A., Minguez, P., Diaz-Uriarte, R., Al-Shahrour, F., Herrero, J., and Dopazo, J. (2005) GEPAS, an experiment-oriented pipeline for the analysis of microarray gene expression data, Nucleic Acids Res. 33, W616–W620.CrossRefPubMedGoogle Scholar
  4. 46.
    Diaz-Uriarte, R., Alibes, A., Morrissey, E. R., Canada, A., Rueda, O. M., and Neves, M. L. (2007) Asterias: integrated analysis of expression and aCGH data using an open-source, web-based, parallelized software suite, Nucleic Acids Res. 35, W75–W80.CrossRefPubMedGoogle Scholar
  5. 47.
    Rainer, J., Sanchez-Cabo, F., Stocker, G., Sturn, A., and Trajanoski, Z. (2006) CARMAweb: comprehensive R- and bioconductor-based web service for microarray data analysis, Nucleic Acids Res. 34, W498–W503.CrossRefPubMedGoogle Scholar
  6. 48.
    Psarros, M., Heber, S., Sick, M., Thoppae, G., Harshman, K., and Sick, B. (2005) RACE: Remote Analysis Computation for gene Expression data, Nucleic Acids Res. 33, W638–W643.CrossRefPubMedGoogle Scholar
  7. 49.
    Saeed, A. I., Sharov, V., White, J., Li, J., Liang, W., Bhagabati, N., Braisted, J., Klapa, M., Currier, T., Thiagarajan, M., Sturn, A., Snuffin, M., Rezantsev, A., Popov, D., Ryltsov, A., Kostukovich, E., Borisovsky, I., Liu, Z., Vinsavich, A., Trush, V., and Quackenbush, J. (2003) TM4: a free, open-source system for microarray data management and analysis, Biotechniques 34, 374–378.PubMedGoogle Scholar
  8. 50.
    Newman, J. C. and Weiner, A. M. (2005) L2L: a simple tool for discovering the hidden significance in microarray expression data, Genome Biol 6, R81.CrossRefPubMedGoogle Scholar
  9. 51.
    Cahan, P., Ahmad, A. M., Burke, H., Fu, S., Lai, Y., Florea, L., Dharker, N., Kobrinski, T., Kale, P., and McCaffrey, T. A. (2005) List of lists-annotated (LOLA): a database for annotation and comparison of published microarray gene lists, Gene 360, 78–82.CrossRefPubMedGoogle Scholar
  10. 52.
    Yi, Y., Li, C., Miller, C., and Gorge, A. L. Jr. (2007) Strategy for encoding and comparison of gene expression signatures, Genome Biol. 8, R133.CrossRefPubMedGoogle Scholar

Web Resources and Statistical Tools: Annotation and Functional Class Enrichment

  1. 53.
    Huang, D. W., Sherman, B. T., Tan, Q., Kir, J., Liu, D., Bryant, D., Guo, Y., Stephens, R., Baseler, M. W., Lane, H. C., and Lempicki, R. A. (2007) DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists, Nucleic Acids Res. 35, W169–W175.CrossRefGoogle Scholar
  2. 54.
    Al-Shahrour, F., Minguez, P., Tarraga, J., Montaner, D., Alloza, E., Vaquerizas, J. M., Conde, L., Blaschke, C., Vera, J., and Dopazo, J. (2006) BABELOMICS: a Systems Biology perspective in the functional annotation of genome-scale experiments, Nucleic Acids Res. 34, W472–W476.CrossRefPubMedGoogle Scholar
  3. 55.
    Al-Shahrour, F., Díaz-Uriarte, R., and Dopazo, J. (2004) FatiGO: a web tool for finding significant associations of Gene Ontology terms to groups of genes, Bioinformatics 20, 578–580.CrossRefPubMedGoogle Scholar

Web Resources and Statistical Tools: Web Services and Networks Visualization

  1. 56.
    Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M. R., Li, P., and Oinn, T. (2006) Taverna: a tool for building and running workflows of services, Nucleic Acids Res. 34, W729–W732.CrossRefPubMedGoogle Scholar
  2. 57.
    Stevens, R. D., Robinson, A. J., and Goble, C. A. (2003) MyGrid: personalised bioinformatics on the information grid, Bioinformatics 19, i302–i304.CrossRefPubMedGoogle Scholar
  3. 58.
    Breitkreutz, B.J., Stark, C., and Tyers, M. (2003) Osprey: A Network Visualization System, Genome Biol. 4, R22.CrossRefPubMedGoogle Scholar
  4. 59.
    Shannon, P., Markiel, A., Ozier O., Baliga, N. S., Wang, J. T., Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks, Genome Res. 13, 2498–2504.CrossRefPubMedGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.CRIBI Biotechnology Centre and Department of BiologyUniversity of PadovaPadovaItaly
  2. 2.University of PadovaPadovaItaly

Personalised recommendations