Abstract
DNA microarray is an innovative technology for obtaining information on gene function. Because it is a high-throughput method, computational tools are essential in data analysis and mining to extract the knowledge from experimental results. Filtering procedures and statistical approaches are frequently combined to identify differentially expressed genes. However, obtaining a list of differentially expressed genes is only the starting point because an important step is the integration of differential expression profiles in a biological context, which is a hot topic in data mining. In this chapter an integrated approach of filtering and statistical validation to select trustable differentially expressed genes is described together with a brief introduction on data mining focusing on the classification of co-regulated genes on the basis of their biological function.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. (1995) Quantitative monitoring of gene expression patterns with complementary DNA microarray. Science 270, 467–470.
Kane, M. D., Jatkoe, T. A., Stumpf, C. R., Lu, J., Thomas, J. D., and Madore, S. J. (2000) Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucleic Acids Res. 28, 4552–4557.
El Atifi, M., Dupre, I., Rostaing, B., Chambaz, E. M., Benabid, A. L., and Berger, F. (2002) Long oligonucleotide arrays on nylon for large-scale gene expression analysis. Biotechniques 33, 612–616.
Jin, W., Riley, R. M., Wolfinger, R. D., White, K. P., Passador-Gurgel, G., and Gibson, G. (2001) The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster. Nat. Genet. 29, 389–395.
Dudoit, S., Yang, Y. H., Callow, M. J., and Speed, T. P. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Technical report #578 Department of Statistics, UC-Berkeley. August 2000. (<http://www.stat.berkeley.edu/users/terry/zarray/Html/matt.html> http://www.stat.berkeley.edu/users/terry/zarray/Html/matt.html)
Li, C. and Wong, W. H. (2001) Model-based analysis of oligonucleotides arrays: expression index computation and outlier detection. Proc. Natl. Acad. Sci. USA 98, 31–36.
Hartemink, D. G., Jaakkola, I., and Young, R. (2001) Maximum likelihood estimation of optimal scaling factors for expression array normalization. Microarrays: Optical Technologies and Informatics (Proceedings of SPIE), p. 4266.
Rocke, D. M. and Durbin, B. (2001) A model for measurement error for gene expression arrays. J. Comput. Biol. 8, 557–569.
Golub, T. R., Slonim, D. K., Tamayo, P., et al. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537.
Kim, S., Dougherry, E. R., Chen, Y., et al. (2000) Multivariate measurement of gene expression relationships. Genomics 67, 201–209.
Celis, J. E., Kruhoffer, M., Gromova, I., et al. (2000) Gene expression profiling: monitoring transcription and translation products using DNA microarrays and proteomics. FEBS Lett. 480, 2–16.
Tusher, V. G., Tibshirani, R., and Chu, G. (2001) Significance analysis of microarrays applied to ionizing radiation response. Proc. Natl. Acad. Sci. USA 98, 5116–5121.
Baldi, P. and Long, A. D. (2001) A bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inference of gene changes. Bioinformatics 17, 509–519.
Li, C. and Wong, W. H. (2001) Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol. 2, 32.1–32.11.
Tamayo, P., Slonim, D., Merinov, J., et al. (1999) Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912.
Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14,863–14,868.
De Smet, F., Mathys, J., Marchal, K., Thijs, G., De Moor, B., and Moreau, Y. (2002) Adaptive quality-based clustering of gene expression profiles. Bioinformatics 18, 735–746.
Yeung, K. Y., Fraley, C., Murua, A., Raftery, A. E., and Ruzzo, W. L. (2001) Model-based clustering and data transformations for gene expression data. Bioinformatics 17, 977–987.
Zhao, Y. and Karypis, G. (2002) Criterion Functions for Document Clustering Experiments and Analysis. Technical Report #01-40, 2002. University of Minnesota, Department of Computer Science/Army HPC Research Center, Minneapolis, MN 55455.
Wang, L., Wu, Q., Qiu, P., et al. (2001) Analyses of p53 target genes in the human genome by bioinformatic and microarray approaches. J. Biol. Chem. 276, 43,604–43,610.
Tanabe, L., Scherf, U., Smith, L. H., Lee, J. K., Hunter, L., and Weinstein, J. N. (1999) MedMiner: an Internet text-mining tool for biomedical information, with application to gene expression profiling. Biotechniques 27, 1210–1217.
Hokamp, K. and Wolfe, K. (1999) What’s new in the library? What’s new in GenBank? Let PubCrawler tell you. Trends Genet. 5, 471–472.
Nobata, C., Collier, N., and Tsujii, J. (1999) Automatic term identification and classification in biology texts, in Proceedings of the Natural Language Pacific Rim Symposium (NLPRS’2000), pp. 369–375.
Agrawal, R., Imielinski, T., and Swami, A. (1993) Mining association rules between sets of items in large databases, in Proceedings of the Conference on Management of Data. ACM Press, pp. 207–216.
van’t Veer, L. J., Dai, H., van de Vijver, M. J., et al. (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536.
Nagayama, S., Katagiri, T., Tsunoda, T., et al. (2002) Genome-wide analysis of gene expression in synovial sarcomas using a cDNA microarray. Cancer Res. 62, 5859–5866.
Weiss, S. W. and Sobin, L. (1994) Histological typing of soft tissue tumors. In: World Health Organization International Histological Classification of Tumors, 2nd ed.: Springer-Verlag, Berlin.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Humana Press Inc.
About this protocol
Cite this protocol
Saviozzi, S., Iazzetti, G., Caserta, E., Guffanti, A., Calogero, R.A. (2004). Microarray Data Analysis and Mining. In: Decler, J., Reischl, U. (eds) Molecular Diagnosis of Infectious Diseases. Methods in Molecular Medicine™, vol 94. Humana Press. https://doi.org/10.1385/1-59259-679-7:67
Download citation
DOI: https://doi.org/10.1385/1-59259-679-7:67
Publisher Name: Humana Press
Print ISBN: 978-1-58829-221-6
Online ISBN: 978-1-59259-679-9
eBook Packages: Springer Protocols