Abstract
There are many methodologies for performing gene expression profiling on transcripts, and through their use scientists have been generating vast amounts of experimental data. Turning the raw experimental data into meaningful biological observation requires a number of processing steps; to remove noise, to identify the “true” expression value, normalize the data, compare it to reference data, and to extract patterns, or obtain insight into the underlying biology of the samples being measured. In this chapter we give a brief overview of how the raw data is processed, provide details on several data-mining methods, and discuss the future direction of expression informatics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. (2001) Nature 409, 860–921.
Venter, J. C., Adams, M. D., Myers, E. W., et al. (2002) The sequence of the Human Genome. Science 291, 1304–1351.
Kerr, M. K. and Churchill, G. A. (2001) Experimental design for gene expression microarrays. Biostatistics 2, 183–201.
Lorkowski, S. and Cullen, P. (eds.) (2003) Computational methods and bioinformatics tools, in Analysing Gene Expression: A handbook of methods possibilities and pitfalls, Wiley-VCH, Weinheim, Germany, 769–904.
Edgar, R., Domrachev, M., and Lash, A. E. (2002) Gene Expression Omnibus: NCBI gene expression hybridization array data repository. Nucleic Acids Res. 1, 207–210.
Velculescu, V., Zhang, L., Vogelstein, B., and Kinzler, K. (1995) Serial analysis of gene expression. Science 270, 484–487.
Cheval, L., Virlon, B., and Elalouf, J. M. (2000) SADE: a microassay for serial analysis of gene expression, in Functional Genomics: a practical approach (Hunt, S. and Livesey, J., eds.), Oxford University Press, New York, NY, 139–163.
Virlon, B., Cheval, L., Buhler, J. M., Billon, E., Doucet, A., and Elalouf, J. M. (1999) Serial microanalysis of renal transcriptomes. Proc. Natl. Acad. Sci. USA 26, 15286–15291.
Hu, G. H., Madore, S. J., Moldover, B., et al. (2001) Predicting splice variant from DNA chip expression data. Genome Res. 7, 1237–1245.
Shimkets, R. A., Lowe, D. G., Tai, J. T., et al. (1999) Gene expression analysis by transcript profiling coupled to a gene database query. Nat. Biotechnol. 8, 798–803.
Hartemink, A. J., Gifford, D. K., Jaakkola, T. S., and Young, R. A. (2001) Maximum-likelihood estimation of optimal scaling factors for expression array normalization, in Microarrays: Optical Technologies and Informatics (Bittner, M. L., Yidong, C., Dorsel, A. N., and Dougherty, E. R., ed.) SPIE—The International Society for Optical Engineering, Bellingham, WA 132–140.
Warrington, J. A., Nair, A., Mahadevappa, M., and Tsyganskaya, M. (2001) Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes. Physiol. Genomics 3, 143–147.
Quackenbush, J. (2002) Microarray data normalization and transformation. Nat. Genet. Suppl. 32, 496–501.
Li, C. and Wong, W. H. (2001) Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proc. Natl. Acad. Sci. USA 1, 31–36.
Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868.
Kohonen, T., Huang, T. S., and Schroeder, M. R. (eds.) Self-Organizing Maps. Springer-Verlag, New York, NY.
Hartigan, J. (ed.) (1975) Clustering Algorithms John Wiley and Sons, New York, NY.
Alon, U., Barkai, N., Notterman, D. A., et al. (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 12, 6745–6750.
Wu, L. F., Hughes, T. R., Davierwala, A. P., et al. (2002) Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters. Nat. Genet. 31, 255–265.
Tavazoie, S., Hughes, J. D., Campbell, M. J., Cho, R. J., and Church, G. M. (1999) Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285.
Tamayo, P., Slonim, D., Mesirov, J., et al. (1999) Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912.
Alter, O., Brown, P. O., and Botstein, D. (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. USA 18, 10101–10106.
Yeung, K. Y. and Ruzzo, W. L. (2001) Principal component analysis for clustering gene expression data. Bioinformatics 9, 764–774.
Kerr, M. K., Martin, M., and Churchill, G. A. (2000) Analysis of variance for gene expression microarray data. J. Comp. Biol. 7, 819–837.
Duda, R. O. and Hart, P. E. (eds.) (1973) Pattern Classification and Scene Analysis. John Wiley and Sons, New York, NY.
Bishop, C. (ed.) (1995) Neural Networks for Pattern Recognition. Oxford University Press, New York, NY.
Quinlan, J. R. (ed.) (1997) C4:5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA.
Brown, M. P., Grundy, W. N., Lin, D., et al. (2000) Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA 1, 262–267.
D’haeseleer, P., Liang, S., and Somogyi, R. (2000) Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics 8, 707–726.
Hartemink, A. J., Gifford, D. K., Jaakkola, T. S., and Young, R. A. (2001) Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Pac. Symp. Biocomput. 6, 422–433.
Chee, M., Yang, R., Hubbell, E., et al. (1996) Accessing genetic information with high-density DNA arrays. Science 274, 610–614.
Lipshutz, R. J., Fodor, S. P. A., Gingeras, T. R., and Lockhart, D. J. (1999) High density synthetic oligonucleotide arrays. Nat. Genet. Suppl. 21, 20–24.
Ihaka, R. and Gentleman, R. (1996) R: A language for data analysis and graphics. J. Comput. Graph. Stat. 3, 299–314.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Humana Press Inc., Totowa, NJ
About this protocol
Cite this protocol
Leach, M. (2004). Gene Expression Informatics. In: Shimkets, R.A. (eds) Gene Expression Profiling. Methods in Molecular Biology, vol 258. Humana Press. https://doi.org/10.1385/1-59259-751-3:153
Download citation
DOI: https://doi.org/10.1385/1-59259-751-3:153
Publisher Name: Humana Press
Print ISBN: 978-1-58829-220-9
Online ISBN: 978-1-59259-751-2
eBook Packages: Springer Protocols