Abstract
The development of DNA microarray technology has made it possible to monitor the mRNA abundance of all genes simultaneously (the transcriptome) for a variety of cellular conditions. In addition, microarray-based genomewide measurements of promoter occupancy (the occupome) are now available for an increasing number of transcription factors. With this data and the complete genome sequence of many important organisms, it is becoming possible to quantitatively model the molecular computation performed at each promoter, which has as input the nuclear concentration of the active form of various regulatory proteins (the regulome) and as output a transcription rate, which in turn determines mRNA abundance. In this chapter, we describe how our group has used multivariate linear regression methods to: (i) discover cis-regulatory elements in upstream regulatory regions in an unbiased manner; (ii) infer a regulatory activity profile across conditions for each transcription factor; and (iii) determine whether the mRNA expression level of a gene whose promoter is occupied by a particular transcription factor is truly regulated by that factor, through integrated modeling of expression and promoter occupancy data. Together, these results show model-based analysis of functional genomics data to be a versatile conceptual and practical framework for the elucidation of regulatory circuitry, and a powerful alternative to the currently prevalent clustering-based methods.
Keywords
- Molecular Computation
- Nuclear Concentration
- Transcriptional Regulatory Mechanism
- Functional Genomic Data
- Bottom Arrow
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Cherry JM, Ball C, Weng S et al. Genetic and physical maps of Saccharomyces cerevisiae. Nature 1997; 387(6632 Suppl):67–73.
Adams MD, Celniker SE, Holt RA et al. The genome sequence of Drosophila melanogaster. Science 2000; 287(5461):2185–95.
Lander ES, Linton LM, Birren B et al. Initial sequencing and analysis of the human genome. Nature 2001; 409(6822):860–921.
Waterston RH, Lindblad-Toh K, Birney E et al. Initial sequencing and comparative analysis of the mouse genome. Nature 2002; 420(6915):520–62.
Schena M, Shalon D, Davis RW et al. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995; 270(5235):467–70.
Lockhart DJ, Dong H, Byrne MC et al. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 1996; 14(13):1675–80.
Ren B, Robert F, Wyrick JJ et al. Genome-wide location and function of DNA binding proteins. Science 2000; 290(5500):2306–9.
Iyer VR, Horak CE, Scafe CS et al. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 2001; 409(6819):533–8.
van Steensel B, Delrow J, Henikoff S. Chromatin profiling using targeted DNA adenine methyltransferase. Nat Genet 2001; 27(3):304–8.
Claverie JM. Gene number. What if there are only 30,000 human genes? Science 2001; 291(5507):1255–7.
Banerjee N, Zhang MQ. Functional genomics as applied to mapping transcription regulatory networks. Curr Opin Microbiol 2002; 5(3):313–7.
Fickett JW, Wasserman W. Discovery and modeling of transcriptional regulatory regions. Curr Opin Biotechnol 2000; 11(1):19–24.
van Helden J, Andre B, Collado-Vides J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 1998; 281(5):827–42.
Tavazoie S, Hughes JD, Campbell MJ et al. Systematic determination of genetic network architecture. Nat Genet 1999; 22(3):281–5.
Ashburner M, Ball CA, Blake JA et al. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25(1):25–9.
Zeeberg BR, Feng W, Wang G et al. GoMiner: A resource for biological interpretation of genomic and proteomic data. Genome Biol 2003; 4(4):R28.
Pavlidis P, Lewis DP, Noble WS. Exploring gene expression data with class scores. Pac Symp Biocomput 2002:474–85.
Lascaris R, Bussemaker HJ, Boorsma A et al. Hap4p overexpression in glucose-grown Saccharomyces cerevisiae induces cells to enter a novel metabolic state. Genome Biol 2003; 4(1):R3.
Bussemaker HJ, Li H, Siggia ED. Regulatory element detection using correlation with expression. Nat Genet 2001; 27(2):167–71.
Keles S, van der Laan M, Eisen MB. Identification of regulatory elements using a feature selection method. Bioinformatics 2002; 18(9):1167–75.
Wang W, Cherry JM, Botstein D et al. A systematic approach to reconstructing transcription networks in Saccharomy cescerevisiae. Proc Natl Acad Sci USA 2002; 99(26):16893–8.
Conlon EM, Liu XS, Lieb JD et al. Integrating regulatory motif discovery and genome-wide expression analysis. Proc Natl Acad Sci USA 2003; 100(6):3339–44.
Jobson JD. Applied multivariate regression analysis Volume 1: Regression and Experimental Design. New York: Springer, 1991.
Hochberg Y, Benjamini Y. More powerful procedures for multiple significance testing. Stat Med 1990; 9(7):811–8.
Chu S, DeRisi J, Eisen M et al. The transcriptional program of sporulation in budding yeast. Science 1998; 282(5389):699–705.
Koerkamp MG, Rep M, Bussemaker HJ et al. Dissection of transient oxidative stress response in saccharomyces cerevisiae by using DNA microarrays. Mol Biol Cell 2002; 13(8):2783–94.
Segal E, Shapira M, Regev A et al. Module networks: Identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet 2003; 34(2):166–76.
Ihmels J, Friedlander G, Bergmann S et al. Revealing modular organization in the yeast transcriptional network. Nat Genet 2002; 31(4):370–7.
Zhu Z, Pilpel Y, Church GM. Computational identification of transcription factor binding sites via a transcription-factor-centric clustering (TFCC) algorithm. J Mol Biol 2002; 318(1):71–81.
van Steensel B, Delrow J, Bussemaker HJ. Genomewide analysis of Drosophila GAGA factor target genes reveals context-dependent DNA binding. Proc Natl Acad Sci USA 2003; 100(5):2580–5.
Orian A, Van Steensel B, Delrow J et al. Genomic binding by the Drosophila Myc, Max, Mad/Mnt transcription factor network. Genes Dev 2003.
Gao F, Foat BC, Bussemaker HJ. Defining transcriptional networks through integrative modeling of mRNA expression and transcription factor binding data. BMC Bioinformatics 2004; 5:31.
Lee TI, Rinaldi NJ, Robert F et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 2002; 298(5594):799–804.
Spellman PT, Sherlock G, Zhang MQ et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 1998; 9(12):3273–97.
Beer MA, Tavazoie S. Predicting gene expression from sequence. Cell 2004; 117(2):185–98.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2006 Landes Bioscience and Springer Science+Business Media
About this chapter
Cite this chapter
Bussemaker, H.J. (2006). Model-Based Inference of Transcriptional Regulatory Mechanisms from DNA Microarray Data. In: Discovering Biomolecular Mechanisms with Computational Biology. Molecular Biology Intelligence Unit. Springer, Boston, MA. https://doi.org/10.1007/0-387-36747-0_7
Download citation
DOI: https://doi.org/10.1007/0-387-36747-0_7
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-34527-7
Online ISBN: 978-0-387-36747-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)
