Yeast Systems Biology pp 367-379 | Cite as
A Computational Method to Search for DNA Structural Motifs in Functional Genomic Elements
Abstract
The rapidly increasing availability of DNA sequence data from modern high-throughput experimental techniques has created the need for computational algorithms to aid in motif discovery in genomic DNA. Such algorithms are typically used to find a statistical representation of the nucleotide sequence of the target site of a DNA-binding protein within a collection of DNA sequences that are thought to contain segments to which the protein is bound. A major assumption of these algorithms is that the protein recognizes the primary order of nucleotides in the sequence. However, proteins can also recognize the three-dimensional shape and structure of DNA. To account for this, we developed a computational method to predict the local structural profiles of any set of DNA sequences and then to search within these profiles for common DNA structural motifs. Here we describe the details of this method and use it to find a DNA structural motif in the Saccharomyces cerevisiae yeast genome that is associated with binding of the transcription factor RLM1, a component of the protein kinase C-mediated MAP kinase pathway.
Key words
Motif discovery hydroxyl radical DNA structure Gibbs sampling transcription factor RLM1Notes
Acknowledgments
We thank Eric Bishop for providing the Perl module that is used to predict hydroxyl radical cleavage patterns for any DNA sequence. SCJP was the recipient of a National Academies Ford Foundation Dissertation Fellowship. This work was supported by an ENCODE Technology Development Grant from the National Human Genome Research Institute of the National Institutes of Health to TDT (HG003541).
References
- 1.Harbison, C. T., Gordon, D. B., Lee, T. I., et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104.PubMedCrossRefGoogle Scholar
- 2.Stormo, G. D. (2000) DNA binding sites: representation and discovery. Bioinformatics 16, 16–23.PubMedCrossRefGoogle Scholar
- 3.Sathyapriya, R., Vijayabaskar, M. S., and Vishveshwara, S. (2008) Insights into protein–DNA interactions through structure network analysis. PLoS Comput. Biol. 4, e1000170.PubMedCrossRefGoogle Scholar
- 4.Otwinowski, Z., Schevitz, R. W., Zhang, R., et al. (1988) Crystal structure of trp repressor/operator complex at atomic resolution. Nature 335, 321–329.PubMedCrossRefGoogle Scholar
- 5.Brennan, R. G., and Matthews, B. W. (1989) Structural basis of DNA-protein recognition. Trends Biochem. Sci. 14, 286–290.PubMedCrossRefGoogle Scholar
- 6.Gartenberg, M. R., and Crothers, D. M. (1988) DNA sequence determinants of CAP-induced bending and protein binding affinity. Nature 333, 824–829.PubMedCrossRefGoogle Scholar
- 7.Price, M. A., and Tullius, T. D. (1992) Using hydroxyl radical to probe DNA structure. Methods Enzymol. 212, 194–219.PubMedCrossRefGoogle Scholar
- 8.Price, M. A., and Tullius, T. D. (1993) How the structure of an adenine tract depends on sequence context: a new model for the structure of TnAn DNA sequences. Biochemistry 32, 127–136.PubMedCrossRefGoogle Scholar
- 9.Balasubramanian, B., Pogozelski, W. K., and Tullius, T. D. (1998) DNA strand breaking by the hydroxyl radical is governed by the accessible surface areas of the hydrogen atoms of the DNA backbone. Proc. Natl. Acad. Sci. USA 95, 9738–9743.PubMedCrossRefGoogle Scholar
- 10.Jain, S. S., and Tullius, T. D. (2008) Footprinting protein-DNA complexes using the hydroxyl radical. Nat. Protoc. 3, 1092–1100.PubMedCrossRefGoogle Scholar
- 11.Greenbaum, J. A., Pang, B., and Tullius, T. D. (2007) Construction of a genome-scale structural map at single-nucleotide resolution. Genome Res., 17, 947–953.PubMedCrossRefGoogle Scholar
- 12.Greenbaum, J. A., Parker, S. C. J., and Tullius, T. D. (2007) Detection of DNA structural motifs in functional genomic elements. Genome Res. 17, 940–946.PubMedCrossRefGoogle Scholar
- 13.Lawrence, C., Altschul, S., Boguski, M., Liu, J., Neuwald, A., and Wootton, J. (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214.PubMedCrossRefGoogle Scholar
- 14.MacIsaac, K. D., Wang, T., Gordon, D. B., Gifford, D. K., Stormo, G. D., and Fraenkel, E. (2006) An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics 7, 113.PubMedCrossRefGoogle Scholar
- 15.Stajich, J. E., Block, D., Boulez, K., et al. (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Res. 12, 1611–1618.PubMedCrossRefGoogle Scholar
- 16.Zhu, C., Byers, K., McCord, R., et al. (2009) High-resolution DNA binding specificity analysis of yeast transcription factors. Genome Res. 19, 556–566.PubMedCrossRefGoogle Scholar
- 17.Santelli, E., and Richmond, T. J. (2000) Crystal structure of MEF2A core bound to DNA at 1.5 Å resolution. J. Mol. Biol. 297, 437–449.PubMedCrossRefGoogle Scholar
- 18.Morozov, A. V., and Siggia, E. D. (2007) Connecting protein structure with predictions of regulatory sites. Proc. Natl. Acad. Sci. USA 104, 7068–7073.PubMedCrossRefGoogle Scholar
- 19.Spellman, P. T., Sherlock, G., Zhang, M. Q., et al. (1998) Comprehensive identification of cell cycle–regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9, 3273–3297.PubMedGoogle Scholar
- 20.Pavlidis, P., and Noble, W. S. (2003) Matrix2png: a utility for visualizing matrix data. Bioinformatics 19, 295–296.PubMedCrossRefGoogle Scholar
- 21.Schneider, T. D., and Stephens, R. M. (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 18, 6097–6100.PubMedCrossRefGoogle Scholar
- 22.Crooks, G. E., Hon, G., Chandonia, J., and Brenner, S. E. (2004) WebLogo: a sequence logo generator. Genome Res. 14, 1188–1190.PubMedCrossRefGoogle Scholar
- 23.Kent, W. J., Sugnet, C. W., Furey, T. S., et al. (2002) The human genome browser at UCSC. Genome Res. 12, 996–1006.PubMedGoogle Scholar
- 24.Karolchik, D., Kuhn, R. M., Baertsch, R., et al. (2008) The UCSC genome browser database: 2008 update. Nucleic Acids Res. 36, D773–779.PubMedCrossRefGoogle Scholar
- 25.Segal, E., and Widom, J. (2009) Poly(dA:dT) tracts: major determinants of nucleosome organization. Curr. Opin. Struct. Biol. 19, 65–71.PubMedCrossRefGoogle Scholar