A Computational Method to Search for DNA Structural Motifs in Functional Genomic Elements
The rapidly increasing availability of DNA sequence data from modern high-throughput experimental techniques has created the need for computational algorithms to aid in motif discovery in genomic DNA. Such algorithms are typically used to find a statistical representation of the nucleotide sequence of the target site of a DNA-binding protein within a collection of DNA sequences that are thought to contain segments to which the protein is bound. A major assumption of these algorithms is that the protein recognizes the primary order of nucleotides in the sequence. However, proteins can also recognize the three-dimensional shape and structure of DNA. To account for this, we developed a computational method to predict the local structural profiles of any set of DNA sequences and then to search within these profiles for common DNA structural motifs. Here we describe the details of this method and use it to find a DNA structural motif in the Saccharomyces cerevisiae yeast genome that is associated with binding of the transcription factor RLM1, a component of the protein kinase C-mediated MAP kinase pathway.
Key wordsMotif discovery hydroxyl radical DNA structure Gibbs sampling transcription factor RLM1
We thank Eric Bishop for providing the Perl module that is used to predict hydroxyl radical cleavage patterns for any DNA sequence. SCJP was the recipient of a National Academies Ford Foundation Dissertation Fellowship. This work was supported by an ENCODE Technology Development Grant from the National Human Genome Research Institute of the National Institutes of Health to TDT (HG003541).