Skip to main content
Log in

How does DNA sequence motif discovery work?

  • Primer
  • Published:

From Nature Biotechnology

View current issue Submit your manuscript

How can we computationally extract an unknown motif from a set of target sequences? What are the principles behind the major motif discovery algorithms? Which of these should we use, and how do we know we've found a 'real' motif?

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1: Starting from a single site, expectation maximization algorithms such as MEME4 alternate between assigning sites to a motif (left) and updating the motif model (right).

Similar content being viewed by others

References

  1. D'haeseleer. P. What are DNA sequence motifs? Nat. Biotechnol. 24, 423–425 (2006).

    Article  CAS  Google Scholar 

  2. Sinha, S. & Tompa, M. YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 31, 3586–3588 (2003).

    Article  CAS  Google Scholar 

  3. Pavesi, G. et al. Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic Acids Res. 32 (Web Server Issue), W199–W203 (2004).

    Article  CAS  Google Scholar 

  4. Bailey, T.L. & Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36 (1994).

    CAS  PubMed  Google Scholar 

  5. Tompa, M. et al. Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23, 137–144 (2005).

    Article  CAS  Google Scholar 

  6. Li, N. & Tompa, M. Analysis of computational approaches for motif discovery. Alg. Mol. Biol. 1, 8 (2006).

    Article  Google Scholar 

  7. Hu, J., Li, B. & Kihara, D. Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res. 33, 4899–4913 (2005).

    Article  CAS  Google Scholar 

  8. Thijs, G. et al. A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes. J. Comp. Biol. 9, 447–464 (2002).

    Article  CAS  Google Scholar 

  9. Huber, B.R. & Bulyk, M.L. Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data. BMC Bioinformatics 7, 229 (2006).

    Article  Google Scholar 

  10. Hughes, J.D. et al. Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296, 1205–1214 (2000).

    Article  CAS  Google Scholar 

  11. McGuire, A.M., Hughes, J.D. & Church, G.M. Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes. Genome Res. 10, 744–757 (2000).

    Article  CAS  Google Scholar 

  12. Huang, H.-D. et al. Identifying transcriptional regulatory sites in the human genome using an integrated system. Nucleic Acids Res. 32, 1948–1956 (2004).

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

D'haeseleer, P. How does DNA sequence motif discovery work?. Nat Biotechnol 24, 959–961 (2006). https://doi.org/10.1038/nbt0806-959

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt0806-959

  • Springer Nature America, Inc.

This article is cited by

Navigation