Journal of Biosciences

, Volume 32, Supplement 1, pp 863–870 | Cite as

Parsing regulatory DNA: General tasks, techniques, and the PhyloGibbs approach

  • Rahul SiddharthanEmail author


In this review, we discuss the general problem of understanding transcriptional regulation from DNA sequence and prior information. The main tasks we discuss are predicting local regions of DNA, cis-regulatory modules (CRMs) that contain binding sites for transcription factors (TFs), and predicting individual binding sites. We review various existing methods, and then describe the approach taken by PhyloGibbs, a recent motif-finding algorithm that we developed to predict TF binding sites, and PhyloGibbs-MP, an extension to PhyloGibbs that tackles other tasks in regulatory genomics, particularly prediction of CRMs.


PhyloGibbs regulatory DNA transcription factors 

Abbreviations used


cis-regulatory modules


Markor Chain Monte Corlo


position weight matrices


transcription factors


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Amir A, Lewenstein M and Porat E 2004 Faster algorithms for string matching with k mismatches; J. Algorithms 50 257–275CrossRefGoogle Scholar
  2. Bailey T L and Elkan C 1994 Fitting a mixture model by expectation maximization to discover motifs in biopolymers; Proc. Int. Conf. Intell. Syst. Mol. Biol. 2 28–36PubMedGoogle Scholar
  3. Berman B P, Nibu Y, Pfeiffer B D, Tomancak P, Celniker S E, Levine M, Rubin G M and Eisen M B 2002 Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome; Proc. Natl. Acad. Sci. USA 99 757–762PubMedCrossRefGoogle Scholar
  4. Berman B P, Pfeiffer B D, Laverty T R, Salzberg S L, Rubin G M, Eisen M B and Celniker S E 2004 Computational identification of developmental enhancers: conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura; Genome Biol. 5 R61CrossRefGoogle Scholar
  5. Dermitzakis E T, Bergman C M and Clark A G 2003 Tracing the evolutionary history of drosophila regulatory regions with models that identify transcription factor binding sites; Mol. Biol. Evol. 20 703–714PubMedCrossRefGoogle Scholar
  6. Djordjevic M, Sengupta A M and Shraiman B I 2003 A biophysical approach to transcription factor binding site discovery; Genome Res. 13 2381–2390PubMedCrossRefGoogle Scholar
  7. Emberly E, Rajewsky N and Siggia E D 2003 Conservation of regulatory elements between two species of drosophila; BMC Bioinformatics 4 57PubMedCrossRefGoogle Scholar
  8. He L and Hannon G J 2004 MicroRNAs: small RNAs with a big role in gene regulation; Nat. Rev. Genet. 5 522–531PubMedCrossRefGoogle Scholar
  9. Lawrence C E, Altschul S F, Boguski M S, Liu J S, Neuwald A F and Wootton J C 1993 Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment; Science 262 208–214PubMedCrossRefGoogle Scholar
  10. Lettice L A, Heaney S J H, Purdie L A, Li L, de Beer P, Oostra B A, Goode D, Elgar G, Hill R E and de Graaff E 2003A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly; Hum. Mol. Genet. 12 1725–1735PubMedCrossRefGoogle Scholar
  11. Matzke M A and Birchler J A 2005 RNAi-mediated pathways in the nucleus; Nat. Rev. Genet. 6 24–35PubMedCrossRefGoogle Scholar
  12. Morgenstern B 1999 DIALIGN 2: improvement of the segmenttosegment approach to multiple sequence alignment; Bioinformatics 15 211–218PubMedCrossRefGoogle Scholar
  13. Pearson H 2006 Genetics: what is a gene?; Nature (London) 441 398–401CrossRefGoogle Scholar
  14. Pierstorff N, Bergman C M and Wiehe T 2006 Identifying cis-regulatory modules by combining comparative and compositional analysis of DNA; Bioinformatics 22 2858–2864PubMedCrossRefGoogle Scholar
  15. Sagot M-F 1998 Spelling approximate repeated or common motifs using a suffix tree; in Latin 98, lecture notes in computer science (Springer-Verlag) vol. 1380, pp 111–127Google Scholar
  16. Segal E, Fondufe-Mittendorf Y, Chen L, Thastrom A, Field Y, Moore I K, Wang J-P Z and Widom J 2006 A genomic code for nucleosome positioning; Nature (London) 442 772–778CrossRefGoogle Scholar
  17. Siddharthan R, Siggia E D and van Nimwegen E 2005 Phylogibbs: A gibbs sampling motif finder that incorporates phylogeny; PLoS Comput. Biol. 1 e67PubMedCrossRefGoogle Scholar
  18. Siddharthan R 2006 Sigma: multiple alignment of weakly-conserved non-coding DNA sequence; BMC Bioinformatics 7 143PubMedCrossRefGoogle Scholar
  19. Siddharthan R and van Nimwegen E 2007 Detecting regulatory sites using phylogibbs; in Comprehensive genomics, methods in molecular biology. (ed.) N H Bergman (Humana Press) (in press)Google Scholar
  20. Sinha S, Liang Y and Siggia E 2006 Stubb: a program for discovery and analysis of cis-regulatory modules; Nucleic Acids Res. 34 555–559CrossRefGoogle Scholar
  21. Sinha S, Schroeder M D, Unnerstall U, Gaul U and Siggia E D 2004 Cross-species comparison significantly improves genomewide prediction of cis-regulatory modules in Drosophila; BMC Bioinformatics 5 129CrossRefGoogle Scholar
  22. Sinha S, van Nimwegen E and Siggia E D 2003 A probabilistic method to detect regulatory modules; Bioinformatics (Suppl. 1) 19 292–301CrossRefGoogle Scholar
  23. Smith, A F M and Roberts G O 1993 Bayesian computation via the gibbs sampler and related markov chain monte carlo methods; J. R. Stat. Soc. Series B (Methodological) 55 3–23Google Scholar
  24. Stein L D, Mungall C, Shu S Q, Caudy M, Mangone M, Day A, Nickerson E, Stajich J E, Harris T W, Arva A and Lewis S 2002 The generic genome browser: a building block for a model organism system database; Genome Res. 12 1599–1610PubMedCrossRefGoogle Scholar
  25. Tanay A, Regev A and Shamir R 2005 Conservation and evolvability in regulatory networks: the evolution of ribosomal regulation in yeast; Proc. Natl. Acad. Sci. USA 102 7203–7208PubMedCrossRefGoogle Scholar
  26. Ukkonen E 1995 Online construction of suffix trees; Algorithmica 14 249–260CrossRefGoogle Scholar

Copyright information

© Indian Academy of Sciences 2007

Authors and Affiliations

  1. 1.The Institute of Mathematical SciencesChennaiIndia

Personalised recommendations