Skip to main content

BayCis: A Bayesian Hierarchical HMM for Cis-Regulatory Module Decoding in Metazoan Genomes

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2008)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4955))

Abstract

The transcriptional regulatory sequences in metazoan genomes often consist of multiple cis-regulatory modules (CRMs). Each CRM contains locally enriched occurrences of binding sites (motifs) for a certain array of regulatory proteins, capable of integrating, amplifying or attenuating multiple regulatory signals via combinatorial interaction with these proteins. The architecture of CRM organizations is reminiscent of the grammatical rules underlying a natural language, and presents a particular challenge to computational motif and CRM identification in metazoan genomes. In this paper, we present BayCis, a Bayesian hierarchical HMM that attempts to capture the stochastic syntactic rules of CRM organization. Under the BayCis model, all candidate sites are evaluated based on a posterior probability measure that takes into consideration their similarity to known BSs, their contrasts against local genomic context, their first-order dependencies on upstream sequence elements, as well as priors reflecting general knowledge of CRM structure. We compare our approach to five existing methods for the discovery of CRMs, and demonstrate competitive or superior prediction results evaluated against experimentally based annotations on a comprehensive selection of Drosophila regulatory regions. The software, database and Supplementary Materials will be available at http://www.sailing.cs. cmu.edu/baycis .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alkema, W.B., Johansson, O., Lagergren, J., Wasserman, W.W.: Mscan: identification of functional clusters of transcription factor binding sites. Nucleic Acids Res. 32(Web Server issue), 195–198 (2004)

    Article  Google Scholar 

  2. Berman, B.P., Nibu, Y., Pfeiffer, B.D., Tomancak, P., Celniker, S.E., Levine, M., Rubin, G.M., Eisen, M.: Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. Proc. Natl. Acad. Sci. USA 99(2), 757–762 (2002)

    Article  Google Scholar 

  3. Davidson, E.H.: Genomic Regulatory Systems. Academic Press, London (2001)

    Google Scholar 

  4. Donaldson, I.J., Chapman, M., Gottgens, B.: Tfbscluster: a resource for the characterization of transcriptional regulatory networks. Bioinformatics 21(13), 3058–3059 (2005)

    Article  Google Scholar 

  5. Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden Markov model: Analysis and applications. Mach Learning 32, 41–62 (1998)

    Article  MATH  Google Scholar 

  6. Frith, M., Li, M., Weng, Z.: Clusterbuster:finding dense clusters of motifs in dna seqs. Nuc. Ac. Res. 31(13), 3666–3668 (2003)

    Article  Google Scholar 

  7. Frith, M.C., Hansen, U., Weng, Z.: Detection of cis-element clusters in higher eukaryotic DNA. Bioinf. 17, 878–889 (2001)

    Article  Google Scholar 

  8. Gallo, S., Li, L., Hu, Z., Halfon, M.: Redfly:a regulatory element database for drosophila. Bioinf. 22(3), 381–383 (2006)

    Article  Google Scholar 

  9. Gupta, M., Liu, J.S.: De novo cis-regulatory module elicitation for eukaryotic genomes. Proc. Natl. Acad. Sci. USA 102(20), 7079–7084 (2005)

    Article  Google Scholar 

  10. Huang, H., Kao, M., Zhou, X., Liu, J.S., Wong, W.H.: Determination of local statistical significance of patterns in Markov sequences with application to promoter element identification. Journal of Computational Biology 11(1) (2004)

    Google Scholar 

  11. Liu, X., Brutlag, D.L., Liu, J.: Bioprospector: Discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Proc. of Pac. Symp. Biocomput., 127–138 (2001)

    Google Scholar 

  12. Loots, G.G., Ovcharenko, I., Pachter, L., Dubchak, I., Rubin, E.M.: rVista for comparative sequence-based discovery of functional transcription factor binding sites. Genome Res. 12(5), 832–839 (2002)

    Article  Google Scholar 

  13. Ludwig, M.Z., Patel, N.H., Kreitman, M.: Functional analysis of eve stripe 2 enhancer evolution in Drosophila: rules governing conservation and change. Development 125(5), 949–958 (1998)

    Google Scholar 

  14. Maerkl, S.J., Quake, S.R.: A systems approach to measuring the binding energy landscapes of transcription factors. Science 315, 233–237 (2007)

    Article  Google Scholar 

  15. Michelson, A.: Deciphering genetic regulatory codes:a challenge for fnal genomics. Pr. Nat. Acad. Sc. USA 99, 546–548 (2002)

    Article  Google Scholar 

  16. Moses, A.M., Chiang, D.Y., Eisen, M.B.: Phylogenetic motif detection by expectation-maximization on evolutionary mixtures. Pac. Symp. Biocomput., 324–335 (2004)

    Google Scholar 

  17. Murphy, K., Paskin, M.: Linear time inference in hierarchical hmms. Adv. in Neural Inf. Proc. Sys. 14 (2002)

    Google Scholar 

  18. Narang, V., Sung, W.K., Mittal, A.: Computational annotation of transcription factor binding sites in D melanogaster developmental genes. In: Proceedings of The 17th International Conference on Genome Informatics (2006)

    Google Scholar 

  19. Rajewsky, N., Vergassola, M., Gaul, U., Siggia, E.D.: Computational detection of genomic cis-regulatory modules, applied to body patterning in the early Drosophila embryo. BMC Bioinformatics 3(30), 1–13 (2002)

    Google Scholar 

  20. Rebeiz, M., Reeves, N.L., Posakony, J.W.: Score: a computational approach to the identification of cis-regulatory modules and target genes in whole-genome sequence data site clustering over random expectation. Proc. Natl. Acad. Sci. USA 99(15), 9888–9893 (2002)

    Article  Google Scholar 

  21. Sharan, R., Ovcharenko, I., Ben-Hur, A., Karp, R.M.: Creme: a framework for identifying cis-regulatory modules in human-mouse conserved segments. Bioinformatics 19(Suppl 1), i283–291 (2003)

    Article  Google Scholar 

  22. Siddharthan, R., Siggia, E.D., van Nimwegen, E.: Phylogibbs: A gibbs sampling motif finder that incorporates phylogeny. PLoS Computational Biology 1(7), e67 (2005)

    Article  Google Scholar 

  23. Sinha, S., Blanchette, B., Tompa, M.: Phyme: A probabilistic algorithm for finding motifs in sets of orthologous sequences. BMC Bioinformatics 5(170) (2004)

    Google Scholar 

  24. Sinha, S., Liang, Y., Siggia, E.: Stubb: a program for discovery and analysis of cis-regulatory modules. Nucleic Acids Res. 34(Web Server issue), W555–W559 (2006)

    Article  Google Scholar 

  25. Staden, R.: Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 12(1 Pt 2), 505–519 (1984)

    Article  Google Scholar 

  26. Thijs, G., Lescot, M., Marchal, K., Rombauts, S., DeMoor, B., Rouze, P., Moreau, Y.: A higher-order background model improves the detection of promoter regulatory elements by gibbs sampling. Bioinformatics 17(12), 1113–1122 (2001)

    Article  Google Scholar 

  27. Thompson, W., Palumbo, M.J., Wasserman, W.W., Liu, J.S., Lawrence, T.E.: Decoding human regulatory circuits. Genome Res. 14(10A), 1967–1974 (2004)

    Article  Google Scholar 

  28. Tompa, M., Li, N., Bailey, T., Church, G., DeMoor, B., Eskin, E., Favorov, A., Frith, M., Fu, Y., Kent, W., Makeev, V., Mironov, A., Noble, A., Pavesi, G., Pesole, G., Regnier, M., Simonis, N., Sinha, S., Thijs, G., van Helden, J., Vandenbogaert, M., Weng, Z., Workman, C., Ye, C., Zhu, Z.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotech. 23(1), 137–144 (2005)

    Article  Google Scholar 

  29. Wingender, E., Dietze, P., Karas, H., Knuppel, R.: TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic. Acids. Res. 24(1), 238–241 (1996)

    Article  Google Scholar 

  30. Xing, E.P., Wu, W., Jordan, M.I., Karp, R.M.: Logos: A modular Bayesian model for de novo motif detection. Journal of Bioinformatics and Computational Biology 2(1), 127–154 (2004)

    Article  Google Scholar 

  31. Zhou, Q., Wong, W.H.: Cismodule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling. Proc. Natl. Acad. Sci. USA 101(33), 12114–12119 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Martin Vingron Limsoon Wong

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lin, Th., Ray, P., Sandve, G.K., Uguroglu, S., Xing, E.P. (2008). BayCis: A Bayesian Hierarchical HMM for Cis-Regulatory Module Decoding in Metazoan Genomes. In: Vingron, M., Wong, L. (eds) Research in Computational Molecular Biology. RECOMB 2008. Lecture Notes in Computer Science(), vol 4955. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78839-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78839-3_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78838-6

  • Online ISBN: 978-3-540-78839-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics