Distinguishing between Genomic Regions Bound by Paralogous Transcription Factors

Munteanu, Alina; Gordân, Raluca

doi:10.1007/978-3-642-37195-0_12

Alina Munteanu²³ &
Raluca Gordân²⁴

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 7821))

Included in the following conference series:

Annual International Conference on Research in Computational Molecular Biology

3142 Accesses

Abstract

Transcription factors (TFs) regulate gene expression by binding to specific DNA sites in cis regulatory regions of genes. Most eukaryotic TFs are members of protein families that share a common DNA binding domain and often recognize highly similar DNA sequences. Currently, it is not well understood why closely related TFs are able to bind different genomic regions in vivo, despite having the potential to interact with the same DNA sites. Here, we use the Myc/Max/Mad family as a model system to investigate whether interactions with additional proteins (co-factors) can explain why paralogous TFs with highly similar DNA binding preferences interact with different genomic sites in vivo. We use a classification approach to distinguish between targets of c-Myc versus Mad2, using features that reflect the DNA binding specificities of putative co-factors. When applied to c-Myc/Mad2 DNA binding data, our algorithm can distinguish between genomic regions bound uniquely by c-Myc versus Mad2 with 87% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ren, B., Robert, F., Wyrick, J.J., et al.: Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000)
Article Google Scholar
Johnson, D.S., Mortazavi, A., Myers, R.M., Wold, B.: Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007)
Article Google Scholar
Berger, M.F., Philippakis, A.A., Qureshi, A.M., et al.: Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat. Biotech. 24, 1429–1435 (2006)
Article Google Scholar
Robasky, K., Bulyk, M.L.: UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein-DNA interactions. Nucleic Acids Research 39, D124–D128 (2011)
Google Scholar
Matys, V., Kel-Margoulis, O.V., Fricke, E., et al.: TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Research 34, D108–D110 (2006)
Google Scholar
Portales-Casamar, E., Thongjuea, S., Kwon, A.T., et al.: JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles. Nucleic Acids Research 38, D105–D110 (2010)
Google Scholar
Badis, G., Berger, M.F., Philippakis, A.A., et al.: Diversity and complexity in DNA recognition by transcription factors. Science 324, 1720–1723 (2009)
Article Google Scholar
Wells, J., Graveel, C.R., Bartley, S.M., et al.: The identification of E2F1-specific target genes. Proc. Natl. Acad. Sci. U S A 99, 3890–3895 (2002)
Article Google Scholar
Wu, Z., Zheng, S., Yu, Q.: The E2F family and the role of E2F1 in apoptosis. Int. J. Biochem. Cell Biol. 41, 2389–2397 (2009)
Article Google Scholar
Tao, Y., Kassatly, R., Cress, W., Horowitz, J.: Subunit composition determines E2F DNA-binding site specificity. Mol. Cell Biol. 17, 6994–7007 (1997)
Google Scholar
Hollenhorst, P.C., Shah, A.A., Hopkins, C., Graves, B.J.: Genome-wide analyses reveal properties of redundant and specific promoter occupancy within the ETS gene family. Genes Dev. 21, 1882–1894 (2007)
Article Google Scholar
Wei, G.H., Badis, G., Berger, M.F., et al.: Genome-wide analysis of ETS-family DNA-binding in vitro and in vivo. EMBO J. 29, 2147–2160 (2010)
Article Google Scholar
Soleimani, V.D., Punch, V.G., Kawabe, Y.I., et al.: Transcriptional dominance of Pax7 in adult myogenesis is due to high-affinity recognition of homeodomain motifs. Dev. Cell 22, 1208–1220 (2012)
Article Google Scholar
Xu, X., Bieda, M., Jin, V.X., et al.: A comprehensive ChIP-chip analysis of E2F1, E2F4, and E2F6 in normal and tumor cells reveals interchangeable roles of E2F family members. Genome Research 17, 1550–1561 (2007)
Article Google Scholar
ENCODE Project Consortium, Bernstein, B., Birney, E., Dunham, I., Green, E., Gunter, C., Snyder, M.: An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)
Google Scholar
Farnham, P.J.: Insights from genomic profiling of transcription factors. Nat. Rev. Genet. 10, 605–616 (2009)
Article Google Scholar
Grandori, C., Cowley, S.M., James, L.P., Eisenman, R.N.: The Myc/Max/Mad network and the transcriptional control of cell behavior. Annu. Rev. Cell Dev. Biol. 16, 653–699 (2000)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20, 273–297 (1995)
MATH Google Scholar
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Article MATH Google Scholar
Rosenbloom, K.R., Dreszer, T.R., Long, J.C., et al.: ENCODE whole-genome data in the UCSC Genome Browser: update, Nucleic Acids Research 40, D912–D917 (2012)
Google Scholar
Workman, C.T., Yin, Y., Corcoran, D., et al.: enoLOGOS: a versatile web tool for energy normalized sequence logos. Nucl. Acids Res. 33, W389 (2005)
Google Scholar
Stormo, G.D.: DNA binding sites: representation and discovery. Bioinformatics 16, 16–23 (2000)
Article Google Scholar
Gordân, R., Hartemink, A., Bulyk, M.: Distinguishing direct versus indirect transcription factor-DNA interactions. Genome Res. 19, 2090–2100 (2009)
Article Google Scholar
Song, L., Crawford, G.E.: DNase-seq: A high-resolution technique for mapping active gene regulatory elements across the genome from mammalian cells. Cold Spring Harbor Protocols 2010, pdb.prot5384 (2010)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 1–27 (2011)
Article Google Scholar
Schwarz, D.F., König, I.R., Ziegler, A.: On safari to random jungle: a fast implementation of random forests for high-dimensional data. Bioinformatics 26, 1752–1758 (2010)
Article Google Scholar
Díaz-Uriarte, R., Alvarez de Andrés, S.: Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7, 3 (2006)
Article Google Scholar
Luo, Q., Li, J., Cenkci, B., Kretzner, L.: Autorepression of c-myc requires both initiator and E2F-binding site elements and cooperation with the p107 gene product. Oncogene 23, 1088–1097 (2004)
Article Google Scholar
Negorev, D.G., Vladimirova, O.V., Kossenkov, A.V., et al.: Sp100 as a potent tumor suppressor: accelerated senescence and rapid malignant transformation of human fibroblasts through modulation of an embryonic stem cell program. Cancer Research 70, 9991–10001 (2010)
Article Google Scholar
Sobek-Klocke, I., Disque-Kochem, C., Ronsiek, M., Klocke, R., et al.: The human gene ZFP161 on 18p11.21-pter encodes a putative c-myc repressor and is homologous to murine Zfp161 (Chr 17) and Zfp161-rs1 (X Chr). Genomics 43, 156–164 (1997)
Article Google Scholar
Chen, G., Zhou, Q.: Searching ChIP-seq genomic islands for combinatorial regulatory codes in mouse ES cells. BMC Genomics 12, 515 (2011)
Article Google Scholar
Machanick, P., Bailey, T.L.: MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27, 1696–1697 (2011)
Article Google Scholar
Thomas-Chollier, M., Herrmann, C., Defrance, M., et al.: RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets. NAR 40, e31 (2012)
Google Scholar
Whitington, T., Frith, M.C., Johnson, J., Bailey, T.L.: Inferring transcription factor complexes from ChIP-seq data. NAR 39, e98 (2011)
Google Scholar
Gerstein, M.B., Kundaje, A., Hariharan, M., et al.: Architecture of the human regulatory network derived from ENCODE data. Nature 489, 91–100 (2012)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science, Alexandru I. Cuza University, Iasi, Romania
Alina Munteanu
Institute for Genome Sciences and Policy, Departments of Biostatistics & Bioinformatics, Computer Science, and Molecular Genetics and Microbiology, Duke University, Durham, NC, 27708, USA
Raluca Gordân

Authors

Alina Munteanu
View author publications
You can also search for this author in PubMed Google Scholar
Raluca Gordân
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Mathematics, Peking University, Beijing, P.R. China
Minghua Deng
Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, 100084, Beijing, P.R. China
Rui Jiang
Molecular and Computational Biology Program, University of Southern California, Los Angeles, California, USA
Fengzhu Sun
Department of Automation, Tsinghua University, P.O. Box, 100084, Beijing, China
Xuegong Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Munteanu, A., Gordân, R. (2013). Distinguishing between Genomic Regions Bound by Paralogous Transcription Factors. In: Deng, M., Jiang, R., Sun, F., Zhang, X. (eds) Research in Computational Molecular Biology. RECOMB 2013. Lecture Notes in Computer Science(), vol 7821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37195-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-37195-0_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37194-3
Online ISBN: 978-3-642-37195-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics