Skip to main content

Deep Learning of CTCF-Mediated Chromatin Loops in 3D Genome Organization

  • Conference paper
  • First Online:
Computational Advances in Bio and Medical Sciences (ICCABS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 12029))

  • 2454 Accesses

Abstract

The three-dimensional organization of the human genome is of crucial importance for gene regulation. Results from high-throughput chromosome conformation capture techniques show that the CCCTC-binding factor (CTCF) plays an important role in chromatin interactions, and CTCF-mediated chromatin loops mostly occur between convergent CTCF-binding sites. However, it is still unclear whether and what sequence patterns in addition to the convergent CTCF motifs contribute to the formation of chromatin loops. To discover the complex sequence patterns for chromatin loop formation, we have developed a deep learning model, called DeepCTCFLoop, to predict whether a chromatin loop can be formed between a pair of convergent CTCF motifs using only the DNA sequences of the motifs and their flanking regions. Our results suggest that DeepCTCFLoop can accurately distinguish the convergent CTCF motif pairs forming chromatin loops from the ones not forming loops. It significantly outperforms CTCF-MP, a machine learning model based on word2vec and boosted trees, when using DNA sequences only. Moreover, we show that DNA motifs binding to ASCL1, SP2 and ZNF384 may facilitate the formation of chromatin loops in addition to convergent CTCF motifs. To our knowledge, this is the first published study of using deep learning techniques to discover the sequence motif patterns underlying CTCF-mediated chromatin loop formation. Our results provide useful information for understanding the mechanism of 3D genome organization. The source code and datasets used in this study for model construction are freely available at https://github.com/BioDataLearning/DeepCTCFLoop.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bonev, B., Cavalli, G.: Organization and function of the 3D genome. Nat. Rev. Genet. 17, 661 (2016)

    Article  Google Scholar 

  2. Bickmore, W.A.: The spatial organization of the human genome. Ann. Rev. Genomics Hum. Genet. 14, 67–84 (2013)

    Article  Google Scholar 

  3. Lieberman-Aiden, E., Van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., et al.: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009)

    Article  Google Scholar 

  4. Fullwood, M.J., Liu, M.H., Pan, Y.F., Liu, J., Xu, H., Mohamed, Y.B., et al.: An oestrogen-receptor-α-bound human chromatin interactome. Nature 462, 58 (2009)

    Article  Google Scholar 

  5. Rao, S.S., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Robinson, J.T., et al.: A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014)

    Article  Google Scholar 

  6. Tang, Z., Luo, O.J., Li, X., Zheng, M., Zhu, J.J., Szalaj, P., et al.: CTCF-mediated human 3D genome architecture reveals chromatin topology for transcription. Cell 163, 1611–1627 (2015)

    Article  Google Scholar 

  7. Nora, E.P., Goloborodko, A., Valton, A.-L., Gibcus, J.H., Uebersohn, A., Abdennur, N., et al.: Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell 169, 930–944 (2017). e922

    Article  Google Scholar 

  8. Guo, Y., Xu, Q., Canzio, D., Shou, J., Li, J., Gorkin, D.U., et al.: CRISPR inversion of CTCF sites alters genome topology and enhancer/promoter function. Cell 162, 900–910 (2015)

    Article  Google Scholar 

  9. Zhang, R., Wang, Y., Yang, Y., Zhang, Y., Ma, J.: Predicting CTCF-mediated chromatin loops using CTCF-MP. Bioinformatics 34, i133–i141 (2018)

    Article  Google Scholar 

  10. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). arXiv preprint arXiv:1301.3781

  11. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436 (2015)

    Article  Google Scholar 

  12. Quang, D., Xie, X.: DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res. 44, e107 (2016)

    Article  Google Scholar 

  13. Angermueller, C., Lee, H.J., Reik, W., Stegle, O.: DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol. 18, 67 (2017)

    Article  Google Scholar 

  14. Kelley, D.R., Snoek, J., Rinn, J.L.: Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. 26, 990–999 (2016)

    Article  Google Scholar 

  15. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)

    Article  Google Scholar 

  16. Zhou, P., Shi, W., Tian, J., Qi, Z., Li, B., Hao, H., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 207-212 (2016)

    Google Scholar 

  17. Li, W., Wong, W.H., Jiang, R.: DeepTACT: predicting 3D chromatin contacts via bootstrapping deep learning. Nucleic Acids Res. 47, e60–e60 (2019)

    Article  Google Scholar 

  18. Grant, C.E., Bailey, T.L., Noble, W.S.: FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011)

    Article  Google Scholar 

  19. Edgar, R., Domrachev, M., Lash, A.E.: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210 (2002)

    Article  Google Scholar 

  20. Consortium, E.P.: The ENCODE (ENCyclopedia of DNA elements) project. Science 306, 636–640 (2004)

    Article  Google Scholar 

  21. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980

  22. Bergstra, J., Yamins, D., Cox, D.D.: Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. Proceedings of the 12th Python in Science Conference, pp. 13-20 (2013)

    Google Scholar 

  23. Crooks, G.E., Hon, G., Chandonia, J.-M., Brenner, S.E.: WebLogo: a sequence logo generator. Genome Res. 14, 1188–1190 (2004)

    Article  Google Scholar 

  24. Mathelier, A., Zhao, X., Zhang, A.W., Parcy, F., Worsley-Hunt, R., Arenillas, D.J., et al.: JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res. 42, D142–D147 (2013)

    Article  Google Scholar 

  25. Bailey, T.L., Boden, M., Buske, F.A., Frith, M., Grant, C.E., Clementi, L., et al.: MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009)

    Article  Google Scholar 

  26. Trabelsi, A., Chaabane, M., Hur, A.B.: Comprehensive Evaluation of Deep Learning Architectures for Prediction of DNA/RNA Sequence Binding Specificities (2019). arXiv preprint arXiv:1901.10526

  27. Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., et al.: Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376 (2012)

    Article  Google Scholar 

  28. Dekker, J., Heard, E.: Structural and functional diversity of topologically associating domains. FEBS Lett. 589, 2877–2884 (2015)

    Article  Google Scholar 

  29. Smith, E.M., Lajoie, B.R., Jain, G., Dekker, J.: Invariant TAD boundaries constrain cell-type-specific looping interactions between promoters and distal elements around the CFTR locus. Am. J. Hum. Genet. 98, 185–201 (2016)

    Article  Google Scholar 

  30. Bouwman, B.A., de Laat, W.: Getting the genome in shape: the formation of loops, domains and compartments. Genome Biol. 16, 154 (2015)

    Article  Google Scholar 

  31. Aydin, B., Kakumanu, A., Rossillo, M., Moreno-Estellés, M., Garipler, G., Ringstad, N., et al.: Proneural factors Ascl1 and Neurog2 contribute to neuronal subtype identities by establishing distinct chromatin landscapes. Nat. Neurosci. 22(6), 897–908 (2019)

    Article  Google Scholar 

  32. Raposo, A.A., Vasconcelos, F.F., Drechsel, D., Marie, C., Johnston, C., Dolle, D., et al.: Ascl1 coordinately regulates gene expression and the chromatin landscape during neurogenesis. Cell Rep. 10, 1544–1556 (2015)

    Article  Google Scholar 

  33. Park, N.I., Guilhamon, P., Desai, K., McAdam, R.F., Langille, E., O’Connor, M., et al.: ASCL1 reorganizes chromatin to direct neuronal fate and suppress tumorigenicity of glioblastoma stem cells. Cell Stem Cell 21, 209–224 (2017). e207

    Article  Google Scholar 

  34. Ren, G., Jin, W., Cui, K., Rodrigez, J., Hu, G., Zhang, Z., et al.: CTCF-mediated enhancer-promoter interaction is a critical regulator of cell-to-cell variation of gene expression. Mol. Cell 67, 1049–1058 (2017). e1046

    Article  Google Scholar 

  35. Whalen, S., Truty, R.M., Pollard, K.S.: Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat. Genet. 48, 488 (2016)

    Article  Google Scholar 

  36. Moorefield, K.S., Yin, H., Nichols, T.D., Cathcart, C., Simmons, S.O., Horowitz, J.M.: Sp2 localizes to subnuclear foci associated with the nuclear matrix. Mol. Biol. Cell 17, 1711–1722 (2006)

    Article  Google Scholar 

  37. Hnisz, D., Weintraub, A.S., Day, D.S., Valton, A.-L., Bak, R.O., Li, C.H., et al.: Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016)

    Article  Google Scholar 

  38. Guo, Y.A., Chang, M.M., Huang, W., Ooi, W.F., Xing, M., Tan, P., et al.: Mutation hotspots at CTCF binding sites coupled to chromosomal instability in gastrointestinal cancers. Nat. Commun. 9, 1520 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liangjiang Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kuang, S., Wang, L. (2020). Deep Learning of CTCF-Mediated Chromatin Loops in 3D Genome Organization. In: Măndoiu, I., Murali, T., Narasimhan, G., Rajasekaran, S., Skums, P., Zelikovsky, A. (eds) Computational Advances in Bio and Medical Sciences. ICCABS 2019. Lecture Notes in Computer Science(), vol 12029. Springer, Cham. https://doi.org/10.1007/978-3-030-46165-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-46165-2_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-46164-5

  • Online ISBN: 978-3-030-46165-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics