Skip to main content
Log in

Computer analysis of colocalization of the TFs’ binding sites in the genome according to the ChIP-seq data

  • Published:
Russian Journal of Genetics: Applied Research

Abstract

A computer program for calculating clusters of binding sites of various transcription factors (TFs) according to the genomic coordinates of the ChIP-seq (Chromatin ImmunoPrecipitation-sequencing) profile peaks is developed. The statistical features of the distribution of the transcription factors’ binding sites (TFBSs) in the mouse genome, obtained with the help of ChIP-seq experiments in embryonic stem cells, are considered. Clusters of sites containing at least four binding sites of various TFs in the mouse genome are determined and their localization relative to the regulatory regions of the genes is described. Two types of colocalization of the sites are confirmed: clusters containing binding sites of factors Oct4, Nanog, and Sox2 located in the distal regions and clusters with n-Myc and c-Myc binding sites located mainly in the promoter regions of mouse genes. Analysis of the new ChIP-seq data on the binding of TFs Nr5a2, Tbx3, Cep, SRF, and USF1 in the same cell type confirmed the differentiation of clusters of the TFBSs into two types: those containing pluripotency regulator binding sites (Oct4, Nanog, and Sox2) and those not containing them. A computer program for the statistical processing of the data on the location of the sites in the genes is developed; it uses the experimental data on site localization obtained by ChIP-seq methods in mouse and human genomes. With the help of this program, the localization patterns of the binding sites of various TFs are detected. The distances between the closest binding sites of the TF groups Oct4, Nanog, and Sox2 and the binding sites of other factors in site clusters that serve as a basis for the analysis of the joint binding of protein complexes to DNA are calculated. The fraction of the presence of the known nucleotide motifs of TFBSs in the genomic regions of ChIP-seq is calculated. The weight matrices for such nucleotide motifs are recalculated. The correlation between the presence of motifs and the ChIP-seq binding intensity is shown. The programs implementing the computerized methods for assessing the clustering of binding sites of various TFs for new ChIP-seq data are available upon request from the authors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Babenko, V.N., Kosarev, P.S., Vishnevsky, O.V., Levitsky, V.G., Basin, V.V., and Frolov, A.S., Investigating extended regulatory regions of genomic DNA sequences, Bioinformatics, 1999, vol. 15, nos. 7–8, pp. 644–653. doi 10.1093/bioinformatics/15.7.644

    Article  CAS  PubMed  Google Scholar 

  • Babenko, V.N., Matvienko, V.F., and Safronova, N.S., Implication of transposons distribution on chromatin state and genome architecture in human, J. Biomol. Struct. Dyn., 2015, vol. 33, no. 1, pp. 10–11. doi 10.1080/07391102.2015.1032559

    Article  PubMed  Google Scholar 

  • Bieda, M., Xu, X., Singer, M.A., Green, R., and Farnham, P., Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome, Genome Res., 2006, vol. 16, no. 5, pp. 595–605. doi 10.1101/gr.4887606

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Boeva, V., Analysis of genomic sequence motifs for deciphering transcription factor binding and transcriptional regulation in eukaryotic cells, Front. Genet., 2016, vol. 7, p. 24. doi 10.3389/fgene.2016.00024

    Article  PubMed  PubMed Central  Google Scholar 

  • Boyer, L.A., Lee, T.I., Cole, M.F., Johnstone, S.E., Levine, S.S., Zucker, J.P., Guenther, M.G., Kumar, R.M., Murray, H.L., Jenner, R.G., Gifford, D.K., Melton, D.A., Jaenisch, R., and Young, R.A., Core transcriptional regulatory circuitry in human embryonic stem cells, Cell, 2005, vol. 122, no. 6, pp. 947–956. doi 10.1016/j.cell.2005.08.020

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen, X., Xu, H., Yuan, P., Fang, F., Huss, M., Vega, V.B., Wong, E., Orlov, Y.L., Zhang, W., Jiang, J., Loh, Y.H., Yeo, H.C., Yeo, Z.X., Narang, V., Govindarajan, K.R., Leong, B., Shahab, A., Ruan, Y., Bourque, G., Sung, W.K., Clarke, N.D., Wei, C.L., and Ng, H.H., Integration of external signaling pathways with the core transcriptional network in embryonic stem cells, Cell, 2008, vol. 133, no. 6, pp. 1106–1117. doi 10.1016/j.cell.2008.04.043

    Article  CAS  PubMed  Google Scholar 

  • Goh, W.S., Orlov, Y., Li, J., and Clarke, N.D., Blurring of high-resolution data shows that the effect of intrinsic nucleosome occupancy on transcription factor binding is mostly regional, not local, PLoS Comput. Biol., 2010, vol. 6, no. 1. doi 10.1371/journal.pcbi.1000649

  • Golosova, O., Henderson, R., Vas’kin, Yu., Gabrielian, A., Grekhov, G., Nagarajan, V., Oler, A.J., Quinones, M., Hurt, D., Fursov, M., and Huyen, Y., Unipro UGENE NGS pipelines and components for variant calling, RNA-seq and ChIP-seq data analyses, Peer J., 2014, vol. 2. doi 10.7717/peerj.644

    Google Scholar 

  • Guo, Y., Mahony, S., and Gifford, D.K., High resolution genome wide binding event finding and motif discovery reveals transcription factor spatial binding constraints, PLoS Comput. Biol., 2012, vol. 8, no. 8. doi 10.1371/journal. pcbi.1002638

  • Han, J., Yuan, P., Yang, H., Zhang, J., Soh, B.S., Li, P., Lim, S.L., Cao, S., Tay, J., Orlov, Y.L., Lufkin, T., Ng, H.H., Tam, W.L., and Lim, B., Tbx3 im proves the germ-line competency of induced pluripotent stem cells, Nature, 2010, vol. 463, no. 7284, pp. 1096–1100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • He, X., Cicek, A.E., Wang Y., Schulz M.H., Le H.-S., and Ziv B.-J., De novo ChIP-seq analysis, Genome Biol., 2015, vol. 16, no. 1, p. 205. doi 10.1186/s13059-015-0756-4

    Article  PubMed  PubMed Central  Google Scholar 

  • Heinemeyer, T., Wingender, E., Reuter, I., Hermjakob, H., Kel, A.E., Kel, O.V., Ignatieva, E.V., Ananko, E.A., Podkolodnaya, O.A., Kolpakov, F.A., Podkolodny, N.L., and Kolchanov, N.A., Databases on transcriptional regulation: TRANSFAC, TRRD and COMPEL, Nucleic Acid Res., 1998, vol. 26, no. 1, pp. 362–367. doi 10.1093/nar/26.1.362

    Article  CAS  PubMed  Google Scholar 

  • Heng, J.C., Feng, B., Han, J., Jiang, J., Kraus, P., Ng, J.H., Orlov, Y.L., Huss, M., Yang, L., Lufkin, T., Lim, B., and Ng, H.H., The nuclear receptor Nr5a2 can replace Oct4 in the reprogramming of murine somatic cells to pluripotent cells, Cell Stem Cell, 2010, vol. 6, no. 2, pp. 167–174. doi 10.1016/j.stem.2009.12.009

    Article  CAS  PubMed  Google Scholar 

  • Hutter, B., Bieg, M., Helms, V., and Paulsen, M., Imprinted genes show unique patterns of sequence conservation, BMC Genomics, 2010, vol. 11, p. 649. doi 10.1186/1471-2164-11-649

    Article  PubMed  PubMed Central  Google Scholar 

  • Ignatieva, E.V., Podkolodnaya, O.A., Orlov, Yu.L., Vasil’ev, G.V., and Kolchanov, N.A., Regulatory genomics: Combined experimental and computational approaches, Russ. J. Genet., 2015, vol. 51, no. 4, pp. 334–352.

    Article  CAS  Google Scholar 

  • Ivanova, N., Dobrin, R., Lu, R., Kotenko, I., Levorse, J., DeCoste, C., Schafer, X., Lun, Yi., and Lemischka, I.R., Dissecting self-renewal in stem cells with RNA interference, Nature, 2006, vol. 442, no. 7102, pp. 533–538. doi 10.1038/nature04915

    Article  CAS  PubMed  Google Scholar 

  • Kulakova, E.V., Spitsina, A.M., Orlova, N.G, Dergilev, A.I., Svichkarev, A.V., Safronova, N.S., Chernykh, I.G., and Orlov, Yu.L., Programs for analyzing genomic sequencing data obtained on the basis of ChIP-seq, ChIA-PET, and Hi-C technologies, Program. Sist., Teor. Prilozh., 2015, vol. 6, no. 2, pp. 129–148.

    Google Scholar 

  • Kuznetsov, V.A., Orlov, Yu.L., Wei, C.L., and Ruan, Y., Computational analysis and modeling of genome-scale avidity distribution of transcription factor binding sites in chippet experiments, Genome Inf., 2007, vol. 19, pp. 83–94.

    CAS  Google Scholar 

  • Kuznetsov, V.A., Singh, O., and Jenjaroenpun, P., Statistics of protein-DNA binding and the total number of binding sites for a transcription factor in the mammalian genome, BMC Genomics, 2010, vol. 11, no. 1, p. 12. doi 10.1186/1471-2164-11-S1-S12

    Article  Google Scholar 

  • Kuzniewska, B., Nader, K., Dabrowski, M., Kaczmarek, L., and Kalita, K., Adult deletion of SRF increases epileptogenesis and decreases activity-induced gene expression, Mol. Neurobiol., 2015, vol. 1–16. doi 10.1007/s12035-014-9089-7

  • Lee, K.L., Lim, S.K., Orlov, Y.L., Yit Le, Y., Yang, H., Ang, L.T., Poellinger, L., and Lim, B., Graded Nodal/Activin signaling titrates conversion of quantitative phospho-Smad2 levels into qualitative embryonic stem cell fate decisions, PLoS Genet., 2011, vol. 7, no. 6. doi 10.1371/journal.pgen.1002130

  • Li, G., Cai, L., Chang, H., Hong, P., Zhou, Q., Kulakova, E.V., Kolchanov, N.A., and Ruan, Y., Chromatin interaction analysis with Paired-End Tag (ChIA-PET) sequencing technology and application, BMC Genomics, 2014, vol. 15, no. 12, p. 11. doi 10.1186/1471-2164-15-S12-S11

    Article  PubMed  PubMed Central  Google Scholar 

  • Loh, Y.H., Wu, Q., Chew, J.L., Vega, V.B., Zhang, W., Chen, X., Bourque, G., George, J., Leong, B., Liu, J., Wong, K.Y., Sung, K.W., Lee, C.W., Zhao, X.D., Chiu, K.P., et al., The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells, Nat. Genet., 2006, vol. 38, no. 4, pp. 431–440. doi 10.1038/ng1760

    Article  CAS  PubMed  Google Scholar 

  • Orlov, Yu., Xu, H., Afonnikov, D., Lim, B., Heng, J.C., Yuan, P., Chen, M., Yan, J., Clarke, N., Orlova, N., Huss, M., Gunbin, K., Podkolodnyy, N., and Ng, H.H., Computer and statistical analysis of transcription factor binding and chromatin modifications by ChIP-seq data in embryonic stem cell, J. Integr. Bioinf., 2012, vol. 9, no. 2, p. 211. doi 10.2390/biecoll-jib-2012-211

    Google Scholar 

  • Orlov, Yu.L. and Potapov, V.N., Complexity: An internet resource for analysis of DNA sequence complexity, Nucleic Acid Res., 2004, vol. 32, pp. W628–W633. doi 10.1093/nar/gkh466

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Orlov, Yu.L., A computer study of the regulation of transcription of eukaryotic genes using data from experiments of chromatin sequencing and immunoprecipitation, Vavilovskii Zh. Genet. Sel., 2014, vol. 18, no. 1, pp. 193–206.

    Google Scholar 

  • Orlov, Yu.L., Bragin, A.O., Medvedeva, I.V., Gunbin, K.V., Demenkov, P.S., Vishnevsky, O.V., Levitsky, V.G., Oschepkov, D.Yu., Podkolodny, N.L., Afonnikov, D.A., Grosse, I., and Kolchanov, N.A., ICGenomics: A program complex for the analysis of symbolic sequences of genomics, Vavilovskii Zh. Genet. Sel., 2012, vol. 16, no. 4/1, pp. 732–741.

    Google Scholar 

  • Orlov, Yu.L., Huss, M.E., Joseph, R., Xu, H., Vega, V.B., Lee, Y.K., Goh, W.S., Thomsen, J.S., Cheung, E.C., Clarke, N.D., and Ng, H.H., Genome-wide statistical analysis of multiple transcription factor binding sites obtained by ChIP-seq technologies, Proc. 1st ACM Workshop on Breaking Frontiers of Computational Biology (Comp-Bio ‘09), New York, 2009, pp. 11–18.

    Chapter  Google Scholar 

  • Orlov, Yu.L., Levitskii, V.G., Smirnova, O.G., Podkolodnaya, O.A., Khlebodarova, T.M., and Kolchanov, N.A., Statistical analysis of DNA sequences containing sites of nucleosome formation, Biofizika, 2006, vol. 51, no. 4, pp. 608–614.

    CAS  PubMed  Google Scholar 

  • Orlov, Yu.L., Te Boekhorst, R., and Abnizova, I.I., Statistical measures of the structure of genomic sequences: Entropy, complexity, and position information, J. Bioinf. Comput. Biol., 2006, vol. 4, pp. 523–536. doi 10.1142/S0219720006001801

    Article  CAS  Google Scholar 

  • Panne, D., Maniatis, T., and Harrison, S.C., An atomic model of the interferon-beta enhanceosome, Cell, 2007, vol. 129, no. 6, pp. 1111–1123. cell.2007.05.019 doi 10.1016/j

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Polunin, D.A., Shtaiger, I.A., and Efimov, V.M., Development of the JACOBI 4 software package for multivariate analysis of microchip data, Vestn. Novosib. Gos. Univ., Inf. Tekhnol., 2014, vol. 12, no. 2, pp. 90–98.

    Google Scholar 

  • Putta, P., Orlov, Yu.L., Podkolodnyy, N.L., and Mitra, C.K., Relatively conserved common short sequences in transcription factor binding sites and miRNA, Russ. J. Genet.: Appl. Res., 2011, vol. 15, no. 4, pp. 750–756.

    Google Scholar 

  • Safronova, N.S., Babenko, V.N., and Orlov, Yu.L., Analysis of SNP containing sites in human genome using text complexity estimates, J. Biomol. Struct. Dyn., 2015, vol. 33, no. 1, pp. 73–74. doi 10.1080/07391102.2015.1032750

    Article  PubMed  Google Scholar 

  • Sirito, M., Lin, Q., Deng, J.M., Behringer, R.R., and Sawadogo, M., Overlapping roles and asymmetrical crossregulation of the USF proteins in mice. Overlapping roles and asymmetrical cross-regulation of the USF proteins in mice, Proc. Natl. Acad. Sci. U.S.A., 1998, vol. 95, no. 7, pp. 3758–3763.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Spitsina, A.M., Orlov, Yu.L., Podkolodnaya, N.N., Svichkarev, A.V., Dergilev, A.I., Chen, M., Kuchin, N.V., Chernykh, I.G., and Glinskii, B.M., Supercomputer analysis of genomic and transcriptomic data obtained with the help of high-performance DNA sequencing technologies, Program. Sist., Teor. Prilozh., 2015, vol. 6, no. 1, pp. 157–174.

    Google Scholar 

  • Takahashi, K. and Yamanaka, S., Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors, Cell, 2006, vol. 126, no. 4, pp. 663–676. doi 10.1016/j.cell.2006.07.024

    Article  CAS  PubMed  Google Scholar 

  • Vas’kin, Yu., Khomicheva, I.V., Ignatieva, E.V., and Vityaev, E.E., Expert discovery and UGENE integrated system for intelligent analysis of regulatory regions of genes, In Silico Biol., 2011–2012, vol. 11, nos. 3–4, pp. 97–108. doi 10.3233/ISB-2012-0448

    Google Scholar 

  • Vas’kin, Yu.Yu., Khomicheva, I.V., Ignat’eva, E.V., and Vityaev, E.E., Analysis of sequences of regulatory regions of genes by the Expert Discovery relational system built into the UGENE package, Vestn. Novosib. Gos. Univ., Inf. Tekhnol., 2012, vol. 10, no. 1, pp. 73–86. doi 10.3233/ISB-2012-0448

    Google Scholar 

  • Vityaev, E.E., Izvlechenie znanii iz dannykh. Komp’yuternoe poznanie. Modeli kognitivnykh protsessov (Extracting Knowledge from Data. Computer Cognition. Models of Cognitive Processes), Novosibirsk: Novosib. gos. Univ., 2006.

    Google Scholar 

  • Vityaev, E.E., Orlov, Yu.L., Vishnevskii, O.V., and Belenok, A.S., Kol chanov N.A. Computer system GENE DISCOVERY for the search of regularities of the organization of regulatory sequences of eukaryotes, Mol. Biol., 2001, vol. 35, no. 6, pp. 952–960.

    Article  Google Scholar 

  • Xu, D., Wei, G., Lu, P., Luo, J., Chen, X., Skogerb, G., and Chen, R., Analysis of the p53/CEP-1 regulated non-coding transcriptome in C. elegans by an NSR-seq strategy, Protein Cell, 2014, vol. 5, no. 10, pp. 770–782. doi 10.1007/s13238- 014-0071-y

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Yanan, Z., Quan, X., Ya, G., and Qiang, W., Characterization of a cluster of CTCF-binding sites in a protocadherin regulatory region, Yi Chuan, 2016, vol. 38, no. 4, pp. 323–336. doi 10.16288/j.yczz.16-037

    PubMed  Google Scholar 

  • Zhang, Y. and Wang, P., A fast cluster motif finding algorithm for ChIP-Seq data sets, Biomed. Res. Int., 2015, vol. 2015, p. 218068. doi 10.1155/2015/218068

    PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. I. Dergilev.

Additional information

Original Russian Text © A.I. Dergilev, A.M. Spitsina, I.V. Chadaeva, A.V. Svichkarev, F.M. Naumenko, E.V. Kulakova, E.R. Galieva, E.E. Vityaev, M. Chen, Yu.L. Orlov, 2016, published in Vavilovskii Zhurnal Genetiki i Selektsii, 2016, Vol. 20, No. 6, pp. 770–778.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dergilev, A.I., Spitsina, A.M., Chadaeva, I.V. et al. Computer analysis of colocalization of the TFs’ binding sites in the genome according to the ChIP-seq data. Russ J Genet Appl Res 7, 513–522 (2017). https://doi.org/10.1134/S2079059717050057

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S2079059717050057

Keywords

Navigation