Abstract
Nucleotide sequences of DNA within clusters of transcription start sites identified by the Cap Analysis of Gene Expression (CAGE) have some distinctive features. DNA within such clusters is enriched in cytosine and guanine, and its GC-skew agrees with selection of the coding strand for which the G content exceeds the C content. On the other hand, for the coding strand the frequency of tracts of the avoided cytosine, normalized to the expectation calculated from the local content of the nucleotide in the cluster, is significantly higher than that of the tracts of the preferred guanine. Similarly, the statistical significance of the C-rich variant of binding site for transcription factor Sp1 in the coding strand is higher than that of the G-rich variant. Yet it is unlikely that the choice of the Sp1 site variant is induced by the coding strand selection. Rather, it is more likely that both variants are more or less equiprobable, and the Sp1 functional binding works as a selection factor, which counteracts the mutations bringing about the GC-skew.
Similar content being viewed by others
References
T. Shiraki, et al., Proc. Natl. Acad. Sci. U S A 100, 15776 (2003).
R. Kodzius, et al., Nat. Methods 3, 211 (2006).
H. Kawaji, et al., Nucleic Acids Res. 34, D632 (2006).
J. C. Reese, Curr. Opin. Genet. Dev. 13, 114 (2003).
A. Shilatifard, R. C. Conaway and J. W. Conaway, Annu. Rev. Biochem. 72, 693 (2003).
S. Saxonov, P. Berg and D. L. Brutlag, Proc. Natl. Acad. Sci. U S A 103, 1412 (2006).
B. F. Pugh and R. Tjian, Genes Dev. 5, 1935 (1991).
L. Weis and D. Reinberg, Mol. Cell. Biol. 17, 2973 (1997).
Y. A. Medvedeva, et al., BMC Genomics 11, 48 (2010).
I. A. Mastrangelo, et al., Proc. Natl. Acad. Sci. U S A 88, 5670 (1991).
S. Aerts, et al., BMC Genomics 5, 34 (2004).
T. Tatarinova, V. Brover, M. Troukhan and N. Alexandrov, Bioinformatics 19Suppl 1, i313 (2003).
S. Fujimori, T. Washio and M. Tomita, BMC Genomics 6, 26 (2005).
C. Van Lint, et al., J. Virol. 71, 6113 (1997).
L. Weis and D. Reinberg, FASEB J. 6, 3300 (1992).
M. Touchon, et al., FEBS Lett. 555, 579 (2003).
P. Polak and P. F. Arndt, Genome Res 18, (2008).
J. S. Liu and C. E. Lawrence, Bioinformatics 15, 38 (1999).
Rozanov, Probability Theory, Random Processes, Mathematical Statistics (Nauka, Moscow, 1985) [in Russian].
V. Matys, et al., Nucleic Acids Res. 34, D108 (2006).
I. V. Kulakovsky and V. Yu. Makeev, Biofizika 54, 963 (2009).
V. Boeva, et al., Algorithms Mol. Biol. 2, 13 (2007).
J. Majewski and J. Ott, Genome Res. 12, 1827 (2002).
E. Chargaff, R. Lipshitz, C. Green, and M. E. Hodes, J. Biol. Chem. 192, 223 (1951).
A. Emili, J. Greenblatt, and C. J. Ingles, Mol. Cell. Biol. 14, 1582 (1994).
D. Mitchell and R. Bridge, Biochem. Biophys. Res. Commun. 340, 90 (2006).
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © Yu.A. Medvedeva, I.V. Kulakovskii, N.Yu. Oparina, A.V. Favorov, V.Yu. Makeev, 2010, published in Biofizika, 2010, Vol. 55, No. 6, pp. 976–985.
Rights and permissions
About this article
Cite this article
Medvedeva, Y.A., Kulakovskii, I.V., Oparina, N.Y. et al. The GC skew near Pol II start sites and its association with SP1-binding site variants. BIOPHYSICS 55, 901–907 (2010). https://doi.org/10.1134/S0006350910060023
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0006350910060023