Abstract
The AT-rich DNA is mostly associated with condensed chromatin, whereas the GC-rich sequence is preferably located in the dispersed chromatin. The AT-rich genes are prone to be tissue-specific (silenced in most tissues), while the GC-rich genes tend to be housekeeping (expressed in many tissues). This paper reports another important property of DNA base composition, which can affect repertoire of genes with high AT content. The GC-rich sequence is more liable to mutation. We found that Spearman correlation between human gene GC content and mutation probability is above 0.9. The change of base composition even in synonymous sites affects mutation probability of nonsynonymous sites and thus of encoded proteins. There is a unique type of housekeeping genes, which are especially unsafe when prone to mutation. Natural selection which usually removes deleterious mutations, in the case of these genes only increases the hazard because it can descend to suborganismal (cellular) level. These are cell cycle-related genes. In accordance with the proposed concept, they have low GC content of synonymous sites (despite them being housekeeping). The gene-centred protein interaction enrichment analysis (PIEA) showed the core clusters of genes whose interactants are modularly enriched in genes with AT-rich synonymous codons. This interconnected network is involved in double-strand break repair, DNA integrity checkpoints and chromosome pairing at mitosis. The damage of these genes results in genome and chromosome instability leading to cancer and other ‘error catastrophes’. Reducing the nonsynonymous mutations, the usage of AT-rich synonymous codons can decrease probability of cancer by above 20-fold.
Similar content being viewed by others
References
Aggarwala V, Voight BF (2016) An expanded sequence context model broadly explains variability in polymorphism levels across the human genome. Nat Genet 48:349–355
Aken BL, Achuthan P, Akanni W, Amode MR, Bernsdorff F, Bhai J, Billis K, Carvalho-Silva D, Cummins C, Clapham P et al (2017) Ensembl 2017. Nucleic Acids Res 45:D635-D642
Alvarez-Valin F, Lamolle G, Bernardi G (2002) Isochores, GC3 and mutation biases in the human genome. Gene 300:161–168
Anselmi C, Bocchinfuso G, De Santis P, Savino M, Scipioni A (2000) A theoretical model for the prediction of sequence-dependent nucleosome thermodynamic stability. Biophys J 79:601–613
Bernardi G (2015) Chromosome architecture and genome organization. PLoS ONE 10:e0143739
Crow JF (1997) The high spontaneous mutation rate: is it a health risk? Proc Natl Acad Sci USA 94:8380–8386
Crow JF (2000) The origins, patterns and implications of human spontaneous mutation. Nat Rev Genet 1:40–47
Danielsen HE, Pradhan M, Novelli M (2016) Revisiting tumour aneuploidy - the place of ploidy assessment in the molecular era. Nat Rev Clin Oncol 13:291–304
Di Filippo M, Bernardi G (2008) Mapping DNase-I hypersensitive sites on human isochores. Gene 419:62–65
Fletcher W, Yang Z (2010) The effect of insertions, deletions, and alignment errors on the branch-site test of positive selection. Mol Biol Evol 27:2257–2267
Gene Ontology Consortium (2017) Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res 45: D331–D338
Giam M, Rancati G (2015) Aneuploidy and chromosomal instability in cancer: a jackpot to chaos. Cell Div 10:3
Gorbunova V, Seluanov A, Zhang Z, Gladyshev VN, Vijg J (2014) Comparative genetics of longevity and cancer: insights from long-lived rodents. Nat Rev Genet 15:531–540
Greaves M (2015) Evolutionary determinants of cancer. Cancer Discov 5:806–820
Greaves M, Maley CC (2012) Clonal evolution in cancer. Nature 481:306–313
Jabbari K, Nurnberg P (2016) A genomic view on epilepsy and autism candidate genes. Genomics 108:31–36
Jia Q, Wu H, Zhou X, Gao J, Zhao W, Aziz J, Wei J, Hou L, Wu S, Zhang Y et al (2010) A “GC-rich” method for mammalian gene expression: a dominant role of non-coding DNA GC content in regulation of mammalian gene expression. Sci China Life Sci 53:94–100
Jordan G, Goldman N (2012) The effects of alignment error and alignment filtering on the sitewise detection of positive selection. Mol Biol Evol 29:1125–1139
Khuu P, Sandor M, DeYoung J, Ho PS (2007) Phylogenomic analysis of the emergence of GC-rich transcription elements. Proc Natl Acad Sci USA 104:16528–16533
Kirkwood TB, Holliday R (1979) The evolution of ageing and longevity. Proc R Soc Lond B Biol Sci 205:531–546
Klein CA (2013) Selection and adaptation during metastatic cancer progression. Nature 501:365–372
Kokko H, Hochberg ME (2015) Towards cancer-aware life-history modelling. Philos Trans R Soc Lond B Biol Sci 370:20140234
Li J, Zhou J, Wu Y, Yang S, Tian D (2015) GC-content of synonymous codons profoundly influences amino acid usage. G3 Genes Genom Genet 5:2027–2036
Lim KH, Ferraris L, Filloux ME, Raphael BJ, Fairbrother WG (2011) Using positional distribution to identify splicing elements and predict pre-mRNA processing defects in human genes. Proc Natl Acad Sci USA 108:11093–11098
Loytynoja A (2014) Phylogeny-aware alignment with PRANK. Methods Mol Biol 1079:155–170
Loytynoja A, Goldman N (2008) Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science 320:1632–1635
Mahadevappa M, Warrington JA (2002) Housekeeping genes. Encyclopedia of life sciences, Wiley, Hoboken. http://www.els.net
Martincorena I, Campbell PJ (2015) Somatic mutation in cancer and normal cells. Science 349:1483–1489
Minervini CF, Izumi M, Miki T (2009) Effect of culture conditions on reference genes expression in placenta-derived stem cells. Int J Stem Cells 2:69–75
Mojtahedi M, Skupin A, Zhou J, Castano IG, Leong-Quong RY, Chang H, Trachana K, Giuliani A, Huang S (2016) Cell fate decision as high-dimensional critical state transition. PLoS Biol 14:e2000640
Nalepa G, Clapp DW (2014) Fanconi anemia and the cell cycle: new perspectives on aneuploidy. F1000Prime Rep 6:23
NCBI Resource Coordinators (2017) Database Resources of the National Center for Biotechnology Information. Nucleic Acids Res 45: D12–D17
Palmer NP, Schmid PR, Berger B, Kohane IS (2012) A gene expression profile of stem cell pluripotentiality and differentiation is conserved across diverse solid and hematopoietic cancers. Genome Biol 13:R71
Rangarajan A, Hong SJ, Gifford A, Weinberg RA (2004) Species- and cell type-specific requirements for cellular transformation. Cancer Cell 6:171–183
Rich A, Zhang S (2003) Timeline: Z-DNA: the long road to biological function. Nat Rev Genet 4:566–572
Saccone S, Federico C, Bernardi G (2002) Localization of the gene-richest and the gene-poorest isochores in the interphase nuclei of mammals and birds. Gene 300:169–178
Slattery M, Zhou T, Yang L, Dantas Machado AC, Gordan R, Rohs R (2014) Absence of a simple code: how transcription factors read the genome. Trends Biochem Sci 39:381–399
Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100:9440–9445
Stratton MR, Campbell PJ, Futreal PA (2009) The cancer genome. Nature 458:719–724
Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, Santos A, Doncheva NT, Roth A, Bork P, Jensen LJ, von Mering C (2017) The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res 45:D362-D368
Tomasetti C, Vogelstein B (2015) Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science 347:78–81
Venn O, Turner I, Mathieson I, de Groot N, Bontrop R, McVean G (2014) Strong male bias drives germline mutation in chimpanzees. Science 344:1272–1275
Vinogradov AE (2001) Bendable genes of warm-blooded vertebrates. Mol Biol Evol 18:2195–2200
Vinogradov AE (2003a) DNA helix: the importance of being GC-rich. Nucleic Acids Res 31:1838–1844
Vinogradov AE (2003b) Isochores and tissue-specificity. Nucleic Acids Res 31:5212–5220
Vinogradov AE (2005) Noncoding DNA, isochores and gene expression: nucleosome formation potential. Nucleic Acids Res 33:559–563
Vinogradov AE (2015a) Consolidation of slow or fast but not moderately evolving genes at the level of pathways and processes. Gene 561:30–34
Vinogradov AE (2015b) Accelerated pathway evolution in mouse-like rodents involves cell cycle control. Mamm Genome 26:609–618
Wang G, Vasquez KM (2014) Impact of alternative DNA structures on DNA damage, DNA repair, and genetic instability. DNA Repair 19:143–151
Wang G, Christensen LA, Vasquez KM (2006) Z-DNA-forming sequences generate large-scale deletions in mammalian cells. Proc Natl Acad Sci USA 103:2677–2682
Wensink MJ (2016) Size, longevity and cancer: age structure. Proc Biol Sci 283:20161510
Yadav VK, DeGregori J, De S (2016) The landscape of somatic mutations in protein coding genes in apparently benign human tissues carries signatures of relaxed purifying selection. Nucleic Acids Res 44:2075–2084
Yang Z (2007) PAML 4: a program package for phylogenetic analysis likelihood. Mol Biol Evol 24:1586–1591
Acknowledgements
We thank two anonymous reviewers for helpful comments. This work was supported by the Russian Scientific Foundation (Grant number 14-50-00068).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Vinogradov, A.E., Anatskaya, O.V. DNA helix: the importance of being AT-rich. Mamm Genome 28, 455–464 (2017). https://doi.org/10.1007/s00335-017-9713-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00335-017-9713-8