Skip to main content
Log in

Substantial Regional Variation in Substitution Rates in the Human Genome: Importance of GC Content, Gene Density, and Telomere-Specific Effects

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

This study presents the first global, 1-Mbp-level analysis of patterns of nucleotide substitutions along the human lineage. The study is based on the analysis of a large amount of repetitive elements deposited into the human genome since the mammalian radiation, yielding a number of results that would have been difficult to obtain using the more conventional comparative method of analysis. This analysis revealed substantial and consistent variability of rates of substitution, with the variability ranging up to twofold among different regions. The rates of substitutions of C or G nucleotides with A or T nucleotides vary much more sharply than the reverse rates, suggesting that much of that variation is due to differences in mutation rates rather than in the probabilities of fixation of C/G vs. A/T nucleotides across the genome. For all types of substitution we observe substantially more hotspots than coldspots, with hotspots showing substantial clustering over tens of Mbp’s. Our analysis revealed that GC-content of surrounding sequences is the best predictor of the rates of substitution. The pattern of substitution appears very different near telomeres compared to the rest of the genome and cannot be explained by the genome-wide correlations of the substitution rates with GC content or exon density. The telomere pattern of substitution is consistent with natural selection or biased gene conversion acting to increase the GC-content of the sequences that are within 10–15 Mbp away from the telomere.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure S1
Figure S2
Figure S3
Figure S4
Figure S5
Figure S6
Figure S7
Figure S8
Figure S9
Figure S10
Figure S11
Figure S12
Figure S13
Figure S14
Figure S15
Figure S16
Figure S17
Figure S18
Figure S19
Figure S20
Figure S21
Figure S22
Figure S23
Figure S24
Figure 4
Figure 5
Figure S25
Figure S26
Figure 6

Similar content being viewed by others

References

  • Arndt PF (2004) Identification and measurement of neighbor dependent nucleotide substitution process. Lecture Notes in Informatics P-53 (2004) 227–234; accepted for publication in Bioinformatics (2005).

  • PF Arndt CB Burge T Hwa (2003a) ArticleTitleDNA sequence evolution with neighbor-dependent mutation J Comput Biol 10 313–322 Occurrence Handle10.1089/10665270360688039 Occurrence Handle1:CAS:528:DC%2BD3sXms1ajt7s%3D

    Article  CAS  Google Scholar 

  • PF Arndt DA Petrov T Hwa (2003b) ArticleTitleDistinct changes of genomic biases in nucleotide substitution at the time of mammalian radiation Mol Biol Evol 26 1887–1896 Occurrence Handle10.1093/molbev/msg204

    Article  Google Scholar 

  • G Bernardi (2000) ArticleTitleIsochores and the evolutionary genomics of vertebrates Gene 241 3–17 Occurrence Handle10.1016/S0378-1119(99)00485-0 Occurrence Handle1:CAS:528:DyaK1MXotVGksrw%3D Occurrence Handle10607893

    Article  CAS  PubMed  Google Scholar 

  • MJ Box (1966) ArticleTitleA comparison of several current optimization methods and use of transformations in constrained problems Compute J 9 67–77

    Google Scholar 

  • RJ Britten WF Baron DB Stout EH Davidson (1988) ArticleTitleSources and evolution of human Alu repeated sequences Proc Natl Acad Sci USA 85 4770–4774 Occurrence Handle1:CAS:528:DyaL1cXlt1enuro%3D Occurrence Handle3387437

    CAS  PubMed  Google Scholar 

  • T Caspersson KR Castleman G Lomakka EJ Modest A Moller R Nathan RJ Wall L Zech (1971) ArticleTitleAutomatic karyotyping of quinacrine mustard stained human chromosomes Exp Cell Res 67 233–235 Occurrence Handle10.1016/0014-4827(71)90645-8 Occurrence Handle1:STN:280:CS6B1crjtVQ%3D Occurrence Handle5569201

    Article  CAS  PubMed  Google Scholar 

  • VG Cheung N Nowak W Jang et al. (2001) ArticleTitleIntegration of cytogenetic landmarks into the draft sequence of the human genome Nature 409 953–988 Occurrence Handle10.1038/35057192 Occurrence Handle1:STN:280:DC%2BD3M7lvFGmsg%3D%3D Occurrence Handle11237021

    Article  CAS  PubMed  Google Scholar 

  • C Coulondre JH Miller PJ Farabaugh W Gilbert (1978) ArticleTitleMolecular basis of base substitution hotspots in Escherichia coli Nature 274 775–780 Occurrence Handle355893

    PubMed  Google Scholar 

  • L Duret N Galtier (2000) ArticleTitleThe covariation between TpA deficiency, CpG deficiency, and G+C content of human isochores is due to a mathematical artifact Mol Biol Evol 17 1620–1625 Occurrence Handle1:CAS:528:DC%2BD3cXnvFyqtrw%3D Occurrence Handle11070050

    CAS  PubMed  Google Scholar 

  • L Duret M Semon G Piganeau D Mouchiroud N Galtier (2002) ArticleTitleVanishing GC-rich isochores in mammalian genomes Genetics 162 1837–1847 Occurrence Handle1:CAS:528:DC%2BD3sXht1ajs7s%3D Occurrence Handle12524353

    CAS  PubMed  Google Scholar 

  • H Ellegren NG Smith MT Webster (2003) ArticleTitleMutation rate variation in the mammalian genome Curr Opin Genet Dev 13 562–568 Occurrence Handle10.1016/j.gde.2003.10.008 Occurrence Handle1:CAS:528:DC%2BD3sXpt1SnsLw%3D Occurrence Handle14638315

    Article  CAS  PubMed  Google Scholar 

  • A Eyre-Walker LD Hurst (2001) ArticleTitleThe evolution of isochores Nature Rev Genet 2 549–555 Occurrence Handle10.1038/35080577 Occurrence Handle1:CAS:528:DC%2BD3MXltVCksb8%3D

    Article  CAS  Google Scholar 

  • J Filipski JP Thiery G Bernard (1973) ArticleTitleAn analysis of the bovine genome by Cs2SO4-Ag density gradient centrifugation J Mol Biol 80 177–197 Occurrence Handle10.1016/0022-2836(73)90240-4 Occurrence Handle1:CAS:528:DyaE2cXht1Wkug%3D%3D Occurrence Handle4798988

    Article  CAS  PubMed  Google Scholar 

  • KJ Fryxell E Zuckerkandl (2000) ArticleTitleCytosine deamination plays a primary role in the evolution of mammalian isochores Mol Biol Evol 17 1371–1383 Occurrence Handle1:CAS:528:DC%2BD3cXmtFyrtrc%3D Occurrence Handle10958853

    CAS  PubMed  Google Scholar 

  • TS Furey D Haussler (2003) ArticleTitleIntegration of the cytogenetic map with the draft human genome sequence Hum Mol Genet 12 1037–1044 Occurrence Handle10.1093/hmg/ddg113 Occurrence Handle1:CAS:528:DC%2BD3sXkt1Kku7c%3D Occurrence Handle12700172

    Article  CAS  PubMed  Google Scholar 

  • RC Hardison KM Roskin S Yang et al. (2003) ArticleTitleCovariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution Genome Res 13 13–26 Occurrence Handle10.1101/gr.844103 Occurrence Handle1:CAS:528:DC%2BD3sXnvFGmsg%3D%3D Occurrence Handle12529302

    Article  CAS  PubMed  Google Scholar 

  • ST Hess JD Blake RD Blake (1994) ArticleTitleWide variations in neighbor-dependent substitution rates J Mol Biol 236 1022–1033 Occurrence Handle10.1016/0022-2836(94)90009-4 Occurrence Handle1:CAS:528:DyaK2cXis1aqsbk%3D Occurrence Handle8120884

    Article  CAS  PubMed  Google Scholar 

  • T Hubbard D Barker E Birney et al. (2002) ArticleTitleThe Ensembl genome database project Nucleic Acids Res 30 38–41 Occurrence Handle10.1093/nar/30.1.38 Occurrence Handle1:CAS:528:DC%2BD38Xht12ksbY%3D Occurrence Handle11752248

    Article  CAS  PubMed  Google Scholar 

  • J Jurka (2000) ArticleTitleRepbase update: a database and an electronic journal of repetitive elements Trends Genet 16 418–420 Occurrence Handle1:CAS:528:DC%2BD3cXmvFygtr0%3D Occurrence Handle10973072

    CAS  PubMed  Google Scholar 

  • J Jurka T Smith (1988) ArticleTitleA fundamental division in the Alu family of repeated sequences Proc Natl Acad Sci USA 85 4775–4778 Occurrence Handle1:CAS:528:DyaL1cXlt1enurs%3D Occurrence Handle3387438

    CAS  PubMed  Google Scholar 

  • V Kapitonov J Jurka (1996) ArticleTitleThe age of Alu subfamilies J Mol Evol 42 59–65 Occurrence Handle1:CAS:528:DyaK28XntFOmtg%3D%3D Occurrence Handle8576965

    CAS  PubMed  Google Scholar 

  • A Kong DF Gudbjartsson J Sainz et al. (2002) ArticleTitleA high-resolution recombination map of the human genome Nature Genet 31 241–247 Occurrence Handle1:CAS:528:DC%2BD38XkvVGmtLc%3D Occurrence Handle12053178

    CAS  PubMed  Google Scholar 

  • HM Kritzer (1980) ArticleTitlecomparing partial order correlations from contingency table data Sociol Methods Res 8 420–433

    Google Scholar 

  • S Kumar S Subramanian (2002) ArticleTitleMutation rates in mammalian genomes Proc Natl Acad Sci USA 99 803–808 Occurrence Handle10.1073/pnas.022629899 Occurrence Handle1:CAS:528:DC%2BD38Xht1Wis74%3D Occurrence Handle11792858

    Article  CAS  PubMed  Google Scholar 

  • ES Lander LM Linton B Birren et al. (2001) ArticleTitleInitial sequencing and analysis of the human genome Nature 409 860–921 Occurrence Handle10.1038/35057062 Occurrence Handle11237011

    Article  PubMed  Google Scholar 

  • MJ Lerche AO Urrutia A Pavlicek LD Hurst (2003) ArticleTitleA unification of mosaic structures in the human genome Hum Mol Genet 12 2411–2415 Occurrence Handle10.1093/hmg/ddg251 Occurrence Handle12915446

    Article  PubMed  Google Scholar 

  • MJ Lercher JV Chamary LD Hurst (2004) ArticleTitleGenomic regionally in rate of evolution is not explained by clustering of genes of comparable expression profile Genome Res 14 1002–1013 Occurrence Handle10.1101/gr.1597404 Occurrence Handle1:CAS:528:DC%2BD2cXkvFGhsbY%3D Occurrence Handle15173108

    Article  CAS  PubMed  Google Scholar 

  • J Meunier L Duret (2004) ArticleTitleRecombination drives the evolution of GC-content in the human genome Mol Biol Evol 21 984–990 Occurrence Handle10.1093/molbev/msh070 Occurrence Handle1:CAS:528:DC%2BD2cXksVymtbY%3D Occurrence Handle14963104

    Article  CAS  PubMed  Google Scholar 

  • D Mouchiroud G D’Onofrio B Aissani G Macaya C Gautier G Bernardi (1991) ArticleTitleThe distribution of genes in the human genome Gene 100 181–187 Occurrence Handle10.1016/0378-1119(91)90364-H Occurrence Handle1:CAS:528:DyaK3MXkvVKmurc%3D Occurrence Handle2055469

    Article  CAS  PubMed  Google Scholar 

  • WH Press SA Teukolsky WT Vetterling BP Flannery (1992) Numerical recipes in C. The art of scientific computing Cambridge University Press Cambridge

    Google Scholar 

  • PD Rabinowicz LE Palmer BP May MT Hemann SW Lowe WR McCombie RA Martienssen (2003) ArticleTitleGenes and transposons are differentially methylated in plants, but not in mammals Genome Res 13 2658–2664 Occurrence Handle10.1101/gr.1784803 Occurrence Handle1:CAS:528:DC%2BD3sXpvVaku7s%3D Occurrence Handle14656970

    Article  CAS  PubMed  Google Scholar 

  • A Razin AD Riggs (1980) ArticleTitleDNA methylation and gene function Science 210 604–610 Occurrence Handle1:CAS:528:DyaL3cXmtlartro%3D Occurrence Handle6254144

    CAS  PubMed  Google Scholar 

  • S Saccone A De Sario J Wiegant AK Raap G Delia Valle G Bernard (1993) ArticleTitleCorrelations between isochores and chromosomal bands in the human genome Proc Natl Acad Sci USA 90 11929–11933 Occurrence Handle1:CAS:528:DyaK2cXntVKmug%3D%3D Occurrence Handle8265650

    CAS  PubMed  Google Scholar 

  • NG Smith MT Webster H Ellegren (2002) ArticleTitleDeterministic mutation rate variation in the human genome Genome Res 12 1350–1356 Occurrence Handle10.1101/gr.220502 Occurrence Handle1:CAS:528:DC%2BD38Xnt1elsbo%3D Occurrence Handle12213772

    Article  CAS  PubMed  Google Scholar 

  • S Subramanian S Kumar (2003) ArticleTitleNeutral substitutions occur at a faster rate in exons than in noncoding DNA in primate genomes Genome Res 13 838–844 Occurrence Handle10.1101/gr.1152803 Occurrence Handle1:CAS:528:DC%2BD3sXjs1Oqur8%3D Occurrence Handle12727904

    Article  CAS  PubMed  Google Scholar 

  • RH Waterston K Lindblad-Toh E Birney et al. (2002) ArticleTitleInitial sequencing and comparative analysis of the mouse genome Nature 420 520–562 Occurrence Handle10.1038/nature01262 Occurrence Handle1:CAS:528:DC%2BD38Xpt1WhsLw%3D Occurrence Handle12466850

    Article  CAS  PubMed  Google Scholar 

  • JA Yoder CP Walsh TH Bestor (1997) ArticleTitleCytosine methylation and the ecology of intragenomic parasites Trends Genet 13 335–340 Occurrence Handle1:CAS:528:DyaK2sXlt1Ggu78%3D Occurrence Handle9260521

    CAS  PubMed  Google Scholar 

Download references

Acknowledgments

T.H. is supported by the NSF through Grants 0211308, 0216576, and 0225630. D.P. is supported by NSF Grant DEB-0317171, the Terman Award, and the Alfred P. Sloan Fellowship in Computational Molecular Biology. P.A. and D.P. are grateful for the hospitality of the Center for Theoretical Biological Physics at UCSD, where extensive discussions of this research took place.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter F. Arndt.

Additional information

Reviewing Editor: Dr. Jerzy Jurka

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arndt, P.F., Hwa, T. & Petrov, D.A. Substantial Regional Variation in Substitution Rates in the Human Genome: Importance of GC Content, Gene Density, and Telomere-Specific Effects. J Mol Evol 60, 748–763 (2005). https://doi.org/10.1007/s00239-004-0222-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-004-0222-5

Keywords

Navigation