Skip to main content
Log in

Comparative Analysis of GC Content Variations in Plant Genomes

  • Published:
Tropical Plant Biology Aims and scope Submit manuscript

Abstract

The GC content, one of the important compositional features of the genome, varies significantly among different genomes and different regions within a genome. Identifying the driving force that shaped the GC content and deciphering the biological meaning of variations in the GC content will help us to understand genome evolution. We analyzed and compared the GC contents of 20 selected plant species, representing the major evolutionary lineages. Our result revealed the highest GC content and GC heterogeneity in the grass genomes followed by the non-grass monocot and dicot genomes. The detailed analysis of GC content in genic regions showed higher GC content in terminal exons than in internal exons in all selected species except Volvox carteri. A strong correlation between the GC contents of exons and their neighboring introns at terminals of genes was observed in all the grasses, Musa acuminata, Spirodela polyrhiza and Nelumbo nucifera genomes. Our result suggested that the widely reported negative gradient of GC3 along the coding sequences from 5′ to 3′ was likely an artifact caused by GC content calculations on an admixture of genes with variable lengths and exon numbers. Our findings supported the role of the GC biased gene conversion in shaping the nucleotide composition landscapes in monocots. The U shape pattern of the GC content along the genes may have resulted from variable degrees of interactions among transcription, replication and DNA repair machineries. The transcription-associated recombination might play a major role in GC content evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qingyi Yu.

Additional information

Communicated by: Paulo Arruda

Electronic supplementary material

Below is the link to the electronic supplementary material.

Sup. Fig. 1

Variation of GC3 content from the 5′ end to the 3′ end in (a) GC poor (GC < 60 %) and (b) GC rich (GC ≥ 60) coding sequences of the 20 selected species. The GC content of grasses and dicots were averaged and represented as “Grasses_avg” and “Dicot_avg”, respectively. The error bars represent the standard deviation of GC contents among the members of grasses and dicots. (PDF 1014 kb) (PDF 1014 kb)

(PDF 966 kb)

Sup. Fig. 2–21

Box plots of GC contents of each exon in the subset of genes grouped based on the number of exons. The genes with same number of exons were grouped in one group and box plot was drawn for each subset individually. The first plot for each species was drawn on the admixture of all the genes within the species. Within each set genes were further divided into GC rich (red) and GC poor (blue). Red boxes are missing in some plots because the GC rich genes with that exon number are not found. The exon index is presented on X-axis and the GC content is presented on Y-axis. Sup. Fig. 221 represent plant species in following order: P. trichocarpa; A. thaliana; C. papaya; V. vinifera; N. nucifera; S. polyrhiza; P. equestris; P. dactylifera; M. acuminata; A. comosus; S. bicolor; Z. mays; S. italica; O. sativa; B. distachyon; A. trichopoda; P. abies; S. moellendorffii; P. patens; V. carteri. (PDF 5278 kb)

(PDF 5053 kb)

(PDF 4983 kb)

(PDF 5215 kb)

(PDF 5382 kb)

(PDF 5357 kb)

(PDF 5075 kb)

(PDF 5328 kb)

(PDF 5465 kb)

(PDF 5486 kb)

(PDF 5422 kb)

(PDF 5602 kb)

(PDF 5465 kb)

(PDF 5476 kb)

(PDF 5453 kb)

(PDF 4975 kb)

(PDF 4897 kb)

(PDF 5169 kb)

(PDF 5278 kb)

(PDF 5432 kb)

Sup. Fig. 22

Matrix plot of correlations of GC contents between indexed intron and exon pairs. The exon index is presented on x-axis and intron index is on y-axis. Each circle in the plot represents the correlation of GC content between the intron and the exon at the assigned index. The size of each circle in the matrix plot corresponds to the magnitude of correlation and colors represent the direction of correlation. Green (r < 0.4) and red (r ≥ 0.4) colors indicate positive correlation while yellow(r < −0.4) and purple (r ≥ −0.4) represent negative correlation. (PDF 3342 kb) (PDF 3342 kb)

Sup. Fig. 23

Matrix plot of correlations of GC contents between indexed intron and exon pairs in a subset of genes with 15 exons. The exon index is presented on x-axis and intron index is on y-axis. Each circle in the plot represents the correlation of GC content between the intron and the exon at the assigned index. The size of each circle in the matrix plot corresponds to the magnitude of correlation and colors represent the direction of correlation. Green (r < 0.4) and red (r ≥ 0.4) colors indicate positive correlation while yellow(r < −0.4) and purple (r ≥ −0.4) represent negative correlation. (PDF 2436 kb) (PDF 2436 kb)

Sup. Fig. 24

Scatterplots of intron GC content on y-axis and exon GC content on x-axis for all the 20 selected genomes. The genes >5000 nt were represented in shades of red and smaller genes in shades of blue. The density of the colors corresponds to the number of genes plotted in the area. Pearson’s correlation coefficients (r) between the GC contents for large and small genes can be found below each window. (PDF 5832 kb) (PDF 5832 kb)

Sup. Fig. 25

Scatterplot of cumulative length of introns in a gene on y-axis and average GC content of exons in the corresponding gene on x-axis. The genes containing 10 or more introns were represented in shades of red and genes with introns less than 10 in shades of blue. The density of the colors corresponds to the number of genes plotted in the area. Pearson’s correlation coefficients (r) between the intron length and exon GC content can be found below each window. (PDF 5748 kb) (PDF 5748 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, R., Ming, R. & Yu, Q. Comparative Analysis of GC Content Variations in Plant Genomes. Tropical Plant Biol. 9, 136–149 (2016). https://doi.org/10.1007/s12042-016-9165-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12042-016-9165-4

Keywords

Navigation