An Evolutionary Footprint of Age-Related Natural Selection in Mitochondrial DNA
- First Online:
- Cite this article as:
- Min, X.J. & Hickey, D.A. J Mol Evol (2008) 67: 412. doi:10.1007/s00239-008-9163-8
- 371 Downloads
By comparing mtDNA sequences between different orders of mammals, we show that both longevity and generation time are significantly correlated with the nucleotide content of the mtDNA. Specifically, there is a positive correlation between generation time and mt GC content. This correlation is repeated, at a finer evolutionary scale, within the primates. Moreover, a comparison of human and chimpanzee mtDNAs shows that the effect has been very pronounced during the short evolutionary period since the divergence of these two species, with human mtDNA showing a GC-biased pattern of substitution at the variable sites. In addition to these DNA sequence patterns, comparisons between the human and the chimp mt protein sequences also revealed a surprisingly high substitution rate for threonine residues, resulting in a reduction of threonine in the human mt proteome. These patterns of both DNA and protein evolution can be explained by a balance between AT-biased mutational pressure and age-related purifying selection.
KeywordsMt DNA Mammals Aging Generation time Longevity Threonine
Mt (mt) dysfunction is a significant cause of human disease, especially age-related diseases (for reviews see Wallace 2005; Trifunovic 2006; Conley et al. 2008; Passos et al. 2007; Hiona and Leeuwenburgh 2008). Thus, mt function may be an important determinant of lifespan in both humans and other species.
Current research on the link between mt function and aging has two main foci. The primary focus is on the molecular mechanisms that underlie mt malfunction in aging individuals (Wallace 2005; Hiona and Leeuwenburgh 2008). The second research focus is on the evolution of mt genes in response to natural selection for extended lifespan (de Magalhães 2005; Lehmann et al. 2008; Moosman and Behl 2008). This second line of research could also yield important insights into the aging process because, if we understood the genetic changes that extend lifespan at the interspecific level, we could ask whether similar changes could contribute to intraspecific differences in longevity.
In this study, we analyzed the correlation between the nucleotide content of mt genomes and both generation time and longevity in a wide variety of mammals. In addition to doing a broad survey of all available mammalian mt sequences, we also compared species within the primates and made a detailed comparison of human and chimpanzee mt sequences.
Data and Methods
A total of 206 complete mt genome (mtDNA) sequences of mammals with detailed gene annotation in GenBank format were retrieved from the NCBI RefSeq organelle genome database (February 2007 release) using Entrez (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi). For each of the 206 species, we manually collected the data including the time of the female sexual maturity (hereafter referred as generation time) and maximum longevity from the AnAge database of the Human Ageing Genomic Resources (http://genomics.senescence.info/species/) (de Magalhães et al. 2005). Among the 206 species, 164 species (140 Eutheria, 22 Metatheria, and 2 Monotremata) have generation time data available and 128 species have maximum longevity data available (Supplementary Table S1).
The DNA sequences of protein coding genes were extracted from each of the mtDNA sequences. We wrote a computer script to calculate the nucleotide frequencies of each of the mt genomes, 13 common (conserved) protein coding genes (COI, COII, COIII, ND1-6, ND4L, COB, ATP6, and ATP8), and noncoding DNA sequences. The GC and AT asymmetry is measured in terms of GC and AT skews according to the following formulae: GC skew = (G – C)/(G + C), and AT skew = (A – T)/(A + T), where C, G, A, and T are the occurrences of the four nucleotides (Perna and Kocher 1995), based on the DNA sequences of the major coding strand of the mtDNA. The 12 conserved protein sequences encoded on this strand of human and chimpanzee mtDNA were aligned using CLUSTALW (Thompson et al. 1997) within the Mega3 package (Kumar et al. 2004). These protein coding DNA sequences were aligned based on prealigned protein sequences to avoid codon disruption by alignment gaps. The aligned DNA and protein sequences were then concatenated for site-by-site comparison and for generation of the nucleotide and amino acid replacement matrices.
Tests of phylogenetically independent contrasts were performed using the Contrast program in the PHYLIP package (http://evolution.genetics.washington.edu/phylip/general.html). The phylogenetic tree, based on nuclear 18S rRNA sequences, was built by the neighbor joining (NJ) method after ClustalW alignment using the Mega3 package.
Nucleotide content in human and chimpanzee mtDNA
GC content (%)
All aligned sites (n = 10,836)
Invariant sites only (n = 9770)
Variable sites only (n = 1066)
Since it is well known that the nucleotide composition of mammalian mtDNAs shows significant strand asymmetry, we asked if the degree of asymmetry was also correlated with generation time. We found that there was a significant negative correlation between GC skew and generation time for the major coding strand of the mtDNAs (see Supplemental Fig. S1). Both the positive correlations with GC content and the negative correlation with GC skew can be explained when we look at the frequencies of individual nucleotides on the coding strand (Supplemental Fig. S2). Essentially, the increase in GC content with generation time is almost entirely due to an increase in the frequency of C and a decrease in the frequency of T nucleotides on the coding strand; there is relatively little change in the frequencies of A and G nucleotides on the major coding strand. We will see this pattern repeated below for the comparison of the human and chimpanzee mtDNAs.
In this analysis, we used the age of female sexual maturity as a measure of lifespan. An alternative measure would be maximum longevity, and in fact, these two measures are highly correlated (r = 0.86) (see Supplementary Fig. S5). Given this high correlation between these two measures of lifespan, it is not surprising that we also found a high correlation between nucleotide content and maximum longevity (see Supplementary Fig. S6). We chose to use the age of sexual maturity for most our analyses because this measure is available for more species and because it is based on more extensive measurements.
A previous study by Gibson et al. (2005) looked at the variation in nucleotide content among mammalian mt sequences, and they noted that changes in base composition between lineages can be attributed, in large part, to shifts between the proportions of C and T on the major coding strand. That is entirely consistent with our results, but we have added the further observation that these fluctuations in the proportions of C and T nucleotides are correlated with lifespan. Other studies have looked for correlations between the mt nucleotide content and the biological properties of the species, most notably the correlation between nucleotide content and basal metabolic rate (Martin 1995, 1999). This led us to ask if there was also a high correlation between lifespan and metabolic rate, but we found that this was not true, as has recently been shown by de Magalhães et al. (2007). When we look at the data in Fig. 1a, however, we see that groups with a high resting metabolic rate, such as the lagomorphs (Hayssen and Lacy 1985), have a nucleotide content that is somewhat higher than expected based on their lifespan, while those with a low metabolic rate such as the Cetacaea (Bismuto et al. 1984) have correspondingly reduced GC contents. This suggests that both metabolic rate and lifespan affect the mtDNA independently, with high metabolic rate combined with long lifespan having the maximal effect.
From our data, it is clear that the changes in mtDNA content are not merely a reflection of selection on its coding capacity. This can best be illustrated by the fact that the nucleotide changes at the largely synonymous third codon positions show the effect most strongly (see Supplementary Fig. S7). This is consistent with the accepted view that the major problem affecting mt function in older individuals is oxidative damage of mtDNA regardless of its coding function. The fact that all of the 13 mt protein genes show these correlations when analyzed separately (Supplemental Fig. S8) is a further indication that the effect is not related to the expression of any particular gene product.
Our results lead us to two obvious questions. The first question is why the correlation between mt nucleotide content and lifespan exists, and the second question is what insight this correlation can give us into the role of mt malfunction in the aging process. While it is tempting to suggest that a higher GC content in mtDNA protects it against oxidative damage, it is more likely that the increase in GC content is the consequence rather than the cause of reduced DNA damage. The idea that a higher GC content is the result of reduced oxidative damage fits with the observation that oxidation of nucleotides promotes GC-to-AT substitutions (Pinz et al. 1995; Stevnsner et al. 2002; Kalam et al. 2006). Thus, the higher GC content reflects a lower substitution rate due to reduced DNA damage. Moreover, this notion fits well with the recent observation of reduced synonymous site variation in larger, long-lived mammals (Nabholz et al. 2008).
A partial answer to the second question (i.e., what provides a higher degree of protection to the mtDNA in the longer-lived animals) may come from our detailed comparison of human and chimpanzee mt sequences. The fact that a majority of the protein differences involve a threonine substitution led us to ask whether there is any evidence that threonine substitutions might be involved in age-related selection. First, we can discount the possibility that the decreased numbers of threonines in the human lineage is simply a reflection of the increased GC content. Although nucleotide content does affect mt amino acid content (Foster et al. 1997), the correlation between the GC content and the frequency of threonine residues is generally positive (Urbina et al. 2006), whereas in the human-chimpanzee comparison we see a decrease in threonine residues despite an increase in overall GC content. There is also independent experimental evidence for the action of selection in this case. For example, a number of studies have identified fixed differences between human mt mutations that are associated with age-related diseases and identical, nondisease residues at the same sites in other species (de Magalhães 2005). Several of these fixed differences have a threonine residue in the nonhuman species. For example, in the case of Leber’s hereditary optic neuropathy, two of the five fixed differences involve a threonine residue, and in both cases the threonine occurs in the nonhuman species (de Magalhães 2005). More direct evidence comes from the finding that threonine metabolites can promote oxidative damage of mtDNA (Dutra and Bechara 2004). These findings provide independent support to the view that threonine residues have been selected against since the divergence of human and chimps, and that this selection is directly related to the extension of lifespan in the human lineage. This selection at the protein level also explains the recent finding that the rate of protein sequence evolution in larger mammals is higher than that in smaller animals (Popadin et al. 2007; Rottenberg 2007), despite the fact that the difference between large, longer-lived and small, shorter-lived mammals is in the opposite direction for the synonymous sites (Nabholz et al. 2008).
Recently, Moosmann and Behl (2008) have reported that increased lifespan is correlated with a depletion in cysteine residues in the mt proteome. We did not find a reduction in cysteine in the human mt proteome relative to that of the chimpanzee (see Fig. 3), but it should be noted that there are only a total of two substitutions involving cysteine between these two closely related species. The results of Moosman and Behl are based on very broad phylogenetic comparisons. Such broad comparisons enable greater statistical power based on the larger differences, but on the other hand, they are also complicated by the very marked differences in nucleotide composition between vertebrate and invertebrate mitochondria.
Our results are also entirely consistent with the recent results of Lehman et al. (2008). These authors propose that mammalian maximum lifespan is correlated with a combination of two factors: resting metabolic rate and mt GC content. This fits very nicely with our proposal that variations in GC content are reflections of varying rates of DNA damage, at least some of which are mediated by the metabolism of specific amino acids such as threonine.
Although there is an apparent increase in mt GC content with increasing lifespan, it should be noted that all of the mammalian mt genomes are GC-poor in absolute terms (GC content <50%). It would be more accurate to say that increased levels of age-related natural selection in long-lived species mitigate the mutationally based tendency of mt genomes to become increasingly AT-rich, i.e., GC-poor.
In summary, our results show two effects. First, the nucleotide composition of mtDNA shows evidence of reduced oxidative damage in the mitochondria of long-lived mammals. Second, the changes in the human mt protein sequences provide a partial answer about how this reduction is achieved, specifically the selective replacement of amino acids that promote oxidative damage of DNA. It is likely that this is only one among several such mechanisms that contribute to the preservation of mt function in long-lived mammals.
This research was supported by a Discovery Grant from NSERC Canada to D.A.H.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.