Phylogenetic analysis of mitochondrial substitution rate variation in the angiosperm tribe Sileneae
- First Online:
Recent phylogenetic studies have revealed that the mitochondrial genome of the angiosperm Silene noctiflora (Caryophyllaceae) has experienced a massive mutation-driven acceleration in substitution rate, placing it among the fastest evolving eukaryotic genomes ever identified. To date, it appears that other species within Silene have maintained more typical substitution rates, suggesting that the acceleration in S. noctiflora is a recent and isolated evolutionary event. This assessment, however, is based on a very limited sampling of taxa within this diverse genus.
We analyzed the substitution rates in 4 mitochondrial genes (atp1, atp9, cox3 and nad9) across a broad sample of 74 species within Silene and related genera in the tribe Sileneae. We found that S. noctiflora shares its history of elevated mitochondrial substitution rate with the closely related species S. turkestanica. Another section of the genus (Conoimorpha) has experienced an acceleration of comparable magnitude. The phylogenetic data remain ambiguous as to whether the accelerations in these two clades represent independent evolutionary events or a single ancestral change. Rate variation among genes was equally dramatic. Most of the genus exhibited elevated rates for atp9 such that the average tree-wide substitution rate for this gene approached the values for the fastest evolving branches in the other three genes. In addition, some species exhibited major accelerations in atp1 and/or cox3 with no correlated change in other genes. Rates of non-synonymous substitution did not increase proportionally with synonymous rates but instead remained low and relatively invariant.
The patterns of phylogenetic divergence within Sileneae suggest enormous variability in plant mitochondrial mutation rates and reveal a complex interaction of gene and species effects. The variation in rates across genomic and phylogenetic scales raises questions about the mechanisms responsible for the evolution of mutation rates in plant mitochondrial genomes.
Studies of rate accelerations in plant mitochondrial genomes have consistently shown that these effects are most pronounced at so-called synonymous sites, which do not affect the corresponding amino acid sequence (e.g. ). One of the pillars of the neutral theory of molecular evolution is that the rate of neutral substitutions (i.e. those with no fitness effect) is expected to equal the mutation rate . Synonymous substitutions are not completely neutral, however. They are subject to a variety of selection pressures including translational efficiency, mRNA stability and the conservation of regulatory motifs (reviewed in ), and direct measurements of mutation rates can be more than an order of magnitude higher than those estimated from synonymous substitution rates . Nevertheless, synonymous sites still offer one of our best approximations of the underlying mutation rate. Therefore, considering the absence of well-supported alternative hypotheses, the extreme synonymous substitution rates observed in certain plant mitochondrial genomes are most likely a result of mutational acceleration.
Silene noctiflora (Caryophyllaceae) is a recent addition to a growing list of angiosperms exhibiting major accelerations in mitochondrial synonymous substitution rate [4, 7]. In other well-documented examples (e.g. Plantago and Pelargonium), rate accelerations appear relatively old (ca. 30-80 million years) having preceded the divergence of large clades or even an entire genus . In contrast, the extreme mitochondrial substitution rates of S. noctiflora appear unique relative to other Silene species, suggesting a very recent acceleration. Estimates of mitochondrial substitution rate, however, are available for only a few Silene species, representing a tiny fraction of this large and diverse genus. The sparse sampling severely limits the phylogenetic resolution to detect historical changes in substitution rate.
Sampled species and voucher information.
Agrostemma githago L.
D. Sloan 001 (VPI)
Atocion lerchenfeldianum (Baumg.) M. Popp
Strid 24875 (GB)
Eudianthe laeta (Aiton) Rchb. ex Wilk.
Strandhede et al. 690 (GB)
Heliosperma pusillum (Waldst. & Kit.) Rchb.
E. Zogg ZH 1438 (Z)
Lychnis coronaria (L.) Desr.
N/A. Collected by D. Sloan. Charlottesville, VA, USA
Petrocoptis pyrenaica A.Br.
Schneeweiss et al. 6549 (WU)
Silene acaulis (L.) Jacq.
*Schneeweiss 5315 (WU)
Silene acutifolia Link ex Rohrb.
Rothmaler 13691 (S)
Silene akinfievii Schmalh.
Portenier 3814 (LE)
Silene ammophila Boiss. & Heldr.
Raus 7631 (GB)
Silene antirrhina L.
N/A. Collected by D. Sloan. Kellog, MN, USA
Silene argentina (Pax) Bocquet
M. Popp 2005-11-11 (GB)
Silene armena Boiss.
B. Oxelman 2436 (GB)
Silene auriculata Sibth. & Sm.
Baden & Franzén 795 (Strid)
Silene bellidifolia Jacq.
Strid et al. 35179 (Strid)
Silene caesia Sm.
Baden 1114 (Strid)
Silene caryophylloides (Poir) Otth
Görk et al. 2436 (Strid)
Silene ciliata Pourr.
Franzén et al. 822 (Strid)
Silene conica L.
P. Erixon 70 (UPS)
Silene conoidea L.
A. Rautenberg 290 (GB)
Silene cordifolia All.
Lippert & Merxmüller 17265 (Strid)
Silene davidii (Franch.) Oxelman & Lidén
F. Eggens 85 (UPS)
Silene delicatula Boiss.
B. Oxelman 2456 (GB)
Silene dichotoma Ehrh.
W. Till 17.7.2004 (WU)
Silene douglasii var. oraria (M. Peck) C.L. Hitchc. & Maguire
*N/A. Collected by S. Kephart. Cascade Head, OR, USA
Silene flavescens Waldst. & Kit.
Strid & Papanicolaou 15820 (Strid)
Silene fruticosa L.
B. Oxelman & Tollsten 934 (GB)
Silene gallica L.
D. Sloan 002 (VPI)
Silene gallinyi Heuff. ex Rchb.
Strid & Hansen 9283 (Strid)
Silene gracilicaulis C.L. Tang
Smith 11346 (UPS)
Silene hookeri Nutt. subsp. hookeri
F. Schwartz 107 (WTU)
Silene imbricata Desf.
B. Oxelman 1881 (GB)
Silene integripetala Bory & Chaub.
B. Oxelman 1902 (GB)
Silene involucrata (Cham. & Schltdl.) Bocquet
F. Eggens 7 (UPS)
Silene khasyana Rohrb.
Einarsson et.al 3025 (UPS)
Silene lacera (Stev.) Sims
Schönswetter & Tribsch Iter Georgicum 51 (WU)
Silene laciniata subsp. californica (Durand) J.K. Morton
Schwartz 102-2 (WTU)
Silene latifolia Poir.
*N/A. Collected by J. Greimler. Vienna, Austria
Silene littorea Brot.
P. Erixon 74 (UPS)
Silene macrodonta Boiss.
B. Oxelman 2441 (GB)
Silene menziesii Hook.
Kruckeberg 3436 (WTU)
Silene moorcroftiana Wall. ex Benth
B. Dickoré 17783 (Dickoré)
Silene multicaulis Guss.
Strid & Hansen 9954 (Strid)
Silene muscipula subsp. deserticola Murb.
*Chevalier 548 (WU)
Silene nana Kar. & Kir.
Kereverzova & Mekeda 1976.V.5 (LECB)
Silene nicaeensis All.
D. Sloan 005 (VPI)
Silene noctiflora L.
D. Sloan 003 (VPI)
Silene nutans L.
*Larsen, Larsen & Jeppesen 196 (S)
Silene odontopetala Fenzl
Görk et al. 23817 (Strid)
Silene otites (L.) Wibel
A. Rautenberg 83 (UPS)
Silene paradoxa L.
W. & S. Till 21 July 2002 (WU)
Silene paucifolia Ledeb.
H. Solstad & Elven 04/1384 (O)
Silene pendula L.
A. Rautenberg 289 (GB)
Silene pygmaea Adams
Amirkhanov 22.VI-1977 MW)
Silene repens Patrin
Argus 1068 (UPS)
Silene sachalinensis F. Schmidt
Popov 1949.VII.8 (LE)
Silene samia Melzh. & Christod
B. Oxelman 2208 (UPS)
Silene samojedora (Sambuk) Oxelman
H. Solstad, R. Elven SUP-04-3871 (O)
Silene schafta S.G. Gmel. ex Hohen.
M. Popp 1053 (UPS)
Silene schwarzenbergeri Halácsy
Hartvig & Christiansen 8167 (Strid)
Silene seoulensis Nakai
Hong & Han 13420001 (UPS)
Silene sordida Hub.-Mor. & Reese
B. Oxelman 2206 (GB)
Silene sorensenis (B. Boivin) Bocquet
F. Eggens 48 (UPS)
Silene stellata (L.) W.T. Aiton
N/A. Collected by D. Sloan. Giles County, VA, USA
Silene succulenta Forssk.
Strid & Kit Tan 55028 (Strid)
Silene tunicoides Boiss.
Carlström 5970 (Strid)
Silene turkestanica Regel
K. Kiseleva 20.VI.1970 (MW)
Silene uniflora Roth
P. Erixon 73 (UPS)
Silene vittata Stapf
B. Oxelman 2390 (UPS)
Silene vulgaris (Moench) Garcke
*N/A. Collected by M. Dzhus. Minsk, Belarus
Silene yemensis Deflers
Hepper 5792 (WU)
Silene zawadzkii Herbich
B. Oxelman 2241 (GB)
Viscaria alpina (L.) G. Don
B. Frajman & Schönswetter 11415 (LJU)
Viscaria vulgaris Bernh.
P. Schönswetter & B. Frajman 11097 (LJU)
To compare absolute substitution rates in a gene across lineages requires an estimate of the genealogy with dated nodes (i.e. divergence times). In cases of extreme rate variation, generating such a tree directly from the gene in question is problematic. With rate variation, slowly-evolving taxa can be difficult to resolve, and long branch attraction can favor incorrect topologies . Even with an accurate topology, rate variation can bias the estimate of divergence times with molecular clock based methods. For this reason, previous studies of substitution rate variation in plant mitochondrial genomes have constrained their analyses based on phylogenies and divergence times inferred from nuclear and chloroplasts sequences.
Because both mitochondrial and chloroplast genomes are predominantly maternally inherited in Silene, they are expected to share a common genealogy [23, 24] (although breakdowns in uniparental inheritance may potentially disrupt this relationship [25, 26, 27]). Therefore, we chose the chloroplast gene matK to estimate phylogenetic relationships and divergence times. This gene has proven to be highly informative in phylogenetic reconstruction, partly because of its high rates of substitution [28, 29]. It has also been used in two recent analyses of divergence times within Silene and the Caryophyllaceae [4, 30].
We identified substantial rate accelerations in multiple lineages within the Silene phylogeny as well as major rate differences among mitochondrial genes. Here, we discuss the complex patterns of mitochondrial rate variation in the genus Silene and the implications they have for the evolution of mitochondrial mutation rates and the patterns of selection on mtDNA at the sequence level.
Chloroplast DNA phylogeny
We used three different dating methods, which produced roughly similar estimates of divergence times, but there was a consistent pattern distinguishing them [see Additional files 1 and 2]. Specifically, the Langley-Fitch method produced the youngest estimates of divergence times within Sileneae, while a penalized likelihood method produced the oldest. For example, the estimated age of the root node for the entire Silene/Lychnis clade differed by 50% between the two methods (21.0 vs. 14.0 Myr). The BEAST model (Figure 2) generally produced intermediate estimates of divergence time relative to the other two methods. Only the BEAST values were used for subsequent rate analyses, so the uncertainty in divergence times should be considered when interpreting absolute substitution rate estimates.
Mitochondrial rate variation
Absolute substitution rates by gene (SSB).
nad9 (378 bp)
cox3 (588 bp)
atp1 (960 bp)
atp9 (162 bp)
Pairwise RNand RScorrelation coefficients within and among genes across phylogenetic lineages
Evolutionary congruence between mitochondrial and chloroplast genomes
We utilized a constrained topology derived from cpDNA to analyze evolutionary rates in mtDNA, reflecting the assumption that the two organelle genomes share a single genealogy. Although the mitochondrial genes often yielded limited phylogenetic signal because of the dual problems of low variation and long branch attraction, there was some evidence to support phylogenetic congruence between these genomes [see Additional file 5]. For example, S. hookeri and S. menziesii were consistently paired by both chloroplast and mitochondrial genes, suggesting that the allopolyploid S. hookeri inherited both of its cytoplasmic genomes from the S. menziesii parental lineage . In addition, the rapidly evolving atp9 gene produced a tree that generally agreed with the chloroplast matK topology at younger nodes, which are presumably less susceptible to saturation at synonymous sites.
There were a large number of incongruencies between chloroplast and mitochondrial trees, but they were generally lacking in support. Perhaps the most suspicious example of conflict between mitochondrial and chloroplast topologies was the placement of S. samojedora, S. seoulensis, S. zawadzkii and the major accelerated species from subgenus Behenantha in a clade otherwise populated by subgenus Silene in the cox3 tree [see Additional file 5]. Although this clade was supported by 74% of bootstrap replicates, inspection of the alignments showed that the grouping was based entirely on a single 6 bp region with 5 substitutions, raising doubts about the independence of those characters. Overall, we found no overwhelming evidence of conflicts between mitochondrial and chloroplast topologies, but the lack of mitochondrial divergence in many lineages gave us little statistical power. Therefore, it is possible that topological inaccuracies in our constraint tree could have led to misidentification of small mitochondrial rate accelerations, but given the phylogenetic scale of our analysis, it is unlikely that any of the major rate changes was an artifact of topological conflicts.
To date, studies of angiosperms with major increases in plant mitochondrial substitution rates (e.g. Plantago, Pelargonium and Silene) have concluded that the observed accelerations are largely independent of evolutionary rates in the chloroplast or nuclear genomes (although there is growing evidence for accelerated sequence and structural evolution in the chloroplast genomes of species with high mitochondrial substitution rates [34, 35, 36]). In our dataset, we found little substitution rate variation among species for the chloroplast matK gene and no significant correlation between mitochondrial and chloroplast rates (Table 3).
Mutation rate variation among species
Silene noctiflora has been shown to have dramatically accelerated rates of mitochondrial evolution relative to its congeners [4, 7]. We examined the phylogenetic distribution of this rate acceleration within the tribe Sileneae and identified six Silene species grouped into two clades that exhibited major increases in synonymous substitution rates across all four loci examined (Figure 3). As an illustration of the magnitude of these accelerations, we note the average synonymous pairwise divergence between these two closely-related clades within Silene subgenus Behenantha exceeds the divergence typically observed between flowering plants and liverworts--the deepest split in the land plant phylogeny . Based on the currently available data in seed plants, the synonymous substitution rates exhibited by these rapidly-evolving lineages (Figure 4) are exceeded only by the fastest lineages of Plantago (Figure 1) . In addition, the observed rates are on par with average estimates for mammalian mtDNA, although they still fall well below the fastest mammalian rates . As discussed above (see Background), the observed differences in synonymous substitution rates most likely reflect differences in the underlying mutation rate.
The phylogenetic data remain ambiguous with respect to whether the two clades with rate accelerations represent independent evolutionary events. The matK tree does not strongly support or reject a monophyletic relationship between S. noctiflora/S. turkestanica and section Conoimorpha (Figure 2; [see Additional files 6 and 7]). More thorough phylogenetic analyses of these taxa have recently been conducted, utilizing both chloroplast and nuclear loci . These studies have found that, while cpDNA sequences suggest phylogenetic independence between the two clades, at least some nuclear loci support monophyly. Therefore it is possible but inconclusive that both high rate clades are sister taxa that inherited an accelerated mitochondrial substitution rate from a common ancestor. If so, the two clades must have split shortly after that acceleration, because internal branches shared by the two lineages in the mitochondrial gene trees are quite short relative to the divergence between the lineages [see Additional file 5]. Resolving these phylogenetic relationships could prove difficult because previous studies have shown that the evolutionary history of subgenus Behenantha may be complicated by reticulation [32, 40], such that relationships differ across genes and genomes
Comparisons of mitochondrial sequences from multiple populations of S. noctiflora have revealed very low levels of polymorphism, suggesting that the historically high mutation rates in this lineage may have undergone a reversion to more typical levels ( and unpublished data). This conclusion was, at least partially, supported by our phylogenetic data. The terminal branches for S. noctiflora and S. turkestanica exhibited a marked reduction in RSvalues relative to the ancestral rate for that clade (Figure 4). In contrast, the patterns of divergence within section Conoimorpha gave little indication of rate reversions.
The genus Silene is characterized by great diversity in breeding system and life history, and there has been substantial interest in how these traits may be related to molecular evolution in mitochondrial genomes [14, 26, 41, 42, 43, 44]. There is no clear correlation between breeding system/life history and rate acceleration. The species exhibiting rate acceleration across all four mitochondrial genes are all hermaphroditic/gynomonoecious annuals with the exception of S. turkestanica, which is perennial. However, there are at least ten additional annual lineages represented in our sampling, and breeding system (hermaphroditic/gynomonoecious or gynodioecious) has yet to be determined for most species.
Mutation rate variation among genes
Substitution rates commonly differ among regions within a genome because of variation in selection and/or mutational pressure, and a previous study had already identified substantial rate heterogeneity among Silene mitochondrial genes . Nevertheless, the differences in synonymous substitution rates among mitochondrial genes in the current study are surprisingly large. If the six species that show universal acceleration across all four mitochondrial genes are excluded, atp9 appears to be evolving more than 40 times faster than nad9 at synonymous sites, while cox3 and atp1 fall in between these extremes.
The extreme elevation in atp9 substitution rates calls into question whether a biological mechanism other than an increase in the mutation rate might be responsible. The obvious alternatives to explain high levels of divergence include horizontal gene transfer (HGT) from distantly related species , maintenance of ancient, trans-specific polymorphism by balancing selection [26, 41, 44, 47], re-localization of the gene to the higher mutation rate environment of the nuclear genome , or relaxed selection in a non-functional pseudogene .
None of these explanations, however, are fully consistent with the data. To explain the observed levels of divergence based on HGT without an increase in evolutionary rates would require multiple phylogenetically distant donor species (i.e. outside the angiosperms). Phylogenetic analysis of atp9, however, clearly places these sequences within the Caryophyllaceae ([Additional file 5] and unpublished data; note that this argument also applies to the lineage-specific divergence in S. noctiflora/S. turkestanica and section Conoimorpha). Likewise, in the absence of rate acceleration, an explanation based on balancing selection alone would require that polymorphism be maintained for hundreds of millions of years. Such a model seems extremely unlikely and even still could not explain the retention of partial phylogenetic congruence between atp9 and matK. Of course, the fact that balancing selection alone cannot explain the pattern of divergence in atp9 does not rule out the possibility that balancing selection has been acting on atp9 and other mitochondrial genes in Silene.
It is unlikely that atp9 has been functionally transferred to the nucleus in at least four Silene species--S. latifolia, S. noctiflora, S. vulgaris and S. paradoxa. Whole mitochondrial genome sequences confirm that atp9 is mitochondrially encoded in both S. latifolia and S. noctiflora (Sloan et al., unpublished data). In addition, the gene has been shown to be maternally inherited in S. vulgaris . Comparing cDNA and genomic sequence also confirms that atp9 contains a site that undergoes C-to-U RNA editing in S. paradoxa--a process that is characteristic of organellar but not nuclear genes in plants (Sloan et al., unpublished data). Although we cannot definitively rule out the possibility of nuclear transfer, these data strongly suggest that nuclear transfer is not the driving force behind the pattern of elevated substitution rate observed in atp9.
It is also clear that atp9 is functional based on its low ω values and the absence of internal stop codons. Therefore, we conclude that the most likely explanation for the high levels of divergence is an increased mutation rate that is specific to atp9 (or a subset of the mitochondrial genome that includes atp9).
The molecular evolution of atp9 could be influenced by the presence of multiple gene copies in at least some species (see Methods). The existence of multiple copies could reflect heteroplasmy resulting from paternal leakage , non-functional paralogs in the mitochondria or other genomes , or the existence of multiple functional mitochondrial copies . It is conceivable that atp9 is located in a region of active recombination within the Silene mitochondrial genome or is experiencing frequent retroprocessing back into the genome from mRNA. Both of these processes may be mutagenic as well as lead to gene duplication and, therefore, would be consistent with our observations [6, 52, 53]. Alternatively, high mutations rates in atp9 may have simply increased divergence between heteroplasmic and/or paralogous copies, thereby enhancing our ability to detect multiple copies of atp9 even though they exist for other genes as well. Sequencing complete mitochondrial genomes, analyzing relative copy number of atp9 variants, and sampling multiple individuals per species would help distinguish between these possibilities. In a sample of individuals from 40 different populations of S. vulgaris, we found 4 individuals with multiple atp9 copies, and certain variants were only found in multi-copy individuals (unpublished data). This result suggests there is polymorphism for the presence of a paralogous copy within S. vulgaris, although heteroplasmy involving a rare haplotype is also plausible.
The acceleration in atp9 appears to be common to most of Silene/Lychnis. In contrast, most of the other Sileneae genera exhibit more conventional substitution rates for atp9, although their rates are still elevated on average. This pattern is consistent with an atp9-specific increase in substitution rate very early in the divergence of Silene, which may have been magnified by further accelerations in local areas of the genus.
A previous study of mitochondrial substitution rate variation across the seed plant phylogeny identified a handful of individual species exhibiting elevated divergence in one gene but not others . Our observations of rate variation in atp9 within Silene indicate that such gene-specific effects can be maintained across large clades of species over millions of years. We also found that these effects can occur quite locally. Most notably, S. nutans exhibited an RSvalue of 80 SSB for atp1 (a rate that exceeds all other species for that gene), but it showed no sign of acceleration in nad9 or cox3 (Figure 3). A number of other species showed more modest rate increases in atp1 and/or cox3 without correlated accelerations in other genes. These patterns may reflect local mutational effects within the genome. Alternatively, given the mounting evidence for recombination in plant mtDNA [41, 46, 54] and the existence of rate variation both within and among species , rate discrepancies between genes may be the result of recombination between genomes with different mutational histories. Finally, the possibility of nuclear transfer for a gene such as atp1 in S. nutans should also be considered .
Evolution at synonymous and non-synonymous sites
Despite the massive variation in synonymous substitution rates among genes and species, we found that rates of non-synonymous substitution generally remained low (although there was a positive correlation between RNand RSacross branches: Figure 3, Table 3). Across genes, RSvalues vary by 9 to 41-fold (depending on whether the six species with apparent genome-wide accelerations are included), while RNvalues vary by only 2 to 3-fold (Table 2). As a result, there is an apparent negative relationship between RSand the ratio of non-synonymous to synonymous changes (ω). While RSis commonly interpreted as a measure of the mutation rate, ω is used as an estimate of the intensity/efficacy of purifying selection (i.e. "the selective sieve" ). Under these interpretations, our data would suggest that genes experiencing high mutations rates also face greater purifying selection. In contrast, the opposite pattern has been observed in comparisons of nuclear genes in mammals .
The relationship between RSand ω among mitochondrial genes in Silene should be confirmed in a larger sample, because we have examined only 4 loci in the present study, and atp9 may generally be subject to strong purifying selection . Whether RSand ω can be reliably interpreted as measures of mutation rate and purifying selection depends on the distribution of fitness effects for mutations at synonymous and non-synonymous sites, which are not well understood in plant mitochondrial genomes. These distributions will dictate how the synonymous and non-synonymous substitution rates scale with the mutation rate.
In a comparison of sequence divergence in 15 protein-coding mitochondrial genes between angiosperms and the liverwort Marchantia, Laroche et al.  found much greater variation among genes in dNthan in dS--the opposite of what we observed. This discrepancy highlights the importance of phylogenetic scale in these studies. Across deep nodes in the land plant phylogeny, local differences in gene-specific mutation rates are apparently averaged out, and variation in the magnitude of purifying/positive selection among genes becomes the primary determinant of evolutionary rates. In contrast, at the local phylogenetic scale of our study, the signature of gene-specific differences in mutation rate is apparently maintained.
Because of the high variance and abundance of 0 values associated with short branches in our analysis, it is difficult to test for the same relationship between RSand ω across lineages that we observed across genes. We did see, however, that removing the rapidly evolving branches from each tree raised the ω ratio for all four genes, suggesting that the same pattern may hold. Mower et al.  conducted a broad phylogenetic survey of seed plants in which short branch lengths would be expected to be less of a problem. Their data showed a strong negative relationship between RSand ω across lineages (see also ). Therefore, it appears that major increases in mitochondrial synonymous substitution rate--either gene or taxon-specific--are accompanied by a less than proportional increase in non-synonymous substitution rate such that the effects of an apparent increased mutational pressure on amino acid sequences are greatly dampened.
Uncertainty in divergence time estimates
We used molecular clock based methods to estimate divergence times in our matK gene tree. The estimated ages were generally older than estimates from two previous studies [4, 30]. The discrepancy between these studies is likely attributable to two major differences. First, there is simple difference in calibration age between our study and the analysis of Mower et al. (2007), which utilized an age of 38 Myr for the Beta/Silene divergence derived from a broader molecular clock analysis of the angiosperms . This date appears to be in conflict with our fossil calibration point, as all three of our analyses estimate the age of the Beta/Silene split to be at least 52 Myr old. This distinction, however, cannot explain the contrasting results between our study and that of Frajman et al. , because we used essentially the same calibration point. Instead, we note that there was a significant difference in sampling schemes between these two studies. Our focus on the genus Silene produced a very imbalanced topology with much denser branching in certain parts of the tree than others. In contrast, Frajman et al.  utilized a much more balanced phylogenetic sampling. Because it is easier to detect multiple substitutions at the same site in regions with lots of branching, there is a tendency to estimate longer branch lengths in species-rich parts of a phylogeny (the "node density effect" ). This effect may contribute to our older age estimates within the tribe Sileneae. Because of this potential bias, the uncertainty over calibration points and the many assumptions associated with molecular clock based dating, it is important to stress that the divergence times used in this analysis should be considered only as approximations.
Divergence time estimates are necessary to calculate absolute substitution rates, so dating uncertainty should be considered in comparing absolute rates across studies. For example, re-calibrating node ages within Silene to correspond with the 12 Myr divergence time for the genus estimated by Frajman et al.  would increase our substitution rate estimates by approximately 50%. In relative terms, however, our three dating analyses were quite consistent within Sileneae. Therefore, our estimates of proportional variation in substitution rate across species and genes are less sensitive to dating method.
Based on our analysis of mitochondrial divergence within the tribe Sileneae, we conclude that mutational acceleration is not restricted to a single species nor is it completely confined to a small number of high rate lineages. The patterns of divergence in atp9 illustrated that elevated rates have been maintained throughout much of the genus Silene for at least one mitochondrial gene, highlighting a complex gene × species interaction in the distribution of rate variation. The diversity in phylogenetic and genomics scale suggests that there is no simple rule or single mechanism underlying mutation rate variation in plant mitochondrial genomes. Elucidating the mechanistic forces that shape mutation rate variation should represent a high priority in the field of plant mitochondrial genomics. Silene was targeted for this in-depth sampling of species-level mitochondrial divergence because of a priori knowledge of the rate acceleration in S. noctiflora. Determining whether the patterns of rate variation among species and among genes in Silene are broadly representative of angiosperm genera or represent something unique about the molecular evolution of Silene will require similar levels of sampling in taxa that currently show no evidence of rate increase.
Silene (Caryophyllaceae) comprises approximately 700 predominantly herbaceous species that vary substantially in life history and breeding system . The genus has become a model system for diverse areas of research with a particular focus on the molecular evolution of organelle genomes, including studies of population genetics [61, 62], organelle transmission [25, 27], evolutionary rates [4, 7, 35, 45], and cytoplasmic male sterility [26, 44]. Silene belongs to the tribe Sileneae, which has been the subject of extensive and ongoing phylogenetic analysis [30, 31, 32, 63, 64, 65]. Taxa were selected so as to represent major groups that will appear in a forthcoming revised taxonomy of the genus (Oxelman et al. in prep). For this study, we used a combination of field collected samples and preserved herbarium specimens along with previously published sequence data. Sample collection and voucher information are summarized in Table 1.
DNA extraction, PCR and Sequencing
We extracted total genomic DNA from each sample. For silica-dried samples and herbarium specimens, we followed the protocol described by Oxelman et al.  and performed subsequent purification using the Qiagen QIAquick Purification Kit protocol, Ultra Silica Bead kit (ABgene), or GFX PCR DNA and Gel Band Purification Kit (Amersham Biosciences). For fresh tissue samples, extractions were performed using the Qiagen Plant DNeasy Kit.
We PCR amplified the full-length coding sequence of the chloroplast gene maturase K (matK) and portions of four mitochondrial protein coding genes: ATP synthase subunit 1 (atp1), ATP synthase subunit 9 (atp9), cytochrome c oxidase subunit 3 (cox3) and NADH dehydrogenase subunit 9 (nad9). [See Additional file 8 for PCR primer sequences.]
PCR products were cleaned with Exonuclease I and shrimp alkaline phosphatase (USB Corporation), cycle sequenced with BigDye v3.1 (Applied Biosystems), and analyzed on an ABI 3130 × l capillary sequencer. Automated basecalls were edited manually using published Beta vulgaris sequences as a reference for reading frame, and sequences were assembled into contigs using Sequencher v4.5 (Gene Codes). All sequences obtained for S. sorensenis were identical to those from S. involucrata, so S. sorensenis was excluded to simplify subsequent analysis. DNA sequences have been submitted to GenBank [see Additional file 9 for accession numbers]. Sequence alignments were generated using the Clustal function imbedded in MEGA v4.0  and edited manually [see Additional file 10].
matKphylogenetic analysis and dating
We estimated the phylogeny of our sample based on the matK dataset, using both maximum likelihood (ML) and maximum parsimony (MP) criteria in PAUP* v4.0b10 . In addition to the species listed in Table 1, our analysis also included matK sequences from GenBank for the following outgroups: Beta vulgaris (Amaranthaceae), Illecebrum verticillatum (Caryophyllaceae) and Scleranthus perennis (Caryophyllaceae). Our ML search employed a GTR+Γ substitution model with fixed parameter values identified based on an analysis of our full matK dataset (including outgroups) using the AIC method in ModelTest v3.7 . The ML topology was identified with a heuristic search using the TBR branch swapping algorithm, the MULTREES option in effect, and random addition of sequences with 10 replicates. A MP bootstrap analysis based on 1000 replicate datasets was performed using the same heuristic search settings except with MULTREES off. We performed ML and MP analyses in the same fashion for each mitochondrial gene.
We used three different techniques to estimate divergence times from our matK gene tree: (1) a Bayesian relaxed clock model implemented in BEAST v1.4.8 , (2) a penalized likelihood (PL) method, and (3) the Langley-Fitch (LF) method. The latter two methods were both performed in r8s v1.71 . The LF model is a maximum likelihood strict molecular clock method that enforces a constant substitution rate over the entire tree. The other two methods allow for rate variation among branches. The PL approach assumes that rates are correlated across adjacent branches and penalizes models that require rapid rate changes within the tree. In contrast, the BEAST analysis constrained the rate variation among branches to a lognormal distribution but placed no restriction on correlations between adjacent branches. All dating analyses incorporated an extra outgroup, Nepenthes glabrata (Nepenthaceae), which was added solely to determine the position of the root along the Beta vulgaris branch. It was pruned from the resulting trees and discarded from all subsequent analyses. We used a calibration time of 34 million years (Myr) for the split between Scleranthus and Sileneae, which corresponds to the recent analysis of Frajman et al.  and the fossil evidence described therein.
The BEAST analysis was conducted with a GTR+Γ model of substitution with 4 rate categories, empirical base frequencies and a birth-death process tree prior. We defined a monophyletic ingroup to include all species except Nepenthes, Beta and Illecebrum. The calibration date was effectively fixed by specifying a normal distribution with mean of 34 and standard deviation of 0.00001 as the prior for the time to most recent common ancestor (TMRCA) of the pre-defined ingroup. We ran 3 MCMC chains of length 50 million each with trees saved every 25,000 iterations. The first 1000 trees (50%) from each chain were discarded as burn in, and chains were combined after verifying convergence among runs. We generated a maximum credibility tree with mean node heights as well as a tree that was constrained to the 70% bootstrap consensus topology from our parsimony analysis. BEAST reported the estimated node ages along with the associated 95% high probability densities (HPDs), as well as the posterior probability for each node.
For the PL and LF analyses in r8s, we used the 70% parsimony bootstrap consensus topology with branch lengths optimized under ML in PAUP*. We fixed the root age of the tree to an arbitrary value of 1 and calibrated the resulting output tree with the 34 Myr fossil age for the Scleranthus/Sileneae split. Both analyses utilized the TN search algorithm with 10 restarts, 10 time guesses, and the checkgradient option on. The cross-validation procedure was used to determine an optimal smoothing parameter of 0.0022 for the PL analysis. Error in divergence times for both methods was estimated based on the distribution of 100 bootstrap replicate datasets, following the recommendations in the r8s documentation.
Estimating dNand dS
We estimated the branch lengths of matK and all 4 mitochondrial genes individually in terms of synonymous (dS) and non-synonymous (dN) substitutions per site, using a codon-based model of substitution within the codeml application in PAML v4.0 . We also analyzed a concatenation of all 4 mitochondrial genes and a concatenation of atp1, cox3 and nad9 only. Tree topologies were constrained based on the matK 70% parsimony bootstrap consensus. Codon frequencies were determined by an F1x4 model. The parameters values for ω and transition/transversion ratio were estimated from the data with initial values of 0.4 and 2 respectively. Separate ω values were estimated for each branch. As in the dating analysis, Nepenthes glabrata was used as an outgroup in the matK dataset to identify the position of the root along the Beta vulgaris branch. Arabidopsis thaliana served a similar purpose for the mitochondrial genes. These outgroups were pruned from the resulting trees and not considered further. Because the process of C-to-U RNA editing can bias the estimation of dNand dSvalues in plant mitochondrial genomes , we excluded all codons known to undergo RNA editing in Beta vulgaris .
In the matK dataset, 6 species (Heliosperma pusillum, S. conica, S. paradoxa, S. samojedora, S. seoulensis and S. conoidea) produced sequences with an apparent frameshift indel in one of two homopolymer regions, raising the possibility that we sequenced pseudogenes in these species. In addition, S. otites did not have a start codon at the conserved position in the matK alignment. These sequences showed little indication of elevated rates or other abnormal substitution patterns, and their phylogenetic placement was consistent with a priori information. Therefore, they were retained in the dataset, and codons that contained frameshift indels were removed to keep all sequences in frame.
For the mitochondrial dataset, we obtained sequences for atp1, cox3, and nad9 from all sampled species, but only a subset of atp9 sequences were successfully generated. For 5 of the 74 species in our sample, we failed to successfully amplify and sequence atp9, and an additional 9 species appeared to have multiple atp9 copies. In the latter group, sequencing electropherograms indicated the presence of two different nucleotides in between 1.2 and 19.5% of sites. These samples were excluded from the analysis.
Estimates of absolute substitution rates (RNand RS)
Following the basic methodology described by Cho et al. , absolute substitution rates (substitutions per site per year) can be obtained by dividing branch lengths (defined in terms of substitution per site) by the age of the branch. We calculated branch ages using the divergence times estimated by BEAST. We divided the dNand dSvalues reported by PAML for each branch by the respective branch age to obtain absolute substitution rates in terms of non-synonymous and synonymous sites (RNand RS, respectively). Standard errors for RNand RSwere calculated as described by Parkinson et al.  where the standard errors for node ages were approximated as one quarter of the 95% HPD. Pearson correlations coefficient for RNand RSvalues across and within genes were calculated with PROC CORR in SAS Software v9.1.
We thank Stephanie Goodrich for providing Agrostemma seeds and Nahid Heidari and Vivian Aldén for assistance in the lab. We also appreciate comments on an earlier version of this manuscript from Steve Keller, Magnus Lidén, Matt Olson, and the members of the Taylor lab. This study was supported by NSF DEB-0808452 (to DBS and DRT), NSF DEB-0349558 (to DRT) and grants from the Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning (to BO).
- 11.Gaut BS, Morton BR, McCaig BC, Clegg MT: Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. Proc Natl Acad Sci. 1996, 93 (19): 10274-10279. 10.1073/pnas.93.19.10274.PubMedCentralCrossRefPubMedGoogle Scholar
- 23.Desplanque B, Viard F, Bernard J, Forcioli D, Saumitou-Laprade P, Cuguen J, Van Dijk H: The linkage disequilibrium between chloroplast DNA and mitochondrial DNA haplotypes in Beta vulgaris ssp. maritima (L.): the usefulness of both genomes for population genetic studies. Mol Ecol. 2000, 9 (2): 141-154. 10.1046/j.1365-294x.2000.00843.x.CrossRefPubMedGoogle Scholar
- 34.Chumley TW, Palmer JD, Mower JP, Fourcade HM, Calie PJ, Boore JL, Jansen RK: The complete chloroplast genome sequence of Pelargonium × hortorum: organization and evolution of the largest and most highly rearranged chloroplast genome of land plants. Mol Biol Evol. 2006, 23 (11): 2175-10.1093/molbev/msl089.CrossRefPubMedGoogle Scholar
- 39.Rautenberg A: Phylogenetic relationships of Silene sect. Melandrium and allied taxa (Caryophyllaceae), as deduced from multiple gene trees. Ph.D. thesis. 2009, Uppsala University; UppsalaGoogle Scholar
- 48.Adams KL, Qiu YL, Stoutemyer M, Palmer JD: Punctuated evolution of mitochondrial gene content: high and variable rates of mitochondrial gene loss and transfer to the nucleus during angiosperm evolution. Proc Natl Acad Sci. 2002, 99 (15): 9905-9912. 10.1073/pnas.042694899.PubMedCentralCrossRefPubMedGoogle Scholar
- 67.Swofford DL: PAUP*. Phylogenetic Analysis Using Parsimony (* and Other Methods). Version 4. 1998, Sunderland, MA: Sinauer AssociatesGoogle Scholar
- 75.Holmgren PK, Holmgren NH, Barnett LC: Index Herbariorum: Part 1. The herbaria of of the world. 1990, New York: New York Botanical GardenGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.