Introduction

Lineages vary dramatically in their propensity to diversify. When the Lake Victoria basin in east Africa filled with water some 150,000 years ago, it was colonized by a number of freshwater fish (Seehausen 2002). Among these were haplochromine cichlids that subsequently diversified into several hundred new species within the lake. However, not all fish that colonized Lake Victoria diversified in this way and many did not even produce a morphologically distinct population. This pattern is typical of species radiations. Not every insect that invaded the Hawaiian islands produced a spectacular species radiation like the Hawaiian Drosophila, not every bird species that reached the Galapagos diversified like the Darwin’s finches, and so on. Clearly, some organisms are more prone to diversify into new species than others.

A number of factors have been identified that influence speciation rate (reviewed in Coyne and Orr 2004). These include the degree of ecological differentiation (Funk et al. 2006), evolvability of key traits (Galis and Metz 1998), and sexual selection (Kraaijeveld et al. 2010). Together, these factors go some way in explaining variation in speciation rates among lineages. Comparative analyses that incorporate both sexual selection and ecology typically explain 10–50% of the variation in species richness of particular clades (Stuart-Fox and Owens 2003; Sol et al. 2005; Phillimore et al. 2006; Kruger 2008; Seddon et al. 2008), leaving ample room for additional factors.

In this paper, I consider the possibility that genomic architecture predisposes lineages to speciate at relatively fast or slow rates. I explore both arguments and examine to what extent either is supported by empirical evidence. I will adopt the biological species concept (Mayr 1942) and consider how genome size may contribute to pre- or post-zygotic isolation. I will use the term ‘genome size’ to mean haploid DNA content, or C-value. This is not strictly correct for polyploids, in which C-value the includes several genomes. However, polyploidization events are difficult to distinguish from duplication of large blocks of genes, especially when they happened a long time ago.

Why Large Genome Size Could Promote Speciation

Genomes may accumulate DNA in various ways. The most important mechanisms are duplication of sections of the genome or whole genomes (i.e., polyploidization), and the activity of transposable elements (Gregory 2005). While these are very different processes, they both result in an increase in C-value (to different degrees and in different ways) and may both influence diversification rate. A suite of reasons have been proposed for why these processes might facilitate speciation.

Duplications

All else being equal, there are at least three reasons for why gene duplication might lead to an increase in speciation rate: neofunctionalization, subfunctionalization, and differential silencing. Duplicated copies of genes may evolve new functions (neofunctionalization) and facilitate the invasion of new ecological niches (Ohno 1970). Given that ecological diversification consistently correlates with speciation (Funk et al. 2006), neofunctionalization could promote speciation. Alternatively, the original function of a duplicate gene may be divided among the copies (subfunctionalization). If this is resolved in different ways in allopatric populations, differential subfunctionalisation could result in genetic incompatibilities that hinder interbreeding (Lynch and Force 2000). Last, one of the copies of a duplicated gene may be lost or silenced. Again, if this is resolved in different ways in allopatric populations, this could lead to incompatibilities when in secondary contact (Taylor et al. 2001; Lynch and Conery 2000; Lynch and Force 2000).

Because the likelihood of neofunctionalization, subfunctionalization, and differential silencing in a given genome increases with the number of duplicated genes, a positive correlation between genome size and speciation rate could be the result.

Transposable Elements and Chromosomal Rearrangements

Transposable elements are an important factor contributing to genome size of many organisms. The presence of transposons and other repetitive DNA can lead to ectopic recombination and thus chromosomal rearrangements and transposition events result in mutations. These processes should be more common in genomes that contain many transposable elements, predicting that large genomes should higher rates of rearrangement and mutation.

Chromosomal rearrangements and mutations could also increase speciation rate, since they may cause hybrid breakdown (Rebollo et al. 2010). Furthermore, chromosomal speciation models predict that chromosomal rearrangements should promote speciation because of reduced recombination between heterokaryotypes (Rieseberg 2001; Navarro and Barton 2003a). Because of reduced recombination, chromosomes carrying rearrangements will accumulate positively selected mutations independently in the two lineages, leading to genetic incompatibilities (Navarro and Barton 2003b).

Whether the presence and activity of transposable elements indeed increase speciation rate is currently difficult to assess. Environmental stress and hybridization may cause the breakdown of epigenetic control of transposable elements, resulting in bursts of transposition. Table 1 in Rebollo et al. (2010) lists several studies that report concordant timing of massive bursts of transposition and species radiation.

Table 1 Correlations between genome size (C-value) and species richness

Genome Size and Extinction Risk

The scarcity of very large genomes suggests that such genomes are selected against. It has thus been suggested that large genomes increase the risk of extinction. For example, the accumulation of non-functional DNA may play a role in species extinction via mutational meltdown, the downward spiral of mutation accumulation, fitness loss, and population decline (Lynch 2007). In support of this idea, Vinogradov (2003) found that genome size correlated positively with the risk of extinction in land plants. A similar correlation was found in vertebrates, although the pattern was less clear-cut (Vinogradov 2004).

While suggestive, there are at least two reasons to be cautious when interpreting these results. First, we do not know cause and effect in this correlation. Natural selection is ineffective in preventing the accumulation of mildly deleterious TE insertions and other genomic junk in small populations. Thus, organisms at risk of extinction may have large genomes because of their small population size (Lynch 2007). Second, Oliver et al. (2007) showed that the skewed distribution of genome sizes in eukaryotes can be explained without invoking strong selection against large genomes. Regardless of the mechanism of genome size change, the total amount of DNA added or deleted depends strongly on the initial genome size (Oliver et al. 2007). If changes in genome size are proportional to initial genome size, it follows that absolute changes will be much larger for large genomes than for small. It should thus be difficult for a small genome to become and stay very large, but much easier for a large genome to become small (Oliver et al. 2007). Genome size distributions should thus be expected to be strongly skewed towards small genomes, as they are.

Why Small Genome Size Could Promote Speciation

Increases in genome size place constraints on development and metabolism (Gregory 2005). Life-history may therefore select strongly for genome reduction. For example, the fast development of Drosophila larvae appears to constrain gene length, because long pre-mRNAs take longer to transcribe than shorter versions (Swinburne and Silver 2008). This should favour intron-less copies that have arisen through retroposition events. In support of this idea, De Renzis et al. (2007) found that the majority of the early expressed zygotic genes in Drosophila melanogaster lack introns. Experimental and computational studies indicate that retroposed genes are common in eukaryote genomes (Fan et al. 2008). A genome may also shed non-essential DNA through chromatin dimunition (Gregory 2005). The process of genome reduction may be resolved in different ways in allopatric populations. For example, intron-less retroposed genes may appear in some populations, but not in others. Upon secondary contact, such differences may cause genetic incompatibility.

There is another poorly understood, but potentially important reason for why small genomes may promote speciation. Pósfai et al. (2006) stripped the genome of Escherichia coli of all non-functional DNA (a reduction of 15% in genome size). Propagation of recombinant genes and plasmids was found to be more efficient in the small-genome strains than in the wild-type, suggesting that small genomes may accumulate mutations and genomic rearrangements more quickly than larger genomes and propagate them more reliably. This could mean that allopatric populations can diverge and become genetically incompatible more quickly if they have small genomes.

Genome Size and Temporal Patterns of Species Diversification

Correlations Between Genome Size and Species Richness

Studies that looked for correlations between genome size and species richness are listed in Table 1. Positive correlations are limited to several groups of fish (Mank and Avise 2006). By contrast, genome size and species diversity are negatively correlated in all other cases (Table 1).

These studies reveal little about the mechanisms underlying these correlations. Species richness is the net result of speciation and extinction. Negative correlations between genome size and species number could thus be due to reduced speciation rate or increased extinction in large genome taxa, or both. Distinguishing between these possibilities is currently not possible.

Temporal Patterns of Genome Size Change

Lungfish have the largest genomes among animals (ranging from 40 pg in Protopterus annectens to 133 pg in P. aethiopicus; Gregory 2010). Thomson (1972) measured cell sizes in fossil lungfish and showed that these had increased over the course of their evolution since the Devonian. As cell size is strongly correlated with genome size among extant taxa (Gregory 2001), this indicates that the genomes of lungfish as a group have accumulated large amounts of DNA over time. Simultaneously, the species diversity among lungfish decreased over time (Stanley 1979). The temporal association of genome size and species diversity results in a significant negative correlation between these two ‘traits’ (Fig. 1).

Fig. 1
figure 1

Left panel: cell size estimates for fossil lungfish (after Thomson 1972). Right panel: taxonomic diversity of lungfish over time (after Stanley 1979). Since taxonomic diversity is a property of time, I combined multiple estimates of cell size for a single period (two estimates each for Pennsylvanian, early Permian, and recent; Spearman rank correlation coefficient = −0.89, n = 9, P = 0.002)

We currently have no way of knowing whether the negative correlation between genome size and species diversity in lungfish is causal or not. However, the pattern is consistent with the idea that large genomes increase extinction risk (Vinogradov 2003, 2004). In this view, more and more species became extinct as the genomes of lungfish accumulated DNA. On the other hand, a similar pattern of reduced species diversity with increased genome size would be predicted if increased genome size constrained speciation rate. The important point here is that the observed correlation is clearly inconsistent with the idea that large genome size should promote diversification.

It is currently impossible to assess the generality of this pattern. Fossil cell sizes have been measured in a number of other groups, including crossopterygian lobe-finned fishes, Paleozoic amphibians (Thomson and Muraszko 1978) and dinosaurs (Organ et al. 2007). None of these display steady increases in cell size as seen in lungfish.

Net Diversification Rates in Mammals

Species diversification rates can also be estimates from phylogenies. Mammals are represented by multiple such estimates at the family level in the data summarized in Coyne and Orr (2004). I combined these estimates with genome size information (Gregory 2010) and found a significant positive correlation between genome size and diversification interval (Fig. 2). Again however, the opposing effects of speciation and extinction on species diversity cannot be separated.

Fig. 2
figure 2

Genome size of mammal taxa in relation to estimates of their speciation interval (data from Coyne and Orr 2004). Linear mixed model with log10-transformed C-values as dependent variable and speciation interval as independent variable: likelihood ratio = 37.26, df = 1, P = 0.0001. Species was included as a random effect to account for multiple estimates of genome size for the same species (n estimates = 229, n groups = 145)

Shifts in Genome Size at the Base of Species Radiations

Large-scale duplication events are at the base of a number of evolutionary radiations. These include the teleost fish radiation (the most speciose group of vertebrates) (Hoegg et al. 2004; Volff 2005) and further duplications within several teleostean clades (Volff 2005). More controversially, a large-scale duplication event may have occurred early in the vertebrate radiation (Ohno 1970, but see Friedman and Hughes 2001).

Polyploidization is particularly common in plants. Most (perhaps all) flowering plants can be traced to a polypoidization event early in the angiosperm radiation (Cui et al. 2006). Polyploidization appears to be a recurring theme in plant speciation, with 15% of angiosperm and 31% of fern speciation events associated with ploidy increases (Wood et al. 2009). However, once established, polyploid plant lineages do not have elevated speciation rates (Wood et al. 2009). Furthermore, genome expansions could be quickly followed by genome reduction, making it difficult to link the subsequent diversification to genome size.

By contrast, the onset of a number of other radiations coincided with significant genome size reductions. Examples include saurischian dinosaurs (Organ et al. 2007), hummingbirds (Gregory et al. 2009), pufferfish (Volff 2005), and Plethodon salamanders (Kozak et al. 2006; see below). Such cases indicate that genome size reductions can sometimes be followed by bursts of diversification or periods of low extinction. When detailed phylogenetic data is available, it should be possible to distinguish between speciation and extinction as the cause of increased diversification. Rabosky and Lovette (2008) showed that bursts of speciation lead to a phylogenetic pattern in which many new taxa are added early in the radiation, followed by a slow-down in diversification. Such ‘explosive-early’ patterns of diversification cannot be explained by increasing extinction rates (Rabosky and Lovette 2008).

The case of the Plethodon salamanders illustrates such an ‘explosive-early’ pattern of diversification. The genus Plethodon occurs throughout North America. In western North America, the genus is represented by only a handful of species, but in the east there no less than 46 recognized species. Phylogenetic analysis by Kozak et al. (2006) revealed that the eastern clade forms a monophyletic group that accumulated lineages at a high rate early in its history. The rate of lineage accumulation later slowed down (Kozak et al. 2006). Data from the animal genome size database (Gregory 2010) reveals that species from the eastern (species-rich) clade have significantly smaller genomes than those from the species-poor western clade (n = 28 and n = 15, respectively; Linear mixed model F 1,10.7 = 21.31, P = 0.001. Species (nested within clade) was included in the model as a random effect to account for multiple genome size estimates for the same species). Thus, a reduction of genome size occurred before or early in a burst of speciation in the eastern clade.

Of course the example described above represents only a single data point in the comparison of genome size versus speciation rate. However, when more similarly detailed phylogenies become available for which there is also sufficient genome size information it will become possible to test for a more general association between genome size and speciation, while excluding the effect of extinction.

Radiating Versus Non-Radiating Insects

For taxa that have sufficiently rich fossil records, historic patterns of taxonomic diversity can be used to distinguish currently diversifying groups from non-diversifying ones. Labandeira and Sepkoski (1993) compiled such data for the major insect clades. Figure 3 shows the historical diversity (number of families) in combination with available estimates for genome size. Clades that have been increasing in diversity up to the present have significantly smaller genomes than groups in which diversity has remained constant or has decreased (Fig. 3). All clades that show large increases in taxonomic diversity up to the present (Coleoptera, Diptera, Lepidoptera, Hemiptera, and Hymenoptera) have mean C-values below 2 pg, while clades in which diversity has decreased (Blattaria) or remained relatively constant (Orthoptera and Odonata) tend to have larger genomes.

Fig. 3
figure 3

Genome sizes of major insect clades (upper panel) and their fossil diversity at the family level (lower panel; after Labandeira and Sepkoski (1993); modified following Labandeira and Eble (in press) by combining heteroptera and homoptera under hemiptera). Clades were classified as having increased in diversity up to the present time (n = 5) or not (n = 6). Linear mixed model with log10-transformed C-values as dependent variable and ‘radiated’ as independent variable: likelihood ratio = 6.57, df = 1, P = 0.01. Order was included in the model as a random effect to account for different numbers of C-value estimates available per order (n estimates = 479, n groups = 11)

At the family level, all insects have very low extinction rates (Labandeira and Sepkoski 1993). The difference in family diversity between clades is thus mainly an effect of differences in the rate at which they generate new families, rather than the rate at which they go extinct.

The correlation between genome size and current diversification rate in insects may be confounded by the degree of metamorphosis. Genome size in insects that undergo complete metamorphosis (holometabolous) appear to be more constrained than in insects that do not (hemimetabolous) (Gregory 2002). Holometabolous insects also tend to have higher diversification rates than hemimetabolous insects (Yang 2001), potentially resulting in a correlation between diversification rate and genome size among insects as a side effect. However, this possibility is refuted by the Hemiptera and Blattaria. Hemiptera are hemimetabolous, but have been increasing in diversity up to the present. The majority of Hemipteran genomes are small, especially those of aphids. Blattaria are also hemimetabolous, yet experienced a time of high diversification rate in the late Carboniferous (Labandeira and Sepkoski 1993; Fig. 3). Blattaria are very unlikely to have been undergoing complete metamorphosis in the Carboniferous, but it is not impossible that their genomes were smaller during that time, not unlike the lungfish mentioned above.

Conclusions

There are theoretical reasons to expect that increased genome size should either increase or decrease speciation rate. Of these possibilities, the former has received most attention. There is good evidence that polyploidization events coincided with speciation events in plants and fish, but not all increases in C-value are caused by polyploidy. Furthermore, it is not clear that large-genome taxa sustain higher speciation rates than small-genome taxa. In fact, the bulk of the empirical evidence is more consistent with elevated speciation rates in small genomes. Like genome duplications, genome reductions are at the base of some evolutionary radiations. Furthermore, clades that have small genomes tend to consist of more species and are more likely to be diversifying than clades with large genomes. There are several non-exclusive explanations for this pattern:

  • The effect of genome size on species diversity may operate solely through increased extinction in large genome taxa. Separating the effects of speciation and extinction on species diversity is difficult, but possible in some cases (Barraclough and Nee 2001). The simulations of Rabosky and Lovette (2008) show that the recurrent pattern of a burst of species diversity followed by a slow-down in diversification cannot be explained by changes in extinction rate. It is thus unlikely that extinction can account for all the observations described in this paper.

  • The selective forces that favour small genome size (such as selection on metabolic rate and developmental speed) may also promote speciation.

  • The process of genome reduction may cause incompatibilities between incipient species to a higher degree than genome size increases.

  • New variants may be generated more frequently and/or be more stably inherited in a small genome, causing faster adaptation and species divergence in small-genome clades.

To distinguish between these possibilities, we need more information on the processes involved in genome reductions and how these affect speciation. We also need to study the accumulation of genetic variants in small versus large genomes.