Introduction

Numerous sequence comparisons between closely related taxa have shown that genes encoding gamete-recognition proteins (proteins that mediate sperm-egg interactions during fertilisation) are unusually diverse (Clark et al. 2006; Swanson and Vacquier 2002a; Swanson and Vacquier 2002b). It has been demonstrated that the rapid diversification of such proteins are often driven by positive selection in free-spawning marine species, such as sea urchins and abalones (Hellberg and Vacquier 1999; McCartney and Lessios 2004; Riginos and McDonald 2003; Springer and Crespi 2007; Vacquier et al. 1997; Yang et al. 2000), as well as in several vertebrate species (Calkins et al. 2007; Civetta 2003; Gasper and Swanson 2006; Swanson et al. 2003; Swanson et al. 2001; Turner and Hoekstra 2006). Although positive selection in gamete-recognition proteins thus appears to be a general phenomenon in diverse taxonomic groups, the underlying evolutionary forces are still poorly understood. In marine invertebrates, which are characterised by external fertilization and no mating behaviour, it has been hypothesised that divergence between gamete-recognition proteins can establish prezygotic barriers to reproduction and hence play an important role in speciation (Metz and Palumbi 1996; Palumbi 1992; Vacquier et al. 1997). For example, it has been proposed that selection against hybridisation (reinforcement) drives the evolution of the bindin protein in Echinometra sea urchins (Geyer and Palumbi 2003). In these organisms, bindin shows extreme sequence divergence in sympatric populations, whereas this is not the case in allopatric populations.

In vertebrates, the presence of mating behaviour and mate choice add a behavioural premating barrier to hybridisation, theoretically reducing the pressure on gametic recognition and binding proteins. A variety of alternative hypotheses have therefore been put forward to explain why gamete-recognition proteins evolve by positive selection in vertebrates, and these primarily involve sperm competition and sexual conflict as evolutionary driving forces. Sexual conflict can arise when conditions that are optimal in one sex simultaneously act to reduce fitness in the other, i.e., when the reproductive interests of the two sexes are not coincident. Sexual conflict over adaptive optima is thought to lead to a co-evolutionary chase between male and female characters (Gavrilets 2000; Rice and Holland 1997). This may apply to gamete-recognition proteins because there is a sexual conflict when sperm competition leads to fast rates of fertilisation (fertilisation being mediated by gamete-recognition proteins). Female organisms may benefit from a more moderate rate to prevent polyspermic fertilisation, i.e., when several sperm bind and fuse with the egg (Frank 2000). Polyspermy generally results in embryo mortality in most organisms (Gardner and Evans 2006), and elaborate mechanisms have evolved to avoid it. It is often assumed that the larger energy investment put into female gametes makes polyspermy more detrimental to female than male fitness, although it should be kept in mind that the cost of a failed fertilisation could either be a matter of egg contra sperm or egg contra entire ejaculate. If polyspermy avoidance is important for the rapid evolution of gamete-recognition proteins in vertebrates with internal fertilisation, then we can predict that the signatures of positive selection would be absent or weaker in species where polyspermy is norm.

One way to test this prediction is to study the molecular evolution of gamete-recognition proteins in birds because the principal feature of fertilisation in birds is physiologic polyspermy (Stepinska and Bakst 2007; Tarin and Cano 2000). Consequently, sexual conflict over fertilisation may be less pronounced in the avian system. In fact, it is unclear whether a single sperm can activate an oocyte in birds, and the observation of a positive correlation between the number of sperm entering ova and ovum size could suggest that large ova may require more spermatozoa to ensure fertilization (Birkhead et al. 1994; Bramwell and Howarth 1992), making polyspermy adaptive. The vertebrate egg envelope is composed of a set of related proteins encoded by zona pellucida (ZP) genes (Hughes 2007; Lefievre et al. 2004; Wassarman 1988). These genes can be divided into five classes, ZP1, ZP2, ZP3, ZP4, and ZPAX. ZP genes have been identified in mammals, birds, amphibians, and fish (Litscher and Wassarman 2007). Although avian ZP proteins have not been functionally characterized in detail (compared with the situation in, for example, invertebrates and mice), the presence of ZP genes in the genomes of phylogenetically divergent eukaryotic lineages, coupled with the conserved nature of the fertilization process, suggest that their functional role in birds is similar to that in other organisms.

The molecular evolution of avian ZP3 has previously been studied by Berlin and Smith (2005) and Calkins et al. (2007), with some evidence for adaptive evolution provided by the latter study. In this study we sequenced and analysed avian ZP1, ZP2, ZP4, and ZPAX, along with two other genes known to be involved in gamete- recognition, CD9 (Miyado et al. 2000; Runge et al. 2007) and Acrosin (Baba et al. 1994). We investigated the role of positive selection in driving the evolution of these avian proteins and then compared this with the situation for mammalian orthologs.

Materials and Methods

Samples and Sequences

Tissues were collected from one female and one male mallard (Anas platyrhynchos), guinea fowl (Numida meleagris), pheasant (Phasianus colchicus), pigeon (Columba livia), quail (Coturnix coturnix), red grouse (Lagopus lagopus scotica), and turkey (Meleagris gallopavo) and were stored in RNAlater (Qiagen). Some avian and mammalian sequences were taken from GenBank as specified in Table 1 and Supplementary Table 1.

Table 1 Summary of species and GenBank accesssion numbers for mammalian gamete-recognition proteins analysed in this study

Laboratory Work

Total RNA was extracted from spleen, testes, and ovaries using TRIzol (Invitrogen). First-strand cDNA was synthesized from the total RNA using Oligo(dT)20 primers (Invitrogen), and this cDNA was subsequently used as template for polymerase chain reaction (PCR). The PCR conditions were 95°C for 5 minutes, 35 cycles of 94°C for 1 minute, 55° to 58° C for 1 minute, and 72°C for 1 minute, and a final 10-minutes extension at 72°C. Primer sequences and combinations are listed in Supplementary Table 2. The PCR primers were also used as sequencing primers. All PCR products were cleaned before sequencing by adding 1 μl ExoSAP-IT (Amersham Biosciences) to every 3 μl PCR product. The reactions were incubated for 15 minutes at 37°C and for 15 minutes at 80°C. The samples were sequenced by Macrogen (Seoul, South Korea) on ABI 3730 instruments (Applied Biosystems). DNA chromatograms were edited and checked using Sequencher 4.2.2 (Gene Codes Corp., Ann Arbor).

Table 2 Descriptive data for the six gamete-recognition genes sequenced in birds

Sequence Analyses

The sequences were aligned using CLUSTALW in the Alignment Explorer tool in MEGA 3.1 (Kumar et al. 2004). We used the codeml program in the PAML package version 4 (Yang 1997; Yang 2007) to perform likelihood ratio tests of positive selection for each gene. For these analyses we considered models of codon evolution which allow for variation in ω, which is the ratio of nonsynonymous to synonymous substitutions per nonsynonymous and synonymous sites (dn/ds or KA/KS), among codons but assume the same distribution in all lineages. We performed three likelihood ratio tests (LRT), which are thought to provide reliable tests of positive selection (Swanson et al. 2003; Wong et al. 2004), according to the following:

  1. 1.

    M1a-M2a LRT: The M1a model (one ω class between 0 and 1, and one class of ω = 1) is compared with the M2a model (same as M1a model plus an extra class of >1).

  2. 2.

    M7-M8 LRT: The M7 model (a discretised beta distribution for ω between 0 and 1 with 10 equal class proportions) is compared with the M8 model (same as the M7 model plus an extra class of ω ≥ 1).

  3. 3.

    M8a-M8 LRT: The M8a model (same as M7 plus an extra class of ω = 1 is compared with the M8 model).

For all LRTs, equilibrium codon frequencies were obtained using the average base composition at the three codon positions (CodonFreq = 2), and the transition–transversion rate ratio was estimated from the data. The sequences were analysed with gaps included. This type of analysis requires an unrooted phylogeny, and we used the following topologies of species trees depending on from the number of taxa from which sequence data were obtained. For the nine bird species analysed for CD9, the following topology was used: (((((((red grouse, turkey), pheasant), quail), chicken), guineafowl), duck), pigeon, zebrafinch) (Kaiser et al. 2007). For ZP4 and ZPAX, for which eight species were analysed, the following topology was used: ((((((turkey, pheasant), quail), chicken), guineafowl), duck), pigeon, zebrafinch). The following seven species were analysed for ZP1 and ZP2: (((((turkey, pheasant), quail), chicken), guineafowl), duck, zebrafinch). Finally, six species were analysed for Acrosin using the following tree: ((((pheasant, quail), chicken), guineafowl), duck, zebrafinch). For analyses of the mammalian orthologues (Table 1), the species trees that were used were based on Murphy et al. (2001) and are presented in the Supplementary Material.

Results

Birds

We sequenced six gamete-recognition genes (CD9, Acrosin, ZP1, ZP2, ZP4, and ZPAX) across a suite of bird species (Table 2 and Supplementary Tables 1 and 3) and analysed the pattern of molecular evolution in these genes. Three pairs of selection-neutral models were compared for each gene using results from codeml: M1a and M2a, M7 and M8, and M8a and M8 (Table 3). For two of the genes (ZP4 and ZPAX), the selection models (M2a and M8) were not significantly different from the neutral models (M1a, M7, and M8a), and we therefore find no evidence for a strong role of positive selection affecting the evolution of these genes in birds. For the other four genes (CD9, Acrosin, ZP1, and ZP2), the selection model M8 fit the data significantly better than the neutral model M7 (CD9: −2ΔlnL = 13.5, df = 2, p = 0.001; Acrosin: −2ΔlnL = 8.2, p = 0.02; ZP1: – 2ΔlnL = 10.3, p = 0.006; and ZP2: −2ΔlnL = 9.7, p = 0.008). We obtained similar results for the M8 versus M8a comparisons, where the M8 model was a significantly better fit to the data than the M8a model for all four genes (CD9: −2ΔlnL = 9.8, df = 1, p = 0.002; Acrosin: −2ΔlnL = 3.9, p = 0.05; ZP1: −2ΔlnL = 5.3, p = 0.02; and ZP2: −2ΔlnL = 5.0, p = 0.02. For the more conservative M1a versus M2a comparison, the selection model was significantly different than the neutral model for the CD9 gene (−2ΔlnL = 8.3, df = 2, p = 0.02) but not for the other three genes. These results suggest that CD9, Acrosin, ZP1, and ZP2 have evolved under the influence of positive selection in birds. Detailed results from all genes and all models, including putative selected sites, as determined through the Bayes empiric Bayes analyses for models M2 and M8, are listed in Table 3. The number of selected sites varied between 1 and 7 at the p = 0.10 significance level for the four different genes.

Table 3 Tests of positive selection among avian gamete-recognition genes

Mammals

The same models were compared for the mammalian orthologs of the investigated bird genes using available sequence data (Tables 4 and 5 [ZPAX absent in mammals]). The selection model M8 fit the data significantly better than the neutral M7 model for all genes except for ZP1 (CD9: −2ΔlnL = 10.7, df = 2, p = 0.005; Acrosin: −2ΔlnL = 45.5, p < 0.001; ZP2: −2ΔlnL = 26.1, p < 0.001; and ZP4: −2ΔlnL = 9.8, p = 0.008). Similarly, the selection model M8 fit the data significantly better than the neutral model M8a for the same four genes (CD9: −2ΔlnL = 4.2, df = 1, p = 0.04; Acrosin: −2ΔlnL = 28.2, p < 0.001; ZP2: −2ΔlnL = 15.4, p < 0.001; and ZP4: −2ΔlnL = 8.5, p = 0.004). However, in the more conservative test, the selection model M2a fit the data significantly better than the neutral model M1a for Acrosin (−2ΔlnL = 32.0, df = 2, p < 0.001) and ZP2 (−2ΔlnL = 12.5, p = 0.002) but not for the other genes. To summarize, four of five mammalian gamete-recognition genes showed evidence for adaptive evolution, confirming previous observations (Swanson et al. 2001).

Table 4 Descriptive data for the five gamete-recognition genes in mammals
Table 5 Tests of positive selection among mammalian gamete-recognition genes

The incidence of putatively selected sites in mammalian orthologs (Table 5) was approximately as high as in the corresponding bird genes. For the three genes positively selected in both birds and mammals, the frequency of selected sites were 9.5% (116 of 1217 sites) and 9.0% (118 of 1314), respectively (χ2 = 0.21 [not significant]). For all genes analysed, the frequencies were in both cases 1.8%. We finally sought to determine if the same sites of orthologs had been subject to positive selection in both mammals and birds. Figure 1 shows an amino-acid alignment of human and chicken CD9. There is a tendency for clustering selected sites in a homologous region of the two species toward the 3′ end of the protein, but hardly any individual sites identified as positively selected in both species. For acrosin and ZP2, the overlap in adaptively evolving regions is less clear, with the exception of three adjacent codons of acrosin evolving under positive selection (Supplementary Figs. 1 and 2).

Fig. 1
figure 1

Amino-acid alignment of chicken and human CD9. Positively selected sites are indicated in bold text

Discussion

We found that four (CD9, Acrosin, ZP1, and ZP2) of six (also including ZP4 and ZPAX) gamete-recognition genes showed evidence for adaptive evolution in birds. For the orthologous genes in mammals, CD9, Acrosin, ZP2, and ZP4 evolved by positive selection, whereas this was not seen for ZP1. There were no significant differences in the number of codons per gene that were positively selected in mammals and birds, suggesting that the pattern and strength of selection is similar in these vertebrate groups for the investigated genes. The results were obtained using a similar number of sequences with similar sequence divergences for the two organismal groups, giving comparable power in the detection of adaptive evolution. We therefore conclude that the molecular evolution of gamete-recognition genes among birds and mammals is not dramatically different despite the fact that there are pronounced differences in reproductive biology between these groups.

The initial stage of avian fertilisation involves the penetration of the inner pervitelline layer (IPVL; analogous to mammalian zona pellucida) by multiple sperm. Multiple sperm then enter into the cytoplasm of a germinal disc (Bakst and Howarth 1977; Birkhead et al. 1994; Okamura and Nishiyama 1978a; Okamura and Nishiyama 1978b; Tarin and Cano 2000). In chicken, >100 sperm can enter an ovum without resulting in decreased fertility (Bramwell et al. 1995). After the sperm have penetrated the IPVL, the outer pervitelline layer is formed, which serves to block pathologic polyspermy (Stepinska and Bakst 2007). Spermatozoal nuclei that have entered the egg cytoplasm decondense and transform into male pronuclei (Okamura and Nishiyama 1978b; Waddington et al. 1998). However, only a single sperm pronucleus fuses with the egg pronucleus, whereas the other supernumary male pronuclei migrate to the periphery and are degraded by DNAses present in the ovum (Stepinska and Bakst 2007). Overall, the IPVL does not seem to form such a critical barrier to fertilization as the zona pellucida does in mammals.

It is important to note that the observation of similar rates of molecular evolution of avian and mammalian gamete-recognition proteins does not necessarily prove that adaptive evolution of these proteins in birds is not driven by the same mechanism as in mammals. Although there are arguments in favour of polyspermy avoidance being the fundamental driver in mammals, firm evidence for this conclusion is lacking. We cannot therefore exclude that other forces are acting in both birds and mammals and that the impact of these forces is what gives rise to the similar patterns in the two lineages. However, if one assumes that polyspermy avoidance is a major driver in mammals, then our data suggest that the mechanism is different in birds. This is in line with the study of Calkins et al. (2007) who, based on the observation of adaptive evolution of avian ZP3, concluded that “polyspermy avoidance is not sufficient to explain positive Darwinian selection in reproductive proteins across taxonomic groups.”

If polyspermy avoidance does not explain the rapid evolution of avian gamete-recognition proteins, we must seek alternative explanations. One possibility relates to the fact that heterospecific fertilization (hybridization) causes embryo mortality or results in a hybrid offspring with low fitness. Gamete-recognition proteins may thus evolve under positive selection to form prezygotic copulatory barriers to heterospecific sperm. However, whereas in mammals sperm interaction with the zona pellucida is thought of as a species-specific event (Wassarman et al. 2001), the role for gamete-recognition proteins as reproductive barriers in birds is questionable (Birkhead and Brillard 2007) because it has been demonstrated that chicken sperm can bind to the IPVL of both closely and distantly related species (Stewart et al. 2004). Nevertheless, this suggests that effective postcopulatory barriers have an important role in avian speciation because bird hybrids are rare (Birkhead and Brillard 2007).

An alternative possibility is that the evolution of gamete-recognition proteins in birds is related to postcopulatory sexual selection because sperm competition is a widespread phenomenon among birds, and cryptic female choice is thought to be common (Birkhead et al. 2004; Birkhead and Pizzari 2002). It is not known how specific sperm are selected and at which stage between insemination and fertilisation sperm selection takes place, although Birkhead and Brillard (2007) list five stages where it could happen: (1) when sperm traverse the vagina; (2) when sperm enter or exit the sperm-storage tubules (SSTs); (3) when sperm are transported from the sperm-storage tubules to the infundibulum; (4) when sperm penetrate the IPVL; and (5) when sperm locate and or fuse with the female pronucleus. Potentially, gamete-recognition proteins may act as the barrier between sperm and egg in one of the later stages.

To conclude, we have shown that gamete-recognition proteins evolve by positive selection in birds, similar to what is the case in mammals (Swanson et al. 2001). The evolutionary forces driving the rapid divergence of these genes are likely to be different forms of sexual conflict between female and male gametes, a conflict created by way of the differential cost of failed or suboptimal fertilisation between male and female organisms. If polyspermy avoidance is not the main mechanism for this in birds, as has been postulated for other organisms, postcopulatory sexual selection, such as cryptic female choice, could create the sexual conflict underlying the rapid evolution of gamete-recognition proteins. It will be interesting to study the molecular evolution of gamete-recognition proteins in other organisms in which physiologic polyspermy is norm, such as in some reptiles, amphibians, newts, and salamanders.