Introduction

How variation at functional gene regions arises and is maintained in natural populations is a key remaining question of evolutionary genetics (reviewed in Piertney and Oliver 2006). The prime candidate gene region to investigate this question is the major histocompatibility complex (MHC). It is hypothesized that positive selection, acting in concert with a “Red Queen” arms race between pathogens and the immune system (Ladle 1992), drives the maintenance of diversity at the MHC. However, empirical evidence for high levels of selection within the MHC has been mixed, depending on which species, MHC region, or method of analysis is employed. Numerous studies have reported evidence of selection on MHC genes in mammals (Aguilar et al. 2004; Amills et al. 2008; Čížková et al. 2011; Goüy de Bellocq et al. 2009), fish (Evans et al. 2010; Gomez et al. 2010; Schaschl et al. 2008), and birds (Alcaide et al. 2008; Anmarkrud et al. 2010; Loiseau et al. 2009). Few researchers have considered, however, whether selection acts in a constant and consistent way across all MHC genes and if there are differences in selection signatures between alpha and beta class genes of the MHC (Averdam et al. 1992).

Much of the work to date on MHC selection processes has been based on model species including humans, rodents, and fish. More recently, attention has turned to natural populations in an attempt to understand how ecological factors interact to influence the dynamics of MHC evolution and how MHC diversity affects individual fitness and population level processes (Piertney and Oliver 2006). While many of the initial hurdles in the isolation and characterization of individual, functional, highly polymorphic MHC loci have been overcome (see review in Bernatchez and Landry 2003), most studies have restricted themselves to investigations of single MHC loci or single populations (Acevedo-Whitehouse and Cunningham 2006). To better understand how the full range of MHC variation contributes to individual fitness and is maintained in natural populations, it is important to extend the breadth of MHC studies to encompass multiple loci and a greater spatial range of populations (Čížková et al. 2011; Garrigan and Hedrick 2003). While, for some studies, it makes sense to only consider single locus interactions (e.g., MHC-related matechoice or pathogen driven selection) and, for some species, there are difficulties in amplifying just single interpretable loci, there are clear advantages to a multiple gene approach to MHC. By contrasting patterns of selection at the different MHC gene regions, it has been shown that in many species (e.g., human and nonhuman primates, rats), variability at class II alpha-chain genes is lower than for beta-chain genes (Averdam et al. 2010; Bondinas et al. 2007). On the other hand, in salmonids, the difference is less pronounced (Gomez et al. 2010), reflecting perhaps a different evolutionary trajectory for the MHC of teleost fishes. Comparisons of the levels of diversity and the strengths of selection acting on the different gene regions of the MHC are essential first steps to understanding how adaptive genetic variation arises and is maintained in an ecological framework.

The brown hare (Lepus europaeus) represents an excellent species to examine MHC gene interactions. It belongs to a Pleistocene species flock that occurs in a wide range of ecologically different habitats from arctic/alpine to desert environments. Hares are virtually the only small- or medium-sized mammals in Europe that raise their young above ground right from the day of birth onwards. Their development is incredibly rapid with an increase in body weight by eight to tenfold of birth weight within the first month of life. Living above ground, they receive little protection against harsh environmental conditions and are usually left alone and only visited by their mother once a day for a very short period of time to suckle (Göritz et al. 2001; Hartl et al. 1995). The sudden exposure to aboveground environmental challenges after an intrauterine life, with virtually no protection from their mother, together with their rapid development suggests a very efficient immune system to guarantee sufficient energy allocation to growth. A diverse set of infectious and parasitic diseases such as European brown hare syndrome (a Lepus-specific calicivirus), tularemia and pseudotuberculosis (bacterial diseases), and coccidiosis (sporozoan infection) are known to strongly affect regional survival (reviewed in Wibbelt and Frölich 2005). Previously, we have examined diversity and selection acting on an alpha-chain gene (DQA; Campos et al. 2011; Goüy de Bellocq et al. 2009; Smith et al. 2010), but no work has been carried out, to date, on other genes of the MHC. Here, we compare those results to additional brown hare data from the three MHC class II genes DQB, DPB, and DRB. Furthermore, we examine the scale of the population recombination rate in this species as compared to point mutations, in order to assess the extent to which recombination and mutation contributes to the diversity at the MHC class II beta loci. We expect the levels of diversity at the beta-chain genes to be higher than the moderate level reported for the DQA locus (Goüy de Bellocq et al. 2009), but in line with other studies of mammalian class II genes (Aguilar et al. 2004; Amills et al. 2008; Averdam et al. 2010; Bryja et al. 2006; Čížková et al. 2011; Oliver and Piertney 2006).

Materials and methods

Animal sampling and sample preparation

All samples were derived during autumnal hunts in 2006 and 2007 from three localities in eastern Austria and three in northern Belgium (Fig. 1). In total, 518 individuals were sampled, 291 from Austria and 227 from Belgium. For each individual, a small piece of liver was collected and preserved at −20°C. Total DNA was extracted using the GeneElute Mammalian Genomic DNA Miniprep Kit (Sigma) and eluted in 200 μl of water. Additionally, RNA was obtained from four individuals from our breeding station which were to be euthanized for the purpose of an unrelated experiment. A further piece of liver was preserved in RNAlater at −20°C. Total RNA was isolated using RNAeasy Mini Kit with on-column RNase-free DNase set (Qiagen) according to the manufacturer’s instructions. Total RNA was eluted in 30 μl RNase-free water and stored at −70°C.

Fig. 1
figure 1

Sampling localities of brown hares from Belgium (hatched circles) and Austria (stars). BK Bulskamp, SL Sint Laureins, MB Moerbeke, OW/STR Oberweiden/Stripfing, ZW Zwerndorf, BG/LA Baumgarten/Lassee

Design of beta gene exon 2-specific primers for the hare

To ensure specificity of the primer design process for brown hares, we performed a RNA ligase-mediated rapid amplification of cDNA ends (RLM-RACE) using a GeneRacer kit (Invitrogen) following the specifications of the manufacturer. Specific conditions for the method are described in detail in Goüy de Bellocq et al. (2009). In brief, RNA from four individuals was pooled two by two into two samples and used as a template for cDNA synthesis. RACE-polymerase chain reaction (PCR) amplification of the 5′ end of the cDNA was performed using the GeneRacer 5′ primer provided with the kit and the universal JS2 primer (Schad et al. 2004) which has been designed in a part of exon 2 relatively conserved in beta genes. PCR was performed as described in Goüy de Bellocq et al. (2009). Bands representing PCR products within the expected size range (~400 bp) were excised from 1.5% agarose gel, purified, cloned, and sequenced as described in Goüy de Bellocq et al. (2009). We obtained 11 sequences of the 5′ end of the cDNA, one corresponding to the DPB gene (BLAST, Max ident >94% with RLA-DPB of Oryctolagus cuniculus), eight corresponding to DQB gene (BLAST, Max ident >92% with RLA-DQB of O. cuniculus), and two corresponding to DRB gene (BLAST, Max ident 86% with RLA-DRB of O. cuniculus; GenBank AN: NC_013680.1). Based on the cDNA sequences obtained with the RACE and rabbit RLA sequences from GenBank, we designed forward and reverse primers embedded within exon 2 to amplify portions of the exon in each of the three beta genes (see Table 1 for primer sequences). The DRB primers within exon 2 amplified an additional amplicon from genomic DNA that was not detected when RNA was used as a template. The sequences generated from gDNA contained a mutation in the exon 2 portion which altered the reading frame such that a stop codon was introduced midsequence. To avoid amplification of this nonfunctional gene from gDNA or the time-consuming process of genotyping from RNA, it was necessary to design a new forward primer within intron 1 at the 5′ end of exon 2. We used a primer designed in exon 1 of the DRB gene (Lepus_DRBex1_F2, identified from the original 5′ RACE-PCR) in combination with primer JS2 to amplify across intron 1 and provide sequences for alignment and primer design. The sequence divergence in this intron was sufficient to design a forward primer (DRB_Int_1f) that excluded the more variable nonfunctional copies and only amplified the functional alleles.

Table 1 Primer sequences used in MHC class II gene assay development and screening

PCR amplification and CE-SSCP

Fragments of exon 2 for each gene (DPB 188 bp, DQB 210 bp, DRB 225 bp excluding primers) were amplified with 6′ FAM-labeled forward and NED-labeled reverse primers using the Multiplex PCR kit (Qiagen) following the manufacturer instructions in a final reaction volume of 10 μl. The thermal profile started with an initial HotStarTaq DNA polymerase activation at 95°C for 15 min, followed by 30 cycles of denaturation at 94°C for 30 s, annealing at 58°C for 30 s and extension at 72°C for 1 min, and final extension at 72°C for 10 min. One microliter of 25× diluted PCR product was mixed with 13.75 μl of Hi-DI formamide (ABI) and 0.25 μl of GeneScan 350 ROX size standard and denaturated at 96°C for 3 min and immediately placed on ice. Capillary electrophoresis single-strand conformation polymorphism (CE-SSCP) analysis was carried out using 5% Genescan polymer (ABI) with the addition of 10% glycerol in an ABI 3130xl genetic analyzer. TBE (1×) with 10% glycerol was used as running buffer. Electrophoresis was conducted at 18 and 22°C, samples were injected for 18 s and electrophoresis voltage was set to 12 kV. Samples were run for 36 min. Chromatograms were analyzed using the software program GeneMapper 3.7 (ABI).

Cloning and sequencing

We selected individuals with diverse SSCP patterns, representing all identified alleles, to investigate sequence variation at the second exon of the three beta-chain loci, respectively. The genes were amplified as described above, but using nonlabeled primers. Where homozygous individuals were detected by SSCP analysis, we directly sequenced the PCR product. The PCR products from heterozygote individuals were ligated into pCR2.1 vectors, and chemically competent TOP 10 cells were transformed with the ligation products using a TOPO TA cloning kit (Invitrogen). Transformants containing inserts were identified via PCR screening and plasmid DNA was extracted by QIAprep Spin Miniprep Kit (Qiagen), and inserts sequenced (Macrogen). Each potential allele (unique conformation pattern) was sequenced at least twice from two different individuals that were taken, if possible, across the Austrian and Belgium regions. The sequences were edited in Sequencing Analysis 5.1 and aligned in BioEdit (Hall 1999) using ClustalW multiple alignment.

Population genetic parameters and analysis of selection

Population allele frequencies, observed and expected heterozygosities (Ho and He), and tests of deviation from Hardy–Weinberg equilibrium were calculated using GENEPOP 4.0 (Rousset 2008). The allelic numbers for the brown hare sequences were assigned according to the guidelines of Klein et al. (1990). Population differentiation was determined by calculation of Jost’s D (Jost 2008) as a more robust measure than Fst. Calculations were performed using the DEMEtix package within the R statistical framework (R Development Core Team), and p-values were calculated by performing 10,000 bootstrap iterations. In addition, a hierarchical analysis of molecular variance from the MHC data was calculated for populations within the two regions, Belgium and Austria, using the Arlequin 3.5.1.2 program (Excoffier and Lischer 2010). In order to address the evidence for positive selection in the MHC genes, we used the OmegaMap program (Wilson and McVean 2006). This program detects positively selected sites in sequences in the presence of recombination. It employs a Bayesian population genetics approximation to the coalescent theory and generates means and credible intervals for the selection parameter (dN/dS = ω) and recombination rate (ρ = 4Nr) for each codon (N and r represent the effective population size and the per codon rate of recombination). The probable values of the mutation rate (μ) and the transition/transversion rate ratio (κ) were adjusted to follow improper inverse distributions: starting values for μ and κ were set at 0.1 and 3.0, respectively, and the selection parameter (ω) and the recombination rate (ρ) were adjusted to follow inverse distributions in the range between 0.01 and 20 for ω and 0.01 and 100 for ρ, respectively. We chose to estimate ω for each codon independently and for ρ a block-like structure of 10 codons. Means for ω and ρ per codon were calculated using the posterior distributions generated with the objective prior set. Two Markov chain Monte Carlo tests were run for 250,000 iterations on population allele frequencies at each MHC locus, with a 20,000 iteration burn-in. The data was visualized in R with the included R-script provided by the authors. The program MEGA 4 (Tamura et al. 2007) was used to compute average pairwise nucleotide distances (Kimura 2-parameter model or K2P) and average pairwise Poisson-corrected amino acid distances. Standard errors were obtained with 1,000 bootstrap replicates.

Analysis of population recombination rate

The LDhat program was used to estimate rates of population mutation (from Waterson’s θ, where θ = 4N e μ) and population recombination ρ (where ρ = 4N e r; McVean et al. 2002). This program estimates population recombination from a set of aligned sequences using the composite likelihood method (Hudson 2001) within a coalescent framework. This analysis is reliable and valid even for low variability loci. The program considers the population frequency of each allele and each codon independently. Specifically, it uses a population genetics approximation to the coalescent with recombination (ρ) and then applies the reversible-jump MCMC method to perform Bayesian inference on the selection parameter omega. We used the likelihood permutation-based approach to test significance against the null hypothesis of no recombination (i.e. 4N e r = 0). We also calculated the ratio of ρ/θ as an estimate of the relative amount of recombination compared to point mutations (Fearnhead and Donnelly 2001). Additionally, D′ and r 2 statistics (Awadalla et al. 1999), which assess the correlation of linkage disequilibrium among pairs of polymorphic sites with the distance between them, were calculated as alternate indicators of recombination. In these cases, recombination is inferred if there is a significant decay of linkage disequilibrium with distance.

Results

Population genetic parameters

From the entire sample, we identified 15 DRB alleles, 11 DQB alleles, and four DPB alleles. All are previously undescribed. The same individuals were genotyped for all loci except DPB which showed very low variability after the initial screen of a subset of 71 individuals (51 from Austria and 20 from Belgium), so it was decided to discontinue the population level analysis for this locus. Allele frequencies and basic diversity parameters for the three MHC class II beta-chain exon 2 loci in the different localities as well as the overall values for the Austrian and Belgian regions are displayed in Table 2. For DRB, there were eight alleles shared between the two regions. Alleles Leeu-DRB*06, Leeu-DRB*10, Leeu-DRB*11, Leeu-DRB*12, Leeu-DRB*14, and Leeu-DRB*15 only occurred in Austria; whereas allele Leeu-DRB*07 only occurred in Belgium. For DQB, nine alleles were shared with alleles Leeu-DQB*05 and Leeu-DQB*06 only occurring in Austria. Of the four DPB alleles, Leeu-DPB*03 and Leeu-DPB*04 were not detected in Belgium, but it is important to note the small sample sizes for the Belgium subpopulations. HWE analyses highlighted that the DRB locus contained significantly fewer heterozygotes than expected under equilibrium in every subpopulation except OW/STR and also for each region overall. DRB and DQB showed significant linkage disequilibrium, however, DPB showed no linkage with the other two loci, perhaps reflecting its physical distance from them (as shown for humans and rabbits: Chouchane and Kindt 1992; Trowsdale 1995) and/or as a consequence of the low number of individuals included in analyses involving DPB.

Table 2 Allele frequencies of three MHC class II loci for three localities within Austria and three localities within Belgium as well as overall values for both regions

Population differentiation, as determined by mean D est, across all three loci was high and significant (mean D est = 0.16, p < 0.001). For the individual loci, significant differentiation was detected for DRB (D est = 0.409, p < 0.001) and DQB (D est = 0.071, p < 0.001), but not for DPB (D est = 0.00, p = 0.43). When data from the previous study for the DQA locus was included (D est = 0.594, p < 0.001), the mean D est value increased sharply (D est = 0.268, p < 0.001). A hierarchical population structure was evident in the genetic differentiation across all three loci, with 12.4% of the variance attributable to between region differences, 1.9% between localities within regions, and 85.7% to within sampling localities (P < 0.001, P ≪ 0.0001, and P ≪ 0.0001, respectively).

Selection and recombination

The average pairwise nucleotide distance (K2P) for the DQB gene was 0.096 (±0.014) and for the DRB gene 0.101 (±0.014). The average pairwise Poisson-corrected amino acid distances for DQB and DRB were 0.195 (±0.039) and 0.180 (±0.034), respectively. We identified, via OmegaMap, two codons (26S, 60 N) in the DQB gene and six codons in the MHC DRB gene (11 V, 57I, 67 L, 71R, 74S, and 78Y) that showed strong and significant effects of positive selection. In both genes, the positively selected amino acid sites had posterior probabilities greater than 95% (Fig. 2). In the DPB gene, no sites were indentified to be under strong positive selection. All of the positively selected sites identified for the DQB and DRB genes correspond to proposed peptide-binding sites of human HLA class II molecules (Bondinas et al. 2007; Brown et al. 1993; Stern et al. 1994). OmegaMap identified no particular recombination hot spots for any of the genes. The estimates of population mutation (θ) and recombination (ρ) obtained with the LDhat program show that each of these processes has played a significant role in generating the diversity seen among the hare MHC beta genes DRB, DQB, and DQA (Table 3). The likelihood permutation test (LPT) indicated that the estimates of ρ were significantly different from zero for the DRB, DQB, and DQA in the study (Table 3), but not for DPB. The two alternative statistical tests for recombination, r 2 and D′, also detected recombination rates significantly different from zero in most cases. For r 2, however, this was only the case for DRB, and all but DPB were significant for D′ (Table 3). The ratios of ρ/θ for the MHC loci DRB, DQB, and DQA (0.6, 1.4, and 0.2, respectively) indicate that the relative contribution of recombination and point mutations has been close to even in the evolution of these genes in brown hares (ρ/θ < 1 indicates a greater role for point mutations, whereas ρ/θ > 1 indicates more influence from recombination).

Fig. 2
figure 2

Omega Plot output for a the DQB gene and b the DRB gene. The dotted line indicates omega (ω) = 1. The unbroken line shows the mean value for ω from posterior distribution, and the shadowed area displays the 95% confidence interval. Amino acid codes are shown above each plot, and those presumed to be involved in peptide binding based on the human sequence are in bold. Codon positions are given with respect to the standard human HLA numbering and are considered to be significantly under the influence of positive selection if the mean and 95% confidence interval for ω > 1. Positively selected sites for DQB are 26S and 60 N and for DRB are 11 V, 57I, 67 L, 71R, 74S, and 78Y. All positively selected sites correspond to proposed peptide-binding regions

Table 3 Statistics and probability (P) values for population mutation (Wattersons’s θ = 4) and population recombination (ρ = 4Nr, McVean et al. 2002) of the brown hare MHC class II genes DQB, DRB, and DPB, as well as the the hare DQA gene from a previous study (Goüy de Bellocq et al. 2009)

Discussion

We have previously reported moderate diversity and evidence of selection (10 alleles, ω = 2.69) acting on the MHC class II DQA locus in European brown hares (Goüy de Bellocq et al. 2009). In this study, we extend the coverage of the brown hare MHC to include three other class II genes (DPB, DQB, and DRB). Heterozygosities for the three genes varied substantially from 0.18 for DPB in Belgium to 0.784 for DRB in Austria. Austria tended to exhibit higher He values than Belgium for the three genes, but only significantly so for DQB (t = 4.10, p = 0.015). The lower diversity levels detected for Belgium populations fits with a pattern of recent founder events as the species has expanded northwards from refugia in the Balkans since the last glacial maximum (Kasapidis et al. 2005; Stamatis et al. 2009). Spatial differences were clearly evident in the distribution of genetic variation with significant structure reported between the regions as well as between populations within each region. Although it is possible that demographic processes may explain a portion of this structuring, the lack of population differentiation at one locus (DPB) and the departure from HWE for the DRB locus suggests that differential selection pressures acting on these immune system genes are likely to also play a role. Some caution is needed in interpreting these data, however, as only a subset of individuals (n = 71) were genotyped for the DPB locus and the departure from HWE at DRB may also be the result of some alleles being missed due to the specificity of the forward primer placed in the intron.

The DQB and DRB genes showed evidence of positive selection at putative peptide binding sites similar to that revealed for the DQA gene. For DPB, however, we found no statistical evidence for positive selection and very low levels of diversity, indicating that this locus may be operating under a different selection regime to the others. Our data corresponds with human HLA allele frequency data for class II beta genes (Begovich et al. 1992), which suggests that DRB, DQA, and DQB loci have been subjected to balancing selection, whereas DPB has shown a different evolutionary past. This difference in the evidence for selection and the reduced diversity also reported for the human DPB (Bodmer et al. 1999) is presumably, at least in part, a result of the lower levels of linkage disequilibrium between DPB and the other class II genes (Begovich et al. 1992; Petersdorf et al. 2001). We also found no evidence of recombination (Table 3) or linkage disequilibrium at the DPB locus which supports the notion of a different evolutionary history for this locus compared to the other class II genes in the study.

The evidence for positive selection at DRB and DQB is consistent with findings for class II genes of other mammalian species (Amills et al. 2008; Averdam et al. 2010; Čížková et al. 2011; Schaschl et al. 2006). However, the rather surprising finding is how few sites were identified as being under positive selection compared to other studies. Much of this is probably due to the conservative approach adopted in our analysis to ensure robust extrapolation of positively selected sites. The task of disentangling the cooccurring signals of selection and recombination is a complicated process, but an important one in a system where both are known to play significant roles (Begovich et al. 1992). The evidence for recombination in brown hare MHC genes, while not as strong as for ungulate DRB genes (average ρ/θ = 8.03, Schaschl et al. 2006), suggests that its effects need to be controlled for in tests of selection. We chose to analyze our data via the OmegaMap program which does take the potential influence of recombination into account in the detection of selection (Wilson and McVean 2006). Other studies which have not acknowledged the likely presence of recombination within their datasets must potentially be treated with caution and are perhaps prone to type 1 error.

To better understand how immune genes are interacting with parasites or pathogens, a detailed analysis of MHC genes is needed, particularly for nonmodel species. MHC molecules play a key role in directing and shaping the T-cell repertoire during T-cell maturation. Heterozygosity or specific alleles at MHC loci may therefore enhance resistance to infectious diseases by generating a more diverse or specific T-cell repertoire. Combined with the previously characterized DQA gene, there is now a useful panel of MHC class II genes available to investigate how adaptive variation is maintained at a broader level in this wild species as well as allowing researches more choice in which gene or genes to select for studies of important associations between fitness and immunogenetic diversity.