Introduction

The major histocompatibility complex (MHC) is a gene complex that encodes molecules of vital importance for the adaptive immune system of vertebrates (Klein 1986; Goldsby et al. 2000; Janeway et al. 2008). The MHC is known to be a remarkably diverse gene complex, containing many loci, a high level of polymorphism and high nucleotide diversity (Klein 1986; Nei and Hughes 1991). The high number of MHC loci is thought to have arisen by gene duplication, since many MHC genes are very similar in sequence (Beck et al. 1999; Hess and Edwards 2002). Balancing selection is thought to maintain MHC diversity and prolong the lifetime of MHC alleles (Potts and Wakeland 1990; Hedrick 2002). Several mechanisms of balancing selection have been proposed (Bodmer 1972; Potts and Wakeland 1990; Nei and Hughes 1991; Borghans et al. 2004; Oosterhout 2008), most of which belong to one of two categories: hypotheses involving pathogen driven selection and those involving mate choice. The pathogen driven selection hypotheses are based on the assumption that heterozygous individuals and/or individuals possessing rare alleles have increased survival probabilities and are more likely to produce offspring, e.g. due to unpredictable pathogen distributions. The mate choice hypotheses assume that selection is imposed by a female preference for mates with genotypes that increase the survival probabilities of their offspring. Selection of males with favourable genotypes can occur either pre- or post-copulatory (Potts and Wakeland 1990; Zeh and Zeh 1997).

The MHC can be divided into three classes: class I (MHC-I) and class II (MHC-II) genes are similar in sequence and are involved in the adaptive immune system, while class III genes encode molecules involved in the nonspecific immune response (Goldsby et al. 2000; Janeway et al. 2008). Class I molecules present peptides that arise from proteins in the cytoplasm to contiguous structures like the nucleus (often intracellular pathogens), while class II molecules present peptides that arise in intracellular vesicles and extracellular space (often extracellular pathogens). However, there is also a phenomenon of “cross-presentation” where, e.g. extracellular material is presented by class I molecules (Goldsby et al. 2000; Janeway et al. 2008). Specific regions of the MHC (exons 2 and 3 in MHC-I and exon 2 in MHC-II) encode the peptide-binding regions (PBRs) that bind foreign peptides (Goldsby et al. 2000; Janeway et al. 2008). Since each MHC molecule can successfully bind a limited number of peptides, greater polymorphism at the PBRs should increase the number of pathogens that can be recognized by an individual, resulting in balancing selection (Hughes and Nei 1988, 1989).

The MHC of the chicken (Gallus domesticus) was the first to be characterized in a bird. The chicken MHC is smaller with denser gene regions and smaller introns, compared to the human MHC, called human leukocyte antigen complex (HLA). It is also less diverse than the HLA complex with only two class I and two class II loci. The chicken is therefore said to have a “minimal essential MHC” (Kaufman et al. 1995, 1999). Another remarkable feature of the chicken MHC is that it has two gene complexes (the classical BLB and non-classical YLB loci) that both contain MHC-I and II alleles, but these loci segregate independently (Miller et al. 1994; Hunt et al. 2006). At least one of the YLB loci is known to be expressed (Hunt et al. 2006). The BLB and YLB loci have also been reported in the black grouse (Tetrao tetrix) and the ring-necked pheasant (Phasianus colchicus) (Wittzell et al. 1995; Strand et al. 2007) and are thought to be a feature of galliform birds.

In general, the MHC of passerine birds appears to be more diverse in terms of loci, polymorphism and the existence of pseudogenes than the MHC of galliform birds (with the exception of the Japanese quail; Cortunix japonica; Shiina et al. 1995; Westerdahl et al. 2000; Westerdahl 2007). In passerines, studies have mainly focused on MHC-II (i.e. Vincek et al. 1995; Westerdahl et al. 2000; Edwards et al. 2000; Freeman-Gallant et al. 2002; Jarvi et al. 2004; Miller and Lambert 2004; Richardson et al. 2005; Aguilar et al. 2006), while MHC-I has been investigated in only a few species (the house sparrow, Passer domesticus, Bonneaud et al. 2004, the Seychelles warbler, Acrocephalus sechellensis, Richardson and Westerdahl 2003, the great reed warbler, Acrocephalus arundinaceus, Westerdahl et al. 1999, and the scarlet rosefinch, Carpodacus erythrinus, Promerová et al. 2009). Pathogens are, as mentioned above, thought to maintain the MHC diversity and in the last 10 years, avian malaria has been investigated extensively in wild bird populations. There is a wide array of malaria strains that infect most bird species and they potentially exert a substantial selection pressure for MHC diversity. Avian malaria has both extra- and intracellular stages and hence both MHC-I and MHC-II are likely to be involved in the adaptive immune response against malaria parasites (Peirce 1981; Valkiunas 2005; Janeway et al. 2008).

Here, we investigate the MHC-I of the blue tit (Cyanistes caeruleus) across three populations in Europe. In blue tits, several species of intracellular parasites, including avian malaria, are known to negatively affect the survival and reproductive success of infected individuals (Merino et al. 2000; Cichon and Dubiec 2005; Tomás et al. 2007; Arriero et al. 2008; Knowles et al. 2010). Blue tits typically breed as socially monogamous pairs (del Hoyo et al. 2007), but studies in numerous populations revealed that broods often contain offspring sired by males other than the social partner (extra-pair offspring; Kempenaers et al. 1997; Leech et al. 2001; Delhey et al. 2003; Brommer et al. 2006; Magrath et al. 2009). Consequently, the blue tit is an ideal species for investigating the potential role of MHC in both pre- and post-copulatory mate choice. Characterising MHC-I of the blue tit is an important first step towards future investigation into the role of pathogen-mediated selection and mate choice in the maintenance of MHC diversity in passerines.

Specifically, this study aims to: (1) partly characterize the diversity of the MHC-I genes in blue tits using individuals from three populations across their range, (2) compare the blue tit MHC-I diversity with other passerine species and (3) gain insights into the selection pressures acting on PBR and non-PBR regions within the blue tit MHC complex.

Methods

Study species

The blue tit is a common passerine species with a wide distribution; the Cyanistes caeruleus caeruleus subspecies extends throughout Europe and Western and Northern Asia (del Hoyo et al. 2007). We defined birds as migratory when seasonal migration between breeding and wintering grounds occurs, since this may be of importance for the diversity of pathogens encountered during an individual’s lifetime (Møller and Erritzøe 1998). Non-migratory individuals may still disperse. According to our definition, the blue tit is non-migratory across the majority of its distribution, although large-scale dispersal as well as partial migration occurs in the northern range of the distribution (i.e. Sweden; Smith and Nilsson 1987; Cramp and Perrins 1993).

Sample collection

We used DNA samples from populations in three countries: (1) the Netherlands (NL, the Vosbergen estate, near Groningen, 53°08′ N, 06°35′ E), (2) Spain (Sp, Valsain, central Spain, 40°49′ N, 3°56′ E) and (3) Sweden (Sw, 55°41′ N, 13°26′ E). From the Netherlands, we sequenced both gDNA and cDNA in three individuals and only gDNA in an additional three individuals. From Spain and Sweden, we sequenced seven individuals each (five gDNA, two cDNA; for practical reasons gDNA and cDNA samples were taken from different individuals in Spain and Sweden).

Birds were trapped in nest boxes and blood samples were collected from the brachial vein. Blood samples for DNA analysis were stored in 99% ethanol. An ammonium acetate or phenol/chloroform method was used for DNA extraction (Richardson et al. 2001; Sambrook et al. 2001). For RNA collection, 80–100 μL of blood was added to 100 μL K2EDTA (0.2 M) and 500 μL of Trizol-LS was added immediately after (following Miller and Lambert 2003). All samples were then stored at 4°C.

Restriction fragment length polymorphism

To get an estimate of the numbers of class I loci in the blue tit genome, a restriction fragment length polymorphism (RFLP) analysis was performed. A restriction cleavage and southern blot were performed on 7 μg genomic DNA using the restriction enzyme Pvu II and a radioactively labelled class I exon 3 clone (consisting of a purified 215 bp PCR product, see below; for details see Westerdahl et al. 1999).

Sequencing

Initially, several primer combinations were used for gDNA sequencing [PcaH1-A23H3, PcaH1 grw-A23H3; Balakrishnan et al. 2010, PcaH2-A23H3 and A21B (Bonneaud et al. 2004) -A23H3, Table 1 in Supplementary material]. The primer combination A21B-A23H3 was most successful and we continued with only this combination after initial testing. The primers sequenced the major part of exon 3 (215 bp, primers not included, Fig. 1 in Supplementary material). The following PCR protocol was used for DNA amplification: 94°C for 2 min, then 35 cycles of (94°C for 30 s, TA for 30 s, 72°C for 30 s), then 72°C for 10 min and finally 4°C on a thermal cycler before the samples were stored at 4°C (see Table 1 in Supplementary material for TA). Reagent concentrations: genomic DNA, 50 ng; primers, 0.5 μM; dNTP, 0.15 mM; 10x buffer; MgCl2, 1.5 mM, Taq 5U; final volume, 40 μL (AmpliTaq DNA polymerase with GeneAmp, Applied Biosystems, USA). A ligation reaction was performed, in which the PCR products were cloned into a bacterial vector (TOPO—TA cloning kit, Invitrogen, California, USA). Between 5 and 20 bacterial colonies per individual were amplified (using primers of the cloning kit, M13fw-M13rv) and sequenced on a capillary sequencer (ABI prism 3730, Applied Biosystems, California, USA) according to a standard big dye protocol (Big Dye Terminator mix V3.1, Applied Biosystems).

We extracted and cleaned the RNA samples using the RNeasy cleanup kit (Qiagen, Hilden, Germany). A two-step RT-PCR reaction was then performed using the Retroscript kit according to protocol (Ambion, Applied Biosystems) with the A21B-A23H3 primers and finally the obtained cDNA was amplified, ligated and sequenced (see above).

Definition of alleles

All sequences were blasted against previously published avian MHC-I sequences (NCBI GenBank) for confirmation and the MHC-I sequences were aligned using BioEdit (Hall 2009). Only completely identical sequences found in two independent PCR events (from either RNA or DNA) were defined as alleles. We refer to these alleles as verified, since the same sequence is unlikely to have arisen twice from independent amplification errors. Throughout this paper, sequences are reported without the primers. The word “allele” is used to indicate a 215 bp exon 3 sequence derived from cDNA and/or gDNA.

Analysis of sequences

A phylogeny of the identified alleles was derived in PAUP* v.4.0b10 (Ronquist and Huelsenbeck 2003), the model of nucleotide evolution was determined according to the Akaike information criterion with MrModeltest2.3 (Nylander 2004). GTR was used as the substitution model, while across-site mutation rates were assumed to be gamma distributed. We analysed the dataset in MrBayes3.1.2 (Huelsenbeck and Ronquist 2001, 2003). We ran four Markov chains for 5,000,000 generations in two parallel replicates with chain heating parameter set to 0.15. Trees were sampled at intervals of 1,000 generations, and posterior probabilities were calculated from 2,000 trees after excluding 3,000,000 generations as burn-in. As an outgroup, we used a great reed warbler (A. arundinaceus) sequence.

The MHC-I sequences of the blue tit previously reported by Foerster et al. 2006 were added from NCBI GenBank (accession numbers AM232710-14). These sequences grouped with our sequences in the phylogenetic tree of alleles (data not shown).

The amino acids that comprise the PBR were superimposed on the blue tit sequences using the great reed warbler sequences (Westerdahl et al. 1999). To determine whether the PBR has been under selection, we calculated values of d N/d S (Hughes and Nei 1989; Page and Holmes 1998) using the Nei-Gojobori method and performed a codon-based Z test (Nei-Gojobori, Jukes-Cantor) in MEGA 4.1 (Tamura et al. 2007) to determine whether there was evidence for the occurrence of selection on the PBR vs. non-PBR regions. Tajima’s D was calculated in Arlequin (Excoffier et al. 2005; Tamura et al. 2007) as an additional indication of selection. The value of d N/d S is the ratio of non-synonymous mutations (i.e. mutations resulting in a change in the amino acid sequence) to synonymous mutations (i.e. mutations after which the amino acid sequence remains intact), while Tajima’s D uses the number of polymorphic sites to calculate the divergence between sequences. A d N/d S ratio larger than 1 as well as positive Tajima’s D values are indicative of balancing selection.

Conserved sites within exon 3 have previously been described in the chicken (Livant et al. 2004), the Japanese quail (Shiina et al. 1995), the duck (Anas platyrhynchos; Mesa et al. 2004), the great reed warbler (Westerdahl et al. 1999) and the scarlet rosefinch (Promerová et al. 2009). To investigate whether the same sites were conserved in the blue tit, the blue tit alleles were compared to previously published MHC-I sequences of the above mentioned species. These sequences were added from NCBI GenBank (for accession numbers see Fig. 1). Conserved sites were extrapolated following Kaufman et al. (1994).

Fig. 1
figure 1

Amino acid sequences of exon 3 of the blue tit MHC-I, aligned with amino acid sequences of exon 3 of several other bird species: the chicken (Gallus gallus, Gaga BF*J3, accession number AY327148, alpha 2 region only, Livant et al. 2004), the Japanese quail (C. japonica, Coja, D29813Shiina et al. 1995), the duck (A. platyrhynchos, Anpl, AY294416, Mesa et al. 2004), the great reed warbler (A. arundinaceus, Acar cN20 exon 3 Westerdahl et al. 1999) and the Scarlet rosefinch (C. erythrinus, Caer U*01, FJ392762 Promerová et al. 2009), added from NCBI genbank. Asterisks mark the PBR while shaded areas mark, and numbers below the figure indicate the conserved sites as named by Kaufman et al. 1994. The columns on the right indicate whether the alleles were found in cDNA and gDNA

Our verified alleles were added to NCBI GenBank (accession numbers JF742764-80). The blue tit was recently renamed C. caeruleus, but to name our sequences, we use Parus caeruleus to ensure consistency with blue tit MHC-I sequences previously published in NCBI GenBank. To avoid confusion with blue tit MHC-I sequences previously published in GenBank, we numbered our alleles Paca UA*101-Paca UA*117.

Species comparison

In order to compare genetic diversity at MHC-I exon 3 between the different passerine species studied to date, we calculated nucleotide diversity of the alleles of each species using Arlequin version 3.11 (Excoffier et al. 2005). Nucleotide sequences were obtained from NCBI GenBank. In the blue tit, the phylogenetic tree of alleles revealed a distinct cluster of alleles with very little sequence diversity. For the species comparison of nucleotide diversity, only the eight alleles outside this cluster were used, since these are the alleles with the highest sequence diversity and most likely to be under strong selection. This pattern of clustering was not found in any of the other species included in this analysis and we randomly selected eight alleles to allow for a comparison of sequence diversity.

Results

Diversity

Population level

A total of 17 MHC-I alleles were verified in 234 sequences obtained from 20 individuals that originated from Sweden, Spain and the Netherlands (Fig. 1 in Supplementary material). At least eight of these alleles were transcribed, as they were found in cDNA (Fig. 1, Table 2 in Supplementary material) and seven of these eight transcribed alleles translated into different amino acid sequences. In total, 13 different amino acid sequences were found (Fig. 1). Alleles UA*101, UA*110, UA*106 and UA*113 have identical amino acid sequences, as alleles UA*105 and UA*115 (Fig. 1) do. No gaps, shifts in reading frame or non-sense codons were detected. Transcribed alleles were found throughout the phylogenetic tree (Fig. 2).

Fig. 2
figure 2

Phylogenetic tree for class I alleles in the blue tit. Numbers in the tree indicate the posterior probabilities expressed as a percentage (values below 50 not shown). The alleles of the allelic cluster with low diversity (group 1 supported by a posterior probability value of 100) are indicated by white dots, while all other alleles (group 2) have black dots. Abbreviations indicate from which population the DNA sample was taken (NL the Netherlands, SW Swedish, SP Spanish)

Seven amino acid residues that are conserved in exon 3 of MHC-I across bird species have been described (named Y123, T143, K146, W147, Y159, L160 and Y171 in the chicken; Kaufman et al. 1994, see also Shum et al. 1999; Mesa et al. 2004). In the blue tit, five of these sites were also conserved (Y123, T143, W147, Y159 and Y171). K146 is polymorphic in the blue tit and has different amino acids compared to all other bird species, except for the scarlet rose finch. L160 is polymorphic in the blue tit, group 2 has the common leucin (L) while group 1 has glutamine (Q) (Fig. 1).

Individual level

There was evidence for at least four class I loci because we found a maximum of seven alleles (215 bp sequences) per individual (individual R, Table 2 in Supplementary material). This finding was further supported by the RFLP analysis where each individual had between five and eight RFLP bands (out of a total number of ten RFLP bands found in the two families), each RFLP band representing approximately one MHC allele. Four RFLP bands were non-variable and occurred in all individuals (band no. 1, 4, 8 and 10, Fig. 3).

Fig. 3
figure 3

RFLP gel of two blue tit families. The top row of numbers indicates individuals: M1 and F1 are parents of individuals 1–7 (M male, F female) and M2 and F2 are parents of individuals 8–13. The second row indicates the total number of bands present for each individual. A size standard in kilobase is shown on the right. The position of all RFLP bands found is indicated on the left of the gel. All RFLP bands were between 2.3 and 6.6 kb in length

Selection

Phylogeny

One distinct cluster containing nine MHC-I alleles was observed in the phylogenetic tree (supported by a posterior probability value of 100, Fig. 2). These nine alleles will be referred to as group 1, while the remaining eight alleles will be referred to as group 2. The clustering of the alleles in group 2 lacks phylogenetic support. The nucleotide diversity (π) and the number of segregating sites (S) within group 1 were significantly lower than in group 2 (group 1, π = 0.017, ±0.011 SD, S = 12; group 2, π = 0.056, ±0.033 SD, S = 27, Table 1, p < 0.05, t test). Five of the segregating sites contained the same polymorphism in groups 1 and 2. In addition, four sites were monomorphic within group 1 and monomorphic in group 2 but differed between the groups (Fig. 1 in Supplementary material). Overall, the nucleotide diversity was 0.059, while there were 38 segregating sites.

Table 1 Indicators of selection on MHC-I

There was no geographical separation of alleles across the phylogenetic tree, as alleles from all three populations (Spain, the Netherlands and Sweden) were distributed across the entire tree and frequently shared between sample locations (Fig. 2). Six out of the total of 17 alleles were found in all three locations (UA*101, 103, 104, 105, 108, 112), three that belong to group 1 and three to group 2. Some alleles were only found in one location (UA*102, 111, 113, 114, 117, Fig. 2). However, these alleles were highly similar to those found in other locations. Expressed alleles were found in all parts of the phylogenetic tree and in both groups 1 and 2. We found evidence for at least four group 1 loci and two group 2 loci (Table 2 in Supplementary material).

Selection indices

We found significant balancing selection for the PBR of group 2, (d N/d S = 5.488, Z = 2.98, p = 0.02), but no evidence of any kind of selection (i.e. balancing or purifying) acting on the PBR of group 1 (d N/d S = 0.338, Z = −0.50, p = 0.62). None of the non-PBR regions were under selection (group 1, d N/d S = 0.179, Z = −1.64, p = 0.10; group 2, d N/d S = 0.805, Z = −0.33, p = 0.74; Table 2; codon-based Z test, Nei-Gojobori, Jukes-Cantor).

Table 2 Overview of the number of non-synonymous (d N) and synonymous (d S) mutations in the PBR and other regions (non-PBR) for class I

The high values of Tajima’s D in group 2 (D = 1.26, p = 0.82 and D = 0.51, p = 0.62 for PBR and non-PBR, respectively) indicate that alleles in this group have been maintained in the population for longer than expected under neutrality (as the phylogenetic tree indicates, Fig. 2). This effect was most pronounced in the PBR region. Both PBR and non-PBR regions of group 1 appear to be under purifying selection, as indicated by negative values of Tajima’s D (D = −0.69, p = 0.19 and D = −0.94, p = 0.15 for PBR and non-PBR, respectively). None of the Tajima’s D values provide significant evidence for a deviation from neutrality however (Table 1).

Species comparison

We found that the nucleotide diversity of blue tits was significantly lower than that of any of the other species, including the inbred Seychelles warbler (p < 0.05). The great reed warbler had the highest nucleotide diversity (Table 3).

Table 3 Measures of genetic diversity [number of segregating sites (S), Tajima’s D and the nucleotide diversity (π; ±SD)] at the MHC-I for five species of passerine birds

Discussion

Selection pressures

It is reasonable to expect that the prevalence and composition of pathogens vary between our three different blue tit populations, from Spain in the south to Sweden in the north, possibly reflecting climatic differences (i.e. Bensch and Akesson 2003; Merino and Møller 2010). Therefore, one may expect different selection pressures on the MHC resulting in population differentiation in MHC alleles. Our phylogenetic tree of alleles shows no obvious geographical relationship to the three source locations. One of the characteristic features of the MHC genes is that alleles can persist for a long time and that trans-species polymorphism (MHC alleles or lineages that are shared between diverged species) is common (Edwards and Hedrick 1998; Westerdahl 2007). In a phylogeny containing the MHC alleles of closely related species, it is uncommon that the alleles will cluster in a species-specific manner. Most likely, a geographical relationship to the three source populations in the phylogenetic tree should only be expected when there are substantial differences in selection pressures operating. Hence, it may not be so surprising that we did not find evidence of population differentiation between our sample locations. In a study of population differentiation among three Belgian blue tit populations (on a much smaller spatial scale than our study), Verheyen et al. (1995) found significant population differentiation using selectively neutral markers. Ekblom et al. (2007) studied population differentiation in MHC-II in Scandinavian and east European populations of the great snipe (Gallinago media) and found significant geographical differentiation, but this was reflected in allele frequencies rather than in phylogeny. Therefore, we may expect to find a geographical structure in MHC-I when studying allele frequencies. Unfortunately, our present results do not allow us to test for differences in allele frequencies, since we sampled a limited number of individuals in each population.

The theory of pathogen driven selection predicts that pathogen distributions vary over time and in space and that selection acts on a species to keep up with its changing environment (Bodmer 1972; Hedrick 2002). To recognize novel pathogens, the PBRs of MHC molecules must be able to adapt rapidly and are expected to be under balancing selection (Hughes and Nei 1988, 1989). We found that the d N/d S ratio in the PBR of the alleles in group 2, the diverse allele group, was significantly higher than 1. Furthermore, the positive (though non-significant) value of Tajima’s D also indicated that group 2 has deeper branches than expected under neutrality. A positive Tajima’s D value could be caused by demographic events in the population history (e.g. population expansion) or by the occurrence of balancing selection. If the demographic history was responsible for the positive Tajima’s D value in the PBR of group 2, we would also expect to observe a positive value for the non-PBR regions, since they are regions within the same gene locus, but we did not. Therefore, balancing selection acting on the PBR of group 2 is the most likely explanation. It should be noted that we calculated d N/d S and Tajima’s D from sequences derived from at least four loci, though ideally it should be calculated within one locus. This may result in an overestimation of the number of synonymous substitutions (Hughes and Nei 1989). However, these numbers should be overestimated in both PBR and non-PBR regions, so it is safe to conclude that the PBR is under stronger positive selection than non-PBR regions in group 2.

The MHC-I alleles in group 1 have d N/d S ratios lower than 1 and negative Tajima’s D values for the PBR as well as the non-PBR regions. These values are indicative of negative selection on group 1, though the evidence was not statistically significant. The alleles in group 1 also have remarkably little sequence variation, which could be explained by the alleles originating from a bottlenecked population (while the alleles of group 2 diversified under selection) or by strong negative selection. A bottleneck has occurred in the phylogenetic history of the blue tit (Kvist et al. 2004), but we also found some evidence for negative selection. Since the bottleneck in the blue tit population was not very severe, we cannot currently explain the lack of variation in group 1.

MHC-I compared to other passerines

We found evidence for the existence of four MHC-I loci in the blue tit. MHC-I has been described for a small number of passerine species: the Seychelles warbler, great reed warbler (Westerdahl et al. 1999; Richardson and Westerdahl 2003), scarlet rosefinch (Promerová et al. 2009), house sparrow (Bonneaud et al. 2004) and now the blue tit. These species represent phylogenetically very different groups within the passerines. Yet, the number of MHC-I loci is very similar in these species, with the exception of the great reed warbler (five loci in the house sparrow; Bonneaud et al. 2004; five in the scarlet rosefinch; Promerová et al. 2009; eight in the great reed warbler; Westerdahl et al. 1999). Preliminary RFLP results suggest that the number of MHC-I loci in the Seychelles warbler also is higher, so the high number of MHC-I loci in the great reed warbler could be representative for this taxonomic group (Richardson and Westerdahl 2003).

It has previously been proposed that the MHC should be more diverse in migratory than in non-migratory species, since migratory species encounter more pathogen species and strains (Møller and Erritzøe 1998; Westerdahl et al. 2000). We did not find a consistent pattern when comparing the number of MHC-I loci or nucleotide diversity between migratory (great reed warbler, scarlet rosefinch) and non-migratory (house sparrow, Seychelles warbler, blue tit) passerines. Genetic diversity within species or populations may further be reflected in genetic polymorphism, a value difficult to compare between species, since it is likely to be correlated to the number of individuals sampled. Our inability to detect a difference in MHC-I diversity may be due to the very small number of species we compared. Hopefully, the characterisation of MHC-I for other passerine species will allow a more robust comparison in the near future.

The nucleotide diversity and the number of segregating sites in the blue tit MHC-I alleles were remarkably low compared to other passerine species, even though only the variable blue tit alleles in group 2 were taken into account. The nucleotide diversity and number of segregating sites are also lower than those of the closely related great tit (Parus major). One explanation for the low diversity could be that the blue tit underwent a population bottleneck, after which the population expanded and spread throughout Europe (Kvist et al. 2004). Kvist et al. (2004) suggest that this bottleneck took place during the last ice age, when birds were left in two refuges, the Iberian Peninsula and the Balkan, resulting in the C. c. caeruleus and Cyanistes caeruleus ogliastrae subspecies. Hence, the C. c. caeruleus subspecies arose from the refuge population in the Balkan and this bottleneck may explain the low MHC-I nucleotide diversity that we observed. The population history of the great tit is thought to be comparable to that of C. c. caeruleus, although the blue tit diverged more recently than the great tit (Kvist et al. 1999a, b), which may partly explain why the nucleotide diversity and number of segregating sites of the great tit is higher. In a species comparison of genetic variation among mitochondrial DNA, Kvist et al. (1999a, b) found that levels of nucleotide diversity are comparable in the blue tit and the great tit (Kvist et al. 1999a). Surprisingly, the nucleotide diversity and number of segregating sites of MHC-I were even lower in the blue tit than in the Seychelles warbler, a species that underwent a severe recent (1920– approx. 1968) population bottleneck (Komdeur and Pels 2005). The bottleneck that the blue tit population went through is less recent and less severe than the bottleneck in the Seychelles warbler population and we would expect more genetic variation in the blue tit MHC. Another possible explanation for the low nucleotide diversity and number of segregating sites in the blue tit is the lack of diversifying selection, which could occur due to the absence of pathogens. However, there is evidence that blue tits are commonly infected with several species of blood parasites (such as avian malaria; Merino et al. 2000; Cichon and Dubiec 2005; Tomás et al. 2007; Arriero et al. 2008; Knowles et al. 2010) and we have no reason to believe that pathogen pressures are lower in blue tits compared to other passerine species. In case selection pressures from pathogens had relaxed during a longer period of the blue tits’ population history, one would expect this to be reflected by a relatively low number of loci and loci becoming non-functional.

Our characterisation of MHC-I in the blue tit shows the existence of one phylogenetic cluster (group 1 versus the remaining alleles named group 2) with very low sequence diversity and indications of purifying rather than balancing selection, while the remaining alleles show the expected MHC characteristics. A non-variable MHC-I allele cluster has previously been found in another passerine, the house sparrow (Bonneaud et al. 2004). For MHC-II, the division in two allele clusters that differ in nucleotide diversity has been described in Hawaiian honeycreepers (Drepanidinae) and Darwin’s finches (Geospizinae) (Jarvi et al. 2004). The existence of two gene clusters that segregate independently (the classical BLB and non-classical YLB) is known in several galliform species (Miller et al. 1994; Wittzell et al. 1995; Hunt et al. 2006; Strand et al. 2007). Purifying selection is one of the defining characteristics of non-classical MHC alleles (Janeway et al. 2008) and a possible explanation for the lack of polymorphism in the alleles in group 1 could be that these alleles are non-classical in origin, while the alleles in group 2 are classical. So far, we have too little background information on the blue tit MHC to determine whether the alleles we found are classical or non-classical.

We found evidence (although not statistically significant) that the alleles in group 1 are under purifying selection, which could explain the low nucleotide diversity found in this group of alleles. It seems possible that the molecules derived from alleles in group 1 perform an essential function in immune recognition and selection acts to preserve them. Purifying selection on the alleles essential for immune recognition could have led to convergent evolution leading to the existence of a non-variable gene cluster in several passerine species.

One could argue that the alleles in group 1 may be a radiation of recently originated alleles. If the alleles in group 1 were of recent origin, we would expect the values for d N and d S to be similar, since selection has not yet had a strong effect. When a phylogeny is drawn including only synonymous substitutions, the clustering in our phylogenetic tree completely disappears (data not shown) and we conclude that selection must be involved in creating this gene cluster. One may also argue that more clusters of highly similar alleles may exist in the blue tit and that the alleles in group 2 belong to one or more of these clusters, but that we simply did not find the alleles in the other clusters due to a limited sample size. In that case, we would erronously draw the conclusion that there is a large difference in diversity between group 1 and group 2, since we are comparing the group 1 alleles to alleles potentially belonging to several clusters. For a follow-up project, we designed primers to specifically amplify the alleles of group 2 (data not shown), but we found no evidence for another gene cluster with low diversity.

Interestingly, one of the seven amino acid residues that are conserved in exon 3 of MHC-I across bird species differed between groups 1 and 2 (L160). Group 1 had the common leucin (L160) changed to glutamine (Q160). Leucin and glutamine have different characteristics and such an amino acid change could therefore indicate different peptide-binding abilities in the PBR of groups 1 and 2.

An indication that the number of samples we obtained per individual is limited (for both cDNA and gDNA) may be that certain alleles were only revealed in the cDNA (and not in gDNA) in an individual. In order to reveal all alleles an individual possesses and all existing alleles across all blue tit populations, a very large number of sequences would have to be obtained. Importantly, however, it is unlikely that our main conclusions would be altered by increasing the sample sizes, since (1) the phylogenetic relationships would likely be maintained and (2) including rare alleles in our analysis is unlikely to increase the maximum number of alleles per individual, since our RFLP analysis confirmed our estimation of four loci.

Conclusions and future prospects

This study is the first to characterise major histocompatibility complex class I in the blue tit and is among few published studies that have characterized MHC-I in passerine species. Besides providing insight into the structure, diversity and selection acting on MHC-I, the characterisation of the blue tit MHC-I is a first step towards the development of high throughput methods for MHC-I screening. Such methods will make it feasible to explore the mechanisms imposing balancing selection on the PBR of MHC-I in natural blue tit populations. Among other insights, these future analyses may reveal why the blue tit MHC-I exhibits such low sequence diversity compared to other passerine species.