HvPDIL5-1 and HvEIF4E exhibit distinct patterns of sequence diversity in domesticated but not in wild barley
To assess the extent and pattern of global genetic diversity of the two host factor genes HvPDIL5-1 and HvEIF4E, sequence variation in the full-length open reading frames (fl-ORFs) of both genes was surveyed in 365 wild and 2557 domesticated geographically referenced barley accessions (Supplementary Tables S1, S2). The dataset included (i) previously published sequences of HvEIF4E obtained from 1090 cultivated barleys (Hofinger et al. 2011) and HvPDIL5-1 obtained from 350 and 1382 wild and domesticated barleys, respectively (Yang et al. 2014b), as well as (ii) newly generated sequences of HvPDIL5-1 and HvEIF4E obtained in this study from 856 cultivated and 320 wild barley accessions, respectively (Table 1). Furthermore, we sequenced in a subset of the total of 2922 barley accessions the fl-ORFs of HvGT43 and HvMCT-1 (Supplementary Table S1). These genes co-segregated and are located in close physical proximity to HvPDIL5-1 and HvEIF4E, respectively (Fig. 1a, Stein et al. 2005; Yang et al. 2014a, b), and were included in this study for the purpose of determining the demography of population and testing the effect of genomic location on patterns of sequence diversity. The relative genomic positions of HvGT43/HvPDIL5-1 and HvMCT-1/HvEIF4E, besides allocating to different chromosomes, are significantly different. The first gene pair is located close to the centromere (low recombination frequency region) on the long arm of chromosome 4H (4HL) whereas the second gene pair is located close to the telomere (high recombination frequency region) on the long arm of chromosome 3H (3HL) (IBRC 2012). Importantly, fl-ORF sequences of HvPDIL5-1 and HvEIF4E genes were obtained from a shared set of 1152 genotypes, and fl-ORF sequences of all four genes HvPDIL5-1, HvEIF4E, HvGT43 and HvMCT-1 were obtained from a shared set of 392 barley accessions (Supplementary Table S2).
After removal of low quality data a total of 2588, 1410, 413 and 453 fl-ORF sequences of the four genes HvPDIL5-1, HvEIF4E, HvGT43 and HvMCT-1 were collected and analyzed for sequence diversity, respectively (Fig. 1; Table 1). A total of 30, 65, 17 and 8 haplotypes, respectively, were defined for HvPDIL5-1, HvEIF4E, HvGT43 and HvMCT-1 (Fig. 2; Supplementary Tables S4–S8). This included two newly identified HvPDIL5-1 haplotypes (hap-XXIX and XXX) and 18 newly identified HvEIF4E haplotypes (hap-XLIV–LXI) (Supplementary Tables S4, S5). One predominating haplotype each was found for HvPDIL5-1 (hap-I, 94.5%) and for HvMCT-1 (hap-I, 92.3%), respectively, whereas two and three frequent haplotypes were detected for HvEIF4E (hap-wt0A, 57.1%; hap-I, 17.6%) and HvGT43 (hap-I, 55.9%; hap-II, 11.6%; hap-III, 25.9%) (Supplementary Tables S4–S8).
We further analyzed the coding sequence diversity of these four genes in wild barley in comparison to domesticated barley.
Nineteen wild barley haplotypes (350 analyzed sequences) were observed based on ten non-synonymous (refers to non-synonymous amino acid change or in-frame deletions/insertions) and eight synonymous mutations, while only 15 haplotypes in domesticated barley (from 2238 sequences) were revealed with five containing loss-of-function mutations/deletions, another five containing non-synonymous mutations and remaining four containing synonymous mutations (Table 1). Synonymous and non-synonymous mutations in wild barley occurred evenly along the coding gene sequence, whereas loss-of-function sequence variation in domesticated barley were enriched in the central part of the gene (Fig. 1b), coding putatively for the functional thioredoxin domain (Yang et al. 2014b). Four haplotypes (hap-I, III, IV and XXIV) were shared between wild and domesticated barleys (Fig. 2). Haplotype I predominated in the collection and other haplotypes were derived from this ancestral haplotype by one or two mutations (Yang et al. 2014b). Interestingly, loss-of-function haplotypes were only found to be significantly accumulated in domesticated, but not in wild barley (Fisher’s Exact Test, P = 0.0099). The number of polymorphisms, haplotypes, as well as the values of the parameters H and π were lower in domesticated barley compared to those in wild barley (Table 1), indicating the reduction of gene sequence diversity. Statistical tests employing Fu and Li’s D* and F* and also Tajima`s D indicated low frequencies of rare alleles of HvPDIL5-1 in wild and domesticated barley, possibly being related to selection or population size expansion.
The close neighbor gene of HvPDIL5-1 exhibited thirteen and five polymorphic nucleotide positions (including non-synonymous and synonymous mutations) in 192 wild and 221 domesticated barley accessions, respectively, defining 14 and seven haplotypes (Table 1). Since the same three major haplotypes (hap-I, II and III) were also present in domesticated barley accessions (95.5%, 211 out of 221 accessions), no strong differences were found for H and π values between the domesticated and wild barleys (Table 1). The same set of cultivated barley vs wild barley revealed a strongly decreased sequence diversity at the HvPDIL5-1 locus (Supplementary Table S9). This indicated that the observed sequence diversity of HvPDIL5-1 and HvGT43 was gene specific and irrespective of chromosomal position. Importantly, no defective haplotype was found at the HvGT43 locus.
In 320 wild and 1090 domesticated barley accessions, 26 and 47 haplotypes were defined, respectively. Fourteen non-synonymous and eleven synonymous nucleotide polymorphisms were detected in wild barley (Table 1), and these were evenly distributed along the coding gene sequence (Fig. 1b). This pattern was similar to the observations at the HvPDIL5-1 locus, which implied a lack of specific selection at a particular region of both genes in wild barley. As to domesticated barley, an extreme situation was observed: sequence diversity was caused by a total of 30 non-synonymous mutations, whereas not even a single silent (synonymous) mutation was detected (Fisher’s Exact Test, P = 0.0001, Table 1). Non-synonymous mutations were enriched at three regions (nucleotides 336–384, 480–528, and 600–648) (Fig. 1b), coding for amino acid sequences located in proximity to the putative cap-binding domain of the protein (Kanyuka et al. 2005; Stein et al. 2005). In contrast to HvPDIL5-1 that showed a simple evolutionary network, the Median-Joining (MJ) network revealed a complex status of sequence variation for HvEIF4E (Fig. 2). For instance, it seems likely that haplotype XII originated by one additional mutation from one of haplotypes wt0A, XIV, XXI, XXXI or XXXII. The number of polymorphic loci and haplotypes, and H and π values in cultivated barley were not lower than in wild barley.
Three synonymous and three non-synonymous mutations of this gene were detected in wild barley accessions, while only three non-synonymous mutations were found in domesticated barley (Table 1). In contrast to HvEIF4E that represented a higher sequence diversity in domesticated barley, the same set of barley accessions revealed dramatically decreased sequence diversity at the MCT-1 locus in domesticated vs wild barley (Supplementary Table S9). This indicated that the higher sequence diversity of HvEIF4E was not a function of recombination frequency at the 3HL telomeric region.
Collectively, the sequence variation analyses revealed two qualitatively distinct patterns of gene sequence diversity with a bias for loss-of-function mutations in HvPDIL5-1 and non-synonymous mutations in HvEIF4E—both occurring only in domesticated but not in wild barley accessions. Importantly, analyzing the sequence diversity of the next neighboring genes HvGT43 and HvMCT-1 indicated that the two distinct patterns observed for HvPDIL5-1 and HvEIF4E are characteristic for the respective positions on the chromosome.
Bymovirus resistance correlated with the increased diversity of the host factor genes HvEIF4E and HvPDIL5-1.
The barley yellow mosaic virus disease is a highly virulent and persistent biotic stress for cultivated barley in East Asia and Europe (Kühne 2009), and selection for resistance was an important target in recent barley breeding and cultivation in these regions (Chen 2005; Ordon et al. 2005). To determine whether appearance of haplotypes conferring virus resistance was associated with the occurrence of loss-of-function and non-synonymous exchange mutations in HvPDIL5-1 and HvEIF4E (the corresponding recessive alleles of these genes conferring resistance to bymoviruses are known as rym1/11 and rym4/5, respectively), we tested accessions carrying different haplotypes at the HvPDIL5-1 or HvEIF4E locus for their reaction to a common isolate of Barley mild mosaic virus (BaMMV-ASL). A total of 30 and 65 different haplotypes of HvPDIL5-1 and HvEIF4E, respectively, were defined in wild and domesticated barley accessions (Supplementary Tables S4, S5). Of these, a subset of accessions representative for haplotypes carrying either loss-of-function and/or non-synonymous mutations were subjected to mechanical virus inoculation (Supplementary Tables S10, S11). In these assays, primarily the accessions carrying known, previously characterized susceptibility haplotypes at one locus and novel haplotypes with the unknown susceptibility/resistance status at another locus were used. For instance, when one accession contained a known susceptibility haplotype of HvPDIL5-1, testing the susceptibility/resistance status of a novel haplotype of HvEIF4E was allowed. However, most of accessions carrying haplotypes of unknown susceptibility/resistance status at both loci (HvPDIL5-1 and HvEIF4E) were excluded (or the phenotype of respective haplotype was marked as ‘unknown’). It needs to be noted, that this approach cannot rule out completely the remote possibility that an accession may contain additional independent bymovirus resistance loci. Less than 1% of nearly 10,000 domesticated barley accessions were reported to be completely resistant against isolates of BaYMV and BaMMV in the field in earlier studies (Ruan et al. 1984; Zhou and Cao 1985).
Seven HvPDIL5-1 haplotypes (hap-II, VII, VIII, IX, XVIII, XXVIII, and XXIX) were classified as resistant to BaMMV-ASL (Fig. 3; Supplementary Table S4). This included a previously undescribed loss-of-function haplotype (G256A/hap-XXIX/rym11-f) encoding a truncated PDIL5-1-like protein, and a known haplotype (hap-XVIII) containing a non-synonymous mutation (A239G/Glu80Gly) in HvPDIL5-1. Only one wild barley accession carrying hap-XVIII was identified, and it showed resistance to BaMMV-ASL (Supplementary Table S10). This reveals H. spontaneum as a useful source of resistance to bymoviruses. In domesticated barley, accessions carrying one of the six HvPDIL5-1 haplotypes, hap-II, VII, VIII, IX, XXVIII, and XXIX, also displayed resistance to BaMMV-ASL (Fig. 3). These resistance-conferring haplotypes were present in 38 (1.69%) of 2238 cultivated barley accessions (Figs. 3, 4a), and of these 37 were carrying haplotypes containing loss-of-function mutations in HvPDIL5-1 (Supplementary Table S4). Thus, in the case of HvPDIL5-1, it appears that the mutations resulting in a loss of function of the encoded host factor protein are largely responsible for the observed bymovirus resistance phenotype.
Accessions representing either one of the identified nineteen HvEIF4E haplotypes were tested as conferring resistance to BaMMV-ASL (Supplementary Tables S5, S11). This included three previously reported resistance alleles (rym4, rym5, and rym4
) (Perovic et al. 2014; Stein et al. 2005), and 15 new haplotypes for which bymovirus resistance had not been reported before (hap-IX, X, XIII, XIV, XV, XVII, XIX, XXIII, XXIII, XXIV, XXVII, XXVIII, XXXI, XXXVII, XLI, and XLV) (Fig. 3). Wild barley accessions carrying HvEIF4E hap-XLV were also resistant to BaMMV-ASL (Supplementary Table S11), confirming that H. spontaneum is a useful source of resistance to bymoviruses. In domesticated barley, sixty-eight (5.31%) of 1090 accessions carried any of the remaining 18 resistance-conferring HvEIF4E haplotypes (Fig. 3). These haplotypes were found in many barley cultivation areas except the Americas, Near East and Oceania (Fig. 4b), where no reports about the occurrence of the bymovirus disease are available. Interestingly, all resistance-conferring haplotypes contained non-synonymous mutations in the coding sequence of HvEIF4E.
Collectively, the data presented above revealed a strong correlation between bymovirus resistance and either the loss-of-function mutations in HvPDIL5-1 or the non-synonymous mutations in HvEIF4E. Importantly, resistance-conferring haplotypes were found preferentially in cultivated barley rather than wild barley, suggesting the more frequent rise of bymovirus resistance after domestication.
Type and frequency of mutations in the two host factor genes in context of geographical origins of the domesticated barley accessions.
The large collection of geographically referenced domesticated barley accessions allowed us to investigate if any geographic region contributed disproportionally to gene sequence diversity (loss-of-function mutations of HvPDIL5-1 and non-synonymous mutations of HvEIF4E), and if there were any preferred geographic origins of resistance-conferring haplotypes. Seven sub-populations (Africa, Americas, Central Asia, East Asia, Europe, Near East and Oceania) of the domesticated barley accessions were defined (Supplementary Table S1), and analyzed for diversity in the host factor genes HvPDIL5-1 and HvEIF4E (Tables 2, 3).
The polymorphic loci, the number of haplotypes, H and π of HvPDIL5-1 decreased in all seven sub-populations of domesticated barley compared to those in wild barley (Tables 1, 2). Six HvPDIL5-1 haplotypes (hap-II, VII, VIII, IX, XXVIII and XXIX) were found to be associated with bymovirus resistance, with five of those containing a loss-of-function mutation/deletion and one other carrying a non-synonymous mutation. One of these resistance-conferring haplotypes of HvPDIL5-1, rym11-a (hap-XXVIII), carrying a 1375-bp deletion was found only in one accession from Near East (Yang et al. 2014b). The other five haplotypes were found in 36 accessions originated from East Asia and one accession (Russia 57, hap-II, rym11-b) for which the exact location of a collection site was uncertain (Fig. 3). Of these, 35 contained haplotypes with loss-of-function mutations, indicating a disproportional frequency of resistance-conferring haplotypes in East Asia compared to other barley cultivation regions (Fig. 4a). The frequency of these haplotypes varied between 0.23% (e.g. hap-VIII) and 6.56% (hap-II) in the set of 442 accessions from East Asia used in this study. In comparison, no accessions carrying resistance alleles were found in Europe, where the bymovirus disease is also known to occur (Kühne 2009).
The genetic diversity of HvEIF4E varied among cultivated barley from the different regions of origin (Table 3). The highest diversity was observed in barley from East Asia. Most of the bymovirus resistance-conferring haplotypes were found exclusively in a single barley cultivation area, except hap-XXII and XXVIII, which were present in more than two major geographic regions (Fig. 3). Two haplotypes, rym4 and rym5, were extensively found in Europe and East Asia, respectively (Fig. 3). This is consistent with the known origin of barley landraces Ragusa (from Croatia) and Mokusekko 3 (from China) that were originally identified as carriers of rym4 and rym5, respectively, and used for introgression of bymovirus resistance into elite barley germplasm by breeding (Ordon and Friedt 1993). In East Asia, fourteen resistance-conferring haplotypes were found (Fig. 3). They were present in 46 (24.2%) of 190 cultivated barley accessions from this region in our collection (Table 3). Each haplotype was present at relatively low frequency in the set of accessions tested in this study, ranging between 0.53% (1 of 190, e.g. hap-rym5) and 8.95% (14 of 190, hap-XIV). By contrast, in Europe only three haplotypes (hap-rym4, XL and XXII) conferring resistance to bymoviruses were found. These haplotypes were present in 15 out of 304 (4.93%) barley accessions from this region (Table 3).
Taken together, the bymovirus resistance-conferring haplotypes caused by loss-of-function mutations in HvPDIL5-1 and non-synonymous mutations in HvEIF4E were greatly overrepresented in accessions from East Asia (Tables 2, 3), and the predominant occurrence as minor haplotypes together with statistics indicating selection suggest an evolution of bymovirus resistance alleles in barley germplasm from this geographic region.