Background

Rice (Oryza sativa) is providing approximately 20% dietary energy supply for world’s people [1]. However, rice production worldwide is severely threatened by bacterial leaf blight (BLB), a plant disease caused by Xanthomonas oryzae pv. oryzae (Xoo) [2]. Xoo is now prevalent in rice-growing areas of world, but due to its host-shifting capacity, this bacterium also threatens wheat production in both south America and Asia [3, 4]. Damage from this disease has led to rice production losses of 20%–30%, reaching devastating levels of up to 80%–90% in India and Philippines [5, 6]. Generally, this damage begins at the tillering stage, becoming more widespread as the incidence of disease increases with host plant growth peaking at the flowering stage [7]. The easiest way to prevent BLB is to apply chemical pesticides; however large-scale use of a variety of pesticides threatens the safety of rice food products. Additionally, because BLB spreads rapidly, such chemical control applied in a monsoon climate is ultimately unpractical, since once a BLB infestation occurs on a large scale, its effective control by pesticides is difficult if not impossible [8]. Therefore, developing and applying resistant rice cultivars is the most effective way to control this disease and ensure food security.

To date, more than 42 BLB resistance genes and hundreds of quantitative trait loci (QTLs) have been detected in rice plants. Among these genes, some were identified from wild species [9,10,11]. Other resistance genes or alleles have been found by mutating cultivated rice lines [12, 13]; e.g. Xa1, Xa3/Xa26, Xa21, Xa23, Xa27, xa5, xa13, xa25, and xa41 [14,15,16,17,18,19,20,21,22]. Moreover, the complete spectrum of BLB R-genes reportedly consists of 16 genes: xa5, xa8, xa9, xa13, xa15, xa19, xa20, xa24, xa25, xa26b, xa28, xa31, xa32, xa33, xa34, and xa42 [19, 22]. Importantly, BLB has high race specificity, and the rapid loss of BLB resistance in rice lines containing a single resistance gene remains a pressing problem for breeders. Combining multiple resistance genes or QTLs can contribute to broad spectrum and durable resistance to Xoo that is effective [23]. Therefore, detecting novel resistance genes and QTLs in many rice lines is imperative for successfully breeding rice capable of resisting BLB.

Most studies of BLB resistance have focused on a single resistant parent or bi-parental genetic mapping populations. To genetically map many agronomic traits and disease resistance loci in plants, genome-wide association study (GWAS) based on high-density single nucleotide polymorphisms (SNPs) and next-generation sequencing has been widely used [24, 25]. For example, Li et al. (2015) detected 97 loci associated with resistance to stripe rust in wheat via GWAS methods; and genes conferring resistance to Verticillium dahlia were identified in cotton by GWAS [26]. For rice, researchers have used GWAS to detect 27 loci related to rice blast resistance [27]. Yet due to its insufficient marker density and linkage disequilibrium (LD), GWAS does not provide an accurate target gene at a given locus. Transcriptome analyses can overcome this limitation by detecting and distinguishing the expression of candidate genes of different genotypes. Recently, Wen et al. [28] identified a set of candidate genes associated with white mold resistance in soya bean by combining transcriptome and GWAS approaches.

In this study, 259 rice lines were inoculated with two Xoo races P3 and P6, respectively, to evaluate their BLB resistance. We performed a GWAS of BLB resistance using 2 888 332 high-confidence SNPs (missing data < 20%; minor allele frequency [MAF] > 1%). Building on this, we explored candidate resistance genes by analyzing the transcriptomes of the most resistant rice line NSIC RC154 and the most susceptible line CT 9737–6-1-3P-M, at five post-infection time points (0, 12, 24, 48, and 72 h). Our results allowed us to detect candidate genes linked to BLB resistance in these rice lines. This should provide important gene resource for improving disease resistance breeding in rice.

Results

Genomic variation and population structure

To search for BLB resistance genes, we sequenced 259 rice accessions using the Illumina Hi-Seq platform (Fig. 1A; Table S1). This generated 1.33 Tb of raw reads (Table S2), and 2 888 332 high-confidence SNPs (missing data < 20%; minor allele frequency [MAF] > 1%) were obtained after mapped onto the Nipponbare rice genome (Table S3). Among 2 888 332 high-confidence SNPs, there are 1 146 191, 555 884, 363 883, 464 911, and 318 546, SNPs were located in intergenic regions, exons, upstream regions, introns, and downstream regions, respectively (Table S3). In the coding sequences (CDS), the 16 416 stop-gain, 312 857 nonsynonymous, 2 850 splicing, and 1 085 stop-loss SNPs were found (Table S3). The number of SNPs among the 12 rice chromosomes ranged from 334 353 (Chr. 1) to 192 058 (Chr. 9), and the highest SNPs’ frequency were found on Chr. 8 (8.50 SNPs/kb) (Table S4). This SNP data set of 259 rice lines provides an abundant resource for use in the molecular improvement of BLB resistance in rice.

Fig. 1
figure 1

Population structure of 259 rice accessions. (A) Neighbor-joining phylogenetic tree generated using 2,888,332 high-quality SNPs; (B) Individual ancestry coefficients of 259 rice accessions determined using ADMIXTURE with the number of ancestry kinships (K) set to 2 or 3. Each accession is denoted by a vertical bar; the proportion of different colors in each bar indicates the proportion of genetic from each of the ancestral populations; (C) Principal component analysis (PCA) plot of 259 rice accessions. (D) Genome-wide average linkage disequilibrium (LD) decay rate of 259 rice accessions and three different subgroups

Population-structure and phylogenetic-tree analysis indicated that these 259 rice varieties contained three subgroups (K = 3): landrace indica cultivar (94), improved indica cultivar (146) and japonica (19) (Fig. 1B). According to the principal component analysis (PCA) of these 259 rice lines, the 29.54% genetic variation was explained on the first two PCs (Fig. 1C). These results indicated that the rice varieties used in this study harbor abundant genetic variation in the core germplasm of rice. The LD decay for 259 rice cultivars was estimated to be 194 kb (Fig. 1D), suggested the rice lines exhibited moderate LD [29].

Phenotypic variation among rice cultivars

All 259 rice lines were inoculated with two Xoo races (i.e., P3 and P6) at the booting stages, and the resistant phenotypes to each were investigated over 2 years. For the P3 Xoo race, the BLB disease severity index (DSI) in all lines ranged from 0.26% to 99.46% in 2018 (average = 37.94%) and from 2.82% to 100% in 2019 (average = 30.34%). For P6, the corresponding BLB DSI values were 1.46–85.32% (2018; average = 30.92%) and 2.49–97.84% (2019; average = 31.92%) (Fig. 2A-D; Table S1). The wide range of BLB DSI values observed in the different rice lines (a 16- and 12-fold difference, respectively, for the two Xoo races) demonstrated substantial genotypic variability is associated with rice resistance to BLB. Additionally, the BLB incidence rate for the two races was normal (Fig. 2A-D). Analysis of variance (ANOVA) of BLB DSI for both races revealed significant differences among genotypes, implying the presence of dominant loci conferring BLB resistance. Crucially, the level of BLB resistance was highest among the improved cultivars and lowest among the landraces (Fig. 2E, F). This latter result indicated that artificial selection has been successful in rice breeding applications.

Fig. 2
figure 2

Susceptible and resistant reactions of 259 rice lines inoculated with two Xanthomonas oryzae pv. oryzae (Xoo) strains. (A, B) The frequency distribution of incidence rate of 259 rice lines after inoculation with Xoo P3 strain at 2 years; (C, D) The frequency distribution of incidence rate of 259 rice lines after inoculation with Xoo P6 strain at 2 years; (E) Histograms and box-plots showing the resistance phenotypic data of three rice subgroup to Xoo P3 strain; (F) Histograms and box-plots showing the resistance phenotypic data of three rice subgroup to Xoo P6 strain

GWAS for bacterial leaf blight resistance

Based on the 2 888 332 high-confidence SNPs, the GWAS of two resistance traits (each Xoo race was presumed to select for different plant trait) was conducted by a mixed linear model (MLM). For the GWAS analysis, phenotypic data of these two traits (in 2 years) and their best linear unbiased prediction (BLUP) values of each trait per year were used. A total of 196 and 164 SNP loci were significantly associated with resistance to P3 and P6, respectively (− log10P ≥ 5) (Fig. 3; Tables S5, S6, S7, S8, S9 and S10). Intriguingly, 63 SNP loci of them were common to all traits. Next, the GWAS was applied to the selected 240 varieties of indica rice. Similarly, the phenotypic data of the two traits and their yearly BLUP values were used this analysis. Among significant association SNPs, when compared with results of the GWAS implemented for all 259 rice varieties, the strongest signals was also identified on Chr.7 (Figure S1). However, some known loci did not appear in the GWAS results of 240 indica rice varieties (Fig. 3 and Figure S1). Accordingly, the 19 japonica rice varieties can contribute to the BLB GWAS, so the GWAS results of all 259 rice varieties were investigated further. Candidate genes were detected within 200 kb upstream and downstream of the significant associated SNP, according to the LD decay of the rice genome (Fig. 1D). A total of 2 081 genes (significant for 2 years and BLUP values) were detected for P3, and likewise 1 297 genes were significantly detected for P6. Of those, 954 genes were common to both resistance traits (Tables S5, S6, S7, S8, S9 and S10). The number of these genes was higher on Chr. 7 and Chr. 11, when compared with the other rice chromosomes (Table S5, S6, S7, S8, S9 and S10).

Fig. 3
figure 3

Manhattan and quantile–quantile plots resulting from the genome-wide association study (GWAS) for bacterial leaf blight (BLB) resistance in rice. (A, B, C) GWAS for BLB resistance in (A) P3_2018, (B) P3_2019 and (C) Best linear unbiased prediction (BLUP) values of P3 at 2 years. (D, E, F) GWAS for BLB resistance in (D) P6_2018, (E) P6_2019 and (F) Best linear unbiased prediction (BLUP) values of P6 at 2 years. The x-axis shows the single nucleotide polymorphism (SNPs) along each chromosome; the y-axis is the –log10P for the association

We summarized 41 previously reported fine-mapped QTLs or genes related to BLB resistance using Oryzabase database (http://www.shigen.nig.ac.jp/rice/oryzabase/gene/list). To further confirm these significant genes or SNPs found associated with BLB, the results were compared with those of 41 previously reported fine-mapped QTLs or genes. Of the loci detected in the P3 strain, the sixteen previously QTLs fine-mapped QTLs or genes were identified (Fig. 3A-C). For the loci that were detected in the P6 strain, seven SNPs were adjacent to previously reported QTL or genes (Fig. 3D-F). This confirmed that our results were reliable, has a strong potential for more deeply exploring novel resistance genes in rice.

Comparison of the transcriptomes of resistant NSIC RC154 and susceptible CT 9737–6-1-3P-M lines of rice

Because of the strong population structure and large extent of LD in rice, GWAS-identified loci often fall within gene deserts or in regions with many equally plausible causative genes. This makes it difficult to robustly identify functional genes. Combining the results of GWAS with RNA-seq data has been used to detect the function genes.

Furthermore, previous reports also indicated that genes that were present different expression levels in different resistance lines are most likely to be association with disease resistance [28].

Among the 259 lines tested in the 2-year field trial, NSIC RC154 (DSI = 6.01%) and CT 9737–6-1-3P-M (DSI = 100%) emerged as the most resistant and susceptible lines, respectively. To further identify those genes related to BLB resistance, we performed a transcriptome analysis on leaves from both lines to analyze the levels of gene expression at 12, 24, 48, and 72 h post-inoculation (hpi) with Xoo. Overall, we respectively obtained 2, 080, 282, 498 clean reads (Table S11), and the mapped reads value was 91.89–94.29% and 91.81%-94.37% for NSIC RC154 and CT 9737–6-1-3P-M, respectively (Table S11). The resistant line NSIC RC154 and the susceptible line CT 9737–6-1-3P-M exhibited a substantially different response to the Xoo inoculation; the number of differentially expressed genes (DEGs) in NSIC RC154 (11 674; 3 167 up-regulated and 8 507 down-regulated) (Table S12) exceeded those in CT 9737–6-1-3P-M (11 436; 2929 up-regulated, 8 254 down-regulated) (Table S13). Among these DEGs, although 8 851 DEGs of them were common by the two rice varieties at different time points, 2 817 DEGs were only found in the resistance line NSIC RC154 (Table S14). Therefore, those latter genes may figure prominently in conferring resistance to BLB. At earlier stages (12 h and 24 h) of inoculation, 9 023 DEGs (2 477 up-regulated, and 6 546 down-regulated) were detected in the resistant line NSIC RC154, compared with 8 040 DEGs (2 201 up-regulated, and 5 839 down-regulated) in the susceptible line CT 9737–6-1-3P-M. When challenged with the pathogen, NSIC RC154 harbored more DEGs than did CT 9737–6-1-3P-M at the early stages. This suggested that NSIC RC154 reacts more strongly than CT 9737–6-1-3P-M to pathogen attack and invasion. We presumed that NSIC RC154 might have numerous resistance-related genes that are primed for a quick response to pathogen infection.

Expression analysis of the GWAS-identified genes

Among the 954 genes found significantly associated with resistance against the two Xoo races, 161 (36 up-regulated, 125 down-regulated) were DEGs in the resistant rice line NSIC RC154, and 197 (64 up-regulated, 140 down-regulated) were DEGs in the susceptible rice line CT 9737–6-1-3P-M. Furthermore, a set of 109 DEGs that underwent significant differential expression between resistant and susceptible rice lines was detected (Fig. 4A). Gene Ontology (GO) analysis revealed a stark enrichment of DEGs in several functional categories, namely signal transducer activity, purine nucleoside binding, receptor activity, motor activity, and molecular transducer activity (Fig. 4B). The KEGG (Kyoto Encyclopedia of Genes and Genomes) annotations indicated that the pathways enriched with these DEGs were closely related to phenylpropanoid biosynthesis, biosynthesis of secondary metabolites, photosynthesis, diterpenoid biosynthesis, as well as the biosynthesis of stilbenoid, diarylheptanoid and gingerol (Fig. 4C). Elucidating the detailed mechanisms of BLB resistance based on these results is difficultly; however, these DEGs were great importance as they most likely as candidate genes for enhancing resistance to BLB.

Fig. 4
figure 4

The candidate genes were detected via GWAS and different expression genes (DEGs) data set. (A) Heat map for expression patterns of the 109 candidate genes at 12, 24, 48, and 72 h post inoculation with Xoo strain P6 in R genotype (NSIC RC154) and S genotype (CT 9737–6-1-3P-M); (B) GO classification of candidate genes. Gene Ontology terms are classified into three main categories: biological process, cellular component, and molecular function; (C) KEGG annotation of putative proteins. The y-axis indicates the name of the KEGG metabolic pathway. The x-axis indicates the percentage of the number of unigenes annotated to the pathway out of the total number of unigenes annotated

Furthermore, among those 109 DEGs with significant differential expression levels between resistant and susceptible rice lines, nearly half (45) were expressed more in the resistant than susceptible line (Fig. 4A). These 45 genes contain a large number of defense-associated protein encoded genes, such as the NBS-LRR disease resistance protein encoding the genes LOC_Os11g44960, LOC_Os11g45050, LOC_Os07g02570, LOC_Os11g44990, LOC_Os07g02620, and LOC_Os07g02560. It is known that the NBS (nucleotide-binding state)-LRR (Leucine-rich repeat) disease resistance protein is critically involved in plant-pathogen interactions, and its NBS play a key role in regulate activity of this protein [30]. Perhaps more importantly, of the six NBS-LRR disease resistance protein-encoding genes, LOC_Os07g02570, LOC_Os11g44990, LOC_Os07g02620, and LOC_Os07g02560 displayed induced up-regulation in the resistant line at 24 hpi, whereas no significant changes in those four genes were detectable in the susceptible line after the Xoo inoculation. Reactive oxygen species (ROS) is important in plants by triggering plant basal defense responses [31]. The peroxidase precursor-encoding gene, LOC_Os07g02440, was up-regulated in the resistant line at 24 hpi, but not induced in the susceptible line at all time points tested (Fig. 4A). The expression of this gene implies that ROS could be crucially involved in marshaling defense against the Xoo pathogen early in its infection of rice.

Moreover, the natural-resistance-associated macrophage protein (NRAMP)-encoding gene LOC_Os07g15460, the glycosyl hydrolase family 3 protein-encoding gene LOC_Os11g44950, and the cytochrome P450-encoding gene LOC_Os02g17760 all underwent significantly higher expression in the resistant line compared with the susceptible line. According to other research, iron is a key element for most living organisms, and pathogens are likely to compete with their hosts for its acquisition [32]. The bacterial plant pathogen Dickeya dadantii depends strongly on its siderophore-mediated iron uptake system for systemic disease progression on several host plants, including Arabidopsis thaliana [33]. In rice plants, several metal ions such as Zn2+, Mn2+, Fe2+, and Cd2+ are transported via NRAMP transporter proteins [34]. In our study, the NRAMP-encoding gene LOC_Os07g15460 was found up-regulated in the resistant line at 72 hpi, yet it underwent no significant expression change in the susceptible line. This gene may thus contribute to resistance to BLB via iron transfer. Furthermore, three protein kinase-encoding genes (LOC_Os11g44660, LOC_Os11g44560, and LOC_Os07g02450) were down-regulated in the resistant line by the Xoo inoculation, but their expression levels were not induced in the susceptible line’s transcriptome at any time point examined. Further research should try to functionally validate effects of these genes, which is needed to reveal the molecular mechanisms of complex BLB resistance traits in rice.

Expression validation of candidate genes by qRT-PCR

The seven candidate genes were selected for verify the RNA-Seq data. These consisted of six encoding NBS-LRR disease resistance protein (LOC_Os11g44960, LOC_Os07g02560, LOC_Os07g02570, LOC_Os11g44990, and LOC_Os07g02440) and one protein kinase gene (LOC_Os11g44660) (Fig. 5). The threshold cycle (Ct) values of each gene were normalized relative to those of the UBQ gene (internal control). The relative expression levels of these genes were detected using qRT-PCR method and compared with transcriptome results. The results indicated that these genes expressed differently in the resistance line after the Xoo P6 strain inoculation, in way that was consistent with the RNA-Seq data (Fig. 5).

Fig. 5
figure 5

Haplotype analysis of the peak associated with the gene on chromosome 7. (A) Manhattan plots of loci on chromosome 07 associated with BLB incidence. LD heat map (bottom) reflected that associated SNP localized in a haploid between the red dashed lines; (B) Expression of two NBS-LRR genes LOC_Os07g02560 and LOC_Os07g02570 in the NBS-LRR gene cluster in resistant and susceptible rice lines (Transcriptome data); (C) Expression analysis of LOC_Os07g02560 and LOC_Os07g02570 in six resistant and six susceptible varieties at 24 h post inoculation (hpi) of Xoo by qRT–PCR

Identification of new BLB resistance genes by haplotype and expression analyses

In this study, the strongest signals identified on Chr.7 were novel (Fig. 3), and the region of SNPs located on Chr.7 contained many genes that encode NBS-LRR resistance proteins. The candidate region was estimated using pairwise LD correlations. We focused on the locus mapped from 0.747 to 0.981 Mb with 58 candidate genes (Fig. 6A). Among these genes, the transcriptome data uncovered two NLR protein-encoding genes, LOC_Os07g02560 and LOC_Os07g02570, which went significantly high expression in the resistant line but had low expression in the susceptible line of rice (Fig. 6B). To confirm the functioning of LOC_Os07g02560 and LOC_Os07g02570 in BLB resistance, six resistant and six susceptible varieties were selected to examine the expression levels of LOC_Os07g02560 and LOC_Os07g02570 after inoculation with Xoo P6. Expression of these two genes was higher in the resistant than susceptible varieties at 24 hpi (Fig. 6C). We also analyzed the sequences difference of these two candidate genes in resistant (NSIC RC154) and susceptible (CT 9737–6-1-3P-M) varieties. We found four SNPs (Chr.7_ 922,336, T-C; Chr.7_ 922,460, T-C; Chr.7_ 922,462, A-G; Chr.7_ 922,694, G-A) in upstream of LOC_Os07g02560, and six SNPs (Chr.7_ 920,121, C-A; Chr.7_ 920,184, T-C; Chr.7_ 920,534, G-A; Chr.7_ 920,762, G-A; Chr.7_ 920,766, C-T; Chr.7_ 920,861, T-G) were contained in coding region. For LOC_Os07g02570, two SNPs (Chr.7_927311, G-A; and Chr.7_928738, A-G) in upstream, and two SNPs (Chr.7_925408, A-G; and Chr.7_926479, A-G) in coding region were detected. These results indicated those two genes are significantly associated with resistance to BLB, and so they may be promising candidate resistance genes for this disease.

Fig. 6
figure 6

Verification of seven candidate resistant genes expression in resistant and susceptible rice lines at different infection time points by qRT-PCR. Error bars represent standard errors from four biological replicates (*, P < 0.05; **, P < 0.01; ***P < 0.001)

Discussion

For 259 rice accessions, we evaluated their resistance levels to BLB over 2 years. The results showed that not all rice sub-populations were equally resistant. Moreover, among the resistant levels of the 259 rice lines, there was a remarkable disparity between the two Xoo pathogen races tested, which indicated that the R genes with specific resistance were only carried by certain rice varieties. This finding is consistent with those of a smaller study by Zhang et al. [35], in which the level of indica rice resistance differed significantly among six Xoo races, with the latter divided into three groups based on the lesions’ size (length values). Furthermore, the resistance level of each line could be inferred from their incidence rate: < 10% distinguished the resistant lines. Among our 259 rice accessions, six lines (IR 10M126, NSIC RC154, CT 16,658–5-2-3SR-2–1-MMP, CT 15,765–12-1–4-2–1-M, CT 15,765–13-3–8-3–3-M, IR61009-37–2-1–2, and HHZ 12-DT 10-SAL 1-DT 1) had considerably high and stable resistance to both Xoo races. These results suggest the possibility of selecting appropriate materials for accelerating BLB-resistance breeding and the genetic study of rice.

GWAS is an important approach for detecting the function genes of complex traits. It has been used to detect new genes associated with resistance to diseases and important agronomic traits in plants. In maize, for instance, Li et al. [36] used GWAS to illuminate the role of ZmFBL41, which encodes an F-box protein, in that corn plant’s resistance to sheath blight. In this study, we identified 63 BLB resistance loci, containing 954 significantly associated genes, though a GWAS of 2,888,332 high-confidence SNPs. The five significant SNPs found, namely Chr. 12_17579641, Chr. 11_28745675, Chr. 11_21379864, Chr. 08_27163888, and Chr. 11_17915331 were respectively located near the cloned R genes xa25 [37], Xa26 [14], Xa21 [38], Xa23 [39], Xa33 [40], xa13 [16], and Os-11N3 [41] (Fig. 3). These results indicate that a relatively high resolution of GWAS is attainable when using a relatively large population, which considerably strengthened the investigation of genetic diversity and generation of a high-density SNP map for rice. By successfully combining the GWAS with transcriptome data to discover BLB resistance candidate genes in rice, novel loci were identified on chromosomes 7 (Chr.7_707158) that were significantly associated with rice resistance to both P3 and P6. Our data provide important information for future gene function studies of BLB resistance. It is anticipated these findings can serve as a robust reference for function gene discover on complex traits in rice and other plant species.

Compared with the susceptible line CT 9737–6-1-3P-M, the resistant line NSIC RC154 had more up-regulated genes at early time points following the pathogen inoculation. Furthermore, in combining the GWAS and transcriptome data, 109 significant associations with BLB-resistance DEGs were obtained. After assigning these genes to KEGG functional annotation, the pathways of phenylpropanoid biosynthesis, biosynthesis of secondary metabolites, photosynthesis, diterpenoid biosynthesis, and stilbenoid, diarylheptanoid, and gingerol biosynthesis were found to be all enriched. Phenylpropanoid compounds play key roles in plant defense, ranging from constitutive or inducible physical and chemical barriers against pathogenic infections, to acting as signal molecules involved in local and systemic signaling for the induction of one or more defense genes [42]. The enriched phenylpropanoid biosynthesis suggests that secondary defense metabolites figure prominently in early resistance to BLB in rice. Other pathways, however, such as diterpenoid biosynthesis and photosynthesis, likely also participated in plant disease resistance. For example, Oscyp71Z2 governs BLB resistance by regulating the biosynthesis of diterpenoid phytoalexin [43]. Our results suggest that rice resistance to BLB is a rather complex trait, in that it depends on a well-coordinated and activated network of multiple defense pathways.

Many BLB resistance genes have been searched for and applied to rice breeding [44]. Although some Xoo resistance genes are now known, most are specific resistance genes; e.g., xa25, xa26, and Xa1 [30, 45, 46]. Therefore, given the diminished plant resistance to BLB caused by evolving Xoo populations, it is imperative we find new genes conferring resistance traits and combine them with known resistance genes to develop durable and sustainable resistant lines of rice. In this study, multiple NLR genes were found localized at the Chr.7_707158 locus, where several NBS-LRR genes are also clustered. Through the haplotype and expression analyses, an NBS-LRR gene cluster (LOC_Os07g02560 and LOC_Os07g02570) that confers broad-spectrum resistance in rice to both Xoo races was uncovered. But whether all these mutations in fact drove the loss of plant resistance to Xoo requires further careful investigation. Furthermore, plant genomes often will encode several hundred NLR proteins that are involved in defense responses, some of which occur in clusters at specific loci following gene duplication and amplification events. Previous reports indicate that genetically linked NLR genes may act together to recognize a pathogen’s avirulent effectors, such as RPS4/RPS1, RGA4/RGA5, and Pikp-1/Pikp-2. In such pairs, one gene functions as a “sensor” that perceives pathogen effectors while the other is a “helper” required to activate immune signaling [47,48,49]. Similarly, LOC_Os07g02560 and LOC_Os07g02570 were co-localized in an LD block. Yet whether these two NBS-LRR genes in the R-cluster are subject to the same regulatory mechanism as RPS4/RPS1, RGA4/RGA5 or Pikp-1/Pikp-2 is unknown and merits further study.

Conclusions

We integrated the GWAS and transcriptome results of our study to provide some new, useful gene resources against bacterial blight in rice. Two candidate genes LOC_Os07g02560 and LOC_Os07g02570, were thus obtained. These findings should provide reliable targets for assessing candidate genes for use in the breeding of BLB resistance. More work remains to be done, however, to verify which additional genes underpin resistance to BLB in rice.

Methods

Plant materials and phenotypic evaluation

A total of 259 rice lines were used in this study. These varieties were collected from different country, including Senegal, China, Malaysia, Brazil, Colombia, and Mexico. Of these, 146 lines were provided by the International Rice Research Institute (IRRI) while the other 113 had been preserved by the Rice Research Institute of Sichuan Agriculture University, China. Information of 259 rice lines can be found in Table S1. To evaluate BLB resistance, the seeds of all 259 rice varieties were sown in a greenhouse. Then transplant to an experimental field (at Sichuan Agriculture University) after 30-day-old, with 10 plants per row.

We used two representative Xoo strains P3 and P6 to artificially inoculate plants. The strains were cultivated separately on peptone dextrose agar (PDA) medium for 2 days at 30 °C; each Xoo race was suspending using sterile water at a concentration of 108 cells ml−1 as inoculum. At the rice tillering stage, 15 of the uppermost leaves of each rice variety were infected with the two Xoo races, using the leaf-clipping method [50]. Lesion lengths were measured at 14 days after inoculation, when lesions were easily visible. A BLB disease score was recorded for each line, as the lesion length divided by the leaf length. The average disease score of each variety was calculated based on 15 individual leafs. Data were processed with Microsoft Excel 2010. Statistical analysis of BLB scores among different rice varieties or sub-populations was done using ANOVA followed by Dunnett’s multi-comparison test in SPSS v16.0 (IBM Corp., Armonk, USA).

DNA extraction and sequencing

Young leaves of 21-day-old seedlings of each rice variety were sampled to extract their genomic DNA. The cetyl trimethyl ammonium bromide (CTAB) method was used to extract genomic DNA [51]. The purity and concentration of DNA was determined, respectively, through a NanoPhotometer spectrophotometer (IMPLEN, CA, USA) and a Qubit DNA Assay Kit with a Qubit 2.0 Fluorometer (Life Technologies, CA, USA). The DNA samples of all 259 rice varieties were first fragmented by sonication to 350 bp fragments. These DNA fragments were sequencing after end-polished, A-tailed, and ligated to full-length adapters. Next, the raw sequences having a 150-bp read length were obtained. Among raw data, those reads that contained adapter sequence stretches of –Ns, or had low quality scores were deleted. The high quality paired-end reads were mapped to the Nipponbare rice genome (ftp://ftp.ensemblgenomes.org/pub/plants/release_36/fasta/oryza_indica/dna/) by Burrows-Wheeler Aligner software tool with the command “mem -t 4 -k 32 –M” [52]. After their alignment, genomic variants in GVCF format for each accession were identified by the Haplotype Caller module and GVCF model in, Genome Analysis Toolkit (GATK) software [53]. All the GVCF files were then merged together. A raw genotype file was then filtered by these parameters: depth for each individual ≥ 4; minor allele frequency (MAF) ≥ 0.01, genotype quality for each individual ≥ 4; and; a miss rate ≤ 0.2. In this way, a total of 2 888 332 SNPs were obtains, and further annotated using ANNOVAR software (v2013-05–20) [42]. These SNPs were divided into five groups according to their annotations: CDS SNPs, upstream SNPs (those positioned within 1 kb of the transcription start site), intergenic SNPs, downstream SNPs (located within 1 kb of the transcription stop site), and intron SNPs.

PCA, population structure, and LD analysis

The neighbor-joining (NJ) tree was built through the P-distance, by using the 2,888,332 SNPs in the Tree Best software (v1.9.2) and 1000 bootstrap replications [54]. The population structure of 259 rice lines was identified by the program ADMIXTURE (v1.23) [55], with a K-value ranging from 2 to 3. A principal component analysis (PCA) was then carried out using GCTA software [56]. To do this, the genetic relationship matrix was first calculated through the parameter “–make-grm”; the top three principal components were obtained by the parameter “–pca3”. To identify the LD of this rice population, the squared Pearson correlation coefficient (r2) between pairwise SNPs was calculated through the “Pop-LD-decay” software tool [57], whose program parameters were set to “-MaxDist 1000 kb-MAF 0.05 -Miss 0.1”. The average r2 value was calculated for pairwise markers in a 1-kb window, and these values averaged across the whole rice genome.

Estimation of breeding value

The breeding values were calculated by BLUP (best linear unbiased predictor), using the “lme4” package in the R computing platform (v. 3.2.2) [58], as follows:

$$Y = \mu + Line + Loc + \left( {Line \times Loc} \right) + \left( {Rep \times Loc} \right) + \varepsilon$$

where, the Y, μ, Line, and Loc are respectively the phenotype, intercept, variety effects, and environmental effects. The Rep is the number of replications, and ε indicated the random effects; the Line × Loc term denotes the interaction between variety and environment, while Rep × Loc is the interaction between replication and environment.

GWAS analysis

Only those SNPs with a sequencing depth ≥ 4, missing rate < 0.2 and MAF ≥ 0.01 were used in the GWAS, with the latter analyzed using the EMMAX (beta version) software package [59]. The matrix of pairwise genetic distances, which calculated using EMMAX, formed the variance–covariance matrix of random effects.

Transcriptome analysis

To further identify candidate resistance genes positioned around the GWAS-identified loci, the resistant rice variety, NSIC RC154, and the susceptible variety CT 9737–6-1-3P-M (both confirmed) were grown and inoculated with the more virulent Xoo race P6 in a greenhouse by the leaf-clipping method [50]. From each rice variety, leaves sample were obtained at 12, 24, 48, and 72 hpi, respectively, and each treatment has three replicates. Control samples of non-inoculated, fresh leaves of seedlings at 12 h were also collected. Place all leaf samples in liquid nitrogen and stored at -80℃ for their RNA isolation. Total RNA was isolated with the Plant Total RNA Isolation Kit (Sangon Biotech, Shanghai, China), according to the manufacturer’s instructions. We used the NEBNext Ultra™ RNA Library Prep Kit for Illumina (NEB, USA) for RNA-Seq libraries construction. The Illumina Hi-Seq platform was used sequencing, and 125-bp paired-end reads were generated. Among raw data, the reads having a low quality score and those containing adaptor sequences and stretches of -Ns were removed. An index of the Nipponbare rice reference genome was built using Bowtie v2.2.3, to which the above paired-end reads were aligned using TopHat v2.0.12 [60,61,62]. To count the number of reads mapped to each gene, HTSeq v0.6.1 software was used [63]. The expression value of each gene was present based on FPKM (fragments per kilobase of transcript sequence per million) that calculated using Cuffdiff software (v2.2.1).

The differential expression analysis of two treatments (each treatment contains three biological replicates) was carried out in R, using the “DESeq” package (v1.18.0) [64]. Differential expression levels of gene in the two treatments sample comparisons were determined based on the negative binomial distribution. Benjamini and Hochberg’s approach was used to adjust P-values for controlling the false discovery rate (FDR). Genes with the |log twofold change |> 1 and adjusted P-values of < 0.05 were designated as differentially expressed [65].

Haplotype analyses

Haplotype blocks were distinguished by using the confidence interval method [66], and Haploview software [67]. For this, the Hardy–Weinberg P-value cut-off was set to 0.001, with a MAF of 0.05.

qRT-PCR

Relative expression levels of seven candidate genes were investigated in rice plants by qRT-PCR. Total RNA extraction and reverse transcription were performed as described previously. The PCR reactions were using 20µL volume, which contained cDNA template 3 µL and forward and reverse gene-specific primers 0.8 µL, respectively. Each PCR set four replicated times. The ubiquitin (UBQ) gene was used as an internal control for data normalization. The 2−∆∆Ct method was used calculating gene expression levels. The primers used in this experiment are provided in Table S15.