Background

Stress caused by salinity is one of the most serious environmental factors, which inhibits plant growth and decreases crop productivity worldwide [1]. Primary effects occurring at the beginning of salt stress include retarded cell division and expansion [2], stomata closure and photosynthesis reduction [3]. During long-term exposure to salt stress, accumulation of salt ions in plant aerial parts via the transpiration stream leads to ionic stress [1, 2, 4]. To adaptively respond and survive under salinity, plants require changes of various cellular, physiological and metabolic mechanisms, which are controlled by the regulated expression of specific stress-related genes through cascades of complex regulatory networks [5,6,7].

Rice (Oryza sativa L.), one of the world’s most important cereal crops, is classified as a salinity sensitive crop [1, 8]. An electrical conductivity (EC) of ~ 6 dS m− 1 (~ 56.98 mM NaCl [9]) would result in more than 50% reduction in yield of many rice varieties [10]. Therefore, plant breeders are continuously improving salt tolerant rice cultivars to increase yield productivity [11]. However, salt tolerance is a multigenic trait, which underlying mechanisms are controlled by many genes and affected by the environment. Breeding efforts for developing salt tolerant rice have been limited because the salt tolerance mechanisms and the genes that control them are not completely understood [12,13,14]. To fill the knowledge gap between genotypes and phenotypes of the salt stress response in rice, forward and reverse genetics have been performed to identify salt-responsive loci/genes such as genetic mapping of quantitative trait loci (QTLs) using cross population; screening of mutants generated by chemical- or irradiation-induced mutagenesis; and transgenic approach [15, 16]. To identify salt-responsive genes using cross population, a number of mapping studies have identified QTLs of physiological traits related to salinity tolerance in economic crops such as soybean, barley and rice [17,18,19]. Although QTL mapping is a powerful and popular method to tag the salt tolerance region in plants, the examination of the variation is one of the limitation because QTL mapping can identify only allelic diversity that segregates between the parents of a particular F2 cross or within recombinant inbred lines and the mapping resolution is limited by the amount of the genetic recombination event occurring in the mapping populations [20, 21]. Moreover, the genotyping by SSR markers, which is usually based on polymerase chain reaction (PCR), is limiting to examining the kinds of variations, and laborious and time-consuming when high-density genotyping is needed for a large number of individuals [22].

Over the past several years, next generation sequencing has been used to rapidly generate a large amount of accurate genomic data, providing a powerful approach for functional genomics and molecular breeding studies, including the genome-wide association study (GWAS) [23]. GWAS, which is the analysis of the statistical association between genetic variants and traits on the whole genome scale in a large number of individuals within an organism, has been employed to identify causal genetic variability for target traits, including those in Arabidopsis and crop species [22, 24,25,26]. Compared with the QTL linkage mapping method, GWAS provides high resolution mapping using single nucleotide polymorphisms (SNPs) as genetic markers [22, 27]. GWAS in rice was performed for agronomic traits such as tiller number, grain width, grain length and spikelet number in the indica subspecies based on SNPs identified by whole-genome sequencing. [28, 29]. In another report, the genetic architecture of rice chlorophyll content at the heading stage was revealed by GWAS. Forty-six significant loci were identified and Ghd7 was highlighted as a major locus for the natural variation of the chlorophyll content [30]. GWAS also revealed three QTLs (qER1–3) located on chromosomes 3, 6 and 12 associated with the responsiveness of yield-determination traits under field condition [31]. Application of GWAS for causative gene identification has been reported in rice responding to abiotic stresses such as aluminum, boron, cold, drought and salt stresses [32,33,34,35,36]. On salt stress, there are several GWA studies in rice with different growing stages and traits. Shi et al. [37] studies GWAS on germination stage of salt-treated rice using ~ 6000,000 SNPs, 11 loci containing 22 significant SNPs responsible for stress-susceptibility indices of the vigor index and germination time were identified. The strongest association region for germination time was detected on chromosome 1, near salt-tolerance QTL controlling Na+ uptake and K+ concentration. At tillering stage, GWAS was performed on rice exposed to short- (6 h), medium- (7 d) and long-term (30 d) salt stress based on ~ 200,000 SNPs. Around 1200 candidate genes associated with growth parameters, and Na+ and K+ content were identified [36]. For salt-treated rice at reproductive stage, only a study of Kumar et al. [38] were reported. Based on 6000 SNPs, it was shown that 20 loci were associated with the Na+/K+ ratio, and 44 loci were associated with other traits. Twelve association mappings with Na+/K+ were located on chromosome 1 where Saltol, a major QTL that controls shoot Na+/K+ homeostasis in rice at the seedling stage, is located. However, GWAS has not been applied for the analysis of photosynthetic and yield-related traits in rice exposed to salt stress at the flowering stage, which is a highly salt-sensitive stage. Additionally, no rice accession from Thailand where a large collection of diverse rice germplasms can provide new allelic diversity for salt tolerance [39], were analyzed by GWAS.

The objectives of this research were (1) to investigate and cluster Thai and Asian rice accessions based on physiological responses and yield-related traits under the salt-stress condition at the flowering stage and (2) to perform GWAS for these traits to identify regions/genes responsible for salt tolerance.

Methods

Plant materials and growing conditions

The association panel consisted of a diverse collection of 190 rice (O. sativa) cultivars including both standard salt-tolerant (Pokkali) and salt-sensitive (IR29) varieties. The rice accessions in this study were kindly provided by the Pathum Thani Rice Research Center (Additional file 1: Table S1). The experiment was designed with a randomized complete block design with four replications. According to the limitation of the time-consuming process of data collection, the experiment was performed in three separate sets of experiments. The standard salt-tolerant and salt-sensitive cultivars were included in every experimental set. Twenty-one day old seedlings were cultivated using a hydroponic system with WP No. 2 nutrient solution [40] and transplanted into pots containing soil (5 kg) and maintained until harvest. At heading stage in the flowering phase of each accession, water on the soil surface was drained before salt stress treatment. Rice plants were then watered with 900 mL of 150 mM NaCl solution to reach the desired final soil electrical conductivity (EC) of 8–9 dS m− 1 and treated for 9 days. For the control condition, rice plants were treated by tap water for the same period. Water level was kept at 2 cm above the soil surface throughout the experimental period. To recover, tap water was used to wash out salt ions in the soil every day until the soil EC was lower than 2 dS m− 1; this condition was maintained until harvest to collect yield-related traits. These experiments were conducted in the greenhouse facility at the Nakhon Ratchasima Rice Research Center, Rice Department, Ministry of Agriculture and Cooperatives. The air temperature was maximum at 32 °C with natural light and minimum at 21.1 °C during the night. The average relative humidity was 72.5%.

Parameter collection for association analysis

Photosynthetic parameters consisting of net photosynthetic rate (PN), stomatal conductance (gs), transpiration rate (E), and intercellular CO2 concentration (Ci) were measured during the same period (8:30–11:30 a.m.) by the LI-6400 XT portable photosynthesis system (LI-COR, Lincoln, NE) on the middle portion of the 2nd leaf (penultimate leaf) of the main tiller on days 0, 3, 6 and 9 after salt stress treatment. The photosynthetic photon flux density used was 1200 μmol photon m− 2 s− 1. The leaf temperatures and ambient CO2 concentration used during the measurement were 27–30 °C and 380 ppm, respectively. A method modified from Blum and Ebercon [41] was used for cell membrane stability (CMS) measurement. One gram of the 2nd leaf was cut into segments of 2 mm in length and put into 10 ml of deionized water in a test tube and left at room temperature for 2 h. Electrical conductivity (EC1) of the sample solutions was measured using the universal instrument for measurements of conductivity (SevenCompact™ conductivity S230, Metler, USA). Then, the tissues in the test tubes were boiled for 15 min, cooled to room temperature, and the final electrical conductivity (EC2, maximum conductivity of the tissues) was measured. The percent CMS was calculated as 100 × [100 – (EC1/EC2 ratio)]. For yield-related traits, the numbers of tillers (TIL), panicles (PAN), filled grains (FG) and unfilled grains (UFG) per pot were recorded at the end of experiment.

Exome library preparation

Rice gDNA was extracted from leaf tissue using the Genomic DNA Mini Kit (Plant) (Geneaid Biotech Ltd., Taiwan) and the amount of DNA was quantified using a spectrophotometer. For the exome library preparation, gDNA was fragmented using dsDNA Fragmentase (New England Biolabs, Ipswich, MA). The sheared DNA was modified using an End Repaired enzyme (New England Biolabs) and deoxyadenosine was added at the 3′ end using a Klenow fragment (New England Bio-labs). Each of the unique DNA barcodes (Bioo Scientific, Austin, TX) was joined to DNA in each library using DNA ligase (New England Biolabs). Pre-capture libraries were hybridized with the capture probes of the rice exome region, which were designed based on the O. sativa ‘Nipponbare’ database (Michigan State University [MSU] Rice Genome Annotation Project). The capture libraries were cleaned using AMPure (Beckman Coulter, Indianapolis, IN) and amplified by PCR using post-capture primers. The final yields were quantified by Bioanalyzer (Agilent Technologies, Santa Clara, CA). Exome-capture libraries (18–23) were pooled in each lane and sequenced using the Illumina HiSeq2000’s protocol in the Illumina genome analyzer (San Diego, CA).

Data analysis

Statistical analysis of phenotypic traits

Statistical analyses were performed with IBP SPSS ver.22 (IBM Corp., Armonk, USA). Analysis of variance (ANOVA) was carried out to assess the effects of genotype, environment, and G × E interactions using the general linear model procedure. Duncan’s Multiple Range Test was used to compare the mean value for tests of significance. Cluster analysis among physiological responses was performed by JMP ver. 11 (SAS Institute Inc., Cary NC, USA) and R ‘corrplot’ package [42].

SNP genotyping and genotype data analysis

The short-sequence reads from the Illumina Genome Analyzer were grouped into the correct categories using the pipeline created by Missirian et al. [43]. The rice reference genome was downloaded from the database (Ensemble version IRGSP-1.0), and indexed by SAMtools [44]. Raw reads were aligned against the reference genome using the Burrow-Wheeler Aligner (BWA version 0.5.7–1) [45]. Variants were called using genome analysis toolkit (GATK; version 3.3–0) [46]. Variants were filtered if they fitted the following criteria: to be called heterozygous, minimum coverage and minimum percentage of each of the two observed major basecalls were 5 and 20, respectively and minimum total coverage was 10; for a position to be called homozygous, minimum coverage was 6 or 3 if positions with the minimum coverage of 6 were present in at least 10 accessions. SNP density was visualized using R 'CMplot' (https://github.com/YinLiLin/R-CMplot).

Population structure and linkage disequilibrium analysis

To estimate the number of subgroups in the panel to select the appropriate statistical model for association between the phenotypic and genomic data, analysis of the population structure within the rice population was performed using EIGENSOFT version 6.0.1, which used principal component analysis (PCA) to model ancestry differences in a population [47, 48]. Population stratification was visualized by plotting the first two PCs.

Linkage disequilibrium (LD) analysis was assessed by computing the correlation (r2) in frequency across a pair of SNP loci. The r2 values between pairs of SNPs were calculated using the command in PLINK [49] --r2 -ld window-kb 2000 --ld-window 999999 --ld-window-r2 0. This command was used to calculate LD association among SNP pairs to a distance of 2000 kb. LD decay analysis were conducted by division of marker pairs within the 2000-kb region into bins of 20 kb and r2 values within each bin were averaged. To visualize the result, the r2 values were sorted and plotted against the physical distance [38].

Association mapping

To identify loci underlying the genetic regulation of traits mentioned above, SNPs were removed from the analysis by PLINK 1.07 [49] if their minor allele frequency was less than 5% across the panel or the genotype was unknown for > 40% of the varieties. The resulting ungenotyped markers were imputed using Beagle 5.0 [50]. Genome-wide association (GWA) mapping was conducted using GEMMA software based on the SNP data and the phenotypic data [51, 52]. To visualize the association results, the quantile–quantile (Q-Q) plots of observed p-values were constructed against expected p-values and Manhattan plots were constructed with the chromosome position on the X-axis against –log (p-value) of all SNPs using the R ‘qqman’ package [53]. The p-value of SNP marker was corrected for multiple tests by calculating q-value (FDR adjusted p-value) of each trait. SNPs with the q-value lower than 0.05 was selected as significant marker.

QTL analysis

The list of candidate genes from GWA mapping was compared with the salt QTL mapping that was previously reported by Hu et al. [54], Patishtan et al. [36] and summarized in TropGENE [55], Gramene and http://www.plantstress.com/files/qtls_for_resistance.htm#salinity.

Results

Phenotypic variation among Thai rice accessions under salt effect

We evaluated photosynthetic parameters and cell membrane stability on 104 rice accessions individually at the flowering stage after salt stress for 3, 6 and 9 days and analyzed yield-related traits at harvesting time. The mean values and frequency distributions of all parameters of each accession are shown in Additional file 2: Table S2 and Additional file 3: Figure S1. The highest reduction of phenotypic traits was observed at day 9 after salt stress: photosynthetic rate, PN (− 49%); stomatal conductance, gs (− 50%); transpiration rate, E (− 43%), and cell membrane stability, CMS (− 18%) when compared with the control condition (Table 1). However, we found that the mean values of intercellular CO2 concentration, Ci increased about 6% at day 9 after salt stress treatment. For yield-related traits, on average, number of tillers per plant, TIL; number of panicles per plant, PAN; number of filled grains per plant, FG decreased by 19, 11 and 26%, respectively, whereas number of unfilled grains per plant, UFG increased by 10% (Table 1). To determine substantial genotypic variation in salt-stress responses, relative phenotypic values were calculated by the salt stability index of each rice accession [(salt/control) × 100] (Fig. 1). These parameters tended to decrease when plants were exposed to salt stress, except Ci, which tended to increase under salt stress. The variations of phenotypic traits were found in all parameters and were pronounced, particularly in the case of PN, FG and UFG (Fig. 1).

Table 1 Mean and range of phenotypic values and yield-related traits of 104 rice accessions. Mean phenotypic values of control and salt-treated plants (n = 104) are shown with the S.D
Fig. 1
figure 1

Box plots for relative phenotypic values (a) and yield-related traits (b) (calculated as percent phenotypic value in salt divided by control condition). The median of each trait is shown as a horizontal bar in the box, and the upper and lower sides of a box represent the first and third quartile values of the distribution, respectively. Whiskers represented maximum/minimum values. PN: net photosynthetic rate; gs: Stomatal conductance; E: Transpiration rate; Ci: intercellular CO2 concentration; CMS: Cell membrane stability, TIL: number of tillers per plant; PAN: number of panicles per plant; FG: number of filled grains per plant and UFG: number of unfilled grains per plant

The relationships of the salt stability index of all parameters were determined by Pearson’s correlation r (Additional file 4: Table S3). We found a strong positive correlation between PN and gs, or E (Fig. 2). PN also had a positive correlation with CMS, though weaker, at days 6 and 9 after salt treatment. Conversely, a strong negative correlation between PN and Ci was found. As expected for yield-related traits, the strongest positive correlation was observed between TIL, PAN and UFG. In addition, the relationship between photosynthetic performance and yield-related traits were observed. TIL was negatively correlated with gs at days 3 and 6; and with E at day 3. Similarly, PAN was negatively correlated with gs at day 3 as well as PN. Following the same trend, UFG was negatively correlated with gs or E at days 3 after salt treatment, and with Ci both at days 3 and 6. (Fig. 2 and Additional file 4: Table S3). In an opposite trend, a positive correlation was found between FG and gs at day 6. At day 9, no correlation was observed between photosynthetic parameters and yield-related traits.

Fig. 2
figure 2

Pearson correlation coefficients computed for susceptibility indices of all traits and timings. PN: net photosynthetic rate; gs: stomatal conductance; E: transpiration rate; Ci: intercellular CO2 concentration; CMS: cell membrane stability, TIL: numbers of tillers per plant; PAN: number of panicles per plant; FG: number of filled grains per plant and UFG: number of unfilled grains per plant. Cells with correlation values not significant at p-value < 0.05 are left blank

SNP data, population structure and LD pattern in the panel

The list of rice accessions used for exome sequencing is shown in Additional file 1: Table S1. In total, 190 rice accessions were used for exome-sequencing, with the capture probes designed to cover about 50 Mb of the nucleotide target covering all 12 chromosomes of rice. SNPs that showed a minor allele frequency (MAF) of < 5% of our population were removed to decrease overestimation of the effect of low-MAF SNPs. Therefore, the resultant number of 112,565 SNPs (Fig. 3), which were high-quality SNPs genotyped across this population, was subsequently used for GWAS.

Fig. 3
figure 3

The number of SNPs called using GATK within 1 Mb window size in 12 rice chromosomes

EINGENSOFT was implemented for population structure analysis, which was based on PCA. Using SNPs identified by exome sequencing, two main subpopulations were delineated (Additional file 5: Figure S2), consisting of five accessions in the first group and 185 accessions in the second group, respectively. The rice accessions in the first group included ‘Ai Tai’, ‘Jao Haw’, ‘Beu Saw Mi’, ‘E-Puang’ and ‘Leung Tah Young’ rice, which were grouped as upland rice (Additional file 1: Table S1). We also found that ‘Pokkali’ rice, which is a standard salt tolerance variety, was separated from the two main sub populations. Therefore, before association analysis, we removed upland rice accessions to reduce strong subpopulation structure that may generate spurious association between the phenotype and unlinked SNP markers.

For LD decay analysis of the panel, the binned r2 values were mapped against the physical distance and the distance at which the average of r2 dropped to half of the maximum value was described as LD decay. The average LD for SNPs at 20 kb distance from each other was 0.34 (r2), which decayed to its half value (~ 0.17) at around 80 kb (Fig. 4). Additionally, PLINK was also used to calculate chromosome-wise LD between SNPs pairs. At 5 kb from each other, the greatest r2 was found on chromosome 3 (r2 = 0.57) and the lowest r2 was found on chromosome 11 (r2 = 0.27).

Fig. 4
figure 4

LD pattern and LD decay in the whole panel. The whole genome r2 values from PLINK were first sorted by r2 values, and then divided into 100 blocks of 20 kb. The r2 values in each block were averaged and plotted against the physical distance

Genome-wide association mapping and candidate loci associated with salt tolerance at flowering stage

To identify potential genes associated with salt stress in Thai rice population, GWAS was performed using SNP data and the phenotypic data of 10 parameters by GEMMA software [51]. Manhattan plots were generated to illustrate the significance of exome-sequencing SNPs associated with each trait. Using the mixed linear model (MLM), after correction for multiple testing, markers with a q-value (an FDR adjusted p-value) < 0.05 were considered as truly significant. Given that an FDR adjusted p-value threshold of 0.05 means that 5% of significant tests would result in false positives and the number of spurious associations was greatly reduced. Altogether, 448 significant SNPs were found from GWA mapping of four traits, and the list of SNP positions, alternate SNPs and candidate loci were presented in Additional file 6: Table S4. A significant SNP was found in the GWA mapping of net photosynthesis at day 6 (PN.day6), which was located on chromosome 10 (Fig. 5a). There were two SNP peaks on the GWA mapping of PAN on chromosomes 2 and 10 (Fig. 5c). A SNP peak was also found in GWA mapping of FG, which was located on chromosome 4 (Fig. 5e) and there were four SNP peaks in the GWA mapping of UFG, which was on chromosomes 1, 7, 8 and 12 (Fig. 5g). The Q-Q plot of expected and observed p-values was delineated and SNPs that had p-values deviated from the linear indicated reasonable positives (Fig. 5b, d, f, h). Table 2 listed the loci identified by GWAS that contained multiple neighboring significant SNPs appeared and/or significant SNPs of low p-value. Figure 6 and Additional file 7: Figure S3a and b show the regions containing those significant SNPs with the shade color in the blue bar representing the pair-wise LD indicated by r2 value for the SNP of the lowest p-value in that region.

Fig. 5
figure 5

Manhattan and Quantile-quantile (Q-Q) plots of GWAS. GWAS analysis was carried out using SNP markers from the exon region associated with (a and b) PN on day 6 after the salt stress treatment: PN.day6 (c and d) PAN (e and f) FG (g and h) UFG of 104 accessions as phenotypic data. For Manhattan plots, x-axis represents SNP positions across the entire rice genome by chromosome and the y-axis is the negative logarithm p-value: -log10 (p) of each SNP. For Q-Q plots, x-axis represents expected -log10 (p) and y-axis is observed -log10 (p) of each SNPs. PN: net photosynthetic rate; gs: stomatal conductance; E: transpiration rate; Ci: intercellular CO2 concentration; CMS: cell membrane stability, TIL: number of tillers per plant; PAN: number of panicles per plant; FG: number of filled grains per plant and UFG: number of unfilled grains per plant

Table 2 List of genes identified by GWAS that contained multiple neighboring significant SNPs and/or significant SNPs of low p-value
Fig. 6
figure 6

The peak regions on rice chromosomes containing significant SNPs from GWAS of PN.day6 (a), PAN (b and c), FG (d) and UFG (e and f). The pair-wise LD for the SNP of the lowest p-value (red letters) is indicated as r2 values, where the markers were divided into bins of 5 kb and the r2 values were averaged and shown as blue bars; the darkest blue represents a value of 1 and the lightest represents a value of 0. The dotted lines denote the regions containing LD blocks that the significant SNPs reside. Examples of other significant SNPs are shown in green letters. Note that the diagram of r2 values represents all neighboring SNPs present in that region, while it is not proportional to the physical distance of the chromosome. PN: net photosynthetic rate; PAN: number of panicles per plant; FG: number of filled grains per plant and UFG: number of unfilled grains per plant

Comparison of the GWAS prediction and previously reported QTLs

Overall, GWAS mapping identified 448 significant SNPs in the exome, which were located on 200 genes (Additional file 6: Table S4). Among these, there were 146 genes co-localized with salinity-related QTLs, which accounted for 73% of all candidate genes covering all rice chromosomes. Figure 7 represents salt-related QTL on which candidate genes were co-located. The top four regions that contained a high number of significant SNPs were found on chromosomes 8, 12, 1 and 2, respectively (Table 3). The region containing the highest density of significant SNPs (100 SNPs) was located between markers RM7027 and RM826 on chromosome 8, which was related with the salt evaluation score (SES) of rice [56]. The second highest density of the significant SNPs (33 SNPs) was located in qGY12.1 (RM519-RM1103) on chromosome 12, which associated with grain yield (GY) of rice under salt stress at reproductive stage. On chromosome 3, 28 significant SNPs were located in qGP3 (RM49–RM6712), which involved germination percentage (GP) of rice under salt stress [57]. This region was overlapped with 2 QTLs, including qPL3.1 s (RM520–RM570) and qSHL-3 (RM7000–RM7389). These QTLs were correlated with the panicle length (PL) and shoot length (SHL) [56, 58]. Finally, for salt-treated QTLs of rice involving day of seedling survival (DSS) on chromosome 2, we found that qDSS2.1 (RM109-RM110) containing 19 significant SNPs identified in this study.

Fig. 7
figure 7

Locations of reported salt QTLs of 12 rice chromosomes that overlap with candidate genes from GWA mapping. The positions of the QTL regions correspond to Table 3. The black lines in each chromosome represent location of significant loci. The boxes on right hand side of each chromosome are salt QTLs identified from previous studies:  Ammar et al. [106], Bimpong et al. [73], Ghomi et al. [56], Hossain et al. [107], Koyama et al. [19], Lee et al. [76], Liang et al. [108], Lin et al. [75], Yao et al. [77], Mohammadi et al. [58], Prasad et al. [109], Qiu et al. [74], Sabouri and Sabouri [80], Takehisa et al. [78], UI Haq et al. [110], Wang et al. [57], Wang et al. [79]. R 'chromPlot' was used to draw this illustration [111].

Table 3 Summary of reported QTLs and candidate genes from GWA mapping associated with salt tolerance that are co-located to these segments

Discussion

Variations of the salt tolerance of rice at the flowering stage

Salt-tolerance of rice is a dynamic trait affected by growing stage and genotype [59, 60]. This study is the first large-scale tolerance evaluation and GWAS focusing on salt stress at the reproductive stage of rice [61]. In this study, we applied salt stress to flowering rice and evaluated five parameters of photosynthetic performance (photosynthetic rate, PN; stomatal conductance, gs; transpiration rate, E; and intercellular CO2 concentration, Ci), cell membrane stability, CMS and four parameters of yield-related traits (number of tillers per plant, TIL; number of panicles per plant, PAN; number of filled grains per plant, FG; and number of unfilled grains per plant, UFG). On average, the stability indices of photosynthesis performance and CMS decreased under salt stress, while increases of Ci were found, which were similar to the yield parameter, UFG (Table 1 and Fig. 1). For PN, E and gs, these findings agreed with previous studies describing salt-induced photosynthesis reduction in rice seedlings [62,63,64]. Consistently, during the reproductive stage, Moradi and Ismail [65] found that PN, E and gs were inhibited in the flag leaf under salinity. Additionally, the salt-sensitive rice cultivar IR29 displayed higher Ci than others when exposed to salt stress during both seedling and reproductive stages. As indicated by Burghardt et al. [27], GWAS would have power to discover genes affecting the trait of interest in large of phenotypic variation greater than small of phenotypic variation. In this study, large variation was observed in PN in photosynthesis performance; and PAN, FG and UFG in yield-related traits under salt stress (Fig. 1). Correspondingly we found high detection power of association mapping in these parameters, whereas GWAS for the other parameters that exhibited lower variation was not successful (Fig. 5).

Rice genome variations and genome-wide association mapping

Using efficient, high precision exome capture and sequencing, we have identified 112,565 SNPs. Previous studies used SNP array to identify SNP markers for GWAS in rice and yielded fewer SNP markers when compared with our study [33, 38, 66]. The present study, however, focused on exonic regions, which are specific sequences in the genome while accounting for only one-sixth of the rice genome, resulted in more than 100,000 SNPs. Although several statistically robust models have been developed for GWAS [67], population structure can limit its effectiveness [68, 69]. Our Thai rice population belongs to the indica group (Additional file 5: Figure S2). While its size is relatively small compared human studies, GWAS with similar population size has been effective in Arabidopsis [24] and rice [26]. Indeed, diversified composition of our population, lack of strong subpopulation structure, and its homozygosity facilitated GWAS [26, 38].

Candidate genes associated with salt tolerance of rice at the flowering stage

Altogether, GWAS using this Thai rice population leveraged more than 110,000 SNPs to identify 448 SNPs associated with salt tolerance, which were located in 200 loci in the rice genome. As presented in Table 3, 73% of candidate genes from association mapping associated with salt stress were located within salinity tolerance QTLs identified in bi-parental segregating populations.

Functional annotation of the 200 identified genes revealed a number of plausible candidates. The gene annotations we employed relay on the presence of a protein domain or of a homolog with a known function in rice and other crop species such as maize or sorghum [70, 71], as well as Arabidopsis [72]. Two chromosomes contained the highest number of reported salt QTLs (Table 3 and Fig. 7) overlapping with 7 of our candidate loci: chromosome 1, which included 16 QTLs [58, 73,74,75,76,77,78, 107] and chromosome 2, which included 10 QTLs [56, 58, 74, 77, 79, 80]. The nature of the candidate genes indicates that different molecular and cellular strategies have evolved to favor survival during salt stress [81]. Several genes belong to the receptor kinase family (LOC_Os01g66740, LOC_Os01g66760, LOC_Os02g02120 and LOC_Os02g56630), encoding signaling factors during environmental stresses [82, 83]. LOC_Os01g18850, one of candidate genes detected by GWA mapping of UFG trait, encodes SQUAMOSA promoter binding protein-like (SPL) transcription factor (TF), a plant specific TF, whose function was suggested to affect a broad range growth and development processes, including flower development [84] and 19 SPL genes were identified in rice [85]. The role of SPL gene in salt stress response has been studied by Mao et al. [86]. The 31 SPL genes were identified in maize and the expression profiles of SPLs revealed that most SPL genes were induced under salt stress condition.

A candidate gene identified here encodes cytochrome P450 monooxygenases (LOC_Os01g59020). This enzyme, common to bacteria, plants and humans, shares a common catalytic center, a heme with an iron coordinated to the thiolate of a conserved cysteine [87]. They oxidize disparate substrates through activation of molecular oxygen. The plant P450 gene superfamily plays crucial roles in plant metabolic processes [88]. Narusaka et al. [89] analyzed the expression of 49 Arabidopsis P450 genes under various stresses, including salt stress, and found that 29 P450-genes were induced by various stresses. In the CYP709B subfamily of P450, a cyp709b3 Arabidopsis mutant showed sensitivity to salt stress during germination and high salt-damage at the seedling stage [90]. In rice, Tamiru et al. [91] reported that a P450 gene, OsDSS1 located on chromosome 3 was involved in growth and drought stress responses. Compared to WT, the dss1 rice mutant exhibited improved recovery after germination under drought stress. Additionally, ectopic expression of the P450 gene PtCYP714A3 from Populus trichocarpa was studied in rice. Transgenic rice expressing PtCYP714A3 was semi-dwarf with improved tolerance to salt and osmotic stress, resulting in higher survival rates than WT [92].

Interestingly, several novel candidate loci with 144 significant SNPs identified from this GWA mapping were found on chromosome 10, in which no salt QTL was reported. This represents the highest density of significant SNPs found in the same LD block (Fig. 6c). Interestingly, seven of these SNP-associated genes encoded F-box domain containing proteins (LOC_Os10g03620, LOC_Os10g03660, LOC_Os10g03730, LOC_Os10g03740, LOC_Os10g03780, LOC_Os10g03930 and LOC_Os10g05500). Previous studies have reported the role of F-box proteins in regulating various abiotic stress responses in Arabidopsis, wheat and rice [93,94,95,96,97]. A conserved N-terminal F-box domain (40–50 amino acids), is a component of the multi-subunit of ubiquitin E3 ligase, an enzyme in the last step of the ubiquitination pathway [98, 99]. The rice genome harbors more than 600 F-box genes whose divergence is consistent with adaptive roles [100] and regulation of 25 of these genes responds to salinity stress [94]. Rice seedlings overexpressing F-box protein gene, MAIF1 reduced inhibition of root growth and tolerance under salt stress compared with WT [97]. Salt induced the expression of OsMsr9, a novel rice putative F-box containing protein, especially in the panicle. Overexpression of OsMsr9 increased root length, shoot length and survival rate under salt stress [101].

Moreover, SNP with the lowest p-value (9.04 × 10− 11) found on chromosome 12 of GWA mapping of UFG (Table 2, Additional file 7: Figure S3d) was located in LOC_Os12g36630, which was annotated as a universal stress protein (USP) domain containing protein. In fact, USP genes are widely distributed across many organisms including plant, which encode a protein containing the 140–160 highly conserved residues of the Universal Stress Protein A domain (USPA, Pfam accession number PF00582). These genes were reported as environmental stress-responsive genes and played role in the ability of plant to respond to the stresses [102, 103]. To date, there are no report on the role of USP genes in salt-treated rice. However, in the study on OsUsp1 in rice under oxygen deficiency condition, it was found that OsUsp1 expression was strongly induced within 1 h of submergence and it played a role in ethylene-mediated stress adaptation in rice [104]. Furthermore, the role of the USP protein (At3g53990: AtUsp) in enhancing oxidative stress has been reported in the plant model Arabidopsis [105]. They found that the over-expression of AtUSP conferred a strong tolerance to oxidative stress, primarily via its chaperone function.

Conclusion

High quality genotyping data from high-throughput sequencing combined with robust statistical analysis, enables GWA mapping of complex quantitative traits. We conducted GWAS for salt tolerance during rice reproduction based on high-density SNPs in exon regions using indica Thai accessions. Altogether, the significant SNPs were located on 200 loci distributed among all rice chromosomes. Our GWA mapping was highly consistent with previous salt tolerance QTL mapping studies conducted in bi-parental populations. Overall, more than 73% of the candidate genes controlling salt tolerance identified in our GWAS overlap with the salt QTLs. While many are novel, their annotation is consistent with potential involvement in plant salt tolerance and in related agronomic traits. These significant SNPs greatly help narrow down the region within these QTLs where the likely underlying candidate genes can be identified. Knowledge on the varieties with high salt tolerance, as well as the associated SNPs from this study, will be useful for future improvement of rice yield productivity under salt stress.