Introduction

Utilization of renewable energy resources could decrease the negative environmental impacts caused by burning non-renewable fossil fuels. Second-generation, lignocellulosic feedstocks such as switchgrass (Panicum virgatum L.) [1] could produce fuel with less impact on food security than grain crops like corn (Zea mays L.). Switchgrass is a perennial warm season (C4) bunchgrass native to North America [2] and a predominant species of the tallgrass prairie ecosystems naturally grown from northern Mexico to southern Canada east of the Rocky Mountains [3]. Switchgrass is an allogamous species with abundant genetic variation that can be harnessed to improve biomass yield and cell wall composition traits for biofuel conversion [4, 5].

Lignocellulosic biomass is naturally recalcitrant to saccharification, limiting the production of liquid fuels. The plant cell wall is a complex structure consisting of cellulose microfibrils embedded in a network of hemicellulose and lignin. The presence of lignin in plant cell walls obstructs access of the hydrolysis and fermentation enzymes to the structural carbohydrates. This limits sugar recovery from cellulose and hemicellulose [6] and, consequently, impacts ethanol yield from biomass feedstock [7, 8]. Genotypic variation for lignin content and the ratio of syringyl to guaiacyl units (S/G ratio) were shown to significantly affect amenability of Populus (Populus trichocarpa Torr. & Gray) biomass feedstocks to hydrolysis [9]. An increase in dilute acid hydrolysis rate was observed with decreasing S/G ratios, which may be due to reduced covalent cross-linking [9]. Compared to woody feedstock species such as pine and eucalyptus, switchgrass had faster hydrolysis rates and higher sugar yields with dilute acid and ionic liquid pretreatment [10] due to its inherent low S/G ratio.

Research at the Department of Energy (DOE) Bioenergy Science Center (BESC) is focused on reducing recalcitrance of plant cell walls, mostly through transgenic approaches. Genetic modification of the cell wall composition may not only reduce recalcitrance and enhance amenability of lignocellulose to hydrolysis into simple sugars for fermentation but will also significantly enhance the economics and efficiency of the conversion processes [11]. Reduction of lignin content has been achieved through downregulation of genes involved in the phenylpropanoid pathway or overexpression of gene transcription suppressors [1215]. However, some transgenic switchgrass plants with significantly reduced lignin content have also exhibited abnormal growth and developmental characteristics [12, 13, 1517] which may limit their utility as a feedstock.

In addition to reducing switchgrass recalcitrance via a transgenic approach, due attention should be given to the assessment of the natural genetic variation that exists for lignin content and sugar release efficiency, their ultimate relationship, and the genomic regions underlying these traits. Quantitative trait loci (QTL) underlying cell wall composition could be manipulated using genetic markers, and marker-assisted breeding could be used to rapidly and efficiently develop improved biofuel-ready switchgrass cultivars. Our objective of this study was to identify QTL for cell wall composition traits such as lignin content, S/G ratio, and sugar (glucose and xylose) release in a segregating switchgrass population derived by crossing an individual genotype from each of the two main switchgrass ecotypes (lowland and upland).

Materials and Methods

Mapping Population and Field Evaluations

An F1 pseudo-testcross population was developed by crossing the lowland ecotype AP13 (female) with the upland ecotype VS16 (male) and has been previously described [18]. Field experiments were conducted at Ardmore and Burneyville in OK, and Watkinsville, GA as described in detail in Serba et al. [19]. Briefly, each genotype was clonally propagated and grown in four replicates at each of the three locations. The field experiments at Ardmore and Burneyville were transplanted on July 19, 2007 and May 08, 2008, respectively, using a randomized complete block design (RCBD) arranged in a honeycomb [20]. At Watkinsville, the experiment was laid out in RCBD with four replications. Two replicates were transplanted in the field on September 25, 2007, while the remaining two were transplanted adjacent to the first two on April 30, 2008. Detailed crop management practices such as seedbed preparation, fertilizer application, weed control, and interplant cultivation were as described previously [19]. Biomass was harvested after complete senescence in the fall in all the environments. Harvesting was conducted manually using hedge trimmers.

Cell Wall Composition Analysis

Total shoot biomass, including stem, leaf, and panicle harvested after senescence, was sampled from two replications in 2009 and 2010 in both locations in OK and in 2010 in GA, for a total of five location-year environments. Samples were oven dried at approximately 40 °C for 72 h and milled to a 1-mm particle size (mesh size 20) using a Thomas Wiley® Mill Model 4 (Thomas Scientific, Swedesboro, NJ, USA). The milled samples were analyzed for lignin content and sugar (glucose and xylose) release estimates at the National Renewable Energy Laboratory (NREL), Golden, CO, USA.

Sugar release estimates were generated using a high-throughput screening method that involved hydrothermal pretreatment and an enzymatic hydrolysis [2123]. Biomass was treated with alpha-amylase (Spirizyme Ultra—0.25 %) and beta-glucosidase (Liquozyme SC DS—1.5 %) in 0.1 M sodium acetate (24 h, 55 °C, pH 5.0) to remove possible starch content using 16 mL enzyme solution per 1 g biomass. This was followed by an ethanol (190 proof) Soxhlet extraction for an additional 24 h to remove other soluble compounds. After drying the samples overnight, 5 mg (±0.5 mg) was weighed in triplicate into one of 96 wells in a solid Hastelloy microtiter plate. Water was added (250 μL); the samples were sealed with silicone adhesive, Teflon tape, and heated to 180 °C for 17.5 min. Once cooled, 40 μL of buffer-enzyme stock (8 % CTec2 (Novozymes) in 1 M sodium citrate buffer) was added. The samples were then gently mixed and left to statically incubate at 50 °C for 70 h. After 70-h incubation, an aliquot of the saccharified hydrolysate was diluted and tested using Megazymes (Wicklow, Ireland) glucose oxidase/peroxidase (GOPOD) and xylose dehydrogenase (XDH) assays. For quantitation, results were calculated using mixed glucose/xylose solutions, and the amount of glucose and xylose released to liquid was measured using a colorimetric assay. Sugar release data were reported in milligrams of sugar released per gram of biomass residues. The average of the triplicates for each sample was used as a data point for a statistical analysis.

Lignin content and S/G lignin monomer analysis were performed using pyrolysis molecular beam mass spectrometry (py-MBMS), using the protocol described by Sykes et al. [24]. Approximately 4 mg of the cell wall residue per sample was loaded into 80-μL stainless steel cups and pyrolyzed at 500 °C in a quartz reactor using a frontier py2020 autosampler. The resulting pyrolysis vapors exit through a 250-μm crystal orifice and expand into a vacuum, quenching the reaction. A molecular beam is formed when a portion of the pyrolysis vapor stream is pulled through a 1-mm stainless steel skimmer, and then, the beam is ionized using electron impact (EI) ionization at −17 eV. Ions travel through a series of focusing lenses and are then bent 90° to travel through a single quadrupole to be detected as a mass spectrum. The relative intensities of the mass spectra peaks identified for lignin precursors were summed to estimate total lignin content [24]. S/G ratio was determined by dividing the sum of the intensity of the syringyl peaks by the sum of the intensity of the guaiacyl peaks.

Statistical Data Analysis

Normality of the composition trait data was checked with the UNIVARIATE procedure using a Q-Q plot of residuals. All the traits fit a normal distribution. Analysis of variance was conducted to test the effects of environment, genotype, and genotype × environment interaction on cell wall composition using the mixed procedure of the SAS statistical software program version 9.3 (SAS Institute, Cary, NC, USA). Replication and environments (location-year combination) were considered to be random, and genotypes were considered fixed effects. Least square means of genotypes across environments were computed and used for QTL analysis.

Broad sense heritability (H 2) for all traits in the combined dataset was calculated from the variance component generated by PROC VARCOMP in SAS. Heritability estimates were calculated using the following formula:

$$ {H}^2=\frac{\mathrm{genotypic}\ \mathrm{variance}}{\left[\mathrm{genotypic}\ \mathrm{variance}+\left(\frac{\mathrm{gei}\ \mathrm{variance}}{n}\right)+\left(\frac{\mathrm{error}\ \mathrm{variance}}{nr}\right)\right]} $$

where “gei” is genotype × environment interaction, “r” is the number of replications, and “n” is the number of environments.

QTL Detection

QTL analysis for sugar release, lignin content, and lignin S/G ratio was conducted using across environments, least square means data for the 188 progeny plants for which marker genotype data was available. Because we did not detect significant GXE interactions for the lignin content and sugar release traits, we analyzed data across environments to obtain genotypic means for QTL analysis. The QTL analysis was conducted using WinQTL Cartographer v2.5 [25]. A three-step QTL analysis was conducted starting with a non-parametric single-marker analysis [26] to establish marker-trait associations. Then, simple interval mapping [27] was conducted using a linear model for F1 progenies in an outcrossing species [28] for estimating relationships between phenotypic values and putative QTL positions. Finally, the QTL was confirmed with composite interval mapping (CIM) [29, 30]. CIM was performed with forward and backward stepwise regressions at a threshold of p < 0.05 for automatic cofactor selection, a window size of 10, and a 1.0 cM walking speed along the linkage groups (LGs). Permutation analysis repeated 1000 times was used to set the genome-wide threshold of significance (p < 0.05) for QTL at 2.4 for glucose release, 2.7 for xylose release, and 3.1 for lignin content. We considered a threshold log of odds (LOD) minus 0.5 for a “putative QTL.” Main effect QTL were validated by inclusive composite interval mapping (IciMapping) using the QTL in biparental population (BIP) functionality [31]. Epistatic QTL were rescanned using the QTL-by-environment interaction in biparental population functionality of the IciMapping software. QTL designations consisted of abbreviations for the trait names (Glu = glucose, Xyl = xylose, and Lig = lignin content) followed by the LG name (1 to 9) and sub-genome (“a” or “b”) and a serial number when there were two or more QTL on the same LG.

Localization of QTL Flanking Markers in the Switchgrass Genome Sequence

To determine the physical location of QTL and identify putative candidate genes underlying the QTL, flanking marker sequences were used as queries in a BLASTn search [32] against the P. virgatum AP13 v1.1 genome sequence assembly (P. virgatum v1.1, DOE-JGI, http://www.phytozome.net/panicumvirgatum, accessed November 20, 2015). Genes annotated in the region spanned by the QTL flanking markers plus 50 kbp at either side were selected as candidate genes, and their functional annotation was recorded. Most of the QTL identified in this study had LOD scores <3.0. Thus, we delimited putative QTL intervals using the criterion “threshold LOD minus 0.5” with whiskers at LOD scores “threshold minus 1.”

Results and Discussion

Population Performance

Statistically significant differences were observed among genotypes and environments (years and locations) for the cell wall composition traits investigated (Table 1). Genotype-by-environment interaction effects were only significant for S/G ratio. The two parents significantly differed in sugar release and lignin content but were similar in their S/G ratio (Table 2). All traits segregated in the progeny and the prevalence of genetic variation among genotypes of the population for the cell wall traits guarantee that significant improvements can be achieved through selection. Since we are mapping in an F1 pseudo-testcross population, it is the variation present within each of the heterozygous parents that is pertinent.

Table 1 Effects of various sources of variance on sugar release, lignin content, and S/G ratio for AP13 × VS16 F1 pseudo-testcross population combined across five environments (Ardmore and Burneyville, OK in 2009 and 2010 and Watkinsville, GA in 2010)
Table 2 Population mean and heritability of sugar release, lignin content, and S/G ratio for AP13 × VS16 F1 pseudo-testcross population combined across five environments (Ardmore and Burneyville, OK in 2009 and 2010 and Watkinsville, GA in 2010)

The average total sugar (glucose and xylose) release across five environments ranged from 287 to 384 mg g−1, with an overall average of 322 mg g−1. The highest average sugar release was observed at Ardmore in 2009 and the lowest at Burneyville in 2010 (Fig. 1a). The mean lignin content across the five environments ranged from 23.0 to 26.7 % (Fig. 1b). The mean S/G ratio ranged from 0.63 to 0.74 (Fig. 1c), indicating that switchgrass lignin is mainly composed of the guaiacyl (G) monomers derived from coniferyl alcohol. A low concentration of syringyl (S) units, derived from sinapyl alcohol, is in line with results from previous analyses of lignin composition in grasses [10].

Fig. 1
figure 1

Mean sugar release (a), lignin content (b), and S/G ratio (c) for the AP13 × VS16 population across five environments (Ardmore and Burneyville, OK in 2009 and 2010 and Watkinsville, GA in 2010). Means with the same letter within a given trait are not significantly different (p < 0.05)

Mean sugar release was lower in 2010 than in 2009 across locations, concomitant with and probably due to an increase in lignin content, which was 12 % higher in 2010 than in 2009 (Fig. 2). The S/G ratio of lignin composition was higher in 2010 than in 2009. This could be favorable for biofuel processing because a higher concentration of S lignin relative to G lignin increases sugar release or the efficiency of biomass conversion [8, 33]. However, a small reduction in S/G ratio was found to improve dilute acid hydrolysis in Populus [9]. Variation in lignin content across years could be caused by environmental factors that affected plant growth (plant height and biomass yield were also higher in 2010 than in 2009 [19]) or by biotic and abiotic stresses that are known to affect lignification [34]. It is also possible that the differences in lignin content across years are due to changes in the leaf to stem ratio because stems have a higher lignin content and a higher S/G ratio than leaves [3537]. Future assessments of the natural variation in lignin content across environments should also consider the relative ratio of leaves to stems in harvested biomass.

Fig. 2
figure 2

Frequency distribution of sugar release and lignin content in the AP13 × VS16 F1 pseudo-testcross mapping population evaluated across five environments (site-year combination)

The broad sense heritability for cell wall composition traits was estimated from a combined analysis of variance components (Table 2) at 0.27 for glucose release and 0.54 for xylose release. The heritability of lignin content was 0.34 and of S/G ratio 0.38. The moderate heritability estimates obtained for these traits indicate a preponderance of genetic control of the phenotypic variation. These heritability estimates are similar to those made for biomass yield (0.29–0.65) and related traits (plant height <0.27, tillering 0.28–0.48, etc.) in other populations of switchgrass [3840].

Impact of Lignin Content on Sugar Release

A concomitant reduction in total sugar release was observed with an increase in lignin content (Fig. 3a). Lignin is a complex phenolic polymer that has an important role in structural integrity of the stems [41]. Lignification and cross-linkage of lignin with other cell wall components form a natural barrier against the penetration of disparaging enzymes through the cell wall [42]. This barrier negatively impacts sugar release from cellulose and hemicellulose [43]. We found that a 1 % increase in lignin content reduced sugar release by 10.6 mg g−1 (Fig. 3). The impact of lignin content was higher for glucose release than for xylose release (Fig. 3b, c). From the regression analysis, we estimated that a 1 % increase in lignin content brought about a 7.9 mg g−1 reduction in glucose release and a 2.75 mg g−1 reduction in xylose release. Interestingly, in a study of natural variants of Populus, Studer et al. [8] found that the strong negative impact of lignin content on glucose release was observed only for pretreated samples with a low S/G ratio (<2.0). Xylose release, on the other hand, did not correlate with lignin content. Since the S/G ratio is low in switchgrass, our observation that lignin content predominantly affects glucose release efficiency agrees with those observations. Decreased lignin content due to downregulation of 4-coumarate:coenzyme A ligase (4CL), a gene in the phenylpropanoid biosynthetic pathway, resulted in an increase in glucose but not xylose release [15].

Fig. 3
figure 3

Simple linear regression of lignin content to sugar release in AP13 × VS16 F1 pseudo-testcross mapping population. Data were averaged across five environments; a total sugar release, b glucose release, and c xylose release

QTL Mapping

A total of nine QTL for sugar release and 14 QTL for lignin content were detected in the female and male maps (Table 3, Fig. 4). Three QTL each for glucose release were detected in the female and male maps, while one and two xylose release QTL were detected in the female and male maps, respectively. No QTL was detected for S/G ratio at threshold LOD (1.97) determined by genome-wide 1000 permutation test minus 0.5 as for other traits. Most of the QTL for sugar release and lignin content that mapped to the same LG had opposite additive effects. For instance, on LG VIIb-f, Glu7bf had a positive additive effect on sugar release and Lig7bf had a negative additive effect on lignin content.

Table 3 Main effect QTL detected for sugar release and lignin content in AP13 × VS16 F1 pseudo-testcross mapping population tested across five environments (Ardmore and Burneyville, OK in 2009 and 2010 and Watkinsville, GA in 2010)
Fig. 4
figure 4figure 4

Main effect QTL for sugar release and lignin content detected with overall LSM across five environments in AP13 (female, lowland) and VS16 (male, upland) linkage maps. QTL with LOD values equal or greater than 2.0 are presented. QTL for glucose and xylose release, and lignin content were named as Glu, Xyl, and Lig, respectively, followed by the LG name on which the QTL was detected (serial numbers were included when there were more than one QTL on a LG). The box on the QTL chart indicates a threshold LOD minus 0.5, and the whiskers on one or both ends of the QTL are a threshold LOD minus 1.0

The individual QTL identified for lignin and sugar release each explained less than 10 % of the phenotypic variation in the population, except for Lig7bf, which explained 11.1 % (Table 3). The relatively low phenotypic variation explained (PVE) by individual QTL indicates that these traits are controlled by several QTL with small effects. Therefore, improvement in feedstock quality can occur by accumulating positive alleles at multiple loci (Fig. 5). All the additive effects of the glucose and xylose release QTL mapped in the female and male maps were positive, except Glu9am which had an additive effect of −4.4 mg g−1. The additive effects of most of the lignin concentration QTL detected in the female map were negative, while they were positive for QTL detected in the male map.

Fig. 5
figure 5

Allelic effect plot of selected QTL for glucose release (ae), xylose release (fh), and lignin content (il) in the AP13 and VS16 maps. AA and AB represent the allelic states of the markers close to the peak of the QTL

In addition to the main effect QTL, 12 epistatic QTL for glucose release, 9 for xylose, and 11 for lignin concentration were detected (Table 4). In general, the epistatic QTL for the sugar release traits accounted for more of the phenotypic variation than the main effect QTL detected for the traits. The epistatic QTL for glucose release had 8.9 to 18.6 % PVE with the additive by additive effect ranging from −17.6 to 13.5 mg g−1. The nine epistatic QTL detected for xylose release had PVE from 8.9 to 17.1 %. The additive by additive effect of the xylose epistatic QTL ranged from −12.6 to 8.3 mg g−1. Similarly, the epistatic QTL detected for lignin concentration had PVE ranging from 7.6 to 12.2 %. The additive by additive effect of the epistatic QTL detected for lignin ranged from −7.8 to 8.0 g kg−1. All epistatic QTL for lignin except those on LGs Ib-f/IIa-f, IIb-f/VIIIb-f, and IIIa-f/IXb-f in the female and IIIa-m/IVa-m in the male had positive additive effects (Table 4).

Table 4 Epistatic effect QTL detected for sugar release and lignin content in AP13 × VS16 F1 pseudo-testcross mapping population tested across five environments (Ardmore and Burneyville, OK in 2009 and 2010 and Watkinsville, GA in 2010)

The importance of epistatic effect QTL has been documented for various traits in rice [4448], wheat [4951], and soybean [52], demonstrating that both main effect and epistatic QTL are underlying various quantitative traits in plants. The negative correlation of lignin content and sugar release observed in this population suggests that some segregants or natural variants with reasonably low lignin content and above average sugar release may be identified through selection.

Co-mapping of QTL for Yield and Quality

Several of the QTL detected for cell wall composition traits co-mapped with biomass yield and plant height QTL reported previously [19]. One xylose release QTL on IIIa-f mapped very close to a biomass yield QTL but had a contrasting allelic effect. Similarly, a glucose release QTL that co-mapped with plant height on LG IVa-m had contrasting effects on glucose release and plant height. Three lignin QTL, namely Lig3af, Lig9bf, and Lig3am, co-mapped or closely mapped with biomass yield QTL, and two lignin QTL (Lig1bf and Lig7af) comapped with plant height. As expected, alleles that increased biomass yield or height also increased lignin content. Exceptions were the alleles at QTL Lig9bf and Lig3am which increased lignin content and had a reducing effect on biomass yield. As the breeding targets for a new cultivar have increased biomass with reduced lignin content, selection of beneficial alleles at Lig9bf and Lig3am will be important. Where QTL for biomass and cell wall composition co-map and the direction of the effect is the same (higher biomass, higher lignin), it will be hard to uncouple them to improve both yield and recalcitrance simultaneously. In this case, the focus of breeding should be on regions where QTL do not co-map for those two target traits. On the other hand, lignin content and biomass yield are positively correlated, placing a daunting challenge on breeding to combine both traits to produce high yielding genotypes with improved biomass conversion [53].

Candidate Genes for Cell Wall Composition

The linkage maps of switchgrass used for the QTL detection have been constructed using SSR, STS, and DArT markers. A BLASTN analysis of the genomic and EST sequences of the markers flanking the QTL was conducted against the P. virgatum AP13 v1.1 assembly, and the regions between 50 kbp upstream of the left and downstream the right flanking markers were searched for possible candidate genes with functional relationships to the QTL. Most of the QTL regions harbored genes involved in cell wall modifications (Table 5). Among candidate genes, we identified for glucose main effect QTL detected on LGs IIIa-f, VIIb-f, IVa-m, Va-m, and IXa-m were hexokinases, which are enzymes that phosphorylate six-carbon sugars and form hexose phosphate [54]. An auxin-induced transcription factor that functions as a repressor of early auxin response genes at low-auxin concentrations [55] was also colocalized with glucose QTL. The MYB family transcription factors, transferase family proteins, and other transcriptional regulators were also identified in the vicinity of glucose main effect QTL. QTL for xylose release on LGs IIIa-f, IIb-m, and Va-m were found within 50 kbp of putative candidate genes such as glutathione S-transferases, glycosyltransferases, CDP-alcohol phosphotidyltransferases, oligosaccharyltransferases, and phosphotransferases. Glutathione S-transferase is involved in the reduction of organic hydroperoxides formed during oxidative stress [56]. Glycosyltransferase establishes natural glycosidic linkages, including the biosynthesis of polysaccharides [57]; CDP-alcohol phosphotidyltransferase is involved in phospholipid biosynthesis; oligosaccharyltransferase is a membrane protein complex that transfers a 14-sugar oligosaccharide from dolichol to nascent protein, and phosphotransferase catalyzes phosphorylation reactions (moving sugars to the cell).

Table 5 Selected sugar release and lignin content QTL colocalized genes in the switchgrass genome sequence v1.1 assembly

Lignin QTL colocalized with genes encoding phosphoglycerate mutase, a key catalyst of glycolysis [58] that plays an important role in vegetative growth and stomatal movement [59] (Table 5). A glycosyl hydrolase, a MYB family transcription factor which plays a regulatory role in plant developmental processes and defense responses [60], 2-oxyglutarate dehydrogenase, a citrate transporter protein, a protein kinase, a fasciclin-like arabinogalactan protein, a terpene synthase, a 3-hydroxyacyl CoA dehydrogenase, a flavonol synthase, a 1,3-beta-glucan synthase, and a kelch motif family protein were also identified in lignin QTL regions. The phenylpropanoid pathway includes about ten genes involved in lignin biosynthesis [61, 62]. Several lignin pathway genes have been mapped in the vicinity of lignin content QTL in maize [63]. It is difficult to conclude that the identified genes play a role in the control of the cell wall composition traits mapped in the AP13 × VS16 population until further validation is conducted.

Conclusions

This population derived from a cross between an upland and a lowland genotype exhibited substantial variation for cell wall composition traits of importance for bioenergy production. The QTL analysis revealed six main effect QTL for glucose release which collectively explained about 32 % of the phenotypic variation of the trait. Three main effect QTL detected for xylose release explained a total of 21 % of the phenotypic variation. Fourteen main effect QTL for lignin content accounted for a total of 84 % of the phenotypic variation. Most of the sugar release and lignin content QTL colocalized with genes functioning in carbohydrate processing and metabolism and other functions related to plant growth and development including cell wall biosynthesis.

Selection of favorable alleles at this limited number of QTL will lead to a significant improvement in the cell wall composition traits in switchgrass that are relevant to improved feedstock production. Both parents carried alleles for lower lignin and higher sugar release, suggesting that both ecotypes can contribute to desirable breeding targets. The cell wall composition traits had no defined relationship with the yield traits reported in the same population [19], which may facilitate simultaneous improvement for both yield and quality. The markers or candidate genes that are tightly associated with the QTL peaks can provide a high level of selection accuracy for the target traits in breeding programs. Future isolation of the actual genes underlying the QTL will require either fine-mapping or validation of strong candidates using molecular approaches.