Introduction

Biomass is a promising renewable resource among the diverse portfolio of energy sources necessary to meet US energy demands and contribute to national energy security [1]. The 2016 Billion-Ton Report (BT16) estimates that by 2040, over one billion tons of biomass will be available to realize a vision of a sustainable bioeconomy [2]. The diversity of biomass resources in the BT16 Report ranges from woody and herbaceous residues to energy crops and waste streams such as municipal solid waste. Characteristics of lignocellulosic biomass are highly variable, both within and among plant species [3,4,5]. Environmental conditions, natural phenomena, and human-controlled factors involved in cultivation and throughout the supply chain impact the quality of the biomass material delivered to a biorefinery. Biorefineries, however, are designed to convert, process, and operate using feedstocks with consistent characteristics [6]. Variability in chemical and physical characteristics of biomass poses substantial operational and economic risk for a biorefinery [3,4,5,6]. A grading system is one mechanism that many industries have employed to allow buyers with diverse needs to purchase products based on categorized quality characteristics and, simultaneously, to increase returns for producers and increase satisfaction of buyers [7]. A biomass grading system addressing the lignocellulosic biomass variability could form the basis for valuationFootnote 1 of these variable resources by defining grades with uniform quality standards and their associated costs similar to grading systems in other industries.

Grading systems exist for many agricultural commodities as well as in the energy sector for crude oil and coal. These grading systems are typically based on a few, key technical parameters that influence performance in a given process or limit the use or application of the material or product. For example, the market price of crude oil is primarily based on American Petroleum Institute (API) gravity, a measure of density relative to that of water, and sulfur content with low sulfur referred to as sweet and high sulfur as sour [8]. Crude oils range from heavy, sour grades that require more processing to light, sweet grades that require less processing and sell for a higher price according to a price differential [9]. Total acid number (TAN) became important after high acid crude oils were observed in the 1920s and have since been found worldwide [10,11,12]. Coal is similarly ranked based on a few chemical characteristics—fixed carbon, volatile matter, and heating value—to assess its characteristics and commercial uses, with a few additional recommended analyses such as Hardgrove Grindability Index [13, 14]. Coal is classified using ASTM D388-17 into four groups–anthracitic, bituminous, subbituminous, and lignite [13, 14].

Lignocellulosic biomass resources are highly complex and heterogeneous when compared to crude oil and coal, but ideally, biomass would also be categorized and ranked based on the energy or fuel-related value of its inherent characteristics for the bioenergy sector. Since grading is a marketing tool, biomass price would then be set according to physical and chemical characteristics as opposed to just the current dry mass basis. A grading system not only will change the paradigms associated with biomass quality and feedstock performance but also would enable transparent inclusion of currently considered low-quality and low-cost rated biomass and waste resources into the bioeconomy market. Heat and power end users have been developing quality standards for solid biofuels. In the case of the European Union, the standards (EN) focus on non-industrial uses while International (ISO) standards are focused on both non-industrial and industrial uses [15, 16]. These standards are evolving into grading systems by incorporating terminology; fuel specificationsFootnote 2 and classes; fuel quality assurance; sampling and sample reduction; and chemical, physical, and mechanical test methods [16]. Similar ISO standard development is underway in Canada [17] and the USA [18, 19]. The Biomass Energy Resource Center, one of the organizations involved in the US adoption of ISO standards for solid biofuels, has outlined four grades of woodchips for boiler fuels—paper-grade chip (high quality), bole chip (medium quality), whole-tree chip (low quality), and urban-derived wood fuel (lowest quality)—that are based on particle size, ash content, moisture content, energy content, and contaminants like rocks and debris [20]. Very recently, the approval of a new US standard has been announced by ASABE [21]. In addition, the Pellet Fuels Institute (PFI), a North American trade association, runs an auditing program that assigns and certifies with a Quality Mark pellets meeting the PFI grade requirements [22].

In addition to standards development related to lignocellulosic biomass use for heat and power, there are associated grading systems for lignocellulosic biomass grown for animal feed. For example, alfalfa grading systems have been continually under development since 1933, with the original federal standards based on simple visual inspections [23]. The most recent standards included chemical characteristics and relationships of these characteristics to animal performance [24, 25]. The USDA Agricultural Marketing Service has a set of quality categories used in the nationwide Market News reporting program that are based on acid detergent fiber, neutral detergent fiber, total digestible nutrients, crude protein, and relative forage quality (RFQ), which is an index system that takes into account many of these chemical characteristics for determining the nutritional quality for animal feed [26,27,28]. In this sense, the animal acts as a biological conversion process and parallels could be drawn to conversion processes being pursued by second generation biorefineries. After over 80 years of development, there is still not a hay grading system that has been adopted nationwide as each region has adapted similar grading methods to meet their local feedstock supply and end user needs [23].

In summary, grading systems are material/product dependent, and the main driver is not only quality and its relationship to market price, but also, the end user type and application will set the performance needs and requirements for consideration in the grading system. No grading system exists today for the biofuel manufacturing industry. There are no reports on any attempt for definition of quality grades or for development or adoption of a grading system. Nevertheless, addressing this challenge early in the development of the bioenergy industry will allow for a scientific approach to the problem and, more importantly, a consistent nationwide evaluation system. An economically based assessment made by Thompson and Tyner [29] established a set of penalties that a biorefinery may impose to the biomass provider due to the effect moisture and ash may have on biorefinery operations and biomass performance. The cost associated with these penalties allowed differentiation into four potential grades, which determined the impact of quality on models of farm production decisions and farm profit. This study also showed that there are certain biomass properties that can create value for both the farmer and the biorefinery. A grading system should set the basis for creating this value in the most transparent form possible. Developing a grading system is a substantial effort that will require extensive data collection on how characteristics of diverse biomass resources affect each specific end process. In addition, a comprehensive grading system would consider factors such as stability (preservation in storage), grindability (particle size distribution resulting from comminution), flowability (handling characteristics), and convertibility (the amount of reactant that transforms into product(s) by virtue of chemical reactions) in a systematic framework that allows comparison and valuation of diverse biomass resources. For example, current practices already measure moisture, particle size distribution, and ash of corn stover as it is harvested or prior to entry into a refinery. These values are of particular importance for storage, milling, and conveying, as industry input has suggested these factors be included in a comprehensive grading system [30].

The basis of a grading system is depicted in Fig. 1. This drawing illustrates how different types of biomass may be harvested and graded into a few categories based on the inherent quality of the material. Grading informs processing that may be required to transform biomass to a feedstock that meets the specifications of a given conversion process for production of a given end product, such as fuels, chemicals, or products. Grading systems can take many forms but are generally based upon a few key parameters that distinguish whether a given resource is well suited to produce a desired product. In this work, we will exemplify the approach followed for the development of a grading system for lignocellulosic biomass—particularly herbaceous grasses—used in the production of monomeric carbohydrates for subsequent conversion to bio-ethanol. The driver criterion is the conversion performance and carbohydrate yields during dilute-acid pretreatment and enzymatic hydrolysis. These are preliminary results that set the foundations for the approach. The inclusion of other potential biomass resources, like short rotation woody crops, or characteristics that might be considered in the definition of biomass grades such as foreign objects (e.g., rocks, glass), performance in preprocessing, intrinsic ash, soil contamination, and moisture will be a natural evolution towards the development of the system.

Fig. 1
figure 1

Depiction of a biomass grading system beginning with example images of biomass that may enter a future grading system (from top to bottom: corn stover being harvested, corn stover bales, municipal solid waste). Next, the biomass would be graded (or binned) based on key biomass characteristics with 1 being the highest quality biomass and 5 being the lowest quality. Graded biomass would then be sent for preprocessing to produce feedstock that meets necessary specifications for a given conversion process to produce a final product

This case study focuses on one basic scenario of this complex undertaking by demonstrating a conversion-based biomass grading system for one biochemical conversion pathway, whereby the framework will be established for future expansion of the grading concept and criteria [31]. One of the primary challenges for the definition of the grades is how to select the right characteristics from all of the available data to support a grading system and develop an easy to use methodology for biomass valuation. In this work, a three-step approach was used to define the grades employing a large variety of herbaceous biomass resources that could be used in the referenced biochemical conversion process. The first step was to identify the biomass characteristics that could be directly included in the grading system, by determining characteristics with the greatest impact on produced carbohydrate yields using a linear regression analysis. This regression analysis was used to determine the minimum number of characteristics required to explain the value ranges for carbohydrate release following dilute-acid pretreatment and enzymatic hydrolysis. Second, a diverse set of biomass resources was used to determine the ranges of biomass variability exhibited by commercially available materials for the identified characteristics. Finally, the ranges of variability were binned using hierarchical cluster analysis, based on conversion performance, and the upper and lower bounds of the bins were used as the grade boundaries. This work can be considered the first attempt at discussing a systemic and systematic development of a biomass grading system for the biofuel industry.

Materials and Methods

Biomass

Ten herbaceous biomass samples from five different biomass materials including the following: corn stover (CS), sorghum (SO), switchgrass (SG), Miscanthus (MG), and grass clippings (GC) were selected to represent a diverse range of this biomass type. To better understand the limited samples included in the study, the characteristics of these samples were compared to 65 herbaceous samples from the Bioenergy Feedstock Library to determine whether the range of compositional variability is representative of a larger sample set. Table 1 summarizes information on the ten samples included in the study. Samples were milled to pass through a 2-mm sieve with round holes using a Thomas Model 4 Wiley Mill (Thomas Scientific, Swedesboro, NJ, USA) in preparation for chemical analyses and conversion performance. Supporting datasets are available in the Bioenergy Feedstock Library (bioenergylibrary.inl.gov) and can be accessed using the globally unique identifiers in Online Resource Table S1.

Table 1 Information for samples in the study

Chemical Composition

The chemical composition of each sample was determined using NREL laboratory analytical procedures (LAP) [32]. Each sample was analyzed in duplicate. The extractions detailed in the LAP were done using an Accelerated Solvent Extractor (Dionex™ ASE 350, ThermoFisher, Scientific, Waltham, MA, USA). Carbohydrates removed during the water extraction were measured using high-performance liquid chromatography (HPLC) following an acid hydrolysis of the liquors. To ensure adequate and consistent extraction of potentially high extractives material, samples GC 2 and SO 1 were extracted two times for both water and ethanol and GC 1 was extracted two times for water and one time for ethanol. The water extracted phase was acid hydrolyzed prior to HPLC, for carbohydrate content determination. Water extractives were adjusted to 4% acid using 72% sulfuric acid. The samples were then autoclaved at 121 °C for 1 h, neutralized using calcium carbonate, and filtered through a 0.2-μm filter. Water-extracted carbohydrates in monomeric form were measured on duplicate samples by HPLC equipped with an Aminex HPX-87P column (BioRad Laboratories, Hercules, CA, USA) and a refractive index detector, using 18 MΩ ultrapure water flowing at 0.6 mL min−1 as mobile phase, at a column temperature of 85 °C. Monomeric carbohydrates and organic acids were quantified in the hydrolysis liquor present after the two-stage acid hydrolysis of the water/ethanol extracted solids by HPLC. The same HPLC methods previously described were used for carbohydrates. Organic acids were analyzed by HPLC using a diode array detector and an Aminex HPX-87H column (BioRad Laboratories, Hercules, CA, USA) at a column temperature of 55 °C and a mobile phase of 0.01 N sulfuric acid aqueous solution at a flow rate of 0.6 mL min−1. The concentration of the acid-soluble lignin in the hydrolysis liquor was evaluated by UV-Vis spectroscopy, at 320 nm using a Varian Cary 50 (Agilent, Santa Clara, CA, USA) and calculated using an extinction coefficient of 30. The acid-insoluble lignin was measured gravimetrically after the hydrolysis liquor was filtered and acid-insoluble ash subtracted. A LECO TruSpec CHN (St. Joseph, MI, USA) was used to measure nitrogen content that was then converted to protein content using a nitrogen-to-protein conversion factor of 4.6 [33]. Ash after heating in air to 575 °C was measured gravimetrically and reported on a 105 °C dry basis.

Conversion Performance

Conversion performance was evaluated using a laboratory-scale, dilute-acid pretreatment, and enzymatic hydrolysis assay according to Wolfrum et al. [34] and Selig et al. [35]. Quadruplicates of each raw, untreated sample were analyzed using the dilute-acid pretreatment and enzymatic hydrolysis assay. Pretreatment experiments were performed using a Dionex™ ASE 350, by filling the 66-mL zirconium cells with 3.0 ± 0.03 g biomass and 30 mL of 1% sulfuric acid (w/w). The solid loading for each cell was 10% (w/w), and the acid-to-biomass loading was 0.08 g g−1. The pretreatment temperature was 130 °C, and cells were heated for 7 min with a subsequent 7-min static time. Then, the cells were purged with N2 for 200 s to displace the liquid in the cell. After the nitrogen purge, the temperature was reduced to 100 °C. The cells were rinsed with 100 to 150 mL of nanopure water, purged with N2 gas for 200 s to remove all rinsate, and then the rinsate was collected for analysis of total and monomeric carbohydrates and organic acids using HPLC as described in the previous chemical composition section. The ASE extraction was completed in quadruplicate for each sample. The solid pretreated biomass remaining in the ASE cells from the four runs were used as quadruplicates for the subsequent enzymatic hydrolysis assay. In some cases, the pretreatment was not completed, such as when the ASE instrument failed and there were less than four cells extracted. This was the case for CS 1, SO 1, and SO 2 for which only three replicates had pretreatment data and therefore enzymatic hydrolysis data. Six replicates were analyzed for MG 1.

Enzymatic hydrolysis was conducted using a modified version of the procedure described in Selig et al. [35]. Pretreated solids were enzymatically hydrolyzed by adding to a 50-mL incubation flask 1.0 g of biomass on a dry basis, which was calculated based on a separate aliquot that was dried at 105 °C to determine the moisture content using a LECO Thermogravimetric Analyzer (TGA) 701 (St. Joseph, MI, USA). Then, citric acid buffer (pH 4.8) and sodium azide were added to final concentrations of 0.05 M and 0.2%, respectively, and a total reaction volume of 10 mL. Glucan content in the pretreated solids was determined by subtracting out the glucan measured in the pretreatment liquors and enzyme cocktails (cellulase and xylanase) were added at an enzyme/glucan wt/wt ratio basis of 40 mg g−1 of Cellic® CTec2 (Novozymes, Franklin, NC, USA) and 4 mg g−1 of Cellic® HTec2. The density of all biomass solutions was assumed to be 1 g mL−1. Controls were included through use of enzyme and substrate blanks. Samples were then incubated at 50 °C, and 100 μL aliquots of liquor were removed at 6 h, 12 h, 24 h, 48 h, and 120 h. Monomeric carbohydrates were measured in these aliquots by HPLC as described in the previous section. The resulting carbohydrate releases from pretreatment, enzymatic hydrolysis, and pretreatment and enzymatic hydrolysis combined were reported on a g monomeric carbohydrate per g dry raw biomass for both glucose and xylose yields. Four replicates were run for all samples, except those noted previously and GC 1 that had only one replicate for the 6 h aliquot, because not enough liquid was generated after 6 h in the other enzymatic hydrolysis assay replicates. Carbohydrate data for the 12 h enzymatic hydrolysis aliquot is not available for SG 1, SO 1, or SO 2.

Conversion Models

A conversion model was developed using JMP, Version 10.0.0 (SAS Institute, Inc., Cary, NC). Multivariate linear regression models were developed with the goal of identifying combinations of compositional characteristics that were adequate (defined in this study as R2 > 0.90) for explaining the resulting carbohydrate (glucose, xylose) release in pretreatment and enzymatic hydrolysis at 120 h from the considered biomass samples. Compositional characteristics used in the multivariate linear regressions were averages of two analytical replicates. The carbohydrate release response data consisted of four replicates for each sample except where noted previously in “Conversion Performance”. Variables contributing significantly (p < 0.05) to explaining the observed response factor were retained in the models. Chemical characteristics with high levels of multicollinearity, based on calculated variance inflation factors (VIF) above 10, were avoided; however, some intercorrelation was inevitable due to the small number of samples and complexity of the chemical characteristics and analytical methods used. The selected models and reported model merits—coefficient of determination (R2) and root mean square errors of calibration and cross validation (RMSEC/CV)—were assessed with a leave-one-out cross validation to provide a representation of the model robustness.

Grading Methodology

The Bioenergy Feedstock Library [36] was used as a resource for compositional analysis data on commercially relevant biomass resources for pursuing the definition of preliminary grades. Commercially relevant was defined as material currently available for commercial purposes and harvested and/or collected using commercially available equipment or practices, including smaller-scale equipment for harvest and baling of field-scale research plots. Sixty-five samples, spanning eight biomass types with the relevant compositional analysis data, were identified and grouped into three categories—crop residues comprised of corn stover, wheat straw, and barley straw; energy crops comprised of mixed perennial grasses, sorghum, Miscanthus, and switchgrass; and grass (lawn) clippings. These samples form the grade definition set (GDS). Carbohydrate release for each sample was estimated using the linear regressions described in the “Conversion Models” section. The binning—grade definition—that was carried out through a hierarchical cluster analysis, using Ward’s minimum variance method [37], was used to determine mathematically similar groupings of the 65 samples based on the predicted carbohydrate release from key characteristics identified from the linear regressions. The upper and lower boundaries of these groupings were used to set the grade boundaries.

Results and Discussion

Variability in Biomass Samples

Grading systems exist to assign value to materials with quality differences that impact their use, performance, or application. The quality of lignocellulosic biomass is diverse and inherently variable. In order to define quality grades of biomass, this study analyzed commercially relevant lignocellulosic biomass with distinct and variable chemical composition. Five different biomass materials—corn stover, grass clippings, Miscanthus, switchgrass, and sorghum—were selected for this study (Table 1). Biomass materials in Table 1 represent multiple species and cultivars collected from eight different US states in 4 years using a variety of harvest practices, because environmental, genetic, and production factors can create substantial amounts of variability in chemical and physical biomass properties [3]. Many biorefineries opted to ensure supply and minimize logistics costs by using a single biomass resource (and provider). Considering that may not be an option in the future or some other technological solution may facilitate logistics and supply, this work incorporated multiple biomass resources. In addition, in many other industries, including agriculture, grading has become a marketing tool that has enabled and supported market growth, by creating a transparent mechanism that links supply with demand. Biomass is no exception, and biomass grading is required to mobilize the diverse resources necessary for a vibrant bioeconomy and allows more refineries to become operational for the processing of diverse biomass materials. Overall, grading would mitigate the risk associated with reliance on a single biomass resource by supporting a low-risk supply of multiple biomass materials [38], with a well-defined, uniform range of characteristics. This approach would incorporate more of the available billion tons of biomass into the market in support of a sustainable, future bioeconomy [2].

Table 2 shows the wide variability of 17 compositional components of ten biomass samples. Total ash (extractable and non-extractable ash) for the samples included in this study ranged from < 1% (Miscanthus) to 16% (grass clippings). Water extractives exceeded 30% in one of the grass clipping samples, while Miscanthus had water extractives between 3 and 7% (Table 2). Similarly, ethanol extractives were higher in grass clippings (6–7%) than in the rest of the samples that had ethanol extractives from 2 to 3%. Regarding structural carbohydrate content, values varied from 16 to 42% glucan and 7 to 24% xylan (Table 2). Total lignin, a known recalcitrant component, ranged from 12 to 20% with Miscanthus and switchgrass having the highest lignin content and grasses clippings having the lowest lignin content.

Table 2 Chemical composition on a percent dry weight basis for ten samples (mean of analytical duplicates). Sample descriptions are in Table 1. CS = corn stover, GC = grass clippings, MG = Miscanthus, SG = switchgrass, SO = sorghum

For the definition of the grades, the key characteristics must be related to a representative performance metric of interest for the biofuels industry end user. The preliminary performance metric in this case was the total primary monomeric carbohydrates—glucose plus xylose—released in a bench-scale, dilute-acid pretreatment and enzymatic hydrolysis assay on a gram per gram biomass dry basis. Glucose release from pretreatment was less than 0.05 g g−1 for all biomass resources (Fig. 2a). Xylose release in pretreatment was the lowest for grass clippings followed by sorghum (Fig. 2b). In general, corn stover, Miscanthus, and switchgrass had the highest xylose release during pretreatment. Glucose release from pretreatment and enzymatic hydrolysis measured after 120 h ranged from 0.16 g g−1 in Miscanthus to 0.29 g g−1 in corn stover (Fig. 2c). Xylose release from pretreatment and enzymatic hydrolysis ranged from 0.06 g g−1 in grass clippings to 0.21 g g−1 in corn stover (Fig. 2d). Glucose and xylose releases from enzymatic hydrolysis were also monitored over the course of the 120 h duration of the reaction (Fig. 3). At 24 h and 48 h, respectively, the glucose release from glucan for all samples was on average 88% and 94% of that achieved overall after the 120 h reaction (Fig. 3a). Xylose release from xylan was 80% of the maximum at 24 h and 88% of the maximum at 48 h on average for all samples (Fig. 3b).

Fig. 2
figure 2

Mean (SD) glucose and xylose release (g g−1 dry biomass) from dilute-acid pretreatment (a, b; PT) at 130 °C and enzymatic hydrolysis (c, d; EH) at 120 h (n = 4, except CS 1, SO 1, and SO 2 that had n = 3 and MG 1 that had n = 6). Sample information for each identifier is in Table 1. CS = corn stover, GC = grass clippings, MG = Miscanthus, SG = switchgrass, SO = sorghum

Fig. 3
figure 3

Enzymatic hydrolysis (EH) glucose and xylose release (g g−1 dry biomass) for 6, 12, 24, 48, and 120 h (mean (SD), n = 4 for most points). The two samples for each crop type correspond to those listed in Table 1. CS = corn stover, GC = grass clippings, MG = Miscanthus, SG = switchgrass, SO = sorghum

Key Characteristic Identification and Conversion Model Development

The robustness of a grading system relies on the relevance of the defined grades for the end user and on the transparency of their definition and later on their assignment. The definition of grades requires the identification of a set of biomass characteristics critical for determining how the material will perform throughout the processes of the biofuels production value chain. A study for developing a preliminary grading system for lignocellulosic biomass for biofuels was approached here by defining grades in a three-step methodology that started with the identification of the key characteristics affecting conversion performance. Linear regression methods were used to both identify the key chemical characteristics impacting conversion performance and to develop models to relate these characteristics to the selected conversion performance metric. Of the compositional analysis components assessed, it was determined that structural glucan, hemicellulose (xylan, arabinan, galactan), total ash, acid-insoluble lignin, and acid-soluble lignin could be used in multivariate linear regression models to understand carbohydrate release. All identified variables significantly impacted carbohydrate release (p < 0.05) and did not have high levels of multicollinearity (VIF > 10). Acid-insoluble lignin and glucan were correlated, 0.903, with a VIF score between these two components of about 10, but it was not high enough to warrant removal from the model as some intercorrelation was anticipated due to the nature of the data. After multiple trials of different combinations of variables, the best linear regression was achieved by using a selection of the chemical components and resulted in a R2 of 0.94 with a RMSEC and RMSECV of 0.01 and 0.02, respectively (Fig. 4a). Equation 1 shows the regression equation that fitted carbohydrate release (Carb Release is the total glucose and xylose released in pretreatment and enzymatic hydrolysis on a gram per gram dry biomass basis) from biomass resources, using four identified key characteristics, namely, structural glucan (Gluc), acid-soluble lignin (ASL), acid-insoluble lignin (AIL), and total ash (Ash).

Fig. 4
figure 4

Actual carbohydrate release (glucose + xylose, g g−1 dry biomass) from a dilute-acid pretreatment and enzymatic hydrolysis compared to predicted carbohydrate release from Eq. 1 (a) and Eq. 2 (b) for two samples each of five different biomass resources (n = 4, except CS 1, SO 1, and SO 2 that had n = 3 and MG 1 that had n = 6)

$$ \hat{\mathrm{Carb}\ \mathrm{Release}}\ \left(\mathrm{g}\ {\mathrm{g}}^{-1}\right)=0.463+0.003\mathrm{Gluc}+0.262\mathrm{ASL}-0.026\mathrm{AIL}-0.017\mathrm{Ash} $$
(1)

Another combination of variables, including Gluc, hemicellulose (Hemi), AIL, and ASL (shown in Eq. 2), resulted in a similar, but slightly lower, R2 value of 0.92 with slightly higher RMSEC and RMSECV of 0.02 and 0.03, respectively (Fig. 4b).

$$ \hat{\mathrm{Carb}\ \mathrm{Release}}\ \left(\mathrm{g}\ {\mathrm{g}}^{-1}\right)=-0.028+0.004\mathrm{Gluc}+0.008\mathrm{Hemi}+0.225\mathrm{ASL}-0.013\mathrm{AIL} $$
(2)

In Eq. 2, Hemi is hemicellulose carbohydrates and includes xylan, galactan and arabinan and all the other variables have the same meaning as in Eq. 1. It should be noted that xylan alone is correlated to the overall hemicellulose carbohydrates—an R2 of 0.98 for the data used here—and could be used in place of Hemi. The factor Hemi was selected over xylan based on the slightly better model. In addition, in the context of creation of a grading system, hemicellulose might be a more flexible factor to accommodate other common compositional analysis methods including Van Soest et al. [39] methods that use acid detergent fiber and neutral detergent fiber for determination of extractives, hemicellulose, cellulose, and acid-insoluble lignin.

The regression model plots (Fig. 4) demonstrate the feasibility of using linear regression as a tool for the identification of key characteristics that affect performance for a wide variety of biomass resources. In addition, the robustness of the model can be deduced from a leave-one-out cross validation also displayed in Fig. 4, in terms of the average R2, RMSEC, and RMSECV. The R2 values ranged from 0.908 to 0.962 for Eq. 1 and 0.872 to 0.948 for Eq. 2. The RMSECV ranged from 0.012 to 0.035 for Eq. 1 and 0.015 to 0.062 for Eq. 2. For both equations, the extreme samples, CS 1 and GC 2, with the highest and lowest carbohydrate release resulted in the most model deviation with R2 values of 0.908 and 0.916 for Eq. 1 and 0.898 and 0.872 for Eq. 2, respectively. However, these samples did not have the best or worst prediction errors for the cross validation results indicating the robustness of the model. In general, the performance of each resource type was predicted with similar accuracy using the proposed linear regressions. While total ash was also considered as a variable in the second equation, the high amount of intercorrelation (VIF ~ 53) seen for total ash was due to the combination of correlations between total ash and acid-insoluble lignin (− 0.94; Online Resource Table S2), glucan (− 0.95; Online Resource Table S2), and hemicellulose (− 0.91; Online Resource Table S2), making it unreasonable to include total ash in the same model in combination with these other components. In addition, the correlation between hemicellulose and total ash, which is a combination of physiological ash and soil contamination, is likely a result of the specific dataset used rather than plant physiological relationships like inorganics cross-linked to the biomass structure. It may also be a result of different levels of plant maturity. Past research has demonstrated decreases in ash and increases in hemicellulose as plants mature [40, 41]. Total ash is also inversely correlated to the acid-insoluble lignin and carbohydrate components; relationships that have also been reported in studies of plant maturity. It should be noted that other regression techniques, such as principal component regression (PCR) and partial least squares (PLS) regression, might be more suitable for dealing with intercorrelation between variables; however, for the present objective of identification of the key properties to be considered as part of a grading system, multivariate linear regression (MLR) techniques are simpler and do not require special software to implement.

The key characteristics identified through this process have logical support, and constitute typical variables for characterization studies of lignocellulosic biomass destined for a biochemical conversion process. The selected performance metrics, Carb Release, includes the two major structural carbohydrates originally present in the resource, glucan and xylan. The fact that the Carb Release values correlate with that of the structural glucan content might indicate a kinetic control effect, which seems to be confirmed by the level of correlation found with the hemicellulose content in Eq. 2 as well. During dilute-acid pretreatment, hemicellulose is typically deconstructed, thereby increasing enzymatic access to cellulose during subsequent enzymatic hydrolysis. Hemicellulose will be further enzymatically hydrolyzed to release xylose by xylanase present in enzyme cocktails, explaining why hemicellulose content would kinetically drive Carb Release and become one of the key characteristics identified.

A high polysaccharide content in a biomass resource does not directly translate into high monomeric sugars release during biochemical conversion processes since inhibitors are formed during thermochemical pretreatment reducing enzyme activities. In addition, lignin components can also inhibit access to cellulose and hemicellulose. Lignin was identified as a key recalcitrant characteristic necessary for prediction of carbohydrate release. However, there are independent effects of acid-insoluble lignin, also known as Klason lignin, and the acid-soluble lignin as a result of the type of chemical reactivity exhibited by each. As can be derived from Eqs. 1 and 2, the impact of each of these two lignin components was opposite. Specifically, the acid-insoluble lignin has a negative effect, the acid-soluble lignin seems to favor the conversion performance metrics with acid-insoluble and acid-soluble lignin being slightly negatively correlated (− 0.114; Online Resource Table S2). The relative importance of each of these parameters can be determined with the correlation coefficients for each of the model equations. In both equations, the acid-soluble lignin actually has the most impact on the carbohydrate release in a positive manner. This is most likely a representation of the relative reactivity/recalcitrance of the lignin components (ASL/AIL). Therefore, total lignin makes the results convoluted and the two components need to be separated to accurately characterize the biomass quality. The negative effects of lignin on biomass pretreatment for carbohydrate conversion have been well documented [4]. However, the observation of a positive effect (higher reactivity), provoked by increased solubilization of acid-soluble lignin components during pretreatment as predicted by our correlations, has not been reported. In both equations, the amount of carbohydrates (glucan and/or hemicellulose) is the lowest contributing components. In Eq. 1, ash along with the lignin components contributes more to the prediction of carbohydrate release than glucan. This model shows that it is not simply a question of having lower lignin or lower ash, but having a greater percentage of acid-soluble lignin to acid-insoluble lignin.

Regarding the analytical methods, lignin components are measured after using a concentrated sulfuric acid hydrolysis method. A similar chemistry, but to a lesser extent, may take place during dilute-sulfuric acid pretreatment and explain why these parameters might be important. However, the accurate measurement of lignin remains a challenge. Although the methods described here are state of the art in the industry, there are still sample-dependent considerations regarding sample preparation and procedural variation that must be consistently taken into account to ensure accurate measurements of acid-insoluble and soluble lignin [42]. A future grading system needs to include standard methods of analysis and inspection that would require both accurate wet chemical measurements of lignin and rapid lignin measurement that can devolve lignin types by acid solubility at the point of sale. Development of rapid, standard analytical methods represents one of the many difficulties the current industry would face implementing a grading system in the future.

Total ash, the final component included in Eq. 1, is likely identified due to ash not being convertible and not contributing to carbohydrate release. Ash can also neutralize acid during pretreatment, which decreases efficacy; however, only certain elements in ash have a neutralizing effect [43]. Ash speciation was measured for all of these samples (data not shown), but their incorporation in the correlation did not lead to further improvements. Therefore, some other effects may have taken place and will need further explanations for understanding the carbohydrate release and contributions to model development.

Other properties and characteristics beyond composition, like cellulase adsorption, water retention value, and particle size, were considered and assessed (data not shown) in this study. However, these variables did not affect the conversion performance metric selected, as shown in Fig. 4. The reasons for this varied. Cellulase adsorption and water retention values were tested on untreated biomass resources, as requisite for grading, but impact on performance would be best associated with these measurements and procedures conducted on pretreated solids. Particle size was fairly similar because all biomass samples were milled using the same methods.

Grade Definition

The final step in the approach was to establish preliminary grades by identifying the range of variation for each of these characteristics using a sample set as large as possible and then bin these ranges to set grade boundaries. The Bioenergy Feedstock Library [36] was used to determine the range of variation for each identified key characteristic—acid-insoluble lignin as well as acid-insoluble lignin, glucan, hemicellulose (xylan, galactan, arabinan), and total ash—for 65 commercially relevant samples (Fig. 5a–e; Online Resource Table S3). These 65 samples were not tested for conversion performance and only used for the purpose of defining grade boundaries. These samples form the grade definition set (GDS). The glucan content for the calibration samples in the model spanned the glucan contents of the set of 65 samples in the GDS in Fig. 5. The glucan content for the samples used in the identification of key properties falls within 1.5% of the minimum and maximum glucan for the GDS samples. Hemicellulose in the calibration set had a similar minimum to the GDS samples (~ 12.5%), but the maximum was 6% greater for the GDS samples compared to the calibration sample set (Table 2; Fig. 5). This indicates that it would be beneficial to evaluate performance of more samples with higher hemicellulose for future iterations of the linear model. Acid-insoluble lignin ranged from 7 to 21% for the GDS, but only from 11 to 19% for the calibration sample set; however, the acid-soluble lignin ranges were similar for both sample sets (Table 2; Fig. 5). Total ash results were similar with the maximum ash content being 16% for the model samples while the maximum for the GDS was 20% (Table 2; Fig. 5).

Fig. 5
figure 5

Histograms of the variability of the five identified explanatory variables for carbohydrate release (glucose + xylose, g g−1 dry biomass), acid-insoluble lignin (a), glucan (b), acid-soluble lignin (c), hemicellulose carbohydrates (d), and total ash (e; n = 65). Gray bars = all biomass types, black lines = crop residues, red lines = energy crops, blue lines = grass clippings. A histogram of predicted carbohydrate release (g g−1 dry biomass) calculated using the identified explanatory variables acid-insoluble lignin, acid-soluble lignin, glucan, and total ash is in f with grade designations identified (n = 65). 1 = high quality, 2 = medium high quality, 3 = medium quality, 4 = medium low quality, 5 = low quality

The 65 GDS biomass resource samples were clustered into two sets of five groups based on a hierarchical cluster analysis of the predicted carbohydrate release based on the regressions from Eqs. 1 and 2 (Online Resource Table S3). The five groups, as a function of carbohydrate release from Eq. 1, are displayed in Fig. 5f. These five groups may represent grades with grade 1 having the highest carbohydrate release and grade 5 having the lowest (Fig. 5f). Equation 2 had overall higher carbohydrate release ranges for each grade compared to Eq. 1 (Table 3). For Eq. 1, 15% of the samples were in the highest quality, grade 1, while approximately, the same percentage of the 65 samples was in the lowest quality, grade 5. For Eq. 2, very few samples fell into grade 1, only 5%, with the highest percentage of samples, 28%, falling into the medium low quality grade 4 (Table 3). For both grade systems in Table 3, the grade 1 material was solely comprised of crop residues (corn stover, wheat straw, and barely straw); however, in Eq. 1, grade 1 captured 44% of the crop residues classified while Eq. 2 only captured 13%. In addition, in Eq. 1, one crop residue sample was considered a grade 5 while Eq. 2 did not result in any grade 5 crop residues. For energy crops, the assigned grades span from grade 2 to grade 5 in Eq. 1, while Eq. 2 does not result in any energy crops assigned to grade 2. Finally, Eq. 1 shows a grass clippings sample represented in grades 2 through 4 while Eq. 2 results in a majority of the grass clipping samples having the lowest quality grade.

Table 3 Grade definition for a biochemical conversion pathway based on predicted carbohydrate release (glucose + xylose, g g−1 dry biomass) from the explanatory variables in Eq. 1 (structural glucan, acid-soluble lignin, acid-insoluble lignin, total ash) and Eq. 2 (structural glucan, hemicellulose carbohydrates, acid-soluble lignin, acid-insoluble lignin). Samples represented are the numbers of samples out of 65 that are in each grade. Also included is the number of samples represented by each biomass category (crop residues, energy crops, and grass clippings) for each equation and grade

The differences in grading outcomes between these two equations exemplify the importance of selecting appropriate conversion performance metrics and methods, key biomass characteristics, representative calibration samples, and grading development samples as all of these factors influence the final grades using this approach. In addition, grades were defined using the carbohydrate yield predicted from the identified key characteristics, as it is not possible to define the grades based on the key characteristics directly when there are complex interactions between them. These complex interactions, probably derived from the complexity of biomass composition, require multiple components to be taken into account simultaneously. For example, MG 2 has the highest structural glucan content (42%); however, it has one of the lowest carbohydrate release values for pretreatment and enzymatic hydrolysis (0.29 g g−1 carbohydrates released) due to the interactions and impacts of the other components in the sample that contribute to decreasing overall reactivity.

As mentioned, the pretreatment and enzymatic hydrolysis conditions used to establish the conversion performance metric and determine the key characteristics impact the final step of grade definition. The conditions used here were selected based on Wolfrum et al. [34], where multiple biomass materials were pretreated at temperatures between 110 and 200 °C, of which 130 °C was determined to reveal the most differences between biomass types and was selected as the optimum for screening biomass reactivity. This allowed selection of one temperature to demonstrate the grading approach; however, future work should focus on confirming that these patterns hold across a variety of conditions. Future work may also consider inclusion of other types of biomass resources, e.g., woody, since resources like short rotation woody crops have been considered for biochemical conversion processes [44], because the ultimate goal of a grading system is to broaden the range of biomass feedstocks considered for biorefining. This may add complexity to the methodology used for developing a grading system as these resources have physical and chemical properties that differ from the herbaceous grasses included in this case study, but stresses the need for systemic and systematic approaches like the one followed in this case study. However, it should be mentioned that poplar was included in the study [34] referenced to determine the pretreatment conditions employed in our work. The selected metrics for this work were based on results of the impact of carbohydrate recovery in the pretreatment/hydrolysis steps on the bioethanol yield of the conversion step, which in turn is the strongest driver of biorefinery economy [45]. However, it has been established that different biomass types require different pretreatments to achieve optimal carbohydrate release and product yields. Environmental factors, like drought, can also impact optimal pretreatment conditions [46, 47]. For example, drought can decrease recalcitrance of grasses leading to increased carbohydrate yields but also increased fermentation inhibitors [46, 47]. This indicates that not only do many biomass resources need to be incorporated, but also other effects and factors have to be considered such as growing conditions or the type of testing prior to sale as a certification protocol. In addition, a comprehensive grade set would consider different pretreatment chemistries and enzymatic hydrolysis conditions to determine what grade boundaries are ideal across many conversion conditions, prior to any standardization methodology. Finally, enzyme and fermentation inhibitors were not included in the models in this study. Inhibitors were measured in the pretreatment liquors, but since the pretreatment was at a fairly low temperature, few inhibitors were detected (Online Resource Table S4). The developmental evolution of the preliminary demonstration of the grading system presented here will require incorporation of additional pretreatment chemistries, inhibitors of enzymes and fermentation, and a wide variety of biomass types and samples to verify consistency and sustainability.

Finally, it must be considered how biomass grades would be used. As mentioned above, quality grades are directly associated with value and consequently with price. A grading system needs to include transparent mechanisms of translation between quality, costs, and price. A methodology for doling out rewards or penalties, like a dockage system, might be an option to incorporate into a system framework. Bonner et al. [48] has discussed how samples high in ash as a result of harvest practices could result in severe payment reductions to growers. It is clear that a transparent and robust grading system will create the required incentives for the growers to harvest using best management practices for biofuel production.

The implementation of a grading system requires that the involved characteristics be measurable at point of sale using rapid measurements such as near-infrared spectroscopy in the future [49,50,51,52]. This is particularly important as biomass is often delivered in smaller more heterogeneous quantities than other feedstocks like oil or coal, and it will be necessary to know what to quantify and how to do it accurately to understand the feedstock delivered and how to process it. However, growers might have less control over characteristics like structural carbohydrates or lignin content even if they are able to be rapidly measured. Components such as soil contamination can be controlled via harvest method to some extent. Carbohydrate and lignin content are more likely to be controlled via cultivar selection or cut height at harvest [53]. Ash, carbohydrates, and lignin could possibly be controlled by harvest timing as crop maturity can affect these parameters [40, 41]. However, tradeoffs will need to be understood and considered as lower ash later in growing season may also mean higher lignin contents. Currently, for crops like corn stover, these parameters are being selected based on grain yield, not stover yield. Unless stover is the primary crop produced or considered a valuable co-product, then cultivar and harvest will continue to be optimized for grain yield. Nonetheless, if a grower can produce a stover crop with higher carbohydrate content, then they will be able to ask a higher price. Conversely, if a grower cannot charge enough to justify harvesting and baling stover, then they will forgo this altogether as discussed in Thompson and Tyner [29], who used grades and dockage in modeling scenarios to help evaluate production decisions.

Conclusions

A biomass transparent grading system intended for the bioenergy industry would lead to a situation where stakeholders on both the supply and demand side benefit. The current situation of the biorefineries represents a high level of uncertainty for the biomass suppliers. In particular, the demand, needs, and requirements of biomass supply are far from being known. This work is a preliminary attempt to define grades and start the process of developing a comprehensive grading system with the future vision being one in which the biomass is graded on a small number of intrinsic characteristics that growers can control to minimize losses and increase product value and which can be easily and rapidly measured. This work has demonstrated that measurable biomass characteristics, like glucan, hemicellulose (xylan, arabinan, galactan), lignin, and ash content, can be correlated to conversion performance for the definition of biomass grades. However, future challenges still exist in three main areas: (1) technology is in development for both the supply chain and biorefineries, making performance metrics a moving target; (2) critical relationships between characteristics of biomass and performance at every segment of the value chain from field, preprocessing through conversion are not available today; and (3) technology needs to be developed that can rapidly measure key biomass properties at the plant gate.