Introduction

Liquid transportation fuels from lignocellulosic materials are seen as a major potential substitute for petroleum [35]. Lignocellulosic ethanol is predicted to have a favorable greenhouse gas profile, alleviate dependence on foreign oil, compensate for decreasing worldwide petroleum reserves, and provide an economic boost to rural communities [16].

Most strategies for converting lignocellulosic biomass to ethanol or other liquid fuels involve the enzyme-catalyzed depolymerization of polysaccharides. However, enzymes are intrinsically expensive because they must be produced by living systems and are thermodynamically unstable. Due to the chemical and physical recalcitrance of lignocellulose, high enzyme loadings are necessary to obtain reasonable degradation rates. The ultimate cost of enzymes is thus one of the major expenses hindering the development of an economically viable lignocellulosic ethanol industry [40].

Enzymes for Biomass Conversion

The biochemistry and chemistry of plant lignocellulosic materials and of the microbial enzymes that depolymerize them has been frequently reviewed [12, 40, 58]. Suffice it to remind readers that the rate and efficiency of conversion of lignocellulosic materials to fermentable sugars is a function of the starting biomass material, the pretreatment, the involvement of somewhere between six and 60 enzymatic activities, and the definition of a “fermentable sugar”. Different feedstocks (wood, corn stover, sugar cane bagasse, etc.) subjected to different pretreatments contain different (and variable) concentrations of different sugars. In regard to the desired end products of depolymerization, wild-type Saccharomyces cerevisiae (yeast) can ferment glucose but not xylose, whereas xylose and cellooligosaccharides are acceptable to other native or engineered microbes [28, 51]. Therefore, biomass source, pretreatment, enzyme mixture, and fermentation microbe are interdependent variables.

From the perspective of the best-studied enzyme systems, the heart of depolymerization of crystalline cellulose comprises cellobiohydrolase (CBH), endo-β1,4-glucanase (EG), and β-glucosidase (BG). Our knowledge of the enzymology of hemicellulose breakdown is less extensive. Whereas cellulose is chemically homogeneous and available in highly pure forms, hemicelluloses are more diverse, both between plant species and between tissues within a single plant. Dicotyledonous hemicelluloses, for example, contain fucose, O-acetylated galactose, and α1,6-xylose-substituted β1,4-glucan [10, 19, 46]. Enzymes that are necessary to degrade the linkages found in dicot hemicelluloses therefore include β1,4-glucanases (which can be specialized for hemicelluloses such as xyloglucan), α-fucosidase, α-glucuronidase, β1,4-xylanase, α- and β-xylosidase, α-arabinosidase, and several classes of esterase. Pectins are deconstructed by pectin lyases, pectinases, pectin methylesterases, and probably other enzymes in the case of complex pectins such as rhamnogalacturonan II. In contrast to herbaceous dicots, cereals contain lower levels of pectins, higher levels of glucurononoarabinoxylan, and esterified phenolics. The depolymerization of these constituents requires xylanase, α1,2- and α1,3-arabinosidases, α-glucuronidase, and esterases [55]. Cereals also contain mixed-linkage glucan (MLG), which is found locally in high concentrations in some tissues such as young seedlings and endosperm walls. The latter is a major component of dried distillers’ grains (DDG), a byproduct of corn starch ethanol production. MLG is hydrolyzed by some conventional cellulases (i.e., β1,4-glucanases) and also by specialized enzymes (called mixed-linked glucanases, β-glucanases, or lichenases).

In addition to enzymes that act directly on the covalent bonds in plant cell wall polysaccharides, enzymes that act indirectly might also be important in lignocellulose breakdown. There are three categories of such enzymes. The first are nonenzymatic proteins that contribute to wall loosening, such as expansins and their fungal and bacterial homologs [52]. A second group of enzymes that might be indirectly critical are ones that degrade nonglycosidic wall components, such as lignin and proteins, thereby facilitating access of the glycosyl hydrolases [44]. A third group of potentially important auxiliary enzymes could be ones that degrade small molecules, released by pretreatments, that inhibit the core degradative enzymes or the downstream fermentation steps [6, 7, 31].

The Research Landscape of Biomass Enzymes

Enzymes for biomass deconstruction (as well as for most other industrial applications) are currently derived from fungi, especially species of Trichoderma (whose sexual stage is known as Hypocrea) and Aspergillus (sexual stage Emericella). Some enzyme preparations, used mainly for food processing, come from the mitosporic ascomycete Humicola insolens or from bacteria such as species of Bacillus. Since the 1950s, Trichoderma reesei has been subjected to multiple rounds of strain improvement for enhanced cellulase production [41]. Enhanced cellulase production has come from reduction of catabolite repression [23], reduction in protease activity [42], and the development of methods to grow the fungus to high densities on simple nutritional feedstocks and inexpensive cellulase inducers. Current protein production by T. reesei is reported to approach 100 gm/l and to require minimal post-fermentation processing.

The most salient aspect of the current research landscape on biomass enzymes is its concentration in the private sector, especially within the two dominant industrial enzyme companies, Genencor and Novozymes. This state of affairs is explicitly supported by the major funder of bioenergy research in the USA, the Department of Energy (DOE). In the past 10 years, there have been two large programs to fund research at these and other enzyme companies. In 2001, Novozymes and Genencor were awarded grants of $15 million and $17 million, respectively. This funding has been stated to have led to a 20- to 30-fold reduction in the cost of enzymes for ethanol production from acid-treated corn stover [53; http://www.nrel.gov/awards/2004hrvtd.html?print]. However, the data have not been published in peer-reviewed journals, and therefore this claim cannot be validated. In an interview published in June 2009, a spokesperson for Novozymes gave the current cost of enzymes as “approximately $1/gallon” [9], implying that the cost prior to the federally funded research was $20-30/gallon. Clearly, there is much uncertainty in this area.

In 2008, a new round of funding of $33.8 million for enzyme research was distributed among four companies—Novozymes, Genencor, DSM Innovation Center, and Verenium (http://www.energy.gov/print/6015.htm). Novozymes and Genencor are Danish companies, and DSM Innovation is based in the Netherlands. Partners on this funding include three DOE national laboratories (NREL, PNNL, and Sandia) and Abengoa Bioenergy New Technologies (a Spanish company).

A critical feature of the 2008 DOE call for proposals was a requirement that all awardees use dilute acid-treated corn stover as the experimental feedstock. Acid pretreatment solubilizes most of the hemicellulose in plant cell walls, which is discarded (thus losing a significant amount of the carbon), or neutralized and re-added to the insoluble (mainly cellulose) fraction, or processed independently. None of these strategies are ideal from an efficiency point of view. There are several technologies that are strong contenders to become the future pretreatment method of choice (e.g., ionic liquids and several types of alkaline chemistries). Enzyme mixtures will need to be re-optimized for each.

While it cannot be denied that the major enzyme companies have tremendous expertise in all aspects of industrial enzymes, there are two potential drawbacks to having so much of the available research dollars invested in them. First, the size of the grants in one specific area of bioenergy research makes it difficult for other labs to compete (i.e., small companies, universities, and the DOE National Labs, venues where a few hundred thousand dollars is considered significant funding). Second, the enzyme companies are not obligated to publish their results, and in fact this is not in their best interests, because much of their intellectual property is held as trade secrets (that is, not disclosed even in the patent literature). Very few data resulting from the first round of DOE funding (in 2001) have emerged into the public sector, and the results from the second round (in 2008) will probably also be closely held. The lack of accountability, peer review, or public access can reasonably be expected to act as a strong restraint on progress in this critical area of the future of lignocellulosic ethanol. The simple existence of this funding situation, even without knowing what experimental avenues the companies are pursuing, can be presumed to have a stifling effect on research elsewhere.

What do we know about the enzyme research being performed with the DOE funding in the private sector? Based on press releases, patents, talks at scientific meetings, and presentations scattered across the Internet, the companies are taking several approaches to reduce the cost of enzymes. These include screening new organisms (mainly fungi) for superior versions of current enzymes and for enzymes that act synergistically with existing commercial enzymes, continued strain improvement by conventional and molecular mutagenesis, enzyme improvement by protein engineering and directed evolution, and improved efficiencies in industrial-scale enzyme production [1, 11, 40, 48, 49, 56].

Current Strategies to Improve Enzymes

It is a given that enzymes are too expensive. What is the potential for reducing the cost 10- to 100-fold? Like any industrial process, reductions in cost can come from engineering and marketing solutions such as more energy-efficient fermentation tanks or by selling fermentation by-products. Here, we will restrict the discussion to considering ways of reducing enzyme costs that involve manipulations of the organisms or the enzymes. Insofar as enzymes are sold on a protein mass basis, the overall goal of these strategies is to increase the specific activity of the mixtures—that is, obtain equivalent fermentable sugar yield with less protein.

In regard to strain improvement, T. reesei has already undergone extensive improvement. It is difficult to see how proteins yields greater than 100 gm/l can be achieved. However, further improvements of strains could focus on tailoring their protein ratios to particular biomass substrate/pretreatment combinations or by genetic elimination of the secreted enzymes that are not necessary.

If fungal fermentation becomes the limiting factor in enzyme cost reduction, production of enzymes in other systems is possible. Plant agriculture provides tremendous yields of protein per acre. A number of cell wall-degrading enzymes have been produced in plants, e.g., bacterial endoglucanase and fungal cellobiohydrolase, and several start-up companies are based on this technology [22, 64]. Enzymes could be produced together in a single plant, or individually, and extracted as soluble protein with existing technologies. Enzymes could be targeted to the apoplast, to vacuoles, or to plastids. Enzyme genes could be regulated so that they are expressed in response to a gratuitous inducer or at a particular developmental stage, e.g., at senescence [58].

Another strategy to improve enzymes that has generated much interest is protein engineering. In this strategy, the three-dimensional structure of an enzyme guides the identification and modification of amino acid residues that affect some property such as specific activity or thermal stability. A number of proof-of-principle studies have shown that this approach is feasible [20, 47, 62], but it is not clear how ultimately successful it will be because the physico-chemical limits on specific activity of biomass enzymes are not well understood. A fungal CBH of high specific activity is not known in nature and perhaps has never arisen through the process of natural selection because it is just not thermodynamically possible. Well-known examples of the limits of evolution on enzyme behavior include the low affinity of RUBISCO for CO2 and the fact that only a few microbes, almost exclusively basidiomycete fungi, have evolved the capacity to degrade lignin [29]. Furthermore, enzymes have critical properties besides specific activity and thermal tolerance that must be considered but which can be difficult to assay in vitro. For example, besides catalyzing a particular chemical reaction, enzymes must be efficiently translated and secreted, able to resist proteases, act cooperatively with other enzymes, and have low product and feedback inhibition. One can easily imagine that an “improved” enzyme, based on an assay in isolation on a model substrate, might perform poorly in a real-world situation [62].

Considering the power of natural selection to evolve proteins with astonishing properties and the large number and high metabolic diversity of microbes, many researchers have turned to exploring nature for better enzymes. Such “bioprospecting” can be more or less random, or can be guided by evolutionary or ecological principles. It can take the form of isolating microbes that grow better on biomass substrates, mining databases of sequenced genomes, cloning variants of known enzyme genes by polymerase chain reaction (PCR), or finding new genes by metagenomics. One method exploits random cloning into expression hosts, the resulting transformants then being screened directly on plates containing cellulose or other model biomass substrates (D Mead and P Brumm, Lucigen, Inc., personal communication). Bioprospecting can be used to find better alternatives of enzymes known to be important (e.g., CBH and EG) or enzymes that enhance (“synergize”) with existing commercial cellulase mixtures. Imaginative bioprospecting is going on in many academic, government, and private labs, exploring ecological niches as varied as tropical compost piles, termite gut, wood-boring wasps, and hot springs. However, despite the broad appeal of bioprospecting and the technical ease of DNA sequencing, gene discovery is not currently a limiting factor in finding new and better enzymes. Thousands of genes annotated as “cellulase” are already present in the public databases, and thousands more are emerging monthly from the high throughput sequencing facilities. Instead, the major limiting factor, both currently and for the future, is the capacity to evaluate the biochemical activities encoded by those genes. Unfortunately, there is no reliable way to tell a “better” cellulase on the basis of its predicted amino acid sequence. The only reliable way to evaluate a new cellulase is to produce it and test it in a realistic biochemical assay.

The Future of Enzymes: Bacterial or Fungal?

To date, the majority of enzymes developed and being tested for lignocellulose degradation are from fungi. A reasonable question is how much additional progress is possible with fungal-based enzymes or whether the way forward will require new prokaryotic paradigms. This point of view is reflected in the influential US Department of Energy report, “Breaking the Biological Barriers to Cellulosic Ethanol”, in which cellulosomal bacteria are discussed extensively (98 mentions), whereas Trichoderma is mentioned only twice and both times qualified by the phrase “short term” [60]. However, the published literature is still agnostic on whether bacterial enzymes are superior to fungal ones and (a related question) whether cellulosomal (“complexed”) enzyme systems are superior to “noncomplexed” systems [63]. Free and complexed enzyme systems are found in both prokaryotes and fungi, although in both cases, complexed systems are found only in anaerobic organisms [14, 37, 60].

There have been only a few side-by-side comparisons of bacterial and fungal enzymes. Irwin et al. [24] concluded that the exoglucanases E3 and E6 from the bacterium Thermomonospora fusca are approximately equivalent to CBH1 and CBH2 of T. reesei when assayed on filter paper, and T. fusca cellulase E3 and T. reesei CBH2 are functionally equivalent in synergism experiments. Johnson et al. [26] concluded that a cell-free cellulase preparation of Clostridium thermocellum was comparable to the activity of T. reesei, but with different temperature and pH optima. Ng and Zeikus [43] found that the extracellular cellulase activity of C. thermocellum was one-half as active as T. reesei. These studies do not support the conclusion that bacterial enzymes are superior to fungal ones. However, this conclusion must be tempered by the fact that both of the comparative studies on C. thermocellum predated the discovery of cellulosomes, and therefore comparing “cell-free” preparations might have put C. thermocellum at a disadvantage [5].

One argument against fungi as a potential future source of better enzymes is their relatively low metabolic and ecological diversity compared to prokaryotes. At first glance, filamentous fungi (at least ascomycetes) seem rather similar to each other in their panoply of cell wall active enzymes. Most fungi (excluding fungi in the Saccharomycotina and symbiotic or biotrophic basidiomycetes such as species in the genera Ustilago, Laccaria, and Amanita), have a large and overlapping assortment of cell wall active enzymes (typically >150 glycosyl hydrolases) [38, 42]. This could be used as an argument that “one fungus is as good as another”. On the other hand, there are reasons to believe that many additional enzymes remain to be discovered from fungi, both superior forms of the currently known enzymes as well as enzymes with novel structures and activities. For example, some cell wall active enzymes have narrow taxonomic distribution among fungi, such as vanadate chloroperoxidase (largely restricted to the Dothidiomycetes/Pleosporales), a pectate lyase with a CBM (to date found only in Fusarium graminearum; gene identifier FG10004), a pectin methylesterase with four catalytic domains (found in a handful of fungi, including F. graminearum; FG04439) and swollenin (restricted to a few species in the Euorotiales and Hypocreales). Another fact that points to what one could call “cryptic” diversity is the low level of homology (amino acid identity) between enzymes of the same classes among fungi. For example, the best matches to T. reesei CBH1 (Cel7A) in the nonredundant database (outside Trichoderma) have <65% amino acid identity, raising the question of what is the significance of those >35% different amino acids.

As an example of how fungi continue to contribute novel insights into our understanding of lignocellulose breakdown, the genome of the brown-rot fungus Postia placenta (Phylum Basidiomycota: Polyporales) was recently shown to have a reduced number of glycosyl hydrolases compared to other fungi [39]. Partly for this reason, it has been hypothesized that P. placenta degrades cellulose by an oxidative mechanism, analogous to how white rot fungi such as Phanerochaete chrysosporium degrade lignin (i.e., Fenton chemistry and production of strong diffusible oxidants) [29].

Another new fungal paradigm is represented by the anaerobic fungi (Phylum Neocallimastigomycota), which organize their wall-active enzymes in cellulosome-like structures [14]. Orpinomyces sp. strain PC-2 makes several cellulases, all members of GH families 5 and 6, as well as xylanase, β-glucosidase, lichenase, and at least two esterases [37]. Much remains to be learned about the genome organization, biosynthesis, assembly, and activity of fungal cellulosomes.

Of relevance to any discussion of the enzyme models of the future is the manifest fact that in nature lignocellulose is decomposed not by cell-free extracts but by living organisms organized into complex communities. One implication of this is that growth rates of intact organisms on lignocellulose might not reflect their potential contribution when reduced to a cell-free system. That is, we shouldn’t necessarily expect a good correlation between in vivo growth on lignocellulose and in vitro enzymatic activity against lignocellulose. This has implications for the search for new enzymes by bioprospecting at the organismal level—for example, some fungi (such as filamentous ascomycetes) typically grow much more quickly in culture than others (such as gilled basidiomycetes) even though their genomes indicate that they both have high genetic potential for lignocellulose degradation. Unfortunately, even though the Basidiomycota clearly have lignocellulolytic capacities not found in any other organism (viz., oxidative degradation of cellulose by P. placenta and lignin by Phanerochaete chrysoporium), their slow growth discourages bioprospecting in this phylum.

There is some evidence that lignocellulose degradation in vivo requires active metabolism on the part of the degradative organisms and cannot be completely replicated by cell free preparations. P. placenta, for example, has a variety of enzymes, such as iron permeases, ferric reductases, P450s, and quinate transporters that are probably involved in lignocellulose degradation but are cytoplasmic or membrane-bound. Such enzymes could be critical for generating oxidants or mediators used as substrates by the extracellular enzymes. Even for well-studied noncomplexed systems, such as T. reesei and aerobic bacteria, we know little about the involvement of enzymes that remain attached to the cell, i.e., intracellular, bound to the plasma membrane, or attached to the cell wall. An unresolved but intriguing aspect of uncomplexed systems such as T. reesei is the role of oxidative reactions. T. reesei secretes few oxidoreductases [38, 42], yet the presence of oxygen has been reported to stimulate degradation of cellulose by cell-free enzyme preparations of T. reesei and other fungi [15].

It is commonly accepted that the future of lignocellulosic ethanol lies in consolidated bioprocessing (CBP), that is, enzyme production and ethanol production combined into a single microbe [61]. In this case, it is possible that bacterial enzymes will work better in a prokaryotic CBP microbe (e.g., Zymomonas mobilis or C. thermocellum), and fungal enzymes will work better in a eukaryotic CBP microbe (such as S. cerevisiae, Pichia stipitis, or T. reesei) [61]. Therefore, there may well be a need for good prokaryotic as well as good eukaryotic enzymes, depending on their purposes.

Importance of Accessory Enzymes

Although there are currently a number of enzyme preparations being sold for bioenergy applications, they are highly similar in activities and composition. This is because, in accordance with the DOE mandate (see above), they have been optimized for acid-pretreated corn stover. However, the bioenergy landscape of the future will probably include multiple feedstocks and multiple pretreatment chemistries, and therefore there will be a need for many different enzyme cocktails. Corn leaves, corn cobs, and corn stems differ significantly in polysaccharide and lignin composition [3]. DDG, of which there is currently a large supply due to grain ethanol production, has a polysaccharide profile that is distinct from other parts of the corn plant. Miscanthus and switchgrass, although related to corn, have quantifiable differences in wall composition, and all three are quite different from dicotyledonous or coniferous woody plants [46]. There are major differences in the polysaccharide and hence monosaccharide composition of biomass materials subjected to different pretreatments; e.g., acid, but not base, strips away most of the hemicellulose.

One way in which the customized enzyme mixtures of the future will probably differ from current mixtures is not mainly in the core cellulases and xylanases, but rather in myriad other “accessory” enzymes that act on the less abundant linkages found in plant cell walls. Accessory enzymes of importance could include arabinanases, galactanases, lyases, pectinases, and several types of esterases [8]. All of these are known to be secreted by lignocellulose-degrading fungi such as T. reesei [42]. Filamentous fungi also secrete a number of nonenzymatic proteins (e.g., swollenin) and proteins of unknown function (e.g., predicted glycosyl hydrolases). Plant cell walls contain significant amounts of proteins such as xyloglucan endo-transglycosidases, extensins and arabinogalactan-rich glycoproteins, and therefore proteases might contribute to efficient biomass conversion (even though they have been engineered out of commercial T. reesei strains) [42, 53]. A number of filamentous fungi (but not T. reesei) produce oxidoreductases such as laccases and lignin peroxidases, which cooperate in the oxidative depolymerization of plant cell wall polysaccharides and lignins [29, 39]. These enzymes might increase the efficiency of the standard glycosyl hydrolases of fungi such as T. reesei, either by providing an alternative way to cleave polysaccharide linkages or by making the polysaccharides more accessible.

The Future of Enzyme Research: Defined Enzyme Mixtures

In our opinion, there are two pressing needs in biomass enzyme research. The first is to improve our understanding of which enzymes or proteins are critical for deconstruction of lignocellulose. Whereas it is well understood that the core cellulases and xylanases are essential for cellulose and xylan conversion to glucose and xylose, respectively, the importance of the numerous other proteins secreted by all lignocellulose-degrading microbes remains largely unknown. This knowledge is essential for guiding enzyme bioprospecting, engineering, and production in plants (Fig. 1).

Fig. 1
figure 1

Scheme showing the central importance of a core set for improving enzymes for biomass

A systematic understanding of which proteins secreted by a lignocellulolytic microbe are actually involved in lignocellulose breakdown has several additional ramifications. For example, it would also lead to the identification of unimportant enzymes, elimination of which from the genome of an enzyme production host would effectively increase the enzymatic specific activity. Another ramification arises from the fact that there is frequently not a simple relationship between the importance of an enzyme and substrate abundance, nor a simple correspondence between catalytic activity in vitro and in vivo. There are many possible reasons for this. One could attempt to predict which enzymes are necessary for lignocellulose degradation based on our knowledge of the presence of the cognate chemical linkages in cell walls, and one could even hazard a quantitative prediction based on the abundance of a particular linkage. However, many enzymes have multiple activities (e.g., some β1,4-xylanases also cleave β1,4-glucan), and others have not been accurately characterized (their names reflect the substrate used to purify them and do not necessarily represent their in vivo activities). Some proteins have no known activity on covalent bonds (e.g., swollenin), and yet others might appear disproportionately important because their substrates mask the substrates of other enzymes. Some enzymes might work on certain linkages when present in synthetic substrates but be unable to physically approach those same bonds in a lignocellulosic context. Collectively, these factors can confound our ability to predict which enzymes and proteins are necessary for effective lignocellulose deconstruction. The only reliable method to determine the contribution of a particular protein is to control its presence in a complex mixture by being able to add or remove it, combined with the use of a real lignocellulosic substrate.

Another ramification of studying the role of individual enzymes in lignocellulose degradation is that we can thereby learn which catalytic functions, and therefore which specific chemical bonds, are important for lignocellulose deconstruction. This information could then be used to guide efforts to breed better biomass plants and to design more effective pretreatment chemistries.

The second critical need in enzyme research is to have a method to evaluate new alternative enzymes in a realistic way. For example, given that CBH is an essential part of any enzyme mix, one naturally wants to find (or synthesize) the best CBH. But how will we know if one CBH is better than another? (To simplify the discussion, we will restrict “better” to mean higher specific activity under standard conditions, although protease resistance, thermal stability, or altered pH optimum might be more relevant). It is now generally recognized that (1) mono-component assays do not capture important enzyme features such as degree of synergism, and (2) assays based on synthetic substrates (such as p-nitrophenol sugars or pure cellulose) do not reflect behavior against real substrates such as pretreated native lignocellulose. Both of these restrictions make it harder to design evaluation strategies for new (improved) enzymes. As discussed earlier, there are already thousands of cellulases, xylanases, etc., in the public sequence databases, yet there is no reliable way, short of actual enzyme assays in combination with the other required enzymes on realistic substrates, to ascertain if an enzyme is better or not than another.

In an attempt to address the two questions posed above, our lab is taking the approach of building synthetic enzyme mixtures in order to define essential lignocellulolytic activities and their optimal ratios (Fig. 1). Previously, there have been some attempts to create and test mixtures for these purposes. Most work has not progressed beyond cellulose degradation to glucose, i.e., it has not dealt with realistic lignocellulosic substrates nor sugars other than glucose. Walker et al. [59] compared mixtures of CBH, EG, and BG from mixed bacterial and fungal sources using cellulose (Avicel) as substrate. An optimized mixture was still not as good as a crude T. reesei preparation, leading the authors to conclude that “additional cellulases are needed”. Irwin et al. [24] compared six cellulases purified from the alkalothermophilic actinomycete Thermomonospora fusca with and without the addition of CBH1 and CBH2 from T. reesei. Mixtures contained up to six components. T. fusca exoglucanase (E3) and T. reesei CBH2 were equivalent in mixtures. By themselves, bacterial E3 was somewhat less active than the two fungal CBHs on carboxymethylcellulose, swollen cellulose, or filter paper. A mixture of all six T. fusca cellulases was equivalent to a crude T. fusca mixture, but addition of T. reesei CBH1 increased activity by another 67%. Kim et al. [30] worked with the same enzymes, but used factorial experimental design to optimize the ratios. Rosgaard et al. [50] optimized the ratios of four T. reesei enzymes (two CBHs and two EGs). However, it is difficult to evaluate their results because the enzymes were obtained by expression in Aspergillus oryzae or Fusarium venatum, and no purification protocol or evidence of purity was presented. Both of these fungi would be expected to secrete a number of cellulases similar to the ones of T. reesei.

In regard to synthetic enzyme mixtures, some of the most interesting research is available only in the non-peer-reviewed patent literature. Hill et al. [21] optimized a mixture of T. reesei cellulases for degradation of acid-pretreated wheat straw. Their benchmark blend was 57% CBH1 (GenBank CAA49596), 29% CBH2 (P07987), 7% EG1 (AAA34212), and 7% EG2 (AAA34213). Optimization of the ratios was able to improve activity by 10-15%. Although the focus of this work was on optimization of two CBHs and two EGs, these enzyme combinations will not release free glucose unless BG is added. The authors indicate they also added BG from Aspergillus niger, but give no indication of its purity. The standard A. niger BG preparation used in many labs (Novozyme 188) contains many contaminating enzyme activities, including cellulases. If this is the BG used by Hill et al. [21], then interpretation of their results is more complicated. In a subsequent patent application from the same group, Scott et al. [54] tested combinations of the accessory enzymes Cip1 (GenBank AAP57751), Cel61A (CAA71999), and swollenin (CAB92328) when added to the core enzymes. An improvement of approximately 35% was obtained.

Progress in Enzyme Research in the DOE Great Lakes Bioenergy Research Center

Given that T. reesei is an efficient degrader of cellulose (although not necessarily the best), it is of interest to know what proteins it secretes. To answer that, we analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and liquid chromatography-mass spectrometry the proteins secreted by T. reesei RUT-C30 when grown on ammonia fiber expansion (AFEX)-treated corn stover [42]. In parallel, we analyzed the proteins in a commercial “cellulase” preparation (Spezyme CP). To the best of our knowledge, this is the first published proteomics analysis of the T. reesei secretome to be based on the complete genome sequence [38].

Several conclusions can be drawn from this analysis and from comparison of the two T. reesei preparations. First, the secretome is complex, containing at least 80 proteins. In this regard, it is similar to other ascomycete secretomes [45]. This number is a conservative estimate using a 95% probability cutoff (Scaffold, Proteome Software Inc., Portland, OR, USA). At 90% probability, 234 proteins could be detected. Second, the commercial and the “homemade” preparations are quite similar, despite the fact that one is from a highly selected industrial strain grown under industrial conditions and the other from a “primitive” selected strain grown in the laboratory on corn stover. There are no major qualitative differences (except for proteases, see below) and only a few proteins differ significantly in relative abundance (e.g., xylanase 3 [BAA89465] and β-xylosidase [CAA93248]). There are no proteins in Spezyme CP from any other organism, indicating that the commercial production strain of T. reesei used to make Spezyme CP has not been genetically engineered with heterologous genes (at least not yet). The biggest qualitative difference between Spezyme CP and RUT-C30 is the absence of five abundant proteases from the commercial cellulase. The simplest explanation for this is that the genes encoding the proteases have been intentionally mutated in the commercial production strain in order to improve stability of the commercial product; to the best of our knowledge, this technical advance has not been published. Third, although as expected, glycosyl hydrolases dominate in abundance and diversity in both preparations, T. reesei also secretes many other cell wall active proteins, including carbohydrate esterases, proteases, and nonenzymatic or unknown proteins such as swollenin and Cip (both of which contain CBMs). In contrast to some fungi such as P. chrysosporium, only one putative oxidoreductase is present in the secretome of T. reesei [42].

Our lab is using these proteome results to guide the construction of a minimal, synthetic set of enzymes. The goal is to produce a mixture that can perform as well as Spezyme CP or other commercial cellulases but at a lower enzyme loading, by optimizing the ratios of the necessary enzymes and by omitting the unnecessary proteins.

Several considerations guided us in our choice of enzymes to constitute the synthetic set. First, we felt it was preferable to choose enzymes from a single organism on the grounds that any enzyme-enzyme interactions that might take place would favor cooperativity between co-evolved proteins. Second, we chose to start with the enzymes of T. reesei because we know the biochemical functions of more of its enzymes, in more detail, than of any other single organism (bacterial or fungal). This point is especially important because in order to use enzymes from most other organisms, one must deduce function by orthology, which carries significant risk. Considering the diversity and low level of amino acid identity among orthologous proteins in fungi (often below 60%), it seemed prudent to choose enzymes whose biochemical functions had been experimentally verified. As an example of how deduction of function by orthology can be misleading, most of the GH family 3 enzymes from Aspergillus nidulans (of which there are ∼20) probably do not cleave cellobiose, as does the dominant enzyme of GH family 3 in T. reesei. It is not possible to determine which A. nidulans gene encodes a functional cellobiase from sequence alignments [4, 42, and unpublished results). Third, although T. reesei has fewer glycosyl hydrolase genes than some other ascomycetes [42], it has good representation of at least a single gene from all of the major known, important cell wall-degrading enzyme families. The deficiency of T. reesei in overall number of glycosyl hydrolases is mainly due to a strong reduction in the numbers of genes in CAZy GH families 43 and 61 and an absence of the redundant GH family 51 (containing mainly α-arabinosidases) and GH family 53 (endo-β1,4-galactanases) [42].

The major experimental challenge in this line of research is the difficulty of obtaining sufficient quantities of highly pure enzymes. Previous published studies have obtained pure enzymes in several ways, including purification from commercial cellulase preparations, from the source fungus grown in house or by heterologous expression in another filamentous fungus. It is difficult to evalute many of these studies because they often do not provide evidence of purity and, in some cases, information about the enzyme purification methods used. A surprising number of published reports do not indicate the method of protein quantitation, or, if the method is given, what protein was used as a standard. Without accurate information on purity and specific activities, it is impossible to compare results between laboratories.

Despite major advances in gene expression and protein purification, it is still challenging to make sufficient quantities of highly pure enzymes, i.e., it is expensive. As many other researchers have discovered, production of pure proteins is not technically trivial, whether it be done by conventional protein purification from commercial cellulase preparations, by heterologous expression, or in vitro [18]. Proteins can be purified directly from commercial cellulase preparations, which have high protein concentrations (∼100 mg/ml) [17]. However, purification becomes progressively more difficult past the most abundant enzymes (CBH1, EG, EX, etc.). Heterologous expression in hosts such as Pichia pastoris, Escherichia coli, or S. cerevisiae is attractive, especially because they have low background of potentially interfering activities [13]. Many fungal glycosyl hydrolases and other proteins have been successfully expressed in P. pastoris [e.g., 25, 32, 36]. There are two problems associated with purification from native sources and expression in heterologous systems, namely, glycosylation heterogeneity and abnormal glycosylation, respectively (other posttranslational modifications can also complicate the issue—see Lappalainen et al. [33]). Glycosylation of many secreted fungal proteins is intrinsically heterogeneous, which hinders efficient purification by methods such as ion exchange. For example, T. reesei EX2 (GenBank AAB29346) overexpressed in T. reesei runs as a tight doublet on SDS-PAGE but splits into >12 peaks by anion exchange chromatography (our unpublished results). The differences in the protein isoforms that account for this behavior have not been identified, although earlier work on this same enzyme suggested that natural heterogeneity is produced, in part, by deamination of glutamine [33]. Some heterologous hosts produce enzymes with abnormal glycosylation, e.g., yeast and P. pastoris. This can be ameliorated by using glycosylation mutants [20].

Concerns that abnormal glycosylation might adversely affect key enzyme properties—activity, protein folding, secretion efficiency, stability, solubility, pH optimum, interactions with other proteins, etc.—makes homologous expression more attractive. The major problem with homologous expression (i.e., in T. reesei itself) is contamination by endogenous glycosyl hydrolases and other secreted enzymes. This can be avoided by engineering a tag such as His6 into the protein, which can then be used for purification with a single chromatographic step on a nickel resin. Another approach to avoid contamination from the major enzymes is to use a knockout strain as a host. A strain of T. reesei missing the two major CBHs and two major EGs has been used for this purpose [27].

In our own experiments, we have found both P. pastoris and T. reesei to be reasonable hosts for medium-scale (∼10-20 mg) production of highly pure enzymes. For now, we are discounting the potential problem of abnormal glycosylation in P. pastoris. We base this decision on two factors: first, there are actually very few reports that glycosylation makes a large difference to activity, at least in vitro. Second, there is no such thing as “normal” glycosylation; glycosylation in T. reesei, for example, depends on both strain and growth conditions [57]. Thus, an enzyme such as CBH1 from different T. reesei strains is not strictly the same enzyme.

In preliminary experiments, we are focusing on making and optimizing a “core” set that can be used to find better replacement enzymes and to serve as a platform to test accessory enzymes (Fig. 1). The rationale for a core set is that without at least these enzymes, no significant release of free glucose or xylose is expected from any lignocellulosic substrate. Our core set includes one cellobiohydrolase (CBH; GenBank CAA49596), one endoglucanase (EG; AAA34212), β-glucosidase (BG; AAA18473), one endo-xylanase (EX; BAA89465), and β-xylosidase (BX; CAA93248). All of these are among the most abundant proteins secreted by T. reesei when grown on AFEX-treated corn stover [42].

Having established a platform for optimizing enzyme mixtures, one can use robotic liquid handlers and statistical experimental design to optimize a core set for diverse conditions, such as alternate feedstocks and alternate pretreatments. Of greatest interest to us is the use of the platform to test alternate core enzymes (i.e., to find better CBH1, EG, etc., by substitution) and to test accessory enzymes when added to the core set (e.g., α-glucuronidase and acetyl xylan esterase). Robotically-assisted assays and sugar measurements, combined with appropriate statistical methods such as response surface methodology make this task practical and meaningful [2, 17, 30, 34].