Mining the biomass deconstructing capabilities of rice yellow stem borer symbionts
Efficient deconstruction of lignocellulosic biomass into simple sugars in an economically viable manner is a prerequisite for its global acceptance as a feedstock in bioethanol production. This is achieved in nature by suites of enzymes with the capability of efficiently depolymerizing all the components of lignocellulose. Here, we provide detailed insight into the repertoire of enzymes produced by microorganisms enriched from the gut of the crop pathogen rice yellow stem borer (Scirpophaga incertulas).
A microbial community was enriched from the gut of the rice yellow stem borer for enhanced rice straw degradation by sub-culturing every 10 days, for 1 year, in minimal medium with rice straw as the main carbon source. The enriched culture demonstrated high cellulolytic and xylanolytic activity in the culture supernatant. Metatranscriptomic and metaexoproteomic analysis revealed a large array of enzymes potentially involved in rice straw deconstruction. The consortium was found to encode genes ascribed to all five classes of carbohydrate-active enzymes (GHs, GTs, CEs, PLs, and AAs), including carbohydrate-binding modules (CBMs), categorized in the carbohydrate-active enzymes (CAZy) database. The GHs were the most abundant class of CAZymes. Predicted enzymes from these CAZy classes have the potential to digest each cell-wall components of rice straw, i.e., cellulose, hemicellulose, pectin, callose, and lignin. Several identified CAZy proteins appeared novel, having an unknown or hypothetical catalytic counterpart with a known class of CBM. To validate the findings, one of the identified enzymes that belong to the GH10 family was functionally characterized. The enzyme expressed in E. coli efficiently hydrolyzed beechwood xylan, and pretreated and untreated rice straw.
This is the first report describing the enrichment of lignocellulose degrading bacteria from the gut of the rice yellow stem borer to deconstruct rice straw, identifying a plethora of enzymes secreted by the microbial community when growing on rice straw as a carbon source. These enzymes could be important candidates for biorefineries to overcome the current bottlenecks in biomass processing.
KeywordsRice yellow stem borer Gut consortium Microbial diversity Targeted enrichment Metaexoproteome Carbohydrate-active enzymes Xylanase GH10 family
yellow stem borer
liquid chromatography–tandem mass spectrometry
operational taxonomic unit
cluster of orthologous group
open reading frame
The use of lignocellulosic ethanol as a sustainable alternative to fossil fuel-derived transportation fuel or first generation biofuels depends upon consistent biomass availability and the economic viability of the bioethanol production process. Among all the lignocellulosic biomass available as potential feedstocks in lignocellulosic ethanol production, the availability of agricultural residues is attractive, as the amount produced on an annual basis is likely to increase in the future due to increased demand of crop production to fulfil the nutritional requirement of the rapidly growing world population. Rice straw, wheat straw, sugarcane bagasse, and corn stover are currently the most available agricultural residues, with rice straw being the most abundant (731 million tons) , totalling more than the sum of the other three crops (663 million tons) . Rice straw also contains the least amount of lignin (one of the limiting factors towards making lignocellulosic ethanol cost competitive) when compared to all other abundantly available agricultural residues [3, 4, 5] making it a desirable choice as feedstock for lignocellulosic ethanol production [6, 7, 8, 9]. Moreover, due to its limited suitability for other purposes due to its high silica content [10, 11], farmers usually burn the rice straw in the field wasting a potentially valuable resource, releasing emissions of black carbon, CO2, and generating tropospheric ozone [12, 13, 14]. A major barrier in delivering cost effective lignocellulosic bioethanol is the availability of enzymes that can efficiently deconstruct each component of the plant cell wall. Indeed, none of the current formulations of biomass degrading enzymes fully meet the requirements of the biofuels’ industry . To overcome these limitations, a diverse range of lignocellulosic degrading organisms are being explored for new enzyme activities, including insects, which have evolved to digest wider range of lignocellulosic substrates [16, 17, 18].
The type of enzymes required for effective deconstruction of biomass depends on the nature or structural component of their cell wall. There is no universal cocktail of enzymes that can effectively deconstruct each type of biomass and it is usually customized on the basis of biomass composition [19, 20]. Most enzymes used in commercial lignocellulosic ethanol production have been discovered from pure fungal or bacterial isolates . In this paper, we describe the selective enrichment of a microbial consortium from the gut of a rice yellow stem borer (Scirpophaga incertulas) using rice straw as the sole carbon source. The yellow stem borer (YSB) is monophagous, i.e., it derives nutrition solely from stems of rice plants. It is, therefore, highly specialized to deconstruct the cell walls of rice plants into simple sugars . Microbial communities residing in the gut of biomass degrading insects are known to interplay synergistically for comprehensive biomass deconstruction [23, 24, 25, 26]. A metatranscriptomic and metaexoproteomic study was performed on a rice straw-enriched microbial community from rice stem borer larvae to investigate the CAZy proteins mediating the deconstruction of rice plant cell walls. Several new enzymes categorized to different CAZy classes were identified, one of which belonging to family GH10 was heterologously expressed in E. coli and its deconstruction ability towards the hemicellulose component of rice straw established.
Microbial diversity of a rice yellow stem borer gut consortium
Bacterial diversity in rice YSB gut consortium
Name of phylum
Number of genera
Enrichment of a rice yellow stem borer gut microbial consortium
Changes in the diversity of rice yellow stem borer gut consortium during enrichment process
16S rRNA gene analysis of the microbial community after 12 months of serial passaging on rice straw showed the enrichment of major phyla Proteobacteria and Bacteroidetes from 92.5 to 99.3%, while a decrease in relative abundance of Firmicutes and Verrucomicrobia from 7.1 to 0.2% compared to the original starting culture was observed (Fig. 1a, c). The proportion of Actinobacteria remained similar in both the gut fluid and the enriched culture at 0.3%.
There was a greater diversity of microorganisms in the original gut fluid with 178 genera identified compared to 83 in the enriched culture, and while certain strains diminished during the enrichment process others became dominant (Fig. 2a, c). For example, the top 5 genera, which constituted 65% of all genera present in the gut, were Asticcacaulis (37%), Pedobacter (11%), Stenotrophomonas (7%), Rhizobium (5%), and Bacillus (5%) (Fig. 2a), while in the case of the enriched culture, except for Pedobacter (8%), all the other genera were replaced in the top 5 ranking by Pseudomonas (49%), Ensifer (10%), Flavobacterium (8%), and Aeromonas (5%), constituting 80% of total abundance (Fig. 2c). We also observed differences between the quantitative abundance and the number of unique OTUs detected for each genus. For example, Azotobacter recorded the highest number of species detected under this genus in the gut consortium, while it was 7th in terms of abundance (Fig. 2a, b). In the enriched culture, Pseudomonas remained highest in both abundance and number of species detected, but Azorhizophilus was 2nd highest for number of species detected, while it was 23rd in terms of abundance (Fig. 2c, d, Additional file 1: Figure S2). More than 99.9% of genus present in enriched YSB consortium were also present in original consortium, albeit in varying abundance, suggesting that chance of contamination arising during passaging was negligible (Additional file 1: Table S1).
Mining CAZy proteins in the enriched consortium
The enriched consortium was superior in rice straw deconstruction in liquid culture compared to the original gut microbial consortium (Fig. 4b). We, therefore, investigated the CAZy proteins produced by this enriched consortium by collecting protein samples on days 3, 7, 13, and 20 from the culture to capture proteins produced at early, mid, and late stages of the rice straw deconstruction. Metaexoproteomic analysis was performed on the secreted proteins present at each of these timepoints with a view to understanding the nature and relative abundance of potential enzymes and ancillary proteins, and also to investigate how the profile and abundance of these proteins changes over time. Secretory proteins available in two discrete fractions were extracted from the rice straw degrading cultures: a soluble extract was isolated by precipitating proteins from the culture supernatant, while a ‘bound fraction’ was obtained using a biotin-labelling methodology as described previously . This methodology allowed the specific targeting of proteins tightly bound to the rice straw. Soluble and biomass-bound protein extracts were then analysed by LC–MS/MS and searched against the metatranscriptomic library generated from the enriched consortium.
CAZy families detected in rice YSB metaexoproteome
Nature of domains
Number of domains identified
Carbohydrate-binding modules (CBMs)
Glycoside hydrolases (GHs)
Carbohydrate esterases (CEs)
Glycosyl transferases (GTs)
Auxiliary activities (AAs)
Surface layer homology (SLH)
Polysaccharide lyases (PLs)
Relative abundance of top 20 GH family proteins observed in the rice YSB gut consortium
Relative abundance rank
Total emPAI scorea
Cellulose and hemicellulose deconstruction
Cellulose and hemicellulose deconstruction
Cellulose and hemicellulose deconstruction
In terms of CBMs, a total of 95 CBMs from 15 families were identified in the enriched consortium metaexoproteome. Among those identified, 33 CBM domains (from 13 different families) were found exclusively in the bound fraction, 17 CBM domains (from 4 different families) were found exclusively in the supernatant fraction, while 45 CBM domains (representing 5 families) were identified in both fractions. By far, the most represented CBM family in the metaexoproteome was CBM44 (known for binding to cellulose and xyloglucan) accounting for 56/212 of all CAZy annotated domains. However, based on relative abundance, the most abundant CBM domain identified in the YSB metaexoproteome was CBM4 (xylan, glucan, and amorphous cellulose binding) and CBM2 (predominantly cellulose binding); their relative abundance is given in the Additional file 1: Table S2. When we categorized these CBMs on the basis of their binding specificity, we found CBM3 and CBM 63 known for cellulose binding, CBM13 and CBM22 for hemicellulose binding, while CBM2, CBM4, CBM6, CBM9, and CBM44 are known to bind both cellulose and hemicellulose. CBMs families known to bind to pectin (CBM32), starch (CBM20 and CBM48), glycoproteins (CBM32 and CBM 40), and peptidoglycans (CBM50) and chitin (CBM2 and CBM3) were also identified.
Metaexoproteome analysis also identified a total of 21 domains belonging to the Carbohydrate Esterases (CE) CAZy class and assigned to 5 families. Among them, 18 domains (representing 4 families) were present exclusively in the bound fraction, 2 domains (from 2 families) were present only in the supernatant fraction, and 1 domain was present in both. The most abundant CE domains identified in metaexoproteome were assigned to the CE1 and CE10 families; their relative abundance in each fraction is given in the Additional file 1: Table S3. In terms of substrate recognition, CE7 is known for hemicellulose deconstruction, CE1 and CE16 are known to hydrolyse hemicellulose and pectin, the CE10 domain is categorized as hemicellulose and lignin deconstructing, while the carbohydrate esterases of CE4 family have specificity for hemicellulose, chitin and peptidoglycan.
When we investigated the presence of auxiliary activities (AA) proteins in the metaexoproteome, we found a total of 16 domains designated to 3 families: AA2, AA7, and AA10. All the 16 domains were exclusively found in the bound fractions. Of all the CAZy annotated domains, the AA10 from Protein c4515_g1_i1_1 was the most abundant, and when compared with the relative abundance of all other identified proteins, it ranks 11/1088. The three AA families represented in the proteome are reported to specifically deconstruct separate components of the plant cell wall; AA10 deconstructs cellulose, AA7 deconstructs cellulose and hemicellulose, and AA2 deconstructs lignin.
In addition, the enriched consortium metaexoproteome contained polysaccharide lyases (PL) represented by two PL families: PL1 and PL2. Pectate lyase and exo-polygalacturonate lyase are two important enzymes known in these families, and they are known to depolymerise pectin present in the primary and secondary cell walls of plant biomass through eliminative cleavage.
Architecture of multi-domain CAZymes identified in the rice YSB gut consortium
ORF of the YSB contig
Domain architecture of translated proteins
Dynamics of CAZy protein expression
Recombinant expression and functional validation of a xylanase from the GH10 family
To identify new microbial sources of lignocellulolytic enzymes, we extracted gut fluids from YSB larvae and enriched for rice straw deconstruction by sub-culturing on rice straw for over a year. As expected, we observed much higher deconstruction of rice straw by the enriched microbial consortium as compared to the freshly isolated YSB gut consortium. The enriched consortium demonstrated significant cellulase and xylanase activities and diverse colony morphology on agar plates. Since there has been little published information on the diversity of the microbiome of the rice YSB gut, we performed 16S rRNA gene analysis and explored changes in microbial population in the enriched consortium compared to the native one. The dominant species in the YSB gut consortium were Proteobacteria, Bacteroidetes, and Firmicutes, which were similar to those observed by Reetha and Mohan  while studying culturable microbes of the pink stem borer that is an important insect pest of several different types of crop including rice. The dominance of Proteobacteria, Bacteroidetes, and Firmicutes in the YSB gut community provides a strong indication of their importance in facilitating depolymerisation of the complex rice straw cell-wall components to monomeric sugars that can be absorbed by the host insect. Following serial sub-culturing, we observed an increase in Proteobacteria and Bacteroidetes and a decline in Firmicutes and Verrucomicrobia. As a result of cellulolytic bacteria enrichment in the consortium, we observed a decrease in the diversity of total bacterial species. Interestingly, bacterial genera known for the biomass deconstruction such as Pseudomonas, Azotobacter, Dyadobacter, Flavobacterium, Prosthecobacter, Chitinophaga, Sphingobium, Pseudoxanthomonas, Mucilaginibacter, Giofilum, Ensifer, and Cellulomonas were identified in both the original and enriched consortia.
We further cultured the enriched consortium on rice straw for 20 days and mined the CAZy proteins through metaexoproteomics. We analyzed proteins that were present in both the culture supernatant as well as those bound to the rice straw biomass . Analysis of all the CAZymes present in the metaexoproteome showed that enzymes exclusively bound to the rice straw were significantly higher in abundance (9.5-fold) compared to those in the culture supernatant. In thee bound fractions, the high abundance of CAZy family proteins known for high catalytic activity on cellulose or hemicellulose such as GH10, GH9, GH48, and GH5 were identified.
In addition to single domain CAZymes, we also identified several enzymes with multi-domain molecular architecture. An enzyme was identified with a single catalytic domain and two different carbohydrate-binding modules (CBM2 and CBM3), indicating that the enzyme may possess broad specificity for different substrates. Interestingly, CAZymes with multiple repetition of CBMs belonging to families CBM13, CBM20, and CBM44, were also identified. Multimerization of CBM44 in different enzymes was in the range of 2–11 binding domains. To date, the multimerization of CBMs is mostly reported for thermostable enzymes such as CenC from Clostridium thermocellum , xylanase from Thermoanaerobacterium aotearoense , and CelA from Caldicellulosiruptor bescii . These enzymes catalyze hydrolysis at high temperature which results in weakened binding to the insoluble substrate because of increased kinetic energy . The availability of several CBMs possibly provides better accessibility of insoluble substrate to the enzyme at these higher temperatures. Moreover, some thermophilic bacteria are reported to secrete non-catalytic proteins to increase the accessibility of the insoluble substrate to the biomass deconstructing enzymes  and this may also apply to the consortium from the YSB. Another interesting finding is identification of several polypeptides with unknown catalytic domains linked to known CBMs. The presence of CBMs with domains of unknown function suggests that these proteins play a role in lignocellulose deconstruction and present interesting targets for characterization and for potentially boosting saccharification of biomass feedstocks.
One of the most abundant enzymes (maximum emPAI score) in the enriched consortium was a GH10 xylanase which we confirmed by showing that the recombinant enzyme was capable of hydrolyzing beechwood xylan and the hemicellulosic component of both treated and untreated rice straw.
The present study was aimed at enriching a rice yellow stem borer (YSB) microbial consortium for better lignocellulosic biomass deconstruction ability, particularly against untreated rice straw. As a result, the enriched rice YSB consortium was found to deconstruct ~ 67% of the rice straw in 7 days, which is high compared to other reported microbial consortia. Wang et al.  found 31.5% degradation efficiency against untreated rice straw in 30 days by the rice straw adapted (RSA) compost consortia. Wongwilaiwalin et al.  and Yan et al.  reported 45% (MC3F compost consortium) and 49% (BYND-5 compost consortium) degradation efficiency against untreated rice straw in 7 days, respectively. The discovery of domains of unknown function linked to CBMs and enzymes with multi-domain architecture present interesting targets for further characterization and possible biotechnological application.
Rice YSB gut consortium cultivation for induced expression and mining of biomass deconstructing enzymes
The insect Scirpophaga incertulas commonly known as rice yellow stem borer (YSB) was selected in this study for targeted discovery of rice straw deconstructing enzymes. Insect larvae (approximately 25) were collected from the paddy fields of the Biotechnological Research Experiments field, Raipur University, Chhattisgarh, India in October 2011. Insect larvae were dissected aseptically, and the gut was isolated and microbial community harbouring in the gut was used as inoculum for further experiments. The YSB gut microbial community was inoculated in three different media: (1) Tryptic Soya Broth (TSB) (1.7% tryptone, 0.3% soya peptone, 0.25% K2HPO4, 0.5% NaCl, and 0.25% glucose); (2) rice straw in water having salt only (0.25% K2HPO4, 0.5% NaCl, and 0.5% rice straw of ~ 0.5 cm), and (3) rice straw in water having salt and 0.1% yeast extract (0.25% K2HPO4, 0.5% NaCl, 0.1% yeast extract, and 0.5% rice straw of ~ 0.5 cm). The YSB gut microbial community was cultured in three different media separately for 7 days at 30 °C with 150 rpm shaking. After 7 days, the culture was centrifuged at 10,000 rpm for 20 min, and the supernatant and cell pellet were collected separately. The supernatant was filtered through 0.22 µM syringe filter and used for enzyme assays, while the cell pellet was sonicated at 4 °C, centrifuged at 10,000 rpm and total soluble proteins (TSP) used for the enzyme assays. CMCase and xylanase assays were performed for both secretory (culture supernatant) and cell bound protein fractions collected from all three different culture and evaluated.
For enrichment of the rice straw hydrolysing microbial consortium, the insect gut microbial consortium was cultured into a medium having salt [NaCl (0.5%), K2HPO4 (0.25%)], 0.1% yeast extract, and rice straw as the main carbon source and passaged after every 7 days for 1 year. The 1 year passaged culture was evaluated for its potential biomass deconstruction ability and changes in microbial community structure or diversity.
Enzyme assays using carboxyl methyl cellulose (CMCase) and beechwood xylan were performed as described previously  with some modifications. Carboxyl methyl cellulose (CMC, sigma) and beechwood xylan (HiMedia) was selected as substrate for evaluating cellulose and hemicellulose deconstruction ability of the consortium, respectively. The 250 µL of substrate (2% w/v in sodium phosphate buffer pH 7.4) was mixed with 250 µL of protein sample and incubated at 50 °C for 30 min. 500 µL of dinitrosalicylic acid (DNSA) was then added and solution was boiled at 100 °C for 5 min. The solution was cooled to room temperature and the reducing sugar content was estimated using glucose and xylose as standards for CMCase and xylanase assay, respectively. One unit of enzyme activity was defined as the amount of enzyme that released 1 μmol of reducing sugar per min.
For plate assay, an equal volume of CMC or xylan (1% w/v in water) and tryptic soya broth medium (2x) (with 1.5% agar and 0.5% trypan blue dye) was autoclaved separately. After autoclaving, both solutions were mixed together and poured into the Petri plate in laminar flow hood. The protein solution was applied on the surface of the solid agar plate under aseptic conditions and incubated at 37 °C. After 48 h, plates were visually inspected for clearance zone formation.
CMCase and xylanse activity using zymogram on SDS-PAGE gel were performed as described earlier . In brief, the protein sample was resolved on a 12% SDS-PAGE gel containing either 0.5% (w/v) CMC or 0.5% (w/v) beechwood xylan. After electrophoresis, the gel was washed once with 20% (v/v) isopropanol in phosphate-buffered saline (PBS) for 1 min followed by three washes of 20 min each in PBS. The gel was incubated in PBS at 37 °C for 1 h, stained with 0.1% (w/v) Congo red for 30 min, and destained with 1 M NaCl. Clear bands against the red background indicated CMCase or xylanase activity. Protein concentrations were estimated with the bicinchoninic acid (BCA) Protein Assay kit (Pierce) using bovine serum albumin as a standard.
Microbial diversity assessment using ion PGM sequencer platform
The original rice YSB gut consortium and the enriched consortium passaged for 1 year were processed for total DNA extraction as described in a latter section. Extracted DNA was then treated with RNase, cleaned and concentrated using Genomic DNA clean-up kit (ZymoResearch). The purified DNA was used as a template to amplify V4 hypervariable regions of the 16S rRNA gene in the consortium. Phusion High-Fidelity DNA Polymerase (Finnzymes OY, Espoo, Finland) and primer pairs covering the V4 (520 forward: 5′ AYTGGGYDTAAAGNG 3′, and 802 reverse: 5′ TACNVGGGTATCTAATCC 3′) hypervariable region  were used in the amplification reaction. The amplified fragments were purified with Agencourt AMPure XP (Beckman Coulter). The quantity and quality of the purified PCR products were analyzed using an Agilent Tape Station with an Agilent DNA 1000 kit. Libraries were prepared using the Ion Plus Fragment Library Kit (Life Technologies Corporation) and barcoded using Ion Xpress Barcode Adapters 1–16 Kit (Life Technologies Corporation). The libraries were quantified using Invitrogen Qubit, and an equimolar pool of initial and passaged library with unique barcodes was generated to create the final library. Template preparation was carried out with the pooled libraries using the Ion One Touch 2 system with an Ion PGM Template OT2 400 Kit (Life Technologies Corporation). Quality control at the pre-enriched template stage was made using the Ion Sphere Quality Control Kit (Life Technologies Corporation) and the Qubit 2.0 Fluorometer (Invitrogen). The templated libraries were sequenced using an Ion PGM sequencer platform (Thermo Fisher Scientific). The instrument cleaning, initialization, and sequencing was done by reagents provided in the Ion PGM 400 Sequencing Kit (Life Technologies Corporation) using an Ion314 Chip v2.
Data processing and analysis for microbial diversity
Amplicon Fastq files were converted to Fasta and quality files using QIIME convert_fastaqual_fastq.py script . The resulting files were quality filtered by removing reads outside the minimum (− l 180) and maximum (− L 250) read length and quality score (Q < 25). During the split_libraries.py process, forward and reverse primer sequences were also trimmed. Filtered files were concatenated and replicated sequences with a minimum size of two were removed with VSEARCH-derep_fulllength command . OTU clustering and chimera filtering were performed using UPARSE–cluster_otu command  at 97% identity. The pipeline produced two output files, an OTU table in txt format (further converted into biom file format), and a set of representative sequences for each OTU in fasta format. The representative sequences were then assigned to taxonomy using UCLUST  and Greengenes database  as a reference on QIIME (assign_taxonomy.py). Taxonomy was added to the OTU table using biom add-metadata script. Running a default command on QIIME, alpha and beta diversity and taxonomy summary analyses were performed. Visualization and statistical analysis was done using Prism7.
Experimental design and sample collection for metatranscriptomic and metaexoproteomic study
To investigate candidate biomass deconstructing proteins/enzymes and their encoding genes, metaexoproteomics and metatranscriptomics of the stable rice YSB gut consortium were performed, respectively. Three replicates of 2 L flasks containing 500 mL medium (0.5% NaCl, 0.25% K2HPO4, 1% Yeast Extract, pH 7) with 1.5% rice straw were prepared and autoclaved, and 2% YSB seed culture was inoculated, cultured by incubating at 30 °C and 150 rpm for 20 days. In addition to these three cultures, a negative control flask was also set up as outlines above, but without the addition of the YSB seed culture. 100 mL samples were collected at 3, 7, 13, and 20 day post-inoculation for protein and DNA/RNA extraction for metaexoproteomics and metatranscriptomics, respectively.
DNA and RNA extraction
Triplicate samples of DNA and RNA were extracted from all three cultures and the negative at each timepoints by following the protocol reported previously  with some modification. In brief, collected samples were spun at 12,000×g at 4 °C for 10 min. Supernatant was used for protein preparation, while pelleted biomass (microbial and rice straw) was used for DNA/RNA preparation. 0.5 g of the biomass pellet was transferred into 2 mL microcentrifuge tube containing glass beads (0.5 g, 0.5 mm and 0.5 g, 0.1 mm), and 0.5 mL CTAB buffer (10% CTAB in 0.7 M NaCl, 240 mM potassium phosphate buffer, pH 8.0, and 1 µL β-mercaptoethanol/mL buffer) was added and vortexed. For nucleic acid extraction, 0.5 mL phenol:chloroform:isoamyl alcohol (25:24:1, pH 8.0) was added, mixed, and then homogenised using a TissueLyser II (Qiagen) for 4 × 2.5 min at a speed setting of 28 s−1. The samples were phase separated by centrifugation at 13,000×g, 4 °C for 10 min, and the resulting aqueous phase was extracted with an equal volume of chloroform:isoamyl alcohol (24:1). The nucleic acids were precipitated overnight at 4 °C from the final aqueous fraction by adding 2 volumes of precipitation solution (1.6 M NaCl, 20% PEG8000 buffer 0.1% DEPC treated). The resulting pellet was washed twice with 1 mL ice-cold 75% ethanol, air-dried, and re-suspended in 50 μL RNase/DNase free water.
Metatranscriptome (Illumina shotgun) sequencing
A sample of the extracted nucleic acids was treated to remove DNA by addition of DNase (Mo Bio, USA) as recommended by manufacturers. Total RNA was then processed for small RNA removal and purification by RNA Clean and Concentrator kit (Zymo Research, USA). For each timepoint purified total RNA (0.7 µg) from all three biological replicates were pooled (total 2.1 µg) and processed for ribosomal RNA removal using Ribo-Zero™ Magnetic Gold (Epidemiology) kit (Epicentre or Illumina, USA), using the protocol recommended by manufacturer. The quality of ribosomal RNA (rRNA)-depleted sample was analyzed using an Agilent TapeStation 2200 using High Sensitivity (HS) RNA ScreenTape (Agilent, USA). Finally, 100 ng rRNA depleted RNA was used for library preparation to perform sequencing on Illumina 2500 platform (Illumina, USA). For all four timepoints the library was prepared using TruSeq RNA Sample Prep v2 kit (Part# 15026495, Illumina) and the protocol was adapted as recommended by the manufacturer. During library preparation different indexing adapters were added to the pooled RNA samples for each of the four timepoints. These four libraries were normalized with equimolar amounts of each library, pooled and subsequently diluted to 10 pM.
For sequencing, rapid run mode was followed. The library template along with 1% PhiX template hybridised onto an Illumina flow cell (single lane) placed on cBot system, and complete cluster generation was done on the HiSeq 2500 instrument. TruSeq Rapid PE Clusture v1 kit (Illumina) was used for cluster generation following the protocol recommended by the manufacturer. Sequencing by synthesis (SBS) chemistry was applied for clustered library sequencing using TruSeq Rapid SBS v1 kit for 100 cycles for each pair end reads. HiSeq Control Software (HCS) 2.2.58, Real-Time Analysis software 1.18.64 and Sequencing analysis viewer software was used in sequencing run processing and data acquisition. Sequences were obtained in the form of reads in BCL format. Reads were demultiplexed by removing 6 bp index using the CASAVA v1.8 program allowing for a one base-pair mismatch per library, and converted to FASTQ format using bcl2fastq. The sequenced libraries were searched against SILVA 115 database  to identify rRNA genes using Bowtie 2 software . Those reads as well as orphans and poor quality sequences were removed with the next-generation sequencing Short Reads Trimmer (ngsShoRT) software. Filtered reads from all timepoints were pooled prior to assembly, the Trinity package  with a k-mer length of 43 was used for de novo assembly.
Metaexoproteomics of enriched gut consortium
A sample of the biomass deconstructing enriched microbial community culture (30 mL) was collected at all four timepoints from all three biological replicates. This was centrifuged at 12,000×g at 4 °C for 10 min. Both supernatant and pelleted biomass fractions were collected to be processed for protein concentration and LC–MS/MS analysis. The 3 × 5 mL of the collected supernatant was precipitated by addition of 100% ice-cold acetone after filtering it through 0.22 µm syringe filter, and incubated for 16 h at − 20 °C. The precipitated protein was collected by centrifugation at 10,000×g and washed two times with 80% ice-cold acetone. Pellets were finally air-dried and re-suspended in 0.5 × phosphate buffer saline (PBS, 68 mM NaCl, 1.34 mM KCl, 5 mM Na2HPO4, 0.88 mM KH2PO4), snap frozen and stored at − 80 °C till processed for next step.
The pelleted biomass fraction was presumed to contain microbes, rice straw and secreted proteins attached to both. In triplicate, 2 g of biomass were aliquoted into 50 mL tubes and washed twice with 25 mL ice-cold 0.5× PBS buffer. Washed biomass was re-suspended in 19 mL 0.5× PBS, with the addition of 10 mM freshly prepared EZ-link-Sulfo-NHS-SS-biotin (Thermo Scientific) and incubated with rotator at 4 °C for 1 h. Samples were pelleted (10,000×g, at 4 °C for 10 min), and the supernatant discarded. The biotinylated reaction was quenched by the addition of 25 mL 50 mM Tris–Cl pH 8.0 and a further 30 min incubation with rotation at 4 °C. The soluble fraction was recovered and washed twice with 0.5× PBS, and bound proteins liberated by resuspension in 10 mL of 2% SDS (pre-heated to 60 °C), incubated at room temperature for 1 h with rotation. To recover the liberated biotin-labelled proteins, the samples were clarified by centrifugation (10,000×g, 4 °C for 10 min) and the supernatant was collected. The protein present in supernatant was precipitated with ice-cold acetone and incubated at − 20 °C for 16 h. Precipitate was then washed twice with 80% ice-cold acetone, air-dried and re-suspended in 1 mL 1× PBS containing 0.1% SDS. Re-suspended proteins were filtered through 0.2 µm filter and loaded onto a HiTrap™ Streptavidin HP column (GE, Sweden) pre-packed with 1 mL Streptavidin immobilized on a Sepharose beads matrix. The column was equilibrated with 10 column volume (CV) PBS containing 0.1% SDS (equilibration buffer). After protein loading column was washed with 10 column volumes (CV) 1× PBS containing 0.1% SDS (equilibration buffer). For elution of bound protein, freshly prepared 1 mL of 1× PBS buffer containing 50 mM DTT (elution buffer) was added into the column and incubated overnight at 4 °C before eluting.
In preparation of label-free LC–MS/MS, both bound fraction proteins and samples of protein collection from culture supernatant were desalted using 7 k MWCO Zeba Spin desalting column (ThermoFisher scientific, USA) according to the manufacturer instructions. Protein samples were then freeze dried and re-suspended in SDS-PAGE protein loading buffer, loaded onto 10% Bis–Tris gels and resolved for 6 min at 180 V to store protein samples in-gel. After staining, protein bands were excised and stored at − 80 °C prior to LC–MS/MS analysis.
Liquid chromatography coupled tandem mass spectrometric analysis
The sliced gel pieces were subjected to tryptic digestion after reduction and alkylation. The resulting peptides were reconstituted in 0.1% trifluoroacetic acid (TFA) and processed for nano LC–MS/MS as described previously . In brief, reconstituted peptides were loaded onto a nanoAcquity UPLC system (Waters, Milford, MA, USA) equipped with a nanoAcquity Symmetry C18, 5-μm trap (180 μm × 20 mm) and a nanoAcquity BEH130 1.7-μm C18 capillary column (75 μm × 250 mm). The trap was washed for 5 min with 0.1% aqueous formic acid having flow rate of 10 μL/min before switching flow to the capillary column. Separation on the capillary column was achieved by gradient elution of two solvents (solvent A: 0.1% formic acid in water; solvent B: 0.1% formic acid in acetonitrile) with a flow rate of 300 nL/min. The column temperature was 60 °C, and the gradient profile was as follows: initial conditions 5% solvent B (2 min), followed by a linear gradient to 35% solvent B over 20 min and then a wash with 95% solvent B for 2.5 min. The nanoLC system was interfaced with a maXis liquid chromatography coupled to tandem mass spectrometry (LC-Q-TOF) system (Bruker Daltonics) with a nanoelectrospray source fitted with a steel emitter needle (180 μm o.d. × 30 μm i.d.; roxeon). Positive electron spray ionization (ESI)-MS and MS/MS spectra were acquired using AutoMSMS mode. Instrument control, data acquisition, and processing were performed using Compass 1.3 SP1 software (microTOF control HyStar, and Data Analysis software; Bruker Daltonics). The following instrument settings were used: ion spray voltage = 1400 V; dry gas 4 L/min; dry gas temperature = 160 °C and ion acquisition range m/z 50–2200. AutoMSMS settings were as follows: MS = 0.5 s (acquisition of survey spectrum); MS/MS [collision induced dissociation (CID) with N2 as collision gas]; ion acquisition range, m/z = 350–1400; 0.1-s acquisition for precursor intensities above 100,000 counts; for signals of lower intensities down to 1000 counts acquisition time increased linear to 1.5 s; the collision energy and isolation width settings were automatically calculated using the AutoMSMS fragmentation table; 3 precursor ions, absolute threshold 1000 counts, preferred charge states, 2–4; singly charged ions excluded. Two MS/MS spectra were acquired for each precursor and former target ions were excluded for 60 s.
Acquired data from MS/MS was searched against the previously prepared YSB metatranscriptome data base using Mascot search engine (Matrix Science Ltd., version 2.4) through the Bruker ProteinScape interface version 2.1). The following parameters were applied: tryptic digestion, carbamidomethyl cysteine as fixed modification, oxidized methionine and deamidation of asparagine and glutamine as the variable modification. A maximum of one missed cleavages were allowed. The peptide mass tolerance was set to 10 ppm and MS/MS fragment mass tolerance was set to 0.1 Da. Protein false discovery rate (FDR) was adjusted to 1%. A minimum of two significant peptides and one unique peptide were required for each identified protein.
Bioinformatic analysis of metaexoproteomes
Nucleotide sequences of contigs matching to observed proteins by Mascot were retrieved from the metatranscriptomic databases using Blast-2.2.30 + Standalone. EMBOSS  application was used to generate all possible open reading frames (ORFs) from these matched contigs, defined as any region > 300 bases between a start (ATG) and a stop codon. These ORF libraries were converted into amino acid sequences and these proteins were annotated using BLASTP searching against the non-redundant NCBI database with an E value threshold of 1 × 10−5. Protein sequences were also annotated using dbCAN  to identify likely carbohydrate-active domains. Subcellular localisation was predicted using SignalP v. 4.1  program with the default cut off value.
Functional validation of rice YSB gut symbionts’ xylanase affiliated to family GH10
Open reading frame (1416 bp) of the metatranscriptome assembled contig no. c64390_g1_i1 encoding putative endoxylanase of CAZy family GH10 was selected for functional validation in Escherichia coli. The encoded protein was 471 amino acids including an N-terminal signal peptide of 35 amino acids. For recombinant expression, the encoding gene without signal peptide of 1320 bp was codon optimized and synthesised commercially (Genscript), and subcloned in pET30a vector at NdeI and HindIII sites. This construct was transformed into BL21(DE3) and SHuffle (NEB) strain of E. coli. Expression profiles for both the expression hosts were evaluated on SDS-PAGE and due to higher expression levels of target soluble protein in SHuffle cells, these cells were selected for scaled up protein expression in 2 litre culture, followed by affinity purification of recombinant xylanase using Ni–NTA agarose matrix (Qiagen). Concentration of the purified protein was determined using BCA Protein Assay kit as described earlier.
The enzymatic activity of the purified protein was tested for its ability to hydrolyse CMC (carboxy methyl cellulose, Sigma), PASC (phosphoric acid swollen cellulose prepared from Avicel pH 101, Sigma) and Xylan (Beechwood Xylan, HiMedia). The released reducing sugars were measure when the recombinant protein was incubated with number of different substrate by the dinitrosalicylic acid (DNSA) method as described previously . Briefly, a crude enzyme solution (0.125 mL) was mixed with 0.125 mL of a 2% substrate solution in 20 mM Tris–Cl pH 7.0 buffer and incubated at 50 °C for 30 min. Enzymatic reactions against PASC was incubated for 60 min. The reducing sugar produced in these experiments was measured by the DNS reagent at 540 nm. One unit of enzymatic activity was defined as the amount of enzyme that released 1 µmol of reducing sugar from the substrate per minute under the above conditions.
Determination of optimal reaction conditions, kinetic parameters and biomass hydrolysis capability of recombinant RSB_GH10_Xylanase
The optimum temperature for maximum xylanase activity was determined by varying the enzymatic reaction temperature in the range of 40–100 °C. For optimum pH assessment, purified protein was dialysed against buffers ranging in pH from 4 to 9. The buffer for pH range 4–6 was 20 mM Citrate buffer containing 150 mM NaCl, while buffer for pH range 7–9 was 20 mM Tris–Cl contacting 150 mM NaCl. Activity assays were performed as described previously.
The kinetic parameters of recombinant xylanase were determined using beechwood xylan with substrate concentrations ranging from 0.5 to 10 mg/mL in 20 mM phosphate buffer (pH 7.0) at 60 °C. The kinetic constants, KM and Vmax, were estimated using GraphPad Prism 7.02 (GraphPad Sofware, Inc., San Diego, CA).
Rice straw deconstruction by recombinant RSB_GH10_Xylanase was determined as follows. Sodium hydroxide treated and untreated rice straw (kindly provided by Prof. Arvind Lali) were deconstructed by incubating 16 mg with purified 30 µg recombinant xylanase for 8 h at 60 °C. After incubation, the reaction mixture was centrifuged at 20,000×g for 15 min, supernatant was filtered through 0.45 µm filter and analyzed on Aminex column (Bio-Rad) using xylotetrose, xylotriose, xylobiose and xylose as standards. Biomass incubated with buffer and protein incubated with buffer were used as used as negative controls.
Authors thank to Prof. Arvind M. Lali, ICT, Mumbai for providing untreated and alkali-treated rice straw, Dr. Raj Bhatnagar for help in procuring rice stem borer and Dr. Jyothilakshmi, NIPGR for providing cut worm.
SSY, NCB and SMQM conceived the idea, designed and coordinated the study, provided expertise; SSY and RS written the manuscript and manuscript edited or modified time to time by NCB and JPB; RS and DE designed and performed YSB consortium cultivation, rice straw deconstruction and enzyme assays. RS and JPB prepared samples for metatranscriptome and metaexoproteome, respectively; AAD performed the mass spectrometry and assisted with the MS/MS analysis. JPB, RS and MG analyzed metatranscriptome and metaexoproteome data; RS and JPB prepared samples for microbial diversity analysis and data processed and analyzed MG and AMA. RS and MS expressed xylanase gene in E. coli for functional validation. All authors reviewed the results. All authors read and approved the final manuscript.
This work was funded by Department of Biotechnology (DBT), Government of India Grants BT/IN/INDO-UK/SuBB/21/SSY/2013 & BT/PB/Center/03/2011 and Biological Sciences Research Council (BBSRC) Grants BB/K020358/1 & BB/I018492/1.
Ethics approval and consent to participate
Consent for publication
All authors agreed to publish this article.
The authors declare that they have no competing interests.
- 11.Drake D, Nader G, Forero L. Feeding rice straw to cattle. UCANR Publications; 2002.Google Scholar
- 14.Satyendra T, Singh RN, Shaishav S. Emissions from crop/biomass residue burning risk to atmospheric quality. Int Res J Earth Sci. 2013;1:24–30.Google Scholar
- 22.Pathak MD, Khan ZR. Insect pests of rice. Los Baños: International Rice Research Institute; 1994.Google Scholar
- 30.Alessi AM, Bird SM, Oates NC, Li Y, Dowle AA, Novotny EH, Bennett JP, Polikarpov I, Young JP, McQueen-Mason SJ, Bruce NC. Defining functional diversity for lignocellulose degradation in a microbial community using multi-omics studies. Biotechnol Biofuels. 2018;11:166.PubMedPubMedCentralCrossRefGoogle Scholar
- 49.Kern M, McGeehan JE, Streeter SD, Martin RN, Besser K, Elias L, Eborall W, Malyon GP, Payne CM, Himmel ME, Schnorr K. Structural characterization of a unique marine animal family 7 cellobiohydrolase suggests a mechanism of cellulase salt tolerance. Proc Natl Acad Sci USA. 2013;110:10189–94.PubMedCrossRefGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.