Background

The genus Pandoraea is a very recently classified genus proposed in the year 2000. Bacteria belonging to genus Pandoraea are Gram-negative, non-sporulating, and motile bacteria with single polar flagellum [1]. The genus belongs to Burkholderiaceae family and class β-proteobacteria. The Pandoraea genus was earlier misidentified and grouped together with Burkholderia or Ralstonia [1] This genus contains five species (Pandoraea pnomenusa, Pandoraea sputorum, Pandoraea norimbergensis, Pandoraea apista, and Pandoraea pulmonicola) and four genomospecies of thiosulfate-oxidizing (Pandoraea thiooxydans) and oxalate-oxidizing species as Pandoraea vervacti, Pandoraea faecigallinarum, and Pandoraea oxalativorans. Pandoraea is a taxonomically distinct genus having close similarity with Burkholderia and Ralstonia. Pandoraea has been isolated from various environments such as soil, landfill site, sediments, clinical samples (only P. apista, P. pnomenusa, and P. sputorum isolated until date), and water [1,2,3,4]. The Burkholderia and Ralstonia are very much explored and established genera with their promising potential environmental and industrial applications. Pandoraea is a relatively new genus, so there are very few findings available about their biotechnological potential. The species from this genus have been documented for utilization of polychlorinated biphenyl, dichloromethane, dyes, lignin, oxalate, thiosulfate, and quorum sensing [3,4,5,6]. At present, the genomic insights for Pandoraea are limiting and such studies would eventually help to widen the biotechnological prospective of this genus.

Lignin is a complex aromatic heteropolymer and it is the most abundant aromatic polymer available on earth. In nature, lignin is degraded mainly by bacteria and fungi. Fungi have been studied extensively for lignin degradation and only a few bacterial species have been reported for lignin degradation [7, 8]. Compared to fungi, bacteria offer advantage as its genome size is small, genetic manipulations, and large-scale recombinant expression of important enzymes can be performed with a greater ease. Therefore, the focus again shifted to bacteria for the identification of novel strains and enzymes for lignin degradation. The discovery of novel ligninolytic microbes, enzymes, and their biochemical characterization will help in deconstruction of biomass for their application in biofuel and bioproduct industry [6, 9,10,11]. The application of advanced ‘omics’ approach such as genomics, transcriptomics, and proteomics to individual microbial strains or microbial community will help in identification and functional characterization of novel ligninolytic enzymes in the near future [12,13,14]. With the increase in genomic data of bacteria and fungi, the biomass degrading potential across different taxa can be identified that will further enhance our understanding related to lignin degradation [12, 13]. The lignin degrading bacterial isolate belongs to actinobacteria, alpha proteobacteria, beta proteobacteria, gamma proteobacteria, delta proteobacteria, bacteroides, and archaea [7]. The novel bacterial enzymes responsible for lignin degradation and their mechanism of action have also been described [15]. In recent years, LC–MS-based proteomics studies have been widely performed. Quantitative LC–MS-based proteomics such as label free and ITRAQ labeling-based quantification methods are generally used to identify the novel enzymes and their level of expression in a particular process [16,17,18].

We have earlier sequenced the genome of Pandoraea sp. ISTKB and the sequence has been submitted to NCBI with accession number MAOS00000000.1 which is openly available [19]. In the present study, we describe the comprehensive analysis of the Pandoraea sp. ISTKB genome. The bioinformatics analysis was performed to identify a large set of genes and pathways putatively responsible for lignin degradation and PHA production. The important gene clusters responsible for lignin degradation and PHA production were also highlighted. This strain has already been shown to utilize monoaromatic lignin derivatives with great ease compared to polymeric kraft lignin for PHA production [20]. Therefore, the proteomic study of Pandoraea sp. ISTKB was performed for identification of set of a proteins expressed during its growth on monoaromatic vanillic acid (VA) and aromatic polymer lignin, i.e., kraft lignin (KL) that can be overexpressed for enhanced KL utilization. VA was selected, because most of the lignin linkages proceed through generation of vanillin or VA as nodal point during the course of degradation [21]. Proteomic studies provide insight into the protein profile and also complement the genomics analysis. Genomic and proteomic analyses would enable us to understand the novel enzymes and pathways responsible for lignin degradation and biovalorization.

Results

Salient features of Pandoraea sp. ISTKB genome

The Pandoraea sp. ISTKB was previously characterized for lignin degradation and successfully applied for pretreatment of sugarcane bagasse and polyhydroxyalkanoate (PHA) production [6, 20, 22]. The genome size of Pandoraea sp. ISTKB is 6.37 Mb with 65× coverage having GC content of 62.05%, 5356 predicted protein-coding genes [prokaryotic genome annotation pipeline (PGAP) and Pfam annotation] and the other general genome features has also been reported earlier [19]. Among the predicted proteins, 1740 proteins were categorized as hypothetical proteins. Out of total predicted proteins, 456 proteins were identified having signal sequences. Circular map displaying genomic features provides a space efficient and clear representation of gene arrangement on the genome, as shown in Fig. 1. The annotation of important genes and pathways related to lignin or aromatic compound degradation has also been represented in the circular plot. KEGG–KAAS pathway analysis of protein-coding genes from Pandoraea sp. ISTKB categorized 2590 genes in 22 different functional KAAS pathway (Additional file 1: Table S1). The KEGG predicted 148 proteins responsible for degradation and metabolism of aromatic and xenobiotic compound. The annotation and analysis by RAST predicted 5658 coding genes and 48% of coding genes have been classified into 26 subsystems features. The percent contribution of genes present in different functional groups in subsystem features is represented in Fig. 2. The subsystem features count showed dominance of general process related to carbohydrate, amino acids, cell wall components, prosthetics, cofactors, proteins, and lipid metabolism. After normal cellular processes, the subsystem feature count is dominated by membrane transport, aromatic compound metabolism, respiration stress response regulation, and cell signaling.

Fig. 1
figure 1

Circos plot of genes compared with the genome for Pandoraea sp. ISTKB. Circles from outside to inside represent; a scaffold arrangement, b gene position on the scaffolds, c GC skew, and d GC content. Syntenic representation of genes associated with the pathways and Pandoraea sp. ISTKB. Different genes associated with the selected pathways with different colors and shapes

Fig. 2
figure 2

Classification of proteins in subsystem features and their abundance in different functional groups shown in Pandoraea sp. ISTKB

Gene ontology (GO) analysis was performed to gain functional information about predicted proteins in the genome. The analysis provided information about distribution of genes among various metabolic processes, cellular functions, and molecular components in the genome of Pandoraea sp. ISTKB (Fig. 3). In the biological processes, the organic substance metabolic process was found to be the dominant process. Molecular functions analysis revealed the major distribution of proteins into three important functions, i.e., organic cyclic compound binding, heterocyclic compound binding, and oxidoreductase activity. Abundance of ion binding and small molecule-binding proteins indicates their role in transcriptional regulation and transportation of molecules across cell membrane. Representation of transferase and hydrolase in good proportion indicates their assistance during metabolism of organic compounds.

Fig. 3
figure 3

GO analysis of Pandoraea sp. ISTKB genome and classification of genes into biological processes, cellular components, and molecular functions

Metabolism, respiratory mechanism, transporters, and transcriptional factors in Pandoraea sp. ISTKB genome

Pandoraea sp. ISTKB can metabolize diverse substrates; which includes five and six carbon sugar molecules. This bacterium can utilize monosaccharide (galactose, mannose, and fructose), disaccharides (sucrose), polysaccharides (starch), glucuronate, ascorbate, aldarate, amino sugar and nucleotide sugar, propionate, and butanoate metabolism. This strain can also utilize pentoses (xylose, xylulose), C5-branched dibasic acid, and other glyoxylate, dicarboxylate and pyruvate as predicted by KEGG. The growth of this strain was observed to be poor on glucose and the KEGG pathway analysis of carbohydrate metabolism also supported this observation. Analysis of respiratory mechanism showed various terminal electron acceptor, electron donors, and also other relevant genes related to respiration. The abundance of formate dehydrogenase, quinone oxidoreductase family proteins, oxidoreductases, ubiquinol oxidase, soluble cytochrome, and other related electron carriers highlights their importance and assistance in metabolism of various recalcitrant compounds (Additional file 1: Figure S1). There were 346 transcriptional factors identified in the genome, and among these regulators, LysR family was found to be dominant. Transcriptional regulator families related to metabolism of aromatic compound such as GntR, MarR, IclR, XRE, aromatic hydrocarbon utilization, anaerobic benzoate metabolism, and organic hydroperoxide regulators are also present in this strain (Additional file 1: Figure S2). There are 587 transporters identified in the genome, and among these, there were 279 ABC family transporters present. This family represents almost half of the total transporters present in the genome and was found to be dominant followed by two-component system and MFS transporters (Additional file 1: Figure S3).

Metabolism of aromatic compounds

The annotation of Pandoraea sp. ISTKB genes and their classification into pathways involved in lignin or aromatic compounds degradation have been identified by KEGG pathway analysis, blast search against ‘nr’ database, and subsystem feature of RAST. There were 42 dioxygenase, 25 monooxygenase, 17 peroxidase (including one DyP-type peroxidase), and 2 laccases discovered in genome (Additional file 1: Figure S4; Tables S2, S3, and S4). The presence of various oxidoreductase [grouped into FAD, NAD(P)H, SDR, GMC, YggW, quinone, pyridine nucleotide–disulfide, flavin, Fe–S, and unclassified oxidoreductases), reductases, dedydrogenases, esterases, thioesterases, transferases, and hydrolases has also been observed.

The pathway analysis revealed genes responsible for lignin degradation and diverse aromatic compound metabolism (Fig. 4). Genes responsible for funneling of lignin or aromatic components’ degradation through peripheral degradation pathways have been observed. Genes related to pathways for degradation of vanillin, ferulate, biphenyl, phenylpropanoic acid, benzoyl-CoA mediated, phenylacetate, and phenol were observed and their abundance is depicted in Fig. 4 and Additional file 1: Table S5. Subsystem feature analysis identified genes as ‘lignin degradation fragments’ responsible for lignin metabolism and this is discussed as cluster later section. The KEGG analysis indicates that this strain can utilize various xenobiotic compounds such as benzoate derivatives (amino, ethyl, p-hydroxy, and fluoro), BTX, salicylate esters, quinate, pesticides, PAHs, synthetic aromatic monomer, furfural, and steroids. The degradation of lignin and xenobiotic aromatic compounds results into generation of some restricted common central intermediates (catechol, protocatechuate, and gentisate) that are further metabolized by beta-ketoadipate and aromatic ring cleaving pathways. The genes responsible for degradation of central intermediates were identified in abundance (Fig. 4 and Additional file 1: Table S6). The genes observed in central intermediate pathways can metabolize common aromatic intermediates through both ortho and meta cleavage pathways [23]. The genes responsible for metabolism of central intermediates such as catechol, protocatechuate, salicylate, homogentisate, N-heterocyclic aromatic compound, and meta cleavage pathways were also identified.

Fig. 4
figure 4

Predicted lignin and aromatic compounds degradation genes and their number responsible for funneling into peripheral pathways and central intermediate metabolism

Identification of stress response genes, secondary metabolites, and genomic islands

Lignin or aromatic compound degradation requires concerted action of various oxidoreductases. The degradation process generates free radicals and reactive intermediates and their removal or transformation into stable and less toxic component is essential for cell survival. Genome analysis identified various proteins related to stress response and detoxification mechanisms (Additional file 1: Figure S5 and Table S7). The presence of superoxide dismutase, catalases, glutathione, thioredoxin, peroxiredoxins, glyoxylases, rubrerythrin, glutaredoxins, aldo/keto reductase, and alkyl hydroperoxidase highlights this strain’s arsenal against oxidative stress, protection from reactive species and detoxification of toxic components during aromatic metabolism [24, 25].

There are nine gene clusters identified in the genome of Pandoraea sp. ISTKB that has been represented with their contigs and position marked in Additional file 1: Table S8. Secondary metabolite cluster analysis identified some novel metabolites that are specific to Pandoraea sp. ISTKB. These clusters included genes responsible for the synthesis of terpenes, nonribosomal peptides, thailanstatin/mangotoxin, arylpropane, 2 homoserine lactone, phosphonate–terpene, bacteriocin, and lassopeptide. The cluster 9 (lassopeptide), cluster 2 (Nrps), and cluster 4 (arylpropane) were found to be unique to this strain, since cluster 9 did not show any match with Pandoraea genus or Burkholderia genus. However, clusters 2 and 4 showed only one match with Burkholderia. Clusters 1 (terpenes), 3 (thailanstatin/mangotoxin), and 5 (homoserine lactone) are distributed among Pandoraea and Burkholderia genus. Moreover, clusters 6 (phosphonate–terpene), 7 (bacteriocin), and 8 (homoserine lactone) are highly represented in Pandoraea genus. The novel clusters such as cluster 9 (lassopeptide), 2 (Nrps), and 4 (arylpropane) can prove to be significant as these are unique to this strain.

There were 12 genomic islands identified in the genome that are mainly dominated by the hypothetical proteins (Additional file 1: Figure S6 and Additional file 2: Table S9). The other proteins present were related to DNA replication, cell division and partitioning, transposition, recombination, phage-mediated integration, repair, and DNA-binding response regulators. There are various proteins identified in the island that plays important role in stress response, detoxification mechanism and their regulation, electron carrier, antibiotic resistance, metal resistance, and transportation of molecules across cell membrane. The proteins related to phosphate and sulfur metabolism and few for aromatic compound degradation were also observed.

Identification of gene clusters for the degradation lignin derivatives and PHA production

The two gene clusters responsible for degradation of lignin derivatives have been identified and the order of gene arrangement on the cluster is shown in Fig. 5a, b. The first cluster ‘lignin degradation fragment’ predicted by RAST contains genes responsible for protocatechuate meta cleavage-mediated degradation of lignin derivatives. The presence of LysR family transcriptional regulator for aromatics can be observed in the cluster. ABC transporters and MFS transporter were also present in this cluster that might be regulating the movement of aromatic compounds across the cell. The benzoyl formate decarboxylase present in the cluster is known for the degradation of benzene, xylene, and toluene. The second cluster contains genes mainly responsible for the degradation of vanillic acid. The presence of ABC transporters for regulating movement of molecules can also be observed in this cluster. This cluster also contains glutathione peroxidase, dehydrogenases, and glyoxylase that play important role in protection from oxidative damage by detoxifying reactive intermediates such as methylglyoxal and other aldehydes formed during metabolism of aromatic compounds [25].

Fig. 5
figure 5

Gene clusters with contig number 40.1 and 13.1 identified in Pandoraea genome responsible for lignin degradation represented as a and b. The size of DNA fragment selected for cluster analysis is between 12 and 17 Kb

PHA is carbon and energy reserve accumulated by microbes under nutrient imbalance condition [26]. We have earlier characterized PHA production by strain ISTKB while growing on lignin and its derivatives (as sole carbon source) and the genes responsible for PHA synthesis have been identified in the genome [20]. Here, the arrangement of PHA biosynthetic genes on cluster was analyzed in detail (Fig. 6a–c). The clusters were identified spanning PHA synthase or polymerase gene that is annotated in the genome. The first cluster revealed the presence of complete set of genes (acetoacetyl-CoA reductase, β-ketothiolase, PHA polymerase, and regulatory protein) responsible for short-chain PHA production. In case of second cluster, PHA polymerase was followed by acetoacetyl-CoA reductase but β-ketothiolase was missing from this cluster. The β-ketothiolase was present in multiple copies in the genome. This cluster is dominated by stress responsive proteins primarily related to heavy metal or multidrug efflux system. The third cluster contains only PHA synthetase and presence of genes predominantly related to oxidative stress as thiol-disulfide interchange protein, protein disulfide reductase, thioredoxin, two-component system response regulator protein, sensory proteins, secretory proteins, and ABC-type multidrug permeases was present around polymerase in the cluster.

Fig. 6
figure 6

Gene clusters with contig numbers 23.1, 34.1, and 48.1 identified in Pandoraea genome responsible for PHA production represented as ac. The size of DNA fragment selected for cluster analysis is between 12 and 17 Kb

Proteomics analysis on kraft lignin and vanillic acid

Proteomic analysis was performed to identify the genes expressed on monoaromatic compound vanillic acid and polyaromatic compound kraft lignin. The identification of important proteins responsible for polymeric lignin degradation and their overexpression will provide opportunity for lignin valorization. There were total 2484 proteins detected during LC–MS analysis covering almost 44.61% of the total protein-coding genes present in the genome. There were 2318 proteins common in both KL and VA and 166 proteins were found to be expressed either on KL or on VA. Among 166 expressed proteins, 74 were expressed on VA and 78 proteins on KL, as shown in Fig. 7a, b. GO analysis was performed on the protein expressed on KL and VA to obtain the overview of functional information about the proteins involved in various biological processes, cellular components, and molecular functions.

Fig. 7
figure 7

a Venn diagram showing total number of proteins expressed on kraft lignin and vanillic acid and their distribution among KL and VA. b Heat map showing differential expression of relevant proteins on kraft lignin–vanillic acid that are responsible for lignin degradation

The GO analysis of genomics was supported by proteomics (especially biological processes and molecular functions) on KL and VA (Fig. 8). The molecular functions category indicates an abundance of protein in catalytic activity, heterocyclic compound binding, organic compound binding, and transcription factor activity on KL and absent on VA. Single organism process was found to be dominant in KL and VA (after normal cellular and metabolic processes) indicates this strain specific process. The proteins involved in localization process on VA were almost double compared to KL. The membrane protein was present in KL and VA, but their representation on VA was found to be more than double as compared to KL and the transporters were also expressed more in VA.

Fig. 8
figure 8

GO analysis of protein expressed by Pandoraea sp. ISTKB while growing on KL and VA. The expressed proteins were classified into biological processes, cellular components, and molecular functions

Expressed proteins involved in lignin or aromatic compound degradation

Proteomic profile of Pandoraea sp. ISTKB revealed the presence of relevant proteins expressed only on KL or VA (Table 1) and KL–VA, as represented in Tables 2, 3, and 4. There are 17, 29, and 394 uncharacterized proteins observed in the KL, VA, and KL–VA, respectively. The various functionally active oxidoreductases, methyltransferases, hydrolases, isomerases, dehydrogenases, reductases, transferases, esterases, transporters, transcriptional factors, stress response, and detoxification-related proteins were observed that could play important role in degradation of lignin or aromatic compounds.

Table 1 Identification of relevant proteins expressed only on kraft lignin (KL) or vanillic acid (VL) that can assist in lignin degradation
Table 2 Differentially expressed proteins for phenylacetic acid, benzene degradation, and various oxidoreductases on kraft lignin
Table 3 Differentially expressed antioxidant and stress response proteins on kraft lignin
Table 4 Differentially expressed reductase, dehydrogenase, transferase, and hydratase proteins on kraft lignin

Important proteins expressed either on kraft lignin or on vanillic acid

The analysis of expression profile on KL revealed the presence of 1,2-phenylacetyl-CoA epoxidase (monooxygenase), phenylacetic acid degradation protein, and 2-hydroxyhepta-2,4-diene-1,7-dioate isomerase enzymes for the degradation of phenylacetate. Proteins such as benzoyl-CoA oxygenase, enoyl-CoA hydratase, tryptophan 2,3-dioxygenase, and salicylate hydroxylase were also active on KL. Proteins for methyl group transfer and decarboxylation such as SAM-dependent methyltransferase, pyruvate ferredoxin oxidoreductase, and (2Fe–2S)-binding protein were also observed. Generation of reactive intermediates and their detoxification by oxidative stress-resistance protein glycolate oxidase and NADPH:quinone reductase was present. Glycine betaine ABC transporter substrate-binding protein and formyl-CoA:oxalate CoA-transferase (FCOCT) proteins for osmoprotection and acid response regulator were present to maintain the smooth functioning of intracellular environment. There were six LysR family, two unclassified and one each of GntR family, AsnC family, Cd(II)/Pb(II)-responsive, Crp/Fnr family, MarR, and MerR transcriptional regulator found on KL. The VA was mainly dominated by transporters and stress response proteins [glutathione S-transferase, Rieske (2Fe–2S) protein, thioesterase, glycine betaine permease, and alkene reductase]. One methyltransferases, aminomethyltransferase, and LysR family transcriptional regulator were also observed.

Proteins differentially expressed on kraft lignin and vanillic acid

There were 1979 proteins obtained on KL–VA after normalization, and among these, 1110 proteins upregulated and 869 downregulated on kraft lignin. There are 164 transporters detected out of which 127 are ABC, 5 RND, and 4 MFS. There are 163 transcription factors identified comparising 34 LysR family, 21 GntR family, 17 tetR family, 12 each MarR, and IcIR family. We are discussing here important proteins that can perform lignin degradation and transformation. Some of the differentially expressed proteins that may involve in prospective lignin degradation are shown in Fig. 7b. The presence of various oxidoreductases, dehydrogenase, reductases, transferases, PHA biosynthetic proteins, and several stress response and detoxification proteins was detected in the expression profile. The phenylacetic acid degradation protein and ‘CoA’-mediated degradation of phenylacetate, phenylpropionate, and benzoate proteins were found to be upregulated on kraft lignin. The DyP-type peroxidase, peroxidase-like proteins, and various accessory enzymes such as aldehyde oxidase, glycolate oxidase, cytochrome C oxidase, oxidase, NADH:quinone oxidoreductase, FAD-linked oxidase, and GMC family oxidoreductase were found to be upregulated on KL. GMC family oxidoreductase or aryl alcohol oxidase is also known as auxiliary enzymes in case of fungi and their role is established in lignin degradation [27]. The homogentisate 1,2-dioxygenase, quercetin 2,3-dioxygenase, 4-hydroxyphenylpyruvate dioxygenase, dioxygenase, and nitropropane dioxygenase were found to be upregulated on KL. There were six SAM-dependent methyltransferase and one methyltransferase identified on KL–VA. Four SAM-dependent methyl transferase and methyltransferase was upregulated on KL and two SAM-dependent methyltransferase was upregulated on VA.

The expression of antioxidant and stress response proteins glutathione peroxidase, glutathione-disulfide reductase, catalase, glyoxylase, thioredoxin, peroxiredoxin, alkyl hydroperoxide reductase, aldo/keto reductase, and glutathione S-transferases was upregulated in case of KL. Superoxide dismutase was downregulated in case of KL and catalases were downregulated on VA. The proteins formyl-coA transferase, formate dehydrogenase for oxalate, and formate metabolism were also found to be upregulated on KL. Various other dehydrogenases, reductases, and transferases such as hydroxypyruvate reductase, NAD dehydrogenase, alcohol dehydrogenase, aldehyde dehydrogenase, ferredoxin reductase, ferredoxin, acyl-CoA dehydrogenase, acetyltransferases, and enoyl-CoA hydratase, were upregulated on KL.

The expression of vanillate O-demethylase oxidoreductase, chloroperoxidase, hydroglutathione hydrolase, protocatechuate 3,4-dioxygenase, protocatechuate 4,5-dioxygenase, 2OG-Fe(II) oxygenase, antibiotic synthesis monooxygenase, 2-hydroxyl acid oxidase, cytochrome c oxidase, NADH quinone oxidoreductase, glutathione peroxidase, and other oxidoreductases was upregulated in case of VA. The expression of protocatechuate 4,5-dioxygenase was more than double compared to protocatechuate 3,4-dioxygenase on VA. Compared to KL, the expression of oxidases enzymes was very less on VA. The expression of laccase, FAD-dependent oxidoreductase, phytanoyl-CoA dioxygenase, YggW family oxidoreductase, ubiquinol oxidase, one glutathione S-transferase, and NADH quinone oxidoreductase, was almost same in both KL and VA. There were several NADH:quinone oxidoreductases observed in KL–VA and some are upregulated in KL other in VA. Short-chain dehydrogenase, acyl-CoA dehydrogenase, alcohol dehydrogenase, acyltransferase, alkene reductase, FMN reductase, NADH:quinone reductase, and acetyl-CoA acetyl transferase was found to be upregulated on VA.

The clusters predicted for lignin degradation and PHA production were found to functionally active and the genes for degradation of lignin derivatives as well as all the three PHA polymerase were present in the expression profile (Additional file 3: Table S10, also contains other dehydrogenase, reductases, transferases, esterases, thioesterases, hydrolases not discussed here but expressed on KL–VA). The PHA production was induced on both the substrate, i.e., kraft lignin and vanillic acid. The activation of PHA biosynthetic genes on lignin was also recently reported [17].

Discussion

The detail of genomic and proteomic studies of lignin degrading bacterium is limited, so we tried to provide the comprehensive genomic and proteomic analysis of lignin degrading bacterium Pandoraea sp. ISTKB. The genome size of this genus available in NCBI varies between 4.4 and 6.5 Mb and this strain’s genome is one of the largest genome sequences available until date from Pandoraea genus. The degradation of aromatic compounds by bacteria is mostly aerobic and is tightly regulated process. Their degradation by oxidoreductases generates reactive intermediates, so a robust stress response and detoxification mechanism is required for survival of microbes. The dominance of these subsystem features such as respiration, aromatic metabolism, and stress response (after normal cellular processes) and their complementation highlights the ability of Pandoraea sp. ISTKB to survive and metabolize lignin or aromatic compound.

The GO analysis especially biological process and molecular functions indeed supported this strain’s robust genomic machinery for the utilization of organic substance, organic cyclic compounds, heterocyclic compound binding, solute binding, ion binding, and oxidoreductase activity. The abundance of localization process proteins, membrane proteins, and transporters in VA as compared to KL can be explained that these proteins might be localized near the membrane and actively involved in transportation and metabolism of VA into the cell. The absence of proteins in VA for organic cyclic compound binding, heterocyclic compound binding, iron–sulfur cluster binding, receptor activity, ion binding, cofactor binding, small molecule binding, and their presence in KL suggests that these are the important molecular functions’ category proteins that would have facilitated the depolymerization and utilization of polymer KL by this strain.

The analysis of expression profile on KL indicates the presence of metacleavage and unusual pathways, i.e., ‘-CoA’-mediated degradation of lignin derivatives in aerobic microorganisms. The presence of 2-hydroxyhepta-2,4-diene-1,7-dioate isomerase in the expression profile of KL possibly indicated 4-hydroxyphenylacetate degradation through meta cleavage pathways [28]. Benzoyl-CoA oxygenase-mediated degradation of aromatic compound is completely different mechanisms and observed in 4–5% of sequenced bacterial genomes. This mechanism helps to overcome the high resonance stabilization of aromatic ring by forming epoxide. Benzoyl-CoA oxygenase leads to formation of 2,3-epoxide followed by enoyl-CoA hydratase (also expressed on KL) and NADP+-dependent aldehyde dehydrogenase (upregulated on KL)-mediated degradation resulting into formic acid, acetyl-CoA, and succinyl-CoA formation [29]. 1,2-phenylacetyl-CoA epoxidase-mediated degradation of phenylacetic acid occurs via 1,2-epoxide intermediate and this pathway is found functional in only 16% of all bacteria genome reported also observed in Escherichia coli and Pseudomonas putida [30]. The upregulation of Salicylate hydroxylase on lignin was also observed in the case of Pseudomonas A514 strain [17].

The expression of glycolate oxidase, oxidase, oxidase, aldehyde oxidase, and GMC family oxidoreductase (aryl alcohol oxidase) was observed on KL–VA and these acts as an accessory enzyme and the peroxides produced by them is utilized by peroxidases for lignin degradation [27, 31]. Expression of these oxidases has also been reported recently in Pseudomonas A514 and Pantoea ananatis Sd-1 [17, 27]. The detection of NADPH:quinone oxidoreductase in Pandoraea strain ISTKB indicates lignin degradation by Fenton reaction. NADPH:quinone oxidoreductase overexpression on lignin and rice straw was also reported recently [17, 27, 32, 33]. Quinone oxidoreductase system is of special interest in case of lignin degradation as fungi especially brown rot used fenton chemistry for lignin degradation with the help of quinone oxidoreductase [9, 31]. The role of NADPH: quinone oxidoreductase in degradation and depolymerization of lignin is well established and reported for Phanerochaete chrysosporium and Trametes versicolor [34, 35].

Dyp-type peroxidases are fungal counterparts of peroxidase (LiP or MnP) present in bacteria for lignin degradation. The peroxidases such as DyP-type peroxidase, peroxidase, chloroperoxidase, and peroxidase-like protein were detected in Pandoraea sp. ISTKB genome and in proteome. Some DyPs are secreted through TAT pathway and their encapsulation has been shown to increase the enzyme’s activity [36]. There are various functions reported recently for bacterial DyPs such as depolymerization, dimer formation, and degradation of aryl ether bonds in lignin and lignin containing compounds [15, 36, 37]. Laccases can degrade lignin in the presence of mediators and there are several natural mediators observed during lignin degradation [38, 39]. Two laccase genes were discovered in the genome and found to be functionally active in this strain. Laccases are reported for ether linkage (aryl β-O-4) and β-1 bond cleavage on lignin model dimers. The degradation of phenolic as well as non-phenolic substrate in the presence of mediators by laccases has also been reported [40, 41]. Formate dehydrogenase coverts formate into carbon dioxide and these formate radicals induce MnP activity, as they can use formate as peroxide in the absence of H2O2 [31]. Formyl transferase is reported for oxalate degradation and oxalate forms complex with Mn3+ (MnP oxidizes Mn2+–Mn3+) and the complex acts as diffusible redox mediator for the degradation of phenolics in lignin [31]. The expression of quinone oxidoreductase, acetyl-CoA acetyltransferase, enoyl-CoA hydratase, dehydrogenase (responsible for cleavage of ether linkage), and cytochrome peroxidase was expressed on lignin, but other known bacterial lignin degrader was not observed in Bacillus ligniniphilus L1 expression profile [33]. The catalase/hydroperoxidase, multicopper oxidase, GMC oxidoreductase, glutathione S-transferase, and quinone oxidoreductases were observed in the secretome of P. ananatis Sd-1 on rice straw [27]. In addition to these proteins, various other proteins were also expressed in Pandoraea sp. strain ISTKB that are responsible for lignin degradation.

The presence of demethylases, methyltransferases, and SAM-dependent methyltransferase indicated demethylation or rearrangement of methyl group during lignin degradation [42]. Demethylation is an important process in conversion of lignin-derived aromatic intermediates into common central intermediates such as catechol, protocatechuate, or gallate that further undergo ring cleavage. Demethylation system removes methyl group from methoxy-substituted lignin-derived aromatic compounds such as syringate, vanillate, or guaiacol in the presence of cofactors. The demethylases include Rieske type ([2Fe–2S] cluster) and reductase (a flavin and a [2Fe–2S]) redox center. The demethylases or methyltransferases were also reported and functionally validated in Pseudomonas and Acinetobacter [9, 42, 43]. Several acyl-CoA synthetases, acyl-CoA hydratases/lyases, acyl-CoA transferase, acetyl-CoA-acetyl transferases, and decarboxylases have been discovered in Pandoraea sp. ISTKB genome and in expression profile. These enzymes help in activation and decarboxylation of aromatic compounds (hydroxycinnamates, carboxyvanillin) and play an important role in diversion of substrate towards central degradation [42,43,44]. The expression of both protocatechuate 3,4-dioxygenase and protocatechuate 4,5-dioxygenase on both KL–VA indicated that this strain has both functional ortho and meta cleavage pathway for degradation of lignin and its derivatives. The expression of metacleavage outperformed ortho pathway on vanillic acid. The presence of both ortho and meta cleavage pathways in single strain is rare phenomenon and the ortho cleavage pathway was found to be dominant among lignin degrading bacteria [9, 23]. The expression of both ortho and meta cleavage pathways in this strain illustrates its robust metabolic machinery for the degradation of aromatic compounds.

There are various glutathione-dependent enzymes identified in Pandoraea sp. ISTKB and glutathione has been known for detoxification mechanism and stress-related response. However, glutathione-dependent cleavage of β-aryl ether linkages (most dominant linkage in lignin) by β-etherase has also been described in Novosphingobium, Sphingobium SYK-6, Novosphingobium sp. PP1Y, and Thiobacillus denitrificans ATC 25259 [15, 45, 46]. Therefore, the presence of glutathione enzymes can help in lignin degradation in this strain. Superoxide dismutase and catalase–peroxidases were recently reported for lignin or lignin model compound in Sphingobacterium sp. T2 and Amycolatopsis sp. 75iv2, respectively, and these were also observed on KL–VA in this strain [47, 48].

Dehydrogenase acts on toxic aldehydes and converts them into their less toxic intermediates inside cells and also reported for cleavage of ether bond [43, 44]. There are various dehydrogenases observed in this strain and these might play important role in ether linkage degradation. The dehydrogenase-mediated degradation of ether linkage in lignin model compounds by SG61-1L and Lig DEG enzyme system in Sphingobium sp. SYK6 has been well documented [42, 49]. The combined action of alcohol dehydrogenase from short-chain dehydrogenase/reductase family and glutathione S-transferases has been show to degrade ether linkage (most prominent linkage in lignin 50–70%) in lignin model compounds [50]. The pathway for cleavage of β-aryl ether linkage in lignin by NAD-dependent dehydrogenases (LigD, LigO, and LigL) and the glutathione-dependent lyase (LigG) was structurally and biochemically characterized [51]. There are glutathione enzymes, superoxide dismutase, catalases, alkyl hydroperoxidase, thioredoxin, glyoxylase, aldo/keto reductase, and peroxiredoxin identified in Pandoraea sp. ISTKB. The presence of theses stress response and detoxification proteins has also been reported in genome sequence of Pseudomonas fluorescens Pf-5 [52]. The specificity of aldo/keto reductase against various lignin-derived phenolics, aldehyde, and fermentable inhibitors was demonstrated and was also shown to produce ROS and initiate fenton reaction [53]. Alky hydroperoxide reductase has greater catalytic efficiency under low H2O2 concentration and is responsible for the detoxification of organic hydroperoxides, as catalases cannot degrade organic hydroperoxides [54]. The analysis of such a diverse set of proteins and their level of expression helped us to identify the important enzymes responsible for lignin or aromatic compound degradation that will further provide opportunity for lignocellulosic biomass valorization.

Conclusion

The genomic and proteomic analysis of Pandoraea sp. ISTKB revealed the presence of various candidate genes responsible for lignin degradation and PHA production. GO analysis of genomic and proteomic data also supported the findings. The peroxidase-accessory enzyme system, fenton reaction, and ‘CoA’-mediated degradation of phenylacetate and benzoate are the major pathways observed for lignin degradation. The gene cluster responsible for lignin degradation and PHA production was found to be functionally active. The functional analysis supported genomic findings and a strong antioxidant and stress responsive machinery for the survival and metabolism of lignin or aromatic compounds was observed. Some secondary metabolites such as lassopeptide unique to this strain were also predicted that needs to be validated. The study indicated the pathways and enzymes important for metabolism of lignin or aromatic compounds that can be applied in the future for value addition to lignocellulosics.

Methods

The draft genome of Pandoraea sp. ISTKB was sequenced using the Illumina MiSeq platform, and the raw data processing, quality reads, assembly, scaffold generation, and genes prediction were carried out as described earlier [19]. Arrangement of genes of Pandoraea sp. strain ISTKB with respect to its genome was performed using clicO FS, i.e., circular layout interactive converter free services [55]. The proteins having signal sequence were identified using the SignalP 3.0 software [56]. The annotation and analysis of Pandoraea sp. ISTKB genome were also performed by Rapid Annotations using Subsystems technology (RAST). The RAST subsystem classification followed by pathway analysis was performed [57, 58]. GO analysis was performed and the genes predicted in genome have been classified into major biological processes, cellular component, and molecular functions using Blast2GO [59]. To identify the potential involvement of the genes of Pandoraea sp. ISTKB in biological pathways, genes were mapped to reference canonical pathways in Kyoto encyclopedia of genes’ and genomes’ (KEGG) database. The output of KEGG analysis includes KEGG orthology (KO) assignments and corresponding enzyme commission (EC) numbers and metabolic pathways of genes using KEGG automated annotation server KAAS (http://www.genome.jp/kaas-bin/kaasmain) [60]. A total of 5568 genes for Pandoraea sp. ISTKB were provided as input to KEGG–KAAS and genes involved in different pathways were further classified into 22 functional pathways. The antimicrobial and secondary metabolite clusters were predicted by antiSMASH 3.0 and genomic islands were predicted using islandviewer4 [61, 62].

Culture conditions and sample preparation for proteomic analysis

Pandoraea sp. ISTKB was grown in mineral medium (MM) containing vanillic acid and kraft lignin as sole carbon source. The composition of MM was the same as described earlier [6]. A single colony was transferred from LB plate to broth and incubated overnight at 30 °C and 165 rpm. One milliliter of overnight culture was transferred to fresh 100 ml LB media and allowed to grow until OD600 reached around 0.5. The cells were pelleted, washed twice with phosphate-buffered saline (PBS), and inoculated in flask containing VA and KL having initial OD of around 0.06. Bacteria were grown at 30 °C, 165 rpm and the OD was monitored at regular interval. The culture was harvested during exponential growth phase for proteomics study. Cells were pelleted by cold centrifugation at 10,000 rpm for 15 min washed with PBS and then resuspended in lysis buffer followed by sonication as described earlier [6]. Total protein concentration was estimated by Bradford method, and then, digestion was performed taking equal volume of proteins from both KL and VA.

Digestion of proteins, LC–MS/MS analyses, and data analysis

The protein concentration of 25 µg from both KL and VA was reduced with 5 mM concentration TCEP for 10 min at room temperature and further alkylated with 15 mM iodoacetamide in dark at room temperature for 30 min. The sample was diluted to 0.6 M final Gn-HCl concentration with 25 mM ammonium bicarbonate buffer. For digestion of protein, trypsin was added in a trypsin-to-lysate ratio of 1:50 after and incubation was performed overnight at 37 °C. The supernatant was vacuum dried and the peptides were reconstituted in 5% formic acid followed by purification using C18 silica cartridge and dried using speed vac. The dried pellets were resuspended in buffer-A (5% acetonitrile/0.15 formic acid).

The peptides were analyzed using EASY-nLC 1000 system (Thermo Fisher Scientific) coupled to QExtractive mass spectrometer (Thermo Fisher Scientific) equipped with nanoelectrospray ion source. 1 µg of peptide mixture was loaded on precolumn and resolved using 15 cm Pico Frit filled with 1.8 um C18-resin (Dr. Maeisch). The sample was run for 90 min and the peptides were eluted with a 0–40% gradient of buffer B (95% acetonitrile/0.1% formic acid) at a flow rate of 300 nl/min. the QExtractive was operated using the Top10 HCD data-dependent acquisition mode with a full-scan resolution of 70,000 at m/z 400. The MS/MS scans were acquired at a resolution of 17500 at m/z 400. Lock mass option was enabled for polydimethylcyclosiloxane (PCM) ions (m/z = 445.120025) for internal recalibration during the run. MS identification of Q extractive files was analyzed by the MaxQuant software and searched against databases at a false-discovery rate (FDR) of 1%. A total of protein groups were identified and were further filtered according to the label-free quantitation (LFQ) intensity values and their respective fold change values were calculated. Heat map and profile plots were against the protein groups filtered based on the normalized LFQ intensity values using the Perseus software. The proteins with at least two unique peptides detected were selected for quantification and differential expression study.