Background

Saccharopolyspora erythraea, formerly called Streptomyces erythraeus (Weber et al. 1985), is a kind of Gram-positive filamentous bacterium which produces the medically useful antibiotic erythromycin A (Oliynyk et al. 2007). Erythromycin is an important broad-spectrum 14-membered macrolide antibiotic which has been widely used in the treatment of many diseases caused by pathogenic Gram-positive bacteria (Mironov et al. 2004). Nowadays, industrial production of erythromycin is mainly through submerged culture system. Similar to the production of antibiotics through secondary metabolism of other actinomyces (Bibb 2005; Medema et al. 2010; Wentzel et al. 2012), the synthesis process of erythromycin is also complex and is largely influenced by the composition of the media and culture condition (Martin and Bushell 1996; Mcdermott et al. 1993; Rostamza et al. 2008).

Systems biology is a kind of discipline which combines experimental and computational methods (Feist and Al 2009). This method is useful to make a comprehensive analysis and prediction of complex intracellular biological systems. Using different omics techniques has made it possible to analyze the abundant data intracellularly, which includes the correlation between the various components within the cell. The significance of genomics is self-evident in today’s most widely used omics analysis (Ellis and Goodacre 2012). The development of mathematical models improves the ability to analyze and integrate these omics data (Gehlenborg et al. 2010; Hollywood et al. 2006; Stefanovic et al. 2017). Abstract genome data and intracellular metabolic pathways are transformed into mathematical models, which make systems biology developed tremendously (Ellis and Goodacre 2012). Most importantly, the Genome-Scale Metabolic Models (GEMs) are becoming one of the most significant tools for analyzing different metabolites and metabolic pathways in metabolic engineering (Kim et al. 2012). With the development of modern genome sequencing, it is possible to integrate the reconstruction of metabolic pathways into GEMs.

The genome-scale metabolic reconstructions (GSMRs) of S. erythraea were built in 2012 (Licona-Cassani et al. 2012). In this model, the metabolic reactions of S. erythraea in intracellular were sorted out by the author and the medium suitable for erythromycin growth was optimized by using this model. In order to promote the quality of the GEMs of S. erythraea, there are at least four aspects that need to be improved according to GEMs of other model strains (O’Brien et al. 2013; Tomàs-Gamisans et al. 2016). Firstly, the balance of mass and electrical charge should be checked; secondly, the ineffective reactions should be deleted and the lacking metabolic pathways need to be filled; thirdly, the gene-protein-reactions (GPRs) relationship should be constructed so that we can use our model to design strain by gene target prediction. Finally, the accuracy and predictive power of the model should be validated by comparing the biomass growth parameters, metabolic fluxes, and other physiological parameters with the simulation results.

In this study, a GEM model based on constraints of flux balance analysis (FBA) was reconstructed and in silico analysis was conducted to compare the physiological data and metabolic states of S. erythraea among different cultivation environments. We integrated the latest omics information into our GEM model (Oliynyk et al. 2007; Peano et al. 2012), then predicted the essential genes, the secretion of product, and the growth condition in different media with our model accurately. Overall, the reconstructed model could better describe the metabolic characteristics, and for this reason, we create a better platform to study the systems metabolic engineering of S. erythraea in vivo.

Methods

Microorganism

The S. erythraea strains NRRL23338 and E3 were used in this study.

Media and culture conditions

The composition of the chemically defined medium used for the pre-batch culture of the microorganism contains (per liter of deionized water): 30 g glucose, 7 g K2HPO4, 3 g KH2PO4, 5.5 g (NH4)2SO4, 0.25 g MgSO4·7H2O, 25 mg FeSO4·7H2O, 0.53 mg CuCl2, 0.55 mg CoCl2, 13.8 mg CaCl2·7H2O, 10.4 mg ZnCl2, 6.2 mg MnCl2, 0.3 mg Na2MoO4 (Bushell et al. 1997). The medium composition for the carbon limitation chemostat culture was the same as for the pre-batch cultivation except the glucose concentration was changed to 15 g/L (Ghojavand et al. 2011; Mcdermott et al. 1993).

The culture conditions for seed culture, pre-batch culture, and chemostat culture were determined by McDermott et al. (1993). Pre-batch culture and chemostat culture were carried out in a 5-L bioreactor (National Engineering Research Center for Biotechnology, Shanghai, China) with a working volume of 3 L. The dissolved oxygen (DO) was maintained above 40% by adjusting the aeration and the agitation to ensure fully aerobic conditions. The OUR, CER, and RQ were measured online using a process mass spectrometer (MAX300-LG, Extrel, America). Temperature, pH, and pressure were set as 34 °C, 7.0 (with adding 1 M NaOH), and 0.05 MPa, respectively. The specific growth rate (μ) of chemostat culture was controlled using the dilution ratio (D).

Analyses

The cell concentration was monitored by measuring the OD600. The dry cell weight (DCW) was measured as described by Carreras et al. (2002). The fermentation broth supernatant was used for measuring the concentration of residual glucose and organic acids. Residual glucose concentration was analyzed using a glucose kit (Sinopharm Chemical Reagent Co., Ltd, China) as per manufacturer’s protocol. Organic acids concentration was analyzed by high-performance liquid chromatography (HPLC) as described by Albert and Martens (1997).

Procedures for model reconstruction

The GEMs of S. erythraea was built on the basis of the whole genome annotation of S. erythraea and other information from the databases (KEGG, UniProtKB, BioCyc, Enzyme) and literatures (Caspi et al. 2012; Licona-Cassani et al. 2012; Oliynyk et al. 2007), which was followed by the standard three step process (Thiele and Palsson 2010). Firstly, considering the lack of metabolites list and GPR relationship of the model created in 2012, this research reassembled all metabolic reactions and metabolites referring to annotated genes from KEGG and PubChem (Kanehisa et al. 2006; Kim et al. 2016). The reactions in this paper were all manually refined and checked to ensure that the structure of each metabolite is consistent, the charge and mass in every reaction is balanced.

Subsequently, in order to expand the S. erythraea GEMs, we found the latest gene annotation information from three major databases (KEGG, UniProtKB, IMG) to add the number of effective reactions and metabolites (Markowitz et al. 2012). At the same time, we used the GapFind algorithm to check the connectivity of all the pathway in this model and if there are identified missing links somewhere (Kumar et al. 2007), then we will use two steps to fill these gaps: first, by referring to the literature add new reactions from other organisms’ metabolic pathway; second, if the synthetic pathway of this metabolite cannot be found, then introduce transport reactions to allow for metabolite exchange. The format of each new reaction was referred to the standard protocol to ensure the quality of the final model to achieve the standardization requirements (Caspi et al. 2012).

Lastly, because there is lack of the gene-protein-reactions (GPRs) relationships in the model created in 2012, we established the GPRs in our model GEM-iZZ1342. We found out all the genes in the NCBI database, and then correlated the genes with enzymes and reactions according to the function of this gene in the KEGG database. According to the GPRs, the number of ineffective reactions was greatly reduced, and the relationship between the reactions and genes was confirmed clearly. Finally, we re-added the non-gene-associated reactions to the model, including metabolites transport reactions, exchange reactions, and other reactions which are lack of gene annotations. The detailed information about the databases used in this study can be found in Additional file 1.

Sensitivity analysis

In order to investigate the sensitivity of iZZ1342, the qs and the qO2 were set to 0–1.5 mmol glucose/gDCW h and 0–1.0 mmol O2/gDCW h, respectively. In all of the six elements (protein composition, RNA composition, DNA composition, cofactor composition, GAM, NGAM), we changed only one element each time. The range of variations for each simulation is the protein (22.8–68.4%), the RNA (4.9–14.7%), the DNA (2.2–6.6%), the cofactor (1.5–4.5%), the GAM (16–48 mmol ATP/gDCW h), and the NGAM (1.25–3.75 mmol ATP/gDCW h). Finally, we calculated the specific growth rate (μ) and specific oxygen uptake rate (qO2) to reflect the result on changing every element of our model. All simulations were performed using the available software Matlab (Mathworks, Inc).

Biomass composition

An equation describing the conversion of every cellular component into biomass can be derived from previous published literature publication on Streptomyces coelicolor (Borodina et al. 2005). The biomass is composed of the following macromolecules: protein, DNA, RNA, lipid, carbohydrates, and cofactors. The detailed biomass components (Additional file 2) of S. erythraea were referred from Donachie and Begg 1970 and Borodina et al. 2005.

In silico computation using flux balance analysis

Metabolic fluxes of the S. erythraea were defined by using flux balance analysis (FBA), constrains of which are imposed by the stoichiometry matrix in the metabolic network (Bordbar et al. 2014; Orth et al. 2010). The matrix of stoichiometry imposes flux balance constraints on the system, ensuring that the total amount of metabolites produced must be equal to the total amount of consumed at the steady state; this is the so-called pseudo-steady state. The net sum of all fluxes which contains production and consumption for each internal metabolite is set to zero. In FBA, a special objective function, written as a linear combination of fluxes, can be used to calculate the optimal solution. And we all know that according to the linear optimization theory, the optimal solution is at a corner in the feasible flux space. Using the matrix representation, this problem can be stated as follows:

$${\text{maximize:}}\;c^{\text{T}} \cdot v$$
$${\text{subject}}\;{\text{to:}}\;S \cdot v = 0$$
$$v_{\text{min} } \le v \le v_{\text{max} },$$

where S is the stoichiometric matrix indicating the stoichiometric coefficient of metabolic reactions in the network and v is the vector of all metabolic fluxes. vmin and vmax represent the minimum and maximum constraints on the fluxes, and which are also used to define the constraints for maximal enzymatic rate and irreversibility of reaction. cT is a vector representing the linear combination of metabolic fluxes. In our research, the biomass production rate is used as the objective function which is targeted to be maximized. We adopted this method to estimate the metabolic fluxes under the assumption that our strain is under exponential phase at which cells grow at the maximum speed. In all of the simulations, glucose was chosen as the sole carbon source and other external metabolites in the transport reactions are set to freely transport through the cell membrane: H2O, CO2, NH4+, PO4, and SO4. All calculations were performed using the available software Matlab (Mathworks, Inc).

Model prediction of the cell growth on different carbon and nitrogen sources

To comprehensively evaluate the prediction ability of iZZ1342, physiological data were obtained from two parts: previous publications and experiments through cultivating S. erythraea on different carbon and nitrogen sources performed in our laboratory. When predicted the utilization of carbon source, NH4+ was set as the only nitrogen source, in the same time sulfate and phosphate were maintained as the only phosphorus source and sulfur source, respectively. Then we set the flux of other exchange reactions referring to carbon source to zero except the aimed carbon source. To predict the utilization of nitrogen, similarly, we set the glucose as the only carbon source. In the simulation, the target substrate was viewed as growth supporting if the predicted growth rate was obviously above zero.

Model prediction of essential genes

To predict the essential genes, non-essential genes, and partially essential genes, Single-Gene Deletion function based on the Cobra Toolbox v2.0 was carried out (Schellenberger et al. 2011). Based on the size of the specific growth rate calculated when a certain gene is knocked out, the genes were divided into three groups: the essential genes (the predicted specific growth rate is equal to 0 or infinitely approaches to 0), non-essential genes (the predicted specific growth rate is equal to the maximum value), and partially essential genes (the predicted specific growth rate was between 0 and the maximum value). Both minimal and optimized chemically defined medium were used to predict essential genes, non-essential genes, and partially essential genes. The minimal chemically defined medium was made up of glucose, oxygen, ammonia, sulfur, and phosphorus, while the optimized synthetic medium formula was optimized by Licona-Cassani et al. (2012).

Results and discussion

Reconstruction of the S. erythraea GEMs iZZ1342

The genome-scale metabolic model (GEMs) of S. erythraea was reconstructed on a three step procedure (see “Methods”). During the reconstruction, the specific process which contains all the materials and the procedures was required to be manually curated in the model as shown in Fig. 1.

Fig. 1
figure 1

The reconstruction process of the genome-scale metabolic model of S. erythraea iZZ1342

Compared with the updated GSMR of S. erythraea published in 2012, the reconstructed GEMs iZZ1342 have shown obvious improvements. First, the ORFs (open reading frames) are increased from 1272 to 1342, the total number of reactions is decreased from 3985 to 1684 after removing the ineffective reactions and adding the GPRs associations. Furthermore, we also sorted out the metabolites list which contains the information of all the metabolites in the reaction. We also manually checked and balanced the mass and electrical charge of the elements and reactions according to the process in “Methods.” Finally, we conducted the gap find analysis procedure and found all the orphan reactions. Then solved these gaps by adding the connecting reactions. The detailed information of iZZ1342 can be found in Additional files 3 and 4.

We compared the GEMs parameters in all aspects of S. erythraea and other actinomyces (Alam et al. 2010, 2011; Kjeldsen and Nielsen 2009). The comparison result is provided in Table 1. As shown in Table 1, the number of total reactions and metabolites (Additional file 5) in iZZ1342 are larger than that in S. erythraea NRRL23338-GSMR, which indicates our model was improved on the scale. The improvements are also reflected on the assigned genes, the coverage of the annotated genes as well as the number of the reactions assigned by the genes. We also compared our GEM with other actinomyces, and the results show that our GEM has a larger scale under the conditions of almost the same genome size.

Table 1 Comparison of the main characteristics of S. erythraea and other actinomyces

Model verification by transcriptomic analyses

We verified the new ORFs in the updated GEMs of S. erythraea with the latest transcriptomic analysis data (Carata et al. 2009; Li et al. 2013; Peano et al. 2012). The reason for this is that we can identify genes with low-expression as many as possible compared with the microarray data (Wang et al. 2009). By extracting the information of RNA sequencing, the information of gene sequence and enzymes of different pathways among sampling period can be gained. Then, we can get the information of genes and reactions by analyzing the gene sequence and enzymes. Due to the fact that we created the GPRs associations in this study, we could distinguish the single-gene-associated reactions and multi-gene-associated reactions. The transcriptomic analysis data result showed that about 7186 genes could be determined during the cultivation condition (Additional file 6). The iZZ1342 contains 1342 genes and the expression of most genes (86.3%) could be found according to the results of transcriptomic analysis data during the sampling condition (Fig. 2a). Among all the reactions (except the exchange reactions), the expression of genes which cannot be measured was at 4.6% (71 reactions) of these reactions (Fig. 2b). When removing the exchange reactions and other reactions which are without any annotated genes, 702 and 739 reactions are annotated with single and multiple genes, respectively. Furthermore, the transcriptome analysis result shows that about 85.4% of the single-gene-associated reactions and 89.2% of the multi-gene-associated reactions were verified (Fig. 2c, d), indicating that the most reactions in the model iZZ1342 were reasonable and reliable, which shows the rationality of our model to a great extent.

Fig. 2
figure 2

Verification of the GEM of iZZ1342 by transcriptomics data. a Pie chart of the expressed and unexpressed genes involved in iZZ1342. b Pie chart of the verified reactions and other reactions in total 1441 reactions (except the exchange and transport reactions). c Pie chart of the verified single-gene reactions and other reactions in total single-gene reactions. d Pie chart of the verified multi-gene reactions and other reactions in total multi-gene reactions

Sensitivity analysis of iZZ1342

To check the sensitivity of the simulation results generated from FBA with iZZ1342, we varied the content of all the four largest macromolecules in the cell (protein, RNA, DNA, and cofactor) and the two energy parameters (GAM, NGAM), respectively (Feist et al. 2007). The specific growth rate (μ) and the specific oxygen uptake rate (qO2) were investigated under aerobic and glucose-limited conditions, shown in Fig. 3. When composition of the protein, RNA, DNA, and cofactor was changed, the specific oxygen uptake rate can hardly be affected by the change. However, the specific growth rate was slightly decreased when the composition of protein and RNA was changed. On the contrary, when the energy parameter was changed, μ and qO2 were affected seriously and easily. As GAM and NGAM increased, μ was greatly decreased. In the meantime, qO2 was tremendously increased as shown in Fig. 3e, f. The results of the sensitivity analysis indicate that the model iZZ1342 is very sensitive to the energy parameters rather than the cell composition parameters, which is correspondingly consistent with the results gained from E. coli GEMs (Feist et al. 2007).

Fig. 3
figure 3

Sensitivity analysis of different model parameters by the model iZZ1342. The effects of each parameter when changing the specific growth rate (A1–F1) and the specific oxygen uptake rate (A2–F2) with iZZ1342. The simulations were performed in the glucose-limited condition by varying the protein content (22.8–68.4%). a The RNA content (4.9–14.7%), b the DNA content (2.2–6.6%), c the cofactor content (1.5–4.5%), d the GAM content (16–48 mmol ATP/gDCW h), e the NGAM content (1.25–3.75 mmol ATP/gDCW h), f red represents the simulated results of the high value of the input parameter and black represents the lower value

Model prediction by measuring availability of different carbon and nitrogen sources

To predict the physiological state of S. erythraea growing under different conditions, we collected the reported phenotype experimental data (El-Enshasy et al. 2008; Zou et al. 2009). For other carbon sources and nitrogen sources which are quite crucial but could not find the reference, we complemented experiments to verify the validity. Collectively, 27 kinds of carbon sources and 33 kinds of nitrogen sources were validated. FBA was used to analyze the growth situation on every carbon or nitrogen source. According to the physiological data from the publications and our laboratory, S. erythraea could grow on 23 carbon sources and 27 nitrogen sources. The in silico growth capabilities of S. erythraea on 17 carbon sources and 25 nitrogen sources could be predicted using iZZ1342 and the accuracy rates were 77.8 and 87.9%, respectively. Although the accuracy is already high, the remaining discrepancy will limit its impact on the partial metabolic function of the network, and we believe this will provide new improvement space for a next round of upgrading. The growth-relating results can be found in Tables 2 and 3.

Table 2 Prediction of growth capability of iZZ1342 on different carbon sources (+ represents growth and − represents non-growth)
Table 3 Prediction of growth capability of iZZ1342 on different nitrogen sources (+ represents growth and − represents non-growth)

Model validation using physiological growth parameters

To validate the GEMs iZZ1342, we compared the phenotype predictions with the experimental data obtained from the chemostat cultures with minimal chemically defined medium (Mcdermott et al. 1993). Firstly, the S. erythraea NRRL23338 was grown in carbon-limited medium at five dilution rates (0.01, 0.02, 0.03, 0.04, and 0.05/h), then measured the uptake and secretion rates of glucose, O2, CO2, and dry cell weight (DCW). Finally, we calculated the specific growth rate (μ), qO2, qCO2, and qs. In all cultures, we ensured that the recovery is over 91% of the substrate carbon in biomass, CO2, and organic acids.

To simulate the cellular growth in carbon-limited medium, we set the cell growth to the maximal while constraining the glucose uptake rate based on the hypothesis that cells tend to maximal growth during exponential phase (Mishra et al. 2016). Moreover, the exchange fluxes of NH4+, phosphate, sulfite, H2O, and H+ were unconstrained to provide basic nutrients for cell growth. The non-growth-associated maintenance (NGAM) was set to 3 mmol/gDCW/h as observed for S. coelicolor (Borodina et al. 2005). As shown in Fig. 4, the prediction data from iZZ1342 matched reasonably with the data from the chemostat cultures. When the qs was changed from 0.5 to 1.5 mmol/gDCW/h, the prediction results of μ, qO2, and qCO2 between the in silico and in vivo were quite similar, indicating that the excellent performance of our new model across multiple environmental conditions to a large extent.

Fig. 4
figure 4

The result of the predicted and measured μ, qO2, and qCO2 for chemostat cultivation of S. erythraea NRRL23338. The NGAM used in simulation was 3 mmol ATP/gDCW h. Black represents simulated results of the GEMs iZZ1342, and red 13 represents the experimental data from our lab

Model validation by in vivo 13C fluxes

Cellular metabolic flux is a significant and direct indicator of the physiological state (Nielsen 2003). Nowadays, GEM models could be used to predict the cellular reaction fluxes due to the fact that GEM model contains the total reactions that can be carried out in the strain. Furthermore, it can avoid any biases caused by lumping reactions or omitting pathways that cannot be prejudged (Saratram and Maranas 2015). However, there is still a possibility to exit obvious discrepancies between the in vivo calculated flux and the simulated flux in silico (Damiani et al. 2015). Therefore, we compared the in vivo calculated flux acquired from the 13C Metabolic Flux Analysis (13C MFA) technology and the simulated flux acquired from FBA to further evaluate the prediction accuracy of our model iZZ1342.

In order to evaluate the prediction accuracy of the model, first we used the 13C-labeled technique to get the flux distribution of analyzed specific metabolic pathways (Hong et al. 2016), and then compared with the flux distribution simulated from our model in silico (Additional file 7). The main aim of this work is to predict how well fluxes gained from analysis of our constraint-based GEM model reflect the real flux distribution.

When analyzing the cellular 13C metabolic flux, we combined the corrected mass isotopomer distributions, the extracellular fluxes as well as the metabolic network. At the same time, a software which is used to calculate the central carbon metabolic flux called INCA was used to iteratively calculate the absolute flux solution that described the data exactly (Young 2014). The central carbon metabolic fluxes identified by INCA and simulation are shown in Fig. 5. As shown in Fig. 5a, the metabolic profiles in the FBA simulations agreed well with the observed experimentally. In Fig. 5b, the correlation coefficient between the simulated fluxes and the calculated 13C fluxes is shown to be 0.97, indicating the good performance of iZZ1342.

Fig. 5
figure 5

The distribution of the central metabolism flux in the cellular. a Metabolic flux profiles of the central metabolism of S. erythraea. The upper number represents the flux acquired from the 13C MFA and the lower number represents the flux simulated from our model iZZ1342. b Consistent changes in fluxes can be found both in the calculated 13C fluxes as well as FBA calculation using iZZ1342

Essential genes target prediction in silico for strain design

The molecular mechanisms in traditional mutation and screening approach for improving the production of erythromycin are still poorly understood. However, this information is quite significant for designing the rational strategies for high-yield strain (Peano et al. 2012). In this study, we used iZZ1342 to find the essential gene targets and give reliable information for making strain design.

During the reconstruction of our GEM model, we established the relationship of genes, proteins, and reactions (GPRs), so that we can use the GPRs to predict the genotype efficiently. In this process, we used the Single-Gene Deletion function of the Cobra Toolbox v2.0 to predict the essential genes, the partially essential genes and the non-essential genes (Additional file 8). When the minimal chemically defined medium was used to cultivate strains, the result of the simulation shows that 318 genes are essential genes (Fig. 6). These genes are mainly distributed in the TCA cycle, amino acids biosynthesis and metabolism, energy metabolism, and so on. However, when the optimized chemically defined medium was adopted, the number of essential genes has declined markedly, from 318 to 186 (Fig. 6). That is because abundant nitrogen sources were added into the new optimized medium and that resulted in replacing the synthesis pathways of some amino acids. Furthermore, 89 genes were identified as partially essential genes (Additional file 9), and the important characteristics of these genes is that knockout of these genes has a subtle impact on cell growth. However, these genes may play a crucial role in the synthesis of products. They are important targets for subsequent strain design because the yield of product synthase may increase with a slowdown of cell growth (Pan and Qiang 2012). In order to verify the effect of the target gene, some targets which are included in the 89 partially essential genes have been validated by knockout experiments in our lab and on the other published papers, including SACE_5639 (Chen et al. 2016; Weber et al. 2012), SACE_0728 (Mironov et al. 2004), SACE_0731 (Minas et al. 1998), and SACE_6669 (Hong et al. 2017). The knockout results can be found in Table 4. To evaluate the other gene targets, further knockout experiments are needed to validate the prediction results of iZZ1342.

Fig. 6
figure 6

Results of single-gene deletion research with iZZ1342. a The gene expression data and the categories of the expressed gene in KEGG with iZZ1342 (red is the necessary gene, blue is the semi-essential gene, and yellow is the non-essential gene). b The relative growth rate changes of S. erythraea between minimal chemically defined medium and optimized chemically defined medium

Table 4 The knockout targets which have been validated in our lab and on other published papers

Conclusion

We have currently reconstructed and evaluated the genome-scale metabolic model (GEMs) of S. erythraea, called iZZ1342, which contains the latest gene annotation information, physiological parameters, and detailed GPRs relationships. Furthermore, we have also checked the mass and charge balance of all the reactions and metabolites. For those metabolic pathways that lack certain key reactions, we integrated the pathways by filling the gaps. The new model iZZ1342 contains 1614 metabolites and 1684 reactions, in which 1441 reactions are annotated with genes. Comparing with the previous model, the new model has a lot of improvements, mainly including the following aspects: firstly, the balance of mass and electrical charge have been checked; secondly, the ineffective reactions have been deleted and the lacking metabolic pathways have been filled; thirdly, the gene-protein-reactions (GPRs) relationship has been constructed so that we can use our model to design strain by gene target prediction. Finally, we have validated the accuracy and predictive power of the model by comparing the biomass growth parameters, metabolic fluxes, and other physiological parameters with the simulation results.

We validated the new model in several aspects. Firstly, we tested the sensitivity, also called robustness of the model. The result shows good consistency with the simulated result that used E. coli GSMM for simulation when changing the content of every component in our model. Secondly, we tested the model by measuring availability of different carbon and nitrogen sources. The result shows an excellent predictive power of our model: the accuracy of prediction when using different carbon sources is 77.8 and 87.9% when using different nitrogen sources. Thirdly, we tested the model using physiological growth parameters. When we use the glucose as the only carbon sources, the simulation results show a positive correlation with the experimental data. Finally, we validated the model by in vivo 13C fluxes. The result of comparison shows that in the main metabolic pathway, they have a quite good identity and the R2 of fluxes between MFA and GEM model is 0.9638. However, in the other few pathways, the results highlight that further attention should be paid to promote our model.

We employed our model to find all the partially essential genes, and these genes are important targets for subsequent strain designing. According to the published studies, four genes are in the range of successful knockout. However, the other gene targets emphasize that further knockout experiments are needed to validate the prediction results of iZZ1342.