Classical and Bayesian predictions applied to Bacillus toxin production

Ennouri, Karim; Ben Ayed, Rayda; Mazzarello, Maura; Ottaviani, Ennio; Hertelli, Fathi; Azzouz, Hichem

doi:10.1007/s13205-016-0527-2

Classical and Bayesian predictions applied to Bacillus toxin production

Original Article
Open access
Published: 26 September 2016

Volume 6, article number 206, (2016)
Cite this article

Download PDF

You have full access to this open access article

3 Biotech Aims and scope Submit manuscript

Classical and Bayesian predictions applied to Bacillus toxin production

Download PDF

Karim Ennouri¹,
Rayda Ben Ayed¹,
Maura Mazzarello²,
Ennio Ottaviani²,
Fathi Hertelli¹ &
…
Hichem Azzouz¹

1380 Accesses
5 Citations
Explore all metrics

Abstract

Bacillus thuringiensis is a bacterium with unusual properties that make it useful for pest control in ecoagriculture. It can form a parasporal crystal containing polypeptides (also called delta-endotoxins). These entomopathogenic toxins are made during the stationary phase of the bacterial growth cycle and were initially characterized as an insect pathogen. Nowadays, the use of saturated two-level designs is very popular. This method is especially used in industrial applications where the cost of experiments is expensive. Standard classical approaches are not appropriate to analyze data from saturated designs. It is due to the fact that they only allow to estimate the main factor effects and cannot assure enough freedom degrees to estimate the error variance. In this paper, we propose the use of empirical Bayesian procedures to get inferences for data obtained from saturated designs, inspired from Hadamard matrices. The proposed methodology is illustrated by assuming a dataset to prove the model robustness. The comparison between the two studied mathematical techniques, based on mean square error values (MSE), revealed that Bayesian method (BM) was more accurate than least square method (LSM): for example, the results showed that 2002 and 2000.7 mg/l for experimental and predicted (BM) data were obtained against 2002 and 1991 mg/l for experimental and predicted (LSM) data. This suggested method could be generalized in several application fields in biological sciences.

Nutritional Requirements to Improve Delta-Endotoxins Production of Bacillus thuringiensis var. kurstaki Using Mixed Designs Modelling

Article 19 July 2015

Multiple linear regression and artificial neural networks for delta-endotoxin and protease yields modelling of Bacillus thuringiensis

Article 29 June 2017

BayFesian inference and model choice for Holling’s disc equation: a case study on an insect predator-prey system

Article Open access 01 June 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Bacillus thuringiensis (Bt) is a spore-forming bacterium well known for its insecticidal properties associated with its aptitude to produce crystal inclusions through sporulation. These toxins are also specific and biodegradable; therefore, no toxic products are accumulated in the environment. Bt biopesticides are produced by fermentation technology. The fermentation is a process in which an agent causes an organic substance to break down into simpler substances. It is considered as a major step in industrial biotechnology and it is necessary to take under consideration the optimization of medium composition to affect the formation of a specific final product during fermentation process (Schmidt 2005; Singh et al. 2008).

A fermentation enhancement program begins by measuring product yield as a response to factors like strength of medium components. Nutritional requirement can be manipulated either by the conventional or statistical approach. Conventional methods, commonly used for a long time in industrial production, involve changing one independent variable while others are kept at fixed level. However, statistical methods are more recent and offer many advantages when compared to conventional methods because they are rapid and reliable. Furthermore, these techniques are useful for short-listing significant nutrients, and then reducing the total number of experiments, hence an important gain of time and chemical products. A systematic experiment was then performed by setting the independent variables according to a statistical design carried out from Hadamard matrices at two levels and delta-endotoxins produced by B. thuringiensis strain was measured in each batch. A statistical analysis was then carried out to interpret the significant medium components. Such approach is a useful screening process employed to identify the contribution of each medium component to the response of the system and thus allows for a reduction of variable numbers that need to be considered (Liu et al. 2008).

In this work, we study the significance levels of medium components for delta-endotoxin production using statistical design. Moreover, we present a statistical method to identify complex ingredients interactions, during fermentation process, with Bayesian network (BN) structure learning for specific conditions. This method discovers the dependency relationship between components which implies their complex interactions on heterogeneous data sets.

Materials and methods

Bacillus thuringiensis strain

Bacillus thuringiensis kurstaki, an isolated strain having high delta-endotoxin production (Ennouri et al. 2013), was used in this study. The bacterial strain was grown on Luria–Bertani (LB) agar medium composed of 10 g/l of peptone, 5 g/l of NaCl, 5 g/l of yeast extract and 15 g/l of agar (BIO BASIC^®, Ontario, Canada) and then stored at 4 °C.

Microorganism and cultivation media

LB medium with the following composition (g/l) was used for the preparation of the pre-inoculum and inoculum: peptone, 10.0; yeast extract, 5.0; NaCl, 5.0. For fermentation medium, a modified complex medium was used (Ghribi et al. 2007). CaCO₃ (20 g/l) was added for keeping pH stability. All media used in this study were adjusted to pH 7.0 ± 0.01 before autoclaving.

Culture conditions

For pre-inoculum preparation, a loopful of B. thuringiensis grown on LB plate was used to inoculate 3 ml of sterilized LB medium and incubated in a rotary shaker (New Brunswick Scientific, model INNOVA^® 44, USA) at 30 °C and 200 rates per minute overnight (14–18 h). For inoculum preparation, 250-ml Erlenmeyer flasks containing 50 ml of LB medium were inoculated with 1 % (v/v) of the pre-inoculum and incubated in a rotary shaker at 30 °C and 200 rates per minute for 6 h. The volume of culture inoculum was determined on the basis of a final absorbance of approximately 0.15 measured at 600 nm. The optical density at 600 nm (OD₆₀₀) was determined using a SmartSpec™ 3000 UV–visible spectrophotometer (Bio-Rad Laboratories). The 500-ml flasks containing 50 ml of complex medium were incubated with estimated inoculum volume. In such media, the initial OD was not measured after inoculation but calculated according to the OD measured in the inoculum. Samples taken periodically from the incubated cultures were subjected to microscopical examination. When 90 % (or more) of the B. thuringiensis cells had lysed, releasing the spores and crystals, the fermentation process was considered as finished.

Determination of delta-endotoxins concentration

1 ml of collected samples at the end of fermentation was centrifuged at 13,000×g for 10 min at a temperature of 4 °C. The supernatants were discarded. The pellets were washed twice with 1 ml of 1 M NaCl solution and twice with 1 ml of bidistilled autoclaved water. The crystal proteins in the pellet were dissolved with 1 ml of 50 mM NaOH (pH 12.5) for 2 h at 30 °C with vigorous shaking. The suspension was centrifuged at 13,000×g for 10 min at 4 °C and the pellet was discarded. The supernatant containing the alkali-soluble insecticidal crystal proteins was used to define the delta-endotoxin concentration by Bradford method using bovine serum albumin as standard protein (Bradford 1976). Delta-endotoxin concentration was measured spectrophotometrically at 595 nm using a UV–visible spectrophotometer (Bio-Rad Laboratories, Inc.). The obtained values were the mean of three values of two separate experiments.

Statistical design

Seven factors were selected as the key factors affecting the production of delta-endotoxins in this investigation. Thereafter, the screening design was applied to evaluate the importance of the seven selected factors. Plackett–Burman experimental design (Plackett and Burman 1946), an ameliorated technique based on Hadamard matrices (Hadamard 1893), was applied to evaluate the significance of various medium components affecting delta-endotoxin production by B. thuringiensis strain. The different factors were prepared in two levels: −1 for low level and +1 for high level, based on statistical matrix design, which is a fraction of a two-level factorial design and allows the investigation of (n−1) variables in at least n experiments. On the base of seven independent variables (Table 1), twelve combinations were screened according to the design shown in Table 2. All trials were performed in triplicate and the average of observations was considered as the final result. Plackett–Burman experimental design is based on the first-order model:

$$Y = \beta_{ 0} + \sum \beta_{\text{i}} X_{\text{i}}$$

(1)

where Y is the predicted response (delta-endotoxin concentration), β ₀ , β _i are constant coefficients, and X _i is the coded independent variable estimates or factors. The data of delta-endotoxin concentration were statistically analyzed. Factors having highest t value and confidence level over 95 % were considered to be highly significant on delta-endotoxin production.

Table 1 Coded values used in factorial design (g l⁻¹)

Full size table

Table 2 Hadamard matrix of seven variables

Full size table

This method utilizes least square estimation to approximate the main effects; however, there are no degrees of freedom to estimate the error. It is difficult to obtain inferences from data of a saturated design as Plackett–Burman design; in this case, the usual analysis of variance (ANOVA) cannot be used.

Assuming that the matrix X′X is nonsingular in the saturated design, the least squares estimates or maximum likelihood estimation of the main effects are given as follows:

$$\widetilde{\beta } = (X^{\prime}X)^{ - 1} X^{\prime}y$$

(2)

In practice, generally the experimenter exploits normal plots to determine the importance of the factors; nevertheless, the interpretation of the normal plots depends on how strongly the researcher believes in factor sparsity (Baba et al. 2013). In our study, the Bayesian estimation can be applied to analyzing more deeply experimental data from saturated two-level designs.

Bayesian estimation

We assume a Bayesian approach to analyze data from a saturated design. In the maximum likelihood approach, it is assumed that there is enough information to have a meaningful estimation of the parameters β and the variance σ ². However, in the Bayesian approach, the data are added with information in a prior probability distribution form. The prior belief about the parameters is combined with the data’s likelihood function according to Bayes formula to give the posterior distribution of the parameters β and σ ². Different priors could be considered to analyze data from saturated design (Baba and Gilmour 2006), but since data from saturated designs provide only limited information, the interpretation of these data depends heavily on the prior assumptions. The use of priors is very inflexible since very informative priors can lead to explicit posterior distribution (Baba and Gilmour 2006).

Bootstrap method

The bootstrap is a well-known resampling method for evaluating the standard error of a statistical estimator, thus providing confidence intervals (Efron and Tibshirani 1993). Bootstrapping could be considered as a complementary method for model comparison and applied to further investigate the stability of the model ranking and parameter estimation. The bootstrap is a commonly used resampling method that permits estimation of the sampling variability of estimated parameters (Austin and Small 2014). Here, we used bootstrapping to test the robustness of our estimates. In fact, the bootstrap technique helps determine whether available data are sufficient for a robust classification (Wang et al. 2007). In our study, 100 simulated samples of data are used for validation: random samples were generated with replacement from a data set where each observation is randomly selected from the original data set.

Results and discussion

Statistical design

Seven different factors including medium components were screened for their effect on delta-endotoxin production using the experimental statistical design. The independent variables examined and their settings are shown in Table 1. The design plan is shown in Table 2. The variables X ₁−X ₇ represented the medium constituents. This method is based upon the existence of Hadamard matrices, which are square matrices of order N with entries at two levels, +1 and −1. These matrices are orthogonal such that for each column the number of +1 is equal to the number of −1.

Table 3 shows the observed and predicted concentrations of delta-endotoxins. The concentrations ranged from 1502 to 3571 mg l⁻¹. The collected output data are generated from the experimental process using bootstrapping method (Efron 1979). The bootstrap is a type of Monte Carlo method applied based on observed data (Mooney and Duval 1993). The predictive distribution is presented in terms of the difference between the predicted and the observed data (Table 3). One hundred simulated samples of data are used for validation. The stopping criterion for training is a minimum value of the mean square error (MSE). In this study, the minimum value of MSE was taken as 10⁻⁶. The MSE was used as the criterion for the training and test data sets to compare the accuracy of the model. The enlarged versions of the simulation output based on Bayesian and linear models are presented in Fig. 1 to illustrate the difference between the mentioned variables. According to the figure, we showed that the majority of the peak points on the Bayesian method curve may indicate that this technique certainly increases goodness of model when compared to least squares method curve, and thus may improve the prediction rate. Differences in residuals between the two models are small when compared pairwise. However, the fact is that Bayesian model residuals are consistently smaller than the linear model residuals. It indicates the degree of robustness of Bayesian method compared to conventional data analysis methods [2002 and 2000.7 mg/l for experimental and predicted (LSM) data vs 2002 and 1991 mg/l for experimental and predicted (BM) data]. In fact, usual approaches such as least square method begin by fitting a model and then optimizing the model to obtain optimal operating settings. These methods do not account for any uncertainty in the parameters or in the form of the model. Bayesian approaches have been proposed recently to account for the uncertainty on the parameters of the model, assuming the model form is identified.

Table 3 Experimental and predicted values (least square and Bayesian methods) for the delta-endotoxin concentrations

Full size table

In this study, a statistical design based on Hadamard matrices was employed to evaluate the main effect of the medium components for the delta-endotoxins production by B. thuringiensis. Table 4 provides the results of effect of each medium component on delta-endotoxins production as well as the two coefficients: t value and p value. In fact, the t test compares the actual difference between two means in relation to data variation, while the p value represents the probability that random chance can explain the end result. In general, a p value of 5 % or lower is considered to be statistically significant. The main effect of each variable was calculated simply as the difference between the average of measurements made at the high setting (+1) and the average of measurements observed at low setting (−1) of that factor. The components were screened at the confidence level of 95 % on the basis of their effects. That component, which showed significance at or above 95 % confidence level and its effect was positive, was interpreted as being required in higher concentration than the indicated high value (+). However, if its effect was found negative, then it indicated that the component was effective in delta-endotoxin production but the amount required was lower than the indicated low (−) concentration in Hadamard matrix. All factors in this study have shown influence on the delta-endotoxin production with confidence level at or above 95 % confidence limit and were considered to be significant for delta-endotoxins production by B. thuringiensis (Table 4).

Table 4 Regression analysis of least square method (LSM) and Bayesian method (BM)

Full size table

Based on the obtained results of Table 4, the soybean meal, known as main source of organic nitrogen, showed the maximum positive effect on toxins production, followed by starch, K₂HPO₄ and KH₂PO₄. The t values of FeSO₄, MnSO₄ and KH₂PO₄ were negative which suggested that these components are required in the medium for delta-endotoxin production but in lower concentration than the low level. From the experimental design results, soybean meal was found to be the most significant medium component effect on delta-endotoxin production, followed by FeSO₄. However, all other medium ingredients were found to have no significant effect on delta-endotoxin production (p value >0.05). Regression analysis of Bayesian method was more accurate than least square method. Therefore, the use of the proposed empirical Bayesian method in this present work could be a powerful technique and applied as complementary and profound assessment of saturated two-level designs.

The saturated factorial designs have been extensively used by industrial researchers and engineers as a powerful methodology for screening factors, especially in the presence of a great number of factors. Usually, the use of linear model based on specific experimental design in the interpretation of factorial two-level experiments could be very subjective and in many cases finding the active factors on the response of interest was quite difficult. The use of the empirical Bayesian approach introduced in this paper could be of great interest in applications.

Conclusion

In this study, the comparison between two statistical techniques, based on mean square error values (MSE), demonstrated that the Bayesian method (BM) was more precise than least square method (LSM). The obtained results confirmed this finding and showed that the Bayesian method MSE values were smaller than least square method MSE values. For instance, according to the first run, we illustrated that concentrations of the predicted data were 1863.4 mg/l and 1853.66 mg/l using BM and LSM, respectively. Likewise, we observed excellent parameter forecasts and inferences accomplished with this suggested technique. Moreover, the Bayesian approach demonstrated the potential enhancement when spatial variability was explained in the model. This proposed method could be applied in various domains, particularly to understand complex interactions on heterogeneous data.

References

Austin PC, Small DS (2014) The use of bootstrapping when using propensity-score matching without replacement: a simulation study. Stat Med 33(24):4306–4319
Article Google Scholar
Baba MY, Gilmour SG (2006) Bayesian estimation from saturated factorial designs. In: Colosimo BM, Castilho E (eds) Bayesian process monitoring, control and optimization. Chapman & Hall/CRC, USA, pp 311–322
Chapter Google Scholar
Baba MY, Achcar JA, Moala FA, Oikawa SM, Piratelli CL (2013) A useful empirical Bayesian method to analyse industrial data from saturated factorial designs. Int J Ind Eng Comput 4:337–344
Google Scholar
Bradford MM (1976) A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal Biochem 72:248–254
Article CAS Google Scholar
Efron B (1979) Bootstrap methods: another look at jackknife. Ann Stat 7:1–26
Article Google Scholar
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman & Hall, USA
Book Google Scholar
Ennouri K, Ben Khedher S, Jaoua S, Zouari N (2013) Correlation between delta-endotoxin and proteolytic activities produced by Bacillus thuringiensis subsp. kurstaki growing in an economic production medium. Biocontrol Sci Technol 23(7):756–767
Article Google Scholar
Ghribi D, Zouari N, Trigui W, Jaoua S (2007) Use of sea water as salts source in starch and soya bean-based media, for the production of Bacillus thuringiensis bioinsecticides. Process Biochem 42:374–378
Article CAS Google Scholar
Hadamard J (1893) Résolution d’une question relative aux déterminants. Bull Sci Math 17:240–246
Google Scholar
Liu J, Xing J, Ghang T, Zhiya MA, Liu H (2008) Optimization of nutritional conditions for nattokinase production by Bacillus natto NLSSE using statistical experimental methods. Process Biochem 40(8):2757–2762
Article Google Scholar
Mooney CZ, Duval RD (1993) Bootstrapping: a non-parametric approach to statistical Inference. Sage Publications, Newbury Park
Book Google Scholar
Plackett RL, Burman JP (1946) The design of optimum multifactorial experiments. Biometrika 33(4):305–325
Article Google Scholar
Schmidt FR (2005) Optimization and scale up of industrial fermentation processes. Appl Microbiol Biotechnol 68:425–435
Article CAS Google Scholar
Singh A, Majumder A, Goyal A (2008) Artificial intelligence based optimization of exocellular glucansucrase production from Leuconostoc dextranicum NRRL B-1146. Bioresour Technol 99:8201–8206
Article CAS Google Scholar
Wang Q, Garrity GM, Tiedje JM, Cole JR (2007) Naïve Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 73(16):5261–5267
Article CAS Google Scholar

Download references

Acknowledgments

This study was supported by grants from the ‘‘Tunisian Ministry of Higher Education, Scientific Research and Technology’’.

Author information

Authors and Affiliations

Centre of Biotechnology of Sfax, 3038, Sfax, Tunisia
Karim Ennouri, Rayda Ben Ayed, Fathi Hertelli & Hichem Azzouz
OnAir s.r.l.-26, Via Carlo Barabino, Genoa, Italy
Maura Mazzarello & Ennio Ottaviani

Authors

Karim Ennouri
View author publications
You can also search for this author in PubMed Google Scholar
Rayda Ben Ayed
View author publications
You can also search for this author in PubMed Google Scholar
Maura Mazzarello
View author publications
You can also search for this author in PubMed Google Scholar
Ennio Ottaviani
View author publications
You can also search for this author in PubMed Google Scholar
Fathi Hertelli
View author publications
You can also search for this author in PubMed Google Scholar
Hichem Azzouz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karim Ennouri.

Ethics declarations

Conflict of interest

The authors do not declare any conflict of interest.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Ennouri, K., Ben Ayed, R., Mazzarello, M. et al. Classical and Bayesian predictions applied to Bacillus toxin production. 3 Biotech 6, 206 (2016). https://doi.org/10.1007/s13205-016-0527-2

Download citation

Received: 22 August 2016
Accepted: 19 September 2016
Published: 26 September 2016
DOI: https://doi.org/10.1007/s13205-016-0527-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Classical and Bayesian predictions applied to Bacillus toxin production

Abstract

Similar content being viewed by others

Nutritional Requirements to Improve Delta-Endotoxins Production of Bacillus thuringiensis var. kurstaki Using Mixed Designs Modelling

Multiple linear regression and artificial neural networks for delta-endotoxin and protease yields modelling of Bacillus thuringiensis

BayFesian inference and model choice for Holling’s disc equation: a case study on an insect predator-prey system

Introduction

Materials and methods

Bacillus thuringiensis strain

Microorganism and cultivation media

Culture conditions

Determination of delta-endotoxins concentration

Statistical design

Bayesian estimation

Bootstrap method

Results and discussion

Statistical design

Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Classical and Bayesian predictions applied to Bacillus toxin production

Abstract

Similar content being viewed by others

Nutritional Requirements to Improve Delta-Endotoxins Production of Bacillus thuringiensis var. kurstaki Using Mixed Designs Modelling

Multiple linear regression and artificial neural networks for delta-endotoxin and protease yields modelling of Bacillus thuringiensis

BayFesian inference and model choice for Holling’s disc equation: a case study on an insect predator-prey system

Introduction

Materials and methods

Bacillus thuringiensis strain

Microorganism and cultivation media

Culture conditions

Determination of delta-endotoxins concentration

Statistical design

Bayesian estimation

Bootstrap method

Results and discussion

Statistical design

Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation