Abstract
This work describes a multivariate monitoring and control concept for bioprocesses based on historical process data. The concept is demonstrated for a Saccharomyces Cerevisiae (baker’s yeast) fermentation process executed in a small-scale bioreactor, which is equipped with common probes to analyze the broth and off-gases. The data of “in-control” fermentation processes were evaluated by means of a principal component analysis to define confidence limits for subsequent fermentations. A violation of these limits indicated that a process had to be classified as “out-of-control”. Fault diagnosis was provided by the components of the squared prediction error, which can also be used to determine the appropriate counteractions, e.g. via an expert system control strategy as described in this study. The sensitivity of fault diagnosis was demonstrated via various erroneous runs. The duration of bioprocesses can vary distinctly, which complicates the definition of time dependent control limits. Therefore, this study utilizes a three-component partial least squares regression model to quantify the current batch maturity during the process. This maturity is then used to reference current data to the appropriate historical data and the assigned control limits.
Similar content being viewed by others
Notes
\(X_{res} =\frac{1}{J-A}\mathop {\sum }\limits _{j=1}^J \left( {x_j -{\widehat{x}}_{j,A}} \right) ^{2}\) The X-residual is a similar expression found in literature, with the only difference that it is weighted using A, the number of principal components used in the model. \(X_{res}\) increases with A, and vice versa.
Abbreviations
- n :
-
Index of the batch
- k :
-
Index of the maturity
- j :
-
Index of the variable
- a :
-
Index of PLS or PCA dimensions
- A :
-
Number of considered PLS or PCA dimensions
- \(\varvec{X}\) :
-
Standardized observations for PLS model(s) (no. of rows = no. of observations; no. of columns = no. of variables)
- \(\varvec{Y}\) :
-
Standardized response for PLS models(s), i.e., the maturity
- \(\widehat{{\varvec{Y}}}\) :
-
(Non standardized) response for PLS models(s), i.e., the maturity
- \(\varvec{E}\) :
-
Residual matrix for \(\varvec{X}\) in PLS
- \(\varvec{F}\) :
-
Residual matrix for \({\varvec{Y}}\) in PLS
- \(t_a \) :
-
Score values of the \(a^\mathrm{th}\) PC of the PLS model(s)
- \(\varvec{p}_{\mathbf {a}}, \varvec{q}_{\mathbf {a}}\) :
-
(Normalized) loading vectors of the ath PC of the PLS model(s)
- \(\varvec{Y}_{PLS i} \) :
-
Maturity predicted by an \(i^\mathrm{th}\) PLS model using 90 % of the observations in \(\varvec{X}\)
- \(\mathop {\varvec{X}}\limits ^{\prime } \) :
-
Standardized observations for PCA model(s) (no. of rows = no. of observations; no. of columns = no. of variables)
- \(\mathop {\varvec{x}}\limits ^{\prime }\) :
-
Single standardized observation
- \(\mathop {\varvec{E}}\limits ^{\prime } \) :
-
Residual matrix for \(\mathop {\varvec{X}}\limits ^{\prime }\) in PCA
- \({\mathop {\varvec{p}}\limits ^{\prime }}_a \) :
-
(Normalized) loading vectors of the \(a^\mathrm{th}\) PC of the PCA model
- \(\overline{{\mathop {t}\limits ^{\prime }}_{a}}\) :
-
Score values of the \(a^\mathrm{th}\) principal component averaged over the in-control batches
- mat :
-
Batch maturity
References
Albert, S., & Kinley, R. D. (2001). Multivariate statistical monitoring of batch processes: An industrial case study of fermentation supervision. Trends in Biotechnology, 19(2), 53–62. http://www.ncbi.nlm.nih.gov/pubmed/11164554.
Alford, J. S. (2006). Bioprocess control: Advances and challenges. Computers & Chemical Engineering, 30(10–12), 1464–1475. doi:10.1016/j.compchemeng.2006.05.039.
Alt, F. B., & Smith, N. D. (1988). Quality control and reliability. Handbook of statistics, vol. 7. Handbook of statistics. Amsterdam: Elsevier. doi:10.1016/S0169-7161(88)07019-1.
Chiang, L. H., Leardi, R., Pell, R. J., & Seasholtz, M. B. (2006). Industrial experiences with multivariate statistical analysis of batch process data. Chemometrics and Intelligent Laboratory Systems, 81(2), 109–119. doi:10.1016/j.chemolab.2005.10.006.
Cimander, C., Bachinger, T., & Mandenius, C.-F. (2003). Integration of distributed multi-analyzer monitoring and control in bioprocessing based on a real-time expert system. Journal of Biotechnology, 103(3), 237–248. doi:10.1016/S0168-1656(03)00121-4.
Doan, X.-T., & Srinivasan, R. (2008). Online monitoring of multi-phase batch processes using phase-based multivariate statistical process control. Computers & Chemical Engineering, 32(1–2), 230–243. doi:10.1016/j.compchemeng.2007.05.010.
FDA. (2004). Guidance for industry: PAT—A framework for innovative pharmaceutical development, manufacturing, and quality assurance. Pharmaceutical CGMPs.
Ferreira, A. P., Lopes, J. A., & Menezes, J. C. (2007). Study of the Application of multiway multivariate techniques to model data from an industrial fermentation process. Analytica Chimica Acta, 595(1–2), 120–127. doi:10.1016/j.aca.2007.05.007.
Fransson, M., & Folestad, S. (2006). Real-time alignment of batch process data using COW for on-line process monitoring. Chemometrics and Intelligent Laboratory Systems, 84(1–2), 56–61. doi:10.1016/j.chemolab.2006.04.020.
Gao, W. J., Jane, H. J., Lin, K. T. L., & Liao, B. Q. (2010). Influence of elevated pH shocks on the performance of a submerged anaerobic membrane bioreactor. Process Biochemistry, 45(8), 1279–1287. doi:10.1016/j.procbio.2010.04.018.
Glassey, J., Montague, G., & Mohan, P. (2000). Issues in the development of an industrial bioprocess advisory system. Trends in Biotechnology, 18(4), 136–41. http://www.ncbi.nlm.nih.gov/pubmed/10740258.
González-Martínez, J. M., Ferrer, A., & Westerhuis, J. A. (2011). Real-time synchronization of batch trajectories for on-line multivariate statistical process control using dynamic time warping. Chemometrics and Intelligent Laboratory Systems, 105(2), 195–206. doi:10.1016/j.chemolab.2011.01.003.
Gregersen, L., & Jørgensen, S. B. (1999). Supervision of fed-batch fermentations. Chemical Engineering Journal, 75(1), 69–76. doi:10.1016/S1385-8947(99)00018-2.
Honda, H., & Kobayashi, T. (2004). Industrial application of fuzzy control in bioprocesses. Advances in Biochemical Engineering/biotechnology, 87, 151–71. http://www.ncbi.nlm.nih.gov/pubmed/15217106.
Ijima, H., Kakeya, Y., Ogata, T., & Sakai, T. (2009). Development of a practical small-scale circulation bioreactor and application to a drug metabolism simulator. Biochemical Engineering Journal, 44(2–3), 292–296. doi:10.1016/j.bej.2008.12.015.
International Conference on Harmonization (2004). Guidance for Industry: Q8(R2) Pharmaceutical Developement.
International Conference on Harmonization (2009). Guidance for Industry: Q9 Quality Risk Management.
Jaumot, J., Igne, B., Anderson, C. A., Drennen, J. K., & de Juan, A. (2013). Blending process modeling and control by multivariate curve resolution. Talanta, 117(117C), 492–504. doi:10.1016/j.talanta.2013.09.037.
Jiménez-González, C., & Woodley, J. M. (2010). Bioprocesses: Modeling needs for process evaluation and sustainability assessment. Computers & Chemical Engineering, 34(7), 1009–1017. doi:10.1016/j.compchemeng.2010.03.010.
Jørgensen, P., Pedersen, J. G., Jensen, E. P., & Esbensen, K. H. (2004). On-line batch fermentation process monitoring (NIR)-introducing‘biological process time. Journal of Chemometrics, 18(2), 81–91. doi:10.1002/cem.850.
Kandel, T. P., Gislum, R., Jørgensen, U., & Lærke, P. E. (2013). Prediction of biogas yield and its kinetics in reed canary grass using near infrared reflectance spectroscopy and chemometrics. Bioresource Technology, 146(October), 282–287. doi:10.1016/j.biortech.2013.07.092.
Karadag, D., & Puhakka, J. A. (2010). Effect of changing temperature on anaerobic hydrogen production and microbial community composition in an open-mixed culture bioreactor. International Journal of Hydrogen Energy, 35(20), 10954–10959. doi:10.1016/j.ijhydene.2010.07.070.
Kourti, T. (2006). Process analytical technology beyond real-time analyzers: The role of multivariate analysis. Critical Reviews in Analytical Chemistry, 36(3–4), 257–278. doi:10.1080/10408340600969957.
Kourti, T., Nomikos, P., & MacGregor, J. F. (1995). Analysis, monitoring and fault diagnosis of batch processes using multiblock and multiway PLS. Journal of Process Control, 5(4), 277–284. doi:10.1016/0959-1524(95)00019-M.
Kresta, J. V., Macgregor, J. F., & Marlin, T. E. (1991). Multivariate statistical monitoring of process operating performance. The Canadian Journal of Chemical Engineering, 69(1), 35–47. doi:10.1002/cjce.5450690105.
Lee, D. S., & Vanrolleghem, P. A. (2003). Monitoring of a sequencing batch reactor using adaptive multiblock principal component analysis. Biotechnology and Bioengineering, 82(4), 489–497. doi:10.1002/bit.10589.
Lennox, B., Montague, G. A., Hiden, H. G., Kornfeld, G., & Goulding, P. R. (2001). Process monitoring of an industrial fed-batch fermentation. Biotechnology and Bioengineering, 74(2), 125–35. doi:10.1002/bit.1102.
Lopes, J. A., Menezes, J. C., Westerhuis, J. A., & Smilde, A. K. (2002). Multiblock PLS analysis of an industrial pharmaceutical process. Biotechnology and Bioengineering, 80(4), 419–427. doi:10.1002/bit.10382.
Luttmann, R., Borchert, S.-O., Mueller, C., Loegering, K., Aupert, F., Weyand, S., et al. (2015). Sequential/parallel production of potential malaria vaccines—A direct way from single batch to quasi-continuous integrated production. Journal of Biotechnology, 213(February), 83–96. doi:10.1016/j.jbiotec.2015.02.022.
MacGregor, J. F., Jaeckle, C., Kiparissides, C., & Koutoudi, M. (1994). Process monitoring and diagnosis by multiblock PLS methods. AIChE Journal, 40(5), 826–838. doi:10.1002/aic.690400509.
MacGregor, J. F., & Kourti, T. (1995). Statistical process control of multivariate processes. Control Engineering Practice, 3(3), 403–414. doi:10.1016/0967-0661(95)00014-L.
Martin, E. B., Morris, A. J., & Zhang, J. (1996). Process performance monitoring using multivariate statistical process control. IEE Proceedings—Control Theory and Applications, 143(2), 132–144. doi:10.1049/ip-cta:19960321.
Menezes, J. C. (2011). Comprehensive biotechnology. comprehensive biotechnology. Amsterdam: elsevier. doi:10.1016/B978-0-08-088504-9.00205-1.
Nomikos, P., & MacGregor, J. F. (1995a). Multivariate SPC charts for monitoring batch processes. Technometrics, 37(1), 41–59. doi:10.1080/00401706.1995.10485888.
Nomikos, P., & MacGregor, J. F. (1995b). Multi-way partial least squares in monitoring batch processes. Chemometrics and Intelligent Laboratory Systems, 30(1), 97–108. doi:10.1016/0169-7439(95)00043-7.
Rathore, A. S. (2014). QbD/PAT for bioprocessing: Moving from theory to implementation. Current Opinion in Chemical Engineering, 6, 1–8. doi:10.1016/j.coche.2014.05.006.
Ryan, T. P. (2011). Statistical methods for quality improvement (3rd ed.). New Jersey: Wiley.
Sarraguça, M. C., Ribeiro, P. R. S., Santos, A. O., Silva, M. C. D., & Lopes, J. A. (2014). A PAT approach for the on-line monitoring of pharmaceutical co-crystals formation with near infrared spectroscopy. International Journal of Pharmaceutics, 471(1–2), 478–484. doi:10.1016/j.ijpharm.2014.06.003.
Shewhart, W. A. (1986). Statistical method from the viewpoint of quality control. Edited by W. Edwards Deming. Dover.
Varmuza, K., & Filzmoser, P. (2009). Introduction to multivariate statistical analysis in chemometrics. boca raton: CRC Press/Taylor & Francis.
Vojinović, V., Cabral, J. M. S., & Fonseca, L. P. (2006). Real-time bioprocess monitoring. Sensors and Actuators B: Chemical, 114(2), 1083–1091. doi:10.1016/j.snb.2005.07.059.
Wold, S., Kettaneh, N., Fridén, H., & Holmberg, A. (1998). Modelling and diagnostics of batch processes and analogous kinetic experiments. Chemometrics and Intelligent Laboratory Systems, 44(1–2), 331–340. doi:10.1016/S0169-7439(98)00162-2.
Wold, S., Kettaneh-Wold, N., MacGregor, J. F., & Dunn, K. G. (2009). Comprehensive chemometrics. Comprehensive chemometrics. Amsterdam: Elsevier. doi:10.1016/B978-044452701-1.00108-3.
Wold, S., Sjöström, M., & Eriksson, L. (2001). PLS-regression: A basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58(2), 109–130. doi:10.1016/S0169-7439(01)00155-1.
Yang, W.-A. (2013). Monitoring and diagnosing of mean shifts in multivariate manufacturing processes using two-level selective ensemble of learning vector quantization neural networks. Journal of Intelligent Manufacturing, 26(4), 769–783. doi:10.1007/s10845-013-0833-z.
Zhu, D., Bai, J., & Yang, S. X. (2010). A multi-fault diagnosis method for sensor systems based on principle component analysis. Sensors (Basel, Switzerland), 10(1), 241–253. doi:10.3390/s100100241.
Acknowledgments
We would like to thank Johannes Österreicher and Johannes Scheiblauer for support in terms of sensor implementation and execution of fermentations.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Statistically in-control/out-of-control
Appendix: Statistically in-control/out-of-control
The selection of historical data for the generation of (M)SPC is commonly referred as Phase I. The latter is of upmost importance to obtain statistical models that are sensitive to out-of-control events in future batches (Phase II).
First of all, it has to be defined how to classify processed batches as in-control and why. These questions depend on the chosen quality characteristics, i.e., if process histories (selected variables at various points in time) or simply product quality attributes are considered. Since the latter are commonly multivariate as well (e.g., yield, purity, metabolite concentrations, etc.) this appendix discusses retrospective testing of batch processes via multivariate analysis.
Assuming that \(\nu \) quality characteristics (e.g. \(\nu = \mathbf{JK}\) for process histories, unfolding the three-way data structure as shown in Fig. 1b) follow a \(\nu \)-variate normal distribution, Hotelling’s \(\hbox {T}^{2}\) statistic is applied. \({\upchi }^{2}\) statistics are used if (co)variances are known or can be estimated accurately, e.g. if the number of observables is much higher than \(\nu \).
Hotelling’s \(\hbox {T}^{2}\) statistic for deviations from a known mean
\(\hbox {T}^{2}\) as defined in Eq. 10 is used to indicate if averaged quality characteristics deviate significantly from a known \(\upmu \), representing the process mean under stable conditions.
Here \(\upmu \) and \({\bar{x}} \) are \(\nu \) -dimensional vectors, \(S^{-1}\) is the inverse of the estimated covariance matrix and n is the number of observations (i.e., batches) averaged to obtain \({\bar{x}} \). When \(\upmu =\upmu _0 \) for the null hypothesis (e.g. when quality characteristics are assumed to fluctuate around known/expected values), \(T^{2}\) is distributed like:
where \(F_{({\nu , n-\nu })}\) refers to the F-distribution with \(\nu \) and \(n-\nu \) degrees of freedom. Hence, an upper control limit with a significance level \(\alpha \) can be defined as:
Hotelling’s \(\hbox {T}^{2}\) statistic for individual observations
As discussed in Ryan (2011) Eq. 13 is used to investigate \(i=1,2, \ldots , m\) individual multivariate observations.
The \(T_i^2 \) statistics represent the distance of (future) individual observations \(x_i \) from the mean vector \({\bar{x}}_m\) weighted by the covariance matrix \(S_m\). Here \({\bar{x}}_m \) and \(S_m\) are estimated from all m observations. For future observations, \(T_i^2 \) is distributed like:
and an upper control limit can be defined as:
PCA for retrospective testing of batches
\(T^{2}\) statistics cannot be computed if quality characteristics are highly correlated, as the covariance matrix gets non-invertible. This issue could be eliminated with the aid of PCA and \(T^{2}\) statistics based on scores. Furthermore, the number of observations can be small compared to the number of quality characteristics which requires data compression e.g. via PCA.
Below, these principles are demonstrated using the fermentation runs discussed in this work (1–6) and two additional runs (7–8) which were not considered in Phase I.
In order to test if the fermentations are statistically in-control, the three-way data structure is unfolded as shown in Fig. 1b, i.e. N observables with \(\nu = \mathbf{JK}\) variables with. Afterwards, a PCA is executed to reduce the number of variables \(\nu \). PCA can provide an output even if \(\mathbf{N}< \nu \). In that case, PCs of a higher number than N do not contain valuable information. Furthermore, the robustness of PCA results need to be tested, e.g. via bootstrapping, before scores are used in \(T^{2}\) statistics. Since only 6–8 batches were considered here (i.e., \(\mathbf{N}=8\)) the number of variables had to be reduced to obtain consistent results. Therefore, only \({og-CO}_{2}\) and \({pO}_{2}\) (see Table 1) were used (i.e., J \(=\) 2) at 100 time bins (i.e., \(\mathbf{K}=100\)).
Score plots are a useful tool to screen batches for abnormalities. Figure 14 shows the projections of batches 1–8 and 1–6 in the \(\hbox {t}_{1}-\hbox {t}_{2}\) plane. This score plot indicates clearly that fermentations 7–8 differ from fermentations 1–6. PCA results were shown to be robust for the first three PCs, but the first two were used in \(T^{2} \) statistics for individual observations (Eqs. 13–15), since they explain already 78 % of the variance. Figure 15 shows the obtained \(T^{2} \)values when considering only fermentations 1–6 in the PCA. This plot quantifies again the discrepancy between the fermentations and why fermentations 7–8 were not considered in Phase I. Such \(T^{2}\) charts can be used for the retrospective testing (exploratory data analysis) if quality characteristics of a processed batch vary significantly from historical data (i.e., Phase II chart).
In order to test if the statistical variations among fermentations are significant, the number of observables used here is rather low. If the number of observables and variables are of the same order upper control limits might not be estimated accurately. This is captured in Fig. 16, showing upper control limits as calculated via Eq. 15 and \(T^{2}\) for future observations.
Rights and permissions
About this article
Cite this article
Besenhard, M.O., Scheibelhofer, O., François, K. et al. A multivariate process monitoring strategy and control concept for a small-scale fermenter in a PAT environment. J Intell Manuf 29, 1501–1514 (2018). https://doi.org/10.1007/s10845-015-1192-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10845-015-1192-8