Intelligent modelling of bioprocesses: a comparison of structured and unstructured approaches Benjamin J. Hodgson Christopher N. Taylor Misti Ushio J. R. Leigh Tatiana Kalganova Frank Baganz Email author Original paper

First Online: 18 September 2004 Received: 25 November 2002 Accepted: 13 July 2004 DOI :
10.1007/s00449-004-0382-0

Cite this article as: Hodgson, B.J., Taylor, C.N., Ushio, M. et al. Bioprocess Biosyst Eng (2004) 26: 353. doi:10.1007/s00449-004-0382-0
Abstract This contribution moves in the direction of answering some general questions about the most effective and useful ways of modelling bioprocesses. We investigate the characteristics of models that are good at extrapolating. We trained three fully predictive models with different representational structures (differential equations, differential equations with inheritance of rates and a network of reactions) on Saccharopolyspora erythraea shake flask fermentation data using genetic programming. The models were then tested on unseen data outside the range of the training data and the resulting performances were compared. It was found that constrained models with mathematical forms analogous to internal mass balancing and stoichiometric relations were superior to flexible unconstrained models, even though no a priori knowledge of this fermentation was used.

Keywords Genetic programming Predictive Modelling Fermentation Development List of symbols N _{vars} Number of variables

N _{batches} Number of batches

N _{nodes} Number of nodes making up an individual

t _{F} Time of last sample

C _{i} , C _{j} , C _{k} Hidden variables used internally by models

C _{Gluc} , C _{Nitrate} , C _{Biomass} , C _{Red} Predicted values of measured variables, glucose, nitrate, biomass, red pigment, respectively (g/l)

M _{Gluc} , M _{Nitrate} , M _{Biomass} , M _{Red} Measured values of glucose, nitrate, biomass, red pigment, respectively (g/l)

r _{batch} Pearson correlation coefficient between measured and predicted values at a given time point varying with respect to initial conditions

r _{time} Pearson correlation coefficient between measured and predicted values with respect to time

R ^{2} Root mean squared error

a , b , c , d Floating point weights

ɛ _{s} Scaled error of model on batch

ɛ _{av} Error between the average profiles of training data and the actual value on that batch

Paper presented at the international conference on trends in monitoring and control of life science applications, 7–8 October 2002, Lyngby, Denmark.

References 1.

Marenbach P, Betterhausen KD, Freyer S, Nieken U, Retttenmaier H (1997) Data-driven structured modelling of a biotechnological fed-batch fermentation by means of genetic programming. Proc Inst Mech Eng 211:325–332

2.

Chen L, Bernard O, Bastin G, Angelov P (2000) Hybrid modelling of biotechnological processes using neural networks. Control Eng Pract 8:821–827

CrossRef 3.

Kennedy MJ, Spooner NR (1996) Using fuzzy logic to design fermentation media: a comparison to neural networks and factorial design. Biotechnol Tech 10(1):47–52

4.

Asprey SP, Mantalaris A (2001) Global parametric identifiability of a dynamic unstructured model of hybridoma cell culture. In: Proceedings of the 8th international conference on computer applications in biotechnology (CAB8), Quebec, Canada, June 2001

5.

Roubos JA, Krabben P, Luiten R, Babuska R, Heijnen JJ (2001) A semi-stoichiometric model for a Streptomyces fed-batch cultivation with multiple feeds. In: Proceedings of the 8th international conference on computer applications in biotechnology (CAB8), Quebec, Canada, June 2001, pp 299–304

6.

Krabben P, Roubos JA, Bruins ME, Babuska R, Verbruggen HB, Heijnen JJ (2000) Metabolic flux analysis of the growth of S.clavuligerus in batch-cultivations with different N-sources. In: Proceedings of the workshop on metabolic engineering, vol 2, Elmau, Germany

7.

Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, Massachusetts, ISBN 0262111705

8.

Nelder JA, Mead R (1965) Downhill simplex method. Comput J 7:308–313

9.

Cao H, Kang L, Chen Y, Yu J (2000) Evolutionary modeling of systems of ordinary differential equations with genetic programming. J Genet Programming Evolvable Machines 1:309–337

CrossRef 10.

Sakamoto E, Iba H (2001) Inferring a system of differential equations for a gene regulatory network by using genetic programming. In: Proceedings of the congress on evolutionary computation (CEC2001), Seoul, Korea, May 2001, 1:720–726

11.

Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992) Numerical recipes in C, 2nd edn. Cambridge University Press, Cambridge, UK, ISBN 0-521-75033-4

Authors and Affiliations Benjamin J. Hodgson Christopher N. Taylor Misti Ushio J. R. Leigh Tatiana Kalganova Frank Baganz Email author 1. The Advanced Centre for Biochemical Engineering University College London London UK 2. Lilly Systems Biology Pte Ltd Singapore 3. Control Engineering Centre, Department of Electrical Engineering and Electronics Brunel University Uxbridge UK 4. Electrical and Computing Engineering Department Brunel University Uxbridge UK