Abstract
The error of single step-ahead output prediction is the information traditionally used to correct the state estimate while exploiting the new measurement of the system output. However, its dynamics and statistical properties can be further studied and exploited in other ways. It is known that in the case of suboptimal state estimation, this output prediction error forms a correlated sequence, hence it can be effectively predicted in real time. Such a suboptimal scenario is typical in applications where the process noise model is not known or it is uncertain. Therefore, the paper deals with the problems of analytical and empirical modeling, identification, and prediction of the output error of the suboptimal state estimator for the sake of improving the output prediction accuracy and ultimately the performance of the model predictive control. The improvements are validated on an empirical model of type 1 diabetes within an in-silico experiment focused on glycemia prediction and implementation of the MPC-based artificial pancreas.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In many applications and problems, like the model predictive control of glycemia in diabetic subjects, there is a need to estimate the system state based solely on noisy measurements of the system output. Traditional recursive state estimators/observers, such as the Kalman filter [1], are algorithms based on the prediction and correction of the state estimate by the new output measurement. It means that the state estimate is corrected (innovated) according to the single step-ahead output prediction error produced by the system model.
It is known from the theory of Kalman filtering, that for an optimal state estimator, the sequence of the output error of the state observer (OESO), which is also called the innovation sequence, has the properties of Gaussian white noise. However, in the case of suboptimal state estimation, this innovation sequence is correlated [2, 3] and thus can be effectively predicted in real time.
However, why should one even consider using the suboptimal state estimation instead of the optimal one? To answer this question, the suboptimal state estimation is basically not an option of choice, but rather an inevitable consequence that arises from the unknown or poorly estimated parameters of the process noise model. The assumption of suboptimality is because the design of Kalman filter and its optimality relies on the exact knowledge of the process noise model, parameters of which are highly uncertain in many applications, or even the entries of the process noise covariance matrix are often used simply as the tuning parameters. It can be claimed that if there is a mismatch between the noise model and the actual statistical properties of the system, the OESO forms a correlated sequence.
The main practical motivation for studying the dynamics of the OESO is to try to predict it in real time and then correct the predictions of the output variable accordingly, yet the ultimate aim is to involve this prediction in the model predictive control. Therefore, the outlined strategy can be seen as a relatively feasible and cheap way to improve prediction and control performance by effectively compensating for the effect of suboptimal state estimation.
Concerning the target application domain outlined in the title, performing highly accurate predictions yields better forecasting of severe hyper- and hypo-glycemia states, as these are the major risks linked to diabetes and its treatment [4, 5]. Another application of the proposed strategy is possible within an implementation of the model predictive control-based artificial pancreas [6, 7] to control glycemia in subjects with type 1 diabetes by automating the insulin dosing.
The rationale for choosing this application domain to demonstrate the effectiveness of the proposed strategy is supported by the fact that in most available studies, e.g. in [8,9,10,11,12,13,14,15,16,17], the crucial entries of the covariance matrix of the process noise are considered tuning parameters in the Kalman filter design. Therefore, it can be concluded that the state estimator works suboptimally in such scenario.
It should be mentioned for completeness, that there also exist sophisticated and dedicated methods [18,19,20,21,22] for estimating the noise models, which can potentially eliminate the problem of suboptimality of the state estimation. Unfortunately, these covariance matrix estimation methods have limited practical applicability, as they often provide biased estimates and typically require very large datasets to be supplied, what can be infeasible in many applications.
In this paper, we provide important insights into the dynamics of the OESO from the analytical point of view, but also from the practical data-driven perspective where reduced models are considered. The primary motivation for using the reduced structures, such as the autoregressive and the moving average model, rather than the full analytical model, is their feasible prediction and identifiability from the experimentally obtained OESO sequence.
The paper has been divided into the following sections: Sect. 2 introduces the basic preliminaries, formulation of the stochastic state-space model, and equations describing the conventional state observer. In Sect. 3, the full analytical model of the OESO is derived to provide some theoretical background. Section 4 considers the reduced autoregressive model and comprises the formulation of the corresponding identification problem and the predictive equation. In Sect. 5, the moving average model is studied in a similar way. Section covers the idea of using the identified models to enhance the output prediction accuracy and their inclusion in the model predictive control. The setup of the experiment aimed at prediction and model predictive control of glycemia in subjects with type 1 diabetes is outlined in Sect. 7, while its results are discussed in Sect. 8.
It should be marginally mentioned that the stochastic nature of glycemia dynamics does not necessarily have to be modeled in the state space by the process noise and the measurement noise, since basic input–output transfer function models like the autoregressive-exogenous, autoregressive moving average with exogenous inputs, and Box–Jenkins model can also be used as reported in [23,24,25,26]. The stochastic part of these input–output models is usually estimated in one step together with the deterministic submodels as a result of the system identification procedure from the experimental data, so there is no need to worry about the optimality of the state estimation.
An application of the unconstrained model predictive control with the state-space model along with the state estimation based on the Kalman filter was presented in study [27]. A similar control strategy was reported in [28, 29], but due to the input–output problem formulation the state estimator was not required. As another example of similar artificial pancreas scheme, in [30] the Kalman filter was used to estimate the state for the linear model predictive control with disturbance rejection.
In our recent work [31], a novel optimal state estimator was proposed as an alternative to the Kalman filter. However, this algorithm was not based on the traditional recursive correction of the state estimate by the OESO, but it used the generalized least squares formulation of the problem. Also in this case, the optimality of the state estimate depended on the exact knowledge of the process noise model.
From the perspective of unique contributions of this paper, it is important to remark that in any of the above referenced studies or in the latest comprehensive survey papers [32,33,34], the strategy of predicting the OESO and compensating for the suboptimality of the state estimation was not proposed or discussed. It can also be concluded that most authors relied on suboptimal state estimation with ad hoc tuning of the process noise model, hence leaving a significant headroom for improving the control performance by targeting this problem.
Unlike the conventional MPC-based artificial pancreas that typically uses the suboptimal state estimator, which normally results in a correlated OESO sequence, we propose a new strategy to effectively compensate for this effect. The proposed modification involves the prediction of the OESO to directly correct the prediction of the system free response. It is worth noting that this modification is easy to embed in already existing MPC schemes [32,33,34] while inducing little more computational cost yet requiring no additional hardware modifications. In other words, the proposed OESO models and the prediction/correction strategy can eventually be retrofitted to any advanced MPC-based artificial pancreas design that utilizes the state estimator at the expense of relatively straightforward structural modification. Note that the other features of the artificial pancreas, such as the safety algorithms and constraints, do not directly interact with the proposed modification.
2 Model structure and preliminaries
The general discrete-time stochastic state-space empirical model of glycemia dynamics in subjects with type 1 diabetes is postulated as [31]
where \(k \in {\mathbb {N}}\) is the current sample, the output y [mmol/l] stands for the deviation of glycemia from its steady-state value, the input u [U/min] denotes the deviation of the insulin administration rate from the basal rate, and d [g/min] represents the carbohydrate intake rate input. The state vector of this nth order model is denoted \(x ~ \left[ n \! \times \! 1 \right] \), \(w ~ \left[ n \! \times \! 1 \right] \) is the process noise vector and the zero-mean uncorrelated random process \(v \sim {\mathcal {N}}\left( 0,{\mathcal {R}}\right) \) represents the measurement noise of the glucose monitoring device [35].
The parameters of model (1) include the state-transition matrix \(A ~ \left[ n \! \times \! n \right] \), the input matrix \(B ~ \left[ n \! \times \! 2 \right] \), and the output vector \(C ~ \left[ 1 \! \times \! n \right] \). The state-transition matrix A consists of the submatrices \(A^{u}\), \(A^{d}\) and the zero matrices \(\varvec{0}\) of the conforming dimensions as
where matrices \(A^{u} ~ \left[ n_u \! \times \! n_u \right] \) and \(A^{d} ~ \left[ n_d \! \times \! n_d \right] \) are in the canonical form and comprise the model coefficients \(a^{\frac{u}{d}}\) such that
The input matrix B is simply
where \(\varvec{0}\) are the zero vectors of conforming dimensions and \(B^{u} ~ \left[ n_u \! \times \! 1 \right] \), \(B^{d} ~ \left[ n_d \! \times \! 1 \right] \) are equal to
The output vector C gets
where \(C^{u} ~ \left[ 1 \! \times \! n_u \right] \) and \(C^{d} ~ \left[ 1 \! \times \! n_d \right] \) will comprise the model coefficients \(c^{\frac{u}{d}}\) as
The state vector x also holds the canonical form
where \(n_u\) and \(n_d\) are the orders of the corresponding submodels, so the overall model order is \(n\!=\!n_u\!+\!n_d\). The state variables \(x^u\), \(x^d\) represent the partial effects of insulin administration and carbohydrate intake, respectively.
Consider that the process noise w in model (1) represents the effect of input uncertainties, so one can write
The first random input \(\gamma _{(k)}\!\sim \!{\mathcal {N}}\left( 0,\sigma ^2_\gamma \right) \) reflects various unmeasurable disturbances, including physiological changes in insulin absorption and action. The second random input \(\delta _{(k)}\!\sim \!{\mathcal {N}}\left( 0,\sigma ^2_\delta \right) \) represents the uncertainty of the meal announcing induced by the patient.
Since all stochastic terms in (1) were defined as zero-mean uncorrelated stationary random processes, the covariance matrix \({\mathcal {Q}} ~ \left[ n \! \times \! n \right] \) of the process noise (9) and the variance \({\mathcal {R}}\) of the measurement noise are equal to
2.1 State observer
The state vector x of model (1) is usually estimated using the recursive state observer
where \({\hat{x}} ~ [n \! \times \! 1]\) is the estimated state and \(K ~ [n \! \times \! 1]\) is the gain vector, which is the subject of the observer design. The design is usually based on the optimal Kalman approach [1, 36] in the case of stochastic system assumption, or using the pole-placement method if the system is deterministic.
The state estimate residual \(e ~ \left[ n \! \times \! 1 \right] \) will be defined as
Finally, the single step-ahead model output prediction error \(\epsilon \), i.e. the OESO, gets
3 Dynamics of the state observer output error
The model of the state estimate residual e can be derived by substituting \({\hat{x}}_{(k+1)}\) from (12) and \(x_{(k+1)}\) according to (1a) into (13) while substituting the output \(y_{(k+1)}\) in the terms of 1b as
Eq. (15) can be transformed by applying the forward time-shift operator z as \(e_{(k+1)}\!=\!z e_{(k)}\) obtaining
The above equation can be reshaped to separate vector \(e_{(k)}\) as
where I is the unit matrix of the conforming dimensions.
According to (14) and (17), \(\epsilon _{(k)}\) holds
Eq. (18) implies that the dynamics of the OESO is represented by a stochastic system with multiple independent noise inputs. One may realize that term \(C\left( zI\!-\!A\!+\!KC \right) ^{-1}\) results in a row vector of rational functions \(\frac{p_i(z)}{s(z)}\) with the common denominator s(z) as the characteristic polynomial of this system. This consideration yields the transfer function model
where
If \({\mathcal {Q}}\) is the covariance matrix of the process noise according to (10), then \(w_{(k)}\) can be written as
where \(\varvec{0} ~ \left[ n \! \times \! 1 \right] \) is the zero vector, \(\varvec{\eta }~ [n+1 \! \times \! 1]\) is the vector of uncorrelated noise inputs with the unit variance, i.e., \(\textrm{cov}(\varvec{\eta },\varvec{\eta })\!=\!I\),
and \(\sqrt{{\mathcal {Q}}} ~ \left[ n \! \times \! n \right] \) is the Cholesky decomposition satisfying \({\mathcal {Q}}\!=\!\sqrt{{\mathcal {Q}}} \left( \sqrt{{\mathcal {Q}}}\right) ^{\textrm{T}}\) [37]. Similarly, the measurement noise \(v_{(k)}\) can be replaced by
where \({\mathcal {R}}\) is the variance of the measurement noise according to (11).
Finally, model (19) can be generalized as the sum of \(n\!+\!1\) autoregressive-moving-average (ARMA) models by substituting \(w_{(k)}\) in the terms of (21) and \(v_{(k)}\) from (23) as
Since the process noise (9) has only two nonzero components and diagonal covariance matrix (10), general model (24) can be reduced to
However, model (24) or (25) cannot be used to predict the OESO, since the random input vector \(\varvec{\eta }\) as well as the partial outputs \(\frac{r_i(z)}{s(z)}\) are unmeasurable in practice.
Another important paradox concerning this analytical model is that since the covariance matrix of the process noise is considered unknown in the case of suboptimal state estimation, analytical model (24) simply cannot be determined. On the contrary, if the covariance matrix of the process noise is known, what implies that the state estimator works optimally, then model (24) will be known, but it will be unnecessary since the OESO is uncorrelated and hence it cannot be predicted.
Concerning the identification of full model (24) directly from experimental data, i.e. based on the OESO sequence, this would be hardly possible primarily due to its structure and the large number of parameters to be estimated.
The aforementioned issues with predictability and identifiability are the main motivation for further considering two reduced model structures, particularly the autoregressive and the moving average model.
3.1 Optimality test
To test whether the state estimator works optimally or not, the sample autocorrelation function of the OESO sequence has to be analyzed. In the case of optimal state estimation, this autocorrelation function should show a character similar to that of Dirac delta function.
Supposing a finite-length experiment with N samples, the autocorrelation function \(R_{\epsilon \epsilon }(n T_\textrm{s})\) is estimated as [38, 39]
where \(n \in {\mathbb {Z}}\) satisfies \(n \!<\! N\).
4 Autoregressive model
In this section, the dynamics of the OESO will be approximated by the single-input single-output autoregressive model defined as
where \(\eta \!\sim \!{\mathcal {N}}\left( 0,\sigma ^2_\eta \right) \) is a random process with the properties of zero-mean white noise.
The polynomial q(z) of this \(n_\textrm{q}\)th order model gets
The equivalent difference equation of model (27) holds
The parameter vector \(\varvec{q}\) gets
4.1 Identification strategy
According to difference equation (29), the corresponding linear regression system considering N available samples takes
or using the shorthand notation
where \(\varvec{q}\) is the parameter vector (30), \(H_\mathrm{{AR}} ~ \left[ N \! \times \! n_\textrm{q} \right] \) is the regression matrix and \(\varvec{\epsilon } ~ \left[ N \! \times \! 1 \right] \), \(\varvec{\eta } ~ \left[ N \! \times \! 1 \right] \) are vectors.
The parameter vector \(\varvec{q}\) can be estimated as \(\hat{\varvec{q}}\) in a straightforward way using the least squares method with the optimal parameter estimate determined analytically as [40]
4.2 Predictive form
For model (27), the explicit prediction formula can be derived based on difference equation (29). The future values of the white noise input are obviously unknown, so assuming that its statistically unbiased prediction is zero, i.e. \(E\left\{ \eta _{(k+i)}\right\} \!=\!0\), the predictive form considering the prediction horizon \(n_\textrm{p}\) gets
where vectors \(\varvec{\epsilon }_\textrm{p} ~ \left[ n_\textrm{q} \! \times \! 1 \right] \) and \(\hat{\varvec{\epsilon }}_f ~ \left[ n_\textrm{p} \! \times \! 1 \right] \) are defined as
and the matrices \(M^\epsilon _f ~ \left[ n_\textrm{p} \! \times \! n_\textrm{p} \right] \), \(M^\epsilon _\textrm{p} ~ \left[ n_\textrm{p} \! \times \! n_\textrm{q} \right] \) comprise the elements of vector \(\varvec{q}\) (30) as
Since the noise input in (27) is unmeasurable, by reshaping equation (29), \(\eta _{(k)}\) can be estimated as
5 Moving average model
In this section, the dynamics of the OESO will be approximated by the moving average model
where \(\eta \!\sim \!{\mathcal {N}}\left( 0,\sigma ^2_\eta \right) \) is the zero-mean white noise input.
The polynomial g(z) in the \(n_\textrm{g}\)th order model (40) gets
The difference equation for model (40) can be written as
The parameter vector \( \varvec{g} ~ \left[ n_\textrm{g} \! \times \! 1 \right] \) \(g_i\) gets
5.1 Identification strategy
It is well known that estimating the moving average processes is more difficult than estimating the autoregressive processes [41]. Since the input noise \(\eta \) is unmeasurable in practice, the straightforward approach based on the least squares minimization of the model single step-ahead prediction error cannot be directly applied in this case.
Therefore, to estimate the coefficient vector (43) using the available OESO sequence, the two-step method of Durbin [41, 42] will be adopted. The first step of this method consists of fitting an autoregressive model to the OESO sequence via the ordinary least squares method in the terms of Sect. 4. In the second step, the identified autoregressive model is used to estimate the input noise \(\eta \) by filtering the OESO sequence by the inverse of the estimated autoregressive model according to (39).
The second step uses this estimated input noise sequence \({\hat{\eta }}\) to create the regression system and to estimate the parameters of the moving average process in the least squares sense.
The corresponding regression system then gets
or using the shorthand notation,
where \(\varvec{g}\) is the parameter vector (43), \(H_{MA} ~ \left[ N \! \times \! n_\textrm{g} \right] \) is the regression matrix and \(\varvec{\epsilon } ~ \left[ N \! \times \! 1 \right] \), \(\varvec{\eta } ~ \left[ N \! \times \! 1 \right] \) are vectors.
The optimal parameter vector \(\varvec{g}\) can be estimated as
5.2 Predictive form
Having the model parameters estimated, the OESO (36) can be predicted. Assuming that the statistically unbiased prediction of the input zero-mean white noise is zero, the predictive form of the moving average model (40) can be derived according to the difference equation (42) as
where matrix \(M^\eta _\textrm{p} ~ \left[ n_\textrm{p} \! \times \! n_\textrm{g} \right] \) is formed by the elements of vector \(\varvec{g}\) (43) such that
and vector \(\hat{\varvec{\eta }}_\textrm{p} ~ \left[ n_\textrm{g} \! \times \! 1 \right] \) comprises the estimated past values of the noise input
In practice, the input noise \(\eta \) cannot be measured, so it has to be estimated based on the inverse filtering of \(\epsilon \) according to the difference equation (42) as
6 Prediction and model predictive control with the OESO compensation
In this section, the algorithm of model predictive control will be adopted from [31], while an important modification that concerns the prediction of the OESO will be proposed.
Prediction of the state vector x and the output y can be expressed considering (1) as
where \(k \in {\mathbb {N}}\) is the current sample and \(i \in {\mathbb {N}}\) gets \(i=1 \cdots n_\textrm{p}\), while assuming \(n_\textrm{p} \in {\mathbb {N}}\) is the prediction horizon.
Notice that in (52), the OESO \({\hat{\epsilon }}\), which was predicted by the identified autoregressive or the moving average model, was taken into account by correcting the output prediction \({\hat{y}}\). This is the most important modification of the traditional prediction and predictive control algorithms as it allows us to effectively compensate for the suboptimality of the state estimation.
The predictive control minimizes the quadratic cost function of the model-based predictions of chosen system variables over the prediction horizon \(n_\textrm{p}\). The corresponding quadratic form gets [43, 44]
where \(\Delta u_f ~ \left[ n_\textrm{c} \! \times \! 1 \right] \) is the vector of future changes of the manipulated variable, while assuming \(n_\textrm{c}\) is the control horizon. Matrix \(\varvec{A} ~ \left[ n_\textrm{c} \! \times \! n_\textrm{c} \right] \), vector \(\varvec{b} ~ \left[ n_\textrm{c} \! \times \! 1 \right] \) and scalar c are defined as
where \(y_\textrm{r} ~ \left[ n_\textrm{p} \! \times \! 1 \right] \) is the reference vector, \({\hat{y}}_{\textrm{free}} ~ \left[ n_\textrm{p} \! \times \! 1 \right] \) is the system free response vector, and scalar \(\lambda ^u\) denotes the weight of the manipulated variable increments penalty.
The free response prediction \({\hat{y}}_{\textrm{free}}\) in (54) gets
Matrices \({\mathcal {B}}^u ~ \left[ n_\textrm{p} \! \times \! n_\textrm{p} \right] \) and \({\mathcal {B}}^d ~ \left[ n_\textrm{p} \! \times \! n_\textrm{p} \right] \) get the lower triangular form
where matrices \(A^{\frac{u}{d}}\), \(B^{\frac{u}{d}}\), \(C^{\frac{u}{d}}\) were defined by (3), (5), (7), respectively.
The forced response matrix \(H_f ~ \left[ n_\textrm{p} \! \times \! n_\textrm{c} \right] \) in (54) gets
The optimization problem (53) can be solved by quadratic programming if linear inequalities constraints are considered. For the sake of simplicity, the elements of the reference vector \(y_\textrm{r}\) are equal to an appropriately chosen constant \(G_\textrm{t}\) representing the target glycemia. Pursuing the receding horizon strategy, only the first element of the optimal solution \(\Delta u_f\) is actually applied, so one can write
Note that the manipulated variable has to be constrained, so the minimal insulin infusion rate \(u_{\min }\!=\!0\) U/min, while \(u_{\max }\) will be adopted from [27]. The corresponding linear inequalities system with respect to the decision vector \(\Delta u_f\) can be formed by involving matrix \(\varPsi \) (57) as in [45]
Compared to [31], the crucial modification of the control algorithm is made here by adding the prediction of the OESO \({\hat{\epsilon }}_{(k+i)}\) to equation (52).
Concerning the safety features of automated insulin therapy, various additional strategies could be considered to enhance the current configuration. To avoid the adverse and dangerous insulin stacking phenomenon, the insulin on board [46,47,48] representing the amount of insulin still active from the previously administered doses can be involved. The dynamics of insulin on board can be represented by simple linear models like in [49, 50] or [51], while this signal can be used to form a special quadratic penalty that is added to the cost function (53) of the MPC. Alternatively, hard constraints for the insulin on board signal can also be assumed by extending the linear inequalities system (59) of the quadratic program accordingly. Another type of safety feature is to hard constrain the controlled variable by modifying the inequalities system (59) to prevent the risk of hypoglycemia and hyperglycemia as proposed in [26]. As this strategy can potentially induce control infeasibility, soft formulation of the controlled variable constraints should be preferred, as proposed in [52, 53]. However, further details are beyond the scope of this paper. For information on safety features, see, e.g. [46, 48, 54]. Also note that these safety features do not directly interact with the proposed strategy of predicting the output error of the suboptimal state estimator, which is the main contribution of the paper.
7 Experimental setup
To validate the proposed strategy and assess its practical effectiveness in application to the problem of prediction and predictive control of glycemia in subjects with type 1 diabetes, a simulation-based experiment was designed and evaluated.
The glycemia response for this experiment was obtained by in-silico approach, simulating the complex physiology-based nonlinear model that was described in [55, 56] and the references therein. The basal state of this model was determined with respect to the basal glycemia \(G_b\!=\!6\) mmol/l and the corresponding basal insulin administration rate \(v_b\!=\!0.01\) U/min.
The orders of empirical model (1) were chosen as \(n_u\!=\!n_d\!=\!4\), implying the overall order \(n\!=\!8\). Note that the theory related to estimation of the model parameters \(a_i^{\frac{u}{d}}\) in (3) and \(c_i^{\frac{u}{d}}\) in (7), is not within the scope of this paper, so we suppose that model (1) was identified with parameters (60). However, for more details on this topic, we refer an interested reader to our recent works [25, 57].
The prediction horizon and the control horizon were assumed as \(n_\textrm{p}\!=\!15\), \(n_\textrm{c}\!=\!10\), while the sample time was chosen as \(T_\textrm{s}\!=\!10\) min.
The variance of the measurement noise 11 and the variances of the process noise 10 were empirically adjusted as
Note that since the variances of the process noise were just empirically tuned to obtain acceptable performance of the Kalman filter while the actual values cannot be determined because they are not based on any particular physiological mechanisms or characteristics of a diabetic subject, the state estimator will perform only suboptimally in this case, and hence the OESO sequence will be correlated.
The observer gain vector K was calculated according to the Kalman filter design [1, 36] while considering the model parameters (60) and the noise model parameters (61) as
Concerning the initial tuning of the proposed empirical models of the OESO, the order of autoregressive model (27) was set as \(n_\textrm{q}=4\), whereas the order of moving average model (40) was chosen as \(n_\textrm{g}=12\).
The experiments were designed to mimic the insulin treatment of a subject with type 1 diabetes during the two-day period. The first investigated problem concerns the prediction of glycemia during standard insulin bolus therapy that was carried out according to the bolus calculator rule (see [58] and the references therein). The second deals with automated insulin dosing managed by the model predictive control algorithm of the artificial pancreas. Both algorithms were modified in the terms of Sect. .
8 Discussion
In this section, the results of the outlined simulation experiment will be comprehensively analyzed and discussed.
The sequence of the OESO obtained in the terms of equation (14) during regular insulin treatment while simultaneously performing the state estimation according to (12) by considering the observer gain (62) is plotted in Fig. 1. This figure suggests that although the state estimate asymptotically converges to the actual state, the character of the OESO sequence is far from ideal uncorrelated noise.
To prove this, the autocorrelation function \({\hat{R}}_{\epsilon \epsilon }\) of the OESO was estimated according to (26) by processing the sequence from Fig. 1 and is plotted in Fig. 2. Analyzing this autocorrelation function, one can conclude that the OESO sequence is correlated, what confirms that the state estimator works suboptimally due to the empirically adjusted variances of the process noise (61).
Estimated autocorrelation function \({\hat{R}}_{\epsilon \epsilon }(n T_\textrm{s})\) of the OESO sequence from Fig. 1
Performing the estimation of both reduced models of the OESO by pursuing the strategies presented in Sections 4 and 5, the following coefficients were estimated.
Parameter vector \(\varvec{q}\) of the autoregressive model (27):
Parameter vector \(\varvec{g}\) of the moving average model (40):
To validate the models by filtering the correlated OESO sequence \(\epsilon \) by the inverse of each of the identified empirical models, i.e. by q(z) and \(\frac{1}{g(z)}\), respectively, the input noise sequence was estimated according to (39) for the autoregressive model and according to (50) for the moving average model. The autocorrelation functions \({\hat{R}}_{{\hat{\eta }}{\hat{\eta }}}(n T_\textrm{s})\) of these sequences for both model structures are depicted in Figs. 3 and 4, which show their Dirac delta function-like nature, proving the estimated models valid.
Now follows the prediction of the OESO using the predictive form (34) of the autoregressive model and the predictive form (47) of the moving average model respectively. Relatively accurate predictions for randomly chosen starting points can be observed in Fig. 5. Both model structures showed almost identical performances and could predict the future evolution of the correlated OESO with a satisfying accuracy considering the highly stochastic nature of this signal.
The next comparison concerns the practical impact of correcting the glycemia prediction by predicted OESO as the original contribution of the paper. In Fig. 6, one can see the uncorrected prediction of glycemia (\({\hat{G}}\)) as the conventional strategy, as well as the predictions that involved the corrections by OESO predicted using the autoregressive (\({\hat{G}}^\mathrm{{AR}}\)) and the moving average (\({\hat{G}}^\mathrm{{MA}}\)) model. By a basic visual assessment, one can observe an improvement of the prediction accuracy, while the improvements concerned primarily the peaks of the response. Keep in mind that such differences between the uncorrected and the corrected prediction can be critical in situations such as decision making with regard to the application of insulin therapy.
In addition to the graphical assessment, the prediction performance will be quantified by the quadratic metric
which will provide a better assessment of the prediction performance.
The last part of the experiment is focused on the model predictive control of glycemia in the context of the artificial pancreas implementation, where a positive effect of the proposed predictors on control performance is anticipated. To demonstrate this, Fig. 7 shows the closed loop glycemia response, where involving the predictions of the OESO visibly improved the control performance in terms of tighter control with respect to the reference value, and reduced maximal and minimal observed glycemia, what is especially significant to reduce the risk of hyperglycemia and hypoglycemia. It can be concluded that both predictors performed almost identically, but way better than in the original case without compensating for the OESO.
Keep in mind that since typical values of the OESO are relatively low compared to the magnitude of the controlled variable (see Fig. 1), the proposed strategy can naturally yield a limited effect. It can also be claimed that the strength of the desired effect is directly related to the performance level of the state observer and thus to the degree of mismatch between the process noise model and the actual statistical properties of the system. Therefore, for systems with an empirically tuned covariance matrix of the process noise, the strategy proposed in this paper is highly recommended.
The control performance will be quantified by the maximal \(G_{\max }\) and the minimal \(G_{\min }\) observed glycemia, and by the quadratic metric
where \(G_\textrm{t}\) is the target glycemia.
The summary of prediction and control performance metrics obtained during the experiment is documented in Table 1, which also confirm the observations from Figs. 6 and 7.
To investigate the effect of the choice of the tunable parameters of the OESO models, particularly the order of the autoregressive model (27) \(n_\textrm{q}\) and the order of the moving average model (40) \(n_\textrm{g}\) on the resulting performance of the proposed strategy, the experimentation was repeated under various configurations, yielding the results summarized in Tables 2 and 3.
It can be concluded that the moving average model of the OESO performs slightly better in both prediction and predictive control than the autoregressive model, while the choice of the corresponding model orders \(n_\textrm{q}\), \(n_\textrm{g}\) also affected the overall performance, yet not consistently for all the metrics considered. However, all studied models performed better than the original uncompensated configuration.
9 Conclusions
This study stressed that the OESO sequence in the case of suboptimal state estimation is correlated, while it can be effectively predicted by the autoregressive or the moving average models. These two reduced models could provide good predictability and identifiability from the experimental data. The predicted OESO was then involved to correct the output variable prediction and hence ultimately improve the performance of the model predictive control. It can be concluded that the presented strategy allowed to effectively compensate for the suboptimality of the state estimation caused by the inaccuracy of the process noise model in a relatively inexpensive and feasible way.
We also obtained theoretical results demonstrating that the actual dynamics of the OESO is analytically described as the sum of ARMA processes, while this full structure was approximated by the autoregressive and the moving average models in practice.
A promising application of our results would be possible within an implementation of the artificial pancreas in subjects with type 1 diabetes, where maximizing the performance of the model predictive control is of highest priority. The presented simulation-based experiment demonstrated that the dynamics of the OESO can be effectively predicted by both proposed reduced models, while there were documented positive effects on the accuracy of the glycemia prediction and on the performance of the predictive control. Keep in mind that although qualitative improvements do not appear to be significant at first sight, any feasible improvement matters a lot when managing glycemia and can have significant long-term consequences for patient health.
Compared with the conservative structure of the MPC-based artificial pancreas [6, 32,33,34] that utilizes the Kalman filter state estimation with typically empirically tuned covariance matrix of the process noise, which normally results in its suboptimal performance, the strategy proposed in this paper additionally involves an easily identifiable prediction model of the OESO to effectively compensate for the adverse effect of suboptimality of the state estimation by correcting the system free response prediction. Moreover, this structural modification is not computationally demanding and it can be easily embedded to the existing modular MPC schemes [6, 32,33,34] without involving any hardware adjustments or directly interacting with other features of the artificial pancreas such as constraints.
On the contrary, compared with the alternative solution based on methodological estimation of the covariance matrix of the process noise according to the methods presented in [18,19,20,21,22] to implicitly ensure the optimality of the state estimation, it can be claimed that these methods typically require very large experimental datasets (tens of thousands of samples) to provide reliable and unbiased estimates, whereas the statistical models of the OESO proposed in this paper can be identified from relatively limited experimental data (hundreds of samples).
In a nutshell, the most significant contributions presented in this work include the derivation of the analytical stochastic ARMA model of the OESO and its approximations which can be used to improve the performance of the suboptimal state observer in applications when the process noise model is not exactly known or is uncertain. It can be concluded that the actual effect of the proposed strategy depends primarily on the degree of plant-model mismatch of the noise model.
References
Kalman, R. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering (ASME), 82(1), 35–45. https://doi.org/10.1115/1.3662552
Mehra, R. (1972). Approaches to adaptive filtering. IEEE Transactions on Automatic Control, 17(5), 693–698. https://doi.org/10.1109/TAC.1972.1100100
Mehra, R. (1970). On the identification of variances and adaptive kalman filtering. IEEE Transactions on Automatic Control, 15(2), 175–184. https://doi.org/10.1109/TAC.1970.1099422
Kirchsteiger, H., Jørgensen, J., Renard, E., & del Re, L. (eds.) (2016). Prediction Methods for Blood Glucose Concentration: Design, Use and Evaluation. Lecture Notes in Bioengineering. Cham: Springer. https://doi.org/10.1007/978-3-319-25913-0.
Cescon, M., & Johansson, R. (2014). Linear modeling and prediction in diabetes physiology. Lecture notes in bioengineering (pp. 187–222). Springer. https://doi.org/10.1007/978-3-642-54464-4_
Sánchez-Peña, R., & Chernavvsky, D. (2019). Artificial pancreas: Current situation and future directions (p. 306). Academic Press.
Toffanin, C., Magni, L., & Cobelli, C. (2021). Artificial pancreas: In silico study shows no need of meal announcement and improved time in range of glucose with intraperitoneal vs. subcutaneous insulin delivery. IEEE Transactions on Medical Robotics and Bionics, 3, 306–314.
Boiroux, D., Hagdrup, M., Mahmoudi, Z., Poulsen, N. K., Madsen, H., & Jørgensen, J. B. (2016). Model identification using continuous glucose monitoring data for type 1 diabetes. IFAC-PapersOnLine, 49(7), 759–764 .https://doi.org/10.1016/j.ifacol.2016.07.279.
Turksoy, K., Samadi, S., Feng, J., Littlejohn, E., Quinn, L., & Cinar, A. (2016). Meal detection in patients with type 1 diabetes: A new module for the multivariable adaptive artificial pancreas control system. IEEE Journal of Biomedical and Health Informatics, 20(1), 47–54. https://doi.org/10.1109/JBHI.2015.2446413
Sala-Mira, I., Siket, M., Kovács, L., Eigner, G., & Bondia, J. (2021). Effect of model, observer and their interaction on state and disturbance estimation in artificial pancreas: An in-silico study. IEEE Access, 9, 143549–143563. https://doi.org/10.1109/ACCESS.2021.3120880
Hou, L., Zhang, H., Wang, J., & Shi, D. (2019). Optimal blood glucose prediction based on intermittent data from wearable glucose monitoring sensors. In 2019 Chinese control conference (CCC) (pp. 5463–5467). Guangzhou, China. https://doi.org/10.23919/ChiCC.2019.8866572.
Griva, L. O., Martínez, R., & Basualdo, M. S. (2019). Combining short and long-term models for predicting blood glucose concentrations on diabetic patients. In 2019 XVIII workshop on information processing and control (RPIC) (pp. 123–128). Salvador, Brazil. https://doi.org/10.1109/RPIC.2019.8882152.
Xie, J., & Wang, Q. (2017). A variable state dimension approach to meal detection and meal size estimation: In silico evaluation through basal-bolus insulin therapy for type 1 diabetes. IEEE Transactions on Biomedical Engineering, 64(6), 1249–1260. https://doi.org/10.1109/TBME.2016.2599073
Acharya, D., & Das, D. K. (2022). Extended kalman filter state estimation-based nonlinear explicit model predictive control design for blood glucose regulation of type 1 diabetic patient. Medical and Biological Engineering and Computing, 60(5), 1347–1361. https://doi.org/10.1007/s11517-022-02511-5
Sala-Mira, I., Siket, M., Eigner, G., Bondia, J., & Kovacs, L. (2020). Kalman filter and sliding mode observer in artificial pancreas: An in-silico comparison. IFAC-Papers Online, 53(2), 16227–16232. https://doi.org/10.1016/j.ifacol.2020.12.617
Fathi, A. E., Palisaitis, E., Boulet, B., Legault, L., & Haidar, A. (2019). An unannounced meal detection module for artificial pancreas control systems. In 2019 American control conference (ACC) (pp. 4130–4135). Philadelphia, PA, USA. https://doi.org/10.23919/ACC.2019.8814932.
Kovács, L., Siket, M., Rudas, I., Szakál, A., & Eigner, G.: Discrete LPV based parameter estimation for TIDM patients by using dual extended Kalman filtering method. In 2019 IEEE international conference on systems, man and cybernetics (SMC) (pp. 1390–1395). Bari, Italy. https://doi.org/10.1109/SMC.2019.8914014.
Dodek, M., & Miklovičová, E. (2023). Estimation of process noise variances from the measured output sequence with application to the empirical model of type 1 diabetes. Biomedical Signal Processing and Control, 84, 104773. https://doi.org/10.1016/j.bspc.2023.104773
Kost, O., Dunik, J., & Straka, O. (2022). Measurement difference method: A universal tool for noise identification. IEEE Transactions on Automatic Control, 68(3), 1792–1799. https://doi.org/10.1109/TAC.2022.3160679
Bianchi, F., Formentin, S., & Piroddi, L. (2020). Process noise covariance estimation via stochastic approximation. International Journal of Adaptive Control and Signal Processing, 34(1), 63–76. https://doi.org/10.1002/acs.3068
Odelson, B. J., Rajamani, M. R., & Rawlings, J. B. (2006). A new autocovariance least-squares method for estimating noise covariances. Automatica, 42(2), 303–308. https://doi.org/10.1016/j.automatica.2005.09.006
Duník, J., Kost, O., & Straka, O. (2018). Design of measurement difference autocovariance method for estimation of process and measurement noise covariances. Automatica, 90, 16–24. https://doi.org/10.1016/j.automatica.2017.12.040
Finan, D. A., Palerm, C. C., Doyle, F. J., III., Seborg, D. E., Zisser, H., Bevier, W. C., & Jovanovic, L. (2009). Effect of input excitation on the quality of empirical dynamic models for type 1 diabetes. AIChE Journal, 55(5), 1135–1146. https://doi.org/10.1002/aic.11699
Ståhl, F., & Johansson, R. (2009). Diabetes mellitus modeling and short-term prediction based on blood glucose measurements. Mathematical Biosciences, 217(2), 101–117. https://doi.org/10.1016/j.mbs.2008.10.008
Dodek, M., & Miklovičová, E. (2021). Physiology-compliant empirical model for glycemia prediction. International Review of Automatic Control (IREACO), 14(6). https://doi.org/10.15866/ireaco.v14i6.21283.
Dodek, M., & Miklovičová, E. (2022). Maximizing performance of linear model predictive control of glycemia for T1DM subjects. Archives of Control Sciences, 32(2), 305–333. https://doi.org/10.24425/acs.2022.141714
Parker, R. S., Doyle, F. J., & Peppas, N. A. (1999). A model-based algorithm for blood glucose control in type i diabetic patients. IEEE Transactions on Biomedical Engineering, 46(2), 148–157. https://doi.org/10.1109/10.740877
Magni, L., Raimondo, D. M., Bossi, L., Man, C. D., Nicolao, G. D., Kovatchev, B., & Cobelli, C. (2007). Model predictive control of type 1 diabetes: An in silico trial. Journal of Diabetes Science and Technology, 1(6), 804–812. https://doi.org/10.1177/193229680700100603
Magni, L., Raimondo, D. M., Man, C. D., De Nicolao, G., Kovatchev, B., & Cobelli, C. (2008). Model predictive control of glucose concentration in subjects with type 1 diabetes: an in silico trial. IFAC Proceedings Volumes, 41(2), 4246–4251. https://doi.org/10.3182/20080706-5-KR-1001.00714.17thIFACWorldCongress
Soru, P., De Nicolao, G., Toffanin, C., Dalla Man, C., Cobelli, C., & Magni, L. (2012). MPC based artificial pancreas: Strategies for individualization and meal compensation. Annual Reviews in Control, 36, 118–128. https://doi.org/10.1016/j.arcontrol.2012.03.009
Dodek, M., & Miklovičová, E. (2022). Optimal state estimation for the artificial pancreas. In 2022 23rd international carpathian control conference (ICCC) (pp. 88–93). https://doi.org/10.1109/ICCC54292.2022.9805903.
Mehmood, S., Ahmad, I., Arif, H., Ammara, U. E., & Majeed, A. (2020). Artificial pancreas control strategies used for type 1 diabetes control and treatment: A comprehensive analysis. Applied System Innovation. 3(3), 31. https://doi.org/10.3390/asi3030031
Moon, S. J., Jung, I., & Park, C.-Y. (2021). Current advances of artificial pancreas systems: A comprehensive review of the clinical evidence. Diabetes & Metabolism Journal, 45(6), 813–839. https://doi.org/10.4093/dmj.2021.0177
Tasic, J., Takacs, M., Kovacs, L. (2022). Control engineering methods for blood glucose levels regulation. Acta Polytechnica Hungarica, 19(7), 127–152. https://doi.org/10.12700/APH.19.7.2022.7.7.
Fabris, C., & Kovatchev, B. P. (2020). Glucose monitoring devices: measuring blood glucose to manage and control diabetes (pp. 350). Amsterdam: Elsevier Science. https://doi.org/10.1016/C2018-0-00515-0
Anderson, B. D. O., & Moore, J. B. (2012). Optimal filtering. Dover Publications.
Golub, G. H., & Van Loan, C. F. (2013). Matrix computations. Johns Hopkins studies in the mathematical sciences. Johns Hopkins University Press.
Gubner, J. A. (2006). Probability and random processes for electrical and computer engineers. Cambridge University Press. https://doi.org/10.1017/CBO9780511813610
Jenkins, G. M., & Watts, D. G. (1969). Spectral analysis and its applications. Holden-Day series in time series analysis and digital signal processing. Holden-Day.
Ljung, L. (1999). System identification: Theory for the user. Prentice Hall information and system sciences series. Prentice Hall PTR.
Sandgren, N., Stoica, P., & Babu, P. (2012). On moving average parameter estimation. In 2012 Proceedings of the 20th European signal processing conference (EUSIPCO) (pp. 2348–2351). Bucharest, Romania.
Durbin, J. (1959). Efficient estimation of parameters in moving-average models. Biometrika, 46(3/4), 306–316.
Haber, R., Bars, R., & Schmitz, U. (2023). Predictive control in process engineering (pp. 629). Wiley. https://doi.org/10.1002/9783527636242
Maciejowski, J. M. (2002). Predictive control: With constraints. Prentice Hall.
Nebeluk, R., & Marusak, P. (2020). Efficient MPC algorithms with variable trajectories of parameters weighting predicted control errors. Archives of Control Sciences, 30(2), 325–363. https://doi.org/10.24425/acs.2020.133502.
Cobelli, C., Dalla Man, C., Sparacino, G., Magni, L., De Nicolao, G., & Kovatchev, B. P. (2009). Diabetes: Models, signals, and control. IEEE Reviews in Biomedical Engineering, 2, 54–96. https://doi.org/10.1109/RBME.2009.2036073
Zisser, H., Robinson, L., Bevier, W., Dassau, E., Ellingsen, C., Doyle, F. J., & Jovanovic, L. (2008). Bolus calculator: A review of four “smart” insulin pumps. Diabetes Technology & Therapeutics, 10(6), 441–444. https://doi.org/10.1089/dia.2007.0284
Ellingsen, C., Dassau, E., Zisser, H., Grosman, B., Percival, M. W., Jovanovic, L., Francis, J., & Doyle, I. (2009). Safety constraints in an artificial pancreatic beta cell: An implementation of model predictive control with insulin on board. Journal of Diabetes Science and Technology, 3(3), 536–544. https://doi.org/10.1177/193229680900300319
Lee, H., Buckingham, B. A., Wilson, D. M., & Bequette, B. W. (2009). A closed-loop artificial pancreas using model predictive control and a sliding meal size estimator. Journal of Diabetes Science and Technology, 3(5), 1082–1090. https://doi.org/10.1177/193229680900300511. PMID: 20144421.
Hu, R., & Li, C. (2015). An improved PID algorithm based on insulin-on-board estimate for blood glucose control with type 1 diabetes. Computational and Mathematical Methods in Medicine. https://doi.org/10.1155/2015/281589
Wilinska, M. E., Chassin, L. J., Schaller, H. C., Schaupp, L., Pieber, T. R., & Hovorka, R. (2005). Insulin kinetics in type-1 diabetes: Continuous and bolus delivery of rapid acting insulin. IEEE Transactions on Biomedical Engineering, 52(1), 3–12. https://doi.org/10.1109/TBME.2004.839639
Boiroux, D., Schmidt, S., Duun-Henriksen, A., Frøssing, L., Nørgaard, K., Madsbad, S., Skyggebjerg, O., Poulsen, N., Madsen, H., Jørgensen, J. (2012). Control of blood glucose for people with type 1 diabetes: an in vivo study. In Proceedings of the 17th Nordic process control workshop (pp. 133–140). Technical University of Denmark.
Sun, X., Cinar, A., Liu, J., Rashid, M., & Yu, X. (2023). Prior-knowledge-embedded model predictive control for blood glucose regulation: Towards efficient and safe artificial pancreas. Biomedical Signal Processing and Control, 82, 104551. https://doi.org/10.1016/j.bspc.2022.104551
De Nicolao, G., Magni, L., Man, C. D., & Cobelli, C. (2011). Modeling and control of diabetes: Towards the artificial pancreas. IFAC Proceedings Volumes, 44(1), 7092–7101. https://doi.org/10.3182/20110828-6-IT-1002.03036.18thIFACWorldCongress
Dalla Man, C., Rizza, R., Cobelli, C. (2006). Mixed meal simulation model of glucose-insulin system. In 28th annual international conference of the IEEE engineering in medicine and biology society (pp. 307–310). New York, NY, USA. https://doi.org/10.1109/IEMBS.2006.260810.
Dalla Man, C., Rizza, R. A., & Cobelli, C. (2007). Meal simulation model of the glucose-insulin system. IEEE Transactions on Biomedical Engineering, 54(10), 1740–1749. https://doi.org/10.1109/TBME.2007.893506
Dodek, M., Miklovičová, E., & Tárník, M. (2022). Correlation method for identification of a nonparametric model of type 1 diabetes. IEEE Access, 10, 106369–106385. https://doi.org/10.1109/ACCESS.2022.3212435
Schmidt, S., & Nørgaard, K. (2014). Bolus calculators. Journal of Diabetes Science and Technology, 8(5), 1035–1041. https://doi.org/10.1177/1932296814532906
Funding
Open access funding provided by The Ministry of Education, Science, Research and Sport of the Slovak Republic in cooperation with Centre for Scientific and Technical Information of the Slovak Republic Open access funding provided by The Ministry of Education, Science, Research and Sport of the Slovak Republic in cooperation with Centre for Scientific and Technical Information of the Slovak Republic.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
This work was supported by the grant VEGA 1/0049/20—Modelling and Control of Biosystems, the Ministry of Education, Science, Development and Sport of the Slovak Republic.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Dodek, M., Miklovičová, E. Predicting the output error of the suboptimal state estimator to improve the performance of the MPC-based artificial pancreas. Control Theory Technol. 21, 541–554 (2023). https://doi.org/10.1007/s11768-023-00142-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11768-023-00142-1