Keywords

1 Introduction

In science and technology, dynamic processes are often described by time-variant mathematical models. However, the accurate prediction of the motion and behavior of technical systems is still challenging. Due to our incomplete knowledge of the internal relations between model parameters, state variables and the time domain, the user frequently encounters model uncertainty [20]. In [12] we developed an algorithm to identify this model uncertainty using parameter estimation, optimal experimental design and classical hypothesis testing. It is the aim of this paper to extend this approach to dynamic models. In the following, we adopt the framework described in [12] and extend it to meet mathematical models which are comprised of time-variant partial differential equations (PDE).

There is abundant literature on the assessment of descriptive and predictive qualities of dynamic models. Most common are techniques like residual analysis [27, 29] and interval simulation [25], maximum likelihood methods [29] and Bayesian model updating [31]. Our approach comes from a frequentist perspective and offers an alternative: we minimize the extent of data uncertainty by optimizing the experimental design and employ a k-fold cross-validation to test the model’s fitness and consistency. The validation is hereby performed via a classical hypothesis test in the parameter space.

A subproblem that needs to be solved in our approach to detect model uncertainty is the PDE-constrained optimal experimental design (OED) problem where the PDE is time-dependent. We specifically focus on experiments where sensors need to be positioned and inputs must be chosen in order to achieve a maximum information gain for the estimated values of the model parameters. Optimal sensor placement has been addressed within the PDE-context in [1, 2, 23] and optimal input configuration has been extensively analyzed for both linear and nonlinear ordinary differential equations in various engineering applications [6, 17, 21, 22, 28]. However, in these cases the problem dimension is small compared to a (discretized) time-variant PDE and thus, gradient-based optimization with a sensitivity approach, as suggested by [4] and [19], works fine. In our case, this approach is no longer computationally tractable. Our framework meets high-dimensionality by employing efficient adjoint techniques in a sequential quadratic programming (SQP) solver scheme.

This paper is organized as follows. In Sect. 2 and Sect. 3, we introduce our model equations and briefly present the concepts of parameter estimation and OED followed by efficient solution techniques for the OED problem. Then, in Sect. 4 we show how our algorithm to detect model uncertainty is adapted to the dynamic setting. Section 5 contains numerical results for the OED problem applied to vibrations of a truss and the application of our algorithm to detect model uncertainty. We end the paper with concluding remarks.

2 Model Equations of Transient Linear Elasticity and Their Discretization

Let be a bounded Lipschitz domain with sufficiently smooth boundary \( \partial G = \varGamma _\mathrm {D} \cup \varGamma _\mathrm {F} \cup \varGamma _\mathrm {N} \) where \( \varGamma _\mathrm {D}, \varGamma _\mathrm {F}, \varGamma _\mathrm {N} \) are pairwise disjoint and non-empty. Furthermore, let (0, T) , with \( T > 0 \), be an open and bounded time interval. We consider the parameter-dependent equations of motion for the linear-elastic body G of mass density \( \varrho > 0 \) and weak damping constant \( a > 0 \), see [15, Sec. 7.2]:

(1)

We include Rayleigh damping in our modeling by the generalized law of Hooke:

$$ \sigma (y, \partial _{t} y, p) = \mathcal {C}(p) \big (\varepsilon (y) + b \varepsilon (\partial _{t} y)\big ), $$

where \( b > 0 \) is the strong damping constant, \( \varepsilon (y) = \frac{1}{2} \!\left( \nabla y^\top + \nabla y\right) \) is the linearized strain and \( \mathcal {C} \; : \; \varepsilon \mapsto p_1 \cdot \mathrm {trace}\left( \varepsilon \right) I + 2 p_2 \cdot \varepsilon \) is the fourth order elasticity tensor, see also [7]. The parameters in this PDE are the well known Lamé constants . It is evident from (1) that the displacement is caused by the traction alone.

After adopting the weak formulation of (1) according to [15, Sec. 7.2] and [9] we perform a finite-dimensional approximation of this weak formulation known as the Galerkin ansatz. We employ standard quadratic finite elements for the elastic body G for the space discretization. Then the finite element approximation leads to the (high-dimensional) second-order ordinary differential equation

$$\begin{aligned} M \partial ^2_{t t} y(t) + C(p) \partial _{t} y(t) + A(p) y(t) - N u(t) = 0, \end{aligned}$$
(2)

with the stiffness matrix A(p) , the mass matrix M and the boundary mass matrix N. For the Rayleigh damping term, we introduce the damping matrix

$$\begin{aligned} C(p) := a M + b A(p), \end{aligned}$$
(3)

where \( a, b > 0 \) are the damping constants as before.

We want to use a numerical time-update scheme with a predefined step size \( \varDelta t \) to solve (2). Therefore, we rewrite (2) in the form

$$\begin{aligned} M a_{n} + C(p) v_{n} + A(p) d_{n} - N u_{n} = 0, \end{aligned}$$

with the acceleration vector \( a_n = \partial ^2_{t t} y(t_n) \), the velocities \( v_n = \partial _{t} y(t_n) \) and the displacements \( d_n = y(t_n) \) for time steps \( t_n \), where \( n = 1, \ldots , n_\mathrm {t} \), respectively. The implicit Newmark method is suitable to solve this equation. It can be implemented in the following way. First, we choose constants \( \beta _\mathrm {N} = \frac{1}{4} \), \( \gamma _\mathrm {N} = \frac{1}{2} \) for stability reasons, see [15, 30], and define with them other constants \( \alpha _1, \ldots , \alpha _6 \) as depicted in [30, Sec. 6.1.2]. Then the iteration scheme reads as follows:

$$\begin{aligned} \begin{aligned} a_{n+1} = \,&\alpha _1 \!\left( d_{n+1} - d_n\right) - \alpha _2 v_n - \alpha _3 a_n, \\ v_{n+1} = \,&\alpha _4 \!\left( d_{n+1} - d_n\right) + \alpha _5 v_n + \alpha _6 a_n, \\ \left[ \alpha _1 M + \alpha _4 C(p) + A(p)\right] d_{n+1} = \,&N u_{n+1} + M \!\left( \alpha _1 d_n + \alpha _2 v_n + \alpha _3 a_n\right) \\&+ C(p) \!\left( \alpha _4 d_n - \alpha _5 v_n - \alpha _6 a_n\right) . \end{aligned} \end{aligned}$$
(4)

This scheme can be written in matrix form:

$$\begin{aligned} L(p) y - F u = 0, \end{aligned}$$

where \( y = (y_1, \ldots , y_{n_\mathrm {t}})^\top \) are the states, \( u = (u_1, \ldots , u_{n_\mathrm {t}})^\top \) are the boundary forces at all time points, \( y_n = (a_n, v_n, d_n)^\top \) and \( u_n = (u_{n,x}, u_{n,y})^\top \). The matrices L and F have the block form

$$\begin{aligned} L(p)&= \begin{bmatrix} Q(p) &{} &{} &{} \\ P(p), &{} X(p) &{} &{} \\ &{} \ddots &{} \ddots &{} \\ &{} &{} P(p), &{} X(p) \end{bmatrix},&F&= \begin{bmatrix} E_0 &{} &{} &{} \\ &{} E_1 &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{} E_1 \end{bmatrix}, \end{aligned}$$

where

$$\begin{aligned} Q(p)&= \begin{bmatrix} M, &{} C(p), &{} A(p) \\ &{} I &{} \\ &{} &{} I \end{bmatrix},&X(p)&= \begin{bmatrix} I, &{} 0, &{} -\alpha _1 I \\ 0, &{} I, &{} -\alpha _4 I \\ 0, &{} 0, &{} D(p) \end{bmatrix},&E_0&= \begin{bmatrix} N, \\ 0, \\ 0 \end{bmatrix},&E_1&= \begin{bmatrix} 0, \\ 0, \\ N \end{bmatrix}, \end{aligned}$$

with \( D(p) := \alpha _1 M + \alpha _4 C(p) + A(p) \) and

$$\begin{aligned} P(p)&= \begin{bmatrix} \alpha _3 I, &{} \alpha _2 I, &{} \alpha _1 I \\ -\alpha _6 I, &{} -\alpha _5 I, &{} \alpha _4 I \\ -\alpha _3 M + \alpha _6 C(p), &{} -\alpha _2 M + \alpha _5 C(p), &{} -\alpha _1 M - \alpha _4 C(p) \end{bmatrix}. \end{aligned}$$

For the following optimization problems let be the state space and let

$$ {{\,\mathrm{\textit{U}_{\mathrm {ad}}}\,}}:= \left\{ u \in H^1\!\left( 0, T; \varGamma _{\mathrm {N}}\right) : [u(t)](x) = c(t) \text { and } u_{\mathrm {min}} \le u \le u_{\mathrm {max}}\right\} , $$

be the space of admissible inputs with \( n_{\mathrm {y}} = n_{\mathrm {d}} n_{\mathrm {t}} \) being the product of the space dimension after discretization \( n_{\mathrm {d}} \) and the number of time steps \( n_{\mathrm {t}} \). Furthermore, let be an operator defining the state equation as

$$\begin{aligned} e(y, p, u) := L(p) y - F u = 0 \end{aligned}$$
(5)

and denote its unique solution by y(pu) . We assume the operator \( \partial _{y} e(y, p, u) \) to be continuously invertible such that we can use the Implicit Function Theorem to define a mapping \( p \mapsto y(p, u) \). Its derivatives \( s_i := \partial _{p_i} y(p, u) \) for \( i = 1,2 \) are computed by solving

$$\begin{aligned} \partial _{y} e(y(p, u), p, u) s_i + \partial _{p_i} e(y(p, u), p, u) = 0, \end{aligned}$$

which in our setting is equivalent to

$$\begin{aligned} L(p) s_i + \partial _{p_i} L(p) y(p, u) = 0, \qquad i = 1,2. \end{aligned}$$
(6)

Thus, the sensitivity variable \( s := [s_1, s_2] \in {{\,\mathrm{\textit{Y}}\,}}\times {{\,\mathrm{\textit{Y}}\,}}\) depends on the solution of the state equation and on the parameters, i.e., \( s_i = s_i(y(p, u), p_i) \). Equations (6) are solved by rewriting them using the iteration scheme (4).

For the input space \( {{\,\mathrm{\textit{U}_{\mathrm {ad}}}\,}}\) we employ a time discretization with linear finite elements and denote by \( M_{\mathrm {T}} \) the mass matrix and by \( A_{\mathrm {T}} \) the stiffness matrix in the time domain.

3 Lamé-Parameter Estimation and the Optimal Experimental Design Problem

Given a set of experimental data, we are concerned with an accurate estimation of the Lamé-parameters which are part of the model equations. The measurements are taken at selected points on the discretized free boundary part \( \varGamma _{\mathrm {F}} \) of the elastic body G with specified sensor types. We denote by \( n_{\mathrm {s}} \) the number of available sensors. In order to compare the output of the model equations, i.e., the state, with experimental data we introduce a nonlinear observation operator that maps components of the state to quantities that are actually measured during the experiment at all \( n_{\mathrm {t}} \) time steps.

Within the framework of optimal experimental design, we introduce binary weights \( \omega \in \left\{ 0, 1\right\} ^{n_{\mathrm {s}}} \) for all sensor locations and types. These weights operate as a selection tool, i.e., \( \omega _k = 1 \) if, and only if, sensor k is used at its specified location. Since the position of these sensors and their usage throughout the experiments stay the same, the values of \( \omega \) are copied \( n_{\mathrm {t}} \) times and summarized in the diagonal matrix , where \( n_{\mathrm {z}} = n_{\mathrm {s}} n_{\mathrm {t}} \). In addition, each sensor has a fixed operating precision, i.e., standard deviation, which we associate by the variable . We again summarize \( n_{\mathrm {t}} \) copies of \( \sigma _{\mathrm {pr}} \) in a diagonal matrix .

The data is used to estimate the parameters by solving a least-squares problem:

(7)

where \( r(z, y(p, u)) := {{\,\mathrm{\textit{h}}\,}}(y(p, u)) - z \) are the residuals and y(pu) is the unique solution of (5) for given p and u. Since the measurements are random variables \( z = z^*+ \varepsilon \) with unknown true values \( z^*\) and noise \( \varepsilon \), so are the parameters. We model the noise to be Gaussian, i.e., \( \varepsilon \in \mathcal {N}(0, {{\,\mathrm{\varOmega }\,}}^{-1}{{\,\mathrm{\varSigma }\,}}^2) \). In a first order approximation, like in a Gauss-Newton solver scheme, the parameters are also Gaussian with unknown mean \( p^*\) and covariance matrix C, see [8, 19]. Then the confidence region of the parameters with a fixed confidence level \( 1 - \alpha \), where \( \alpha \in (0,1) \), is given by

We assume that the solution \( \overline{p}\) of (7) for given z and \( \omega \), emerging from the Gauss-Newton algorithm, is sufficiently close to \( p^*\). Thus, we make the assumption that for a given data set, \( \overline{p}\) is a fairly good approximation of \( p^*\). Then the covariance matrix C can be approximated by employing the Gauss-Newton scheme as well and it has the following form [8]:

$$\begin{aligned} C_{\mathrm {GN}} = \left[ s(y(p, u), p)^\top \partial _y\! {{\,\mathrm{\textit{h}}\,}}(y(p, u))^\top {{\,\mathrm{\varOmega }\,}}{{\,\mathrm{\varSigma }\,}}^{-2} \partial _y\! {{\,\mathrm{\textit{h}}\,}}(y(p, u)) s(y(p, u), p)\right] ^{-1}. \end{aligned}$$

We aim at minimizing the confidence region where the estimated parameters \( \overline{p}\) lie:

The reduction of the size of the confidence ellipsoid K is equivalent to reducing the “size” of the covariance matrix. This is realized by choosing best sensor locations, determined by the weights \( \omega \), and by finding optimal inputs u. In practice, there are various design criteria \( \varPsi \) that measure the “size” of a matrix C, see [11]. In this paper we decide to use the E-criterion which is related to the maximal expansion of K:

$$\begin{aligned} \varPsi (C) = \varPsi _{\mathrm {E}}(C)&= \lambda _{\mathrm {max}}\,(C) \sim \mathrm {diameter}(K)^2. \end{aligned}$$

We add a cost term \( P_\varepsilon (\omega ) \) to penalize the number of used sensors and a regularizer \( R(u) := u^\top (M_{\mathrm {T}} + A_\mathrm {T}) u \) to the objective function. Moreover, we relax the binary restriction on \( \omega \) to employ gradient-based solution techniques for the following optimal experimental design problem.

Definition 1

Let be an estimate of \( p^*\) and let \( \kappa , \beta > 0 \) be fixed. Furthermore, choose \( \varepsilon \in (0,1] \). Then we call \( (\overline{\omega }, \overline{u}) \) an optimal design of an experiment with the linear-elastic body G if it is the solution of

$$\begin{aligned} \min _{\omega , u, y, s} \; \;&\varPsi \!\left( C_{\mathrm {GN}}(\omega , y, s)\right) + \kappa \cdot P_\varepsilon (\omega ) + \beta \cdot R(u), \end{aligned}$$
(8)

where (uys) are subject to the equality constraints

$$\begin{aligned} \begin{aligned} L(p) y - F u&= 0, \\ L(p) s_i + \partial _{p_i} L(p) y&= 0, \end{aligned} \end{aligned}$$
(9)

for \( i = 1, 2 \) and \( (\omega , u) \) satisfy the inequality constraints

$$\begin{aligned} \omega \in \left[ 0, 1\right] ^{n_{\mathrm {s}}}, \quad u \in {{\,\mathrm{\textit{U}_{\mathrm {ad}}}\,}}. \end{aligned}$$
(10)

The penalty term \( P_\varepsilon (\omega ) \) is a smooth approximation of the \( l_0 \)-“norm”. It ensures sparse solutions in \( \omega \) for suitable choices of \( \kappa \) but does not lead to \( \left\{ 0, 1\right\} \)-valued weights yet. To achieve the latter, we adopt a continuation strategy as described in [1, 2].

Note, that the penalty parameter \( \kappa \) must not be chosen too large since the matrix \( C_{\mathrm {GN}} \) becomes singular if too many weights \( \omega \) are switched to zero. We refer to [19] for more details on lower bounds for the sum of the weight variables.

In practice, problem (8)–(10) is solved using its reduced formulation, i.e., by eliminating the equality constraints (9) and inserting y(pu) and s(y(pu), p) into the objective function.

3.1 Derivative and Adjoint Computation

Let \( J(\omega , u, y, s_1, s_2) \) be the objective function in (8). We show how the derivative of the reduced objective function \( \hat{J}(\omega , u) \), where the solutions y(pu) and s(y(pu), p) of (9) have been inserted into J, with respect to the inputs u is efficiently computed. To do so, we follow a standard Lagrangian view of the optimization problem (8)–(10). For simplicity, we ignore the inequality constraints (10) and still denote by \( \partial _y \varPsi \) the derivative of \( \varPsi \) with respect to y even though we used the Clarke directional derivatives in the case of \( \varPsi = \varPsi _{\mathrm {E}} \), cf. [13]. Let \( \mu , \lambda _{1}, \lambda _{2} \in {{\,\mathrm{\textit{Y}^{*}}\,}}\) be Lagrange multipliers and let the Lagrangian be defined as

$$\begin{aligned} \mathcal {L}(\omega , u, y, s_1, s_2, \mu , \lambda _1, \lambda _2) :=&\; J(\omega , u, y, s_1, s_2) + \left\langle \mu , L(p) y - F u\right\rangle _{{{\,\mathrm{\textit{Y}^{*}}\,}}, {{\,\mathrm{\textit{Y}}\,}}} \\&\; + \sum \limits _{i = 1}^2 \left\langle \lambda _i, L(p) s_i + \partial _{p_i}L(p) y\right\rangle _{{{\,\mathrm{\textit{Y}^{*}}\,}}, {{\,\mathrm{\textit{Y}}\,}}}. \end{aligned}$$

The adjoint equations follow from \( \partial _{y} \mathcal {L} = \partial _{s_i} \mathcal {L} = 0 \) for \( i = 1, 2 \):

$$\begin{aligned} \begin{aligned} L(p)^\top \mu + \left[ \partial _{p_1}L(p)\right] ^\top \lambda _{1} + \left[ \partial _{p_2}L(p)\right] ^\top \lambda _{2} + \partial _{y} \varPsi&= 0, \\ L(p)^\top \lambda _{1} + \partial _{s_1} \varPsi&= 0, \\ L(p)^\top \lambda _{2} + \partial _{s_2} \varPsi&= 0. \end{aligned} \end{aligned}$$
(11)

The fact that the matrix L(p) is transposed on the left hand side of (11) leads to an iteration scheme backwards in time. We demonstrate this for the second and third adjoint equations in order to obtain \( \lambda _i \) whereby adopting ideas from [18, Sec. 5.4]. Let \( \lambda = \lambda _{i} \) and \( r := \partial _{s_i} \varPsi \) for \( i \in \left\{ 1,2\right\} \). Note that \( r = (r_1, \ldots , r_{n_{\mathrm {t}}})^\top \) and \( r_n = (r_n^d, 0, 0) \) since the velocities and accelerations do not enter \( \varPsi \). In the terminal point \( t_{n_{\mathrm {t}}} \) we have to solve

$$\begin{aligned} X(p)^\top \lambda _{n_{\mathrm {t}}} = r_{n_{\mathrm {t}}}, \end{aligned}$$

or equivalently \( \lambda _{n_{\mathrm {t}}}^a = 0, \, \lambda _{n_{\mathrm {t}}}^v = 0 \) and

$$\begin{aligned} -\alpha _1 \lambda _{n_{\mathrm {t}}}^a - \alpha _4 \lambda _{n_{\mathrm {t}}}^v + D(p) \lambda _{n_{\mathrm {t}}}^d = r_{n_{\mathrm {t}}}^d. \end{aligned}$$

For other time points \( t_n, \; n \ne 1 \) the current iterate is obtained from the one which is a step forward in time:

$$\begin{aligned} \left[ X(p)^\top , P(p)^\top \right] \begin{pmatrix} \lambda _n \\ \lambda _{n+1} \end{pmatrix} = r_n, \end{aligned}$$

or equivalently

$$\begin{aligned} \lambda _n^a&= \alpha _3 \lambda _{n+1}^a + \alpha _6 \lambda _{n+1}^v + \left[ \alpha _3 M - \alpha _6 C(p)\right] \lambda _{n+1}^d, \\ \lambda _n^v&= \alpha _2 \lambda _{n+1}^a + \alpha _5 \lambda _{n+1}^v + \left[ \alpha _2 M - \alpha _5 C(p)\right] \lambda _{n+1}^d, \\ - \alpha _1 \lambda _{n}^a - \alpha _4 \lambda _{n}^v + D(p) \lambda _{n}^d&= r_n^d - \alpha _1 \lambda _{n+1}^a - \alpha _4 \lambda _{n+1}^v + \left[ \alpha _1 M + \alpha _4 C(p)\right] \lambda _{n+1}^d. \end{aligned}$$

In order to obtain the adjoint variable at the initial time point \( t_1 \) we solve

$$\begin{aligned} \left[ Q(p)^\top , P(p)^\top \right] \begin{pmatrix} \lambda _1 \\ \lambda _2 \end{pmatrix} = r_1, \end{aligned}$$

or equivalently

$$\begin{aligned} M \lambda _{1}^a&= \alpha _3 \lambda _{2}^a + \alpha _6 \lambda _{2}^v + \left[ \alpha _3 M - \alpha _6 C(p)\right] \lambda _{2}^d, \\ C(p) \lambda _{1}^a + \lambda _{1}^v&= \alpha _2 \lambda _{2}^a + \alpha _5 \lambda _{2}^v + \left[ \alpha _2 M - \alpha _5 C(p)\right] \lambda _{2}^d, \\ A(p) \lambda _{1}^a + \lambda _{1}^d&= r_1^d - \alpha _1 \lambda _{2}^a - \alpha _4 \lambda _{2}^v + \left[ \alpha _1 M + \alpha _4 C(p)\right] \lambda _{2}^d. \end{aligned}$$

The matrix vector product \( q := \left[ \partial _{p_i} L(p)\right] ^\top \lambda _i \) is computed likewise using the iteration scheme.

Finally, the full derivative of the reduced objective function \( \hat{J}(\omega , u) \) with respect to the inputs u is given by

$$\begin{aligned} \dfrac{{{\,\mathrm{\mathrm {d}\!}\,}}\hat{J}}{{{\,\mathrm{\mathrm {d}\!}\,}}u} = - F^\top \mu + 2 \beta (M_{\mathrm {T}} + A_\mathrm {T}) u, \end{aligned}$$

where \( \mu \in {{\,\mathrm{\textit{Y}^{*}}\,}}\) is the adjoint variable obtained from (11).

3.2 Computational Remarks

In order to solve (8)–(10) we employ an SQP algorithm with BFGS updates [10] for the Hessian \( H_k \) of the Lagrangian. We modify the update formula in the following way:

$$\begin{aligned} H_0&= \begin{bmatrix} I &{} \\ &{} 2\beta (M_{\mathrm {T}} + A_{\mathrm {T}}) \end{bmatrix}, \\ H_{k+1}&= H_k - \dfrac{H_k d^k (H_k d^k)^\top }{(d^k)^\top H_k d^k} + \dfrac{r^k (r^k)^\top }{(r^k)^\top d^k}, \end{aligned}$$

where \( d^k \) is the current step, \( y^k \) is the difference between gradients of the Lagrangian at the new and old iterate and

$$\begin{aligned} r^k&:= {\left\{ \begin{array}{ll} y^k &{} \text { if } (y^k)^\top d^k \ge 0.2 (d^k)^\top H_k d^k, \\ \theta y^k + (1 - \theta ) H_k d^k&{} \text { otherwise, } \end{array}\right. } \end{aligned}$$

with \( \theta = \frac{0.8 (d^k)^\top H_k d^k}{(d^k)^\top H_k d^k -(y^k)^\top d^k} \). After every tenth iteration we reset the Hessian to \( H_0 \) to avoid matrix filling and to ensure a gradient descent with respect to \( \omega \) from time to time.

4 Detection of Uncertainty in Dynamic Models

We adopt the algorithm presented in [12] and describe the main differences when applied to a time-variant model \( \mathcal {M} \) of a dynamic process. In general, we presuppose that a valid model should reproduce all measurements obtained with all admissible inputs at all sensor locations with the same set of parameters. Our approach is summarized in Algorithm 1.

First, initial (or artificial) data \( z_{\mathrm {ini}} \) is needed for an appropriate guess \( p_{\mathrm {ini}} \) of the parameter values. Having fixed these parameters, one can solve the OED problem (8)–(10) to obtain best sensor positions \( \overline{\omega }\) and optimal input configurations \( \overline{u}\), see lines 02 and 03.

The acquisition of experimental data z in line 04 is done at the optimal sensor locations and for inputs close to the optimum. Since we assume that the true values of the model parameters remain the same for all inputs \( u \in {{\,\mathrm{\textit{U}_{\mathrm {ad}}}\,}}\), we can ensure that our data are truly divers by performing measurements for different input values within a small neighborhood of the optimum \( \overline{u}\). Evidently, the size of the confidence ellipsoid K stays small because of continuity of the objective function (8) with respect to the inputs.

Recall, that for time-variant systems each measurement at a given time depends on the past. Since the order of the data is important, the splitting of z into one calibration and one validation set must not happen over the time axis. Since our methodology is different from forecasting [5] we do not allow such splittings over the time domain.

We perform the division regarding the different inputs in a k-fold cross-validation manner [16]. Divide the data into k groups where each group is distinguished by the input for which it was collected in the whole time domain. We then use \( k-1 \) groups for calibration and the remaining group for validation. When repeating this procedure we run through all k possible combinations.

figure a

For the validation itself, we perform a classical hypothesis test from line 08 onward as documented in [12]. The threshold \( \mathtt {TOL} \) is identical to the error of the first kind. It is common to set a \( 5 \% \) limit to this error. The \( \alpha _{\mathrm {min}} \) which is computed in line 08 is the p-value of the statistical test. This is the smallest test level for which the null hypothesis can only just be rejected.

There is no need to account for the problem of multiple testing here, since we are using a k-fold cross-validation manner to divide the data which ensures pairwise disjoint validation sets.

5 Numerical Results for Simulated Vibrations of a Truss

We employ a 2D-truss consisting of nine beams and six connectors with about 5 000 spatial degrees of freedom in order to exemplify the application of Algorithm 1. The Dirichlet boundary \( \varGamma _{\mathrm {D}} \) is located at the two outer top connectors and the Neumann boundary \( \varGamma _{\mathrm {N}} \) on the bottom left connector, see Fig. 1. We use pairs of strain gauges as sensors that can measure either the axial deflection or the displacement caused by bending of the beams, see [14] and [26]. The strain gauges are located on the upper and lower boundaries of the beams, indicated as black bullets and connecting lines in the figure, which are part of the free boundary \( \varGamma _{\mathrm {F}} \) of the body G. Each strain gauge measures the relative displacement of two adjacent nodes: \( \varepsilon _{\mathrm {u}} = y_{\mathrm {N1}} - y_{\mathrm {N2}} \) and \( \varepsilon _{\mathrm {\ell }} = y_{\mathrm {N3}} - y_{\mathrm {N4}} \), see Fig. 1a. For simplicity, we compute the square of the axial deflection \( {{\,\mathrm{\textit{h}}\,}}_{\mathrm {a}}(y) \) and the square of the displacement caused by bending \( {{\,\mathrm{\textit{h}}\,}}_{\mathrm {b}}(y) \):

$$\begin{aligned} {{\,\mathrm{\textit{h}}\,}}_{\mathrm {a}}(y)&= \dfrac{1}{4} \left\| \varepsilon _{\mathrm {u}} + \varepsilon _{\ell }\right\| ^2,&{{\,\mathrm{\textit{h}}\,}}_{\mathrm {b}}(y)&= \dfrac{1}{4} \left\| \varepsilon _{\mathrm {u}} - \varepsilon _{\ell }\right\| ^2. \end{aligned}$$

Thus, the overall observation operator \( {{\,\mathrm{\textit{h}}\,}}\) consists of \( {{\,\mathrm{\textit{h}}\,}}_{\mathrm {a}} \) and \( {{\,\mathrm{\textit{h}}\,}}_{\mathrm {b}} \) at all time points and we create for each such sensor five weight variables. These additional weights shall give the experimenter information about which pairs of strain gauges are more important than others. The discretization of the truss allows for 117 sensors in total. Hence, we have \( n_s = 117 \times 2 \times 5 = {1\,170} \) weight variables.

Throughout our numerical simulations we use pure stiffness damping, i.e., \( a = 0 \) in (3). This is promising to provide better resemblance with actual experimental data, see [3] and [24]. The accuracy of our sensors is fixed to \( \sigma _{\mathrm {pr}, k}\) = 10 \(\upmu \mathrm {m}\) for \( k = 1, \ldots , n_{\mathrm {s}} \).

Fig. 1.
figure 1

(a) snapshot of the truss with all possible locations for the strain gauges marked as bullets with connecting lines and the excitation force displayed by a red arrow, (b) snapshot after problem (8)–(10) has been solved with displayed optimal positions for strain gauges and optimal excitation

We simulated vibrations of the truss for \( n_{\mathrm {t}} = 600 \) time steps with a step size of \( \varDelta t = 5 \) ms. Thus, three seconds were simulated in total and the solution of the PDE (1) involves about 3 000 000 degrees of freedom. Initially, there were all 117 pairs of strain gauges used measuring both the axial deflection and the displacement caused by bending with maximum weight, respectively. We also use a constant maximally feasible force as a starting point for the inputs u. The excitation forces u act solely on the Neumann boundary \( \varGamma _{\mathrm {N}} \).

Since we were not able to conduct real experiments, all the data was simulated, i.e., generated on the computer with random numbers. Thus, line 05 in Algorithm 1 became obsolete. We assume the beams of the real truss \( \mathcal {R} \) to have an equal cross-sectional area in the displacement-free state except for two beams having a \( 5 \% \) and a \( 7\% \) smaller diameter, respectively. For the detection of model uncertainty it is not important to know which beams differ from the standard diameter. However, our model \( \mathcal {M} \) operates on the assumption that all beams have the same cross-sectional area. This directly impacts the mathematical terms in the mass, damping and stiffness matrices, see (2), since a model with different cross-sectional beam areas would induce other finite element terms. It is our aim in this section to show that Algorithm 1 successfully detects model uncertainty in \( \mathcal {M} \) when compared to \( \mathcal {R} \).

Fig. 2.
figure 2

Optimization results for problem (8)–(10): (a) first order optimality ( ) and norm of the step ( ), (b) objective function ( ), design criterion ( ), penalty ( ) and regularization value ( )

Since we only simulate experiments, we skipped line 02 in Algorithm 1 and adopted textbook values for \( p_{\mathrm {ini}} \), namely, the well-known Lamé-constants for steel \( \lambda _{\mathrm {L}} = \) 121 154 N/mm\(^2 \) and \( \mu _{\mathrm {L}} = \) 80 769 N/mm\(^2 \). These are the values which we use to generate all measurements from the real truss \( \mathcal {R} \). Problem (8)–(10) is solved after about 80 iterations with an overall computation time of about 8 h on an AMD EPYC 48 \(\times \) 2.8 GHz machine. The design criterion decreased by \(\approx \)99% which means that the maximal expansion of the confidence ellipsoid decreased by \(\approx \)98% compared to the initial design, see Fig. 2. The final design employs only two pairs of strain gauges that measure the axial deflection, the upper with weight two the lower with weight five, cf. Fig. 1b.

Let \( \overline{u}\) be the optimal input force obtained from solving (8)–(10). For the application of the hypothesis test in Algorithm 1, consider the following perturbed inputs:

$$\begin{aligned} u_1(t)&= \overline{u}(t) + \delta _1,&u_2(t)&= \overline{u}(t) + 4 \sin \!\left( t / (2\pi )\right) , \\ u_3(t)&= \overline{u}(t) + 4 \cos \!\left( t / (2\pi )\right) ,&u_4(t)&= \overline{u}(t) + \delta _2(t), \\ u_5(t)&= \overline{u}(t) + \delta _3(t),&u_6(t)&= \overline{u}(t) \cdot \!\left( 1 + 0.06 \sin \!\left( t / (2\pi )\right) \right) , \\ u_7(t)&= \overline{u}(t) \cdot \!\left( 1 + 0.06 \cos \!\left( t / (2\pi )\right) \right) ,&u_8(t)&= \overline{u}(t) + \delta _4, \end{aligned}$$

where \( \delta _1, \delta _4 \sim \mathcal {N}(0, 4 \cdot I) \) and \( \delta _2(t), \delta _3(t) \sim \mathcal {N}(0, 4t/n_{\mathrm {t}} \cdot I)\) for all \( t \in \left\{ t_1, \ldots , t_{n_{\mathrm {t}}}\right\} \) with equal time step size \( \varDelta t \) as introduced before. With these inputs we generate eight different data sets and perform an 8-fold cross-validation. We use seven sets for calibration and one set for validation in line 06 of Algorithm 1. Thus, we conducted eight different hypothesis tests, four of which are shown in Table 1. It is clearly seen, that the model \( \mathcal {M} \) does not pass any test when a threshold of \( 5 \% \) is applied to \( \alpha _{\mathrm {min}} \). According to our assumption that a valid model should reproduce all measurements conducted with all admissible inputs with the same set of parameters, this is a significant indication of model uncertainty.

Table 1. Excerpt of the results for the hypothesis tests from Algorithm 1

6 Conclusion

In this paper we showed that our algorithm to detect model uncertainty, which was first presented in [12], is applicable to dynamic models. We efficiently solved the OED problem with time-dependent PDE-constraints using modified BFGS-updates and adjoint methods within an SQP solver scheme. Thus, in finding optimal sensor positions and optimal inputs we were able to significantly reduce the size of the confidence region of the estimated model parameters. By an 8-fold cross-validation using hypothesis tests in the parameter space, we demonstrated on simulations of vibrations in a truss that our algorithm is able to detect inaccuracies of the linear-elastic model which is deficient in the geometrical description of the truss. It is the object of further investigation to show that our algorithm detects other forms or kinds of model uncertainty as well.