Abstract
In this paper we derive error estimates for Runge–Kutta schemes of optimal control problems subject to index one differential–algebraic equations (DAEs). Usually, Runge–Kutta methods applied to DAEs approximate the differential and algebraic state in an analogous manner. These schemes can be considered as discretizations of the index reduced system where the algebraic equation is solved for the algebraic variable to get an explicit ordinary differential equation. However, in optimal control this approach yields discrete necessary conditions that are not consistent with the continuous necessary conditions which are essential for deriving error estimates. Therefore, we suggest to treat the algebraic variable like a control, obtaining a new type of Runge–Kutta scheme. For this method we derive consistent necessary conditions and compare the discrete and continuous systems to get error estimates up to order three for the states and control as well as the multipliers.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Direct discretization methods are often utilized to numerically solve optimal control problems because they are robust and able to solve difficult problems with state and control constraints (cf. Betts [7], Kraft [23], and von Stryk [37]). In order to justify the application of approximation schemes, an investigation of error estimates between the solutions of the continuous and discrete problem is crucial. In addition, the analysis reveals conditions under which discretization schemes yield a solution and at which rate it converges. In process engineering, mechanical engineering, and path planning, optimal control problems subject to differential–algebraic equations (DAEs) might occur (cf. Kunkel and Mehrmann [24]). These can be solved efficiently with Runge–Kutta schemes as proposed in Gerdts [16]. However, a theoretical analysis of these discretizations applied to problems with DAEs is missing in the literature. Runge–Kutta schemes are especially important for DAEs of index 3 and higher since the Euler method does not converge in that case (cf. Brennan et al. [9]). As a first step, to reduce the gap between applications and theory, we analyze Runge–Kutta schemes applied to optimal control problems with index 1 DAEs. The knowledge gained from these investigations will then be utilized for problems with higher index DAEs.
The study of discretized optimal control problems is still a current field of research particularly in the context of DAEs. The Euler-scheme applied to nonlinear problems with ordinary differential equations (ODEs) and smooth controls has been analyzed in [8, 12, 14, 25]. Herein, Malanowski et al. [25] consider problems with mixed control-state constraints, Dontchev et al. [12] also include pure state constraints, whereas Bonnans and Festa [8] and Dontchev and Hager [14] consider problems solely with pure state constraints. Gerdts and Kunkel [17] analyze nonlinear problems with control-state constraints and controls of bounded variation. They derive error estimates of order \(\frac{1}{p}\) with respect to the \(L_p\)-norm. Runge–Kutta schemes for problems with convex control constraints are examined by Dontchev et al. [13] for order 2 methods and by Hager [18] up to order 4 methods.
First order discretization methods applied to problems with bang-bang optimal control have been discussed in [1,2,3,4,5,6, 32, 35, 36]. Alt et al. [1,2,3, 5, 36] examine linear and linear-quadratic problems. They assume that the switching function does not have singular subarcs. Linear-quadratic problems with additional \(L_1\)-sparsity terms in the cost functional are analyzed by Alt and Schneider [4] and Schneider and Wachsmuth [35]. Alt et al. [6] and Osmolovskii and Veliov [32] study affine problems. In terms of higher order discretization schemes, these types of problems have been examined by Veliov [38] for Runge–Kutta methods, by Haunschmied et al. [21] using the stability concept of strong bi-metric regularity, and by Pietrus et al. [33] based on second order Volterra-Fliess approximations (compare also Scarinci and Veliov [34]).
Recently, using the implicit Euler-scheme, Martens and Gerdts [26,27,28,29,30] and Martens and Schneider [31] derived error estimates for different types of optimal control problems with DAEs. The nonlinear index 1 case was discussed in [26]. Convergence for the index 2 case was analyzed for problems with mixed control-state constraints in [28] and with pure state constraints in [30]. Linear quadratic problems and affine problems with bang-bang controls have been discussed in [29, 31], respectively.
In this paper we consider the following type of problem:
with the functional \(\varphi :{\mathbb {R}}^{n_x} \rightarrow {\mathbb {R}}\) and the functions \(f:{\mathbb {R}}^{n_x} \times {\mathbb {R}}^{n_y} \times {\mathbb {R}}^{n_u} \rightarrow {\mathbb {R}}^{n_x}\) and \(g:{\mathbb {R}}^{n_x} \times {\mathbb {R}}^{n_y} \times {\mathbb {R}}^{n_u} \rightarrow {\mathbb {R}}^{n_y}\). Herein, (1), (2) is a DAE in semi-explicit form with Lipschitz continuous differential state \(x \in W_{1,\infty }^{n_x}\) and essentially bounded algebraic variable and control \(y \in L_{\infty }^{n_y}\), \(u \in L_{\infty }^{n_u}\). The algebraic state y is implicitly determined by the algebraic constraint (2). DAEs are characterized by a quantity called index, which has various concepts, e.g., differentiation and perturbation index (cf. [24]). By differentiating the algebraic equation (2) with respect to t we get
If we assume that the matrix \(g'_y\) is non-singular along a trajectory (compare Sect. 3), then we are able to solve this equation for \(\dot{y}\) to obtain an explicit ODE for the algebraic state. Therefore, the DAE (1), (2) has differentiation index 1 since differentiating once with respect to t was sufficient to derive an explicit ODE.
Remark 1
The non singularity of the matrix \(g'_y\) along a trajectory implies that the Eq. (2) is (implicitly) solvable for y with respect to x and u. In theory, it would then be possible to reduce (1), (2) to an ODE and to apply a numerical scheme afterwards. However, in the context of DAEs, this has the drawback that the algebraic constraints are no longer enforced in the discrete system. Thus, depending on the dynamic, the discrete solution may suffer from the drift-off effect (cf. [10, 20]). Therefore, we suggest to discretize the system (OCP) directly and then to solve the discrete optimization problem.
Runge–Kutta schemes are often implemented to get accurate numerical solutions of DAEs. Hairer et al. [19] proved convergence of Runge–Kutta methods for Hessenberg DAEs up to index 3. Usually, in order to approximate a DAE with these schemes, we proceed as follows: for \(N \in {\mathbb {N}}\), \(N \ge 2\) we define the mesh size \(h:= \frac{1}{N}\) and choose coefficients \(b_j, a_{jk}\) for \(j,k = 1,\ldots ,s\). Then, we approximate the differential and algebraic state by
with stage derivatives \(p_i^j,q_i^j\) and stage approximations \(z_i^j,w_i^j\) determined by
Herein, \(z_i^j\) and \(w_i^j\) approximate the differential and algebraic state at \(t = t_i + c_j h\) with \(t_i:= i h\) and \(c_j:= \sum \limits _{k = 1}^{s} a_{jk}\). Moreover, \(q_i^j\) is an approximation of \(\dot{y}\) at \(t = t_i + c_j h\). Thus, we have a discretization of the index reduced system, which we get by solving (4) for \(\dot{y}\), i.e.,
Note that the second equation also depends on \(\dot{u}\). Furthermore, if we derive discrete necessary conditions with the standard Runge–Kutta scheme, we do not obtain an approximation of the continuous necessary conditions associated with (OCP) (compare (9)–(12)). Instead, we get a discretization for the necessary conditions of the index reduced ODE system. To generate an approximation for the necessary conditions of (OCP) and avoid the dependency on \(\dot{u}\), we suggest to treat the algebraic state analogous to the control (cf. [13, 18]). This yields the following approximation:
with stage approximations \(z_i^j \approx x(t_i + c_j h)\) as well as intermediate algebraic variable \(y_i^j \approx y(t_i + c_j h)\) and control \(u_i^j \approx u(t_i + c_j h)\). Then, with the abbreviation \(x_i':= \frac{x_{i+1} - x_i}{h}\) for \(i = 0,\ldots ,N-1\), we have the discrete optimization problem
The objective of this paper is to prove that this system has a local solution, which satisfies certain error estimates with respect to the continuous solution. To that end, we proceed as follows:
In Sect. 2 we derive discrete necessary conditions for (DOCP), which are consistent with the continuous necessary conditions of (OCP). In Sect. 3 we introduce the main result (Theorem 1) of this paper and the assumptions required to prove it. We transform the discrete problem into an abstract setting for which we can apply a convergence theorem in Sect. 4. In Sect. 5 we estimate the consistency error and show that the discretization scheme is stable with respect to small perturbations. This allows us to apply Proposition 2 and prove the main result Theorem 1. Numerical experiments confirming the theoretical deductions are provided in Sect. 6 for different schemes. In Sect. 7 we summarize the results of this paper and give an outlook for the index 2 case. We moved some technical details that would disturb the reading flow to Appendix 1.
Notation We denote by \({\mathbb {R}}^n\) the \(n-\)dimensional Euclidean space with the norm \(\vert \cdot \vert \). The space of \(n\times m\) matrices A is endowed with the spectral norm \(\Vert A \Vert \) and the \(n-\)dimensional unit matrix is denoted by \(I_n\). Let \(\mathfrak {B}_r(w)\) be the open ball with center w and radius \(r > 0\). For generic, non-negative constants we use \(\Gamma ,\Gamma _1,\Gamma _2,\ldots \). Furthermore, for vector-valued functions \(w:\left[ 0,1\right] \rightarrow {\mathbb {R}}^n\), \(p \in \left[ 1,\infty \right] \), and \(k \in {\mathbb {N}}\) we introduce the Banach spaces
equipped, respectively, with the norms
Moreover, we associate discrete sequences \( \left( w_i\right) _{i=0,\ldots ,N} \subset {\mathbb {R}}^n\) with the spaces
for \(i = 0,\ldots ,N-1\), \(p\in \left[ 1,\infty \right] \) and define the discrete norms
Throughout the paper we use i as the index for the discrete time \(t_i\) and j, k for the index of coefficients. Finally, to simplify notation, we often use the abbreviation F[t] for functions of type \(F(\hat{w}(t))\) where \(\hat{w}\) is a local minimizer or Karush–Kuhn–Tucker (KKT) point.
2 Necessary conditions
The general procedure to derive error estimates for solutions of optimal control problems is to compare the associated necessary conditions and the dynamic with its discrete counterparts. Therefore, in this section we derive necessary conditions associated with (DOCP), which are consistent with the continuous necessary conditions. These hold if (OCP) has a (local) solution. A tuple \((\hat{x},\hat{y},\hat{u}) \in W_{1,\infty }^{n_x} \times L_{\infty }^{n_y} \times L_{\infty }^{n_u}\) satisfying (1)–(3) is a local minimizer of (OCP) if there exists \(\epsilon > 0\) such that
for all feasible \((x,y,u) \in \mathfrak {B}_\epsilon (\hat{x},\hat{y},\hat{u}) \subset W_{1,\infty }^{n_x} \times L_{\infty }^{n_y} \times L_{\infty }^{n_u}\) satisfying (1)–(3). Consequently, if (OCP) has a solution \((\hat{x},\hat{y},\hat{u})\) and the index 1 condition in Sect. 3 holds, then there exist Lagrange multipliers \(\lambda \in W_{1,\infty }^{n_x}\) and \(\mu \in L_{\infty }^{n_y}\) such that the normalized necessary conditions for (OCP) (cf. [16, Theorem 3.4.3])
are satisfied with the Hamilton function
Herein, we have the adjoint DAE (9), (10) with adjoint differential state \(\lambda \) and adjoint algebraic variable \(\mu \) as well as the endpoint condition (11). Next, deriving necessary conditions associated with the discrete system (DOCP) yields adjoint equations for multipliers \(\lambda _{i}\), \(\eta _i^j\), \(\mu _i^j\), \(i = 0,\ldots ,N-1\), \(j = 1,\ldots ,s\):
Assuming \(b_j > 0\) for all \(j = 1,\ldots ,s\) and introducing the new multiplier
gives us
and therefore
Inserting \(\lambda _{i+1}\) into the second equation yields the new necessary conditions
with the coefficients
The Eqs. (14)–(18) can be transformed back to the original conditions with the multiplier \(\eta _i^j\), using the relation (13), i.e., both systems are equivalent (cf. [18, Proposition 3.1]). In (14)–(18) the adjoint variables \(\nu _i^j\) and \(\mu _i^j\) can be viewed as stage approximations for \(\lambda \) and intermediate adjoint algebraic states for \(\mu \), respectively. However, the Eqs. (5)–(8) and (14)–(18) are not a Runge–Kutta scheme applied to the KKT-system (1)–(3), (9)–(12) since the coefficients \(a_{jk}\) and \({\tilde{a}}_{jk}\) are not equal in general. Hence, further analysis is required to obtain error estimates.
3 Assumptions and main theorem
Before formulating the main result of this paper, we introduce the assumptions required to prove it. To conduct a convergence analysis in optimal control, it is typically presumed that the problem has a sufficiently smooth solution and that the system satisfies certain regularity properties. Moreover, in the nonlinear case, second order sufficient conditions are exploited since they imply stability of the problem. For the rest of the paper we assume the following:
- (Smoothness):
-
(OCP) has a local solution \((\hat{x},\hat{y},\hat{u}) \in W_{\kappa ,\infty }^{n_x} \times W_{\kappa -1,\infty }^{n_y} \times W_{\kappa -1,\infty }^{n_u} \) for \(\kappa \in \lbrace 2, 3 \rbrace \). For an open set \(\mathcal {M} \subset {\mathbb {R}}^{n_x} \times {\mathbb {R}}^{n_y} \times {\mathbb {R}}^{n_u}\) and \(\rho > 0\) such that \(\mathfrak {B}_\rho (\hat{x}(t),\hat{y}(t),\hat{u}(t)) \subset \mathcal {M}\) for all \(t \in \left[ 0,1\right] \) the first \(\kappa \) derivatives of f and g exist and are Lipschitz continuous on \(\mathcal {M}\). Furthermore, the first \(\kappa \) derivatives of \(\varphi \) exist and are Lipschitz continuous on \(\mathfrak {B}_\rho (\hat{x}(1))\).
- (Index 1):
-
The matrix \(g'_y(\hat{x}(t),\hat{y}(t),\hat{u}(t))\) is non-singular for all \(t \in \left[ 0,1\right] \).
- (Coercivity):
-
There exists \(\gamma >0\) such that the quadratic form
$$\begin{aligned} \mathcal {P}(x,y,u) :=&\frac{1}{2} \left( x(1)^\top \nabla ^2 \varphi [1] x(1) + \int \limits _{0}^{1} \left( \begin{array}{c} x(t) \\ y(t) \\ u(t) \end{array}\right) ^\top \right. \\&\left. \nabla ^2_{(x,y,u)(x,y,u)} \mathcal {H}[t] \left( \begin{array}{c} x(t) \\ y(t) \\ u(t) \end{array}\right) dt\right) . \end{aligned}$$satisfies
$$\begin{aligned} \mathcal {P}(x,y,u) \ge \gamma \left\| u \right\| _{L_2}^2 \end{aligned}$$(19)for all \((x,y,u) \in W_{1,2}^{n_x} \times L_2^{n_y} \times L_2^{n_u}\) such that
$$\begin{aligned} \dot{x}(t)&= f'_x[t] x(t) + f'_y[t] y(t) + f'_u[t] u(t),&\text {a.e. in} \left[ 0,1\right] , \\ 0&= g'_x[t] x(t) + g'_y[t] y(t) + g'_u[t] u(t),&\text {a.e. in} \left[ 0,1\right] , \\ x(0)&= 0. \end{aligned}$$
Remark 2
-
(i)
If smoothness for \(\kappa \in \lbrace 2, 3 \rbrace \) and the index 1 condition are satisfied, then the Lagrange multipliers \(({\hat{\lambda }},{\hat{\mu }})\) associated with the local solution \((\hat{x},\hat{y},\hat{u})\) are in the space \(W_{\kappa ,\infty }^{n_x} \times W_{\kappa -1,\infty }^{n_y}\). This can be seen by solving the adjoint algebraic equation (10) for \({\hat{\mu }}\) which yields \({\hat{\mu }} \in W_{1,\infty }^{n_y}\). Then, the adjoint differential equation (9) implies \(W_{2,\infty }^{n_y}\). The process is repeated until the suggested smoothness is reached.
-
(ii)
If the assumptions are satisfied, then there exists some \(\beta >0\) such that the Legendre-Clebsch condition
$$\begin{aligned} (w^\top ,v^\top ) \nabla ^2_{(y,u)(y,u)} \mathcal {H}[t] \left( \begin{array}{c} w \\ v \end{array}\right) \ge \beta (\left| w \right| ^2 + \left| v \right| ^2) \end{aligned}$$holds for all \((w,v) \in \ker (g'_y[t],g'_u[t])\) and \(t \in \left[ 0,1\right] \) (cf. [11, Lemma 2], [15, 27, Lemma 5.3.3]). In addition, this implies that the matrix
$$\begin{aligned} \left( \begin{array}{ccc} \nabla ^2_{yy} \mathcal {H}[t] &{} \nabla ^2_{yu} \mathcal {H}[t] &{} g'_y[t]^\top \\ \nabla ^2_{uy} \mathcal {H}[t] &{} \nabla ^2_{uu} \mathcal {H}[t] &{} g'_u[t]^\top \\ g'_y[t] &{} g'_u[t] &{} 0 \end{array}\right) \end{aligned}$$(20)is non-singular for all \(t \in \left[ 0,1\right] \).
-
(iii)
Smoothness and the index 1 condition imply that \(g'_y[t]\) and its inverse are continuous and uniformly bounded. Thus, we can solve the linear algebraic equation in the coercivity assumption for y and insert it into the differential equation to obtain
$$\begin{aligned} \dot{x}(t)&= A(t) x(t) + B(t) u(t) \end{aligned}$$(21)$$\begin{aligned} x(0)&= 0 \end{aligned}$$(22)with the abbreviations
$$\begin{aligned} A(t):= f'_x[t] - f'_y[t] g'_y[t]^{-1} g'_x[t], \quad B(t):= f'_u[t] - f'_y[t] g'_y[t]^{-1} g'_u[t]. \end{aligned}$$The quadratic form then reduces to
$$\begin{aligned} \widetilde{\mathcal {P}}(x,u)&:= \frac{1}{2} x(1)^\top \nabla ^2 \varphi [1] x(1) \\&\qquad + \frac{1}{2} \int \limits _{0}^{1} x(t)^\top P(t) x(t) + 2 x(t)^\top S(t) u(t) + u(t)^\top Q(t) u(t) dt \end{aligned}$$with the matrix functions
$$\begin{aligned} P(t)&:= \nabla ^2_{xx} \mathcal {H}[t] - 2 \nabla ^2_{xy} \mathcal {H}[t] g'_y[t]^{-1} g'_x[t] + (g'_y[t]^{-1} g'_x[t])^\top \nabla ^2_{yy} \mathcal {H}[t] g'_y[t]^{-1} g'_x[t], \\ S(t)&:= \nabla ^2_{xu} \mathcal {H}[t] - \nabla ^2_{xy} \mathcal {H}[t] g'_y[t]^{-1} g'_u[t] - (g'_y[t]^{-1} g'_x[t])^\top \nabla ^2_{yu} \mathcal {H}[t] \\&\qquad + (g'_y[t]^{-1} g'_x[t])^\top \nabla ^2_{yy} \mathcal {H}[t] g'_y[t]^{-1} g'_u[t], \\ Q(t)&:= \nabla ^2_{uu} \mathcal {H}[t] - 2 (g'_y[t]^{-1} g'_u[t])^\top \nabla ^2_{yu} \mathcal {H}[t] \\&\qquad + (g'_y[t]^{-1} g'_u[t])^\top \nabla ^2_{yy} \mathcal {H}[t] g'_y[t]^{-1} g'_u[t]. \end{aligned}$$For this reduced form we also have the coercivity condition \(\widetilde{\mathcal {P}}(x,u) \ge \gamma \left\| u \right\| _{L_2}^2\) for all \((x,u) \in W_{1,2}^{n_x} \times L_2^{n_u}\) satisfying (21), (22). Furthermore, the Legendre Clebsch condition \(v^\top Q(t) v \ge \gamma \left| v \right| ^2\) holds for all \(t \in \left[ 0,1\right] \) (cf. [11, Lemma 2], [15]).
Next, with the abbreviations \(c_j:= \sum \limits _{k = 1}^{s} a_{jk}\) and \(d_j:= \sum \limits _{k = 1}^{s} b_k a_{kj}\) we introduce the conditions required to get Runge–Kutta methods of order one to three in Table 1.
Herein, we have the additional condition \(\sum \limits _{j = 1}^{s} \frac{d_j^2}{b_j} = \frac{1}{3}\) for third order, which is not needed for Runge–Kutta methods applied to DAEs. But in the context of optimal control, some extra conditions for the coefficients arise since \(a_{jk}\) and \({\tilde{a}}_{jk}\) are not equal in general.
In the following sections we will show that we can solve the Eqs. (7), (16), (18) for \((y,u,\mu )\) depending on \((z,\nu )\) near the continuous differential state and multiplier \((\hat{x},{\hat{\lambda }})\). Then, we get a discrete solution \((\hat{x}_h,{\hat{\lambda }}_h)\) of a reduced system, which satisfies error estimates with respect to \((\hat{x},{\hat{\lambda }})\). This further implies that the discrete problem (DOCP) has a local solution \((\hat{x}_h,\hat{z}_h,\hat{y}_h,\hat{u}_h)\) associated with multipliers \(({\hat{\lambda }}_h,{\hat{\nu }}_h,{\hat{\mu }}_h)\). However, the algebraic states \(\hat{y}_h\), \({\hat{\mu }}_h\) and the control \(\hat{u}_h\) might not converge at the same rate as the differential states, which we will later confirm with numerical experiments (compare Sect. 6). Though it is possible to obtain discrete algebraic states and a control that satisfy the same order of error estimates as the differential states by solving the algebraic equations (7), (16), (18) for \((y,u,\mu )\) with respect to \((x,\lambda ) = (\hat{x}_h,{\hat{\lambda }}_h)\), i.e., \((y(\hat{x}_h,{\hat{\lambda }}_h),u(\hat{x}_h,{\hat{\lambda }}_h),\mu (\hat{x}_h,{\hat{\lambda }}_h))\). This allows us to formulate the main result of this paper:
Theorem 1
For \(\kappa \in \left\{ 2,3 \right\} \) let smoothness, the index 1 condition, coercivity, \(b_j > 0\) for \(j = 1,\ldots ,s\), and the conditions in Table 1 up to order \(\kappa \) be satisfied. Then, (DOCP) has a local solution \((\hat{x}_h,\hat{z}_h,\hat{y}_h,\hat{u}_h)\) associated with the multipliers \(({\hat{\lambda }}_h,{\hat{\nu }}_h,{\hat{\mu }}_h)\) such that we have the error estimates
for some constant \(\Gamma \ge 0\) and sufficiently small h. Herein, \((y(\hat{x}_h,{\hat{\lambda }}_h),u(\hat{x}_h,{\hat{\lambda }}_h),\mu (\hat{x}_h,{\hat{\lambda }}_h))\) is obtained by solving the algebraic constraints (7), (16), (18) for \((y,u,\mu )\) with respect to \((x,\lambda ) = (\hat{x}_h,{\hat{\lambda }}_h)\).
Note that the order \(\kappa \) of the error estimates in Theorem 1 is closely related to the assumed smoothness of the functions in (OCP). To derive error estimates, we exploit appropriate Taylor expansions, which contain higher order derivatives of the functions, i.e., they have to be sufficiently smooth (see Appendix 1 and Lemma 1).
4 Convergence theorem and abstract setting
Before we prove Theorem 1 in Sect. 5, we first transform the discrete KKT-system (5)–(8), (14)–(18) to obtain an equation \(0 = \mathcal {T}(\omega )\) and show that this system has a solution, which satisfies certain error estimates. To that end, we require the following result (cf. [14, Theorem 3.1], [12, Proposition 3.1], [18, Proposition 5.1]):
Proposition 2
Let \(\Omega \) be a Banach space and \(\Pi \) a linear, normed space. For some \({\bar{\omega }} \in \Omega \) and \(r>0\) let the function \(\mathcal {T}:\mathfrak {B}_r({\bar{\omega }}) \subset \Omega \rightarrow \Pi \) be continuously Frechét differentiable and let \(\mathcal {L}:\Omega \rightarrow \Pi \) be a linear, bounded operator. Suppose there exist \(\theta ,\vartheta ,\sigma > 0\) such that
-
\(\left\| \mathcal {T}'(\omega ) - \mathcal {L} \right\| \le \theta \) for all \(\omega \in \mathfrak {B}_r({\bar{\omega }})\).
-
The mapping \(\mathcal {L}^{-1}:\mathfrak {B}_\sigma ({\hat{\pi }}) \rightarrow \Omega \) for \({\hat{\pi }} = \mathcal {T}({\bar{\omega }}) - \mathcal {L}({\bar{\omega }})\) is single valued and Lipschitz continuous with constant \(\vartheta \).
If \(\theta \vartheta < 1\), \(\theta r \le \sigma \), and \(\left\| \mathcal {T}({\bar{\omega }}) \right\| _\Pi \le \min \left\{ \sigma , \frac{(1 - \vartheta \theta )r}{\vartheta } \right\} \), then there exists a unique solution \(\omega \in \mathfrak {B}_r({\bar{\omega }})\) of \( 0 = \mathcal {T}(\omega )\) satisfying the bound
Our goal now is to transform the discrete KKT-system (5)–(8), (14)–(18) into a discretization of a boundary value problem such that we can apply Proposition 2. For that purpose, we introduce the abbreviations
Then, we can write the discrete system (5)–(8), (14)–(18) as
which is an approximation of the DAE boundary value problem
According to (20), the matrix
is non-singular along the trajectory \((\hat{X},\hat{Y}) = (\hat{x},{\hat{\lambda }},\hat{y},\hat{u},{\hat{\mu }})\), i.e., the DAE has index 1. Since \(\Lambda _{jk}\) contains \(a_{jk}\) and \({\tilde{a}}_{jk}\), which are not equal in general, the discretization (23)–(26) with stage approximations \(Z_i^j\) is not a classic Runge–Kutta scheme and existing convergence results cannot be applied. Hence, further examination is required. We proceed by reducing the system (23)–(26) such that we obtain a discretization of an ODE boundary value problem. We first consider the algebraic equations
which are satisfied for \({\bar{Z}}^j = \hat{X}(t_i)\), \({\bar{Y}}^j = \hat{Y}(t_i)\), \(j = 1,\ldots ,s\) for any \(i = 0,\ldots ,N-1\). Moreover, the matrix \(G'_Y[t_i]^{-1}\) exists for \(i = 0,\ldots ,N-1\) (compare (20)). Thus, by the implicit function theorem (cf. [22, p. 29]), there exist some \(\epsilon ,\delta > 0\) such that the mappings \(Y^j:\mathfrak {B}_\epsilon (\bar{\textbf{Z}}) \rightarrow \mathfrak {B}_\delta ({\bar{Y}}^j)\), \(j = 1\ldots ,s\) are continuously differentiable and satisfy
for \( \textbf{Z} \in \mathfrak {B}_\epsilon (\bar{\textbf{Z}})\). Now, we insert the functions \(Y^j(\cdot )\), \(j = 1,\ldots ,s\) into the stage approximation (24) and consider the equations
with some parameters \(\textbf{P} = (P^1,\ldots ,P^s)\) replacing \(X_i\). These equations are satisfied for \({\bar{Z}}^j = \hat{X}(t_i)\) and \({\bar{P}}^j = \hat{X}(t_i) - h \Upsilon _{j} F[t_i]\), \(j = 1,\ldots ,s\) for any \(i = 0,\ldots ,N-1\). Differentiating the right hand side with respect to \(\textbf{Z}\) yields a matrix of the form \(I_{2 s n_x} + O(h)\). Hence, this matrix is non-singular for sufficiently small h. Applying the implicit function theorem once more gives us continuously differentiable mappings \(Z^j:\mathfrak {B}_{{\tilde{\epsilon }}}(\bar{\textbf{P}}) \rightarrow \mathfrak {B}_{{\tilde{\delta }}}({\bar{Z}}^j)\) for \(j = 1,\ldots ,s\) and some \({\tilde{\epsilon }}, {\tilde{\delta }} > 0\). Then, for all \(\textbf{P} \in \mathfrak {B}_{{\tilde{\epsilon }}}(\bar{\textbf{P}})\) we have
Furthermore, since \({\bar{P}}^j - \hat{X}(t_i) = O(h)\) we get \(\bar{\textbf{Z}} = (\underbrace{\hat{X}(t_i),\ldots ,\hat{X}(t_i)}_{s-\text {times}}) \in \mathfrak {B}_{{\tilde{\epsilon }}}(\bar{\textbf{P}})\) for sufficiently small h. Thus, the function \(Z^j(\cdot )\) is also continuously differentiable on \((\mathfrak {B}_{{\hat{\epsilon }}}(\hat{X}(t_i)))^s\) for some \(0< {\hat{\epsilon }} < {\tilde{\epsilon }}\) and satisfies the stage approximations (29) for \(P^j = \hat{X}(t_i)\). Finally, we insert \(Z^j(\cdot )\) and \(Y^j(\cdot )\) into the difference equation (23) to obtain
for \(X_i \in \mathfrak {B}_{{\hat{\epsilon }}}(\hat{X}(t_i))\), \(i = 0,\ldots ,N\). This system only depends on \(X = (x,\lambda )\) and is an approximation of the ODE boundary value problem we get by solving (2), (10), (12) for \((y,u,\mu )\) with respect to \((x,\lambda )\) and inserting the result into the differential equations (1) and (9). For this discrete equation we intent to apply Proposition 2 to get error estimates for x and \(\lambda \). Therefore, we introduce the Banach space \(\Omega := W_{1,\infty ,h}^{2 n_x}\) and the linear, normed space \(\Pi := L_{1,h}^{2n_x} \times {\mathbb {R}}^{2n_x}\). Finally, with \({\bar{X}} \in \Omega \) defined by \({\bar{X}}_i:= \hat{X}(t_i)\) for \(i = 0,\ldots ,N\) and some sufficiently small \(r \in (0,{\hat{\epsilon }})\) we have the continuously Frechét differentiable function \(\mathcal {T}: \mathfrak {B}_{r}({\bar{X}}) \subset \Omega \rightarrow \Pi \) and the linear operator \(\mathcal {L}: \Omega \rightarrow \Pi \)
with \(\mathbb {X}_i:= (x_i,\lambda _{i+1})\) for \(i = 0,\ldots ,N-1\).
5 Proof of the main theorem
Before proving Theorem 1, we derive an upper bound for the consistency error \(\left\| \mathcal {T}({\bar{X}}) \right\| _\Pi \) with respect to h, which gives us error estimates for the discrete solution of \(0 = \mathcal {T}(X)\) with respect to \({\bar{X}}\) after applying Proposition 2.
Lemma 1
If smoothness for \(\kappa \in \left\{ 2,3 \right\} \), the index 1 condition, \(b_j > 0\) for \(j = 1,\ldots ,s\), and the conditions in Table 1 up to order \(\kappa \) are satisfied, then we have
for some constant \(\Gamma \ge 0\) and sufficiently small h.
Proof
The second component of \(\mathcal {T}({\bar{X}})\) is zero. Thus, it remains to estimate
In Appendix 1 we derive Taylor expansions for \(\frac{{\bar{X}}_{i+1} - {\bar{X}}_i}{h}\) and \(F(Z^j(\bar{\textbf{X}}_i),Y^j(\textbf{Z}(\bar{\textbf{X}}_i)))\) for \(i = 0,\ldots ,N-1\) such that the remainder terms are of order \(O(h^3)\). Herein, we omit arguments, i.e., we write \(\hat{F} = F[t_i]\) etc. for \(i = 0,\ldots ,N-1\). Now, we compare the terms of order O(1), O(h), and \(O(h^2)\) for both expansions. The O(1) terms \(\hat{F}\) and \(\sum \limits _{j = 1}^{s} b_j \hat{F}\) are equal if \(\sum \limits _{j = 1}^{s} b_j = 1\), which corresponds to the order 1 condition in Table 1. For order O(h) terms we have
Since \(\Upsilon _j = \sum \limits _{k = 1}^{s} \Lambda _{jk} = \left( \begin{array}{cc} c_j I_{nx} &{} 0 \\ 0 &{} {\tilde{c}}_j I_{nx} \end{array}\right) \) with \({\tilde{c}}_j:= \sum \limits _{k = 1}^{s} {\tilde{a}}_{jk}\), we can write
According to Table 1, for second order schemes we assume \(\sum \limits _{j = 1}^{s} b_j c_j = \sum \limits _{j = 1}^{s} d_j = \frac{1}{2}\). Furthermore, we get
Thus, the terms are equal if the second order condition in Table 1 is satisfied. For third order we first consider linear terms of order \(O(h^2)\) such as
and
These terms are equal if
which is satisfied if the third order conditions in Table 1 hold (cf. [18, Theorem 4.1]). For the quadratic terms we have, e.g., \(\frac{1}{6} \hat{F}''_{XX}(\hat{F},\hat{F})\) and
which are equal if
According to [18, Theorem 4.1], this holds if the third order conditions in Table 1 are satisfied. Thus, we obtain
for Runge–Kutta schemes satisfying all conditions in Table 1. In summary, we have
for some \(\Gamma \ge 0\) if the conditions in Table 1 up to order \(\kappa \) are satisfied. \(\square \)
Next, we verify that the conditions of Proposition 2 are satisfied for our abstract setting:
Lemma 2
Let the assumptions of Theorem 1 be satisfied. Then, for arbitrary \(\theta > 0\) there exists some sufficiently small \(r > 0\) and \(h>0\) such that \(\left\| \mathcal {T}'(W) - \mathcal {L} \right\| \le \theta \) for all \(W \in \mathfrak {B}_r({\bar{X}})\). Furthermore, for \({\hat{\pi }} = \mathcal {T}({\bar{\omega }}) - \mathcal {L}({\bar{X}})\) and some constant \(\sigma > 0\) the mapping \(\mathcal {L}^{-1}:\mathfrak {B}_\sigma ({\hat{\pi }}) \rightarrow \Omega \) is single valued and Lipschitz continuous with constant \(\vartheta \ge 0\).
Proof
First, we estimate the norm for the linear operator \( \mathcal {T}'(W) - \mathcal {L}:\Omega = W_{1,\infty ,h}^{2n_x} \rightarrow \Pi \), i.e.,
To this end, we abbreviate
which is Lipschitz continuous with respect to (X, Y) close to \((\hat{X}(t),\hat{Y}(t))\) for some \(t \in \left[ 0,1\right] \). Using the sensitivities (27), (28), (30), (31) we get
Then, the linearization of \(\mathcal {T}\) in some \(W \in \mathfrak {B}_r({\bar{X}})\) yields
and we can write the linear Operator \(\mathcal {L}\) as
For the first component of \(\mathcal {T}'(W) X - \mathcal {L}(X)\) we get
for \(i = 0,\ldots ,N-1\) and some generic constants \(\Gamma _1,\Gamma _2,\Gamma _3 \ge 0\). For the second component we exploit the Lipschitz continuity of \(\Phi '\) to obtain the bound \(r \left\| X \right\| _{1,\infty ,h}\). Thus, we have \(\left\| \mathcal {T}'(W) - \mathcal {L} \right\| \le \theta \) for arbitrary \(\theta > 0\) if r and h are sufficiently small.
Next, we need to verify that \(\mathcal {L}^{-1}:\mathfrak {B}_\sigma ({\hat{\pi }}) \rightarrow \Omega \) exists and is bounded. To this end, we consider the linear, perturbed system
for \(i = 0,\ldots ,N-1\), \(j = 1,\ldots ,s\), and perturbations
These are the KKT-conditions associated with the linear quadratic program
for \(i = 0,\ldots ,N-1\), \(j = 1,\ldots ,s\), and with the discrete quadratic form
Since the matrix \(g'_y[t_i]\) is non-singular for all \(i = 0,\ldots ,N\) by the index 1 assumption, we can solve (34) for \(y_i^j\) and obtain the reduced system
where the reduced quadratic form is defined by
and the perturbations \(({\tilde{\pi }}_f^j,{\tilde{\pi }}_{\mathcal {H}_x}^j,{\tilde{\pi }}_{\mathcal {H}_u}^j) \in L_{1,h}^{n_x} \times L_{1,h}^{n_x} \times L_{\infty ,h}^{n_u}\) by
The matrix functions \(A,\ldots ,Q\) are defined in Remark 2 (iii). Furthermore, the matrix \(Q(t_i)\) is uniformly positive definite for all \(i = 0,\ldots ,N-1\) according to Remark 2 (iii). Hence, we can apply [13, Lemma 6.1] which yields a Lipschitz continuous solution of (RLQP) with respect to the perturbations \(({\tilde{\pi }}_f^j,\pi _0,{\tilde{\pi }}_{\mathcal {H}_x}^j,\pi _{\varphi },{\tilde{\pi }}_{\mathcal {H}_u}^j)\), \(j = 1,\ldots ,s\). This in turn implies that the linear quadratic program (LQP) and therefore the KKT-system (33) have a solution for any perturbation. Moreover, we can write (33) as
with \(\pi _{F,i}:= (\pi _{f,i},-\pi _{\mathcal {H}_x,i})\), \(\pi _{G,i}^j:= (\pi _{\mathcal {H}_y,i}^j,\pi _{\mathcal {H}_u,i}^j,\pi _{g,i}^j)\), and \(\pi _\Phi := (\pi _0,\pi _{\varphi })\). Since the matrix \(G'_Y[t_i]\) is invertible by (20), we can solve the second equation for \(Y_i^j\) and insert the result into the first equation. This yields the perturbed equation \(\mathcal {L}(X) = \pi \), which has a unique Lipschitz continuous solution with respect to \(\pi \in \Pi \) and some Lipschitz constant \(\vartheta \ge 0\) as we have verified. Thus, the single valued mapping \(\mathcal {L}^{-1}:\mathfrak {B}_\sigma ({\hat{\pi }}) \rightarrow \Omega \) exists and is Lipschitz continuous with constant \(\vartheta \). \(\square \)
With the results of Lemmas 1 and 2 we are finally able to prove the main theorem of this paper:
Proof of Theorem 1
In Lemma 1 and 2 we derived an upper bound for \(\left\| \mathcal {T}({\bar{X}}) \right\| _\Pi \) (compare (32)) and verified the stability conditions of Proposition 2 for some \(r, \theta , \vartheta , \sigma > 0\), respectively. We can choose \(\theta \) and r sufficiently small such that \(\theta \vartheta < 1\) and \(\theta r \le \sigma \). Additionally, since \(\left\| \mathcal {T}({\bar{X}}) \right\| _\Pi = O(h)\), for sufficiently small h we have \(\left\| \mathcal {T}({\bar{X}}) \right\| _\Pi \le \min \left\{ \sigma , \frac{(1 - \vartheta \theta )r}{\vartheta } \right\} \). Then, according to Proposition 2, the equation \( 0 = \mathcal {T}(X)\) has a solution \(\hat{X}_h =: (\hat{x}_h,{\hat{\lambda }}_h)\) satisfying the bound
for sufficiently small h. Recalling \({\bar{P}}_i^j = \hat{X}(t_i) - h \sum \limits _{k = 1}^{s} \Lambda _{jk} F[t_i]\), we conclude
for \(i = 0,\ldots ,N\) and \(j = 1,\ldots ,s\). This implies \(\hat{\textbf{X}}_h \in \mathfrak {B}_{{\tilde{\epsilon }}}(\bar{\textbf{P}})\) for sufficiently small h and therefore \(Z^j(\hat{\textbf{X}}_h) =: (\hat{z}^j_h,{\hat{\nu }}^j_h)\) exists for \(j = 1,\ldots ,s\). In addition, we have
for \(\Gamma _1 \ge 0\). Since \({\bar{Z}}^j_i = \hat{X}(t_i)\), we get \(\textbf{Z}(\hat{\textbf{X}}_h) \in \mathfrak {B}_\epsilon (\bar{\textbf{Z}})\) for sufficiently small h. Thus, \(Y^j(\textbf{Z}(\hat{\textbf{X}}_h)) =: (\hat{y}^j_h,\hat{u}^j_h,{\hat{\mu }}^j_h)\) exists for \(j = 1,\ldots ,s\). Moreover, the bound
holds for some \(\Gamma _2 \ge 0\). Therefore, \((\hat{x}_h,\hat{z}_h,\hat{y}_h,\hat{u}_h)\) is feasible for (DOCP) and a local minimizer for sufficiently small h according to [13, Lemma 7.2]. Finally, we verify the (improved) error estimates for \(Y^j(\hat{\textbf{X}}_h) =: (y^j(\hat{x}_h,{\hat{\lambda }}_h),u^j(\hat{x}_h,{\hat{\lambda }}_h),\mu ^j(\hat{x}_h,{\hat{\lambda }}_h))\), which are obtained by solving the algebraic equations (7), (16), (18) for \((y,u,\mu )\) with respect to \((x,\lambda ) = (\hat{x}_h,{\hat{\lambda }}_h)\). These exist for sufficiently small h since \(\left\| \hat{\textbf{X}}_h - \bar{\textbf{Z}} \right\| _{\infty ,h} = \left\| \hat{\textbf{X}}_h - \hat{\textbf{X}} \right\| _{\infty ,h} = O(h^\kappa ) < \epsilon .\) Using the Lipschitz continuity of \(Y^j\) we get
for some \(\Gamma \ge 0\), which proves the assertion. \(\square \)
6 Numerical example
In order to verify the results of Theorem 1, we consider the following optimal control problem with a parameter \(\alpha \ne 0\):
Utilizing the normalized necessary conditions associated with this problem
yields the solution
For \(\alpha \in {\mathbb {R}}\setminus \left\{ -4,-1,0 \right\} \) the smoothness assumptions are satisfied. Furthermore, we can (explicitly) solve the algebraic equation for y, i.e., the DAE has index 1. Moreover, we have the quadratic form
and the linearized dynamic
If \(\alpha > 0\), then the coercivity condition (19) is obviously satisfied. For negative \(\alpha \) the linearized dynamic implies \(x_1(t) = 0\) and \(y(t) = \frac{1}{2} u(t)\) for all \(t \in \left[ 0,1\right] \). Hence, we obtain
which is positive for \(\alpha \in (-4,0)\). However, the coercivity assumption does not hold for \(\alpha < - 4\). Therefore, we did numerical experiments for the parameter values \(\alpha = 1\) and \(\alpha = -4.2\) as well as the Runge–Kutta schemes in Table 2.
The convergence order \(\kappa \) for \(\hat{w} = \hat{x}, \hat{y},\ldots \) can be approximated with the formula
Therefore, in Figs. 1 and 2 we plotted \(-\log _2\left( \left\| \hat{w}_h - \hat{w} \right\| _{\infty ,h}\right) \) for the different errors and the values \(N = 20, 40, 80, 160, 320, 640\). Then, the convergence order is indicated by the vertical distance between two consecutive points.
The Heun scheme is displayed in Fig. 1 with the aforementioned values of \(\alpha \). Solving the algebraic constraints for \((y,\mu ,u)\) with respect to \((x,\lambda ) = (\hat{x}_h,{\hat{\lambda }}_h)\) improves the accuracy for the algebraic states and control but not the convergence order of 2. Note that even if the coercivity condition is not satisfied (for \(\alpha = -4.2\)) we still get second order error estimates. For the third order scheme Radau IA and \(\alpha = 1\) we get third order error estimates for the differential states but only second order error estimates for the algebraic states and control (compare Fig. 2). The order of convergence is improved by solving the algebraic constraints for \((y,\mu ,u)\) with respect to \((x,\lambda ) = (\hat{x}_h,{\hat{\lambda }}_h)\). For the parameter value \(\alpha = -4.2\) we do not get third order error estimates for the differential states but still second order error estimates for the algebraic variables and control (compare Fig. 2). Since the differential states do not satisfy third order estimates, solving the algebraic constraints will not improve the convergence order for the algebraic states and control. Therefore, these errors were omitted for the \(\alpha = -4.2\) case.
7 Conclusions
In this paper we proposed a new type of Runge–Kutta scheme for optimal control problems with DAEs. Instead of approximating the algebraic variable with stage derivatives, we treated the algebraic state like a control. For this discretization scheme we derived necessary conditions that are consistent with the continuous conditions and we were able to establish error estimate for Runge–Kutta schemes applied to optimal control problems with index 1 DAEs. The next step is to apply this method to problems with index 2 DAEs and derive error estimates. However, for optimal control problems with index 2 DAEs some new difficulties occur. There exists a discrepancy between the continuous and discrete necessary conditions, i.e., the adjoint continuous DAE has index 1 while the discrete system approximates an index 2 DAE. In [28] we were able to overcome this problem by performing an index reduction for the discrete system. In that case we used the implicit Euler discretization. To obtain higher order error estimates, the idea is to perform such a discrete index reduction in the context of Runge–Kutta schemes. Unfortunately, so far we have not been able to find a way to derive consistent necessary conditions.
References
Alt, W., Baier, R., Lempio, F., Gerdts, M.: Approximations of linear control problems with bang–bang solutions. Optimization 62(1), 9–32 (2011). https://doi.org/10.1080/02331934.2011.568619
Alt, W., Baier, R., Gerdts, M., Lempio, F.: Error bounds for Euler approximation of linear-quadratic control problems with bang–bang solutions. Numer. Algebra Contr. Optim. 2(3), 547–570 (2012). https://doi.org/10.3934/naco.2012.2.547
Alt, W., Seydenschwanz, M.: An implicit discretization scheme for linear-quadratic control problems with bang–bang solutions. Optim. Methods Softw. 29(3), 535–560 (2014). https://doi.org/10.1080/10556788.2013.821612
Alt, W., Schneider, C.: Linear-quadratic control problems with \({L}^1\)-control cost. Optim. Contr. Appl. Methods 36(4), 512–534 (2015). https://doi.org/10.1002/oca.2126
Alt, W., Schneider, C., Seydenschwanz, M.: Regularization and implicit Euler discretization of linear-quadratic optimal control problems with bang–bang solutions. Appl. Math. Comput. 287–288(5), 104–124 (2016). https://doi.org/10.1016/j.amc.2016.04.028
Alt, W., Felgenhauer, U., Seydenschwanz, M.: Euler discretization for a class of nonlinear optimal control problems with control appearing linearly. Comput. Optim. Appl. 69(3), 825–856 (2018). https://doi.org/10.1007/s10589-017-9969-7
Betts, J. T.: Practical Methods for Optimal Control and Estimation Using Nonlinear Programming, 2nd edn. Advances in Design and Control. SIAM (2010). https://doi.org/10.1137/1.9780898718577
Bonnans, J.F., Festa, A.: Error estimates for the Euler discretization of an optimal control problem with first-order state constraints. SIAM J. Numer. Anal. 55(2), 445–471 (2017). https://doi.org/10.1137/140999621
Brennan, K.E., Campbell, S.L., Petzold, L.R.: Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. volume 14 of Classics in Applied Mathematics. SIAM (1996). https://doi.org/10.1137/1.9781611971224
Burger, M., Gerdts, M.: A survey on numerical methods for the simulation of initial value problems with sDAEs. In: Ilchmann, A., Reis, T. (eds.) Surveys in Differential-Algebraic Equations IV. Differential-Algebraic Equations Forum. Springer, Berlin (2017). https://doi.org/10.1007/978-3-319-46618-7_5
Dontchev, A.L., Hager, W.W., Poore, A.B., Yang, B.: Optimality, stability, and convergence in nonlinear control. Appl. Math. Optim. 31, 297–326 (1995). https://doi.org/10.1007/BF01215994
Dontchev, A.L., Hager, W.W., Malanowski, K.: Error bounds for Euler approximation of a state and control constrained optimal control problem. Numer. Funct. Anal. Optim. 21(5–6), 653–682 (2000). https://doi.org/10.1080/01630560008816979
Dontchev, A.L., Hager, W.W., Veliov, V.M.: Second-order Runge–Kutta approximations in control constrained optimal control. SIAM J. Numer. Anal. 38(1), 202–226 (2000). https://doi.org/10.1137/S0036142999351765
Dontchev, A.L., Hager, W.W.: The Euler approximation in state constrained optimal control. Math. Comput. 70(233), 173–203 (2001). https://doi.org/10.1090/S0025-5718-00-01184-4
Dunn, J.C., Tian, T.: Variants of the Kuhn–Tucker sufficient conditions in cones of nonnegative functions. SIAM J. Contr. Optim. 30(6), 1361–1384 (1992). https://doi.org/10.1137/0330072
Gerdts, M.: Optimal Control of ODEs and DAEs. De Gruyter, Berlin (2012). https://doi.org/10.1515/9783110249996
Gerdts, M., Kunkel, M.: Convergence analysis of Euler discretization of control-state constrained optimal control problems with controls of bounded variation. J. Ind. Manag. Optim. 10(1), 311–336 (2014). https://doi.org/10.3934/JIMO.2014.10.311
Hager, W.W.: Runge–Kutta methods in optimal control and the transformed adjoint system. Numer. Math. 87(2), 247–282 (2000). https://doi.org/10.1007/s002110000178
Hairer, E., Lubich, C., Roche, M.: The Numerical Solution of Differential-Algebraic Systems by Runge–Kutta Methods, vol. 1409. Springer, Berlin (1989). https://doi.org/10.1007/BFb0093947
Hairer, E., Wanner, G.: Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, Springer Series in Computational Mathematics, vol. 14, 2nd edn. Springer, Berlin (1996). https://doi.org/10.1007/978-3-662-09947-6
Haunschmied, J.L., Pietrus, A., Veliov, V.M.: The Euler method for linear control systems revisited. In: Proceedings of the 9th International Conference on Large-Scale Scientific Computations, Sozopol, pp. 90–97 (2013). https://doi.org/10.1007/978-3-662-43880-0_9
Ioffe, A.D., Tihomirov, V.M.: Theory of Extremal Problems. Studies in Mathematics and Its Applications, vol. 6. North-Holland Publishing Company, Amsterdam (1979)
Kraft, D.: FORTRAN-Programme zur numerischen Lösung optimaler Steuerungsprobleme. DFVLR-Mitteilung, vol. 80. DFVLR (1980)
Kunkel, P., Mehrmann, V.: Differential-Algebraic Equations: Analysis and Numerical Solution. EMS Textbooks in Mathematics. European Mathematical Society (2006). https://doi.org/10.4171/017
Malanowski, K., Büskens, C., Maurer, H.: Convergence of Approximations to Nonlinear Optimal Control Problems. Lecture Notes in Pure and Applied Mathematics. In: Fiacco, A.V. (ed.) Mathematical programming with data perturbations. CRC Press, Boca Raton (1997). https://doi.org/10.1201/9781003072119-12
Martens, B., Gerdts, M.: Convergence analysis of the implicit Euler-discretization and sufficient conditions for optimal control problems subject to index-one differential-algebraic equations. Set-Valued Var Anal 27, 405–431 (2019). https://doi.org/10.1007/S11228-018-0471-X
Martens, B.: Necessary conditions, sufficient conditions, and convergence analysis for optimal control problems with differential-algebraic equations. PhD thesis, Universität der Bundeswehr München (2019) https://doi.org/10.1007/978-3-030-53905-4_10
Martens, B., Gerdts, M.: Convergence analysis for approximations of optimal control problems subject to higher index differential-algebraic equations and mixed control-state constraints. SIAM J. Contr. Optim. 58(1), 1–33 (2020). https://doi.org/10.1137/18m1219382
Martens, B., Gerdts, M.: Error analysis for the implicit Euler discretization of linear-quadratic control problems with higher index DAEs and bang-bang solutions. In: Reis, T., Grundel, S., Schöps, S. (eds.) Progress in differential-algebraic Equations II. Differential-algebraic equations forum. Springer, Berlin (2020). https://doi.org/10.1007/978-3-030-53905-4_10
Martens, B., Gerdts, M.: Convergence analysis for approximations of optimal control problems subject to higher index differential-algebraic equations and pure state constraints. SIAM J. Contr. Optim. 59(3), 1903–1926 (2021). https://doi.org/10.1137/20M1353952
Martens, B., Schneider, C.: Error analysis for the implicit Euler discretization of affine optimal control problems with index two DAEs. Pure Appl. Funct. Anal. 6(6), 1383–1414 (2021)
Osmolovskii, N.P., Veliov, V.M.: Metric sub-regularity in optimal control of affine problems with free end state. ESAIM: COCV (2020). https://doi.org/10.1051/COCV/2019046
Pietrus, A., Scarinci, T., Veliov, V.M.: High order discrete approximations to Mayer’s problems for linear systems. SIAM J. Contr. Optim. 56(1), 102–119 (2018). https://doi.org/10.1137/16M1079142
Scarinci, T., Veliov, V.M.: Higher-order numerical scheme for linear-quadratic problems with bang-bang controls. Comput. Optim. Appl. 69, 403–422 (2018). https://doi.org/10.1007/s10589-017-9948-z
Schneider, C., Wachsmuth, G.: Regularization and discretization error estimates for optimal control of ODEs with group sparsity. ESAIM: COCV 24(2), 811–834 (2018). https://doi.org/10.1051/COCV/2017049
Seydenschwanz, M.: Convergence results for the discrete regularization of linear-quadratic control problems with bang–bang solutions. Comput. Optim. Appl. 61, 731–760 (2015). https://doi.org/10.1007/s10589-015-9730-z
von Stryk, O.: Numerische Lösung optimaler Steuerungsprobleme: Diskretisierung, Parameteroptimierung und Berechnung der adjungierten Variablen. Ph.D. thesis, Technische Universität München (1994)
Veliov, V.M.: Error analysis of discrete approximations to bang–bang optimal control problems: the linear case. Contr. Cybern. 34(3), 967–982 (2005)
Acknowledgements
This work was supported by the German Research Foundation DFG under the contract GE 1163/8-2.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Statements and Declarations
The author has no relevant financial or non-financial interests to disclose. All data generated or analyzed during this study are included in this published article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix 1: Taylor expansions
Appendix 1: Taylor expansions
In order to estimate the consistency error (32), we apply the Taylor expansion to \(\frac{{\bar{X}}_{i+1} - {\bar{X}}_i}{h}\) and \(F(Z^j(\bar{\textbf{X}}_i),Y^j(\textbf{Z}(\bar{\textbf{X}}_i)))\) with \({\bar{X}}_i = \hat{X}(t_i)\) for \(i = 0,\ldots ,N-1\). This gives us
where the right hand side is evaluated at \(t = t_i\), i.e., \(\hat{F} = F[t_i]\) etc.. We abbreviate \({\widetilde{Z}}^j = Z^j(\bar{\textbf{X}}_i)\) and \({\widetilde{Y}}^j = Y^j(\textbf{Z}(\bar{\textbf{X}}_i))\). Then, we have
since \(Y^j(\cdot )\) is Lipschitz continuous. Furthermore, expanding \(F({\widetilde{Z}}^j,{\widetilde{Y}}^j)\) and \(G({\widetilde{Z}}^j,{\widetilde{Y}}^j)\) in \((\hat{X}(t_i),\hat{Y}(t_i)) =: (\hat{X},\hat{Y})\) yields
Hence, we get
with \(\Upsilon _j = \sum \limits _{k = 1}^{s} \Lambda _{jk}\). This implies
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Martens, B. Error estimates for Runge–Kutta schemes of optimal control problems with index 1 DAEs. Comput Optim Appl 86, 1299–1325 (2023). https://doi.org/10.1007/s10589-023-00484-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-023-00484-1
Keywords
- Optimal control
- Differential–algebraic equation
- Discrete approximations
- Convergence analysis
- Runge–Kutta schemes