1 Introduction

In this paper we study an inverse problem in which we aim to identify finitely many parameters of an optimal control problem with a linear partial differential equation. This results in an infinite-dimensional bilevel optimal control problem. The concept of bilevel optimization is discussed in [1,2,3,4], while [5,6,7,8] present a comprehensive introduction to optimal control. Bilevel optimal control problems are also studied in [9,10,11,12], for example. To be more precise, we consider the parametric optimization problem

figure a

where \(\beta \in Q\subset {\mathbb {R}}^n\) is a parameter, and the sets \(Q\), \(U_{\textrm{ad}}\), the linear operators A, B, the spaces U, Y, and the function f are such that Assumption 2.1 is satisfied. Here \(u\in U_{\textrm{ad}}\) is the control, \(y\in Y\) is the state, and \(Ay=Bu\) describes an elliptic PDE. Assumption 2.1 guarantees that the solution of (LL\((\beta )\)) is unique for each \(\beta \in Q\), see Lemma 2.2.

The problem (LL\((\beta )\)) is also called the lower-level problem. The upper-level problem under investigation is

$$\begin{aligned} \begin{aligned} \min _{\beta \in {\mathbb {R}}^n} \quad&F(\beta , y, u ) \\ \text {s.t} \quad&\beta \in Q, \\ \quad&(y,u) = \Psi (\beta ), \end{aligned} \end{aligned}$$
(UL)

where \(\Psi (\beta )\) describes the unique solution of (LL\((\beta )\)). Our main motivation for studying (UL) is the purpose of identifying an unknown parameter \(\beta \) from some (possibly perturbed) measurements of \(\Psi (\beta )\), see also Sect. 5.

Together, the problems (LL\((\beta )\)) and (UL) constitute the bilevel optimization problem. Necessary optimality conditions of bilevel optimal control problems, i.e. hierarchical optimization problems with two decision layers, where at least one decision maker has to solve an optimal control problem, are derived in [13,14,15,16,17,18]. Recently solution theory for inverse optimal control problems of partial differential equations was developed in [19, 20]. We also note that optimal control problems with variational inequality constraints such as optimal control of the obstacle problem (see [21]) can be viewed as a bilevel optimal control problem. Regarding the numerical solution of the presented problem type, there mainly exist (to the best of our knowledge) methods for inverse optimal control problems with ordinary differential equations, see [10, 22,23,24]. The corresponding algorithms tend to replace the lower-level problem with their optimality conditions. A different approach was introduced in [25], where the authors solved a special class of inverse problems of partial differential equations by exploiting the optimal-value function of the parametric optimal control problem. The optimal-value function \(\varphi :Q\rightarrow {\mathbb {R}}\) of (LL\((\beta )\)) is defined by

$$\begin{aligned} \varphi (\beta ):= \inf \big \{ f(\beta , y, u) \big |(y,u) \in Y \times U_{\textrm{ad}}, A y = B u \big \} = f(\beta , \Psi (\beta )). \end{aligned}$$
(1)

The idea of using the optimal-value function in bilevel optimization problems can be traced back to [26]. With the help of the optimal-value function, the hierarchical problem (UL) can be transformed into the single-level problem

$$\begin{aligned} \begin{aligned} \min _{\beta ,y,u} \quad&F(\beta , y, u ) \\ \text {s.t.} \quad&\beta \in Q, \\&f(\beta , y, u) \le \varphi (\beta ), \\&A y - B u = 0, \\ \quad&u \in U_{\textrm{ad}}. \end{aligned} \end{aligned}$$
(OVR)

We call this optimization problem the optimal-value reformulation of (UL). This resulting nonconvex surrogate problem does not satisfy standard constraint qualifications such as Robinson’s CQ. However, in [25, Theorem 5.12] the authors were able to prove necessary optimality conditions of Clarke-stationary type via a relaxation approach. Furthermore, [25, Algorithm 1] introduces a solution algorithm using a piecewise affine approximation \(\xi \) of the optimal-value function \(\varphi \) with \(\xi \ge \varphi \), which leads to the relaxed optimization problem

figure b

If f and F are convex, this problem can be split into finitely many convex subproblems for which a global solution can be obtained. The original problem can then be solved by iteratively improving the approximation \(\xi \) of the optimal-value function, see [25, Theorem 6.5]. In this paper we start with the same approach to derive a global solution scheme. We slightly deviate in the construction of the piecewise affine approximation by starting with a triangulation of the admissible set for the upper-level control variable and subsequently enforce some regularity on further divisions. In addition to proving convergence of the global solution scheme in Theorem 3.2, this will allow us to link convergence speed to the size of the elements of the partition (see Theorem 3.6). In order to solve (OVR\((\xi )\)), we also consider the penalty problem

figure c

Here, \(P :{\mathbb {R}}\rightarrow {\mathbb {R}}\) is a penalty function and \(\gamma > 0\). Interestingly, we will see that it is possible to choose the identity \(P(x) = x\) as a penalty function. This has several benefits. On the one hand, we show in Lemma 4.7 that a finite penalty parameter can be chosen such that one obtains the solution of (OVR\((\xi )\)). On the other hand, the choice of the identity results in much simpler derivatives of the objective of (OVRP\((\xi )\)) and this enables us to use a semismooth Newton method to solve the subproblems efficiently, see Sect. 5.3.

Solving nonconvex problems to global optimality is an intricate issue, and, hence, we expect difficulties. Indeed, our approach has some limitations concerning the obtained convergence speed, see Remark 3.7. Especially in a practical setting convergence speed deteriorates with an increasing dimension of the upper-level variable (curse of dimensionality).

Let us describe the structure of this paper. In Sect. 2 we present the used notation as well as the main governing assumption in addition to some preliminary theory related to optimal control problems. We proceed by introducing a global solution algorithm (Algorithm 1) in Sect. 3 and prove its convergence in Theorem 3.2. Further we present some convergence speed estimates in Theorem 3.6 related to the size and regularity of the elements in the partition. To ensure this property, we derive a simple method for refining the partition in arbitrary finite dimensions while keeping some regularity properties of the elements, see Lemma 3.3. On top of this foundation we introduce our penalty approach (Algorithm 2) in Sect. 4. We show that there exists a choice of the penalty parameter (see Lemma 4.7), for which one can expect to find the solution to the subproblems from Algorithm 1. A method for solving the penalty subproblems by means of a semismooth Newton method is presented in Sect. 5. We show its superlinear convergence in Theorem 5.8. The corresponding implementation of our algorithm for solving the inverse optimal control problem and a numerical example are covered in Sect. 6.

2 Preliminaries

2.1 Notation

The norm in a (real) Banach space X is denoted by \(\Vert \cdot \Vert _{X}\). Let \(B_{X}^{\varepsilon }(x)\) denotes the closed \(\varepsilon \)-ball centered at \(x\in X\) with respect to \(\Vert \cdot \Vert _{X}\). Furthermore, \(X^{\star }\) is the topological dual of X and \(\langle \cdot ,\cdot \rangle _{X}: X^{\star }\times X \rightarrow {\mathbb {R}}\) denotes the corresponding dual pairing. For a set \(A\subset X\) we denote by \({\text {conv}} A\), \({\text {cone}} A\), \({\text {cl}} A\), \({\text {int}} A\) and \(\partial A\) the convex hull, the conical hull, the closure, interior and the boundary of A, respectively. For a Banach space Y, the space of all bounded linear operators from X to Y is denoted by L[XY] and for some operator \(F\in L[X,Y]\) the adjoint is called \(F^{\star }\in L[Y^{\star },X^{\star }]\). For a convex set \(C\subset X\) and a point \(x\in C\) we denote by

$$\begin{aligned} {\mathcal {R}}_C(x)&:= {\text {cone}}(C-x), \\ {\mathcal {N}}_C(x)&:= \{x^{\star }\in X^{\star }\big |{\langle }{x^{\star },y-x}{\rangle }_X\le 0,\;\forall y\in C\} \end{aligned}$$

the radial cone and the normal cone to the set C at the point \(x \in C\), respectively. For \(x \not \in C\), we set \({\mathcal {N}}_C(x):= \emptyset \).

The set \({\mathbb {R}}^{n}\) denotes the usual n-dimensional real vector space, equipped with the Euclidean norm \(\Vert \cdot \Vert _{{\mathbb {R}}^{n}}\). The sets \({\mathbb {R}}_{+},{\mathbb {R}}_{-}\) represent the nonnegative and nonpositive numbers respectively. For an arbitrary bounded and open set \(\Omega \subset {\mathbb {R}}^{d}\), the space of equivalence classes of measurable, p-integrable functions is given by \(L^{p}(\Omega )\), \(p \in [1,\infty )\). Similarly, \(L^\infty (\Omega )\) denotes the space of essentially bounded (equivalence classes of) measurable functions. Furthermore, we use the notations \(H_0^1(\Omega )\) and \(H^{-1}(\Omega ) {:}{=}H_{0}^{1}(\Omega )^{\star }\) for the Sobolev space with first order derivatives and homogeneous boundary conditions and its dual space.

A mapping \(J:X\rightarrow Y\) is called Fréchet differentiable at \(x\in X\) if there exists an operator \(J'(x)\in L[X,Y]\) such that

$$\begin{aligned} \lim _{\Vert d\Vert _{X}\rightarrow 0} \frac{\Vert J(x+d)-J(x)-J'(x)d\Vert _{Y}}{\Vert d\Vert _{X}} = 0. \end{aligned}$$
(2)

In this case, \(J'(x)\) is called the Fréchet derivative of J at x. If \(X\ni x\mapsto J'(x)\in L[X,Y]\) is well defined and continuous in a neighborhood of x then J is said to be continuously Fréchet differentiable at x.

2.2 Assumptions

Throughout this work we utilize the following standing assumption.

Assumption 2.1

(Standing assumption)

  1. (a)

    The spaces Y and U are (real) Hilbert spaces.

  2. (b)

    The set \(Q\subset {\mathbb {R}}^{n}\) is a nonempty bounded polyhedron, i.e., a nonempty and bounded intersection of finitely many closed halfspaces. We assume that \(Q\) possesses a nonempty interior.

  3. (c)

    The set \(U_{\textrm{ad}}\subset U\) is nonempty, closed and convex.

  4. (d)

    The operator \(A\in L[Y,Y^{\star }]\) is an isomorphism and \(B\in L[U,Y^{\star }]\) is a linear bounded operator. We denote by \(S:= A^{-1} B \in L[U, Y]\) the control-to-state map.

  5. (e)

    The functionals \(F:Q\times Y \times U\rightarrow {\mathbb {R}}\) and \(f:Q\times Y \times U\rightarrow {\mathbb {R}}\) are assumed to be bounded from below, convex and continuously Fréchet differentiable.

  6. (f)

    The functional F and the partial derivatives of f satisfy some specific Lipschitz-like properties on bounded sets, i.e. for every \(M \ge 0\) there exists a constant \(L_M \ge 0\), such that

    $$\begin{aligned} \begin{aligned} {|} F(\beta , y_1, u_1) - F(\beta , y_2, u_2) {|}&\le L_M \, ( {\Vert }y_1 - y_2{\Vert }_Y + {\Vert }u_1 - u_2{\Vert }_U )\\ {\Vert } f'_\beta (\beta _1, y_1, u_1) - f'_\beta (\beta _2, y_2, u_2) {\Vert }_{{\mathbb {R}}^n}&\le L_M \, ( {\Vert }\beta _1 - \beta _2{\Vert }_{{\mathbb {R}}^n} + {\Vert }y_1 - y_2{\Vert }_Y + {\Vert }u_1 - u_2{\Vert }_U )\\ {\Vert } f'_u(\beta _1, S(u), u) - f'_u(\beta _2, S(u), u) {\Vert }_{U^{\star }}&\le L_M \, {\Vert } \beta _1 - \beta _2 {\Vert }_{{\mathbb {R}}^n} \\ {\Vert }f'_y(\beta _1,S(u),u) - f'_y(\beta _2,S(u),u){\Vert }_{Y^{\star }}&\le L_M{\Vert }\beta _1-\beta _2{\Vert }_{{\mathbb {R}}^n} \end{aligned} \end{aligned}$$

    hold for all \(\beta ,\beta _1,\beta _2 \in Q\), \(y_1, y_2 \in B^M_Y(0)\) and \(u, u_1, u_2 \in U_{\textrm{ad}}\cap B^M_U(0)\).

  7. (g)

    The reduced lower-level objective \(u \mapsto f(\beta , S(u), u)\) is assumed to be strongly convex with respect to the control with constant \(\mu >0\) independent of \(\beta \in Q\), i.e.,

    $$\begin{aligned} f(\beta ,S(u_{2}),u_{2}) \ge f(\beta ,S(u_{1}),u_{1}) + {\langle }{f_{y}'(\cdot ),S(u_{2} - u_{1})}{\rangle } + {\langle }{ f_{u}'(\cdot ),u_{2} - u_{1} }{\rangle } + \frac{\mu }{2} {\Vert } u_{2} - u_{1} {\Vert }^{2}_{U} \end{aligned}$$

    holds for all \(\beta \in Q\) and \(u_{1},u_{2}\in U_{\textrm{ad}}\). Here, \(f_y'(\cdot )\) and \(f_u'(\cdot )\) denote the partial derivatives of f w.r.t. y and u at the point \((\beta ,S(u_{1}),u_{1})\).

2.3 Preliminary results

Let the optimization problem

$$\begin{aligned} \begin{aligned} \min _{x\in X} \quad&J(x)\nonumber \\ \text {s.t.} \quad&g(x)\in C \end{aligned} \end{aligned}$$
(OP)

be given, with continuously Fréchet differentiable mappings \(J:X\rightarrow {\mathbb {R}}\), \(g:X\rightarrow Y\) between Banach spaces X, Y and \(C\subset Y\) being nonempty, closed and convex. A feasible point \(x\in X\) of (OP) satisfies the Karush-Kuhn-Tucker (KKT) conditions if

$$\begin{aligned} \exists \lambda \in {\mathcal {N}}_{C}(g(x)): \qquad J'(x) + g'(x)^{\star }\lambda = 0. \end{aligned}$$
(3)

If x is a local solution of (OP) which satisfies Robinson’s constraint qualification

$$\begin{aligned} g'(x)X - {\mathcal {R}}_{C}(g(x)) = Y, \end{aligned}$$
(4)

then the KKT conditions hold, see [27] and [28, Theorem 3.9]. Due to Assumption 2.1, the lower-level problem fits into the setting of (OP). The KKT system for the lower level for a parameter \({\tilde{\beta }}\) in a solution \(({\tilde{y}},{\tilde{u}})\) then reads

$$\begin{aligned} \begin{aligned} 0&= f'_{y}(\tilde{\beta },\tilde{y},\tilde{u})+ A^{\star }{\tilde{p}}, \\ 0&= f'_{u}(\tilde{\beta },\tilde{y},\tilde{u})- B^{\star }{\tilde{p}}+{\tilde{\nu }}, \\ 0&= A{\tilde{y}} - B{\tilde{u}}, \\ {\tilde{\nu }}&\in {\mathcal {N}}_{U_{\textrm{ad}}}({\tilde{u}}), \end{aligned} \end{aligned}$$
(5)

where \({\tilde{p}}\in Y\) (we identify \(Y^{\star \star }\) with Y), \({\tilde{\nu }}\in U^{\star }\) are multipliers. Note that Robinson’s CQ is satisfied due to the surjectivity of A. Thus, for a minimizer of the lower-level problem there exist multipliers such that the KKT system (5) is satisfied.

We can now prove that the assumption of strong convexity for the lower level implies a quadratic growth condition in the solution.

Lemma 2.2

For every \(\beta \in Q\), the lower-level problem (LL\((\beta )\)) has a unique solution \((y_\beta , u_\beta )\). Moreover, the quadratic growth condition

$$\begin{aligned} f(\beta ,S(u),u) \ge f(\beta , y_\beta , u_\beta ) + \frac{\mu }{2} {\Vert } u - u_\beta {\Vert }^{2}_{U} \qquad \forall u\in U_{\textrm{ad}}\end{aligned}$$
(6)

is satisfied with the parameter \(\mu > 0\) from Assumption 2.1(g).

Proof

Existence of a solution follows from the direct method of calculus of variations. Note that the boundedness of the minimizing sequence follows from the strong convexity.

Let \((y_\beta , u_\beta )\) denote a solution of (LL\((\beta )\)). Utilizing the strong convexity in the solution \((\beta ,y_{\beta },u_{\beta })\) yields

$$\begin{aligned} f(\beta ,S(u),u) \ge f(\beta ,y_{\beta },u_{\beta })+ {\langle }{ f_{u}'(\cdot ),u - u_\beta }{\rangle } + {\langle }{f'_{y}(\cdot ),S(u - u_\beta )}{\rangle } + \frac{\mu }{2} {\Vert } u - u_\beta {\Vert }^{2}_{U} \end{aligned}$$

for all \(u \in U_{\textrm{ad}}\), where \(f_u'(\cdot )\) and \(f_y'(\cdot )\) denote the partial derivatives of f in \((\beta ,y_{\beta },u_{\beta })\). By using the KKT conditions with multipliers p, \(\nu \) we obtain

$$\begin{aligned} \begin{aligned} {\langle }{f'_{u}(\cdot ),u - u_\beta }{\rangle } + {\langle }{f'_{y}(\cdot ),S(u - u_\beta )}{\rangle }&= {\langle }{f'_{u}(\cdot ) + S^{\star }f'_{y}(\cdot ),u-u_\beta }{\rangle }\\&= {\langle }{f'_{u}(\cdot ) - S^{\star }A^{\star }p,u-u_\beta }{\rangle }\\&= {\langle }{f'_{u}(\cdot ) - B^{\star }p,u-u_\beta }{\rangle } \\&= {\langle }{-\nu ,u-u_\beta }{\rangle } \ge 0 \qquad \forall u \in U_{\textrm{ad}}. \end{aligned} \end{aligned}$$

The last inequality holds since \(\nu \in {\mathcal {N}}_{U_{\textrm{ad}}}(u_\beta )\) and \(u \in U_{\textrm{ad}}\). Hence, one gets the quadratic growth condition (6). This also yields uniqueness of the solution. \(\square \)

Next, we introduce the solution operator for (LL\((\beta )\)).

Definition 2.3

We denote by \(\Psi : Q\rightarrow Y \times U\) the solution mapping of the lower-level problem which maps \(\beta \in Q\) to the corresponding unique solution \((y_\beta ,u_\beta )\) given in Lemma 2.2. We further denote by \(\psi ^y(\beta ) \in Y\) and \(\psi ^u(\beta ) \in U\) the components of \(\Psi (\beta )\). As an abbreviated notation we introduce \(y_\beta {:}{=}\psi ^y(\beta )\) and \(u_\beta {:}{=}\psi ^u(\beta )\).

We will now prove that the function \(\Psi \) is globally Lipschitz continuous. Local Lipschitz continuity follows already by [15, Lemma 3.1.6]. However, by Assumption 2.1(f) we have a stronger assumption on the derivative of f. Thus, we can adopt the arguments from [15, Lemma 3.1.6] to obtain global Lipschitz continuity.

Lemma 2.4

Let XV be Banach spaces, and let \(C\subset X, {\hat{Q}}\subset V\) be nonempty, closed and convex sets. Further, let \(J:X\times V\rightarrow {\mathbb {R}}\) and \(\mu > 0\) be given such that for all \(p \in {\hat{Q}}\), the function \(J(\cdot ,p)\) is strongly convex with parameter \(\mu \) on the feasible set C and Fréchet differentiable. Then, the solution operator \(\psi :{\hat{Q}}\rightarrow X\) for the parametrized optimization problem

$$\begin{aligned} \begin{aligned} \min _{x} \quad&J(x,p)\nonumber \\ \text {s.t.} \quad&x\in C \end{aligned} \end{aligned}$$

exists and we have the estimate

$$\begin{aligned} {\Vert }\psi (p_2) - \psi (p_1){\Vert }_X \le \mu ^{-1}{\Vert }J'_x(\psi (p_2),p_1) - J'_x(\psi (p_2),p_2){\Vert }_{X^{\star }} \qquad \forall p_1, p_2 \in {\hat{Q}}. \end{aligned}$$

Proof

The existence of \(\psi \) follows by standard arguments for convex optimization problems with strongly convex objectives.

We now consider fixed elements \(p_{1},p_{2}\in {\hat{Q}}\) and their corresponding unique minimizers \(\psi (p_{i}) = x_{i}\in C\), \(i\in \{1,2\}\). The associated optimality conditions are

$$\begin{aligned} {\langle }{J_{x}'(x_{i},p_{i}),{\hat{x}}-x_{i}}{\rangle } \ge 0 \quad \forall {\hat{x}}\in C. \end{aligned}$$
(7)

If we now add these inequalities with the special choices \({{\hat{x}}} = x_{3-i}\), we obtain the estimate

$$\begin{aligned} 0&\le {\langle }{J_{x}'(x_{1},p_{1}) - J_{x}'(x_{2},p_{2}),x_2-x_{1}}{\rangle } \\&\le {\langle }{ J_{x}'(x_{1},p_{1}) - J_{x}'(x_{2},p_{1}) + J_{x}'(x_{2},p_{1}) - J_{x}'(x_{2},p_{2}) ,x_2-x_{1}}{\rangle } \\&\le -\mu {\Vert } x_2 - x_1 {\Vert }_X^2 + {\Vert }J_{x}'(x_{2},p_{1}) - J_{x}'(x_{2},p_{2}) {\Vert }_{X^{\star }} {\Vert }x_2 - x_1{\Vert }_X. \end{aligned}$$

In the last step, we have used the strong convexity of \(J(\cdot , p_1)\). Dividing the last inequality by \(\mu {\Vert }x_2 - x_1{\Vert }_X\) yields the claim. \(\square \)

Corollary 2.5

The function \(\Psi \) from Definition 2.3 is Lipschitz continuous on \(Q\). Moreover, there exists a constant \(M_\Psi \ge 0\) such that

$$\begin{aligned} {\Vert }\beta {\Vert }_{{\mathbb {R}}^n}, {\Vert }\psi ^y(\beta ){\Vert }_{Y}, {\Vert }\psi ^u(\beta ){\Vert }_U \le M_\Psi \qquad \forall \beta \in Q. \end{aligned}$$

Proof

We start by proving the boundedness. From Lemma 2.2, we get

$$\begin{aligned} f(\beta ,y_{\beta },u_{\beta })+ \frac{\mu }{2} {\Vert }{{\hat{u}}} - u_\beta {\Vert }_U^2 \le f(\beta ,S({{\hat{u}}}),{{\hat{u}}}) \qquad \forall \beta \in Q\end{aligned}$$

for a fixed \({{\hat{u}}}\in U_{\textrm{ad}}\). Further, \(f(\cdot ,S({{\hat{u}}}),{{\hat{u}}}): {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) is continuous, thus it is bounded on the compact set \(Q\). Hence, one has

$$\begin{aligned} f(\beta ,y_{\beta },u_{\beta })+ \frac{\mu }{2} {\Vert }{{\hat{u}}} - u_\beta {\Vert }_U^2 \le C \qquad \forall \beta \in Q\end{aligned}$$

for some constant \(C \in {\mathbb {R}}\). Together with the assumption that f is bounded from below (see Assumption 2.1(e)) we get an upper bound for \({\Vert }\psi ^u(\beta ){\Vert }_U = {\Vert }u_\beta {\Vert }_U\). This also allows us to bound \({\Vert }\psi ^y(\beta ){\Vert }_Y={\Vert }S(\psi ^u(\beta )){\Vert }_Y\le {\Vert }S{\Vert }{\Vert }\psi ^u(\beta ){\Vert }_U\), since S is a linear bounded operator by assumption. Since \(Q\) is bounded, \(\beta \in Q\) is bounded as well. We choose \(M_\Psi \) to be the largest of the previously discussed bounds for \({\Vert }\beta {\Vert }_{{\mathbb {R}}^n}, {\Vert }\psi ^y(\beta ){\Vert }_Y\) and \({\Vert }\psi ^u(\beta ){\Vert }_U\).

In order to prove the Lipschitzness of \(\Psi \), we want to apply Lemma 2.4 to the state-reduced lower-level problem, i.e., with the setting

$$\begin{aligned} x = u, \quad C = U_{\textrm{ad}}, \quad p = \beta , \quad {\hat{Q}} = Q, \quad J(x,p) = J(u,\beta ):= f(\beta , S(u), u). \end{aligned}$$

Assumption 2.1 yields that the assumptions of Lemma 2.4 are satisfied. From the chain rule, we get

$$\begin{aligned} J_x(u, \beta ) = f'_u(\beta , S(u), u) + S^{\star }f'_y(\beta , S(u), u). \end{aligned}$$

Now, Lemma 2.4 yields

$$\begin{aligned} \begin{aligned} {\Vert }\psi ^u(\beta _1) - \psi ^u(\beta _2){\Vert }&\le \mu ^{-1}\bigg (\left\| f'_u\big (\beta _1,\psi ^y(\beta _1),\psi ^u(\beta _1)\big ) - f'_u\big (\beta _2,\psi ^y(\beta _1),\psi ^u(\beta _1)\big )\right\| _{U^{\star }}\\&\qquad + {\Vert }S^{\star }{\Vert }\left\| f'_y\big (\beta _1,\psi ^y(\beta _1),\psi ^u(\beta _1)\big ) - f'_y\big (\beta _2,\psi ^y(\beta _1),\psi ^u(\beta _1)\big )\right\| _{Y^{\star }}\bigg ). \end{aligned} \end{aligned}$$

Owing to Assumption 2.1(f) with \(M = M_\Psi \), this yields the desired Lipschitz continuity of \(\psi ^u\). Consequently, the Lipschitz continuity of \(\psi ^y\) follows due to the continuity of S. \(\square \)

We can use this property to prove the existence of solutions for (OVR).

Theorem 2.6

There exists a solution for (OVR).

Proof

The lower-level problem admits a unique solution. Therefore the solution operator \(\Psi \) of the lower-level optimization problem can be used to reduce (UL) to an optimization problem in \({\mathbb {R}}^{n}\):

$$\begin{aligned} \begin{aligned} \min _{\beta } \quad&F(\beta ,\psi ^y(\beta ),\psi ^u(\beta )) \\ \text {s.t.} \quad&\beta \in Q. \end{aligned} \end{aligned}$$

By Assumption 2.1(e) F is continuous. Thus with the Lipschitz continuity of \(\Psi \) it follows that \(\beta \mapsto F(\beta ,\psi ^y(\beta ),\psi ^u(\beta ))\) is continuous. Moreover, \(Q\subset {\mathbb {R}}^{n}\) is compact by Assumption 2.1(b). The existence of a solution follows from the celebrated Weierstraß theorem. \(\square \)

We finally mention that more general results on the existence of solutions for bilevel optimal control problems are given in [29]. In particular, our result is covered by the second part of [29, Theorem 16.3.5].

In order to use interpolation error estimates, we prove regularity of the optimal-value function \(\varphi \).

Corollary 2.7

The optimal-value function is Fréchet differentiable on the interior of \(Q\) and the derivative is Lipschitz continuous.

Proof

The differentiability of \(\varphi \) can be shown as in [15, Theorem 3.2.6]. This also yields the expression \(\varphi '(\beta ) = f'_{\beta }(\beta ,\psi ^{y}(\beta ),\psi ^{u}(\beta ))\) for the derivative. By combining this with the Lipschitz continuity of \(\Psi \) (see Corollary 2.5) and Assumption 2.1(f), we get the Lipschitz continuity of \(\varphi '\) on the interior of \(Q\). \(\square \)

3 Algorithm

In this section, we present an algorithm to solve (OVR) under the given Assumption 2.1. The algorithm is similar to [25, Algorithm 1], with the main difference being the choice of the function \(\xi \) which approximates the value function \(\varphi \). In that reference, the functions \(\xi _k\) were defined via

$$\begin{aligned} \xi _k(x):= \min \left\{ \sum _{i=1}^m\mu _i\varphi (x^i) \big |0 \le \mu , \; \sum _{i = 1}^m \mu _i = 1, \; \sum _{i=1}^m\mu _i x^i=x \right\} , \end{aligned}$$

where \(X_k = \{x^1, \ldots , x^m\} \subset {\mathbb {R}}^n\) is a finite set. The sets \(X_k\) are assumed to be increasing w.r.t. k and in order to achieve a uniform Lipschitz bound of \(\xi _k\) on \(Q\), one has to require \(Q\subset {\text {int}}{\text {conv}}X_1\), see [25, Lemma 6.1, Example 6.1]. The reason for this extra assumption is that it is not possible to a priori control the shape and size of the simplices on which \(\xi _k\) is affine.

We use a different method and choose a subdivision \({\mathcal {T}}_k\) of \(Q\) (recall that Q is a bounded polyhedron) into closed simplices. On each simplex \(T \in {\mathcal {T}}_k\), we define \(\xi _T :T \rightarrow {\mathbb {R}}\) as the affine interpolant of \(\varphi \) in the vertices of T. The function \(\xi _{{\mathcal {T}}_k}\) is obtained by combining \(\xi _T\) for all \(T \in {\mathcal {T}}_k\), see (8) below. The advantage of this approach lies in the accessible way to control the interpolation error \({\Vert }\xi _T-\varphi {\Vert }_{L^\infty (T)}\) by refining the subdivision \({\mathcal {T}}_k\) to get sufficient decrease in diameter for new simplices.

We mention that our approach does not require continuity of \(\xi _{{\mathcal {T}}_k}\). Therefore, we do not need any special assumptions on the subdivision, in particular, we allow for hanging nodes. In fact, it is enough to require

$$\begin{aligned} \bigcup _{T \in {\mathcal {T}}_k} T = Q. \end{aligned}$$

Therefore, if we have two elements \(T,S \in {\mathcal {T}}_k\) with \(T \cap S \ne \emptyset \), the values of \(\xi _T\) and \(\xi _S\) may not agree on \(T \cap S\). For the definition of \(\xi _{{\mathcal {T}}_k} :Q\rightarrow {\mathbb {R}}\), we choose

$$\begin{aligned} \xi _{{\mathcal {T}}_k}(\beta ):= \max _{T\in {\mathcal {T}}_{k}} \xi _{T}(\beta ). \end{aligned}$$
(8)

This definition of \(\xi _{{\mathcal {T}}_{k}}\) ensures upper semicontinuity.

Lemma 3.1

Let a simplex \(T \subset Q\) be given. Then, the interpolant \(\xi _T\) of \(\varphi \) satisfies the interpolation error estimate

$$\begin{aligned} {\Vert }\varphi - \xi _T{\Vert }_{L^\infty (T)} \le \frac{C_\varphi }{2} {\text {diam}}(T)^2, \end{aligned}$$

where \(C_\varphi \) is the Lipschitz constant of \(\varphi '\) on the interior of \(Q\), see Corollary 2.7.

Proof

We define the error \(e:= \varphi - \xi _T\). Since e is continuous, there exists \(z \in T\) such that \({|}e(z){|} = {\Vert }e{\Vert }_{L^\infty (T)}\). If z coincides with one of the vertices, we are done since e vanishes in the vertices of T. Otherwise, we can find a vertex p of T and \(\varepsilon > 0\) such that \(z \pm \varepsilon p \in T\). Since e attains its maximum or minimum in z, this gives \(e'(z) (p - z) = 0\). Now, we use the fundamental theorem of calculus to obtain

$$\begin{aligned} {\Vert }z{\Vert }_{L^\infty (T)}&= {|}e(z){|} = {|}e(p) - e(z){|} = \left|\int _0^1 e'(z + t ( p - z )) ( p - z ) \textrm{d}t \right|\\ {}&= \left|\int _0^1 \left[ e'(z + t ( p - z )) - e'(z)\right] ( p - z ) \textrm{d}t \right|. \end{aligned}$$

By using the linearity of \(\xi _T\), we can continue with

$$\begin{aligned} {\Vert }z{\Vert }_{L^\infty (T)}&= \left|\int _0^1 \left[ \varphi '(z + t ( p - z )) - \varphi '(z)\right] ( p - z ) \textrm{d}t \right|\\ {}&\le \int _0^1 \left|\left[ \varphi '(z + t ( p - z )) - \varphi '(z)\right] ( p - z ) \right|\textrm{d}t \le \int _0^1 C_\varphi t {\Vert }p - z{\Vert }^2 \textrm{d}t = \frac{C_\varphi }{2} {\Vert }p - z{\Vert }^2 . \end{aligned}$$

Using \({\Vert }p - z{\Vert } \le {\text {diam}}(T)\) finishes the proof. \(\square \)

The main idea in Algorithm 1 is to solve (OVR\((\xi )\)) with \(\xi = \xi _{{\mathcal {T}}_k}\) and to successively refine a simplex on which a solution is found. In order for Algorithm 1 to be well-defined, we need to guarantee the existence of global minimizers of (OVR\((\xi ,T)\)). This can be shown by the direct method of calculus of variations. The boundedness of \(\beta \) follows from \(\beta \in T\) and the boundedness of (yu) follows from \(f(\beta , y, u) \le \xi _T(\beta )\), cf. Assumption 2.1(g).

figure d

Under very mild assumptions we can show the convergence towards global minimizers using Lemma 3.1.

Theorem 3.2

Algorithm 1 either stops at a global optimal solution of (OVR) or the computed sequence \((\beta _{k},y_{k},u_{k})\) is bounded and contains a weakly convergent subsequence in \({\mathbb {R}}^n \times Y \times U\) to a global optimal solution of (OVR). Every weakly convergent subsequence of \((\beta _{k},y_{k},u_{k})\) converges strongly. If (OVR) has a unique global solution \((\bar{\beta },\bar{y},\bar{u})\), then the entire sequence \((\beta _{k},y_{k},u_{k})\) converges strongly to \((\bar{\beta },\bar{y},\bar{u})\).

Proof

The value function \(\varphi \) is convex and therefore \(\xi _{{\mathcal {T}}_k}(\beta ) \ge \varphi (\beta )\). Thus, the feasible set of (OVR\((\xi _{{\mathcal {T}}_k})\)) contains the feasible set of (OVR). If the solution \((\beta _{k},y_{k},u_{k})\) of (OVR\((\xi _{{\mathcal {T}}_k})\)) is feasible for (OVR), it is globally optimal for (OVR). Hence, the stopping criterion of the algorithm ensures that \((\beta _{k},y_{k},u_{k})\) is globally optimal for (OVR). It remains to discuss the case where Algorithm 1 does not terminate. We denote by \((\bar{\beta },\bar{y},\bar{u})\) a global solution of (OVR). Then

$$\begin{aligned} F(\beta _{k},y_{k},u_{k})\le F(\bar{\beta },\bar{y},\bar{u}) \end{aligned}$$
(9)

by the same argument. The feasible set \(Q\) is compact by Assumption 2.1(b). This implies the existence of \(N \in {\mathbb {R}}\) with \(\varphi (\beta ) \le N\) for all \(\beta \in Q\). Therefore, the estimate

$$\begin{aligned} N \ge \xi _{{\mathcal {T}}_k}(\beta _k) \ge f(\beta _{k},y_{k},u_{k})\ge f(\beta _k, y_{\beta _k}, u_{\beta _k}) + \frac{\mu }{2} {\Vert } u_{\beta _k} - u_k {\Vert }_U^2 \end{aligned}$$

(where we used (6) in the last step) together with the boundedness of \(u_{\beta _k}\) shows the boundedness of \(u_k\) in U. The boundedness of \(y_k\) in Y follows from the properties of the linear operators A and B. Therefore the sequence \((\beta _{k},y_{k},u_{k})\) is bounded by a constant \(M \ge 0\) and contains a weakly convergent subsequence (without relabeling) \((\beta _{k},y_{k},u_{k})\rightharpoonup (\hat{\beta },\hat{y},\hat{u})\) in \({\mathbb {R}}^n \times Y \times U\). In particular, one has the strong convergence \(\beta _k \rightarrow {\bar{\beta }}\), since \({\mathbb {R}}^n\) is finite dimensional.

In order to estimate the distance between \(\varphi \) and its interpolant \(\xi _{{\mathcal {T}}_k}\), we use the interpolation error estimate Lemma 3.1 on each simplex \(T \in {\mathcal {T}}_k\). To this end, we need to show that the last step of Algorithm 1 ensures \({\text {diam}}({{\bar{T}}}_k) \rightarrow 0\). We proceed by proof of contradiction and assume \(v{:}{=}\limsup _{k\rightarrow \infty }{\text {diam}}({\bar{T}}_{k})>0\). Thus, the set \({{\bar{{\mathcal {T}}}}}_0:= \{ {{\bar{T}}}_k \big |k \in {\mathbb {N}}, {\text {diam}}({{\bar{T}}}_k) \ge v\}\) is infinite. Now there has to be at least one simplex \(T_{0} \in {\mathcal {T}}_1\) that strictly contains infinitely many simplices from \({{\bar{{\mathcal {T}}}}}_0\), i.e., the set \({\bar{{\mathcal {T}}}}_1 {:}{=}\{T \in {{\bar{{\mathcal {T}}}}}_0 \big |T \subsetneq T_{0}\}\) is infinite. These simplices are refined at least once and thus we have \({\text {diam}}(T) \le q {\text {diam}}(T_0)\) for all \(T \in {{\bar{{\mathcal {T}}}}}_1\). Again, one simplex in \({{\bar{{\mathcal {T}}}}}_1\) has to contain infinitely many of the simplices from \({{\bar{{\mathcal {T}}}}}_1\) and we can repeat the above argument. This leads to a contradiction as the diameter of the simplices is bounded from above by \(q^{-l} {\text {diam}}(T_0)\) and this contradicts the lower bound \(v > 0\).

The interpolation error estimate in combination with \({\text {diam}}(\bar{T}_k)\rightarrow 0\) yields

$$\begin{aligned} \begin{aligned} \varphi (\hat{\beta }) \le f(\hat{\beta },\hat{y},\hat{u})&\le \liminf _{k\rightarrow \infty } f(\beta _{k},y_{k},u_{k})\le \limsup _{k\rightarrow \infty } f(\beta _{k},y_{k},u_{k})\le \limsup _{k\rightarrow \infty } \xi _{{\bar{T}}_{k}}(\beta _{k})\\&\le \limsup _{k\rightarrow \infty } \left( \varphi (\beta _{k}) + \frac{C_\varphi }{2} {\text {diam}}({\bar{T}}_{k})^{2} \right) = \varphi (\hat{\beta }). \end{aligned} \end{aligned}$$
(10)

Note that we have used the sequential weak lower semicontinuity of f which follows from convexity and continuity in Assumption 2.1(e). Thus, (10) yields feasibility of \((\hat{\beta },\hat{y},\hat{u})\) for (OVR). Similarly, F is sequentially weakly lower semicontinuous. Therefore, we can pass to the limit \(k \rightarrow \infty \) in (9) and obtain

$$\begin{aligned} F(\hat{\beta },\hat{y},\hat{u})\le \liminf _{k\rightarrow \infty } F(\beta _{k},y_{k},u_{k})\le F(\bar{\beta },\bar{y},\bar{u}). \end{aligned}$$
(11)

This shows that \((\hat{\beta },\hat{y},\hat{u})\) is a global solution for (OVR).

Next, we prove the strong convergence of \(y_k\) and \(u_k\). Strong convergence of the control \(u_k\) can be obtained by exploiting the quadratic growth condition from Lemma 2.2: Note that \(y_{k} = S(u_{k})\) by feasibility of \((\beta _{k},y_{k},u_{k})\) for (OVR(\(\xi ,{{\bar{T}}}_k\))). Thus, Lemma 2.2 and the Lipschitz continuity of \(f'_\beta ({\hat{\beta }},\cdot ,\cdot )\) from Assumption 2.1(f) yield

$$\begin{aligned} \nonumber f(\beta _{k},y_{k},u_{k})&\ge f(\hat{\beta },y_{k},u_{k})+{\langle }{f'_{\beta }(\hat{\beta },y_{k},u_{k}),\beta _{k}-\hat{\beta }}{\rangle } \\ \nonumber&\ge f(\hat{\beta },y_{k},u_{k}) -{\Vert }f'_{\beta }(\hat{\beta },y_{k},u_{k}){\Vert }_{{\mathbb {R}}^n}{\Vert }\beta _{k}-{\hat{\beta }}{\Vert }_{{\mathbb {R}}^n}\\ \nonumber&\ge f(\hat{\beta },y_{k},u_{k}) \\\nonumber&\qquad - \left( {\Vert }f'_{\beta }(\hat{\beta },\hat{y},\hat{u}){\Vert }_{{\mathbb {R}}^n} + L_M {\Vert } y_{k}-{\hat{y}} {\Vert }_Y + L_M {\Vert } u_{k}-{\hat{u}} {\Vert }_U \right) {\Vert }\beta _{k}-\hat{\beta }{\Vert }_{{\mathbb {R}}^n}\\&\ge f(\hat{\beta },\hat{y},\hat{u})+ \frac{\mu }{2}{\Vert }u_{k}-{\hat{u}}{\Vert }^{2}_{U} -C{\Vert }\beta _{k}-\hat{\beta }{\Vert }_{{\mathbb {R}}^n} . \end{aligned}$$
(12)

Since (10) implies \(f(\beta _{k},y_{k},u_{k})\rightarrow f(\hat{\beta },\hat{y},\hat{u})\) and since \(\beta _k \rightarrow {\hat{\beta }}\), this inequality yields the strong convergence \(u_k \rightarrow {{\hat{u}}}\) in U. The continuity of the solution operator S now implies strong convergence of the states.

If the solution to (OVR) is unique, the convergence of the entire sequence follows from a usual subsequence-subsequence argument. \(\square \)

An important ingredient of Algorithm 1 is the refinement of the simplices in (S3) such that the property \({\text {diam}}(T) \le q{\text {diam}}({{\bar{T}}}_k)\) is obtained. In the two-dimensional case \(Q\subset {\mathbb {R}}^{2}\) this can be done by splitting the triangle \({{\bar{T}}}_k\) into 4 similar triangles by using the midpoints of the edges. However, already in three dimensions this is not straightforward since a general tetrahedron cannot be divided into similar tetrahedrons. In particular, a regular tetrahedron cannot be split into smaller regular tetrahedra. One, however, can use hypercubes to construct a method of refinement that ensures that the diameter decreases sufficiently.

Lemma 3.3

For every (finite) subdivision \({\mathcal {T}}_1\), there exists a constant \(q \in (0,1)\) such that the refinement in (S3) of Algorithm 1 is always possible.

Proof

We first describe the splitting of reference simplices and then use a linear transformation to apply the procedure to an arbitrary simplex.

Let \(S_n\) denotes the permutations of \(\{1,2,\ldots , n\}\). We consider the hypercube \([0,1]^{n}\) and a permutation \(\pi \in S_n\). Then \(T_{\pi } {:}{=}\{x\in {\mathbb {R}}^{n} \big |0\le x_{\pi (1)} \le \dots \le x_{\pi (n)} \le 1\}\) describes a simplex. For each point x in the hypercube there exists at least one permutation \(\pi \) for which the definition of \(T_{\pi }\) is consistent with the “\(\le \)”-ordering of the components of x, i.e., \(x \in T_\pi \). Therefore \(\bigcup _{\pi \in S_{n}}T_{\pi } = [0,1]^{n}\). If we consider a point \(x\in [0,1]^{n}\) with \(x_{i} \ne x_{j}\) for all \(i\ne j\), then there exists only one permutation \(\pi \) such that \(x\in T_{\pi }\) since the components of x have a uniquely determined order. Furthermore, those points are dense in \([0,1]^n\) and this implies that two simplices constructed with two different permutations cannot have a n-dimensional intersection. Moreover, different simplices \(T_\pi \) can be matched by a permutation of the coordinates.

The hypercube can be split into \(2^{n}\) smaller cubes. By dividing these smaller cubes again into simplices, we arrive at

$$\begin{aligned} T_{\pi }^{t} {:}{=}\{x\in {\mathbb {R}}^{n} \big |0 \le x_{\pi (1)}-t_{\pi (1)} \le \dots \le x_{\pi (n)}-t_{\pi (n)} \le 0.5 \} \subset t + [0, 0.5]^n, \end{aligned}$$
(13)

where we consider all possible \(t \in \{0, 0.5\}^n\) and \(\pi \in S_n\). We observe that these simplices are the translated and scaled versions of \(T_\pi \). In particular, we have \(T_\pi ^t = \frac{1}{2} T_\pi + t\).

We argue that for all \(\pi \in S_n\) and \(t \in \{0, 0.5\}^n\), there exists \({\hat{\pi }} \in S_n\) with \(T_\pi ^t \subset T_{{\hat{\pi }}}\). Indeed, for \(x \in T_\pi ^t\), the coordinates \(x_i\) with \(t_i = 0\) are smaller (or equal) than the coordinates \(x_j\) with \(t_j = 0.5\). Further, we have \(x_{\pi (i_1)} \le x_{\pi (i_2)}\) if \(t_{\pi (i_1)} = t_{\pi (i_2)}\) and \(i_1 \le i_2\). Thus, we can construct \({\hat{\pi }}\) by first taking the indices \(\pi (i)\) with \(t_{\pi (i)} = 0\) and afterwards the indices \(\pi (j)\) with \(t_{\pi (j)} = 0.5\). This implies that every \(T_\pi \) can be divided into \(2^n\) smaller simplices \(T^{t^{(i)}}_{\pi ^{(i)}}\) with \(i = 1,\ldots , 2^n\).

Each vertex of \(T_\pi \) is a vertex of the original hypercube \([0,1]^n\) and each vertex of \(T^{t^{(i)}}_{\pi ^{(i)}}\) is a vertex of the smaller cube \(t^{(i)} + [0,0.5]^n\). Since these two cubes share exactly one vertex, at most one vertex of the divided simplex \(T^{t^{(i)}}_{\pi ^{(i)}}\) is a vertex of \(T_\pi \).

Finally, we map each simplex \(T \in {\mathcal {T}}_1\) to \(T_\pi \) for some fixed \(\pi \in S_n\) by an (invertible) affine transformation \(a_T:T\rightarrow T_\pi \). The first part of the proof shows that \(T_\pi \) can be divided into smaller simplices. By applying the inverse transformation \(a_T^{-1}\), we get a subdivision \(\{T_i \big |i = 1,\ldots , 2^n\}\) of T. For the children of T, we reuse the transformation \(a_T\). Thus, only finitely many shapes appear for all the descendants of T. Since the diameter of a simplex is only attained at pairs of vertices and since each child \(T_i\) shares at most one vertex with T we find \({{\bar{q}}}_T \in (0,1)\) with \({\text {diam}}(T_i) \le {{\bar{q}}}_T {\text {diam}}(T)\) for all \(i = 1,\ldots ,2^n\). In subsequent iterations the affine transformations applied to the simplices of the initial subdivision can be reused for the respective children. The number of affine transformations and reference shapes is finite. Thus, the procedure introduces at most \(k=2^n{|}{\mathcal {T}}_1{|}\) pairings of shapes and corresponding transformations, with their own scaling factors \(q_k \in (0,1)\). We can set \(q = \max _{k}(q_k) \in (0,1)\) and the described refinement strategy complies with step (S3) of Algorithm 1. \(\square \)

Remark 3.4

The refinement technique of Lemma 3.3 always generates hanging nodes. The presented method is consistent with splitting a triangle into 4 similar parts using the midpoints of the edges. As a side effect the method from Lemma 3.3 also provides a lower bound on the aspect ratio and maintains some shape regularity of the initial subdivision. In higher dimensions there might exist more advanced methods of refinement.

After we have proven the convergence of Algorithm 1, we want to get an estimate on the convergence speed. We establish a preliminary result on the error in the upper-level objective induced by the approximation \(\xi _T\) of \(\varphi \).

Lemma 3.5

Let \({\mathcal {T}}\) be a subdivision of \(Q\). For \(T\in {\mathcal {T}}\) and any feasible point \((\beta ,y,u)\) of (OVR\((\xi ,T)\)) we have

$$\begin{aligned} {|}F(\beta ,y,u)- F(\beta ,y_{\beta },u_{\beta }){|} \le L_M(1+\Vert S\Vert ) \sqrt{\frac{C_{\varphi }}{\mu } } {\text {diam}}(T), \end{aligned}$$
(14)

where \((y_\beta , u_\beta )\) is the solution of the lower-level problem associated with the parameter \(\beta \), see Definition 2.3, and \(C_\varphi \) is the Lipschitz constant of \(\varphi '\), see Corollary 2.7. Moreover, the constant \(M > 0\) (defined in the proof) is large enough, such that the norms of \(\beta \), y, u, \(y_\beta \) and \(u_\beta \) are bounded by M.

Proof

We use the quadratic growth condition from Lemma 2.2 to obtain

$$\begin{aligned} \xi _T(\beta ) \ge f(\beta ,y,u)\ge f(\beta ,y_{\beta },u_{\beta }) + \frac{\mu }{2} \Vert u-u_{\beta }\Vert _{U}^{2} = \varphi (\beta ) + \frac{\mu }{2} \Vert u-u_{\beta }\Vert _{U}^{2}. \end{aligned}$$

Next, we apply the interpolation estimate Lemma 3.1 to get

$$\begin{aligned} {\Vert }u-u_{\beta }{\Vert }^{2}_{U} \le \frac{C_\varphi {\text {diam}}(T)^{2}}{\mu }. \end{aligned}$$
(15)

In order to apply the Lipschitz assumption from Assumption 2.1, we define \(M:= M_\Psi + \max \{1,{\Vert }S{\Vert }\} \sqrt{C_\varphi / \mu } {\text {diam}}(Q)\) where \(M_\Psi \) is given in Corollary 2.5. Due to (15), all quantities are bounded by M. Thus,

$$\begin{aligned} {|}F(\beta ,y,u)- F(\beta ,y_{\beta },u_{\beta }){|}&\le L_M ({\Vert }S u - S u_\beta {\Vert }_{Y} + {\Vert }u - u_\beta {\Vert }_{U})\\&\le L_M (1+\Vert S\Vert ){\Vert }u - u_\beta {\Vert }_{U}\\&\le L_M (1+\Vert S\Vert ) \sqrt{\frac{C_{\varphi }}{\mu } } {\text {diam}}(T) . \end{aligned}$$

\(\square \)

Theorem 3.6

Let \({\mathcal {T}}\) be a subdivision of \(Q\) and suppose that the upper-level objective functional satisfies a quadratic growth condition for a solution \((\bar{\beta },\bar{y},\bar{u})\) of (OVR) in the sense that

$$\begin{aligned} F(\beta , y_\beta , u_\beta ) \ge F(\bar{\beta },\bar{y},\bar{u})+ G {\Vert }\beta - {{\bar{\beta }}}{\Vert }_{{\mathbb {R}}^n}^{2} \qquad \forall \beta \in Q\end{aligned}$$
(16)

holds for some constant \(G > 0\). Let \(T\in {\mathcal {T}}\) be an element satisfying the condition

$$\begin{aligned} {\text {diam}}(T) < \frac{G}{L_{M} (1+\Vert S\Vert ) \sqrt{\frac{C_{\varphi }}{\mu }} } {\text {dist}}(T, {{\bar{\beta }}})^{2}. \end{aligned}$$
(17)

Then, for any feasible point \((\beta ,y,u)\) of the relaxed problem (OVR\((\xi ,T)\)) we have

$$\begin{aligned} F(\beta ,y,u)> F(\bar{\beta },\bar{y},\bar{u}). \end{aligned}$$

The constants appearing in (17) have the same meaning as in Lemma 3.5.

Proof

Let \(T \in {\mathcal {T}}\) satisfy (17) and let \((\beta ,y,u)\) be feasible to (OVR\((\xi ,T)\)). By using the quadratic growth condition (16) and Lemma 3.5 we obtain

$$\begin{aligned} \begin{aligned} F(\beta ,y,u)- F(\bar{\beta },\bar{y},\bar{u})&= F(\beta ,y_{\beta },u_{\beta })- F(\bar{\beta },\bar{y},\bar{u})+ F(\beta ,y,u)- F(\beta ,y_{\beta },u_{\beta })\\&\ge G {\Vert }\beta - {{\bar{\beta }}}{\Vert }_{{\mathbb {R}}^n}^{2} -L_M (1+{\Vert }S{\Vert }) \sqrt{\frac{C_{\varphi }}{\mu } } {\text {diam}}(T) \\&> G {\Vert }\beta - {{\bar{\beta }}}{\Vert }_{{\mathbb {R}}^n}^{2} - G {\text {dist}}(T, {{\bar{\beta }}})^2 \ge 0. \end{aligned} \end{aligned}$$
(18)

This shows the claim. \(\square \)

Remark 3.7

We give some interpretation of Theorem 3.6. Let \((\bar{\beta },\bar{y},\bar{u})\) be a solution to (OVR) satisfying the growth condition (16). Let \(T \in {\mathcal {T}}\) satisfy (17) and let \((\beta ,y,u)\) be a feasible point of (OVR\((\xi ,T)\)). Further, let \({{\bar{T}}} \in {\mathcal {T}}\) be a simplex with \({{\bar{\beta }}} \in {{\bar{T}}}\). Then, a solution \((\beta _{{{\bar{T}}}}, y_{{{\bar{T}}}}, u_{{{\bar{T}}}})\) of (OVR\((\xi ,T)\)) satisfies

$$\begin{aligned} F(\beta ,y,u)> F(\bar{\beta },\bar{y},\bar{u})\ge F(\beta _{{{\bar{T}}}}, y_{{{\bar{T}}}}, u_{{{\bar{T}}}}). \end{aligned}$$

Hence, Algorithm 1 will never refine the simplex T and, consequently, this simplex will be ignored in the subsequent iterations of the algorithm.

Theorem 3.6 also has a quantitative implication. We consider a subdivision of Q into simplices of diameter h. According to (17), the minimizer \({{\bar{\beta }}}\) cannot occur in simplices T with \(h > C {\text {dist}}(T, {{\bar{\beta }}})^2\), with some constant \(C > 0\). That is, we only have to consider simplices with \({\text {dist}}(T, {{\bar{\beta }}}) \le \sqrt{h/C}\). The number of simplices satisfying this condition is roughly of the order \(h^{n/2 - n} = h^{-n/2}\).

If we are able to improve (17) to \({\text {diam}}(T) < C {\text {dist}}(T, {{\bar{\beta }}})^\alpha \) for some \(\alpha \in [1,2)\), see the discussion below, this number of simplices improves to \(h^{-n (1-1/\alpha )}\). In particular, in the case \(\alpha = 1\), we expect a constant number of simplices.

Remark 3.8

There are two possibilities to improve condition (17). First, if one has a stronger growth condition for the upper-level objective functional, i.e.,

$$\begin{aligned} F(\beta , y_\beta , u_\beta ) \ge F(\bar{\beta },\bar{y},\bar{u})+ G {\Vert }\beta - {{\bar{\beta }}}{\Vert }_{{\mathbb {R}}^n}^{\alpha } \qquad \forall \beta \in Q\end{aligned}$$
(19)

for some \(\alpha \in [1, 2)\), then we can use \({\text {dist}}(T,{{\bar{\beta }}})^\alpha \) instead of \({\text {dist}}(T,{{\bar{\beta }}})^2\) in (17), cf. (18). In particular, \(\alpha = 1\) might be possible if \({{\bar{\beta }}}\) is located on the boundary of \(Q\) or if the reduced objective is non-smooth at \({{\bar{\beta }}}\).

Second, we can improve Theorem 3.6 if \(F'(\bar{\beta },\bar{y},\bar{u})= 0\). For simplicity, we discuss the case that F is quadratic, i.e.,

$$\begin{aligned} \begin{aligned} F(\beta ,y,u)&= F(\beta ,y_{\beta },u_{\beta })+ F'(\beta ,y_{\beta },u_{\beta })((\beta ,y,u)- (\beta ,y_{\beta },u_{\beta }))\\&\qquad + \frac{1}{2} F''(\beta ,y_{\beta },u_{\beta })[(\beta ,y,u)- (\beta ,y_{\beta },u_{\beta })]^2. \end{aligned} \end{aligned}$$
(20)

In particular, the second derivative is constant. Together with the Lipschitz continuity of \(F'\) and \(\Psi \) (see Corollary 2.5), we readily obtain

$$\begin{aligned} {\Vert } F'( \beta , y_\beta , u_\beta ) {\Vert }_{{\mathbb {R}}^n \times Y^{\star }\times U^{\star }} = {\Vert } F'( \beta , y_\beta , u_\beta ) - F'(\bar{\beta },\bar{y},\bar{u}){\Vert }_{{\mathbb {R}}^n \times Y^{\star }\times U^{\star }} \le C {\Vert }\beta - {{\bar{\beta }}}{\Vert }_{{\mathbb {R}}^n}. \end{aligned}$$

Using this estimate and (15) in (20), we find

$$\begin{aligned} {|}F(\beta ,y,u)- F(\beta ,y_{\beta },u_{\beta }){|}&\le C {\Vert }\beta - {{\bar{\beta }}}{\Vert }_{{\mathbb {R}}^n}{\text {diam}}(T) + C {\text {diam}}(T)^2 \\ {}&\le C {\text {dist}}(T,{{\bar{\beta }}}){\text {diam}}(T) + C {\text {diam}}(T)^2 . \end{aligned}$$

By using this estimate in (18), we see that (17) can be replaced by \({\text {diam}}(T) < c {\text {dist}}(T, {{\bar{\beta }}})\) for some \(c > 0\). Note that \(F'(\bar{\beta },\bar{y},\bar{u})= 0\) is highly restrictive. However, the positive influence on the convergence speed can already be expected if the first derivative of F is close to zero in the solution. The approach can be applied to non-quadratic objective functionals F by replacing (20) by a Taylor expansion and requiring that \({\Vert }F''{\Vert }\) is bounded on bounded subsets.

Algorithm 1 can still be sped up substantially without additional restrictions. In (S3), we have to evaluate \(\varphi (\beta _k)\), and for this purpose we calculate the lower-level solutions \((y_{\beta _k}, u_{\beta _k})\). Therefore \((\beta _k, y_{\beta _k}, u_{\beta _k})\) is a feasible point of (OVR) and, thus, \(F(\beta _k, y_{\beta _k}, u_{\beta _k})\) is an upper bound for the minimal objective value of (OVR). On the other hand, the computed values \(F(\beta _T, y_T, u_T)\) for \(T \in {\mathcal {T}}\) are lower bounds for the possible objective value of (OVR) restricted to T. Hence, all elements \(T \in {\mathcal {T}}\) with \(F(\beta _T, y_T, u_T) > F(\beta _k, y_{\beta _k}, u_{\beta _k})\) cannot contain a solution of (OVR) and can be ignored in later iterations. Furthermore, the simplices can be sorted by \(F(\beta _T, y_T, u_T)\) and multiple simplices may be refined in each iteration. This results in a larger number of auxiliary problems which have to be solved in the next iteration (recall that (OVR\((\xi )\)) has to be solved on refined elements only). These problems are independent of each other and can be solved in parallel.

Finally, we demonstrate that in most cases, the value-function constraint in (OVR\((\xi )\)) will be satisfied with equality. To study the issue we introduce the problem

$$\begin{aligned} \begin{aligned} \min _{\beta ,y,u} \quad&F(\beta , y, u ) \\ \text {s.t.} \quad&A y - B u = 0, \\ \quad&\beta \in Q,\quad u\in U_{\textrm{ad}}. \end{aligned} \end{aligned}$$
(21)

This problem is a relaxation of (OVR), since we neglected the optimality of (yu) for the lower level. We expect that this problem has a smaller optimal value than (OVR).

Lemma 3.9

Suppose that the infimal value of (21) is smaller than the infimal value of (OVR). Let \((\beta _{k},y_{k},u_{k})\) be defined as in Algorithm 1(S2). Then, the constraint \(f(\beta _{k},y_{k},u_{k})\le \xi _{{{\bar{T}}}_k}(\beta _k)\) is satisfied with equality whenever k is sufficiently large and \(\xi _{{\mathcal {T}}_k}\) is continuous at \(\beta _k\).

Proof

Let \((\tilde{\beta },\tilde{y},\tilde{u})\) be a global solution for (21). Note that global solutions \((\bar{\beta },\bar{y},\bar{u})\) to (OVR) are not globally optimal for (21). The construction of the sequence \((\beta _{k},y_{k},u_{k})\) according to Algorithm 1 yields a monotonically increasing sequence \(F(\beta _{k},y_{k},u_{k})\). By Theorem 3.2 one gets \(F(\beta _{k},y_{k},u_{k})\rightarrow F(\hat{\beta },\hat{y},\hat{u})= F(\bar{\beta },\bar{y},\bar{u})\). Due to \(F(\tilde{\beta },\tilde{y},\tilde{u})< F(\bar{\beta },\bar{y},\bar{u})\), we have \(F(\tilde{\beta },\tilde{y},\tilde{u})< F(\beta _{k},y_{k},u_{k})\) for sufficiently large k.

We argue by contradiction and assume that \(f(\beta _{k},y_{k},u_{k})< \xi _{{{\bar{T}}}_k}(\beta _k)\) for some large k for which \(\xi _{{\mathcal {T}}_k}\) is continuous at \(\beta _k\). We consider a convex combination \((1-s) (\beta _{k},y_{k},u_{k})+ s (\tilde{\beta },\tilde{y},\tilde{u})\), \(s \in (0,1)\), and check that it is a feasible point of (OVR\((\xi _{{\mathcal {T}}_k})\)) for s small enough. The constraint \(Ay = Bu\) is linear and the admissible sets \(Q\) and \(U_{\textrm{ad}}\) are convex. Moreover, since f is continuous (see Assumption 2.1) and since \(\xi _{{\mathcal {T}}_k}\) is continuous by assumption, we have

$$\begin{aligned} f((1-s) (\beta _{k},y_{k},u_{k})+ s (\tilde{\beta },\tilde{y},\tilde{u})) < \xi _{{\mathcal {T}}_k}( (1-s) \beta _k + s {\tilde{\beta }}) \qquad \forall s\in (0,\varepsilon ] \end{aligned}$$

for some \(\varepsilon > 0\). Now the convexity of the upper-level objective functional F (see Assumption 2.1(e)) implies

$$\begin{aligned} F((1-s) (\beta _{k},y_{k},u_{k})+ s (\tilde{\beta },\tilde{y},\tilde{u})) \le (1 - s) F(\beta _{k},y_{k},u_{k})+ s F(\tilde{\beta },\tilde{y},\tilde{u})< F(\beta _{k},y_{k},u_{k})\end{aligned}$$

for all \(s \in (0,\varepsilon ]\). This contradicts the optimality of \((\beta _{k},y_{k},u_{k})\) from Algorithm 1 (S2). \(\square \)

Note that the piecewise linear function \(\xi _{{\mathcal {T}}_k}\) is continuous if the triangulation \({\mathcal {T}}_k\) does not possess hanging nodes. Otherwise, it might be discontinuous at all facets containing hanging nodes.

4 Penalty approach

The subproblems (OVR\((\xi ,T)\)) presented in Algorithm 1 are already subject to convex constraints, however, the nonlinear inequality constraint \(f(\beta ,y,u)\le \xi (\beta )\) still may introduce difficulties when implementing the solution algorithm. In particular, this constraint is of a rather unusual form in an optimal control context, see Sect. 5. Using a penalty method for this complicated constraint the treatment of the subproblems (OVR\((\xi ,T)\)) can be simplified since this inequality constraint is incorporated into the objective functional. Any additional error that is introduced by the penalty approach has to be compared to the error induced by the relaxation of the problem with the affine interpolation of the optimal-value function.

By replacing the subproblems in Algorithm 1 with a penalty approach, we arrive at Algorithm 2 for which we now provide some further comments. In a classical penalty method the penalty parameter depends only on the iteration counter k. In Algorithm 2, we allow an additional dependence on the simplex T. Indeed, if \(\gamma _{k,T}\) is independent of k, it is sufficient to solve the auxiliary problems (OVRP\((T,\gamma _{k,T})\)) only on the new cells \(T \in {\mathcal {T}}_{k+1} {\setminus } {\mathcal {T}}_k\). Otherwise, we would need to solve these problems on all cells in each iteration. The stopping criterion in (S3) is justified in the first part of the proof of the upcoming Theorem 4.2.

figure e

Lemma 4.1

Let the penalty function \(P:{\mathbb {R}}\rightarrow {\mathbb {R}}\) be non-constant, non-decreasing and convex. Then, for every simplex \(T \subset Q\) and \(\gamma _{k,T} > 0\), the problem (OVRP\((T, \gamma _{k,T})\)) possesses a solution.

Proof

From the monotonicity and convexity of P, we get \(P(s) \rightarrow \infty \) for \(s \rightarrow \infty \). For a minimizing sequence \((\beta _{k},y_{k},u_{k})\), the boundedness of \(\beta _k\) follows from \(\beta _k \in T\). Since F is bounded from below by Assumption 2.1(e) and since \(\gamma _{k,T} > 0\), the expression \(P(f(\beta _{k},y_{k},u_{k})- \xi _T(\beta _k) )\) is bounded from above. Due to the properties of \(P\), the sequence \(f(\beta _{k},y_{k},u_{k})\) is bounded from above. Thus, the boundedness of \((y_k,u_k)\) follows from Assumption 2.1(g). Now, the remaining part of the proof is clear since the objective is continuous and convex, hence, weakly sequentially lower semicontinuous. \(\square \)

4.1 Standard penalization

We first prove the convergence of Algorithm 2 for a typical penalty function \(P\).

Theorem 4.2

Let the penalty function \(P:{\mathbb {R}}\rightarrow {\mathbb {R}}\) be monotone and convex, such that \(P(s) = 0\) for all \(s\le 0\) and \(P(s) > 0\) for all \(s > 0\). If \(\gamma _{k, {{\bar{T}}}_k}\rightarrow \infty \), Algorithm 2 either stops at a global optimal solution of (OVR) or the computed sequence \((\beta _{k},y_{k},u_{k})\) is bounded and contains a weakly convergent subsequence in \({\mathbb {R}}^n \times Y \times U\) to a global optimal solution of (OVR). Every weakly convergent subsequence of \((\beta _{k},y_{k},u_{k})\) converges strongly. If (OVR) has a unique global solution \((\bar{\beta },\bar{y},\bar{u})\), then the entire sequence \((\beta _{k},y_{k},u_{k})\) converges strongly to \((\bar{\beta },\bar{y},\bar{u})\).

Proof

A global solution \((\bar{\beta },\bar{y},\bar{u})\) to (OVR) is feasible for (OVRP\((T, \gamma _{k,T})\)) if \({\bar{\beta }}\in T\). By definition of \((\beta _{k},y_{k},u_{k})\) and the assumed properties for the penalty function \(P\) one obtains the estimate

$$\begin{aligned} \begin{aligned} F(\beta _{k},y_{k},u_{k})&\le F(\beta _{k},y_{k},u_{k})+ \gamma _{k,{{\bar{T}}}_k}P\big (f(\beta _{k},y_{k},u_{k})- \xi _{{{\bar{T}}}_k}(\beta _{k})\big )\\&\le F(\beta _{k,T},y_{k,T},u_{k,T}) + \gamma _{k,T}P\big ( f(\beta _{k,T},y_{k,T},u_{k,T}) - \xi _{T}(\beta _{k,T}) \big )\\&\le F(\bar{\beta },\bar{y},\bar{u}). \end{aligned} \end{aligned}$$
(22)

If Algorithm 2 terminates in (S3), then the condition \(f(\beta _{k},y_{k},u_{k})= \varphi (\beta _k)\) implies feasibility of \((\beta _{k},y_{k},u_{k})\) for (OVR) while (22) ensures global optimality.

It remains to check the case that Algorithm 2 does not terminate. From (22) and Assumption 2.1(e) we get a constant \(C \ge 0\) such that

$$\begin{aligned} P\big (f(\beta _{k},y_{k},u_{k})- \xi _{{{\bar{T}}}_k}(\beta _{k})\big ) \le \frac{F(\bar{\beta },\bar{y},\bar{u})- F(\beta _{k},y_{k},u_{k})}{\gamma _{k,{{\bar{T}}}_k}} \le \frac{C}{\gamma _{k,{{\bar{T}}}_k}} \rightarrow 0. \end{aligned}$$
(23)

Using that P is non-decreasing and that \(\xi _{{{\bar{T}}}_k}(\beta _k)\) is bounded from below (since \(\varphi \) is bounded from below on Q), we get that \(f(\beta _{k},y_{k},u_{k})\) is bounded from above. From Lemma 2.2 we get

$$\begin{aligned} f(\beta _{k},y_{k},u_{k})\ge f( \beta _k, y_{\beta _k}, u_{\beta _k} ) + \frac{\mu }{2} {\Vert } u_k - u_{\beta _k}{\Vert }_U^2. \end{aligned}$$

Since f is bounded from below and since \(u_{\beta _k}\) is bounded by Corollary 2.5, we obtain the boundedness of \(u_k\) in U. The boundedness of the solution operator S then implies boundedness of the state \(y_k = S u_k\) in Y. Thus, the sequence \((\beta _{k},y_{k},u_{k})\) is bounded and contains a weakly convergent subsequence (without relabeling), \((\beta _{k},y_{k},u_{k})\rightharpoonup (\hat{\beta },\hat{y},\hat{u})\). The parameter \(\beta _{k}\) converges strongly because \(\beta \in Q\subset {\mathbb {R}}^{n}\) is finite dimensional. It remains to check optimality of the weak limit \((\hat{\beta },\hat{y},\hat{u})\) and the strong convergence.

From (23) we obtain \(\limsup _{k\rightarrow \infty } f(\beta _{k},y_{k},u_{k})-\xi _{{{\bar{T}}}_k}(\beta _k) \le 0\).

Arguing as in Theorem 3.2, we have \({\text {diam}}({{\bar{T}}}_k)\rightarrow 0\) and together with the interpolation error estimate Lemma 3.1 we get

$$\begin{aligned} \begin{aligned} 0&\le \liminf _{k\rightarrow \infty } \big (f(\beta _{k},y_{k},u_{k})-\varphi (\beta _k)\big ) \le \limsup _{k\rightarrow \infty } \big (f(\beta _{k},y_{k},u_{k})-\varphi (\beta _k)\big ) \\ {}&\le \limsup _{k\rightarrow \infty } \big (f(\beta _{k},y_{k},u_{k})-\xi _{{{\bar{T}}}_k}(\beta _k) + \frac{C_{\varphi }}{2} {\text {diam}}({\bar{T}}_{k})^{2}\big ) =0. \end{aligned} \end{aligned}$$

In particular, we have \(f(\beta _{k},y_{k},u_{k})-\varphi (\beta _k)\rightarrow 0\). This implies

$$\begin{aligned} 0 \le f(\hat{\beta },\hat{y},\hat{u})- \varphi ({\hat{\beta }}) \le \lim _{k\rightarrow \infty } \big (f(\beta _{k},y_{k},u_{k})-\varphi (\beta _k)\big ) =0. \end{aligned}$$

Therefore, \((\hat{\beta },\hat{y},\hat{u})\) is feasible for (OVR) and \(f(\beta _{k},y_{k},u_{k})\rightarrow f(\hat{\beta },\hat{y},\hat{u})\) holds. Then we can argue as in (12) and obtain strong convergence for the control u. Since the solution operator S is continuous, this proves the strong convergence of the subsequence \((\beta _{k},y_{k},u_{k})\rightarrow (\hat{\beta },\hat{y},\hat{u})\). Finally, due to

$$\begin{aligned} F(\hat{\beta },\hat{y},\hat{u})= \lim _{k\rightarrow \infty } F(\beta _{k},y_{k},u_{k})\le F(\bar{\beta },\bar{y},\bar{u})\end{aligned}$$

we know that \((\hat{\beta },\hat{y},\hat{u})\) is a global minimizer of (OVR).

Analogous to Theorem 3.2, the usual subsequence-subsequence argument can be used to obtain strong convergence of the entire sequence if the solution to (OVR) is unique. \(\square \)

Remark 4.3

We observe from Theorem 4.2 that it is sufficient to have the penalty parameter \(\gamma _{k,T}\) being solely dependent on the simplex T. A possibility is the choice \(\gamma _{k,T} = \upsilon ({\text {diam}}(T))\) with a function \(\upsilon \) satisfying \(\upsilon (t) \rightarrow \infty \) for \(t \rightarrow 0\). A direct benefit is that the solution of the subproblem on a fixed simplex is now independent of the iteration and only needs to be carried out once, as in Algorithm 1. In analogy to Algorithm 1 a refinement strategy that ensures the validity of step (S3) in Algorithm 2 is given in Lemma 3.3.

4.2 Direct penalization

The problem (OVRP\((T,\gamma _{k,T})\)) can be simplified by introducing a direct penalization \(P= {\text {Id}}\). For a general optimization problem this could lead to the objective being unbounded from below. Our constraint \(f(\beta ,y,u)- \xi _T(\beta )\) cannot be arbitrarily negative, as we already have a lower bound for \(f(\beta ,y,u)\) by Assumption 2.1(e) and an upper bound for \(\xi _T(\beta )\) by the largest value of \(\varphi (\beta )\) on the bounded set \(Q\). However, using a direct penalization has implications on the choice of the penalty parameter. The difference between the lower-level objective functional and the interpolation of the optimal-value function \(f(\beta ,y,u)- \xi _T(\beta )\) can be negative. Thus, arbitrarily increasing the penalty parameter does not work. If the choice of penalty parameter is too large it simply encourages choosing a point, where the approximation quality of \(\xi _{T}\) is close to worst. The penalty parameter \(\gamma \) needs to be set specifically for each simplex.

Corollary 4.4

We consider Algorithm 2 with \(P= {\text {Id}}\) and we assume that the penalty parameters satisfy

$$\begin{aligned} \gamma _{{{\bar{T}}}_k} \rightarrow \infty , \quad \gamma _{{{\bar{T}}}_k} {\text {diam}}({{\bar{T}}}_k)^2 \rightarrow 0 \end{aligned}$$

as \(k \rightarrow \infty \). Then, \((\beta _{k},y_{k},u_{k})\) contains a strongly convergent subsequence and all accumulation points are globally optimal for (OVR). If (OVR) admits a unique global minimizer \((\bar{\beta },\bar{y},\bar{u})\) then the entire sequence \((\beta _{k},y_{k},u_{k})\) converges strongly towards this minimizer.

Proof

The argumentation follows the lines of the proof of Theorem 4.2. Therefore, we just comment on the differences. The interpolation error estimate Lemma 3.1 allows for a lower bound for the violation of the constraint, i.e.

$$\begin{aligned} f(\beta _{k},y_{k},u_{k})- \xi _{{{\bar{T}}}_k}(\beta _k) \ge \varphi (\beta _k) - \xi _{{{\bar{T}}}_k}(\beta _k) \ge - \frac{C_{\varphi }}{2} {\text {diam}}({{\bar{T}}}_k)^{2}. \end{aligned}$$
(24)

When using \(P={\text {Id}}\), an upper bound follows as in (23) and we have

$$\begin{aligned} f(\beta _k,y_k,u_k) - \xi _{{{\bar{T}}}_k}(\beta _{k}) \le \frac{F(\bar{\beta },\bar{y},\bar{u})- F(\beta _{k},y_{k},u_{k})}{\gamma _{{\bar{T}}_{k}}} \le \frac{C_{F}}{\gamma _{{\bar{T}}_{k}}}. \end{aligned}$$
(25)

We can now argue as in Theorem 4.2 and obtain \((\beta _{k},y_{k},u_{k})\rightarrow (\hat{\beta },\hat{y},\hat{u})\) along a subsequence, where \((\hat{\beta },\hat{y},\hat{u})\) is a feasible point of (OVR). In order to achieve optimality of \((\hat{\beta },\hat{y},\hat{u})\), we combine (24) and (25) and obtain

$$\begin{aligned} F(\beta _{k},y_{k},u_{k})\le F(\bar{\beta },\bar{y},\bar{u})+ \frac{C_\varphi }{2} \gamma _{{{\bar{T}}}_k} {\text {diam}}({{\bar{T}}}_k)^2 \rightarrow F(\bar{\beta },\bar{y},\bar{u})+ 0, \end{aligned}$$

which implies \(F(\hat{\beta },\hat{y},\hat{u})=\lim _{k\rightarrow \infty } F(\beta _{k},y_{k},u_{k})\le F(\bar{\beta },\bar{y},\bar{u})\). The remaining part of the proof follows the proof of Theorem 4.2. \(\square \)

The next lemma addresses the continuous dependence of the solution on the penalty parameter.

Lemma 4.5

We suppose that \(F(\cdot , S(u), u)\) is strongly convex (w.r.t. \(\beta \)) with constant \(\mu _\beta > 0\), independent of the control u. Then, (OVRP(\(T,\gamma \))) with \(P= {\text {Id}}\) has a unique solution \((\beta _{\gamma },y_{\gamma },u_{\gamma })\) for all \(\gamma > 0\). Further, let \(0<\gamma _a\le \gamma _T<\infty \) and \(\gamma _a \le {\hat{\gamma }}\). Then,

$$\begin{aligned} {\Vert }\beta _{\gamma _T} - \beta _{{\hat{\gamma }}}{\Vert }_{{\mathbb {R}}^n} +{\Vert }u_{\gamma _T} - u_{{\hat{\gamma }}}{\Vert }_{U} \le C_{\mu _\beta ,\gamma _a,\gamma _T}{|}\gamma _T-{\hat{\gamma }}{|}. \end{aligned}$$

Proof

The existence of a solution to (OVRP(\(T,\gamma \))) follows from Lemma 4.1. For \(\gamma _a\le \gamma \), the strong convexity of f implies that the reduced objective of (OVRP(\(T,\gamma \))) is strongly convex w.r.t. u with constant \(\gamma _a \mu \) on the feasible set. This gives uniqueness of the state \(y_\gamma =S(u_\gamma )\) and of the control \(u_\gamma \). With the additional assumption on F, we get the uniqueness of \(\beta _\gamma \).

Next, we want to apply Lemma 2.4 to the state reduced variant of (OVRP(\(T,\gamma \))), i.e., we apply the setting

$$\begin{aligned} x&= (\beta , u), \quad C = T \times U_{\textrm{ad}}, \quad p = \gamma , \quad {\hat{Q}} = [\gamma _a,\infty ), \quad \\ J(x,p)&= J( (\beta , u),\gamma ) := F(\beta , S(u), u) + \gamma ( f(\beta , S(u), u) - \xi _T(\beta ) ). \end{aligned}$$

Assumption 2.1 ensures that the assumptions of Lemma 2.4 are satisfied. Thus, Lemma 2.4 implies

$$\begin{aligned} {\Vert }\beta _{{\hat{\gamma }}} - \beta _{\gamma _T}{\Vert }_{{\mathbb {R}}^n} + {\Vert }u_{{\hat{\gamma }}}-u_{\gamma _T}{\Vert }_{U} \le C_{\mu _\beta , \gamma _a} {\Vert }J'_x( (\beta _{\gamma _T}, u_{\gamma _T}),{\hat{\gamma }}) - J'_x( (\beta _{\gamma _T}, u_{\gamma _T}), \gamma _T){\Vert }_{{\mathbb {R}}^n \times U^{\star }}. \end{aligned}$$

Now, the derivative \(J'_x( (\beta , u), \gamma ) \) contains the two components

$$\begin{aligned}&F_\beta '(\beta , S(u), u) + \gamma ( f_\beta '(\beta , S(u), u) - \xi _T'(\beta )), \\&F_u'(\beta , S(u), u) + S^{\star }F_y'(\beta , S(u), u) + \gamma ( f_u'(\beta , S(u), u) + S^{\star }f_y'(\beta , S(u), u)). \end{aligned}$$

Thus, the above estimate implies

$$\begin{aligned} {\Vert }\beta _{{\hat{\gamma }}} - \beta _{\gamma _T}{\Vert }_{{\mathbb {R}}^n} + {\Vert }u_{{\hat{\gamma }}}-u_{\gamma _T}{\Vert }_U \le C_{\mu _\beta , \gamma _a} {|}{\hat{\gamma }} - \gamma _T{|} \left( C_{1,\gamma _T} + C_{2,\gamma _T} \right) , \end{aligned}$$

with

$$\begin{aligned} C_{1, \gamma _T}&= {\Vert } f_\beta '(\beta _{\gamma _T}, S(u_{\gamma _T}), u_{\gamma _T}) - \xi _T'(\beta _{\gamma _T} ) {\Vert }_{{\mathbb {R}}^n} , \\ C_{2, \gamma _T}&= {\Vert } f_u'(\beta _{\gamma _T}, S(u_{\gamma _T}), u_{\gamma _T}) + S^{\star }f_y'(\beta _{\gamma _T}, S(u_{\gamma _T}), u_{\gamma _T}) {\Vert }_{U^{\star }} . \end{aligned}$$

This shows the claim. \(\square \)

The problem (OVRP(\(T,\gamma _T\))) is a relaxation of (OVR\((\xi ,T)\)) and consequently the objective functional attains a smaller minimal value and represents a lower bound to the minimal objective value of (OVR\((\xi ,T)\)). Since this lower bound depends on the chosen penalty parameter \(\gamma _T\), we try to adjust this parameter to obtain the largest possible lower bound. We will now show that it is reasonable to aim for a choice of the penalty parameter such that the equality \(f(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})= \xi (\beta _{\gamma _T})\) holds for the solution \((\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})\) of (OVRP(\(T,\gamma _T\))). In the expected case where no solution to (21) is feasible for (OVR), this specific penalty parameter results in the largest possible minimal objective value for (OVRP(\(T,\gamma _T\))).

Lemma 4.6

Let the state reduced functional F be strongly convex with respect to \(\beta \) with constant \(\mu _\beta \) independent of the control u. Let a simplex T be given and, again, \(P= {\text {Id}}\). Further, we assume the existence of \(\beta \in T\) with \(\varphi (\beta ) < \xi _T(\beta )\). For \(\gamma \ge 0\), we denote a solution to (OVRP(\(T,\gamma \))) by \((\beta _{\gamma },y_{\gamma },u_{\gamma })\).

  1. (a)

    If \(f(\tilde{\beta },\tilde{y},\tilde{u})\le \xi _T(\tilde{\beta })\) for one global solution \((\tilde{\beta },\tilde{y},\tilde{u})\) to (OVRP(T, 0)) then the choice \(\gamma _T = 0\) yields the largest minimal objective value for (OVRP(\(T,\gamma _{T}\))).

  2. (b)

    If \(f(\tilde{\beta },\tilde{y},\tilde{u})> \xi _T(\tilde{\beta })\) for all global solutions \((\tilde{\beta },\tilde{y},\tilde{u})\) to (OVRP(T, 0)) then there exists \(\gamma _T > 0\) such that \(f(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})= \xi _T(\beta _{\gamma _T})\) and this choice of \(\gamma _T\) results in the largest minimal objective value for (OVRP(\(T,\gamma _{T}\))).

The existence of \(\beta \in T\) with \(\varphi (\beta ) < \xi _T(\beta )\) is equivalent to \(\varphi \) being not affine on T. Thus, this assumption is not very restrictive.

Proof

   

  1. (a)

    For any \(\gamma \ge 0\) we have

    $$\begin{aligned} F(\beta _{\gamma },y_{\gamma },u_{\gamma })+ \gamma (f(\beta _{\gamma },y_{\gamma },u_{\gamma })- \xi _T(\beta _{\gamma }))&\le F(\tilde{\beta },\tilde{y},\tilde{u})+ \gamma (f(\tilde{\beta },\tilde{y},\tilde{u})- \xi _T(\tilde{\beta })) \\&\le F(\tilde{\beta },\tilde{y},\tilde{u}). \end{aligned}$$

    Hence, the infimal value of (OVRP(\(T,\gamma _{T}\))) is maximized for \(\gamma _T = 0\).

  2. (b)

    We prove the existence of \(\gamma _T > 0\) with \(f(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})- \xi _T(\beta _{\gamma _T}) = 0\) by the intermediate value theorem. Therefore, we have to provide penalty parameters with and \(f(\beta _{\bar{\gamma }_T},y_{\bar{\gamma }_T},u_{\bar{\gamma }_T})- \xi _T(\beta _{{{\bar{\gamma }}}_T}) \le 0\). The required continuous dependence w.r.t. \(\gamma > 0\) follows from Lemma 4.5.

    We first construct \({{\bar{\gamma }}}_T\). By assumption F is bounded from below by a constant \(C \in {\mathbb {R}}\) and there exists a \(\beta \in T\), such that \(\varphi (\beta )=f(\beta ,y_{\beta },u_{\beta })<\xi _T(\beta )\). Thus, we can choose \({{\bar{\gamma }}}_T > 0\) such that

    $$\begin{aligned} F(\beta ,y_{\beta },u_{\beta })+ {{\bar{\gamma }}}_T(f(\beta ,y_{\beta },u_{\beta })- \xi _T(\beta )) \le C. \end{aligned}$$
    (26)

    It follows that \(f(\beta _{\bar{\gamma }_T},y_{\bar{\gamma }_T},u_{\bar{\gamma }_T})- \xi _T(\beta _{{{\bar{\gamma }}}_T}) \le 0\).

    The existence of is proven by contradiction. Assume that there is no \(\gamma > 0\) with \(f(\beta _{\gamma },y_{\gamma },u_{\gamma })- \xi _T(\beta _{\gamma }) \ge 0\). For \(\gamma \searrow 0\), the bound \(f(\beta _{\gamma },y_{\gamma },u_{\gamma })< \xi _T(\beta _\gamma )\) and the quadratic growth condition from Lemma 2.2 imply boundedness of the control \(u_\gamma \) whereas the continuity of the solution operator yields boundedness of the state \(y_\gamma = Su_\gamma \). The parameter \(\beta _\gamma \in T\) is bounded as well. Thus, one obtains the existence of a weak accumulation point \((\bar{\beta },\bar{y},\bar{u})\) for \(\gamma \searrow 0\). It is clear that \((\bar{\beta },\bar{y},\bar{u})\) is feasible for (OVRP(T, 0)) and we show that it is even a solution. By optimality, we get the inequality

    $$\begin{aligned} F(\beta _{\gamma },y_{\gamma },u_{\gamma })+ \gamma (f(\beta _{\gamma },y_{\gamma },u_{\gamma })- \xi _{T}(\beta _\gamma )) \le F(\tilde{\beta },\tilde{y},\tilde{u})+ \gamma (f(\tilde{\beta },\tilde{y},\tilde{u})- \xi _{T}({\tilde{\beta }})) \end{aligned}$$

    and

    $$\begin{aligned} \lim _{\gamma \searrow 0} \gamma (f(\beta _{\gamma },y_{\gamma },u_{\gamma })- \xi _{T}(\beta _\gamma )) = \lim _{\gamma \searrow 0} \gamma (f(\tilde{\beta },\tilde{y},\tilde{u})- \xi _{T}({\tilde{\beta }})) = 0 \end{aligned}$$

    follows by boundedness of \(f(\beta _{\gamma },y_{\gamma },u_{\gamma })\). Thus,

    $$\begin{aligned} F(\bar{\beta },\bar{y},\bar{u})\le \liminf _{\gamma \searrow 0} F(\beta _{\gamma },y_{\gamma },u_{\gamma })\le F(\tilde{\beta },\tilde{y},\tilde{u}), \end{aligned}$$

    where we take the limes inferior along the weakly convergent subsequence. Thus, \((\bar{\beta },\bar{y},\bar{u})\) is a solution to (OVRP(T, 0)). Similarly, passing to the limit inferior in \(f(\beta _{\gamma },y_{\gamma },u_{\gamma })- \xi _T(\beta _{\gamma }) < 0\) yields \(f(\bar{\beta },\bar{y},\bar{u})- \xi _T({{\bar{\beta }}}) \le 0\). This contradicts the assumption and yields the existence of .

    By the intermediate value theorem, we conclude the existence of \(\gamma _T > 0\) with \(f(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})- \xi _T(\beta _{\gamma _T}) = 0\).

    It remains to prove that this choice of \(\gamma _T\) results in the largest infimal objective value for (OVRP(\(T,\gamma _{T}\))). It is clear that \(f(\beta _{\gamma },y_{\gamma },u_{\gamma })- \xi _T(\beta _\gamma )\) is non-increasing w.r.t. \(\gamma \). Thus, it follows with Lemma 4.5 that

    $$\begin{aligned} \big \{ \gamma _T \in [0,\infty ) \big |f(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})- \xi _T(\beta _{\gamma _T}) = 0 \big \} = [\gamma _a,\gamma _b ] \subset {\mathbb {R}}_{+}. \end{aligned}$$

    For \(\gamma _b< \gamma _1 < \gamma _2\), we have \(f(\beta _{\gamma _{1}},y_{\gamma _{1}},u_{\gamma _{1}})-\xi _T(\beta _{\gamma _{1}}) < 0\) and, thus, the optimality of \((\beta _{\gamma _{2}},y_{\gamma _{2}},u_{\gamma _{2}})\) for (OVRP(\(T,\gamma _{2}\))) implies

    $$\begin{aligned}&F(\beta _{\gamma _{1}},y_{\gamma _{1}},u_{\gamma _{1}})+ \gamma _{1} (f(\beta _{\gamma _{1}},y_{\gamma _{1}},u_{\gamma _{1}})-\xi _T({\beta _{\gamma _{1}}})) \\&\quad > F(\beta _{\gamma _{1}},y_{\gamma _{1}},u_{\gamma _{1}})+ \gamma _{2} (f(\beta _{\gamma _{1}},y_{\gamma _{1}},u_{\gamma _{1}})-\xi _T({\beta _{\gamma _{1}}})) \\&\quad \ge F(\beta _{\gamma _{2}},y_{\gamma _{2}},u_{\gamma _{2}})+ \gamma _{2} (f(\beta _{\gamma _{2}},y_{\gamma _{2}},u_{\gamma _{2}})-\xi _T({\beta _{\gamma _{2}}})). \end{aligned}$$

    It follows that the objective value of (OVRP(\(T,\gamma \))) is monotonically decreasing for \(\gamma > \gamma _b\) and, similarly, one can show that it is monotonically increasing for \(\gamma < \gamma _a\) and constant on \([\gamma _a, \gamma _b]\). Thus, all \(\gamma _T\in [\gamma _a,\gamma _b]\) maximize the minimal objective value of (OVRP(\(T,\gamma _{T}\))).\(\square \)

In general it is not possible to check which case of Lemma 4.6 applies. However, the proof suggests that after solving (OVRP(\(T,\gamma \))) the value \(f(\beta _{\gamma },y_{\gamma },u_{\gamma })- \xi _T(\beta _\gamma )\) can be checked to infer whether the choice of the penalty parameter \(\gamma \) was adequate, too small or too large. Furthermore, when splitting the simplices in Algorithm 2, the approximation \(\xi _T\) of the optimal-value function \(\varphi \) cannot increase in any point \(\beta \in Q\). Together with the feasibility of the solution to the refined problems for the problem on the original simplex T, this yields that the minimal objective value may only remain constant or increase if the same penalty parameter \(\gamma _T\) is used for a subproblem. We therefore suggest starting with \(\gamma = 0\) and then using a heuristic to find a \(\gamma _T\). The refined problems can inherit the parameter \(\gamma _T\) as a starting point instead of zero. This approach covers both cases of Lemma 4.6 without the need to calculate all solutions of (21). Once a \(\gamma _T\) is found such that \(f(\beta _{\gamma },y_{\gamma },u_{\gamma })- \xi _T(\beta _{\gamma _T}) > 0\) one can be sure that all subproblems are of case Lemma 4.6(b), because \(\xi \) is decreasing with further refinement of the simplices.

Lemma 4.7

Let the state reduced functional F be strongly convex with respect to \(\beta \) with constant \(\mu _\beta \) independent of the control u. Let a simplex T be given and, again, \(P= {\text {Id}}\). Further, we assume the existence of \(\beta \in T\) with \(\varphi (\beta ) < \xi _T(\beta )\). For \(\gamma \ge 0\), we denote a solution of (OVRP(\(T,\gamma \))) by \((\beta _{\gamma },y_{\gamma },u_{\gamma })\). Let the penalty parameter \(\gamma _T\) be chosen as described in Lemma 4.6, i.e., we have one of the following cases:

  1. (a)

    \(\gamma _T = 0\) and \(f(\tilde{\beta },\tilde{y},\tilde{u})\le \xi _T(\tilde{\beta })\) for one global solution \((\tilde{\beta },\tilde{y},\tilde{u})\) of (OVRP(T, 0)),

  2. (b)

    \(\gamma _T > 0\) and \(f(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})= \xi _T(\beta _{\gamma _T})\).

Then, the point \((\tilde{\beta },\tilde{y},\tilde{u})\) or \((\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})\), respectively, is a solution of (OVR\((\xi ,T)\)) and \(\gamma _T\) is a multiplier corresponding to the constraint \(f(\beta ,y,u)\le \xi _T(\beta )\) in the optimality system for (OVR\((\xi ,T)\)).

Proof

First, we consider the case \(\gamma _T > 0\). Note that \((\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})\) is feasible for (OVR\((\xi ,T)\)). We denote by \((\beta _{T},y_{T},u_{T})\) a solution of (OVR\((\xi ,T)\)). Then, the optimality of both points, \(f(\beta _{T},y_{T},u_{T})\le \xi (\beta _T)\) and \(f(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})= \xi (\beta _{\gamma _T})\) yield

$$\begin{aligned} F(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})\ge F(\beta _{T},y_{T},u_{T})&\ge F(\beta _{T},y_{T},u_{T})+ \gamma _T ( f(\beta _{T},y_{T},u_{T})- \xi (\beta _T)) \\&\ge F(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})+ \gamma _T ( f(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})- \xi (\beta _{\gamma _T})) \\ {}&= F(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T}). \end{aligned}$$

This shows \(f(\beta _{T},y_{T},u_{T})= \xi (\beta _{T})\) and \(F(\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})= F(\beta _{T},y_{T},u_{T})\). Hence, the triple \((\beta _{T},y_{T},u_{T})\) solves (OVRP(\(T,\gamma _T\))) and, by the uniqueness of the solution, the solution is \((\beta _{T},y_{T},u_{T})= (\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})\).

Thus, \((\beta _{\gamma _T},y_{\gamma _T},u_{\gamma _T})\) is globally optimal for (OVR\((\xi ,T)\)). The optimality system of (OVRP(\(T,\gamma _T\))) can be interpreted as the KKT system of (OVR\((\xi ,T)\)) and the parameter \(\gamma _T\) in (OVRP(\(T,\gamma _T\))) becomes a Lagrange multiplier in the KKT system of (OVR\((\xi ,T)\)). Note that Lagrange multipliers for (OVRP(\(T,\gamma _T\))) exist since the CQ by [27, 30] is satisfied.

Finally, we consider the case \(\gamma _T = 0\). Due to \(f(\tilde{\beta },\tilde{y},\tilde{u})< \xi _T({\tilde{\beta }})\), the point \((\tilde{\beta },\tilde{y},\tilde{u})\) is feasible for (OVR\((\xi ,T)\)). Since (OVRP(T, 0)) is a relaxation of (OVR\((\xi ,T)\)), this shows that \((\tilde{\beta },\tilde{y},\tilde{u})\) is a solution of (OVR\((\xi ,T)\)). The interpretation of \(\gamma _T\) as a multiplier is analogous to the case \(\gamma _T > 0\). \(\square \)

This lemma shows that the problem (OVR\((\xi ,T)\)) is equivalent (in some sense) to (OVRP(\(T,\gamma _T\))) for the “optimal” value of \(\gamma _T\), cf. Lemma 4.6. In the application we have in mind, the structure of (OVRP(\(T,\gamma _T\))) is much nicer, since the “complicated” function f appears in the objective and not in the constraints.

5 Parameter identification in an optimal control problem

In the previous section we discussed how a global optimal solution for (OVR) can be found using Algorithm 2. However, so far we did not introduce a solution scheme for the subproblems (OVRP\((T, \gamma _{k,T})\)). In this section we will show that for a special problem class the advantage of the direct penalization (see Sect. 4.2) is that the semismooth Newton method can be used to solve the subproblems.

5.1 Problem formulation and properties

We consider the bilevel optimization problem with the lower-level problem

figure j

and upper-level problem

$$\begin{aligned} \begin{aligned} \min _{\alpha \in {\mathbb {R}}^n,\,y\in H_0^1(\Omega ),\,u\in L^2(\Omega )} \quad&{\hat{F}}(\alpha ,y,u) {:}{=}\frac{1}{2} \Vert y-y_{m}\Vert _{L^{2}(\Omega )}^{2} +\frac{\sigma _{u}}{2} \Vert u-u_{m}\Vert _{L^{2}(\Omega )}^{2} +\frac{\sigma _{\alpha }}{2} \Vert \alpha \Vert _{{\mathbb {R}}^{n}}^{2} \\ \text {s.t.} \quad&\alpha \in Q_\alpha , \\ \quad&(y,u)\;\text {solves}\;(\hbox {LL}(\alpha )). \end{aligned} \end{aligned}$$
(UL)

As an underlying assumption let \(\sigma _u,\sigma _l,\sigma _{\alpha } > 0\), \(y_m,y_{d,i},u_m,\in L^2(\Omega )\), where \(\Omega \subset {\mathbb {R}}^l\) is an open and bounded set. Moreover, let \(Q_\alpha := [a_1,b_1]\times \dots \times [a_n,b_n]\) constitute a box constraint on \(\alpha \), where \(a_i,b_i\in {\mathbb {R}}\) satisfies \(0<a_i<b_i\) for all \(i\in \{1,\ldots ,n\}\). We also require that the admissible set \(U_{\textrm{ad}}\) has the structure \(U_{\textrm{ad}}=\{v \in L^2(\Omega ) \mid u_a \le v \le u_b \; \text {a.e. in}\;\Omega \}\), where \(u_a,u_b \in L^2(\Omega )\) are functions such that \(U_{\textrm{ad}}\) is nonempty. Further, let \(A:H_0^1(\Omega ) \rightarrow H^{-1}(\Omega )\), \(B:L^2(\Omega ) \rightarrow H^{-1}(\Omega )\), \(C_i:H_0^1(\Omega ) \rightarrow L^2(\Omega )\) be bounded linear operators such that A is bijective.

We also assume that B can be extended to an operator \(B \in L[L^q(\Omega ), H^{-1}(\Omega )]\) for some \(q\in (1,2)\). Additionally, we require \(u_m,u_a,u_b\in L^{q'}(\Omega )\), where \(q'>2\) satisfies \(1/q+1/q' = 1\).

We observe that the lower-level objective functional \({{\hat{f}}}\) is not convex with respect to all variables. In particular, Assumption 2.1(e) is not satisfied. Additionally, the corresponding optimal-value function is usually not convex either. As Algorithm 1 depends on convexity of the optimal-value function one has to first transform the problem in such a way that the new lower-level objective functional is convex. For this purpose, we consider the simple substitution \(\beta _i=1/\alpha _i\). We also define \(\sigma _\beta := \sigma _\alpha \). For the upper-level objective this substitution results in

$$\begin{aligned} F(\beta ,y,u){:}{=}\frac{1}{2}{\Vert }y-y_m{\Vert }_{L^2(\Omega )}^2 + \frac{\sigma _u}{2} {\Vert }u-u_m{\Vert }_{L^2(\Omega )}^2 + \frac{\sigma _\beta }{2} \sum _{i=1}^n \left( \frac{1}{\beta _i}\right) ^2. \end{aligned}$$

The constraint \(\alpha \in Q_\alpha \) has to be transformed to \(\beta \in Q:= [b_1^{-1}, a_1^{-1}] \times \cdots \times [b_n^{-1}, a_n^{-1}]\). Observe that Q is a compact subset of \((0,\infty )^n\) because \(Q_\alpha \) is a compact subset of \((0,\infty )^n\).

One can check that F is convex on \(Q\times H_0^1(\Omega ) \times L^2(\Omega )\) due to \(\beta > 0\) for \(\beta \in Q\). The transformed lower-level objective is

$$\begin{aligned} f(\beta ,y,u){:}{=}\sum _{i=1}^{n} \frac{1}{2\beta _{i}} \Vert C_{i}y-y_{d,i}\Vert ^{2}_{L^{2}(\Omega )} + \frac{\sigma _{l}}{2} \Vert u\Vert _{L^{2}(\Omega )}^{2}. \end{aligned}$$
(27)

We check that this f is indeed convex on \(Q\times H_0^1(\Omega )\times L^2(\Omega )\). Here we use that for a Banach space Y, the function \(g:Y\rightarrow {\mathbb {R}}\), \(y\mapsto \tfrac{1}{2}{\Vert }y{\Vert }^2_Y\) is convex and for \(\lambda > 0\) the so-called perspective of g is given by

$$\begin{aligned} Y \times ( 0,\infty ) \ni (y,\lambda ) \mapsto \lambda g(y / \lambda ) = \frac{1}{2 \lambda }{\Vert }y{\Vert }^{2}_{Y}. \end{aligned}$$
(28)

It is known that the perspective of a convex function is convex (e.g. one can simply generalize the proof of [31, Lemma 2] to Banach spaces). Now convexity is preserved under composition with an affine function \(y\mapsto Cy - y_d\). Thus, the function \((\beta _i,y)\mapsto \frac{1}{2\beta _i}{\Vert }C_i y - y_{d,i}{\Vert }_{L^2(\Omega )}^2\) is convex. The convexity of f follows.

With the above setting and observations, one can show that the transformed problem satisfies Assumption 2.1.

5.2 Stationarity system for the direct penalization

Classic choices of the penalty function for (OVRP\((T, \gamma _{k,T})\)), e.g., \(P = \max (0,\cdot )^2\), will result in subproblems that are difficult to handle. In particular, the optimality system cannot be reformulated as a simple projection formula. We will see that the direct penalization \(P = {\text {Id}}\) results in an easy to implement solution algorithm for (OVRP\((T, \gamma _{k,T})\)). Computing the solution of (OVRP\((T, \gamma _{k,T})\)) requires the construction of \(\xi \) and thereby the evaluation of \(\varphi (\beta )\) at certain points. This equates to solving single-level optimal control problems.

In order to state the stationarity conditions, we first reformulate the condition \(\beta \in T\). Recall that T is a (non-degenerate) simplex. Thus, T can be written as the intersection of \(n+1\) half-spaces, \(T = \{\beta \in {\mathbb {R}}^n \big |K_T \beta \le b_T\}\), where \(K_T\in {\mathbb {R}}^{(n+1)\times n}\) is a suitable matrix. Clearly, at most n of these constraints may simultaneously hold with equality and all those constraints that are satisfied with equality are linearly independent. Thus, (OVRP\((T, \gamma _{k,T})\)) with \(P = {\text {Id}}\) takes the form

$$\begin{aligned} \min _{\beta ,y,u} \quad&F(\beta ,y,u) + \gamma _{k,T}\big (f(\beta , y, u) - \xi _T(\beta )\big )\\ \text {s.t.} \quad&K_T\beta -b_T \le 0, \\&A y - B u = 0, \\ \quad&u \in U_{\textrm{ad}}. \end{aligned}$$

The KKT system for (OVRP\((T, \gamma _{k,T})\)) with direct penalization (\(P = {\text {Id}}\)) is given by

$$\begin{aligned} 0&= F'_\beta (\beta ,y,u)+ \gamma _{k,T} (f'_\beta (\beta ,y,u)- a_T\beta ) + K_T^\top z, \end{aligned}$$
(29a)
$$\begin{aligned} 0&= F'_y(\beta ,y,u)+ \gamma _{k,T} f'_y(\beta ,y,u)+ A^{\star }p, \end{aligned}$$
(29b)
$$\begin{aligned} 0&= F'_u(\beta ,y,u)+ \gamma _{k,T} f'_u(\beta ,y,u)- B^{\star }p + \nu , \end{aligned}$$
(29c)
$$\begin{aligned} 0&= Ay - Bu, \end{aligned}$$
(29d)
$$\begin{aligned} z&\ge 0 \wedge K_T\beta - b_T\le 0\wedge z^\top (K_T\beta - b_T) = 0, \end{aligned}$$
(29e)
$$\begin{aligned} \nu&\in {\mathcal {N}}_{U_{\textrm{ad}}}(u),\qquad u\in U_{\textrm{ad}}, \end{aligned}$$
(29f)

where \(p\in {H^{1}_{0}(\Omega )}\), \(z\in {\mathbb {R}}^{n+1}\), and \(\nu \in L^2(\Omega )\) are the Lagrange multipliers. The vector \(a_{T}\) refers to the derivative of the affine function \(\xi _T\) on the simplex T.

Lemma 5.1

The feasible point \((\beta ,y,u)\) is a local/global solution to (OVRP\((T, \gamma _{k,T})\)) if and only if there exist multipliers \(p\in {H^{1}_{0}(\Omega )}\), \(z\in {\mathbb {R}}^{n+1}\), and \(\nu \in L^2(\Omega )\) such that (29) holds.

The solution and the corresponding multipliers are unique.

Proof

\(\Rightarrow \)”: We check that the Robinson regularity condition for the reformulated problem is satisfied. This condition reads

$$\begin{aligned} \begin{bmatrix} A&{}\quad -B&{}\quad 0\\ 0&{}\quad 0&{}\quad K_T \end{bmatrix} \begin{pmatrix} H^{1}_{0}(\Omega )\\ {\mathcal {R}}_{U_{\textrm{ad}}}(u)\\ {\mathbb {R}}^{n} \end{pmatrix} - \begin{pmatrix} \{0\}\\ {\text {cone}}( {\mathbb {R}}_{-}^{n+1} - (K_T\beta - b_T)) \end{pmatrix} = \begin{pmatrix} H^{-1}(\Omega )\\ {\mathbb {R}}^{n+1} \end{pmatrix}. \end{aligned}$$

The two lines of the equation are independent of each other. By assumption, A is bijective, i.e., \(A(H^{1}_{0}(\Omega )) = H^{-1}(\Omega )\). For the second line we recall that the Robinson regularity condition is equivalent to the Mangasarian–Fromovitz condition for standard nonlinear optimization problems, see [28, p. 71]. Thus, the second line is satisfied since we have assumed that the simplex T is non-degenerate, i.e., we even have the linear-independence constraint qualification for the system \(K_T \beta \le b_T\). This shows the existence of multipliers, see [28, Theorem 3.9].

\(\Leftarrow \)”: This is clear since (OVRP\((T, \gamma _{k,T})\)) is a convex problem.

It remains to address the uniqueness. The uniqueness of the solution follows from the strict convexity of the objective. The second line of the KKT system gives uniqueness of the adjoint p, since A is an isomorphism. Similarly one gets uniqueness of \(\nu \) from the third line. Regarding uniqueness of z we observe that the matrix \(K_T\) describing a non-degenerate simplex has rank n, even after removing an arbitrary line. Additionally, there exists at least one inactive constraint, such that z is equal zero in this component. After removing the corresponding component from z and the respective column from \(K^\top _T\) in the first line of (29), z is obtained by inverting a square matrix of full rank. Thus, z is unique. \(\square \)

We introduce two auxiliary functions \(h,{{\hat{h}}}:(0,\infty )^n\times H_0^1(\Omega )\rightarrow {\mathbb {R}}\) via

$$\begin{aligned} \begin{aligned} {{\hat{h}}}(\beta ,y)&{:}{=}\frac{1}{2}\Vert y-y_m\Vert ^2_{L^2(\Omega )} + \gamma _{k,T}\left( \sum _{i=1}^{n} \frac{1}{2\beta _{i}} \Vert C_{i}y-y_{d,i}\Vert ^{2}_{L^{2}(\Omega )} -\xi _T(\beta ) \right) ,\\ h(\beta ,y)&{:}{=}{{\hat{h}}}(\beta ,y) +\frac{\sigma _\beta }{2} \sum _{i=1}^n \left( \frac{1}{\beta _i} \right) ^2. \end{aligned} \end{aligned}$$
(30)

Note that h represents the part of the objective function of (OVRP\((T, \gamma _{k,T})\)) that does not depend on u.

Recall that \(K_T\in {\mathbb {R}}^{(n+1)\times n}\), \(A:H_0^1(\Omega )\rightarrow H^{-1}(\Omega )\), \(B:L^2(\Omega )\rightarrow H^{-1}(\Omega )\) are bounded linear operators and that A is invertible. We define the function

\(W:(0,\infty )^n \times H_0^1(\Omega ) \times L^2(\Omega ) \times {\mathbb {R}}^{n+1} \times H_0^1(\Omega ) \rightarrow {\mathbb {R}}^{n} \times H^{-1}(\Omega ) \times L^2(\Omega ) \times {\mathbb {R}}^{n+1} \times H^{-1}(\Omega ) \) via

$$\begin{aligned} W(\beta ,y,u,z,p) {:}{=}\begin{pmatrix} h'_\beta (\beta ,y) + K^\top _T z \\ h'_y(\beta ,y) + A^{\star }p \\ u - \min \bigl (\max ((B^{\star }p+\sigma _{u}u_{m}) / {\hat{\sigma }},u_a),u_b\bigr ) \\ \max (K_T\beta -b_T,-z) \\ Ay-Bu \end{pmatrix} \end{aligned}$$
(31)

with \({\hat{\sigma }} {:}{=}\sigma _u + \gamma _{k,T}\sigma _l\). Now we discuss the relation between the roots of W and the optimality system.

Lemma 5.2

Let \(\beta \in T\), \(y\in H_0^1(\Omega )\), \(u\in L^2(\Omega )\) be given. Then \((\beta ,y,u)\) is the solution of (OVRP\((T, \gamma _{k,T})\)) if and only if there exist \(z\in {\mathbb {R}}^{n+1}\), \(p\in H_0^1(\Omega )\) such that \(W(\beta ,y,u,z,p)=0\) with h as defined in (30).

Proof

In view of Lemma 5.1, we have to check that (29) is equivalent to \(W(\beta ,y,u,z,p)=0\).

It is clear that (29a), (29b) and (29d) are equivalent to lines 1, 2 and 5 in (31). The complementarity conditions (29e) on z and \(b_T - K_T \beta \) can be reformulated via

$$\begin{aligned} z&\ge 0,\quad b_T-K_T\beta \ge 0,\quad z^\top (b_T-K_T\beta )=0\\ \Longleftrightarrow \quad 0&=\min (z,b_T-K_T\beta ) \quad \Longleftrightarrow \quad 0 =\max (-z,K_T\beta -b_T). \end{aligned}$$

A similar reformulation is standard for treating the gradient equation (29c) in combination with the inclusion (29f), see [7, Theorem 2.28]. These two equations are equivalent to the projection formula

$$\begin{aligned} u = {\text {Proj}}_{U_{\textrm{ad}}}\big ((B^{\star }p+\sigma _{u}u_{m}) / {\hat{\sigma }}\big ) = \min \bigl (\max ((B^{\star }p+\sigma _{u}u_{m}) / {\hat{\sigma }},u_a),u_b\bigr ), \end{aligned}$$

i.e., line 3 in (31). Note that \(\nu \) does not appear in (31), but it is uniquely determined by (29c). This shows that the KKT system is equivalent to \(W(\beta ,y,u,z,p)=0\). This finishes the proof. \(\square \)

5.3 Semismooth Newton method for the subproblems

We have shown in Lemma 5.2 that we can characterize the solution of the subproblem (OVRP\((T, \gamma _{k,T})\)) with the nonlinear operator W. An established way to solve problems with this structure is the semismooth Newton method, cf. [32]. To this end, we verify the Newton differentiability of W and the invertibility of the Newton matrix. In order to state the Newton derivative of W, we need to define some index sets and corresponding operators. We define

$$\begin{aligned} \begin{aligned} {\mathcal {A}}_1(\beta ,z)&:= \{i\in \{1,\ldots ,n+1\}\big |(K_T\beta -b_T)_i \ge -z_i\}, \\ {\mathcal {A}}_2(\beta ,z)&:= \{i\in \{1,\ldots ,n+1\}\big |(K_T\beta -b_T)_i < -z_i\}, \\ {\mathcal {A}}_3(p)&:= \{u_a \le (B^{\star }p+\sigma _{u}u_{m} )/{\hat{\sigma }} \le u_b\} \subset \Omega \end{aligned} \end{aligned}$$

and for \(i\in \{1,2\}\) we write \(\chi _{{\mathcal {A}}_i(\beta ,z)} \in {\mathbb {R}}^{(n+1)\times (n+1)}\) for the diagonal matrix where the k-th diagonal entry is 1 if \(k\in {\mathcal {A}}_i(\beta ,z)\) and 0 otherwise. Similarly, we write \(\chi _{{\mathcal {A}}_3(p)}: L^2(\Omega ) \rightarrow L^2(\Omega )\) for the multiplication operator corresponding to the characteristic function of \({\mathcal {A}}_3(p)\) on the space \(L^2(\Omega )\).

Lemma 5.3

The mapping W is Newton differentiable and a Newton derivative of W at a point \((\beta ,y,u,z,p)\) is given by the block operator

$$\begin{aligned} W'(\beta ,y,u,z,p) = \begin{bmatrix} h''_{\beta \beta }(\beta ,y) &{}\quad h''_{\beta y}(\beta ,y) &{}\quad 0 &{}\quad K_T^\top &{}\quad 0 \\ h''_{y\beta }(\beta ,y) &{}\quad h''_{yy}(\beta ,y) &{} \quad 0 &{}\quad 0 &{}\quad A^{\star }\\ 0 &{}\quad 0 &{}\quad {\text {Id}}&{}\quad 0 &{}\quad -{\hat{\sigma }}^{-1}\chi _{{\mathcal {A}}_3(p)}B^{\star }\\ \chi _{{\mathcal {A}}_1(\beta ,z)}K_T &{}\quad 0 &{} \quad 0 &{}\quad -\chi _{{\mathcal {A}}_2(\beta ,z)} &{} \quad 0 \\ 0 &{}\quad A &{}\quad -B &{}\quad 0 &{}\quad 0 \end{bmatrix}. \end{aligned}$$

Proof

To show Newton differentiability of W, one has to pay attention only to the third and fourth line as the others are Fréchet differentiable. For the fourth line one can use that in finite dimensions the composition of Newton differentiable functions is Newton differentiable cf. [33, Proposition 2.9] and combine this with the fact that \(\max (\cdot ,\cdot )\) is Newton differentiable (see [33, Proposition 2.26]). Furthermore, [33, Theorem 3.49] can be used to show the Newton differentiability of the third line: If we use \(m=3\), \(\psi (s)=\min (\max (s_1,s_2),s_3)\), \(r=r_i=2\), \(G(p)= ((B^{\star }p+\sigma _u u_m)/{\hat{\sigma }}, u_a,u_b)\) in the setting of [33, Section 3.3], then the required [33, Assumption 3.32] is satisfied with \(q_i=q'>2\), by the higher regularity \(B^{\star }\in L[H_0^1(\Omega ), L^{q'}(\Omega )]\).

Consequently, the function \(H_0^1(\Omega )\ni p\mapsto \min (\max ((B^{\star }p+\sigma _u u_m)/{\hat{\sigma }}, u_a),u_b)\) is Newton differentiable.

Now a Newton derivative can be obtained using direct calculations and utilizing the index sets that are introduced above. \(\square \)

The proof required a norm gap, which was ensured by the higher regularity \(B^{\star }\in L[H_0^1(\Omega ), L^{q'}(\Omega )]\) with \(q' > 2\), which is intrinsic to our problem setting. This allowed us to prove the Newton differentiability of W in the spaces where W is defined. In particular when adapting the Algorithm from [33, Algorithm 3.10], see Algorithm 3, this allows for the smoothing step to be skipped. This smoothing step is designed to treat the more general case when Newton differentiability can only be shown by artificially introducing a norm gap while the boundedness of the inverse of the derivative can only be shown in the original setting (cf. [33, Introduction to section 3]). Note that (S3) is well defined as long as \(\beta _i\) is positive, since the function W is only defined for positive \(\beta \). This, however, does not influence the local convergence of Algorithm 3.

figure k

To prove fast convergence of the semismooth Newton method, the uniform invertibility of the Newton derivative \(W'(\beta ,y,u,z,p)\) is needed. For this purpose, we convert the Newton derivative \(W'(\beta ,y,u,z,p)\) into a self-adjoint operator, since the latter type of operator is easier to handle. For that purpose we fix a point \((\beta ,y,u,z,p)\). We use the notation \(I_1\in {\mathbb {R}}^{(n+1)\times l_1}\), \(I_2\in {\mathbb {R}}^{(n+1)\times (n+1 - l_1)}\), \(I_3: L^2({\mathcal {A}}_3(p))\rightarrow L^2(\Omega )\), \(I_4: L^2(\Omega {\setminus } {\mathcal {A}}_3(p))\rightarrow L^2(\Omega )\), to refer to the canonical embedding operators that correspond to the index sets \({\mathcal {A}}_1(\beta ,z)\), \({\mathcal {A}}_2(\beta ,z)\), \({\mathcal {A}}_3(p)\), \(\Omega \setminus {\mathcal {A}}_3(p)\). Here \(l_1\) denotes the cardinality of \({\mathcal {A}}_1(\beta , z)\). We mention that \(I_1^\top , I_2^\top , I_3^{\star }, I_4^{\star }\) are the corresponding restriction operators and, consequently,

$$\begin{aligned} \begin{aligned} \chi _{{\mathcal {A}}_1(\beta ,z)}&= I_1I_1^\top ,&\chi _{{\mathcal {A}}_2(\beta ,z)}&= I_2I_2^\top ,&\chi _{{\mathcal {A}}_3(p)}&= I_3I_3^{\star }, \\ {\text {Id}}_{{\mathbb {R}}^{n+1}}&=I_1I_1^\top + I_2I_2^\top ,&{\text {Id}}_{L^2(\Omega )}&=I_3I_3^{\star }+ I_4I_4^{\star }. \end{aligned} \end{aligned}$$

We define the linear operator \( {{\hat{W}}}'\) from \( {\mathbb {R}}^n \times H_0^1(\Omega ) \times L^2({\mathcal {A}}_3(p)) \times {\mathbb {R}}^{l_1} \times H^1_0(\Omega ) \) to \( {\mathbb {R}}^n \times H^{-1}(\Omega ) \times L^2({\mathcal {A}}_3(p)) \times {\mathbb {R}}^{l_1} \times H^{-1}(\Omega ) \) via

$$\begin{aligned} {{\hat{W}}}':= \begin{bmatrix} h''_{\beta \beta }(\beta ,y) &{}\quad h''_{\beta y}(\beta ,y) &{}\quad 0 &{}\quad K_T^\top I_1 &{}\quad 0 \\ h''_{y\beta }(\beta ,y) &{}\quad h''_{yy}(\beta ,y) &{}\quad 0 &{}\quad 0 &{}\quad A^{\star }\\ 0 &{}\quad 0 &{}\quad {\hat{\sigma }} {\text {Id}}&{}\quad 0 &{} \quad -(B I_3)^{\star }\\ I_1^\top K_T &{} \quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad A &{}\quad -B I_3 &{}\quad 0 &{}\quad 0 \end{bmatrix}. \end{aligned}$$

It can be seen that \({{\hat{W}}}'\) is self-adjoint. Note that the spaces on which \({{\hat{W}}}'\) operates depend on \(\beta ,z,p\). The next lemma gives us a relation between \({{\hat{W}}}'\) and \(W'(\beta ,y,u,z,p)\).

Lemma 5.4

Let \((\beta ,y,u,z,p) \in (0,\infty )^n \times H_0^1(\Omega ) \times L^2(\Omega ) \times {\mathbb {R}}^{n+1} \times H_0^1(\Omega )\) be fixed. Furthermore, let two points \((\beta _1,y_1,u_1,z_1,p_1) \in {\mathbb {R}}^n \times H_0^1(\Omega ) \times L^2(\Omega ) \times {\mathbb {R}}^{n+1} \times H_0^1(\Omega )\) and \((\beta _2,y_2,u_2,z_2,p_2) \in {\mathbb {R}}^n \times H^{-1}(\Omega ) \times L^2(\Omega ) \times {\mathbb {R}}^{n+1} \times H^{-1}(\Omega )\) be given. Then

$$\begin{aligned} W'(\beta ,y,u,z,p) \begin{pmatrix} \beta _1 \\ y_1 \\ u_1 \\ z_1 \\ p_1 \end{pmatrix} = \begin{pmatrix} \beta _2 \\ y_2 \\ u_2 \\ z_2 \\ p_2 \end{pmatrix} \end{aligned}$$
(32)

holds if and only if

$$\begin{aligned} \begin{aligned} {{\hat{W}}}' \begin{pmatrix} \beta _1 \\ y_1 \\ I_3^{\star }u_1 \\ I_1^\top z_1 \\ p_1 \end{pmatrix}&= \begin{pmatrix} \beta _2 - K_T^\top I_2I_2^\top z_2 \\ y_2 \\ {\hat{\sigma }} I_3^{\star }u_2 \\ I_1^\top z_2 \\ p_2 + BI_4I_4^{\star }u_2 \end{pmatrix}, \qquad \begin{aligned} I_4^{\star }u_1&= I_4^{\star }u_2, \\ -I_2^\top z_1&= I_2^\top z_2 \end{aligned} \end{aligned} \end{aligned}$$
(33)

hold.

Proof

The proof can be carried out by direct calculation. We first assume (32) to be valid. Computing the application of \({{\hat{W}}}'\) yields

$$\begin{aligned} {{\hat{W}}}' \begin{pmatrix} \beta _1 \\ y_1 \\ I_3^{\star }u_1 \\ I_1^\top z_1 \\ p_1 \end{pmatrix} = \begin{pmatrix} h''_{\beta \beta }(\beta ,y)\beta _{1} + h''_{\beta y}(\beta ,y)y_{1} + K_T^\top I_{1}I_{1}^\top z_{1} \\ h''_{y\beta }(\beta ,y)\beta _1 + h''_{y y}(\beta ,y)y_{1} + A^{\star }p_1 \\ I^{\star }_{3}({\hat{\sigma }} u_{1} -B^{\star }p_{1}) \\ I_{1}^\top K_T\beta _{1} \\ Ay_{1} -BI_{3}I^{\star }_{3}u_{1} \end{pmatrix}. \end{aligned}$$

We use the definition of the index sets and receive the equivalent expression

$$\begin{aligned} {{\hat{W}}}' \begin{pmatrix} \beta _1 \\ y_1 \\ I_3^{\star }u_1 \\ I_1^\top z_1 \\ p_1 \end{pmatrix} = \begin{pmatrix} h''_{\beta \beta }(\beta ,y)\beta _{1} + h''_{\beta y}(\beta ,y)y_{1} + K_T^\top z_{1} - K_T^\top I_{2}I_{2}^\top z_{1} \\ y_2\\ {\hat{\sigma }} I_{3}^{\star }(u_{1} - {\hat{\sigma }}^{-1}\chi _{{\mathcal {A}}_3(p)}B^{\star }p_{1}) \\ I^\top _{1}( \chi _{{\mathcal {A}}_1(\beta ,z)} K_T\beta _{1} - \chi _{{\mathcal {A}}_2(\beta ,z)} z_1)\\ Ay_{1} -Bu_{1} + BI_{4}I^{\star }_{4}u_{1} \end{pmatrix}, \end{aligned}$$
(34)

where we used \(I_1I_1^\top = {\text {Id}}_{{\mathbb {R}}^{n+1}} - I_2I_2^\top \), \(I_3^{\star }= I_3^{\star }\chi _{{\mathcal {A}}_3(p)}\), \(I_1^{\star }= I_1^{\star }\chi _{{\mathcal {A}}_1(\beta ,z)}\), \(I^\top _1 \chi _{{\mathcal {A}}_2(\beta ,z)} = 0\), and \(I_3I_3^\top = {\text {Id}}_{L^2(\Omega )} - I_4I_4^{\star }\). Using the description of \(W'(\beta ,y,u,z,p)\) yields

$$\begin{aligned} {{\hat{W}}}' \begin{pmatrix} \beta _1 \\ y_1 \\ I_3^{\star }u_1 \\ I_1^\top z_1 \\ p_1 \end{pmatrix} = \begin{pmatrix} \beta _{2} - K_T^\top I_{2}I_{2}^\top z_{1} \\ y_{2} \\ {\hat{\sigma }} I^{\star }_{3}u_{2}\\ I_{1}^\top z_{2}\\ p_{2} + BI_{4}I^{\star }_{4}u_{1} \end{pmatrix}. \end{aligned}$$
(35)

Note that the claimed relations \(I_4^{\star }u_1 = I_4^{\star }u_2\) and \(-I_2^\top z_1 = I_2^\top z_2\) follow from the equations \({\text {Id}}u_1 + {\hat{\sigma }} ^{-1} \chi _{{\mathcal {A}}_3(p)} BG^{\star }p_1 = u_2\) and \(\chi _{{\mathcal {A}}_1(\beta ,z)} K_T\beta _1 - \chi _{{\mathcal {A}}_2} z_1 = z_2\) (which are part of (32)). With these relations, we directly get (33) from (35).

For the other direction, we first get (35) directly from (33). Then, a comparison with (34) yields the equations for \(\beta _2\), \(y_2\), \(p_2\), and \(I_3^{\star }u_1 + {\hat{\sigma }} ^{-1} \chi _{{\mathcal {A}}_3(p)} BG^{\star }p_1 = I_3^{\star }u_2\), \(I_1^\top (\chi _{{\mathcal {A}}_1(\beta ,z)} K_T\beta _1 - \chi _{{\mathcal {A}}_2} z_1) = I_1^\top z_2\). The final expression (32) follows by utilizing \(I_4^{\star }u_1 = I_4^{\star }u_2\) and \(-I_2^\top z_1 = I_2^\top z_2\) again. \(\square \)

In order to ensure the uniform invertibility of the operators \({{\hat{W}}}'\), we state an auxiliary lemma.

Lemma 5.5

Let XY be Hilbert spaces and \({{\hat{A}}}:X\rightarrow X^{\star }\), \({{\hat{B}}}:X\rightarrow Y^{\star }\) be bounded linear operators. Let the bounded linear operator \({{\hat{D}}} :X\times Y\rightarrow X^{\star }\times Y^{\star }\) be defined via

$$\begin{aligned} {{\hat{D}}}= \begin{bmatrix} {{\hat{A}}} &{} \quad {{\hat{B}}}^{\star }\\ {{\hat{B}}} &{}\quad 0 \end{bmatrix}. \end{aligned}$$

Suppose that \({{\hat{B}}}\) is surjective and that \({{\hat{A}}}\) is coercive on \(\ker {{\hat{B}}}\), i.e. there exists a constant \({\hat{\gamma }} > 0\) such that \({\langle }{{{\hat{A}}}x,x}{\rangle }\ge {\hat{\gamma }}{\Vert }x{\Vert }_X^2\) for all \(x\in \ker {{\hat{B}}}\).

Then \({{\hat{D}}}\) is continuously invertible. Moreover, the estimate

$$\begin{aligned} {\Vert }{{\hat{D}}}^{-1}{\Vert }\le 4c^5 \end{aligned}$$

holds, where \(c:=\max (1,{\hat{\gamma }}^{-1},\alpha ,{\Vert }{{\hat{A}}}{\Vert })\), \(\alpha >0\) is a constant such that \(B^1_{Y^{\star }}(0)\subset {{\hat{B}}}(B^\alpha _X(0))\), and \({\hat{\gamma }}>0\) is the coercivity constant from above.

Proof

This result follows from [34, Proposition II.1.3]. Note that we have \({{\hat{B}}}(X)=Y^{\star }\) and \(\ker {{\hat{B}}}^{\star }=\{0\}\). \(\square \)

Lemma 5.6

Let \((\beta ,y,u,z,p)\in (0,\infty )^n\times H_0^1(\Omega )\times L^2(\Omega ) \times {\mathbb {R}}^{n+1}\times H_0^1(\Omega )\) be fixed. Suppose that \(I_1^\top K_T\in {\mathbb {R}}^{l_1\times n}\) is surjective, i.e. that the rows of \(K_T\) which correspond to the index set \({\mathcal {A}}_1(\beta ,z)\) are linearly independent. Then, the operator \(W'(\beta ,y,u,z,p)\) is continuously invertible. Moreover, we have \({\Vert }W'(\beta ,y,u,z,p)^{-1}{\Vert }\le C\) for a constant \(C>0\), which does not depend on \(\beta ,y,u,z,p\) but can depend on an upper bound of \({\Vert }y{\Vert }\), on the upper and lower bounds of \(\beta \), and on \(K_T,A,B,h,{\hat{\sigma }},\sigma _r\).

Proof

We start with showing that \({{\hat{W}}}'\) is continuously invertible, which we will do using Lemma 5.5. We notice that the operator \({{\hat{W}}}'\) has the required block structure if we set

$$\begin{aligned} \begin{aligned} {{\hat{A}}}&:= \begin{bmatrix} h''_{\beta \beta }(\beta ,y) &{}\quad h''_{\beta y}(\beta ,y) &{}\quad 0 \\ h''_{y\beta }(\beta ,y) &{}\quad h''_{yy}(\beta ,y) &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad {\hat{\sigma }} I \end{bmatrix}, \\ {{\hat{A}}}&:{\mathbb {R}}^n\times H_0^1(\Omega )\times L^2({\mathcal {A}}_3(p)) \rightarrow {\mathbb {R}}^n\times H^{-1}(\Omega )\times L^2({\mathcal {A}}_3(p)), \\ {{\hat{B}}}&:= \begin{bmatrix} I_1^\top K_T &{} \quad 0 &{}\quad 0 \\ 0 &{} \quad A &{} \quad -B I_3 \end{bmatrix}, \\ {{\hat{B}}}&:{\mathbb {R}}^n\times H_0^1(\Omega )\times L^2({\mathcal {A}}_3(p)) \rightarrow {\mathbb {R}}^{l_1}\times H^{-1}(\Omega ). \end{aligned} \end{aligned}$$

Since A is invertible and \(I_1^\top K_T\) is surjective by assumption, it follows that \({{\hat{B}}}\) is surjective. In order to show that \({{\hat{W}}}'\) is continuously invertible, it remains to show that \({{\hat{A}}}\) is coercive on \(\ker {{\hat{B}}}\).

Let \(({\hat{\beta }},{{\hat{y}}},{{\hat{u}}})\in \ker {{\hat{B}}}\) be given. Then

$$\begin{aligned} {\Vert }({{\hat{y}}},{{\hat{u}}}){\Vert }_{H_0^1(\Omega ) \times L^2({\mathcal {A}}_3(p))} = {\Vert }(A^{-1}BI_3{{\hat{u}}},{{\hat{u}}}){\Vert }_{H_0^1(\Omega ) \times L^2({\mathcal {A}}_3(p))} \le (1+{\Vert }A^{-1}B{\Vert }){\Vert }{{\hat{u}}}{\Vert }_{L^2({\mathcal {A}}_3(p))} \end{aligned}$$

holds. Recall from (30) that \(h(\beta ,y)={{\hat{h}}}(\beta ,y) +\frac{\sigma _\beta }{2}\sum _{i=1}^n \left( \frac{1}{\beta _i}\right) ^2\) and that \({{\hat{h}}}\) is convex, and that for \(\frac{\sigma _\beta }{2}\sum _{i=1}^n \left( \frac{1}{\beta _i}\right) ^2\) we can directly calculate the second derivative, which is a diagonal matrix with strictly positive entries, if \(\beta _i > 0\). Therefore, there exists a constant \(\sigma _r>0\) for which

$$\begin{aligned} \left( h''(\beta ,y)({\hat{\beta }},{{\hat{y}}})\right) ({\hat{\beta }},{{\hat{y}}}) \ge \sigma _r {\hat{\beta }}^\top {\hat{\beta }} \end{aligned}$$
(36)

holds, where \(\sigma _r\) depends on the upper bound of \(\beta _i\). This implies

$$\begin{aligned} \begin{aligned} {\langle }{ {{\hat{A}}} ({\hat{\beta }},{{\hat{y}}},{{\hat{u}}}),({\hat{\beta }},{{\hat{y}}},{{\hat{u}}})}{\rangle }&\ge \sigma _r{\Vert }{\hat{\beta }}{\Vert }^2_{{\mathbb {R}}^n}+{\hat{\sigma }}{\Vert }{{\hat{u}}}{\Vert }_{L^2({\mathcal {A}}_3(p))}^2 \\ {}&\ge \sigma _r{\Vert }{\hat{\beta }}{\Vert }^2_{{\mathbb {R}}^n} + {\hat{\sigma }}(1+{\Vert }A^{-1}B{\Vert })^{-2} {\Vert }({{\hat{y}}},{{\hat{u}}}){\Vert }^2_{H_0^1(\Omega ) \times L^2({\mathcal {A}}_3(p))} \\ {}&\ge {\hat{\gamma }} {\Vert }({\hat{\beta }},{{\hat{y}}},{{\hat{u}}}){\Vert }^2_{{\mathbb {R}}^n \times H_0^1(\Omega ) \times L^2({\mathcal {A}}_3(p))}, \end{aligned} \end{aligned}$$

where \({\hat{\gamma }}>0\) is a suitable constant. Thus \({{\hat{A}}}\) is coercive on \(\ker {{\hat{B}}}\). It follows from Lemma 5.5 that \({{\hat{W}}}'\) is continuously invertible. Because \({{\hat{B}}}\) is surjective, there exists a constant \(\alpha >0\) such that \(B^1(0)\subset {{\hat{B}}}(B^\alpha (0))\). Since there are only finitely many possibilities for \(I_1\) and \(I_3\) is not needed for surjectivity, the constant \(\alpha \) can be chosen such that it is independent of \(I_1\) and \(I_3\). For \({\Vert }{{\hat{A}}}{\Vert }\) we note that it can be bounded by a constant which can depend on an upper bound on \({\Vert }y{\Vert }_{H_0^1(\Omega )}\) and a lower bound on \(\beta _i\).

It follows from Lemma 5.5 that the estimate \({\Vert }{{\hat{W}}}'^{-1}{\Vert }\le 4c^5\) holds for a suitable constant \( c>0\) which does not depend on \(\beta ,y,u,z,p\) but can depend on an upper bound of \({\Vert }y{\Vert }_{H_0^1(\Omega )}\), the lower bound of \(\beta _i\) and on \(K_T,A,B,h,{\hat{\sigma }},\sigma _r\).

Next, we combine this result with Lemma 5.4 to show the invertibility of \(W'(\beta ,y,u,z,p)\). Let \((\beta _2,y_2,u_2,z_2,p_2)\) be a right-hand side as in (32). Since \({{\hat{W}}}'\) is invertible, by Lemma 5.4 there exists a unique solution \((\beta _1,y_1,u_1,z_1,p_1)\) of (32). Using the estimate \({\Vert }{{\hat{W}}}'^{-1}{\Vert }\le 4c^5\) and (33), one gets an estimate of the form \({\Vert }(\beta _1,y_1,u_1,z_1,p_1){\Vert }\le C{\Vert }(\beta _2,y_2,u_2,z_2,p_2){\Vert }\), where \(C>0\) is a suitable constant that can depend on \(c,{\hat{\sigma }}, K_T, B\), the upper bound of \({\Vert }y{\Vert }_{H_0^1(\Omega )}\) and the bounds of \(\beta \). The constant C however, does not depend on \((\beta ,y,u,z,p)\) or any of the embedding operators \(I_1,I_2,I_3,I_4\). Since we can estimate the norm of the unique solution in (32) by the norm of the right-hand side, the claimed invertibility and estimate \({\Vert }W'(\beta ,y,u,z,p)^{-1}{\Vert }\le C\) follow. \(\square \)

Lemma 5.7

Let \((\beta ,y,u,z,p)\in Q\times H_0^1(\Omega )\times L^2(\Omega ) \times {\mathbb {R}}^{n+1}\times H_0^1(\Omega )\) be a point such that \(W(\beta ,y,u,z,p)=0\). Then the Newton derivative \(W'\) is uniformly continuously invertible in a neighborhood of \((\beta ,y,u,z,p)\).

Proof

We want to apply Lemma 5.6. We need to verify that \(I_1^\top K_T\) (which can depend on \(\beta \) and z) is surjective in a neighborhood.

From the definition of W, we get \(z\ge 0\), \(K_T\beta -b_T\le 0\) and \(z^\top (K_T\beta - b_T) =0\). In particular, \(\beta \in T\). Recall that T is a non-degenerate simplex. Thus, at most n constraints in the system \(K_T\beta \le b_T\) are active, and these active constraints are linearly independent. Furthermore, if \(i\in \{1,\ldots ,n+1\}\) is an index of an inactive constraint, we have \(z_i=0\) due to the complementarity condition, and therefore \(i\in {\mathcal {A}}_2(\beta ,z)\) and \(i\not \in {\mathcal {A}}_1(\beta ,z)\). Thus, \({\mathcal {A}}_1(\beta ,z)\) contains at most n elements. Therefore, the rows of \(K_T\) which correspond to the index set \({\mathcal {A}}_1(\beta ,z)\) are linearly independent, which yields that \(I_1^\top K_T\) is surjective for this particular \(\beta \), z.

If \(i\in {\mathcal {A}}_2(\beta ,z)\), then \(i\in {\mathcal {A}}_2({\hat{\beta }},{{\hat{z}}})\) holds also for \(({\hat{\beta }},{{\hat{z}}})\) that are sufficiently close to \((\beta ,z)\). Thus, \({\mathcal {A}}_1(\beta ,z)\) cannot get larger in a neighborhood of \((\beta ,z)\). Hence, the rows of \(K_T\) that correspond to \({\mathcal {A}}_1(\beta ,z)\) stay linearly independent in a neighborhood, i.e. \(I_1^\top K_T\) is surjective in a neighborhood of \((\beta ,z)\).

Now to apply Lemma 5.6 we restrict the neighborhood such that \(\beta _i > \frac{1}{2a_i}\) if necessary. This guarantees the lower bound \(\beta _i>\frac{1}{2a_i}\). The upper bound of \({\Vert }y{\Vert }_{H_0^1(\Omega )}\) is obtained from the coercivity of f with constant \(\gamma _{k,T}\mu \) (cf. Assumption 2.1(g)). Hence, with Lemma 5.6 there exists a constant \(C>0\), such that \({\Vert }W'(\beta ,y,u,z,p)^{-1}{\Vert }\le C\) in the considered neighborhood of \((\beta ,y,u,z,p)\). \(\square \)

Now we are ready to give our final theorem, which states that Algorithm 3 converges superlinearly.

Theorem 5.8

Let the function W be given as in (31). Further, we denote by \((\beta _{k,T},y_{k,T},u_{k,T})\) the unique global solution of (OVRP\((T, \gamma _{k,T})\)) and by \(z_{k,T}\), \(p_{k,T}\) the corresponding multipliers that satisfy (29). Then there exists a neighborhood of the point \((\beta _{k,T},y_{k,T},u_{k,T},z_{k,T},p_{k,T})\) such that for all initial values \((\beta _0,y_0,u_0,p_0,z_0)\) from this neighborhood, the semismooth Newton method from Algorithm 3 either terminates in the i-th step with \((\beta _i,y_i,u_i,z_i,p_i) = (\beta _{k,T},y_{k,T},u_{k,T},z_{k,T},p_{k,T})\) or generates a sequence that converges q-superlinearly to \((\beta _{k,T},y_{k,T},u_{k,T},z_{k,T},p_{k,T})\) in \({\mathbb {R}}^n\times H^1_0(\Omega )\times L^2(\Omega )\times {\mathbb {R}}^{n+1}\times H^1_0(\Omega )\).

Proof

We already established that the function W is semismooth in the solution to (OVRP\((T, \gamma _{k,T})\)) (see Lemma 5.3). We have proven in Lemma 5.7 that the derivative from Lemma 5.3 is invertible and the norm of the inverse is bounded on a neighborhood of a solution. The result is now a direct application of [33, Theorem 3.13]. In particular, we do not need a smoothing step, since the spaces in which W is Newton differentiable coincide with the spaces in which the Newton derivative is uniformly invertible, see Lemmas 5.3 and 5.6. \(\square \)

6 Numerical experiments

In this section we present an example for Algorithm 2 to illustrate the convergence behavior towards a global minimizer. To this end, we consider the parameter identification problem

$$\begin{aligned} \begin{aligned} \min _{\beta ,y,u} \qquad&\frac{1}{2}{\Vert }y-y_m{\Vert }^2_{L^2(\Omega )} + \frac{\sigma _u}{2}{\Vert }u-u_m{\Vert }^2_{L^2(\Omega )} +\frac{\sigma _\beta }{2}{\Vert }\beta - \beta _m{\Vert }^2_{{\mathbb {R}}^n} {=}{:}F_1(\beta ,y,u)\\\ \text {s.t.} \qquad&\beta \in Q,\\&(y,u)\in \Psi (\beta ), \end{aligned} \end{aligned}$$
(37)

where \(\Psi :{\mathbb {R}}^2\rightarrow H^1_0(\Omega ) \times L^2(\Omega )\) denotes the solution mapping of the parameter \(\beta \) to the unique solution of the lower-level problem

$$\begin{aligned} \begin{aligned} \min _{y,u} \qquad&\frac{1}{2\beta _1}{\Vert }y-y_{d,1}{\Vert }^2_{L^2(\Omega )} + \frac{1}{2\beta _2}{\Vert }y-y_{d,2}{\Vert }^2_{L^2(\Omega )} + \frac{\sigma _l}{2}{\Vert }u{\Vert }^2_{L^2(\Omega )} {=}{:}f(\beta ,y,u)\\ \text {s.t.} \qquad&0 = -\Delta y - u\qquad \text {in}\ \Omega ,\\&0 = y \qquad \qquad \qquad \text {on}\ \partial \Omega ,\\&u \in U_{\textrm{ad}}. \end{aligned} \end{aligned}$$
(38)

Let us define the data present in this bilevel optimization problem. We use the sets \(Q{:}{=}[0.1, 1]^2\) and \(\Omega = ( -1,1 )^2\) (discretized with a meshsize of 0.2). and the two possible desired states

$$\begin{aligned} y_{d,1}&:\Omega \rightarrow {\mathbb {R}},\qquad y_{d,1}(x) = \sin (\pi x_1)\sin (\pi x_2),\\ y_{d,2}&:\Omega \rightarrow {\mathbb {R}},\qquad y_{d,2}(x) = (x_1+1)(x_1-1)(x_2+1)(x_2-1). \end{aligned}$$

The regularization parameter for the lower level is \(\sigma _l = 0.03\). Additionally, we introduce box constraints for the control via

$$\begin{aligned} u\in U_{\textrm{ad}}&{:}{=}\{u\in L^2(\Omega )\ |\ u_a\le u\le u_b \text { a.e. on } \Omega \},\\ u_a(x)&{:}{=}0, \qquad u_b(x) {:}{=}3. \end{aligned}$$

It turns out that these constraints are active on parts of the domain for the choice of the parameter \(\beta = (0.6,0.3)^\top \). For the upper level we fix the parameters \(\sigma _u = 0.05\) and \(\sigma _\beta = 10^{-5}\). We also choose \(\beta _m{:}{=}(0.6,0.3)^\top \) and \((y_m,u_m){:}{=}\Psi ((0.6,0.3)^\top )\), i.e. the objective value of \(F_1\) is zero for the solution to the lower-level problem with \(\beta =\beta _m\). We call this setting “fully reachable target state”. We mention that when this setting is implemented, the functions \(y_m\), \(u_m\) are not the analytical solutions, but are calculated directly using the finite element solutions for the lower level.

For the setting of this section, Assumption 2.1 is valid. Additionally, for the chosen functionals and parameters we can apply the semismooth Newton method from Sect. 5.3 to solve the subproblems (OVRP\((T, \gamma _{k,T})\)). In order to illustrate some fundamental properties of the proposed solution algorithm, we consider two additional problems that only differ in the choice of the objective functional, i.e. the functions

$$\begin{aligned} F_2(\beta ,y,u)&{:}{=}\frac{1}{2}{\Vert }y-y_m{\Vert }^2_{L^2(\Omega )} + \frac{\sigma _u}{2}{\Vert }u-u_m{\Vert }^2_{L^2(\Omega )} +\frac{\sigma _\beta }{2}{\Vert }\beta {\Vert }^2_{{\mathbb {R}}^n},\\ F_3(\beta ,y,u)&{:}{=}\frac{1}{2}{\Vert }y-{{\hat{y}}}_m{\Vert }^2_{L^2(\Omega )} + \frac{\sigma _u}{2}{\Vert }u-{{\hat{u}}}_m{\Vert }^2_{L^2(\Omega )} +\frac{\sigma _\beta }{2}\sum _{i=1}^{2}\frac{1}{\beta _i^2} \end{aligned}$$

are used instead of \(F_1\). In the second objective functional \(F_2\), the \(\beta \) term is only introduced as a regularization. This will be called “reachable target state”. The functional \(F_3\) is set up with desired states \({{\hat{y}}}_m\) and \({{\hat{u}}}_m\) that are given by

$$\begin{aligned} {{\hat{y}}}_m&:\Omega \rightarrow {\mathbb {R}},\qquad {{\hat{y}}}_m(x) = (x_1-1)(x_1+1)\sin (\pi x_2),\\ {{\hat{u}}}_m&:\Omega \rightarrow {\mathbb {R}},\qquad {{\hat{u}}}_m(x) = 0. \end{aligned}$$

This state and control have the property that they do not arise as a solution of the lower-level problem. This setting is named “unreachable target state”. We expect a noticeable difference in the convergence speed for the introduced settings, see Remark 3.8.

The refinement of the subdivision will be implemented by splitting the triangles at the midpoint of the edges. This refinement procedure is the application of Lemma 3.3 to the two-dimensional case. However, in this special case we can even guarantee that the diameter of the simplices is halved in each refinement. We initialize Algorithm 2 with the domain \(Q\) split into two triangles.

We use an implementation with the suggested improvements mentioned at the end of Sect. 3. In each iteration we get a lower bound on the optimal objective value from the element with the lowest objective value for the solution to (OVRP\((T, \gamma _{k,T})\)). We obtain an upper bound from the vertex with the lowest objective value. Hence every element whose relaxed optimal objective value is above the upper bound can be dismissed, since the relaxed optimal objective value is smaller than or equal to the objective value of the original subproblem. Further, in each iteration we refine the best \(15\%\) of the active triangles with respect to the objective value for the solution to (OVRP\((T, \gamma _{k,T})\)), but at most 200. This is done to effectively utilize parallelization. Additionally, we refine elements that are a certain amount of generations behind to achieve a “clean up of old triangles”. Otherwise, for some triangles that are quite far from the actual solution but for which (by chance) the objective value comes really close, the algorithm might take a long time to refine this element. This only has a noticable impact if we are in a case where the amount of active elements steadily increases, i.e. the case of \(F_3\) and was not used for the other cases. Lastly, the algorithm runs until a set amount of elements (\(4\cdot 10^5\)) is reached or the difference between lower and upper bound is sufficiently small. We chose a target bound difference of \(10^{-10}\). For the case of \(F_3\) the element limit was reached.

Fig. 1
figure 1

Upper bound (blue) and lower bound (red) for the setting of \(F_1\), \(F_2\) and \(F_3\) w.r.t. the number of solved subproblems. (Color figure online)

Fig. 2
figure 2

Distance between the calculated solution \({{\bar{\beta }}}\) and the best known vertex (blue) and the furthest active vertex (red) respectively for each iteration. (Color figure online)

We now visualize the convergence of Algorithm 2 in Figs. 1, 2, 3 and 4. These graphics indicate the convergence \(\beta _k\rightarrow {{\bar{\beta }}}\) as predicted in Theorem 4.2, see in particular Fig. 2. In Fig. 1 we show the difference of lower and upper bound compared for all mentioned settings.

Fig. 3
figure 3

Difference of upper and lower bound for the settings of \(F_1\), \(F_2\) and \(F_3\) w.r.t. the number of solved subproblems. For the setting of \(F_3\) the results for two different regularization terms are displayed

Fig. 4
figure 4

From left to right: Progression of the splitting of the domain \(Q\) for (OVRP(\(\xi _{\mathcal {T}}\))) for the settings of \(F_1\), \(F_2\) and \(F_3\). Simplices are differentiate by the color of their outline: Dismissed (blue), relevant (red), split in the last iteration (yellow), difference of lower and upper bound for the element is within \(10^{-9}\) (green). The element with the current best objective value is marked with a pink dot. (Color figure online)

We have a stark difference of convergence speed for the different settings introduced in this section. Additionally there is a noticeable difference between looking at the vertex that provides the upper bound and the furthest active vertex. Note that only for the latter the distance to \({{\bar{\beta }}}\) is guaranteed to be nonincreasing, while the vertex providing the upper bound might be more interesting from a heuristic point of view if one considers a depth-search. The splitting of the domain can be seen in Fig. 4. For the purpose of better visualization in the setting of \(F_1\) and \(F_2\), the algorithm was continued for Fig. 4 until every element either had a vertex for which the corresponding upper level objective was close (\(10^{-10}\)) to the upper bound or was dismissed. We show the difference of lower and upper bound for all the cases discussed in Fig. 3.

We want to comment on the relation between the number of subproblems and the difference of upper and lower bound. As discussed in Remark 3.7, a growth condition for the upper-level objective functional for a solution w.r.t. \(\beta \) has positive effects and we can expect at least an inverse proportional relation. This is exactly the setting of \(F_1\). For \(F_2\) we have the second case from Remark 3.8, where the derivative of \(F_2\) is close to zero in the solution. This is because the term \({\Vert }\beta {\Vert }_{{\mathbb {R}}^n}^2\) only comes up as a regularization with a small parameter for the upper-level objective functional. The solution of the parameter estimation problem is still close to \((y_m,u_m)\). In Fig. 3 the cases for \(F_1\) and \(F_2\) show a behaviour that is a little bit better then the prediction.

For \(F_3\), we no longer have a setting where the number of subproblems, which is required to reach a certain accuracy, follows a nice relation. Especially, the number of active subproblems might heavily increase during the runtime of Algorithm 2. This can be seen well in Fig. 4. Finally Fig. 3 indicates, that the important property in the setting of \(F_3\) is that the solution is no longer close to \(({{\hat{y}}}_m,{{\hat{u}}}_m)\), i.e. that the target state is “unreachable” and that the choice of regularization term \(\frac{\sigma _\beta }{2}{\Vert }\beta {\Vert }_{{\mathbb {R}}^n}^2\) or \(\frac{\sigma _\beta }{2}\sum _{i=1}^2\frac{1}{\beta _i^2}\) is of minor importance regarding convergence speed for this case. As a final comment we emphasize, that if Remark 3.7 or Remark 3.8 apply, both are indicative regarding the general behaviour of the algorithm. These remarks allow for a prediction of the relation between the number of subproblems and the difference of the bounds that is at least inversely proportional. Ultimately Remarks 3.7 and 3.8 give some information about the speed of the algorithm.

Observe that a significant amount of solved subproblems has to be solved. Hence, the computational times needed to evaluate the value function \(\varphi (\beta )\) (which amounts to the solution of (LL\((\beta )\))) and to solve the subproblem (OVRP\((T, \gamma _{k,T})\)) are critical. Thus, we report on the computational times of the used algorithmFootnote 1 in Table 1.

Table 1 Computational times for the evaluation of the value function and the solution of the relaxed penalty problems to reach a difference of upper and lower bound of at most \(10^{-10}\) or exceed \(4\cdot 10^5\) total subproblems in the next iteration