1 Introduction

Let X and Y be real Hilbert spaces and \(C \subseteq X\) a nonempty closed convex set. Let the nonlinear objective function \(\phi : X \rightarrow \mathbb {R}\) and the nonlinear constraint function \(c: X \rightarrow Y\) be twice continuously Fréchet differentiable. We consider the mathematical programming problem

(1)

This formulation is equivalent to a more prevalent formulation that allows \(c(x) \in C_c\) for some nonempty closed convex set \(C_c\) [by the use of slack variables \(s \in Y\) via \(c(x) - s = 0\) and \((x, s) \in C \times C_c\)]. Further restrictions on the overall setting are stated in Sect. 1.5 after we settle the notation in Sect. 1.4.

This setting naturally comprises finite dimensional problems (also known as Nonlinear Programming Problems, NLPs) of the form

$$\begin{aligned} \min _{x \in \mathbb {R}^n} \phi (x) \quad \text {subject to } x^{\mathrm{l}} \le x \le x^{\mathrm{u}} \text { and } c(x) = 0 \end{aligned}$$

with \(X = \mathbb {R}^n\), \(Y = \mathbb {R}^m\), and \(C = \{ x \in \mathbb {R}^n \mid x^{\mathrm{l}} \le x \le x^{\mathrm{u}} \}\), where some components of \(x^{\mathrm{u}}\) and \(x^{\mathrm{l}}\) may take on values of \(\pm \infty \).

Another popular example is partial differential equation (PDE) constrained optimization, where \(X = U \times Q\) is a product of the state and control space, C encodes pointwise constraints on the controls, and \(c(x) = c((u, q)) = 0\) is the PDE constraint, where we often assume that the state \(u \in U\) is locally uniquely determined by the control \(q \in Q\) as an implicit function u(q) via \(c((u(q), q)) = 0\).

1.1 Structure of the article

We give a concise overview of the results of this article in Sect. 1.2. We outline our contributions and connections to existing methods in Sect. 1.3. In the remainder of Sect. 1, we settle our notation, state the general assumptions, and provide the statements of important classical results. We give a short proof of the necessary optimality conditions we use and discuss two central constraint qualifications in Sect. 2. Our main results on projected gradient/antigradient flows for (1) follow in Sect. 3. The application of a projected backward Euler method on the projected gradient/antigradient flow results in a sequential homotopy method, which we describe in Sect. 4. We present numerical results for a local semismooth Newton method globalized by the sequential homotopy approach for a class of highly nonlinear and badly conditioned elliptic PDE-constrained optimal control problems with control constraints in Sect. 5.

1.2 Overview: a novel solution approach based on a sequence of homotopies

We propose the following general solution approach in this paper: We construct and analyze existence and uniqueness of a primal-dual projected gradient/antigradient flow for an augmented Lagrangian. The equilibria of the flow are critical points of (1) and vice versa. Under reasonable assumptions, we prove that critical points that are not local minima cannot be asymptotically stable. Small perturbations will make the flow escape these unwanted critical points. We then apply a projected version of backward Euler timestepping. We provide an interpretation of the backward Euler equations as the optimality conditions of a primal-dual proximally regularized counterpart to (1), which satisfies a strong constraint qualification, even though (1) might only satisfy the Guignard constraint qualification [26], the weakest of all constraint qualifications. This gives rise to a sequential homotopy method, in which a sequence of proximally regularized subproblems needs to be solved by (possibly inexact) fast numerical methods that are only required to converge locally.

We invite the reader to read the supplementary material, in which we sketch without proofs the salient features of our approach with an illustrative example in finite dimensions without inequalities.

1.3 Related work and contributions

We advance and bridge several fields of optimization with this paper.

The field of globalized Newton methods based on differential equation methods applied to the Newton flow started in the early 1950s with Davidenko [20] and continues to raise scientific interest over the decades [9, 15, 21, 23, 24, 34, 42, 51, 52], predominantly due to the affine invariance properties of the Newton flow [22]. By trading the affine invariance of the Newton flow for the stability properties of the gradient flow, we obtain from a dynamical systems point of view the advantage of being repelled from maxima or saddle points when solving nonlinear optimization problems.

The Newton method, which is equivalent to forward Euler timestepping on the Newton flow with stepsize \(\Delta t = 1\), has the prominent property of quadratic local convergence. Backward Euler timestepping on the gradient/antigradient flow can attain superlinear local convergence if the solution is sufficiently regular so that we can take the stepsize \(\Delta t\) to infinity or, equivalently, drive the proximal coefficient \(\lambda \) to zero, provided that we use a local solver in the numerical homotopy method with at least superlinear local convergence. Driving \(\lambda \) to zero is usually possible if the solution satisfies certain second order sufficient optimality conditions.

Three methods in the field of convex optimization are closely related to our approach. The first method is the proximal point algorithm for closed proper convex functions, which can be interpreted as a backward Euler timestepping on the gradient flow of the objective function, while the gradient descent method amounts to forward Euler timestepping on the gradient flow (see, e.g., [48, Sect. 4.1] and references therein). We extend this approach to nonconvex optimization problems with explicit handling of nonlinear equality constraints, as they appear for instance in optimal control. To this end, we extend a second method, the primal-dual projected gradient/antigradient flow of [8, Chap. 6, 7], from the finite-dimensional convex to the infinite-dimensional nonconvex setting with the help of an augmented Lagrangian technique in the framework of projected differential equations in Hilbert space [19]. The third method we extend is the closely related Arrow–Hurwicz gradient method [8, Chap. 10], which amounts to projected forward Euler timestepping on the projected gradient/antigradient flow of the Lagrangian without augmentation (\(\rho = 0\)). Our sequential homotopy method is equivalent to projected backward Euler timestepping. Hence, it bears the same connection with the Arrow–Hurwicz gradient method as the proximal point algorithm with gradient descent.

From a Sequential Quadratic Programming (SQP) perspective (see, e.g., [46]), our approach resolves all the numerical difficulties on the nonlinear level such as subproblem infeasibility, degeneracy, and nonconvexity due to indefinite subproblem Hessians. Existing approaches often pass these difficulties on to the level of the quadratic subproblem solvers, which may fail to resolve these issues in a way that guarantees convergence of the overall nonlinear iteration. Our method can thus be used as a black-box globalization framework for any locally convergent optimization method that can be used within a continuation framework, e.g., methods of structure-exploiting inexact Sequential Quadratic Programming (SQP) [28, 33, 49, 50, 53] or semismooth Newton methods [31,32,33, 35, 43, 54, 56, 57]. The local methods are even allowed to converge to maxima and saddle points. These issues are taken care of by our sequential homotopy method. For the application of local SQP methods, we can guarantee that the quadratic subproblems are always feasible and that they satisfy a strong constraint qualification that implies unique subproblem Lagrange multipliers. In addition, they are convex if the augmentation parameter \(\rho \) is sufficiently large and the stepsize \(\Delta t\) is sufficiently small when we are still far away from a solution.

Our approach uses the theory of projected differential equations due to Cojocaru and Jonker [19], which have a tight connection to differential inclusions [10] and evolutionary/differential variational inequalities [18, 47]. We are mainly interested in their equilibrium points, which satisfy a variational inequality (VI). Other methods to compute solutions to VIs have been described in the literature (see, e.g., [13, 45]), which are based on semismooth iterations on reformulations using special Nonlinear Complementarity Problem (NCP) functions.

Projected gradient flows for constrained optimization problems in finite dimensions have also been considered with techniques from Riemannian geometry (see, e.g., [29, 30, 37, 55] and references therein), but the resulting methods produce only feasible iterates. It is often computationally wasteful to satisfy all constraints for iterates far away from an optimum and to force the iterates to follow a feasible manifold with possibly high curvature.

For an introduction to augmented Lagrangian approaches in Hilbert spaces we refer to [36] and references therein. We point out that our approach relies on the augmented Lagrangian mainly to remove negative curvature of the Lagrangian in the kernel of the constraints. In contrast to classical augmented Lagrangian methods, we do not alternate between updates of the primal and dual variables but rather update primal and dual variables simultaneously as in augmented Lagrangian-SQP methods [36, Chap. 6].

1.4 Notation

We abbreviate the nonnegative real numbers with \(\mathbb {R}_{\ge 0}\). By \((x_k) \subset X\) we denote a sequence \(x_0, x_1, \dotsc \) of elements in X. By \(X^{*}\) we denote the topological dual of X, by \(\left( .,.\right) _{X}: X \times X \rightarrow \mathbb {R}\) the inner product, by \(\left\Vert .\right\Vert _{X}: X \rightarrow \mathbb {R}_{\ge 0}\) the norm, and by \(\left\langle .,.\right\rangle _{X^{*}, X}: X^{*} \times X \rightarrow \mathbb {R}\) the duality pairing. By \(R_X: X^{*} \rightarrow X\) we denote the Riesz isomorphism (see, e.g., [58, Sect. III.6]), which satisfies the identity

$$\begin{aligned} \left( R_X x^*, x\right) _{X} = \left\langle x^{*}, x\right\rangle _{X^{*}, X} \quad \text {for all } x^{*} \in X^{*}, x \in X \end{aligned}$$

and likewise for Y. As usual, \(\mathcal {L}(X, Y)\) denotes the Banach-space of all continuous linear operators from X to Y. For \(A \in \mathcal {L}(X, Y)\), the (Banach space) dual operator \(A^{*} \in \mathcal {L}(Y^{*}, X^{*})\) and the (Hilbert space) adjoint operator \(A^{\star } \in \mathcal {L}(Y, X)\) are defined by

$$\begin{aligned} \left\langle A^{*} y^{*}, x\right\rangle _{X^{*}, X}&= \left\langle y^{*}, A x\right\rangle _{Y^{*}, Y}&\text {for all } x \in X, y^{*} \in Y^{*},\\ \left( A^{\star } y, x\right) _{X}&= \left( y, A x\right) _{Y}&\text {for all } x \in X, y \in Y, \end{aligned}$$

which implies \(A^{\star } R_{Y} = R_{X} A^{*}\). We denote the Fréchet-derivative of c(x) with \(c'(x) \in \mathcal {L}(X, Y)\). We denote the objective gradient by \(\nabla \phi (x) = R_{X} \phi '(x) \in X\) and the adjoint of the constraint derivative by \(\nabla c(x) = \left( c'(x) \right) ^{\star } \in \mathcal {L}(Y, X)\). For a linear operator \(A \in \mathcal {L}(X, Y)\), we denote its kernel by \({{\,\mathrm{ker}\,}}(A) = \{x \in X \mid A x = 0 \}\) and its range by \({{\,\mathrm{ran}\,}}(A) = \{ y \in Y \mid \exists x \in X: y = A x \}\). For an open set \(\Omega \in \mathbb {R}^n\), we denote with \(L^2(\Omega )\) the standard Hilbert space of square Lebesgue-integrable functions on \(\Omega \), with \(H^1_0(\Omega )\) the Sobolev-space of functions with square Lebesgue-integrable derivatives and zero trace at the boundary, and with \(H^{-1}(\Omega )\) its dual space. We denote the feasible set of (1) with \(\mathcal {F} = \{ x \in C \mid c(x) = 0 \}\).

1.5 General assumptions

A central role in this article is played by the augmented objective and augmented Lagrangian

$$\begin{aligned} \phi ^{\rho }(x) = \phi (x) + \frac{\rho }{2} \left\Vert c(x)\right\Vert _{Y}^2, \qquad L^{\rho }(x,y) = \phi ^{\rho }(x) + \left( y, c(x)\right) _{Y}, \end{aligned}$$
(2)

defined for some fixed \(\rho \in \mathbb {R}_{\ge 0}\) and arbitrary \(x \in C\) and \(y \in Y\). Throughout this article, we make the following assumptions:

Assumption 1

For all \(x \in \mathcal {F}\), \({{\,\mathrm{ran}\,}}(c'(x))\) is closed in Y.

Assumption 2

For some fixed \(\rho \in \mathbb {R}_{\ge 0}\) we have the coercivity condition

$$\begin{aligned} \phi ^{\rho }_{\mathrm{low}} = \inf _{x \in C} \phi ^{\rho }(x) > -\infty \quad \text {and} \quad \lim _{\left\Vert x\right\Vert _{X} \rightarrow \infty } \phi ^{\rho }(x) = \infty . \end{aligned}$$

Assumption 3

The functions c(x), \(L^{\rho }(x,y)\) and the gradient \(\nabla L^{\rho }(x,y)\) are locally Lipschitz continuous.

1.6 Well-known results

Let us recall the following well-known definitions.

Definition 1

(Tangent cone) For \(\bar{x} \in X\) and a nonempty set \(M \subseteq X\), we call

$$\begin{aligned} T(M, \bar{x}) = \{&d \in X \mid ~\text {there exist sequences } (x_k) \subset M, (\lambda _k) \subset \mathbb {R}_{\ge 0}\\&\text {with } x_k \rightarrow \bar{x} \text { and } \lambda _k (x_k - \bar{x}) \rightarrow d \text { as } k \rightarrow \infty \} \end{aligned}$$

the tangent cone to M at \(\bar{x}\).

Definition 2

(Projection) For a nonempty closed convex set \(K \subseteq X\), we denote by \(P_{K}: X \rightarrow K\) the projection operator of X onto K, which is uniquely defined by

$$\begin{aligned} \left\Vert P_{K}(x) - x\right\Vert _{X} = \inf _{\tilde{x} \in K} \left\Vert \tilde{x} - x\right\Vert _{X} \quad \text {for all } x \in X. \end{aligned}$$

For properties of projection operators, we refer the reader to [59].

Definition 3

(Polar cone) For a cone \(K \subseteq X\), we call

$$\begin{aligned} K^{-} = \left\{ d \in X \mid \left( d, x\right) _{X} \le 0 \text { for all } x \in K \right\} \end{aligned}$$

the polar cone of K.

Remark 1

If \(K \subseteq X\) is a linear subspace, then \(x \in K\) implies \(-x \in K\) and thus equality holds in the definition of \(K^{-} = \left\{ d \in X \mid \left( d, x\right) _{X} = 0 \text { for all } x \in K \right\} = K^{\perp }.\)

We shall make use of the following classical results from convex analysis.

Lemma 1

(Moreau decomposition) If \(K \subseteq X\) is a nonempty closed convex cone, then every \(x \in X\) has a unique decomposition \(x = P_{K}(x) + P_{K^{-}}(x) =: x^{+} + x^{-}\), where \(\left( x^{-}, x^{+}\right) _{X} = 0\). A simple consequence is the identity

$$\begin{aligned} \left( x, P_{K}(x)\right) _{X} = \left( x^{+} + x^{-}, x^{+}\right) _{X} = \left\Vert P_{K}(x)\right\Vert _{X}^2. \end{aligned}$$

Proof

See [44] according to [59, Lemma 2.2 and Corollary 2]. \(\square \)

Lemma 2

Let \(K \subseteq X\) be a nonempty closed convex set and let \(\bar{x} \in K\). If \(x \in T^{-}(K, \bar{x}) + \bar{x}\), then \(P_{K}(x) = \bar{x}\).

Proof

Choose any \(y \in K\). Then, \(y - \bar{x} \in T(K, \bar{x})\), e.g., with \(\lambda _{k} = k + 1\) and \(x_{k} = (1-\lambda _{k}^{-1}) \bar{x} + \lambda _{k}^{-1} y \in K\). Because \(x - \bar{x} \in T^{-}(K, \bar{x})\), we obtain \(\left( x - \bar{x}, y - \bar{x}\right) _{X} \le 0.\) The result follows from [59, Lemma 1.1], because \(y \in K\) was chosen arbitrarily. \(\square \)

2 Necessary optimality conditions

The basis for the sequential homotopy method we propose in Sect. 4 is a necessary optimality condition due to Guignard [26]. Because the separation of nonlinearities \(c(x) = 0\) and inequalities \(x \in C\) in (1) allow for a much shorter proof, we state it here for the sake of convenience.

Lemma 3

If \(\bar{x} \in \mathcal {F}\) is a local optimum of (1), then \(-\nabla \phi (\bar{x}) \in T^{-}(\mathcal {F}, \bar{x})\).

Proof

Let \(d \in T(\mathcal {F}, \bar{x})\) with corresponding sequences \((x_k) \subset X\) and \((\lambda _k) \subset \mathbb {R}_{\ge 0}\). Using the shorthand \(d_k = \lambda _k (x_k - \bar{x})\), we obtain the assertion from letting \(k \rightarrow \infty \) in

$$\begin{aligned} 0 \le \lambda _k \left[ \phi (x_k) - \phi (\bar{x}) \right] = \left\langle \phi '(\bar{x}), d_k\right\rangle _{X^{*}, X} + \left\Vert d_k\right\Vert _{X} \frac{o\left( \left\Vert x_k - \bar{x}\right\Vert _{X}\right) }{\left\Vert x_k - \bar{x}\right\Vert _{X}} \rightarrow \left( \nabla \phi (\bar{x}), d\right) _{X}. \end{aligned}$$

\(\square \)

Definition 4

(GCQ) We say that the Guignard Constraint Qualification (GCQ) holds at \(\bar{x} \in \mathcal {F}\) if

$$\begin{aligned} {{\,\mathrm{ker}\,}}^{\perp } (c'(\bar{x})) + T^{-} (C, \bar{x}) = T^{-}(\mathcal {F}, \bar{x}). \end{aligned}$$

Theorem 1

(Necessary optimality conditions) If \(\bar{x} \in \mathcal {F}\) is a local optimum of (1) that satisfies GCQ, then there exists a multiplier \(\bar{y} \in Y\) such that

$$\begin{aligned} -\nabla \phi (\bar{x}) - \nabla c(\bar{x}) \bar{y} \in T^{-}(C, \bar{x}). \end{aligned}$$
(3)

Proof

The proof is based on the Closed Range Theorem (see, e.g., [58, Sect. VII.5] with premultiplication by the Riesz isomorphism \(R_X\) to obtain the Hilbert space version), which states that Assumption 1 is equivalent to

$$\begin{aligned} {{\,\mathrm{ker}\,}}^{\perp }(c'(\bar{x})) = {{\,\mathrm{ran}\,}}(\nabla c(\bar{x})). \end{aligned}$$

Together with Lemma 3 and GCQ we obtain

$$\begin{aligned} -\nabla \phi (\bar{x}) \in T^{-}(\mathcal {F}, \bar{x}) = {{\,\mathrm{ker}\,}}^{\perp }(c'(\bar{x})) + T^{-}(C, \bar{x}) = {{\,\mathrm{ran}\,}}(\nabla c(\bar{x})) + T^{-}(C, \bar{x}). \end{aligned}$$

Thus, there exists a \(\bar{y} \in Y\) such that \(-\nabla \phi (\bar{x}) - \nabla c(\bar{x}) \bar{y} \in T^{-}(C, \bar{x}).\) \(\square \)

Definition 5

(Critical point) We call \((\bar{x}, \bar{y}) \in \mathcal {F} \times Y\) a critical point if (3) holds.

The method we propose below enjoys the benefit that its subproblems lift the original problem into a larger space with additional structural properties in X, C, and c, which result in satisfaction of a constraint qualification that is much stronger than GCQ, even though problem (1) only satisfies GCQ.

Lemma 4

Let \(X = U \times Q\), equipped with the canonical inner product derived from the Hilbert spaces U and Q, and let \(C = U \times C_{Q}\) for some nonempty closed convex set \(C_{Q} \subseteq Q\). Furthermore, assume there exists a continuously Fréchet-differentiable mapping \(S: C_{Q} \rightarrow U\) such that for all \(x = (u, q) \in C\)

$$\begin{aligned} { (\mathrm{a})} ~&~ c( (u, q)) = 0 ~ \text { iff } ~ u = S(q),&{ (\mathrm{b})} ~&~ {{\,\mathrm{ran}\,}}c'_{u}(x) = Y,&{ (\mathrm{c})} ~&~ {{\,\mathrm{ran}\,}}\nabla _{u} c(x) = U. \end{aligned}$$

Then, \(\mathcal {F}\) is nonempty, every \(\bar{x} \in \mathcal {F}\) satisfies GCQ, and the Lagrange multiplier \(\bar{y}\) in (3) is uniquely determined.

Proof

The feasible set \(\mathcal {F} = \{ (S(q), q) \mid q \in C_Q \}\) is nonempty because \(C_Q\) is nonempty. Let \(\bar{x} = (S(\bar{q}), \bar{q}) \in \mathcal {F}\) and choose some \(d \in T(C_{Q}, \bar{q})\). By definition, there exist sequences \((q_k) \subset C_{Q}\) and \((\lambda _k) \subset \mathbb {R}_{\ge 0}\) such that \(\lambda _k (q_k - \bar{q}) \rightarrow d\). Using (a), we choose a sequence \((x_k) \subset \mathcal {F}\) according to \(x_k = (S(q_k), q_k)\) to guarantee \(x_k \rightarrow \bar{x}\) and

$$\begin{aligned} \lambda _k (x_k - \bar{x})&= \lambda _k (S(q_k) - S(\bar{q}), q_k - \bar{q})\nonumber \\&= \lambda _k (S'(\bar{q}) (q_k - \bar{q}) + o(\left\Vert q_k - \bar{q}\right\Vert _{Q}), q_k - \bar{q}) \rightarrow (S'(\bar{q}) d, d), \end{aligned}$$
(4)

which shows that \(T(\mathcal {F}, \bar{x}) \supseteq \{ (S'(\bar{q}) d, d) \mid d \in T(C_{Q}, \bar{q}) \}.\) In order to show that equality holds between the two sets, we notice that if \((e, d) \in T(\mathcal {F}, \bar{x})\) then \(d \in T(C_{Q}, \bar{q})\) and (4) implies \(e = S'(\bar{q}) d\). Hence, we obtain

$$\begin{aligned} T(\mathcal {F}, \bar{x}) = \{ (S'(\bar{q}) d, d) \mid d \in T(C_{Q}, \bar{q}) \}. \end{aligned}$$

In order to compute its polar cone, let \(x = (u, \tilde{q}) \in X\) such that

$$\begin{aligned} 0 \ge \left( u, S'(\bar{q}) d\right) _{U} + \left( \tilde{q}, d\right) _{Q} = \left( \nabla S(\bar{q}) u + \tilde{q}, d\right) _{Q} \quad \text {for all } d \in T(C_{Q}, \bar{q}). \end{aligned}$$

We choose \(q = \nabla {S}(\bar{q}) u + \tilde{q}\) in order to obtain

$$\begin{aligned} T^{-}(\mathcal {F}, \bar{x})&= \left\{ (u, \tilde{q}) \in X \mid \left( u, e\right) _{U} + \left( \tilde{q},d\right) _{Q} \le 0 \text { for all } (e,d) \in T(\mathcal {F}, \bar{x}) \right\} \nonumber \\&= \left\{ (u, \tilde{q}) \in X \mid \left( \nabla S(\bar{q}) u + \tilde{q}, d\right) _{Q} \le 0 \text { for all } d \in T(C_{Q}, \bar{q}) \right\} \nonumber \\&= \left\{ (u, q - \nabla S(\bar{q}) u) \mid u \in U, q \in T^{-}(C_{Q}, \bar{q}) \right\} . \end{aligned}$$
(5)

For the other polar cone in the definition of GCQ, we get

$$\begin{aligned} T^{-}(C, \bar{x}) = T^{-}(U \times C_{Q}, (\bar{u}, \bar{q})) = \left( U \times T(C_{Q}, \bar{q}) \right) ^{-} = \{ 0 \} \times T^{-}(C_{Q}, \bar{q}). \end{aligned}$$
(6)

Taking the derivative of \(c(S(q), q) = 0\) with respect to q in direction \(d \in Q\) yields

$$\begin{aligned} c'_{u}(\bar{x}) S'(\bar{q}) d + c'_{q}(\bar{x}) d = 0. \end{aligned}$$

As a consequence of the Closed Range Theorem [58, Sect. VII.5, Corollary 1], (c) is equivalent to the existence of a continuous inverse of \(c'_{u}(\bar{x})\), from which we see that

$$\begin{aligned} {{\,\mathrm{ker}\,}}c'(\bar{x}) = \left\{ (e,d) \in X \mid c'_{u}(\bar{x}) e + c'_{q}(\bar{x}) d = 0 \right\} = \left\{ (S'(\bar{q}) d, d) \mid d \in Q \right\} . \end{aligned}$$

Thus, its orthogonal complement amounts to

$$\begin{aligned} {{\,\mathrm{ker}\,}}^{\perp } c'(\bar{x})&= \left\{ (u,q) \in X \mid \left( u, S'(\bar{q}) d\right) _{U} + \left( q,d\right) _{Q} = 0 \text { for all } d \in Q \right\} \nonumber \\&= \left\{ (u,q) \in X \mid \left( \nabla S(\bar{q}) u + q, d\right) _{Q} = 0 \text { for all } d \in Q \right\} \nonumber \\&= \left\{ (u, -\nabla S(\bar{q}) u) \mid u \in U \right\} . \end{aligned}$$
(7)

Hence, it follows from (6), (7), and (5) that

$$\begin{aligned} T^{-}(C, \bar{x}) + {{\,\mathrm{ker}\,}}^{\perp } c'(\bar{x}) = \left\{ (u, q - \nabla S(\bar{q}) u \mid u \in U, q \in T^{-}(C_{Q}, \bar{q}) \right\} = T^{-}(\mathcal {F}, \bar{x}), \end{aligned}$$

which shows that GCQ holds at \(\bar{x}\). Regarding multiplier uniqueness, we take the U-components of (3) and (6) to deduce

$$\begin{aligned} \nabla _{u} \phi (\bar{x}) + \nabla _{u} c(\bar{x}) \bar{y} = 0, \end{aligned}$$

from which the uniqueness of \(\bar{y}\) follows from the the existence of a continuous inverse of \(\nabla _{u} c(\bar{x})\) by virtue of (b) and [58, Sect. VII.5, Corollary 1]. \(\square \)

3 Projected gradient/antigradient flow

We study a primal-dual gradient/anti-gradient flow (from now on simply called gradient flow) of the augmented Lagrangian \(L^{\rho }\), defined in (2), projected on the closed convex set C in the framework of projected differential equations in Hilbert space [19] according to

$$\begin{aligned} \dot{x}(t) = P_{T(C,x(t))} \left( -\nabla _x L^{\rho }(x(t), y(t)) \right) , \qquad \dot{y}(t) = \nabla _y L^{\rho }(x(t), y(t)), \end{aligned}$$
(8)

where the gradients with respect to x and y evaluate to

$$\begin{aligned} \nabla _{x} L^{\rho }(x, y) = \nabla \phi (x) + \nabla c(x) \left[ y + \rho c(x) \right] , \qquad \nabla _{y} L^{\rho }(x, y) = c(x). \end{aligned}$$

The following existence theorem uses \(L^{\rho }\) and \(\frac{1}{2} \left\Vert c(.)\right\Vert _{Y}^2\) as Lyapunov-type functions. Due to Lemma 1, the t-derivative of \(L^{\rho }\) along the flow is given by

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} L^{\rho }(x(t), y(t))&= \left( \nabla _x L^{\rho }(x(t), y(t)), \dot{x}(t)\right) _{X} + \left( \nabla _y L^{\rho }(x(t), y(t)), \dot{y}(t)\right) _{Y}\nonumber \\&= -\left\Vert P_{T(C, x(t))}\left( -\nabla _x L^{\rho }(x(t), y(t))\right) \right\Vert _{X}^2 + \left\Vert c(x(t))\right\Vert _{Y}^2. \end{aligned}$$
(9)

The positive sign in front of the last term in (9) reflects the saddle point nature of the Lagrangian approach and complicates the use of Lyapunov arguments in comparison to the unconstrained case. We pursue the basic idea that by increasing \(\rho \), we can make the negative term overpower the \(\rho \)-independent positive term. That this is not always possible will be discussed after the following theorem.

Theorem 2

(Unique existence of solutions) Let Assumptions 2 and 3 be satisfied. Then, there exists an interval \([0, t_{\mathrm{final}}]\) and a uniquely determined pair of absolutely continuous functions \((x, y): [0, t_{\mathrm{final}}] \rightarrow C \times Y\) that satisfy the projected gradient flow Eq. (8) and \((x(0), y(0)) = (x_0, y_0)\). The final time \(t_{\mathrm{final}}\) can be extended as long as the condition

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} L^{\rho }(x(t), y(t)) \le 0 \end{aligned}$$
(10)

holds almost everywhere on \([0, t_{\mathrm{final}}]\). In addition, if for some \(\gamma _1, \gamma _2 \in (0,1)\) the conditions (10) and

$$\begin{aligned} \gamma _1 \frac{\mathrm {d}}{\mathrm {d}t} \left( \frac{1}{2} \left\Vert c(x(t))\right\Vert _{Y}^2 \right) \le -\frac{\mathrm {d}}{\mathrm {d}t} L^{\rho }(x(t), y(t)) -\gamma _2 \left\Vert c(x(t))\right\Vert _{Y}^2, \end{aligned}$$
(11)

hold almost everywhere in \(\mathbb {R}_{\ge 0}\), we have

$$\begin{aligned} \int _{0}^{\infty } \left\Vert P_{T(C, x(t))}\left( -\nabla _x L^{\rho }(x(t), y(t)) \right) \right\Vert _{X}^2 \mathrm {d}t< \infty \quad \text {and} \quad \int _{0}^{\infty } \left\Vert c(x(t))\right\Vert _{Y}^2 \mathrm {d}t < \infty . \end{aligned}$$
(12)

Furthermore, if there is a set \(M \subseteq X \times Y\) such that \(\nabla L^{\rho }\) is (globally) Lipschitz continuous on M and \((x(t), y(t)) \in M\) for all \(t \in [0, \infty )\), we obtain

$$\begin{aligned} P_{T(C, x(t))}\left( -\nabla _x L^{\rho }(x(t), y(t)) \right) \rightarrow 0 \quad \text {and} \quad c(x(t)) \rightarrow 0 \quad \text {for } t \rightarrow \infty . \end{aligned}$$

Proof

By Assumption 3, \(\nabla L^{\rho }(x, y)\) is Lipschitz continuous in a neighborhood of \((x_0, y_0)\) with some Lipschitz constant \(b < \infty \). By virtue of [19, Theorem 3.1], there exists an \(l > 0\) and a uniquely determined pair of absolutely continuous functions \((x,y): [0, l] \rightarrow C \times Y\) that satisfy (8) for almost all \(t \in [0, l]\) and \(x(0) = x_0\), \(y(0) = y_0\). Without loss of generality, (10) is satisfied on [0, l] and we can repeatedly extend the local solution by the above arguments until (10) or (11) is violated for some \(t_{\mathrm{final}} > 0\). As long as (10) is satisfied, no blowup is possible in finite time. To see this, we first observe that

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \left( \frac{1}{2} \left\Vert y(t)\right\Vert _{Y}^2 \right) = \left( y(t), c(x(t))\right) _{Y}, \end{aligned}$$
(13)

which implies in combination with (10) and Assumption 2 that

$$\begin{aligned} \frac{1}{2} \left\Vert y(t)\right\Vert _{Y}^2&= \frac{1}{2} \left\Vert y_0\right\Vert _{Y}^2 + \int _{0}^{t} \left( y(\tau ), c(x(\tau ))\right) _{Y} \mathrm {d}\tau \\&= \frac{1}{2} \left\Vert y_0\right\Vert _{Y}^2 + \int _{0}^{t} \left[ L^{\rho }(x(\tau ), y(\tau )) - \phi ^{\rho }(x(\tau )) \right] \mathrm {d}\tau \\&\le \frac{1}{2} \left\Vert y_0\right\Vert _{Y}^2 + t \left[ L^{\rho }(x_0, y_0) - \phi ^{\rho }_{\mathrm{low}} \right] . \end{aligned}$$

This establishes that there can be no blowup of y in finite time. In addition, x cannot blow up in finite time because then \(L^{\rho }(x(t), y(t))\) would tend to infinity by virtue of Assumption 2.

Hence, we can extend the local solutions to global solutions on the whole interval \(\mathbb {R}_{\ge 0}\) if the condition (10) holds almost everywhere. In this case, Eqs. (13), (2), and Assumption 2 imply that for \(t > 0\)

$$\begin{aligned} \frac{1}{t} \int _{0}^{t} L^{\rho }(x(\tau ), y(\tau )) \,\mathrm {d}\tau&= \frac{1}{t} \int _{0}^{t} \phi ^{\rho }(x(\tau )) \,\mathrm {d}\tau + \frac{1}{t} \left[ \frac{1}{2} \left\Vert y(t)\right\Vert _{Y}^2 - \frac{1}{2} \left\Vert y_0\right\Vert _{Y}^2 \right] \nonumber \\&\ge \phi ^{\rho }_{\mathrm{low}} - \frac{1}{2t} \left\Vert y_0\right\Vert _{Y}^2. \end{aligned}$$
(14)

Using the monotonicity \(L^{\rho }(x(\tau ), y(\tau )) \le L^{\rho }(x(s), y(s))\) for \(0 < s \le \tau \) implied by (10), we obtain for \(s \le t\) that

$$\begin{aligned} \frac{1}{t} \int _{0}^{t} L^{\rho }(x(\tau ), y(\tau )) \,\mathrm {d}\tau \le \frac{1}{t} \int _{0}^{s} L^{\rho }(x(\tau ), y(\tau )) \,\mathrm {d}\tau + \frac{t - s}{t} L^{\rho }\left( x(s), y(s) \right) . \end{aligned}$$
(15)

We concatenate (14) and (15) and let \(t \rightarrow \infty \), which yields

$$\begin{aligned} L^{\rho }(x(s), y(s)) \ge \phi ^{\rho }_{\mathrm{low}} \quad \text {for all } s \in \mathbb {R}_{\ge 0}. \end{aligned}$$

Hence, we obtain

$$\begin{aligned} 0 \ge \int _{0}^{t}\frac{\mathrm {d}}{\mathrm {d}\tau } L^{\rho }(x(\tau ), y(\tau )) \,\mathrm {d}\tau = L^{\rho }(x(t), y(t)) - L^{\rho }(x_0, y_0) \ge \phi ^{\rho }_{\mathrm{low}} - L^{\rho }(x_0, y_0). \end{aligned}$$
(16)

If condition (11) holds additionally, the boundedness of the integral in (16) implies with integration of assumption (11) that

$$\begin{aligned}&\gamma _2 \int _{0}^{t} \left\Vert c(x(\tau ))\right\Vert _{Y}^2 \, \mathrm {d}\tau \nonumber \\&\quad \le -\int _{0}^{t} \frac{\mathrm {d}}{\mathrm {d}t} L^{\rho }(x(t), y(t)) \, \mathrm {d}\tau - \gamma _1 \int _{0}^{t} \frac{\mathrm {d}}{\mathrm {d}t} \left( \frac{1}{2} \left\Vert c(x(t))\right\Vert _{Y}^2 \right) \mathrm {d}\tau \nonumber \\&\quad \le L^{\rho }(x_0, y_0) - L^{\rho }(x(t), y(t)) - \gamma _1 \left[ \frac{1}{2} \left\Vert c(x(t))\right\Vert _{Y}^2 - \frac{1}{2} \left\Vert c(x_0)\right\Vert _{Y}^2 \right] \nonumber \\&\quad \le L^{\rho }(x_0, y_0) - \phi ^{\rho }_{\mathrm{low}} + \gamma _1 \frac{1}{2} \left\Vert c(x_0)\right\Vert _{Y}^2. \end{aligned}$$
(17)

Hence, \(\int _{0}^{\infty } \left\Vert c(x(t))\right\Vert _{Y}^2 \mathrm {d}t < \infty \) and we can establish (12) by way of (16) and the representation (9).

If now there is a set \(M \subseteq X \times Y\) such that \(\nabla L^{\rho }\) is Lipschitz continuous on M and \((x(t), y(t)) \in M\) for all \(t \in [0, \infty )\), then the integrand in (17) is absolutely continuous (as a concatenation of an absolutely continuous function with Lipschitz continuous functions). This implies uniform continuity of the integrand and we can deduce that \(\left\Vert c(x(t))\right\Vert _{Y}^2 \rightarrow 0\) for \(t \rightarrow \infty \). In combination with (16) and the representation (9), this implies that

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} L^{\rho }(x(t), y(t)) = -\left\Vert P_{T(C, x(t))}\left( -\nabla _x L^{\rho }(x(t), y(t))\right) \right\Vert _{X}^2 + \left\Vert c(x(t))\right\Vert _{Y}^2 \rightarrow 0 \end{aligned}$$

and finally \(P_{T(C, x(t))}\left( -\nabla _x L^{\rho }(x(t), y(t)) \right) \rightarrow 0\) for \(t \rightarrow \infty \). \(\square \)

Discussion of Theorem 2 If we do not obtain a solution up to \(t_{\mathrm{final}} = \infty \), it must be due to violation of (10) or (11). In this case, we may try to increase \(\rho \) in order for the negative term in (9) to overpower the positive one. To understand the behavior for \(\rho \rightarrow \infty \), we let \(\beta = 1 / (1+\rho ) \in [0, 1]\) and consider a reparametrization of the flow Eq. (8) via \(x_{\beta }(t) = x(\beta t), y_{\beta }(t) = y(\beta t)\), which leads to

$$\begin{aligned} \dot{x}_{\beta }(t)&= P_{T(C,x_{\beta }(t))} \left( -\beta \nabla _x L^{0}(x_{\beta }(t), y_{\beta }(t)) - (1 - \beta ) \nabla c(x_{\beta }(t)) c(x_{\beta }(t)) \right) , \\ \dot{y}_{\beta }(t)&= \beta \nabla _y L^{\rho }(x_{\beta }(t), y_{\beta }(t)). \end{aligned}$$

For \(\beta = 0\), these flow equations reduce to the projected gradient flow for minimizing the constraint violation \(\left\Vert c(x)\right\Vert _{Y}^2\) over \(x \in C\) according to

$$\begin{aligned} \dot{x}_{\beta }(t) = P_{T(C,x_{\beta }(t))} \left( -\nabla c(x_{\beta }(t)) c(x_{\beta }(t)) \right) , \qquad \dot{y}_{\beta }(t) = 0. \end{aligned}$$

Hence, violation of (10) or (11) for large \(\rho \) can only occur if for \(\beta = 1\) we get stuck in a locally infeasible point \(\tilde{x}\) of problem (1), which means

$$\begin{aligned} P_{T(C, \tilde{x})}\left( -[\nabla c(\tilde{x})] c(\tilde{x}) \right) = 0 \quad \text {but} \quad c(\tilde{x}) \ne 0. \end{aligned}$$

This case must arise for instance if \(\mathcal {F} = \varnothing \) and it is reassuring that the theory provides room for this pathological case and that we at least obtain a point of (locally) minimal constraint violation.

We also remark that boundedness of y(t) can for instance be ensured by the sufficient condition that for some \(\gamma _3 > 0\) we have (omitting t-arguments)

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \left( \frac{1}{2} \left\Vert c(x)\right\Vert _{Y}^2 \right)&= \left( P_{T(C, x)} \left( -\nabla _x L^{0}(x, y) - \rho \nabla c(x) c(x) \right) , \nabla c(x) c(x)\right) _{X}\nonumber \\&\le -\gamma _3 \left\Vert c(x)\right\Vert _{Y}^2. \end{aligned}$$
(18)

In this case, Grönwall’s inequality (see, e.g., [5]) implies \(\left\Vert c(x(t))\right\Vert _{Y} \le \left\Vert c(x_0)\right\Vert _{Y} e^{-\gamma _3 t}\) and consequently

$$\begin{aligned} \left\Vert y(t) - y_0\right\Vert _{Y} \le \int _{0}^{\infty } \left\Vert c(x(t))\right\Vert _{Y} \mathrm {d}t \le \gamma _3^{-1} \left\Vert c(x_0)\right\Vert _{Y}. \end{aligned}$$

Assumption (18) is obviously too restrictive for the case of a feasible initial guess \(c(x_0) = 0\), which would imply \(y(t) \equiv y_0\). Hence, we prefer the weaker assumption (11) in Theorem 2.

We next characterize equilibrium points of (8) assuming they exist.

Lemma 5

(Equilibria are critical) Equilibrium points \((\bar{x}, \bar{y}) \in C \times Y\) of (8) are critical points of (1) and vice versa.

Proof

Let \((\bar{x}, \bar{y}) \in C \times Y\) be an equilibrium point of (8), implying \(0 = \nabla _{y} L^{\rho }(\bar{x}, \bar{y}) = c(\bar{x})\) and consequently \(\bar{x} \in \mathcal {F}.\) From \(0 = P_{T(C, \bar{x})} ( -\nabla _{x} L^{\rho }(\bar{x}, \bar{y}))\), we can derive with Lemma 1 that \(- \nabla _{x} L^{\rho }(\bar{x}, \bar{y}) = P_{T^{-}(C, \bar{x})}(- \nabla _{x} L^{\rho }(\bar{x}, \bar{y})) \in T^{-}(C, \bar{x})\). Because \(c(\bar{x}) = 0\), we have \(\nabla _{x} L^{\rho }(\bar{x}, \bar{y}) = \nabla \phi (\bar{x}) + \nabla c(\bar{x}) \bar{y}\). Hence, \((\bar{x}, \bar{y})\) is a critical point.

Let now \((\bar{x}, \bar{y}) \in \mathcal {F} \times Y\) be a critical point of (1). Because \(\bar{x} \in \mathcal {F}\), the antigradient vanishes due to \(\nabla _{y} L^{\rho }(\bar{x}, \bar{y}) = c(\bar{x}) = 0\). By definition, we also have that

$$\begin{aligned} -\nabla _{x} L^{\rho }(\bar{x}, \bar{y}) = -\nabla \phi (\bar{x}) - \nabla c(\bar{x}) \left[ \bar{y} + \rho c(\bar{x}) \right] = -\nabla \phi (\bar{x}) - \nabla c(\bar{x}) \bar{y} \in T^{-}(C, \bar{x}). \end{aligned}$$

Moreau decomposition of \(-\nabla _{x} L^{\rho }(\bar{x}, \bar{y})\) then yields that \(P_{T(C, \bar{x})}(-\nabla _{x} L^{\rho }(\bar{x}, \bar{y})) = 0\). This shows that both right-hand sides of (8) vanish and that \((\bar{x}, \bar{y})\) is an equilibrium point. \(\square \)

Among the critical points we are apparently only interested in those that are minima of (1). For the finite-dimensional unconstrained case, we recall that asymptotically stable equilibria of the gradient flow are strict local minima of the objective function and that the converse is true if the objective is analytic in a neighborhood of the minimum [1]. This is of high practical relevance, because the gradient flow will be attracted to strict local minima and, conversely, small perturbations (for instance due to numerical round-off) will usually make the flow escape unwanted critical points such as saddle points or maxima.

For the constrained case, the situation is more complicated because the intrinsic saddle point structure of the Lagrangian requires a gradient/antigradient flow, for which to our knowledge no results on asymptotic stability exist so far. We show that critical points that admit an emanating feasible curve of descent are not asymptotically stable (under reasonable conditions). This implies that the projected gradient/antigradient flow will not be attracted to these undesired critical points. To prove this result, we need the following three definitions.

Definition 6

(Descent curve) We call a continuous function \(\bar{x}: [0, 1] \rightarrow \mathcal {F}\) a descent curve of (1), if \(\phi (\bar{x}(t_2)) < \phi (\bar{x}(t_1))\) for all \(0 \le t_1 < t_2 \le 1\).

Definition 7

(Stability) An equilibrium \((\bar{x}, \bar{y}) \in C \times Y\) of the projected gradient flow (8) is stable if for every neighborhood \(U \times V \subset X \times Y\) of \((\bar{x}, \bar{y})\) there exists a smaller neighborhood \(U_1 \times V_1\) of \((\bar{x}, \bar{y})\) such that solutions \((x, y): [0, \infty ) \rightarrow (U \cap C) \times V\) of (8) exist for all initial values \((x_0, y_0) \in (U_1 \cap C) \times V_1\). If, in addition, it holds for all these solutions that \(\lim _{t \rightarrow \infty } (x(t), y(t)) = (\bar{x}, \bar{y})\), then \((\bar{x}, \bar{y})\) is asymptotically stable.

Definition 8

(Flow ribbon) For a continuous function \((\bar{x}, \bar{y}): [0, 1] \rightarrow C \times Y\) we denote by \(\mathcal {R}(\bar{x}, \bar{y}) \subseteq C \times Y\) the flow ribbon emanating from the curve \((\bar{x}, \bar{y})\), which we define as the union of the images of all curves \((x, y): \mathbb {R}_{\ge 0} \rightarrow C \times Y\) satisfying (8) with initial values \((x(0), y(0)) = (\bar{x}(l), \bar{y}(l))\) for some \(l \in [0, 1]\).

We can think of a flow ribbon as the trajectory of a curve under the gradient/antigradient flow (8), just as if the curve at \(t=0\) is the first thread and we weave together the threads into a fabric while moving along the flow. This somewhat unusual definition is required to keep the set of points (xy) small on which assumption (19) in the following theorem must hold (compare also Example 1 below).

Theorem 3

Let \(\bar{x}: [0, 1] \rightarrow \mathcal {F}\) be a descent curve and \(\bar{y}(t) \equiv \bar{y} \in Y\) such that \((\bar{x}(0), \bar{y})\) is a critical point of (1) and let there exist a neighborhood \(U \times V \subset X \times Y\) of \((\bar{x}(0), \bar{y})\) such that for all \((x, y) \in \mathcal {R}(\bar{x}, \bar{y}) \cap (U \times V)\) with \(L^{\rho }(x, y) < L^{\rho }(\bar{x}(0), \bar{y})\) it holds that

$$\begin{aligned} \left\Vert c(x)\right\Vert _{Y}^2 \le \left\Vert P_{T(C,x)}\left( -\nabla _x L^{\rho }(x, y) \right) \right\Vert _{X}^2. \end{aligned}$$
(19)

Then \((\bar{x}(0), \bar{y})\) is not asymptotically stable.

Proof by contradiction

Assume \((\bar{x}(0), \bar{y})\) is asymptotically stable. By Definition 7, there exists a neighborhood \(U_1 \times V_1 \subset U \times V\) of \((\bar{x}(0), \bar{y})\), which admits for each element as initial value a global solution to (8). We choose \(l \in (0, 1]\) such that \((x_0, y_0) := (\bar{x}(l), \bar{y}) \in U_1 \times V_1\). Because \(\bar{x}\) is a descent curve, we have that \(c(x_0) = 0\) and

$$\begin{aligned} L^{\rho }(\bar{x}(0), \bar{y}) - L^{\rho }(x_0, y_0) = \phi (\bar{x}(0)) - \phi (x_0) =: \varepsilon > 0. \end{aligned}$$
(20)

By Definition 7, a solution \((x, y): [0, \infty ) \rightarrow (U \cap C) \times V\) of (8) with \(x(0) = x_0\) and \(y(0) = y_0\) exists and converges to \((\bar{x}(0), \bar{y})\). Using assumption (19) and Eqs. (9) and (20), we observe that

$$\begin{aligned} L^{\rho }(x(t), y(t))&= L^{\rho }(x_0, y_0) + \int _{0}^{t} \frac{\mathrm {d}}{\mathrm {d}t} L^{\rho }(x(\tau ), y(\tau )) \,\mathrm {d}\tau \\&\le L^{\rho }(x_0, y_0) = L^{\rho }(\bar{x}(0), \bar{y}) - \varepsilon \end{aligned}$$

for all \(t \in \mathbb {R}_{\ge 0}\), which implies that (xy) cannot converge to \((\bar{x}(0), \bar{y})\). Hence, \((\bar{x}(0), \bar{y})\) is not asymptotically stable. \(\square \)

In order to validate that assumption (19) does not reduce the assertion of Theorem 3 to one about the empty set, we provide a simple example.

Example 1

(Simple nonconvex quadratic program) We consider the problem

$$\begin{aligned} \min \tfrac{1}{2} \left( x_1^2 -x_2^2\right) \quad \text {over } x \in \mathbb {R} \times \mathbb {R}_{\ge 0} =: C \quad \text {subject to } x_1 = 0. \end{aligned}$$

It is easy to verify that \((x_1, x_2, y) = (0, 0, 0)\) is a critical point and the objective is unbounded for the feasible points \(x_1 = 0\), \(x_2 \rightarrow \infty \). The augmented Lagrangian amounts to

$$\begin{aligned} L^{\rho }(x,y) = \tfrac{1}{2} (x_1^2 - x_2^2) + x_1 y + \tfrac{\rho }{2} x_1^2 = (1+\rho )\tfrac{1}{2} x_1^2 - \tfrac{1}{2} x_2^2 + x_1 y. \end{aligned}$$

The projected gradient flow equations then read (omitting (t)-arguments)

$$\begin{aligned} \dot{x} = P_{T(C,x)}(-\nabla _x L^{\rho }(x,y)) = \begin{pmatrix} -(1+\rho ) x_1 - y\\ \max (0, x_2) \end{pmatrix}, \qquad \dot{y} = \nabla _y L^{\rho }(x, y) = x_1. \end{aligned}$$

For the descent curve \(\bar{x}(l) = (0, l, 0)^T, l \in [0, 1],\) with corresponding \(\bar{y} \equiv 0\), we can easily solve the flow equations and obtain the flow ribbon

$$\begin{aligned} \mathcal {R}(\bar{x},\bar{y}) = \left\{ (0, l e^t, 0)^T \mid t \in \mathbb {R}, l \in [0, 1] \right\} = \{0\} \times \mathbb {R}_{\ge 0} \times \{0\}. \end{aligned}$$

Hence, we see that for \((x, y) \in \mathcal {R}(\bar{x}, \bar{y})\) it holds that

$$\begin{aligned} L^{\rho }(x, y) = -\tfrac{1}{2} x_2^2 \le 0 = L^{\rho }(\bar{x}(0), \bar{y}) \end{aligned}$$

with strict inequality for \(x_2 > 0\). Inequality (19) holds for all \((x, y) \in \mathcal {R}(\bar{x}, \bar{y})\) by virtue of

$$\begin{aligned} \left\Vert c(x)\right\Vert _{2}^2 = x_1^2 = 0 \le \max (0, x_2)^2 = \left\Vert P_{T(C, x)}(-\nabla _x L^{\rho }(x, y))\right\Vert _{2}^2. \end{aligned}$$

Hence, this example satisfies all assumptions of Theorem 3.

4 Projected backward Euler: a sequential homotopy method

It is well-known that the projection in (8) is actually the derivative of the projection of the primal variable onto C in direction of the negative primal gradient:

Lemma 6

For a nonempty closed convex set \(K \subseteq X\), the Gâteaux derivative of the projection of \(x \in X\) onto K in the direction \(\delta x \in X\) is the projection of \(\delta x\) onto the tangent cone T(Kx), i.e.,

$$\begin{aligned} \lim _{h \rightarrow 0^+} h^{-1} \left( P_{K}(x + h \delta x) - x \right) = P_{T(K,x)}(\delta x). \end{aligned}$$

Proof

See [59, Lemma 4.5]. \(\square \)

This motivates following the flow defined by (8) from \((\hat{x}, \hat{y}) \in C \times Y\) to \((x, y) \in C \times Y\) with a projected backward Euler step of stepsize \(\Delta t > 0\) by solving

$$\begin{aligned} x - P_{C}\left( \hat{x} - \Delta t \nabla _x L^{\rho }(x, y) \right) = 0, \qquad y - \hat{y} - \Delta t c(x) = 0, \end{aligned}$$
(21)

because Lemma 6 ensures consistency by virtue of

$$\begin{aligned} \lim _{\Delta t \rightarrow 0} \frac{x - \hat{x}}{\Delta t} = \lim _{\Delta t \rightarrow 0} \frac{P_{C}\left( \hat{x} - \Delta t \nabla _x L^{\rho }(x, y) \right) - \hat{x}}{\Delta t} = P_{T(C, \hat{x})} \left( -\nabla _x L^{\rho }(\hat{x}, \hat{y}) \right) . \end{aligned}$$

From a computational point of view, the projected backward Euler system (21) is an ideal candidate for the application of local (possibly inexact) semismooth Newton methods (see, e.g., [43, 54, 57]), which we will investigate in more detail in Sect. 5.

In addition, the projected backward Euler system (21) can be interpreted as necessary optimality conditions of a primal-dual proximally regularized version of the augmented form of (1). With \(\lambda = 1/\Delta t\), it reads

$$\begin{aligned} \begin{aligned}&\min ~ \phi ^{\rho }(x) + \lambda \left[ \tfrac{1}{2} \left\Vert x - \hat{x}\right\Vert _{X}^2 + \tfrac{1}{2} \left\Vert w - \hat{y}\right\Vert _{Y}^2 \right] ~~\text {over}~w \in Y, x \in C\\&\text {subject to}\,~ c(x) + \lambda w = 0. \end{aligned} \end{aligned}$$
(22)

Uniqueness of solutions to (22) can be guaranteed for sufficiently large \(\lambda \).

Theorem 4

The regularized problem (22) has the following properties for \(\lambda > 0\):

  1. 1.

    It satisfies the strong constraint qualification of Lemma 4.

  2. 2.

    Its primal-dual solutions \((\bar{w}, \bar{x}, \bar{y}) \in Y \times C \times Y\) satisfy (21) and \(\bar{w} = -\Delta t c(\bar{x})\).

  3. 3.

    For \(\lambda \rightarrow \infty \), i.e., \(\Delta t \rightarrow 0\), its unique primal-dual solution \((\bar{w}, \bar{x}, \bar{y})\) tends to \((0, \hat{x}, \hat{y})\) provided that \(\nabla L^{\rho }\) is globally Lipschitz continuous.

For \(\lambda = 0\), i.e., \(\Delta t = \infty \), its primal-dual solutions \(\bar{x}\) and \(\bar{y}\) coincide with those of problem (1) and arbitrary \(\bar{w} \in Y\).

Proof

With \(U = Y\), \(Q = X\), we can apply Lemma 4 with the solution mapping \(S(x) = -\Delta t c(x)\) and \({{\,\mathrm{ran}\,}}\lambda \mathrm {I}_{Y} = {{\,\mathrm{ran}\,}}\lambda \mathrm {I}_{Y}^{\star } = Y\). This shows assertion 1. We call the Lagrangian of (22) homotopy Lagrangian or proximal Lagrangian and denote it by

$$\begin{aligned} L^{\lambda ,\rho }(w, x, y) = L^{\rho }(x, y) + \lambda \left[ \tfrac{1}{2} \left\Vert x - \hat{x}\right\Vert _{X}^2 + \tfrac{1}{2} \left\Vert w - \hat{y}\right\Vert _{Y}^2 + \left( y, w\right) _{Y} \right] . \end{aligned}$$

By Lemma 4, GCQ holds at all feasible points and Theorem 1 yields that

$$\begin{aligned} \left( -\nabla _w L^{\lambda ,\rho }(\bar{w},\bar{x},\bar{y}), -\nabla _x L^{\lambda ,\rho }(\bar{w},\bar{x},\bar{y}) \right) \in \{0\} \times T^{-}(C, \bar{x}), \end{aligned}$$
(23)

from which we can deduce that \(\bar{w} = \hat{y} - \bar{y}\) because of

$$\begin{aligned} 0 = \nabla _w L^{\lambda , \rho } (\bar{w}, \bar{x}, \bar{y}) = \lambda (\bar{w} - \hat{y} + \bar{y}). \end{aligned}$$
(24)

Hence, the feasibility of \((\bar{w}, \bar{x})\) implies that

$$\begin{aligned} c(\bar{x}) + \lambda [\hat{y} - \bar{y}] = 0. \end{aligned}$$

Multiplication with \(\Delta t\) yields the second equation of (21). For the x-part of (23), we observe that

$$\begin{aligned} -\Delta t \nabla _x L^{\lambda ,\rho } (\bar{w}, \bar{x}, \bar{y}) = -\bar{x} + \hat{x} - \Delta t \nabla _x L^{\rho }(\bar{x}, \bar{y}) \in T^{-}(C, \bar{x}), \end{aligned}$$

implying \(\hat{x} - \Delta t \nabla _x L^{\rho }(\bar{x}, \bar{y}) \in T^{-}(C, \bar{x}) + \bar{x}\) and by Lemma 2 that therefore \(P_{C}(\hat{x} - \Delta t \nabla _x L^{\rho }(\bar{x}, \bar{y})) = \bar{x}\), which coincides with the first equation of (21). This shows assertion 2.

We can now use (21) to define a fixed point iteration \(z^{k+1} = \Phi (z^k)\) on \(C \times Y\) via

$$\begin{aligned} \Phi (z) = (P_C(\hat{x} - \Delta t \nabla _{x} L^{\rho }(z)), \hat{y} + \Delta t \nabla _{y} L^{\rho }(z)). \end{aligned}$$

Let \(\omega \) denote the Lipschitz constant of \(\nabla L^{\rho }\). For \(\Delta t < \frac{1}{\omega }\), the mapping \(\Phi \) is a contraction because \(P_C\) is Lipschitz continuous with modulus 1:

$$\begin{aligned} \left\Vert \Phi (z) - \Phi (\tilde{z})\right\Vert _{X \times Y}^2= & {} \left\Vert P_C(\hat{x} - \Delta t \nabla _{x} L^{\rho }(z)) - P_C(\hat{x} - \Delta t \nabla _{x} L^{\rho }(\tilde{z}))\right\Vert _{X}^2\\&+\, \Delta t^2 \left\Vert \nabla _{y} L^{\rho }(z) - \nabla _{y} L^{\rho }(\tilde{z})\right\Vert _{Y}^2 \le (\omega \Delta t)^2 \left\Vert z - \tilde{z}\right\Vert _{X \times Y}^2. \end{aligned}$$

The Banach fixed point theorem yields uniqueness and existence of a fixed point \((\bar{x}, \bar{y})\), which together with \(\bar{w} = \hat{y} - \bar{y}\) is the unique solution of (22). For \(\Delta t = 0\), the fixed point and thus the solution is obviously \((0, \hat{x}, \hat{y})\). In order to prove convergence of \((\bar{w}, \bar{x}, \bar{y})\) to \((0, \hat{x}, \hat{y})\) for \(\Delta t \rightarrow 0\), we observe that

$$\begin{aligned} \begin{aligned} \left\Vert \bar{z} - \hat{z}\right\Vert _{X \times Y}&= \left\Vert \Phi (\bar{z}) - \Phi (\hat{z}) + \Phi (\hat{z}) - \hat{z}\right\Vert _{X \times Y} \\&\le \omega \Delta t \left\Vert \bar{z} - \hat{z}\right\Vert _{X \times Y} + \left\Vert \Phi (\hat{z}) - \hat{z}\right\Vert _{X \times Y}, \end{aligned} \end{aligned}$$

which implies (recall that \(\Phi (\hat{z})\) depends continuously on \(\Delta t\))

$$\begin{aligned} \left\Vert \bar{z} - \hat{z}\right\Vert _{X \times Y} \le \frac{1}{1 - \omega \Delta t} \left\Vert \Phi (\hat{z}) - \hat{z}\right\Vert _{X \times Y} \rightarrow 0 \quad \text {for } \Delta t \rightarrow 0. \end{aligned}$$

This finally proves assertion 3. \(\square \)

The artificial introduction of the variable w in (22) allows a lifting of the dual regularization term \(y - \hat{y}\) in the backward Euler system (21) onto primal variables. From a linear algebra perspective, this can be understood as a Schur complement approach, as we see in the following example.

Example 2

(A quadratic program) Let \(C = X\), \(\rho = 0\), \(\Delta t > 0\), \(c(x) = A x - b\), and \(\phi (x) = \frac{1}{2} \left( x, H x\right) _{X} - \left( g, x\right) _{X}\) for \(A \in \mathcal {L}(X, Y)\), \(H = H^{\star } \in \mathcal {L}(X, X)\), \(g \in X\), and \(b \in Y\). The necessary optimality conditions of the homotopy problem (22) are then equivalent to the linear system

$$\begin{aligned} \begin{pmatrix} \lambda \mathrm {I}_{Y} &{}\quad 0 &{}\quad \lambda \mathrm {I}_{Y} \\ 0 &{} \quad H + \lambda \mathrm {I}_{X} &{}\quad A^{\star }\\ \lambda \mathrm {I}_{Y} &{}\quad A &{} \quad 0 \end{pmatrix} \begin{pmatrix} \bar{w}\\ \bar{x}\\ \bar{y} \end{pmatrix} = \begin{pmatrix} \lambda \hat{y}\\ \lambda \hat{x} + g\\ b \end{pmatrix}. \end{aligned}$$

If we eliminate \(\bar{w}\) with a Schur complement approach, we obtain the backward Euler system (21) as a primal-dual regularization of the original saddle point system for (1) according to

$$\begin{aligned} \left[ \begin{pmatrix} H &{}\quad A^{\star }\\ A &{}\quad 0 \end{pmatrix} + \lambda \begin{pmatrix} \mathrm {I}_{X} &{}\quad 0\\ 0 &{}\quad -\,\mathrm {I}_{Y} \end{pmatrix} \right] \begin{pmatrix} \bar{x}\\ \bar{y} \end{pmatrix} = \begin{pmatrix} g + \lambda \hat{x}\\ b - \lambda \hat{y} \end{pmatrix}. \end{aligned}$$

We can derive two interesting equivalent reformulations of (22). The first reformulation substitutes \(v = \sqrt{\lambda } w\), from which we obtain

$$\begin{aligned} \begin{aligned}&\min ~ \phi ^{\rho }(x) + \frac{\lambda }{2} \left\Vert x - \hat{x}\right\Vert _{X}^2 + \frac{1}{2} \left\Vert v - \sqrt{\lambda } \hat{y}\right\Vert _{Y}^2 ~~\text {over}~v \in Y, x \in C\\&\text {subject to}\,~ c(x) + \sqrt{\lambda } v = 0. \end{aligned} \end{aligned}$$
(25)

The advantage of (25) over (22) is that the optimal v is also uniquely determined for \(\lambda = 0\). The second reformulation completely eliminates \(w = -\Delta t c(x)\). This leads to the problem

$$\begin{aligned} \min ~ \phi ^{\rho }(x) + \frac{\lambda }{2} \left\Vert x - \hat{x}\right\Vert _{X}^2 + \frac{1}{2} \left\Vert \sqrt{\lambda } \hat{y} + \sqrt{1/\lambda } c(x)\right\Vert _{Y}^2 ~~\text {over}~ x \in C, \end{aligned}$$

which has no equality constraint and might allow for the application of projected Newton/gradient methods similar to, e.g., [14, 16, 38].

The homotopy problem (22) and Theorem 4 provide a complementary interpretation of using projected backward Euler steps (21) for the gradient flow Eq. (8): we trace the solutions of (22) from some primal-dual starting point \((0, \hat{x}, \hat{y})\) as a continuation in \(\lambda \) until the homotopy breaks down. The result yields an update for \((\hat{x}, \hat{y})\) and we can repeat the procedure. If, at one point, we are able to drive \(\lambda \) to zero, we can solve the original problem (1) with superlinear local convergence rate by the means of a locally superlinearly convergent method for the homotopy problem (22), e.g., a semismooth Newton method. If it is never possible to drive \(\lambda \) to zero, we at least follow the gradient flow (8) with a projected backward Euler method with stepsize \(1/\lambda \). If we fix \(\lambda \) to some positive value, we obtain a locally linear convergence rate provided that the gradient flow converges exponentially.

5 Numerical case study in PDE constrained optimization

We apply the proposed method to the following benchmark problem adapted from [42]: Let \(\Omega \subset \mathbb {R}^2\) be a bounded domain with Lipschitz boundary and let constants \(a, b, \gamma > 0\), control bounds \(q_{\mathrm{l}}, q_{\mathrm{u}} \in L^{r}(\Omega ), r \in (2, \infty ]\), and a target function \(u_{\mathrm{d}} \in L^2(\Omega )\) be given. We solve the control-constrained quasilinear elliptic optimal control problem

$$\begin{aligned} \begin{aligned}&\min ~\frac{1}{2} \int _{\Omega } \left|u - u_{\mathrm{d}}\right|^2 + \frac{\gamma }{2} \int _{\Omega } \left|q\right|^2 \quad \text {over } u \in H^1_0(\Omega ), q \in L^2(\Omega )\\&\text {subject to} ~ \nabla \cdot \left( \left[ a + b \left|u\right|^2\right] \nabla u\right) = q,\\&q_{\mathrm{l}} \le q \le q_{\mathrm{u}}. \end{aligned} \end{aligned}$$
(26)

In addition to [42], we include pointwise control bounds. For smaller values of a and \(\gamma \), problem (26) becomes more and more ill-conditioned, while the effects of nonlinearity become more challenging for larger values of b.

To transform problem (26) into the form (1), we use the variables \(x = (u, q) \in X = U \times Q = H^1_0(\Omega ) \times L^2(\Omega )\), \(y \in Y = U^{*} = H^{-1}(\Omega )\) and define the closed convex set

$$\begin{aligned} C = U \times \left\{ q \in Q \mid q_{\mathrm{l}} \le q \le q_{\mathrm{u}} \right\} =: U \times C_{Q} \subset U \times Q = X \end{aligned}$$

and the functions \(\phi : X \rightarrow \mathbb {R}\) and \(c: X \rightarrow Y\) via

$$\begin{aligned} \phi ((u,q))&= \frac{1}{2} \int _{\Omega } \left|u - u_{\mathrm{d}}\right|^2 + \frac{\gamma }{2} \int _{\Omega } \left|q\right|^2\\ \left\langle c((u,q)), \varphi \right\rangle _{U^{*}, U}&= \int _{\Omega } \nabla \varphi \cdot \left[ a + b \left|u\right|^2 \right] \nabla u - \int _{\Omega } \varphi q \quad \text {for all } \varphi \in U, \end{aligned}$$

where c is the weak form of the PDE in (26). The problem has a continuously Fréchet-differentiable solution operator \(S: Q \rightarrow U\) in the sense of Lemma 4 [17].

5.1 Implementation aspects

From an implementation point of view, the projected backward Euler system (21) with all its required derivatives can be conveniently generated by the use of the Unified Form Language [2, 4] in combination with Algorithmic Differentiation [25], as it is implemented in the DOLFIN/FEniCS project [3, 39,40,41].

When evaluating the augmented objective \(\phi ^{\rho }(x) = \phi (x) + \frac{\rho }{2} \left\Vert c(x)\right\Vert _{Y}^2\), the inner product \(\left( y, c(x)\right) _{Y}\), or the dual proximal term in (22), we face the problem of computing norms and inner products in \(Y=H^{-1}(\Omega )\), which we can facilitate computationally with the use of the Riesz isomorphism \(\left\Vert y\right\Vert _{Y} = \left\Vert R_{U} y\right\Vert _{U}\). If we choose the norm \(\left\Vert u\right\Vert _{U} = \left\Vert \nabla u\right\Vert _{L^2(\Omega )^2}\) on U, the evaluation of \(R_{U} y\) boils down to one solution of a Poisson problem with right-hand side y and homogeneous Dirichlet boundary conditions. The difficulty from a computational vantage point is that \(R_{U}\) is a large dense matrix in contrast to its inverse \(R_{U}^{-1}\), which is a sparse finite element stiffness matrix. For practical purposes, we always work with the Riesz representation of the dual variable \(y_{R} = R_{U} y\) directly, eliminating the need for evaluating the Riesz isomorphism for the dual variables.

From a linear algebra point of view, it is important to exploit the special structure of the augmentation term \(\frac{\rho }{2} \left\Vert c(x)\right\Vert _{Y}^2\). We extend a well-known argument for the special case of \(\lambda = 0\) (see, e.g., [36, p. 158f]) to the case \(\lambda \ge 0\): For fixed (xy), let us denote the gradients and the second derivative of the augmented Lagrangian \(L^{\rho }(x, y)\) by

$$\begin{aligned} \nabla _{x} L^{\rho }(x, y)&= \nabla _{x} L^{0}(x, y + \rho c(x)) =: F_1, \quad \nabla _{y} L^{\rho }(x, y) = c(x) =: F_2,\\ \nabla _{xx} L^{\rho }(x, y)&= \nabla _{xx} L^{0}(x, y + \rho c(x)) + \rho \nabla c(x) c'(x) =: H + \rho A^{\star } A. \end{aligned}$$

Disregarding inequalities for a moment, each Newton step for the (appropriately scaled) backward Euler equation (21) requires us to solve the linear system

$$\begin{aligned} \begin{pmatrix} \lambda \mathrm {I}_{X} + H + \rho A^{\star } A &{}\quad A^{\star }\\ A &{}\quad -\,\lambda \mathrm {I}_{Y} \end{pmatrix} \begin{pmatrix} \delta x\\ \delta y \end{pmatrix} = - \begin{pmatrix} F_1 + \lambda (x - \hat{x})\\ F_2 - \lambda (y - \hat{y}) \end{pmatrix}. \end{aligned}$$
(27)

The problem here is that \(A^{\star }A = R_{X} A^{*} R_{Y}^{-1} A = R_{X} A^{*} R_{U} A\) becomes a dense matrix after discretization by finite elements due to \(R_{U}\). Hence, we must avoid the formation of \(A^{\star } A\). Instead of (27) we solve the equivalent system

$$\begin{aligned} \begin{pmatrix} \lambda \mathrm {I}_{X} + H &{}\quad A^{\star }\\ A &{}\quad -\,(1 + \rho \lambda )^{-1} \lambda \mathrm {I}_{Y} \end{pmatrix} \begin{pmatrix} \delta x\\ \delta \tilde{y} \end{pmatrix} = - \begin{pmatrix} F_1 + \lambda (x - \hat{x})\\ (1 + \rho \lambda )^{-1} \left( F_2 - \lambda (y - \hat{y}) \right) \end{pmatrix} \end{aligned}$$
(28)

with the reconstruction \(\delta y = (1 + \rho \lambda )^{-1} (\delta \tilde{y} + \rho F_2)\). The equivalence can easily be checked. Because we work with \(y_{R} = R_{U} y\) directly, we need to compute the Riesz representation \(c_{R} = R_{U} c(x)\) first, evaluate the Lagrangian derivatives at \((x, y_{R} + \rho c_{R})\), solve the unaugmented Newton system (28) (reformulated for \(y_{R}\) instead of y) for \((\delta x, \delta \tilde{y}_{R})\), and finally reconstruct \(\delta y_{R} = (1 + \rho \lambda )^{-1} ( \delta \tilde{y}_{R} + \rho c_{R})\).

The enforcement of the projection onto C in (21) can be easily implemented on top of (28): Let us consider the block row corresponding to the gradient with respect to u in (21) scaled by \(\lambda \), which reads

$$\begin{aligned} 0&= \lambda q - \lambda P_{C_{Q}} \left( \hat{q} - \Delta t \nabla _{q} L^{\rho }( (u,q), y) \right) \\&= \lambda q - \lambda P_{C_{Q}} \left( \hat{q} - \Delta t \left[ \gamma q - R_{U} (y + \rho c(x)) \right] \right) . \end{aligned}$$

This nonsmooth equation together with the remaining smooth block rows of (21) scaled by \(\lambda \) can be solved efficiently with a semismooth Newton method. To this end, we need to address a norm gap for the pointwise defined projector

$$\begin{aligned} P_{C_{Q}}(q)(\xi ) = \max (q_{\mathrm{l}}(\xi ), \min (q(\xi ), q_{\mathrm{u}}(\xi ))) \quad \text {for } \xi \in \Omega , \end{aligned}$$

which is known to be semismooth only if it maps from \(L^{r}(\Omega ) \subsetneq Q\) to \(Q = L^2(\Omega )\) (see, e.g., [57, Sect. 3.3] or [31, Theorem 4.2]). Indeed, this higher regularity holds here if the initial guess satisfies \(q_0 \in L^{r}(\Omega )\): For problem (26), the Q part of the projected backward Euler equation (21) simplify to \(q = P_{C_{Q}} \left( \hat{q} - \Delta t \left[ \gamma q - R_{U} (y + \rho c(x)) \right] \right) .\) By induction, we can assume that \(q, \hat{q} \in L^{r}(\Omega )\). Then, the argument of the projection operator \(P_{C_{Q}}\) also lies in \(L^{r}(\Omega )\), because \(R_{U} [y + \rho c(x)] \in H^{1}_{0}(\Omega )\), which is continuously embedded in \(L^{r}(\Omega )\). Because \(q_{\mathrm{l}}, q_{\mathrm{u}} \in L^{r}(\Omega )\), we obtain \(q = P_{C_{Q}}(.) \in L^{r}(\Omega )\), which completes the induction step.

5.2 Solution algorithm

figure a

We provide in Algorithm 1 pseudocode for a prototypical implementation of the sequential homotopy method with a classical continuation approach. It consists of an outer loop over the subsequent homotopies. In the inner loop, the reference point \(\hat{z} = (\hat{x}, \hat{y})\) is fixed and we trace the solution of (21) with one semismooth Newton step followed by one inexact semismooth Newton step.

The computationally heavy part is the computation of \(z^{+}\) in line 5 by one local semismooth Newton step at z and of \(z^{++}\) in line 6 by one local simplified semismooth Newton step at \(z^{+}\). Here, simplified means that the system matrix of the previous semismooth Newton system is reused, subject to modifications concerning the current active set guess derived from the residual evaluated at \(z^{+}\). We accept an iterate for the current value of \(\lambda \) if the following natural monotonicity test is satisfied in line 7: We require that the simplified semismooth Newton increment is smaller in norm than a contraction factor \(\Theta \in (0, 1)\) times the semismooth Newton increment.

If the monotonicity test fails, we enlarge \(\lambda \) by a constant factor to drive the solution of (21) closer to \(\hat{z}\) in order to eventually enter the region of local superlinear convergence of the semismooth Newton method.

If the monotonicity test is satisfied, we accept \(z^{++}\) as the new iterate. If \(\lambda \) and the norm of the outer loop increment \(z - \hat{z}\) are small enough, then we terminate with the solution z, otherwise we predict a new stepsize which should eventually drive \(\lambda \) close to zero. We then commence the next outer iteration.

There are many possibilities to predict the next \(\lambda \) after acceptance of the current iterate. For the numerical results below, we use a heuristic motivated by a discrete proportional-integral (PI) controller: We try to choose \(\lambda \) such that the contraction factor \(\theta = \left\Vert z^{++} - z^{+}\right\Vert _{Z} / \left\Vert z^{+} - z\right\Vert _{Z}\) is close to a given reference \(\theta _{\mathrm{ref}} \in (0, 1)\). We choose to predict \(\lambda \leftarrow \lambda / \lambda _{\mathrm{mod}}\), where \(\log \lambda _{\mathrm{mod}}\) is the manipulated variable. To this end, let \(e = \log \theta _{\mathrm{ref}} - \log \theta \) and let I denote the sum of all previous values of e over the last successful outer loops. We then set with some constants \(K_{P}\) and \(K_{I}\)

$$\begin{aligned} \log \lambda _{\mathrm{mod}} \leftarrow K_{P} e + K_{I} I. \end{aligned}$$

In each accepted iteration, we have the simple update \(I \leftarrow I + e\). In case the monotonicity test fails, we possibly reset the integral term \(I \leftarrow \min (I, 0)\). We can also clip \(\lambda \) at a lower bound \(\lambda _{\mathrm{min}}\). For a related concept in the stepsize control of one-step methods for ordinary differential equations we refer to [27, p. 28ff].

It is also possible to keep all iterates inside C with an additional projection in the local semismooth Newton step (see, e.g., [57]). We found the method to require fewer iterations on (26) without projection steps, even though we are aware that if \(z \not \in C\), we might run into problems with the monotonicity test in line 7 of Algorithm 1 because \(\left\Vert z^{+} - z\right\Vert _{Z}\) might not tend to 0 for \(\lambda \rightarrow \infty \).

Alternatively to Algorithm 1, it is conceivable to update the reference point \(\hat{z}\) less frequently and to trace each homotopy leg until it nearly breaks down in a singularity. In our experience, this approach of long homotopy legs leads to a more complicated algorithm and requires the solution of more and worse conditioned linear systems. We prefer the sequential homotopy method with short homotopy legs in the form of Algorithm 1.

5.3 Numerical results

We apply Algorithm 1 to problem (26) on \(\Omega = (0,1)^2\) with the target state \(u_{\mathrm{d}}(\xi ) = 12 (1-\xi _1) \xi _1 (1-\xi _2) \xi _2\) from [42] and control bounds

$$\begin{aligned} q_{\mathrm{l}}(\xi ) = -50, \qquad q_{\mathrm{u}}(\xi ) = \min \left( 50, 800 \max \left( \left( \xi _1-\tfrac{1}{2} \right) ^2, \left( \xi _2-\tfrac{1}{2} \right) ^2 \right) \right) \end{aligned}$$

for the parameters \(a = 10^{-p}\), \(b = 10^{p}\) for \(p = 0, \dotsc , 5\) with continuous piecewise linear (P1) finite elements on regular triangular grids with \(N = 64, 128, 256, 512\) elements along each side of the unit square.

We perform Algorithm 1 with the initial guess \(z_0 = 0\) and the parameters \(\Theta = 0.9\), \(\lambda _{\mathrm{term}} = 10^{-8}\), \(\lambda _{\mathrm{inc}} = 2\), and \(\mathrm {TOL} = 10^{-8}\). We fix the choice of the penalty parameter to \(\rho = 0.1\). For the stepsize PI controller, we set \(\theta _{\mathrm{ref}} = 0.5\), \(K_{P} = 0.2\), \(K_{I} = 0.005\), and \(\lambda _{\mathrm{min}} = 10^{-12}\). Figures 1 and 2 depict the resulting optimal controls and states.

Fig. 1
figure 1

Optimal controls for problem (26) with \(a = 10^{-p}\) and \(b = 10^{p}\) for \(p = 0, 1, \dotsc , 5\) from left to right. The lower bounds at \(-\,50\) are only active for \(p = -2\) (deep blue) (color figure online)

Fig. 2
figure 2

Optimal states for problem (26) with \(a = 10^{-p}\) and \(b = 10^{p}\) for \(p = 0, 1, \dotsc , 5\) from left to right (color figure online)

Table 1 Comparison of the sequential homotopy method of Algorithm 1 with a nonlinear VI solver with backtracking (bt) and error-oriented monotonicity test (nleqerr) for different instances of problem (26) and varying discretizations (N)

We compare the sequential homotopy method of Algorithm 1 with a nonlinear VI solver described in [13, 45] and implemented in the production quality software package PETSc [11, 12]. For better comparison, we use the direct solver MUMPS [6, 7] for the solution of the linear systems in both approaches. The use of inexact linear algebra solvers is no conceptual problem, as long as they yield a locally convergent nonlinear iteration. The efficiency of iterative linear algebra methods, however, depends crucially on the use of suitable structure-exploiting preconditioners. This topic exceeds the scope of this paper and is the subject of future research.

For the VI solver, we consider two implemented globalization strategies, a backtracking line-search (bt) and an error-oriented monotonicity test (nleqerr). As it turns out, the VI solver did not solve any of the problem instances when started at the initial guess \(z_0 = 0\), failing either by raising an error or reaching the limit of 5.000 residual evaluations, even for a reduced termination tolerance of \(10^{-5}\) on the \(l^{\infty }\)-norm of the residuals. Some problem instances could be solved successfully after dropping the lower control bound, which is only active for \(a = 10^{-2}\), \(b = 10^{2}\). In some of these instances the residual norm stalled between \(10^{-5}\) and \(10^{-8}\).

We compare in Table 1 the sequential homotopy method of Algorithm 1 (with a sharper termination tolerance of \(10^{-8}\) on the Z-norm of the homotopy increment and upper and lower bounds) to the VI approach with reduced termination tolerance as above and only upper bounds. We can observe that the sequential homotopy method solves all problem instances with mesh-independent convergence (subject to some fluctuation for the worse conditioned problems). The VI approach with backtracking is faster for the less demanding but fails for the more demanding instances. The VI approach with error-oriented monotonicity test solves at least two of the more demanding instances successfully, although only one with an efficiency comparable to the sequential homotopy method.

Fig. 3
figure 3

The projected backward Euler step norms \(\left\Vert z - \hat{z}\right\Vert _{Z}\) for (26) with \(a = 10^{-2}\), \(b = 10^{2}\) on different meshes plotted with respect to the flow time t, which is the sum of all accepted step sizes \(\Delta t = 1/\lambda \) (color figure online)

In Fig. 3, we see that even though slightly different numbers of iterations (depicted with markers) are performed on different meshes for the case \(a = 10^{-2}\), \(b=10^{2}\), roughly the same flow time of \(10^{11}\) has to be traversed to reach the required tolerance of \(\mathrm {TOL}=10^{-8}\). We also see that the stepsizes \(\Delta t\) eventually become very large and lead to superlinear convergence. This is the typical numerical behavior of the sequential homotopy method on all considered instances. For \(N=128\) some extra steps are carried out around \(t=10^{5}\) and \(t=10^{7}\).

6 Summary

We provided sufficient conditions for the existence of global solutions to the projected gradient/antigradient flow (8) and showed that critical points with emanating descent curves cannot be asymptotically stable and are thus not attracting for the flow. We applied projected backward Euler timestepping to derive the necessary optimality conditions of a primal-dual proximally regularized counterpart (22) of (1). The regularized problem can be solved by a homotopy method, giving rise to a sequence of homotopy problems. The sequential homotopy method can be used to globalize any locally convergent optimization method that can be employed efficiently in a homotopy framework. The sequential homotopy method with a local semismooth Newton solver outperforms state-of-the-art VI solvers for a challenging class of PDE-constrained optimization problem with control constraints.