Abstract
In this paper, we consider nonlinear optimization problems with nonlinear equality constraints and bound constraints on the variables. For the solution of such problems, many augmented Lagrangian methods have been defined in the literature. Here, we propose to modify one of these algorithms, namely ALGENCAN by Andreani et al., in such a way to incorporate secondorder information into the augmented Lagrangian framework, using an activeset strategy. We show that the overall algorithm has the same convergence properties as ALGENCAN and an asymptotic quadratic convergence rate under suitable assumptions. The numerical results confirm that the proposed algorithm is a viable alternative to ALGENCAN with greater robustness.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we are interested in the solution of smooth constrained optimization problems of the type:
where \(x,\ell ,u\in \Re ^n,\) \(\ell _i < u_i\), for all \(i=1,\dots ,n\), \(f :\Re ^n\rightarrow \Re \), \(h :\Re ^n\rightarrow \Re ^p\) are twice continuously differentiable functions. Note that the structure of Problem (1) is sufficiently general to capture, through reformulation, also problems with nonlinear inequality constraints. Problem (1) has been studied for decades, and many optimization methods have been proposed for its solution. Solution algorithms for (1) belong to different classes like, e.g., sequential penalty [18], augmented Lagrangian [4] and sequential quadratic programming [21].
Among the algorithms based on augmented Lagrangian functions, the one implemented in the ALGENCAN [2, 3] software package is one of the latest and more efficient. The computational heavy part of ALGENCAN consists in the solution (at every outer iteration) of the subproblem, i.e., the minimization of the augmented Lagrangian merit function for given values of the penalty parameter and of the estimated Lagrange multipliers. Such minimization is carried out by the inner solver GENCAN [5].
It is worth noticing that besides the above methods, efficient local algorithms have been proposed in the literature that exploit secondorder information to define superlinearly convergent Newtonlike methods [4, 13, 16]. The socalled acceleration strategy of ALGENCAN is an attempt to exploit secondorder information by means of such locally convergent methods to improve the convergence rate of the overall algorithm.
The idea that we develop in this paper is twofold. On the one side, we propose an alternative and possibly more extensive way to use secondorder information within the framework of an augmented Lagrangian algorithm. Basically, we propose a Newtontype direction to use even when potentially far away from solution points. The use of such a Newton direction is combined with an appropriate activeset strategy. In particular, after estimating active and nonactive variables with respect to the bound constraints, we compute the Newton direction with respect to only the variables estimated as nonactive, while the ones estimated as active are set to the bounds.
On the other hand, when the Newtontype direction cannot be computed or does not satisfy a proper condition, we propose to resort to the minimization of the augmented Lagrangian function, but using an efficient activeset method for boundconstrained problems [11].
The paper is organized as follows. In Sect. 2, we report some preliminary results that will be useful in the paper. In Sect. 3, we describe the procedure to compute the Newtontype direction and we study its theoretical properties. Section 4 is devoted to the description of the proposed augmented Lagrangian algorithm and to its convergence analysis. In Sect. 5, we are concerned with the analysis of the converge rate for the proposed method. In Sect. 6, we report some numerical experiments and comparison with existing software. Finally, in Sect. 7 we draw some conclusions.
2 Notation and Preliminary Results
Given a vector \(x \in \mathbb R^n\), we denote by \(x_i\) its ith entry and, given an index set \(T \subseteq \{1,\ldots ,n\}\), we denote by \(x_T\) the subvector obtained from x by discarding the components not belonging to T. The gradient of a function f(x) is denoted by \(\nabla f(x)\), while the Hessian matrix is denoted by \(\nabla ^2 f(x)\). We indicate by \(\nabla _{x_i} f(x)\) the ith entry of \(\nabla f(x)\). The Euclidean norm of a vector x is indicated by \(\Vert x\Vert \), while \(\Vert x\Vert _{\infty }\) denotes the supnorm of x. Given a matrix M, we indicate by \(\Vert M\Vert \) the matrix norm induced by the Euclidean vector norm. The projection of a vector x onto a box [a, b] is denoted by \(\mathcal{P}_{[a,b]}(x)\). The ith column of the identity matrix is indicated by \(e_i\).
With reference to Problem (1), we define the Lagrangian function \(L(x,\mu )\) with respect to the equality constraints as follows:
where \(\mu \in \Re ^p\) is the Lagrange multiplier.
Denoting the gradient of \(L(x,\mu )\) with respect to x as \(\nabla _x L(x,\mu ) = \nabla f(x) + \nabla h(x) \mu \), we say that \((x,\mu ,\sigma ,\rho )\in \Re ^{3n+p}\) is a KKT tuple for Problem (1) if
If \(x^*\) is local minimum of Problem (1) that satisfies some constraint qualification, then there exist KKT multipliers \(\mu ^*,\sigma ^*,\rho ^*\) such that \((x^*,\mu ^*,\sigma ^*,\rho ^*)\) is a KKT tuple. Note that the KKT conditions 2 can be rewritten as follows:
For a KKT tuple \((x^*,\mu ^*,\sigma ^*,\rho ^*)\), we say that the strict complementarity holds if \(x^*_i = \ell _i \Rightarrow \sigma ^*_i > 0\) and \(x^*_i = u_i \Rightarrow \rho ^*_i > 0\), that is, \(x^*_i = \ell _i \Rightarrow \nabla _i L(x^*,\mu ^*) > 0\) and \(x^*_i = u_i \Rightarrow \nabla _i L(x^*,\mu ^*) < 0\).
Now, let us define the multiplier functions \(\sigma (x,\mu )\) and \(\rho (x,\mu )\), which give us some estimates of the KKT multipliers \(\sigma \) and \(\rho \), respectively, associated with the box constraints of Problem (1). Following the same approach used in [11, 12] for boundconstrained problems, we can first express \(\sigma (x,\mu ) = \nabla _x L(x,\mu ) + \rho (x,\mu )\) from (2a), and then, we can compute \(\rho (x,\mu )\) by minimizing the error over (2c)–(2d) (see [12] for more details), obtaining
These multiplier functions will be employed later for defining an activeset strategy to be used in the proposed algorithm.
Moreover, now we can say that \((x^*,\mu ^*)\in \Re ^{n+p}\) is a KKT pair for Problem (1) when \((x^*,\mu ^*,\sigma (x^*,\mu ^*),\rho (x^*,\mu ^*))\) is a KKT tuple.
2.1 The Augmented Lagrangian Method
The algorithm we propose here builds upon the augmented Lagrangian method described in [3], where an augmented Lagrangian function is defined with respect to a subset of constraints and iteratively minimized over x subject to the remaining constraints. In our case, we define the augmented Lagrangian function for Problem (1) with respect to the equality constraints as
where \(\epsilon > 0\) is a parameter that penalizes violation of the equality constraints. Given an estimate \((x_k,\bar{\mu }_k)\) of a KKT pair and a value \(\epsilon _k\) for the penalty parameter, the new iterate \(x_{k+1}\) can thus be computed by approximately solving the following boundconstrained subproblem:
Then, according to [3], we can set
and update the Lagrange multiplier \(\bar{\mu }_{k+1}\) by projecting \((\mu _{k+1})_i\) in a suitable interval \([\bar{\mu }_{\text {min}},\bar{\mu }_{\text {max}}]\), \(i = 1,\ldots ,p\), that is,
Finally, we decrease the penalty parameter \(\epsilon _{k+1}\) if the constraint violation is not sufficiently reduced and start a new iteration. We can summarize the method proposed in [3] as in the following scheme.
In the next section, we will describe how to incorporate the use of a proper secondorder direction into this augmented Lagrangian framework.
3 Direction Computation
In this section, we introduce and analyze the procedure for computing a secondorder direction, employing a proper activeset estimate.
3.1 ActiveSet Estimate
Taking inspiration from the strategy proposed in [16], for any \(x \in [\ell ,u]\) and any \(\mu \in \mathbb R^p\), we can estimate the active constraints in a KKT point by the following sets:
where \(\nu >0\) is a given parameter and the multiplier functions \(\sigma (x,\mu )\), \(\rho (x,\mu )\) are defined in (4) and (5), respectively.
In particular, in a given pair \((x,\mu )\), the sets \(\mathcal{L}(x,\mu )\) and \(\mathcal{U}(x,\mu )\) contain the indices of the variables that are estimated to be active at the lower bound \(\ell _i\) and at the upper bound \(u_i\), respectively, in a KKT point. As to be shown later, at each iteration of the proposed algorithm, these sets are used to compute a Newton direction with respect to only the variables that are estimated as nonactive, while the variables estimated as active are set to bound.
Using results from [16], the following identification property of the activeset estimate (9)–(10) holds.
Proposition 3.1
If \((x^*,\mu ^*,\sigma ^*,\rho ^*)\) satisfies the KKT conditions 2, then there exists a neighborhood of \((x^*,\mu ^*)\) such that, for each \((x,\mu )\) in this neighborhood, we have
In particular, if the strict complementarity holds at \((x^*,\mu ^*,\sigma ^*,\rho ^*)\), for each \((x,\mu )\) in this neighborhood we have
The result stated in the above proposition holds for an unknown neighborhood of the optimal solution. It would be of great interest and importance to give a characterization of that neighborhood, in order to bound the maximum number of iterations required by the algorithm to identify the active set. Currently, this is an open problem and we think it may represent a possible line of future research, for example by adapting the complexity results given for ALGENCAN in [7], or extending some results on finite activeset identification given in the literature for specific classes of algorithms [8, 10, 22].
3.2 Step Computation
In the proposed algorithm, at the beginning of every iteration k, we have a point \(x_k \in [\ell ,u]\) and Lagrange multiplier estimates \((\bar{\mu }_k)_i \in [\bar{\mu }_{\text {min}},\bar{\mu }_{\text {max}}]\), \(i=1,\ldots ,p\).
Using (9)–(10), we estimate the active and nonactive set in \((x_k,\bar{\mu }_k)\). Denoting
we can thus partition the vector \(x_k\) as \(x_k = (x_{\mathcal{B}_k}, x_{\mathcal{N}_k})\), reordering its entries if necessary. Let us also denote
while \(\nabla ^2_{xx} L_k\) denotes the Hessian matrix of \(L_k\) deriving with respect to x two times and \(\nabla ^2_{\mathcal{N}_k} L_k\) denotes the submatrix obtained from \(\nabla ^2_{xx} L_k\) by discarding rows and columns not belonging to \(\mathcal{N}_k\).
Now, consider the following system of equation with unknowns \(x_{\mathcal{N}_k}\) and \(\mu \):
The nonlinear system (12a)–(12b) can be solved iteratively by the Newton method, where the Newton direction is computed by solving the following linear system:
Hence, if a solution \((d_{x_{\mathcal{N}_k}},d_{\mu })\) of (13) exists, we can set
and move from \(((x_k)_{\mathcal{N}_k},\bar{\mu }_k)\) along \(d_k\), then projecting \((x_k)_{\mathcal{N}_k} + d_{x_{\mathcal{N}_k}}\) onto the box \([\ell _{\mathcal{N}_k},u_{\mathcal{N}_k}]\). In particular, we define
and
For what concerns the variables \((x_k)_{\mathcal{B}_k}\), since they are estimated as active, we set them to the bounds. Namely, we define \((\tilde{x}_k)_{\mathcal{B}_k}\) as follows:
The following results holds.
Proposition 3.2
If the solution \(d_k\) of system (13) exists, then \((x_k,\bar{\mu }_k,\sigma _k,\rho _k)\) is a KKT tuple with \(\sigma _k = \sigma (x_k,\bar{\mu }_k)\) and \(\rho _k = \rho (x_k,\bar{\mu }_k)\) if and only if \(d_k=0\) and \((\tilde{x}_k)_{\mathcal{B}_k} = (x_k)_{\mathcal{B}_k}\).
Proof
First, assume that \(d_k=0\) and \((\tilde{x}_k)_{\mathcal{B}_k} = (x_k)_{\mathcal{B}_k}\). From (13), we have
Using the expression of \(\mathcal{L}(x_k,\bar{\mu }_k)\) and \(\mathcal{U}(x_k,\bar{\mu }_k)\) given in (9)–(10), and recalling the definition of \(\rho (x,\mu )\) and \(\sigma (x,\mu )\) given in (4)–(5), we also have
It follows that KKT conditions 2 are satisfied.
Now, assume that \((x_k,\bar{\mu }_k,\sigma _k,\rho _k)\) is a KKT tuple. Since \(\nabla _{x_{\mathcal{N}_k}} L(x_k,\bar{\mu }_k) = 0\) and \(h(x_k) = 0\), from (13) we have \(d_k=0\). Finally, using the KKT conditions written as in 3, and recalling the definition of \(\rho (x,\mu )\) and \(\sigma (x,\mu )\) given in (4)–(5), we also have \((x_k)_i = \ell _i = (\tilde{x}_k)_i\) for all \(i \in \mathcal{L}_k\) and \((x_k)_i = u_i = (\tilde{x}_k)_i\) for all \(i \in \mathcal{U}_k\). \(\square \)
4 The Algorithm
In this section, we use the above described activeset estimate and Newton strategy to design a primaldual augmented Lagrangian method.
At the beginning of each iteration k, we have a pair \((x_k,\bar{\mu }_k)\). We first estimate the active set \(\mathcal{L}_k \cup \mathcal{U}_k\) and the nonactive set \(\mathcal{N}_k\) as in (11). If possible, we calculate a direction \(d_k = (d_{x_\mathcal{N}},d_{\mu })\) by solving the Newton system (13) and we compute \((\tilde{x}_k)\) as in (14)–(15). This point is accepted and set as \(x_{k+1}\) only if \(\Vert (d_k,(\tilde{x}_kx_k)_{\mathcal{B}_k})\Vert \le \Delta _k\), where \(\Delta _k\) is iteratively decreased trough the iterations by a factor \(\beta \in (0,1)\).
If this is not the case, we compute \(x_{k+1}\) as an approximate minimizer of the boundconstrained subproblem (6), such that
with \(\{\tau _k\} \rightarrow 0\). Then, we update the multiplier estimate \(\mu _{k+1}\) by (7) and decrease the penalty parameter \(\epsilon _{k+1}\) if the constraint violation is not sufficiently reduced.
We finally terminate the iteration by setting \(\bar{\mu }_{k+1}\) as the projection of \(\mu _{k+1}\) on a prefixed box, according to (8).
The proposed method, named PrimalDual Augmented Lagrangian Method (PD ALM), is reported in the following algorithmic scheme. As specified later (see Sect. 6), in practical implementation of the algorithm we use a stricter test to accept the point \(\tilde{x}_k\), also requiring a decrease of the feasibility violation in the new point \(\tilde{x}_k\). For the sake of generality, the theoretical analysis is carried out by considering only the condition \(\Vert (d_k,(\tilde{x}_kx_k)_{\mathcal{B}_k})\Vert \le \Delta _k\).
The next results shows that a KKT point is obtained, as a limit point, whenever we accept the Newton direction for an infinite number of iterations.
Proposition 4.1
Let \(\{x_k\}\) be a sequence generated by the PrimalDual Augmented Lagrangian Method and let \(\{x_k\}_K\) be a subsequence such that \(\tilde{x}_k\) is accepted (i.e., \(d_k\) is computed and \(\Vert (d_k,(\tilde{x}_kx_k)_{\mathcal{B}_k})\Vert \le \Delta _k\)) for infinitely many iterations \(k \in K\) and
Then, \(x^*\) is a KKT point.
Proof
Since \(\{\bar{\mu }_k\}\) is a bounded sequence and \(\mathcal{L}_k\), \(\mathcal{U}_k\), \(\mathcal{N}_k\) are subsets of a finite set of indices, without loss of generality we can assume that \(\lim _{k \rightarrow \infty , \, k \in K} \bar{\mu }_{k+1} = \mu ^*\), \(\mathcal{L}_k = \mathcal L\), \(\mathcal{U}_k = \mathcal U\) and \(\mathcal{N}_k = \mathcal N\) (passing into a further subsequence if necessary). Moreover, since \(d_k\) is accepted for infinitely many iterations \(k \in K\), without loss of generality we can also assume that \(d_k\) is accepted for all \(k \in K\) (passing again into a further subsequence if necessary).
Since the projection is nonexpansive, for all \(k \in K\) we have
Moreover, since \(\Delta _{k+1} = \beta \Delta _k\), with \(\beta \in (0,1)\), for all \(k \in K\),
and
Then,
Since \(\Vert d_k\Vert \le \Delta _k\) for all \(k \in K\), from (17) we also have that
Using again the fact that the Newton direction is accepted at every iteration \(k \in K\), we can write
Taking the limits for \(k \rightarrow \infty \), \(k \in K\), and using (18), we have
Taking into account (19) and (20), we can write
To conclude the proof, we have to show that the KKT conditions are satisfied with respect to \(\nabla _{x_\mathcal{L}} L(x^*,\mu ^*)\) and \(\nabla _{x_\mathcal{U}} L(x^*,\mu ^*)\) as well. From the instructions of the algorithm, \((x_{k+1})_\mathcal{L} = (\tilde{x}_k)_\mathcal{L} = \ell _\mathcal{L}\) and \((x_{k+1})_\mathcal{U} = (\tilde{x}_k)_\mathcal{U} = u_\mathcal{U}\) for all \(k \in K\). Consequently,
So, using 3, KKT conditions with respect to \(\nabla _{x_\mathcal{L}} L(x^*,\mu ^*)\) and \(\nabla _{x_\mathcal{U}} L(x^*,\mu ^*)\) hold if and only if
For any index \(i\in \mathcal L\), from the activeset estimate (9) we have \(0 \ge (d_k)_i = \ell _i  (x_k)_i \ge \nu \sigma _i(x_k,\bar{\mu }_k)\) and, using the definition of \(\sigma _i(x,\mu )\) given in (4), we get
Similarly, for any index \(i\in \mathcal U\) we have \(0 \le (d_k)_i = u_i  (x_k)_i \le \nu \rho _i(x_k,\bar{\mu }_k)\) and then
Taking the limits for \(k \rightarrow \infty \), \(k \in K\), and using (18)–(19), we obtain (21). \(\square \)
In the following result, we show that any limit point of the sequence \(\{x_k\}\) is either feasible for Problem (1) or stationary for the penalty term \(\Vert h(x)\Vert ^2\) of the augmented Lagrangian function, measuring the violation with respect to the equality constraints.
Proposition 4.2
Let \(\{x_k\}\) be a sequence generated by the PrimalDual Augmented Lagrangian Method and let \(\{x_k\}_K\) be a subsequence such that
The following holds:

if \(\lim _{k \rightarrow \infty } \epsilon _k > 0\), then \(x^*\) is feasible;

if \(\tilde{x}_k\) is accepted (i.e., \(d_k\) is computed and \(\Vert (d_k,(\tilde{x}_kx_k)_{\mathcal{B}_k})\Vert \le \Delta _k\)) for infinitely many iterations \(k \in K\), then \(x^*\) is feasible (indeed, it is a KKT point);

in all other cases, \(x^*\) is a KKT point of the problem \(\min _{\ell \le x \le u}\Vert h(x)\Vert ^2\).
Proof
Let us analyze the three cases separately.

If \(\lim _{k \rightarrow \infty } \epsilon _k > 0\), from the instructions of the algorithm there exists an iteration \(\hat{k}\) such that \(\epsilon _{k+1} = \epsilon _k\) for all \(k\ge \hat{k}\). Therefore, \(\Vert h(x_{k+1})\Vert _{\infty } \le \eta \Vert h(x_k)\Vert _{\infty }\), with \(\eta \in (0,1)\), for all \(k \ge \hat{k}\), and then \(\{h(x_k)\} \rightarrow 0\), implying that \(x^*\) is feasible.

If \(\tilde{x}_k\) is accepted for infinitely many iterations \(k \in K\), from Proposition 4.1 we have that \(x^*\) is a KKT point, and thus, it is feasible.

In all the other cases, we want to show that
$$\begin{aligned}{}[\nabla h(x^*) h(x^*)]_i {\left\{ \begin{array}{ll} \ge 0, &{} \quad \text {if } x^*_i = \ell _i, \\ = 0, &{} \quad \text {if } x^*_i \in (\ell _i,u_i), \\ \le 0, &{} \quad \text {if } x^*_i = u_i. \end{array}\right. } \end{aligned}$$(22)Since \(\{\bar{\mu }_k\}\) is a bounded sequence, without loss of generality we can assume that \(\lim _{k \rightarrow \infty , \, k \in K} \bar{\mu }_{k+1} = \mu ^*\), (passing into a further subsequence if necessary). Moreover, note that there exists an iteration \(\hat{k} \in K\) such that, for all \(k \ge \hat{k}\), \(k \in K\), the Newton direction \(d_k\) is not accepted, that is, we compute \(x_{k+1}\) such that (16) holds. Since \(\{\tau _k\} \rightarrow 0\), it follows that
$$\begin{aligned} \lim _{k \rightarrow \infty } \Vert x_{k+1}  \mathcal{P}_{[\ell ,u]}(x_{k+1}  \nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k))\Vert _{\infty } = 0. \end{aligned}$$(23)Now, we distinguish three subcases.

(i)
\(x^*_i \in (\ell _i,u_i)\). Since \(\{x_{k+1}\}_K \rightarrow x^*\), there exists an iteration \(\hat{k} \in K\) such that \((x_{k+1})_i \in (\ell _i,u_i)\) for all \(k \ge \hat{k}\), \(k \in K\). In view of (23), it follows that
$$\begin{aligned} \lim _{k \rightarrow \infty , \, k \in K} (\nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k))_i = 0 \end{aligned}$$(otherwise, if it was not true, then \(\limsup _{k \rightarrow \infty , \, k \in K} (x_{k+1}  \mathcal{P}_{[\ell ,u]}(x_{k+1}  \nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k)))_i > 0\), leading to a contradiction with (23)). So, there exists an iteration, that we still denote by \(\hat{k} \in K\) without loss of generality, such that \((x_{k+1}  \nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k))_i \in [\ell _i,u_i]\) for all \(k \ge \hat{k}\), \(k \in K\). Hence, for all \(k \ge \hat{k}\), \(k \in K\), we can write
$$\begin{aligned} \begin{aligned} \tau _k&\ge \Vert x_{k+1}  \mathcal{P}_{[\ell ,u]}(x_{k+1}  \nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k))\Vert _{\infty } \\&\ge \bigl (x_{k+1}  \mathcal{P}_{[\ell ,u]}(x_{k+1}  \nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k)))_i\bigr  \\&= \bigl (\nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k))_i\bigr  \\&= \biggl \biggl (\nabla f(x_{k+1}) + \nabla h(x_{k+1}) \bar{\mu }_k + \frac{2}{\epsilon _k} \nabla h(x_{k+1}) h(x_{k+1})\biggr )_i\biggr . \end{aligned} \end{aligned}$$Multiplying the first and the last term in the above chain of inequality by \(\epsilon _k\), we get
$$\begin{aligned} \epsilon _k \tau _k \ge  (\epsilon _k \nabla f(x_{k+1}) + \epsilon _k \nabla h(x_{k+1}) \bar{\mu }_k + 2 \nabla h(x_{k+1}) h(x_{k+1}))_i, \end{aligned}$$for all \(k \ge \hat{k}\), \(k \in K\). Taking the limits in the above inequality for \(k \rightarrow \infty \), \(k \in K\), the lefthand side converges to zero, since both \(\{\epsilon _k\}\) and \(\{\tau _k\}\) converge to zero, while the righthand side converges to \((2 \nabla h(x^*) h(x^*))_i\), since \(\{\epsilon _k\} \rightarrow 0\), \(\{\nabla f(x_{k+1})\}_K \rightarrow \nabla f(x^*)\), \(\{\nabla h(x_{k+1})\}_K \rightarrow \nabla h (x^*)\), \(\{h(x_{k+1})\}_K \rightarrow h(x^*)\) and \(\{\bar{\mu }_k\}_K \rightarrow \mu ^*\). We thus conclude that \((\nabla h(x^*) h(x^*))_i=0\).

(ii)
\(x^*_i = \ell _i\). Since \(\{x_{k+1}\}_K \rightarrow x^*\), there exists an iteration \(\hat{k} \in K\) such that \((x_{k+1})_i \in [\ell _i,u_i)\) for all \(k \ge \hat{k}\), \(k \in K\). In view of (23), it follows that
$$\begin{aligned} \liminf _{k \rightarrow \infty , \, k \in K} (\nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k))_i \ge 0 \end{aligned}$$(otherwise, if it was not true, then \(\limsup _{k \rightarrow \infty , \, k \in K} (x_{k+1}  \mathcal{P}_{[\ell ,u]}(x_{k+1}  \nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k)))_i > 0\), leading to a contradiction with (23)). So, we can write
$$\begin{aligned} \liminf _{k \rightarrow \infty , \, k \in K} \biggl (\nabla f(x_{k+1}) + \nabla h(x_{k+1}) \bar{\mu }_k + \frac{2}{\epsilon _k} \nabla h(x_{k+1}) h(x_{k+1})\biggr )_i \ge 0. \end{aligned}$$Multiplying the terms of the above inequality by \(\epsilon _k\), and taking into account that \(\{\epsilon _k\}~\rightarrow ~0\), \(\{\nabla f(x_{k+1})\}_K \rightarrow \nabla f(x^*)\), \(\{\nabla h(x_{k+1})\}_K \rightarrow \nabla h (x^*)\), \(\{h(x_{k+1})\}_K \rightarrow h(x^*)\) and \(\{\mu _k\}_K \rightarrow \mu ^*\) is bounded, we get
$$\begin{aligned} \begin{aligned}&\liminf _{k \rightarrow \infty , \, k \in K} (\epsilon _k \nabla f(x_{k+1}) + \epsilon _k \nabla h(x_{k+1}) \bar{\mu }_k + 2\nabla h(x_{k+1}) h(x_{k+1}))_i =\\&\qquad \qquad = 2(\nabla h(x^*) h(x^*))_i \ge 0. \end{aligned} \end{aligned}$$ 
(iii)
\(x^*_i = u_i\). We obtain \((\nabla h(x^*) h(x^*))_i \le 0\) using the same arguments as in the previous case.\(\square \)

(i)
In order to show convergence of the algorithm to KKT points, we need to point out some properties of the approximate minimizers of the augmented Lagrangian function. In particular, in the next lemma we show that, when we cannot use the Newton direction, the approximate minimizers of the augmented Lagrangian function computed as in (16), with \(\{\tau _k\} \rightarrow 0\), satisfy the conditions stated in [3] for the solutions of the subproblems (see Step 2 of Algorithm 3.1 in [3]).
Lemma 4.1
Let \(\{x_k\}\) be a sequence generated by the PrimalDual Augmented Lagrangian Method, and let \(\{x_k\}_K\) be a subsequence such that
with \(x^*\) feasible and, for all \(k \in K\), either the Newton direction \(d_k\) cannot be computed (i.e., system (13) does not have solutions) or \(\tilde{x}_k\) is not accepted (i.e., \(\Vert (d_k,(\tilde{x}_kx_k)_{\mathcal{B}_k})\Vert > \Delta _k\)). Then, for all \(k \in K\) there exist \(\tau _{k,1} \ge 0\), \(\tau _{k,2} \ge 0\), \((v_k)_i\),\((w_k)_i\),\(i=1,\ldots ,n\), such that
Proof
First, note that the conditions on \(x_k\) in (25) are satisfied for any \(\tau _{k,2} \ge 0\), since we maintain feasibility with respect to the constraints \(\ell \le x \le u\). Without loss of generality, we can limit to prove that an iteration \(\hat{k} \in K\) exists such that (24)–(27) hold for all \(k \ge \hat{k}\), \(k \in K\), and (28) is satisfied (for the iterations \(k < \hat{k}\), \(k \in K\), we can choose arbitrary \(\tau _{k,1} \ge 0\), \(\tau _{k,2} \ge 0\), \((v_k)_i\),\((w_k)_i\),\(i=1,\ldots ,n\), with \(\tau _{k,1}\) sufficiently large, satisfying (24)–(27)).
From the instructions of the algorithm, at every iteration \(k \in K\) we compute \(x_{k+1}\) such that (16) holds, with \(\{\tau _k\} \rightarrow 0\). So, we can choose \(\hat{k}\) as the first iteration such that
Since the index set \(\{1,\ldots ,n\}\) is finite, without loss of generality we can define the subsets \(I_1\), \(I_2\), \(I_3\) and \(I_4\) (passing into a further subsequence if necessary) such that:
From (16) and (29), for all \(k \ge \hat{k}\), \(k \in K\), we can write
For every variable \((x_k)_i\) with \(i \in I_1\), we also have that
for all sufficiently large \(k \in K\) (this follows from the fact that \(\{(x_k)_i\}_K \rightarrow x^*_i \in (\ell _i,u_i)\) and \(\tau _k \rightarrow 0\)) . So, without loss of generality we can also assume that \(\hat{k}\) is large enough to satisfy
Let us rewrite the quantities within the absolute value in the second inequality as follows:
where \((y'_k)_i, (y''_k)_i \ge 0\) are proper scalars. In more detail, if \(p := (x_{k+1}  \nabla _x L_a(x_{k+1},\bar{\mu }_k;\epsilon _k))_i\) is in \([\ell _i,u_i]\), then \((y'_k)_i, (y''_k)_i = 0\). On the other hand, if \(p  \mathcal{P}_{[\ell _i,u_i]}(p)< 0\), then \((y'_k)_i > 0\) and \((y''_k)_i = 0\); otherwise, i.e., if \(p  \mathcal{P}_{[\ell _i,u_i]}(p)> 0\), then \((y'_k)_i = 0\) and \((y''_k)_i > 0\). Therefore, we obtain
We conclude that (24)–(27) hold for all \(k \ge \hat{k}\), \(k \in K\), with
and, from the above definitions, also (28) is satisfied. \(\square \)
Combining the above results with those stated in [3], we can finally show the convergence of the proposed algorithm to stationary points. In particular, as in [3], we use the constant positive linear dependence (CPLD) as constraint qualification condition.
Definition 4.1
A point x is said to satisfy CPLD for Problem (1) if the existence of scalars \(\lambda _1,\ldots ,\lambda _p\), \(\pi _i \ge 0\), \(i \in \mathcal{L}(x)\), \(\varphi _j \ge 0\), \(j \in \mathcal{U}(x)\), such that \(\sum _{t=1}^p \lambda _t \nabla h_t(z)  \sum _{i \in \mathcal{L}(x)} \pi _i e_i + \sum _{j \in \mathcal{U}(x)} \varphi _j e_j = 0\) implies that, for all z in a neighborhood of x, the vectors \(\nabla h_1(z), \ldots , \nabla h_p(z)\),\(e_i\), \(i \in \mathcal{L}(x)\), \(e_j\), \(j \in \mathcal{U}(x)\) are linearly dependent, where \(\mathcal{L}(x) := \{i :x_i = \ell _i\}\), \(\mathcal{U}(x) := \{i :x_i = u_i\}\) and \(\mathcal{N}(x) := \{1,\ldots ,n\} \setminus (\mathcal{L}(x) \cup \mathcal{U}(x)).\)
For more details on CPLD and the relations with other constraint qualification conditions, see also [1, 23].
Theorem 4.1
Let \(\{x_k\}\) be a sequence generated by the PrimalDual Augmented Lagrangian Method and let \(\{x_k\}_K\) be a subsequence such that
The following holds:

if \(\tilde{x}_k\) is accepted (i.e., \(d_k\) is computed and \(\Vert (d_k,(\tilde{x}_kx_k)_{\mathcal{B}_k})\Vert \le \Delta _k\)) for infinitely many iterations \(k \in K\), then \(x^*\) is a KKT point;

else, if \(x^*\) satisfies the CPLD constraint qualification, then \(x^*\) is a KKT point.
Proof
If \(\tilde{x}_k\) is accepted for infinitely many iterations \(k \in K\), then \(x^*\) is a KKT point from Proposition 4.1. Else, there exists an iteration \(\hat{k} \in K\) such that \(d_k\) is not accepted for any \(k \ge \hat{k}\), \(k \in K\), and the algorithm reduces to a classical Augmented Lagrangian method. Then, using Lemma 4.1, the conditions stated in [3] for the solutions of the subproblems are satisfied and the result is obtained by the same arguments given in the proof of Theorem 4.2 in [3]. \(\square \)
5 Convergence Rate Analysis
In this section, we analyze the convergence rate of the proposed algorithm. We will show that, for sufficiently large iterations, the primaldual sequence \((x_k,\bar{\mu }_k)\) converges to an optimal solution \((x^*,\mu ^*)\) at a quadratic rate.
In the literature, standard assumptions to prove the convergence rate of an augmented Lagrangian scheme are the linear independence constraints qualification (LICQ), the strict complementarity and the secondorder sufficient condition (SOSC). For Problem (1), let us denote by \(\sigma ^*\) and \(\rho ^*\) the KKT multipliers at \(x^*\) associated with the bound constraints \(x \ge \ell \) and \(x \le u\), respectively, and
Then,

LICQ means that the vectors \(\nabla h_1(x^*),\ldots ,\nabla h_p(x^*)\), \(e_i\), \(i\in \mathcal{L}^*\), \(e_j\), \(j\in \mathcal{U}^*\), are linearly independent;

SOSC means that \(y^T \nabla ^2_{xx} L(x^*,\mu ^*) y > 0\) for all \(y \in T(x^*) \setminus \{0\}\), where
$$\begin{aligned} \begin{aligned}&T(x^*) := \{y \in \mathbb R^n :\nabla h(x^*)^T y = 0, \\&e_i^T y = 0, \quad i \in I_0(x^*), \\&e_i^T y \le 0, \quad i \in I_1(x^*)\}, \end{aligned} \end{aligned}$$with \(I_0(x^*) := (\mathcal{L}^*\cap \{i :\sigma _i^*> 0\}) \cup (\mathcal{U}^*\cap \{i :\rho _i^* > 0\})\) and \(I_1(x^*) := (\mathcal{L}^* \cup \mathcal{U}^*) \setminus I_0(x^*)\).
Under LICQ, strict complementarity and SOSC, if the penalty parameter \(\epsilon _k \rightarrow 0\), usually it is possible to show superlinear convergence rate for augmented Lagrangian methods (see, e.g., [4, 17] and the references therein). Moreover, superlinear convergence rate is proved in [17], when \(\epsilon _k \rightarrow 0\), even without any constraint qualification, but requiring the starting multiplier to be in a neighborhood of a KKT multiplier satisfying SOSC.
Here, quadratic convergence rate is obtained by assuming that \(\mu ^*_i\in [\bar{\mu }_{\text {min}},\bar{\mu }_{\text {max}}]\) for all \(i = 1,\ldots ,p\), under LICQ and the strong secondorder sufficient condition (SSOSC), where the latter means that
with
Interestingly, our results do not need the convergence of \(\{\epsilon _k\}\) to 0.
First, we state an intermediate result ensuring that, if a sequence converges to a point where the conditions for superlinear convergence rate of the Newton direction are satisfied, then the direction is eventually accepted by the algorithm.
Proposition 5.1
Let \(\{(z_k,\bar{d}_k)\}\) be a sequence of vectors such that
with \(\{\alpha _k\} \rightarrow 0\). Then, for k sufficiently large,
for given \(\beta \in (0,1)\) and \(\Delta _0>0\).
Proof
Let \(\bar{k}\) and \(\bar{\alpha }\) be such that, for all \(k\ge \bar{k}\),
Therefore, we can write
from which we obtain:
By using (30), we can set
Then, we have
Since \(\rho \in (0,1)\), we can conclude that, for k sufficiently large, it results that
Now, (31) and (32) conclude the proof. \(\square \)
Finally, we are ready to show the asymptotic quadratic rate of the primaldual sequence \(\{(x_k,\bar{\mu }_k)\}\), under LICQ and SSOSC, if \(\mu ^*_i\in [\bar{\mu }_{\text {min}},\bar{\mu }_{\text {max}}]\) for all \(i = 1,\ldots ,p\).
Theorem 5.1
Let \(\{x_k\}\) and \(\{\bar{\mu }_k\}\) be the sequences generated by the PrimalDual Augmented Lagrangian Method and assume that
with \(\mu ^*_i\in [\bar{\mu }_{\text {min}},\bar{\mu }_{\text {max}}]\) for all \(i = 1,\ldots ,p\). Also assume that the LICQ and SSOSC hold at \((x^*,\mu ^*)\). Then, \(\{(x_k,\bar{\mu }_k)\}\) converges to \((x^*,\mu ^*)\) with a quadratic rate asymptotically, i.e.,
for all sufficiently large k and some constant K.
Proof
Since LICQ and SSOSC hold at \((x^*,\mu ^*)\), using [16, Proposition 3.1] it follows that the following matrix is invertible for k sufficiently large:
where \(I_{\mathcal{L}_k}\) and \(I_{\mathcal{U}_k}\) denote the submatrices obtained from the identity matrix by discarding the columns whose indices do not belong to \(\mathcal{L}_k\) and \(\mathcal{U}_k\), respectively. Consequently, for all sufficiently large k, the Newton direction can be computed.
Let us define \(\bar{d}_k\) as the Newton direction \(d_k = ((d_x)_k,(d_{\mu })_k)\) augmented with the components in \(\mathcal{B}_k\). Namely, \(\bar{d}_k := ((\bar{d}_x)_k,(\bar{d}_{\mu })_k)\), where
(by properly reordering the entries of \(x_k\)). We note that
By the instructions of the algorithm, when a Newton direction is used, we have
and
So, using Proposition 3.1, for all sufficiently large k we have that
and then,
For all sufficiently large k, by the same arguments given in the proof of [13, Proposition 4], there exists a constant K such that
The above relation implies that \(\bar{d}_k\) satisfies the assumptions of Proposition 5.1 (with \(z_k = (x_k,\bar{\mu }_k)\) and \(\alpha _k = K \Vert (x_k,\bar{\mu }_k)  (x^*,\mu ^*)\Vert \)). Since \(\Vert d_k\Vert \le \Vert \bar{d}_k\Vert \), by the instructions of the algorithm, the Newton direction \(d_k\) is accepted for all sufficiently large k, so that
and
Using 33, we get
where the first inequality follows from the fact that the projection operator is nonexpansive and that, for all sufficiently large k, from Proposition 3.1 we have \(\mathcal{N}^* \subseteq \mathcal{N}_k\), implying that \((x^*)_{\mathcal{N}_k} \in (\ell _{\mathcal{N}_k},u_{\mathcal{N}_k})\). Similarly, using again the nonexpansivity of the projection operator and the assumption that \(\mu ^*_i\in [\bar{\mu }_{\text {min}},\bar{\mu }_{\text {max}}]\) for all \(i = 1,\ldots ,p\), we have
where \(\mathbf {1}\) denotes the vector of all ones (of appropriate dimensions). Combining these relations with (34), for all sufficiently large k we obtain
concluding the proof. \(\square \)
6 Numerical Experiments
This section is devoted to the description of the numerical experience with the proposed algorithm and to its comparison with other algorithms publicly available. All the numerical experiments have been carried out on an Intel Xeon CPU E51650 v2 @ 3.50GHz with 12 cores and 64 Gb RAM.
Problem set description. We considered a set of 362 general constrained problems from the CUTEst collection [20], with number of variables \(n \in [90, 906]\) and number of general constraints (equalities and inequalities) \(m \in [1,8958]\). In particular, among the whole CUTEst problems collection, we selected all constrained problems (i.e., with at least one constraint besides bound constraints on the variables) having:

(i)
number of variables and constraints “user modifiable”, or

(ii)
number of variables “user modifiable” and a fixed number of constraints, or

(iii)
at least 100 variables.
Figure 1 describes the distribution of the number of variables and number of general constraints of the considered problems.
Algorithms used in the comparison. We used the following algorithms:

the augmented Lagrangian method implemented in the ALGENCAN (v.3.1.1) software package [2, 3];

the augmented Lagrangian method implemented in LANCELOT (rev.B) [9, 19];

our proposed primaldual augmented Lagrangian method PD ALM (as described in Sect. 4).
Both ALGENCAN and LANCELOT have been run using their default parameters. Note that, in its default setting, ALGENCAN uses secondorder information exploiting a socalled acceleration strategy, which is activated when the current primaldual pair is sufficiently close to a KKT pair of the problem.
Our method has been implemented by modifying the code of ALGENCAN in two points:

at the beginning of each iteration k, we inserted the computation of the activeset estimate and the Newton direction \(d_k\), according to the algorithmic scheme reported in Sect. 4;

the approximate minimization of the augmented Lagrangian function is carried out by means of the ASABCP method proposed in [11], in place of GENCAN [5].
In more detail, for every iteration k, in (15) we set \(\nu = \min \{10^{6}, \Vert x_k  \mathcal{P}_{[\ell ,u]}(x_k  \nabla _x L(x_k,\bar{\mu }_k)) \Vert ^{3} \}\) and the linear system (13) was solved by means of the MA57 library [15]. Note that we used the same library also in ALGENCAN. For what concerns the inner solver ASABCP, it is an activeset method where, at each iteration, the variables estimated as active are set to the bounds, while those estimated as nonactive are moved along a truncatedNewton direction. In ASABCP, here we employed a monotone line search and, to compute the truncatedNewton direction by conjugate gradient, we used the preconditioning technique described in [6], based on quasiNewton formulas.
It is worth noticing that, in our implementation of PD ALM, the test for accepting the point \(\tilde{x}_k\) is made of two conditions, which must be both satisfied for acceptance. The first condition is that reported in Sect. 4, i.e., \(\Vert (d_k,(\tilde{x}_kx_k)_{\mathcal{B}_k})\Vert \le \Delta _k\), while the second condition is that \(\Vert h(\tilde{x}_k)\Vert _{\infty } \le \eta \Vert h(x_k)\Vert _{\infty }\), i.e., the feasibility violation in \(\tilde{x}_k\) must be sufficiently smaller than in \(x_k\). In our experience, adding this new condition leads to better results in practice.
In our experiments, for all the considered methods we used the same stopping conditions. Namely, the algorithms were stopped when the following two conditions were both satisfied:
where \(x_0\) is the initial point and \((x_k,\bar{\mu }_k)\) is the primaldual pair at iteration k, with \(\epsilon _{\text {opt}} = \epsilon _{\text {feas}} = 10^{6}\). Moreover, we inserted a maximum number of (outer) iterations equal to 400 and a time limit of 3600 s.
In Fig. 2, we start by comparing PD ALM against ALGENCAN with and without acceleration phase (note that the acceleration phase in ALGENCAN is where secondorder information come into play) using the performance profiles [14] with respect to CPU time. Note that the performance profiles are obtained on the subset of problems where at least one solver requires more than 10 s of CPU time. As it can be seen, ALGENCAN (using secondorder information) is the most efficient solver but the least robust one. On the other hand, PD ALM is considerably more robust than both the versions of ALGENCAN. One possible reason for PD ALM being less efficient than ALGENCAN can be the following: in PD ALM we try to use the secondorder direction as much as possible, whereas secondorder information is used in ALGENCAN only when the current primaldual point is sufficiently close to a KKT pair. This could explain our larger computational times and the behavior of the reported performance profiles.
In Fig. 3a, we report the comparison between ALGENCAN and PD ALM. We note that, even though ALGENCAN is slightly better than PD ALM in terms of efficiency, it is outperformed by our proposed method in terms of robustness. Furthermore, we note that the two performance profiles intersect at, approximately, \(\alpha \simeq 5\), i.e., both algorithms solve the same percentage of problems in at most 5 times the CPU time of the best performing solver.
In Fig. 3b, we report the comparison between PD ALM and LANCELOT (rev. B). In this case, PD ALM is clearly the best performing solver both in terms of efficiency and robustness.
Finally, we notice that ALGENCAN, LANCELOT and PD ALM solve, respectively, 272, 232 and 290 problems out of 362. The comparison among the three solvers is reported in Fig. 4.
7 Conclusions
In this paper, we presented a new method for nonlinear optimization problems with equality constraints and bound constraints. Starting from the augmented Lagrangian scheme implemented in ALGENCAN, we used a tailored activeset strategy to compute a Newtontype direction with respect to the variables estimated as nonactive, while the variables estimated as active are set to the bounds. If this direction satisfies a proper test, an augmented Lagrangian function is minimized by means of an efficient solver recently proposed in the literature. We proved convergence to stationary points and, under standard assumptions, an asymptotic quadratic convergence rate. The numerical results show the effectiveness of the proposed method.
References
Andreani, R., Martínez, J., Schuverdt, M.: On the relation between constant positive linear dependence condition and quasinormality constraint qualification. J. Optim. Theory Appl. 125(2), 473–483 (2005)
Andreani, R., Birgin, E., Martínez, J., Schuverdt, M.: Augmented Lagrangian methods under the constant positive linear dependence constraint qualification. Math. Program. 111, 5–32 (2008a)
Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: On augmented Lagrangian methods with general lowerlevel constraints. SIAM J. Optim. 18(4), 1286–1309 (2008b)
Bertsekas, D.P.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press (2014)
Birgin, E.G., Martínez, J.M.: Largescale activeset boxconstrained optimization method with spectral projected gradients. Comput. Optim. Appl. 23(1), 101–125 (2002)
Birgin, E.G., Martínez, J.M.: Practical Augmented Lagrangian Methods for Constrained Optimization. SIAM (2014)
Birgin, E.G., Martínez, J.M.: Complexity and performance of an augmented Lagrangian algorithm. Optim. Methods Softw. 35(5), 885–920 (2020)
Bomze, I.M., Rinaldi, F., Zeffiro, D.: Active set complexity of the awaystep Frank–Wolfe algorithm. SIAM J. Optim. 30(3), 2470–2500 (2020)
Conn, A.R., Gould, G., Toint, P.L.: LANCELOT: A Fortran Package for Largescale Nonlinear Optimization (Release A), vol. 17. Springer (2013)
Cristofari, A.: Activeset identification with complexity guarantees of an almost cyclic 2coordinate descent method with Armijo line search. SIAM J. Optim. (To appear)
Cristofari, A., DeSantis, M., Lucidi, S., Rinaldi, F.: A twostage activeset algorithm for boundconstrained optimization. J. Optim. Theory Appl. 172(2), 369–401 (2017)
DeSantis, M., DiPillo, G., Lucidi, S.: An active set feasible method for largescale minimization problems with bound constraints. Comput. Optim. Appl. 53(2), 395–423 (2012)
DiPillo, G., Lucidi, S., Palagi, L.: A superlinearly convergent primaldual algorithm model for constrained optimization problems with bounded variables. Optim. Methods Softw. 14(1–2), 49–73 (2000)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)
Duff, I.S.: MA57—a code for the solution of sparse symmetric definite and indefinite systems. ACM Trans. Math. Softw. 30(2), 118–144 (2004)
Facchinei, F., Lucidi, S.: Quadratically and superlinearly convergent algorithms for the solution of inequality constrained minimization problems. J. Optim. Theory Appl. 85(2), 265–289 (1995)
Fernández, D., Solodov, M.V.: Local convergence of exact and inexact augmented Lagrangian methods under the secondorder sufficient optimality condition. SIAM J. Optim. 22(2), 384–407 (2012)
Fiacco, A.V., McCormick, G.P.: Nonlinear Programming: Sequential Unconstrained Minimization Techniques. SIAM (1990)
Gould, N.I., Orban, D., Toint, P.L.: GALAHAD, a library of threadsafe Fortran 90 packages for largescale nonlinear optimization. ACM Trans. Math. Softw. 29(4), 353–372 (2003)
Gould, N.I., Orban, D., Toint, P.L.: CUTEst: a constrained and unconstrained testing environment with safe threads for mathematical optimization. Comput. Optim. Appl. 60(3), 545–557 (2015)
Nocedal, J., Wright, S.: Numerical Optimization. Springer (2006)
Nutini, J., Schmidt, M., Hare, W.: “Activeset Complexity” of proximal gradient: how long does it take to find the sparsity pattern? Optim. Lett. 13(4), 645–655 (2019)
Qi, L., Wei, Z.: On the constant positive linear dependence condition and its application to SQP methods. SIAM J. Optim. 10(4), 963–981 (2000)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Massimo Pappalardo.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Cristofari, A., Di Pillo, G., Liuzzi, G. et al. An Augmented Lagrangian Method Exploiting an ActiveSet Strategy and SecondOrder Information. J Optim Theory Appl 193, 300–323 (2022). https://doi.org/10.1007/s10957022020034
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957022020034
Keywords
 Constrained optimization
 Augmented Lagrangian methods
 Nonlinear programming algorithms
 Largescale optimization