An Active Set Trust-Region Method for Bound-Constrained Optimization

This paper discusses an active set trust-region algorithm for bound-constrained optimization problems. A sufficient descent condition is used as a computational measure to identify whether the function value is reduced or not. To get our complexity result, a critical measure is used which is computationally better than the other known critical measures. Under the positive definiteness of approximated Hessian matrices restricted to the subspace of non-active variables, it will be shown that unlimited zigzagging cannot occur. It is shown that our algorithm is competitive in comparison with the state-of-the-art solvers for solving an ill-conditioned bound-constrained least-squares problem.


Introduction
We consider the bound-constrained optimizationproblem

Related Work
Two efficient globalization methods are line search methods and trust-region methods: • Line search methods seek in each iteration for a point with better function value along a descent direction by generating a sequence of step sizes. Two inexact line search methods are the Wolfe conditions (Wolfe [31]) or Goldstein conditions (Goldstein [15]); • Trust-region methods construct a linear/quadratic model restricted to a region centered at the current point, called the trust region, to approximate the objective function, resulting in the trust-region step which enforces a sufficient model function decrease. The achieved reduction in the function value over the predicate reduction in the model function is measured by a trust-region ratio. If this ratio is sufficiently positive, a trial point with better function value is accepted as a new point and the trust region either remains fixed or is expanded. Otherwise, the trust region is reduced until such a point is found; cf. Conn et al. [10].
There has been a lot of interest to construct various active set methods to solve the bound-constrained optimization problem (1.1). These methods alternately find active bound constraints and then approximately solve an unconstrained subproblem until a local minimizer is found. In fact, these methods can be combined with either line search methods or trust-region methods to be effective.
There are some efficient and robust algorithms to solve the bound constrained optimization problem (1.1): • BLBFGS by Byrd et al. [5] identifies active constraints by computing a generalized Cauchy point along which a line search method is used to find step sizes, and uses a limited memory quasi-Newton method to approximate the Hessian matrix. • ASACG by Hager and Zhang [19,21] finds active constraints using a nonmonotone gradient projection algorithm and uses a conjugate gradient algorithm with the goal of optimizing the objective function f with respect to active constraints found.
• ASABCP by Cristofari et al. [11] finds active set estimate by applying a nonmonotone line search algorithm in each iteration and uses a truncated-Newton method over free variables. • SPG by Birgin et al. [3] is a spectral projected gradient algorithm which uses a non-monotone line search method. • LMBOPT by Kimiaei et al. [25] finds active constraints by applying a gradientfree line search along a bent search path and uses a non-traditional limited memory quasi-Newton method to approximate the Hessian matrix.
The global convergence for all mentioned solvers using active set methods has been proven. The best convergence result for active set methods has been suggested by Neumaier and Azmi [28]. They showed when all strongly active variables are identified and fixed at finitely many iterations, the reduced gradient norm is arbitrarily below a tiny threshold. In fact, LMBOPT is an improved version of active set algorithm suggested by Neumaier and Azmi [28] with many practical enhancements. In the unconstrained case, CGDECSENT by Hager and Zhang [20] is a well-known solver using a conjugate gradient algorithm. Burdakov et al. [4] suggested a good package, called LMBFG whose best solver is called LMBFG-EIG-MS (see [25]), including either a line search method or trust-region method using a traditional limited memory quasi-Newton method to estimate the Hessian matrix. An extensive comparison among all mentioned algorithms can be found in [25], so that the three best solvers are LMBOPT, LMBFG-EIG-MS, and ASACG, respectively, on the CUTEst collection by Gould et al. [16].
One of the main reasons for the inefficiency of bound constrained optimization methods is zigzagging. It is arisen thorough the inefficient search directions, like steepest descent direction, in the unconstrained case and changing active constraints in the bound constrained case. Efficient directions, like conjugate gradient directions, can remove zigzagging in the unconstrained case, but their adaptation restricted to the subspace of non-active variables may not remove zigzagging in the bound-constrained case (see, e.g., the second and third examples in [28,Section 6]). Neumaier and Azmi [28] showed that their active set method can overcome theoretically and numerically the effects of zigzagging.
For iterative bound-constrained optimization methods, the reduced gradient, which is equivalent to the KKT conditions for the problem (1.1), at a local minimizer vanishes in exact precision arithmetic. However, for a finite termination, a critical measure is needed theoretically and numerically, so that the reduced gradient norm at an approximated local minimizer is below a given threshold, while the function value at such an approximated local minimizer is as possible as smaller than the function value at the initial point. If this critical measure can distinguish numerically and theoretically spurious apparent local minimizers from good local minimizers, it is called a reasonable critical measure. A complexity bound for an iterative bound-constrained optimization method is to use a critical measure to find an upper bound on the number of iterations and terminate at an approximated local minimizer in finite precision arithmetic, e.g., see Birgin et al. [2], Cartis et al. [7][8][9], Curtis [12,13], Grapiglia et al. [17], Gratton et al. [18], Nesterov [26], and Nesterov and Polyak [27].
Our algorithm combines the trust-region method with the active set method by Neumaier and Azmi [28]. It uses a computational measure instead of the trust-region ratio to decide whether an agreement of the objective function and the model function is good or not, useful in finite precision arithmetic. To prove our complexity bound, the trust region radius is updated based on the reduced gradient. Because of cancellation in the calculation of known critical measures in double-precision arithmetic, spurious apparent local minimizers may be considered as a local minimizer. To overcome this, the reduced gradient is used as a reasonable critical measure. All strongly active variables are found and fixed at finitely many iterations under the positive definiteness of approximated Hessian matrices restricted to the subspace of non-active variables, so that unlimited zigzagging cannot occur. It is shown that our algorithm is competitive in comparison with the state-of-the-art solvers for solving an ill-conditioned boundconstrained least squares problem.

Active Set Strategy
We recall some notation from Neumaier and Azmi [28].
Let x ∈ x be the feasible point and and let i be the index. Then (i) a bound x i or x i is called active if x i = x i or x i = x i , respectively. In both cases, the index i and the component x i are called active, as well. (iii) a point all of whose components are active is called corner of the box x.
If the condition holds, x ∈ x is a stationary point of (1.1). The first-order necessary optimality conditions for the bound constrained prob- Here, x is a stationary point for (1.1). Equivalently, at any local minimizer x of (1.1), the reduced gradient g red (x) at x, with components In fact at a nondegenerate stationary point, all components are strongly active, and at a degenerate stationary point, at least one component is not strongly active. An active set algorithm for the bound-constrained optimization problem (1.1) proceeds through changing a subset of the variables. We denote by I ⊆ {1, 2, . . . , n} the working set, so that each search direction p satisfies the condition Denote g := g(x) and g red := g red (x). B I I denotes the submatrix of the Hessian matrix B with row and column indices taken from I , i.e., B I I = (B i j ) i∈I , j∈I and x I denotes the subvector of x with the indices taken from the set I . The goal is to solve the unconstrained quadratic problem restricted to the subspace of the working set I . From [28], two choices for the working set I are the minimal set of free indices of x and the maximal set of free or freeable indices of x Any iteration where is called a freeing iteration. In such an iteration, a decrease in the function value is guaranteed by moving slightly in the feasible direction. In this case, the problem (1.4) restricted to the new working set is solved. Neumaier and Azmi [28] proposed an active set algorithm, called BOPT, for boundconstrained optimization problems. Under some assumptions on the search directions restricted to the fixed working set I , they proved the local linear convergence of BOPT. To solve the problem (1.4), they designed a gradient-free line search method, called CLS, which generates the feasible points by projecting the ray x +α p (α ≥ 0) into the box x. CLS is cheaper than the Wolfe line search method [29], since CLS needs not to compute the gradient at each trial point. They defined a measure based on Goldstein quotient for α > 0, (1.5) extending it to a continuous function on [0, ∞] by defining μ(0) := 1 and leading to the sufficient descent condition with fixed β ∈]0, 1 4 [. In Theorem 2.2.1, they proved that if the condition (1.6) is satisfied, a significant decrease in the objective function is obtained. Moreover, they showed in Theorem 3.2 that if f is bounded below, a step size satisfying (1.6) can be found adaptively, so that a step size α with μ(α) ≈ 0.5 is found.
At first, BOPT computes the reduced gradient g red at the initial point x 0 and initializes the working set I 0 as I + (x 0 ). Then, as long as g red (x ) for ≥ 0 is not zero, BOPT computes the descent search direction restricted to the working set I along which CLS is performed, while a point with better function value satisfying (1.6) is found. Then, the new working set I is the minimal set I − (x ). It expands to the maximal set I + (x ) whenever the condition is violated in the hope of decreasing zigzagging.

Trust-Region Method
In the classical trust-region method, a trust region restricted to the subspace of free variables is defined by Here, Δ > 0 is called the trust-region radius. A trust-region method computes approximately a direction vector p as a minimizer of the quadratic model function An agreement between the objective function f and the model function Q is measured by Here, the numerator and the denominator of (1.9) are called the actual and predicted reduction, respectively. If a good agreement between the objective function and the estimated model is found, Δ is expanded and the th iteration is called successful; otherwise, it is reduced and the th iteration is called unsuccessful; for more details, see [10,30].

Overview of the New Method
We construct an active set trust-region algorithm for the bound constrained optimization problem (1.1): • The trust-region ratio (1.9) is replaced by a variant of the sufficient descent condition (1.6). It is very useful in finite precision arithmetic whenever the function is very nonlinear. • To get a complexity bound for our algorithm, the trust region radius is updated based on the reduced gradient. • The reduced gradient is used as a critical measure to distinguish a good local minimizer from a spurious apparent local minimizer. • Under the positive definiteness of approximated Hessian matrices restricted to the subspace of free variables, it will be shown that unlimited zigzagging cannot occur. Hence, all strongly active variables are found and fixed at finitely many iterations.
Section 2 describes the required assumptions on the problem and algorithm to prove an improved complexity result for bound-constrained optimization. In Sect. 3, a sufficient descent condition is used instead of the trust region ratio. In Sect. 4, the new algorithm is described. The complexity result is proven in Sect. 5. Some implementation details are discussed in Sect. 6. Numerical results are given in Sect. 7. Conclusion is summarized in Sect. 8.

Complexity
In this section, we discuss the state-of-the-art complexity results in Sect. 2.1 and then our complexity results with a reasonable critical measure in Sect. 2.2.

Known Complexity
This subsection discusses why the two known critical measures may accept numerically points as local minimizers, while they are far from the local minimizers.
The projection of x ∈ R n into the feasible set x in the component is defined by otherwise. (2.1) Two common critical measures for the problem (1.1) are Here, the definition of π comes from (2.1). They were used to prove the global convergence by Hager and Zhang [19,21] and Byrd et al. [5] and the complexity bound by Cartis et al. [7,8].
Cartis et al. [8] showed that their algorithm, starting from a point x 0 ∈ x, finds a point x ∈ x, such that g (1) (x) ≤ ε using at most N (ε) iterations. For the onedimensional problem [28] min 0≤x≤∞ x, (2.2) whose gradient of the objective function is g(x) = 1, the cancellation in the calculation of g (1) (x) in double-precision arithmetic leads to the acceptance of the point x = 10 17 as the minimizer, while such a point is far from the minimizer. Cartis et al. [7] used another critical measure |χ(x)| ≤ ε. For the problem (2.2), this measure is χ(x) := −1 and this critical measure fails, since ε is assumed to be small.

Some Auxiliary Results
This subsection introduces our reasonable critical measure and makes assumptions on the problem and algorithm which are needed to achieve our complexity bound in the nonconvex case.
To overcome the shortcoming of known critical measures, we are interested in finding an algorithm which needs at most N (ε) iterations to get a point x best satisfying It is clear that this measure overcomes the drawbacks of two previous critical measures, since if it is used for the problem (2.2), it results in g red (0) = 0 and g red (x) = 1 for x > 0. However, the reduced gradient does not preserve continuity. Hence, Neumaier and Azmi [28,Theorem 8.2] showed that if the sequence x ( ≥ 0) generated by BOPT converges to a point x and g red (x ) goes to zero, then g red ( x) = 0 and all strongly active variables are found ultimately at the nondegenerate stationary point x. As a result, if g red (x ) goes to zero for a bounded sequence x ∈ x, there are some subsequences converging to x ∈ x, such that g red ( x) = 0 (see [28,Corollary 8.3]). Accordingly, we prove the main convergence result in Sect. 5 (see Theorem 5.6).
We prove that the complexity bound O(ε −2 ) holds in both unconstrained and boundconstrained cases. In the unconstrained case, this complexity bound is the same as the known complexity bounds by Curtis [12,13], Grapiglia et al. [17], and Gratton et al. [18]. In the bound constrained case, our complexity bound is the same as the known complexity results obtained by Cartis et al. [7,8] provided that the trust-region algorithms use the quadratic model but with the difference that it is numerically better than them.
As in [10], we need to make some assumptions on the problem and the algorithm to determine the complexity results. We abbreviate • the upper bound on the norm of the model's Hessian to "umh", • the upper bound on the norm of function's element Hessians to "ufh", • the uniform norm equivalence to "une", • the upper bound on the Hessians to "ubh".
Assumptions on the problem: (A1) The objective function f (x) is twice continuously differentiable and has a lower bound on the level set L( (A2) The Hessian G of the objective function is uniformly bounded, i.e., there exists Assumptions on the algorithm: By (A4) and the definition : The following result plays a key role in proving our complexity analysis. The same results have been obtained for bound constrained nonlinear systems; see [14,[22][23][24].
Proof Because the convexity of x, defined by p I := ζ g A minimizer of the one-dimensional model function (2.5) In the second case, ( p I ) T B I I p I < −(g I ) T p I results in From the projection theorem and the property of the Euclidean norm, we get From (2.4), (2.5), and (2.6) and since ≥ 1, we get I (x ) , ζ g I (x ) Case 2 ( p I ) T B I I p I ≤ 0. Then, χ(t) ≤ ψ(t) := (g I ) T p I , so that the function ψ(t) is linear. Using (2.4), the feasibility of x , and the monotonicity of the projection operator, we have . Otherwise, Algorithm 1 discussed in Sect. 4 is stopped due to Proposition 2.4(iii). Hence, p is a descent direction, and t * := argmin t∈[0,1] ψ(t) = 1 is the unique minimizer of ψ(t) on [0, 1], since −(g I ) T p I > 0 ≥ ( p I ) T B I I p I . Finally, from (2.4), (2.6), and since ≥ 1, we conclude that A Cauchy point is a minimizer of the model function Q along the projected steepest descent direction p := ζ g (1) Since the Cauchy point is not unique in the constrained case, the following assumption is valid for any global minimizer of (1.8).
(A5) The reduction of the model function Q is at least as much as the Cauchy point where p is a solution of the trust-region subproblem (1.8).
(A6) There exists a constant une ≥ 1, such that where x k are uniformly equivalent to the Euclidean 2 norm. Inspired by Algorithm 2 in [24], the following result is valid.
and with p ∈ R n defined by we have p q (α) = p for sufficiently small α > 0.
We use the following result to prove Proposition 2.5, below, which helps us to get our complexity bound in Sect. 5.

Proposition 2.4 The three following statements hold:
(i) For α ∈ (0, 1) . The following is the final result of this section which will be used to get Corollary 5.3 in Sect. 5. It gives a lower bound on the reduction of the model function Q.

Enforcing a Good Agreement
Our achievement is to use the sufficient descent condition (1.5) in the trust-region framework instead of (1.9) to enforce a sensible decrease in the function value provided that (g I ) T p I < 0 holds. Define where

The Improved Trust-Region Algorithm
This section describes how to work out our improved trust region algorithm whose radius is updated based on the reduced gradient norm.
To update the trust-region radius Δ for all ≥ 0, the parameter λ is restricted into the interval λ := [λ, λ], where 0 < λ < λ < ∞. It is done by projecting λ into the interval λ P λ (λ ) := min(λ, max(λ , λ)). (4.1) Before we introduce the trust-region radius, the associated factor is defined according where 0 < σ 1 < 1 < σ 2 are suitable constants and P λ (.) is computed by (4.1). The trust region radius is updated according to the reduced gradient. We describe how to perform our algorithm, bound-constrained active set trustregion algorithm, called BCASTR. It tries to figure out which constraints are probably active at the solution. It is done by enforcing the sufficient descent condition (3.1). In this case, the iteration is successful and the trust-region radius is updated according to the reduced gradient norm, while the corresponding factor is increased. If the mentioned condition does not hold, the iteration is unsuccessful and the trust region radius is reduced. Moreover, the trust-region subproblem in the subspace of non-active variables is solved in the hope of getting the minimizer. Once the reduced gradient is below a given threshold, BCASTR ends.

Complexity Analysis and Limit Accuracy
In this section, under the assumptions (A1)-(A6), we prove the complexity bound for our algorithm and show that all strongly active variables are found ultimately at the nondegenerate stationary point.

Proposition 5.2 Suppose that (A1)-(A6) hold and let p be the th inexact solution of (1.8).
Then Proof The proof is done by induction. With the choice of λ 0 in the first line of Algorithm 1, the proof holds for = 0; λ 0 ≥ λ ≥ λ min . Let us assume that the induction hypothesis holds, i.e., λ ≥ λ min . Then, we show that it is valid for λ +1 . To do so, we consider two cases:

Algorithm 1 BCASTR -bound constrained active set trust region method
Propose: BCASTR solves the bound constrained optimization problem (1.1).

Input:
The initial point x 0 ∈ x and the feasible set x.
Output: An estimated minimum point x best ∈ x and its function value f best .

Corollary 5.3 Let p be the th inexact solution of (1.8) and suppose that (A3)-(A6) hold. Then
where λ min comes from Proposition 5.2.
To prove the complexity analysis, let us define Here, S(ε) contains all successful iterations generated by (3.1) and U(ε) contains the unsuccessful iterations.
The following result gives an upper bound on the number of unsuccessful iterations.
Proof By the role of updating λ and Proposition 5.2, we get The following result is our main complexity result for the nonconvex case. (1.8).

Theorem 5.5 Suppose that (A1)-(A6) hold and let p be the th inexact solution of
Moreover, let f 0 be the initial function value and μ min := min(μ , μ ). Given any ε ∈ (0, 1), BCASTR uses at most where f := inf f is finite due to (A1) and λ min comes from Proposition 5.2.
Proof (i) Let g I := g I (x ) and g red := g red (x ). Then, we conclude from Corollary 5.3 that According to [28,Theorem 8.2], we show that when inf g red (x ) = 0 and the sequence generated by our method converges to a point x ∈ x, then g red ( x) = 0 and all strongly active variables are found ultimately at the nondegenerate stationary point x and unlimited zigzagging cannot occur.
The point x ∈ R n is called a strong local minimizer of f provided that f is twice continuously differentiable in a neighborhood of x, the gradient g(x) of f at x vanishes, and the true Hessian G(x) of f at x is positive definite. If our algorithm converges to a strong local minimizer, then the true Hessian matrix G I I restricted the subspace of free variables at this point is positive define. Its approximation B I I is likely to be positive definite. In fact, G I I can be estimated by various quasi-Newton methods. Most of them can preserve positive definiteness of B I I , see [5,6]. Hence, in the following theorem, it is assumed that that B I I is positive definite matrix. Proof For sufficiently large , inf f = f and f is finite due to (A1). For sufficiently large , we have by Corollary 5.3, so that (5.2) is obtained. We define freeing iterations with respect to any index set I by Whether there is a limited zigzagging or not makes the proof followed in two cases: Case 1 Unlimited zigzagging. In such a case, F I is infinite. Since I is often changed infinitely, and I is not finite, (1.8) is solved infinitely often for different I . By Proposition 5.4, the number of unsuccessful iterations is finite; hence, the number of successful iterations is infinite. For I ⊂ I + (x ), we have g I (x ) ∞ < g I + (x ) ∞ = g red (x ) ∞ = 0 for sufficiently large ∈ F I . Hence, g I = g I (x ) = 0 and g T I p I = 0 for sufficiently large . By the positive definiteness of B I I and Corollary 5.3, we get a contradiction Hence, unlimited zigzagging cannot occur at the nondegenerate stationary point x. Case 2 Limited zigzagging. In this case, F I is finite, meaning that the number of freeing iterations is finite, i.e., there exists a number , such that I := I := I − (x ) for > . It means that the working set is fixed and all bounds are not fixed. In other words, the trust-region algorithm, like the unconstrained case, solves the subproblem (1.8) in the subspace I = I = I − (x ) for > , so that there are some successful iterations; once the optimal activities are identified. Hence, for > , (5.2) holds. According to [28,Theorem 8.2], when inf g red (x ) = 0 and the sequence generated by our method converges to a point x ∈ x, then g red ( x) = 0 and all strongly active variables are found ultimately at the nondegenerate stationary point x.

Some Implementation Details
This section discusses how to solve the trust-region subproblem (1.8), while three practical enhancements are used to increase the efficiency and robustness of our algorithm. Another subspace technique, different from the subspace of free variables, is used to convert the n-dimensional bound-constrained problem (1.8) to a low-dimensional linearly constraint problem. Then, it can be solved by any bound-constrained solver.
We denote by s the difference of the current point and the previous point and by y the difference of gradients at the current point and the previous point. Let J be a list of basis indices of the subspace spanned by the columns (the difference of previous m points) of the matrix Here, m is the current subspace dimension whose bound is m max satisfying The gradient differences are saved as the columns of the matrix and the gradient is defined by c := S T I J g I . If the matrix B I I is symmetric, then the matrix H J J is symmetric. Moreover, if the matrix B I I is positive definite and the columns of the matrix S I J are linearly independent, then m ≤ n and then the matrix H J J will be positive definite as well. In practice, we are not interested in computing the matrix B I I restricted the subspace of free variables, since it will be costly for large-scale problems. Instead, the matrix H J J restricted the subspace spanned by the columns of the matrix S I J is computed with low memory.
We now recall the trust-region subproblem (1.8) in the subspace spanned by the columns S I J which is a low-dimensional linearly constraint problem. We first reformulate the constraint S I J z ∈ T r as l ≤ S I J z ≤ u with Then, the subspace search direction is computed by p = S I J z. We now enrich the generic version of our algorithm by the three following practical enhancements: • When the activity changes (the size of working set is less than that of previous one), the subspace needs to be restarted. In this case, we cannot form the problem (6.1). Instead, the descent direction, against zigzagging, by Kimiaei et al. [25] is used and projected into [x I , x I ]. • When the trust-region algorithm is unsuccessful, a gradient-free line search, like improved curved line search method by Kimiaei et al. [25], is used which does not change the order of our complexity bound, and only increases the efficiency and robustness of our algorithm. • The initial search direction is computed based on the gradient sign in the same way as Kimiaei et al. [25].
If there is very little progress, not depending on whether the activity changes or not, i.e., y ≈ 0, the subspace ingredients cannot be updated. In fact, the gradient is still large which is equivalent to a new pair (s, y) violates the condition |g T y| ≥ Δ po g T g (6.2) with the tiny tuning parameter Δ po ∈ (0, 1). We initially set m = 0. If a new pair (s, y) satisfies (6.2), m is increased by one. Then, s and y are appended to S and Y , respectively. When m reaches its limits m max , it is preserved to be fixed, while the oldest columns of S and Y are replaced by s and y, respectively.

Numerical Results
This section discusses a comparison among our algorithm (BCASTR) with LMBOPT by Kimiaei et al. [25], LMTR-EIG-MS by Burdakov et al. [4], and FISTA (the fast iterative shrinkage-thresholding algorithm) by Beck and Teboulle [1] to solve the ill-conditioned bound constrained least-squares problem Here, randn(m, n) creates a m × n random matrix whose entries are from the standard normal distribution, qr makes a QR decomposition on the m × n matrix A, and diag creates a diagonal matrix from the vector logspace(5, 0, m) with m entries between decades 10 0 and 10 5 . The Hessian of the objective function (7.1) has the condition number 10 10 . The initial point of the problem (7.1) is a random vector. In Table 1, denote by nf the number of function evaluations, by ng the number of gradient evaluations, by g red ∞ the infinity-norm of reduced gradient at the best point x best found by a solver, and by f best the function value at x best . Moreover, F stands for when an algorithm is failed due to any reason. The algorithms are terminated either g red ∞ ≤ 10 −6 or nf + 2 × ng ≥ 10 4 .

Conclusion
This paper described an active set trust-region method for the bound constrained optimization problem (1.1) with the exact function value and gradient. This method • replaced the traditional trust-region ratio by a variant of the sufficient descent condition (1.6), useful in finite precision arithmetic and in severely nonconvex regions; • used the reduced gradient as a critical measure to get a point which is never a spurious apparent local minimizer arisen because of cancellation in the calculation of the other known critical measures in double-precision arithmetic; • updated the trust-region radius according to the reduced gradient, resulting in under the positive definiteness of approximated Hessian matrices restricted to the subspace of free variables, unlimited zigzagging could not occur; hence, all strongly active variables were found and fixed at finitely many iterations.
Numerical results showed that our algorithm was very competitive in comparison with the-state-of-the-art solvers to solve the ill-conditioned bound-constrained leastsquares problem (7.1).
Funding Open access funding provided by University of Vienna.

Conflicts of interest
The author declares that he has no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.