Abstract
In this paper, a semi-local convergence analysis of the Gauss-Newton method for convex composite optimization is presented using the concept of quasi-regularity in order to approximate fixed points in optimization. Our convergence analysis is presented first under the L-average Lipschitz and then under generalized convex majorant conditions. The results extend the applicability of the Gauss-Newton method under the same computational cost as in earlier studies such as Li and Ng (SIAM J. Optim. 18:613-642, 2007), Moldovan and Pellegrini (J. Optim. Theory Appl. 142:147-163, 2009), Moldovan and Pellegrini (J. Optim. Theory Appl. 142:165-183, 2009), Wang (Math. Comput. 68:169-186, 1999) and Wang (IMA J. Numer. Anal. 20:123-134, 2000).
Similar content being viewed by others
1 Introduction
In this paper, we are concerned with the convex composite optimizations problem. Many problems in mathematical programming such as convex inclusion problems, minimax problems, penalization methods, goal programming problems, constrained optimization problems, and other problems can be formulated like composite optimization problems (see, for example, [1–6]).
Recently, in the elegant study by Li and Ng [7], the notion of quasi-regularity for \(x_{0} \in \mathbb{R}^{l}\) with respect to inclusion the problem was used. This notion generalizes the case of regularity studied in the seminal paper by Burke and Ferris [3] as well as the case when \(d \longrightarrow F'(x_{0}) d - \mathcal {C}\) is surjective. This condition was inaugurated by Robinson in [8, 9] (see, also, [1, 10, 11]).
In this paper, we present a convergence analysis of the Gauss-Newton method (GNM) (see the method (GNA) in Section 2). In [7], the convergence of the method (GNA) is based on the generalized Lipschitz conditions inaugurated by Wang [12, 13] (to be precise in Section 2). In [11], we presented a finer convergence analysis in the setting of Banach spaces than in [12–16] for the method (GNM) with the advantages \((\mathcal{A})\): tighter error estimates on the distances involved and the information on the location of the solution is at least as precise. These advantages were obtained (under the same computational cost) using the same or weaker hypotheses. Here, we provide the same advantages \((\mathcal{A})\) but for the method (GNA).
The rest of the study is organized as follows: Section 2 contains the notions of generalized Lipschitz conditions and the majorizing sequences for the method (GNA). In order for us to make the paper as self-contained as possible, the notion of quasi-regularity is re-introduced (see, for example, [7]) in Section 3. Semi-local convergence analysis of the method (GNA) using L-average conditions is presented in Section 4. In Section 5, some convex majorant conditions are used for the semi-local convergence of the method (GNA).
2 Generalized Lipschitz conditions and majorizing sequences
The purpose of this paper is to study the convex composite optimization problem:
where \(h : \mathbb{R}^{m} \rightarrow \mathbb{R}\) is a convex operator, \(F : \mathbb{R}^{l} \rightarrow \mathbb{R}^{m}\) is a Fréchet-differentiable operator and \(m , l \in \mathbb{N}^{\star}\).
The study of the problem (2.1) is very important. On the other hand, the study of the problem (2.1) provides a unified framework for the development and analysis of algorithmic method and on the other hand it is a powerful tool for the study of first- and second-order optimality conditions in constrained optimality (see, for example, [1–7]).
We assume that the minimum \(h_{\min}\) of the function h is attained. The problem (2.1) is related to the following:
where
is the set of all minimum points of h.
A semi-local convergence analysis for the Gauss-Newton method (GNM) was presented using the popular algorithm (see, for example, [1, 7, 17]):
Algorithm
(GNA): \((\xi, \Delta, x_{0})\)
Let \(\xi\in[1, \infty[\), \(\Delta\in\,] 0, \infty]\) and, for each \(x \in \mathbb{R}^{l}\), define \(\mathcal {D}_{\Delta}(x)\) by
Let also \(x_{0} \in \mathbb{R}^{l}\) be given. Having \(x_{0}, x_{1}, \ldots, x_{k}\) (\(k \geq0\)), determine \(x_{k+1}\) by the following.
If \(0 \in \mathcal {D}_{\Delta}(x_{k})\), then STOP;
If \(0 \notin \mathcal {D}_{\Delta}(x_{k})\), choose \(d_{k}\) such that \(d_{k} \in \mathcal {D}_{\Delta}(x_{k})\) and
Then set \(x_{k+1} = x_{k} + d_{k}\).
Here, \(d(x, W)\) denotes the distance from x to W in the finite dimensional Banach space containing W. Note that the set \(\mathcal {D}_{\Delta}(x)\) (\(x \in \mathbb{R}^{l}\)) is nonempty and is the solution of the following convex optimization problem:
which can be solved by the well known methods such as the subgradient or cutting plane or bundle methods (see, for example, [18, 19]).
Notice that, in the special case when \(l=m\) and \(F(x)=H(x)-x\), the results obtained in this paper can be used to iteratively compute fixed points of the operator \(H:\mathbb {R}^{m} \rightarrow \mathbb {R}^{m}\). Therefore, the results obtained in this paper are useful in fixed point theory and its applications in optimization.
Let \(U(x,r)\) denote the open ball in \(\mathbb {R}^{l}\) (or \(\mathbb {R}^{m}\)) centered at x and of radius \(r>0\). By \(\overline{U} (x,r)\) we denote its closure. Let W be a closed convex subset of \(\mathbb {R}^{l}\) (or \(\mathbb {R}^{m}\)). The negative polar of W denoted by \(W ^{\circleddash}\) is defined as
We need the following notion of the generalized Lipschitz condition due to Wang in [12, 13] (see also [7]). From now on, \(L : [0, \infty[\longrightarrow\,]0, \infty[\) (or \(L_{0}\)) denotes a nondecreasing and absolutely continuous function. Moreover, η and α denote given positive numbers.
Definition 2.1
Let \(\mathcal {Y}\) be a Banach space and let \(x_{0} \in \mathbb{R}^{l}\). Let \(G : \mathbb{R}^{l} \longrightarrow \mathcal {Y}\). Then, G is said to satisfy:
(1) the center \(L_{0}\) -average condition on \(U (x_{0} , r)\) if
for all \(x \in U(x_{0} , r)\);
(2) the L-average Lipschitz condition on \(U (x_{0} , r)\) if
for all \(x,y \in U(x_{0} , r)\) with \(\| x- y \|+ \| y - x_{0} \|\leq r\).
Remark 2.2
It follows from (2.8) and (2.9) that, if G satisfies the L-average condition, then it satisfies the center L-Lipschitz condition, but not necessarily vice versa. We have
for each \(u \in[0 , r]\) holds in general and \({L}/{L_{0}}\) can be arbitrarily large (see [1, 2, 10]).
Definition 2.3
Define a majorizing function \(\psi_{\alpha}\) on \([0, + \infty)\) by
for each \(t \geq0\) and a majorizing sequence \(\{ t_{\alpha, n } \}\) by
for each \(n\geq0\). The sequence \(\{ t_{\alpha, n } \}\) was used in [7] as a majorizing sequence for \(\{x_{n} \}\) generated by the algorithm (GNA).
The sequence \(\{ t_{\alpha, n } \}\) can also be written, equivalently, for each \(n\geq1\) and \(t_{\alpha, 1 } =1\) as
where
since (see (4.20) in [7])
for each \(n\geq1\).
From now on, we show how our convergence analysis for the algorithm (GNA) is finer than the one in [7]. Define a supplementary majorizing function \(\psi_{\alpha, 0}\) on \([0, + \infty)\) by
for each \(t \geq0\) and the corresponding majorizing sequence \(\{ s _{\alpha, n } \}\) by
for each ≥1, where \(\beta_{\alpha, {n}}\) is defined as \(\alpha_{\alpha, {n}}\) with \(s_{\alpha, {n-1}}\), \(s_{\alpha, {n}}\) replacing \(t_{\alpha, {n-1}}\), \(t_{\alpha, {n}}\), respectively.
The results concerning \(\{ t_{\alpha, n} \}\) are already in the literature (see, for example, [1, 7, 11]), whereas the corresponding ones for the sequence \(\{ s_{\alpha, n} \}\) can be derived in an analogous way by simply using \(\psi' _{\alpha, 0}\) instead of \(\psi' _{\alpha}\).
First, we need some auxiliary results for the properties of functions \(\psi_{\alpha}\), \(\psi_{\alpha, 0}\) and the relationship between sequences \(\{ s_{\alpha, n } \}\) and \(\{ t_{\alpha, n } \}\). The proofs of the next four lemmas involving the \(\psi_{\alpha}\) function can be found in [7], whereas the proofs for the function \(\psi_{\alpha, 0 }\) are analogously obtained by simply replacing L by \(L_{0}\).
Let \(r_{\alpha}>0\), \(b_{\alpha}>0\), \(r_{\alpha, 0} >0 \), and \(b_{\alpha, 0} >0\) be such that
and
Clearly, we have
and
In view of (2.10), (2.18), and (2.19), we get
and
Lemma 2.4
Suppose that \(0 < \eta\leq b_{\alpha}\). Then \(b_{\alpha}< r_{\alpha}\) and the following assertions hold:
(1) \(\psi_{\alpha}\) is strictly decreasing on \([0, r_{\alpha}]\) and strictly increasing on \([r_{\alpha}, \infty)\) with \(\psi_{\alpha}(\eta) >0\), \(\psi_{\alpha}(r_{\alpha}) = \eta- b_{\alpha}\leq0\), \(\psi_{\alpha}(+ \infty) \geq\eta>0\);
(2) \(\psi_{\alpha, 0}\) is strictly decreasing on \([0, r_{\alpha, 0}]\) and strictly increasing on \([r_{\alpha, 0} , \infty)\) with \(\psi_{\alpha, 0} (\eta) >0\), \(\psi_{\alpha, 0}(r_{\alpha, 0}) = \eta- b_{\alpha, 0} \leq0\), \(\psi_{ \alpha, 0} (+ \infty) \geq\eta>0\).
Moreover, if \(\eta< b_{\alpha}\), then \(\psi_{\alpha}\) has two zeros, denoted by \(r _{\alpha}^{\star} \) and \(r _{\alpha}^{\star\star} \), such that
and, if \(\eta=b_{\alpha}\), then \(\psi_{\alpha}\) has an unique zero \(r _{\alpha}^{\star} =r_{\alpha}\) in \((\eta, \infty)\);
\(\psi_{\alpha,0}\) has two zeros, denoted by \(r _{\alpha, 0}^{\star} \) and \(r _{\alpha, 0}^{\star\star} \), such that
and, if \(\eta=b_{\alpha,0} \), then \(\psi_{\alpha,0}\) has an unique zero \(r _{\alpha, 0}^{\star} =r_{\alpha, 0} \) in \((\eta, \infty)\);
(3) \(\{ t _{\alpha, n} \}\) is strictly monotonically increasing and converges to \(r _{\alpha}^{\star} \);
(4) \(\{ s _{\alpha, n} \}\) is strictly monotonically increasing and converges to its unique least upper bound \(s _{\alpha }^{\star} \leq r_{\alpha, 0}^{\star}\);
(5) The convergence of \(\{ t _{\alpha, n} \}\) is quadratic if \(\eta< b_{\alpha}\) and linear if \(\eta= b_{\alpha}\).
Lemma 2.5
Let \(r_{\alpha}\), \(r_{\alpha,0}\), \(b_{\alpha}\), \(b_{\alpha,0}\), \(\psi _{\alpha}\), \(\psi_{\alpha,0}\) be as defined above. Let \(\overline{\alpha} > \alpha\). Then the following assertions hold:
-
(1)
The functions \(\alpha\rightarrow r_{\alpha}\), \(\alpha\rightarrow r_{\alpha, 0}\), \(\alpha\rightarrow b_{\alpha}\), \(\alpha\rightarrow b_{\alpha, 0}\) are strictly decreasing on \([0, \infty)\);
-
(2)
\(\psi_{\alpha}< \psi_{\overline{\alpha}}\) and \(\psi_{\alpha, 0} < \psi_{\overline{\alpha} , 0}\) on \([0, \infty)\);
-
(3)
The function \(\alpha\rightarrow r _{\alpha}^{\star}\) is strictly increasing on \(I(\eta)\), where
$$I(\eta) = \{ \alpha>0 :\eta\leq b_{\alpha}\}; $$ -
(4)
The function \(\alpha\rightarrow r _{\alpha, 0}^{\star}\) is strictly increasing on \(I(\eta)\).
Lemma 2.6
Let \(0 \leq\lambda< \infty\). Define the functions
for all \(t \geq0\) and
for all \(t \geq0\). Then the functions χ and \(\chi_{0}\) are increasing on \([0 , \infty)\).
Lemma 2.7
Define the function
for all \(t \in[0, r _{\alpha}^{\star} )\). Suppose that \(0 < \eta\leq b_{\alpha}\). Then the function \(g_{\alpha}\) is increasing on \([0, r _{\alpha}^{\star} )\).
Next, we show that the sequence \(\{ s _{\alpha, n} \}\) is tighter than \(\{ t _{\alpha, n} \}\).
Lemma 2.8
Suppose that the hypotheses of Lemma 2.4 hold and the sequences \(\{ s _{\alpha, n} \}\), \(\{ t _{\alpha, n} \}\) are well defined for each \(n\geq0\). Then the following assertions hold: for all \(n\geq0\),
and
Moreover, if the strict inequality holds in (2.10), so does in (2.29) and (2.30) for all \(n >1\). Furthermore, the convergence of \(\{ s _{\alpha, n} \}\) is quadratic if \(\eta< b_{\alpha}\) and linear if \(L_{0}=L\) and \(\eta= b_{\alpha}\).
Proof
First, we show, using induction, that (2.29) and (2.30) are satisfied for each \(n\geq0\). These estimates hold true for \(n=0,1\) since \(s _{\alpha, 0}=t _{\alpha, 0}=0\) and \(s _{\alpha, 1}=t _{\alpha, 1}=\eta\). Using (2.10), (2.13), and (2.17) for \(n=1\), we have
and
since
for each \(s \leq t\). Hence the estimate (2.29) holds true for \(n=0,1,2\) and (2.30) holds true for \(n=0,1\). Suppose that
for each \(m=0,1,2, \ldots, k+1\) and
for each \(m=0,1,2, \ldots, k\). Then we have
and
The induction for (2.29) and (2.30) is complete.
Finally, the estimate (2.31) follows from (2.30) by letting \(n\rightarrow\infty\). The convergence order part for the sequence \(\{ s_{\alpha, n} \}\) follows from (2.30) and Lemma 2.4(v). This completes the proof. □
Remark 2.9
If \(L_{0} =L\), the results in Lemmas 2.4-2.8 reduce to the corresponding ones in [7]. Otherwise (i.e., if \(L_{0} < L\)), our results constitute an improvement (see also (2.22)-(2.26)).
3 Background on regularities
In order for us to make the study as self-contained as possible, we mention some concepts and results on regularities which can be found in [7] (see, also, [1, 10, 12, 15, 20–22]).
For a set-valued mapping \(T : \mathbb{R}^{l} \rightrightarrows\mathbb {R}^{m}\) and for a set A in \(\mathbb{R}^{l}\) or \(\mathbb{R}^{m} \), we denote by
Consider the inclusion
where C is a closed convex set in \(\mathbb{R}^{m}\). Let \(x\in\mathbb {R}^{l} \) and
Definition 3.1
Let \(x_{0} \in\mathbb{R}^{l}\).
(1) \(x_{0}\) is called a quasi-regular point of the inclusion (3.1) if there exist \(R \in\,]0, +\infty[\) and an increasing positive function β on \([0,R[\) such that
for all \(x \in U(x_{0} , R)\), \(\beta(\| x - x_{0} \|)\) is an ‘error bound’ in determining how for the origin is away from the solution set of the inclusion (3.1).
(2) \(x_{0}\) is called a regular point of the inclusion (3.1) if
Proposition 3.2
(see [3])
Let \(x_{0}\) be a regular point of (3.1). Then there are constants \(R>0\) and \(\beta>0\) such that (3.3) holds for R and \(\beta( \cdot) = \beta\). Therefore, \(x_{0}\) is a quasi-regular point with the quasi-regular radius \(R_{x_{0}} \geq R\) and the quasi-regular bound function \(\beta_{x_{0}} \leq\beta\) on \([0, R]\).
Remark 3.3
(1) \(\mathcal {D}(x) \) can be considered as the solution set of the linearized problem associated to (3.1)
(2) If C defined in (3.1) is the set of all minimum points of h and there exists \(d_{0} \in \mathcal {D}(x) \) with \(\| d_{0} \|\leq\Delta\), then \(d_{0} \in \mathcal {D}_{\Delta}(x) \) and, for each \(d \in\mathbb{R}^{l}\), we have the following equivalence:
(3) Let \(R_{x_{0}}\) denote the supremum of R such that (3.3) holds for some function β defined in Definition 3.1. Let \(R \in[0, R _{x_{0}}]\) and \(\mathcal{B} _{R} ( x_{0} )\) denotes the set of function β defined on \([0, R)\) such that (3.3) holds. Define
for each \(t \in[0, R_{x_{0}})\). All the function \(\beta\in\mathcal{B} _{R} ( x_{0} )\) with \(\lim_{t \rightarrow R^{-}} \beta(t) < + \infty \) can be extended to an element of \(\mathcal{B} _{R_{x_{0}}} ( x_{0} ) \) and we have
for each \(t \in[0, R)\). Here, \(R_{x_{0}}\) and \(\beta_{x_{0}}\) are called the quasi-regular radius and the quasi-regular function of the quasi-regular point \(x_{0}\), respectively.
Definition 3.4
(1) A set-valued mapping \(T : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\) is said to be convex if the following items hold:
-
(a)
\(Tx+ T y \subseteq T (x+y)\) for all \(x, y \in\mathbb{R}^{l}\);
-
(b)
\(T \lambda x=\lambda Tx\) for all \(\lambda>0\) and \(x\in\mathbb{R}^{l}\);
-
(c)
\(0 \in T0\).
(2) Let \(T : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\) be a convex set-valued mapping. The norm of T be defined by
If \(\| T \|< \infty\), we say that T is normed.
(3) For two convex set-valued mappings T and \(S : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\), addition and multiplication are defined by
for all \(x\in\mathbb{R}^{l} \) and \(\lambda\in\mathbb{R}\), respectively.
(4) Let \(T : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\) be a mapping, C be closed convex in \(\mathbb{R}^{m}\) and \(x \in\mathbb{R}^{l}\). We define \(T_{x}\) by
for all \(d \in\mathbb{R}^{l}\) and its inverse by
for all \(y \in\mathbb{R}^{m}\).
Note that, if C is a cone, then \(T_{x}\) is convex. For any \(x_{0} \in \mathbb{R}^{l}\), if the Robinson condition (see [8, 9]),
is satisfied, then \(D(T_{x})= \mathbb{R}^{l} \) for each \(x\in\mathbb {R}^{l}\) and \(D(T_{x_{0}}^{-1} ) = \mathbb{R}^{m}\).
Remark 3.5
Let \(T : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\) be a mapping.
-
(1)
T is convex ⟺ the graph \(Gr(T)\) is a convex cone in \(\mathbb{R}^{l} \times\mathbb{R}^{m}\).
-
(2)
T is convex \(\Longrightarrow T^{-1} \) is convex from \(\mathbb {R}^{m}\) to \(\mathbb{R}^{l}\).
Lemma 3.6
(see [8])
Let C be a closed convex cone in \(\mathbb{R}^{m}\). Suppose that \(x_{0} \in\mathbb{R}^{l}\) satisfies the Robinson condition (3.11). Then we have the following assertions:
(1) \(T_{x_{0}}^{-1}\) is normed.
(2) If S is a linear operator from \(\mathbb{R}^{l}\) to \(\mathbb {R}^{m}\) such that \(\| T_{x_{0}}^{-1}\| \| S\| < 1\), then the convex set-valued mapping \(\bar{T} =T_{x_{0}} + S \) carries \(\mathbb{R}^{l}\) onto \(\mathbb{R}^{m}\). Furthermore, \(\bar{T}^{-1}\) is normed and
The following proposition shows that the condition (3.11) implies that \(x_{0}\) is regular point of (3.1). Using the center \(L_{0}\)-average Lipschitz condition, we also estimate in Proposition 3.7 the quasi-regular bound function. The proof is given in an analogous way to the corresponding result in [7] by simply using \(L_{0}\) instead of L.
Proposition 3.7
Let C be a closed convex cone in \(\mathbb{R}^{m}\), \(x_{0} \in\mathbb {R}^{l}\), and define \(T_{x_{0}}\) as in (3.9). Suppose that \(x_{0}\) satisfies the Robinson condition (3.11). Then we have the following assertions:
(1) \(x_{0}\) is a regular point of (3.1).
(2) If \(F'\) satisfies the center \(L_{0}\)-average Lipschitz condition (2.8) on \(U(x_{0}, R)\) for some \(R >0\). Let \(\beta_{0} = \| T_{x_{0}}^{-1}\|\) and let \(R_{\beta_{0}}\) such that
Then the quasi-regular radius \(R_{x_{0}}\), the quasi-regular bound function \(\beta_{x_{0}}\) satisfy \(R_{x_{0}} \geq\min\{R, R_{\beta_{0}} \}\) and
for each \(0 \leq t < \min\{R, R_{\beta_{0}} \}\).
Remark 3.8
If \(L_{0} = L\), Proposition 3.7 reduces to the corresponding one in [7]. Otherwise, it constitutes an improvement (see (2.20)-(2.26)).
4 Semi-local convergence analysis for (GNA)
Assume that the set \(\mathcal {C}\) satisfies (2.3). Let \(x_{0} \in \mathbb {R}^{l}\) be a quasi-regular point of (2.3) with the quasi-regular radius \(R_{x_{0}}\) and the quasi-regular bound function \(\beta_{x_{0}}\) (i.e., see (3.7)). Let \(\xi\in[1, +\infty)\) and let
For all \(R \in(0, R_{x_{0}}]\), we define
Theorem 4.1
Let \(\xi\in[1, +\infty)\) and \(\Delta\in(0, +\infty] \). Let \(x_{0} \in \mathbb {R}^{l}\) be a quasi-regular point of (2.3) with the quasi-regular radius \(R_{x_{0}}\) and the quasi-regular bound function \(\beta_{x_{0}}\). Let \(\eta>0\) and \(\alpha_{0} (R)\) be given in (4.1) and (4.2), respectively. Let \(0 < R < R_{x_{0}}\), \(\alpha \geq\alpha_{0} (R)\) be a positive constant, and let \(b_{\alpha}\), \(r_{\alpha}\) be as defined in (2.18). Let \(\{ s_{\alpha,n} \}\) (\(n \geq0\)) and \(s_{\alpha}^{\star}\) be given by (2.17) and (2.31), respectively. Suppose that \(F'\) satisfies the L-average Lipschitz and the center \(L_{0}\)-average Lipschitz conditions on \(U(x_{0} ,s_{\alpha}^{\star})\). Suppose that
Then the sequence \(\{ x_{n} \}\) generated by (GNA) is well defined, remains in \(\overline{U}(x _{0}, s_{\alpha}^{\star})\) for all \(n \geq0\) and converges to some \(x^{\star}\) such that \(F(x^{\star}) \in \mathcal {C}\). Moreover, the following estimates hold: for each \(n\geq1\),
and
Proof
By (4.3), (4.4), and Lemma 2.4, we have
Using the quasi-regularity property of \(x_{0}\), we have
for all \(x \in U(x_{0} , R)\).
First, we prove that the following assertion holds.
Denote by \(x_{k}^{\theta}= \theta x_{k} + (1- \theta) x_{k-1} \) for all \(\theta\in[0,1]\). Using (4.8), we have
for all \(\theta\in[0,1]\). Hence, for \(x=x_{k}\), (4.9) holds, i.e.,
We have also
and
Now, we prove that
We show the first inequality in (4.13). We denote by \(A_{k} = \| x _{k-1}- x_{0} \|\) and \(B_{k}=\| x_{k} - x_{k-1} \|\). We have the following identity:
Then, by the L-average condition on \(U(x_{0}, s_{\alpha}^{\star})\), (4.6) for \(n=k-1\) and (4.10)-(4.14), we get
For simplicity, we denote \(\Xi_{\alpha, k}:=s_{\alpha, {k}} - s_{\alpha, {k-1}}\). By (4.4) for \(n=k\) and Lemma 2.6, we have in turn
Thus we deduce that
Using (4.2) and (4.8), we obtain
Note that \(\alpha\geq\alpha_{0} (R)\). By (2.9), we have
By (2.12), (4.17)-(4.19), we deduce that the first inequality in (4.13) holds. The second inequality of (4.13) follows from (4.4). Moreover, by (4.3) and Lemma 2.8, we have
Hence (4.13) implies that \(d(0, \mathcal {D}(x_{k})) \leq\Delta\) and there exists \(d_{0} \in \mathbb {R}^{l}\) with \(\| d_{0} \|\leq\Delta\) such that \(F(x_{k}) + F'(x_{k}) d_{0} \in \mathcal {C}\). By Remark 3.3, we have
and
We deduce that (4.6) holds for \(n=k\) since \(d_{k} = x_{k+1} - x_{k} \in \mathcal {D}(x_{k})\). We also have
Hence (3.7) holds for \(n=k\) and the assertion (\(\mathcal{T} \)) holds. It follows from (4.4) that \(\{ x_{k} \}\) is a Cauchy sequence in a Banach space and as such it converges to some \(x^{\star}\in\overline{U} (x_{0} , s_{\alpha}^{\star}) \) (since \(\overline{U} (x_{0} , s_{\alpha}^{\star})\) is a closed set).
Now, we use now mathematical induction to prove that (4.4), (4.5), and (4.6) hold. By (4.1), (4.3), and (4.9), it follows that \(\mathcal {D}(x_{0} ) \neq\emptyset\) and
We also have
and (4.4) holds for \(n=1\). By an induction argument, we get
The induction is completed. This completes the proof. □
Remark 4.2
(1) If \(L=L_{0}\), then Theorem 4.1 reduces to the corresponding ones in [7]. Otherwise, in view of (2.29)-(2.31), our results constitute an improvement. The rest of [7] is improved since those results are corollaries of Theorem 4.1. For more details, we leave this part to the motivated reader.
(2) In view of the proof of our Theorem 4.1, we see that the sequence \(\{ r_{\alpha, n} \}\) given by
for each \(n\geq2\) is also a majorizing sequence for the method (GNA). Following the proof of Lemma 2.8 and under the hypotheses of Theorem 4.1, we get
and
Hence \(\{ r _{\alpha, {n}} \}\) and \(\{ s_{\alpha, {n}} \}\) are the tighter majorizing sequences for \(\{ x_{n} \}\) than \(\{ t _{\alpha, {n}} \}\) used by Li and Ng in [7]. The sequences \(\{ r _{\alpha, {n}} \}\) and \(\{ s _{\alpha, {n}} \}\) can converge under hypotheses weaker than the ones given in Theorem 4.1. Such conditions have already given by us for more general functions ψ and in the more general setting of Banach spaces as in [1, 2, 10, 11, 23]. Therefore, here, we only refer to the popular Kantorovich case as an illustration. Choose \(\alpha=1\), \(L(u)=L\), and \(L_{0} (u)= L_{0}\) for all \(u \geq0\). Then the sequence \(\{ t _{\alpha, {n}} \}\) converges under the Newton-Kantorovich hypothesis, famous for its simplicity and clarity (see [1, 24]),
The sequence \(\{ r_{\alpha, {n}} \}\) converges provided that (see, for example, [23])
where
and the sequence \(\{ r _{\alpha, {n}} \}\) converges if (see, for example, [23])
where
It follows from (4.24)-(4.26) that
but not vice versa unless \(L_{0}=L\). Moreover, we get
as \(\frac{L_{0}}{L}\longrightarrow0\).
(3) There are cases when the sufficient convergence conditions developed in the preceding work are not satisfied. Then one can use the modified Gauss-Newton method (MGNM). In this case, the majorizing sequence proposed in [7] is given by
for each \(n\geq0\). This sequence clearly converges under the hypotheses of Theorem 4.1, so that the estimates (4.4)-(4.7) hold with the sequence \(\{ q_{\alpha, {n}} \}\) replacing \(\{ s_{\alpha, {n}} \}\). However, according to the proof of Theorem 4.1, the hypotheses on \(\psi_{\alpha, 0}\) can replace the corresponding ones on \(\psi _{\alpha}\). Moreover, the majorizing sequence is given by
for each ≥0. Furthermore, we have
for each \(s \geq0\). Hence clearly it follows that, for each \(n\geq0\),
and
(Notice also the advantages of (2.20)-(2.26).)
In the special case when functions \(L_{0}\) and L are constants and \(\alpha=1\), we find that the conditions on the function \(\psi_{\alpha}\) reduce to (4.24), whereas using \(\psi_{\alpha, 0} \)
Notice that
as \(\frac{L_{0}}{L}\longrightarrow0\). Therefore, one can use (MGNM) as a predictor until a certain iterate \(x_{N}\) for which the sufficient conditions for (GNM) are satisfied. Then we use \(x_{N}\) as the starting iterate for faster than (MGNM) method (GNM). Such an approach was used by the author in [25].
5 General majorant conditions
In this section, we provide a semilocal convergence analysis for (GNA) using more general majorant conditions than (2.8) and (2.9).
Definition 5.1
Let \(\mathcal {Y}\) be a Banach space, \(x_{0} \in \mathbb{R}^{l}\) and \(\alpha>0\). Let \(G : \mathbb{R}^{l} \longrightarrow \mathcal {Y}\) and \(f_{\alpha}: [0,r [ \longrightarrow\,]{-}\infty, + \infty[\) be continuously differentiable. Then G is said to satisfy:
(1) the center-majorant condition on \(U (x_{0} , r)\) if
for all \(x \in U(x_{0} , r)\);
(2) the majorant condition on \(U (x_{0} , r)\) if
for all \(x,y \in U(x_{0} , r)\) with \(\| x- y \|+ \| y - x_{0} \|\leq r\).
Clearly, the conditions (5.1) and (5.2) generalize (2.8) and (2.9), respectively, in [20] (see also [1, 2, 10, 11, 23, 25]) (for \(G=F'\) and \(\alpha=1\)). Define the majorizing sequence \(\{ t_{\alpha, n} \}\) by
Moreover, as in (4.2) and for \(R>0\), define (implicitly):
Next, we provide sufficient conditions for the convergence of the sequence \(\{ t_{\alpha, n} \}\) corresponding to the ones given in Lemma 2.4.
Lemma 5.2
(see, for example, [2, 10, 20])
Let \(r>0\), \(\alpha>0\), and \(f_{\alpha}: [0,r) \longrightarrow (-\infty, + \infty)\) be continuously differentiable. Suppose that:
(1) \(f_{\alpha}(0) >0\), \(f'_{\alpha}(0) = -1\);
(2) \(f'_{\alpha}\) is convex and strictly increasing;
(3) the equation \(f_{\alpha}(t)=0\) has positive zeros. Denote by \(r _{\alpha}^{\star} \) the smallest zero. Define \(r _{\alpha}^{\star\star} \) by
Then the sequence \(\{ t_{\alpha, n} \}\) is strictly increasing and converges to \(r _{\alpha}^{\star} \). Moreover, the following estimates hold:
for each \(n\geq1\), where \(D^{-} f'\) is the left directional derivative of f (see, for example, [1, 2, 10, 20]).
Now, we show the following semilocal convergence result for the method (GNA) using the generalized majorant conditions (5.1) and (5.2).
Theorem 5.3
Let \(\xi\in[1, +\infty)\) and \(\Delta\in(0, +\infty] \). Let \(x_{0} \in \mathbb {R}^{l}\) be a quasi-regular point of (2.3) with the quasi-regular radius \(R_{x_{0}}\) and the quasi-regular bound function \(\beta_{x_{0}}\). Let \(\eta>0\) and \(\alpha_{0} (r)\) be given in (4.1) and (5.4). Let \(0 < R < R_{x_{0}}\), \(\alpha\geq\alpha _{0} (R)\) be a positive constant, and let \(r_{\alpha}^{\star}\), \(r_{\alpha}^{\star\star}\) be as defined in Lemma 5.2. Suppose that \(F'\) satisfies the majorant condition on \(U(x_{0} ,r_{\alpha}^{\star})\), and the conditions
hold. Then the sequence \(\{ x_{n} \}\) generated by (GNA) is well defined, remains in \(\overline{U}(x _{0}, r _{\alpha}^{\star})\) for all \(n \geq0\) and converges to some \(x^{\star}\) such that \(F(x^{\star}) \in \mathcal {C}\). Moreover, the following estimates hold: for each \(n\geq1\),
and
where the sequence \(\{ t_{\alpha, n } \}\) is given by (5.3).
Proof
We use the same notations as in Theorem 4.1. We follow the proof of Theorem 4.1 until (4.13). Then, using (4.10), (5.2) (for \(G=F'\)), (5.3), (5.4), and the hypothesis \(\alpha\geq\alpha_{0} (R)\), we get in turn
where \(t_{\alpha, k }^{\theta}=\theta t_{\alpha, k } + (1-\theta) ( t_{\alpha, k } - t_{\alpha, k-1 }) \) for all \(\theta\in[0,1]\). The rest follows as in the proof of Theorem 4.1. This completes the proof. □
Remark 5.4
In view of the condition (5.2), there exists \(f_{\alpha,0} : [0,r) \longrightarrow(-\infty, + \infty)\) continuously differentiable such that
for all \(x \in U(x_{0} , r)\) and \(r \leq R\). Moreover,
for all \(t \in[0, r]\) holds in general and \(\frac{f'_{\alpha}}{f'_{\alpha, 0}}\) can be arbitrarily large (see, for example, [1, 2, 10, 11, 23, 25]). These observations motivate us to introduce the tighter majorizing sequences \(\{ r_{\alpha, n } \}\), \(\{ s_{\alpha, n } \}\) by
for each ≥2 and
for each \(n\geq0\).
6 Conclusion
Using a combination of average and center-average type conditions, we presented a semilocal convergence analysis for the method (GNA) to approximate a solution or a fixed point of a convex composite optimization problem in the setting of finite dimensional spaces. Our analysis extends the applicability of the method (GNA) under the same computational cost as in earlier studies, such as [4, 5, 7, 12, 13, 26–35].
References
Argyros, IK: Convergence and Applications of Newton-Type Iterations. Springer, New York (2009)
Argyros, IK, Hilout, S: Computational Methods in Nonlinear Analysis: Efficient Algorithms, Fixed Point Theory and Applications. World Scientific, Singapore (2013)
Burke, JV, Ferris, MC: A Gauss-Newton method for convex composite optimization. Math. Program., Ser. A 71, 179-194 (1995)
Moldovan, A, Pellegrini, L: On regularity for constrained extremum problems, I. Sufficient optimality conditions. J. Optim. Theory Appl. 142, 147-163 (2009)
Moldovan, A, Pellegrini, L: On regularity for constrained extremum problems, II. Necessary optimality conditions. J. Optim. Theory Appl. 142, 165-183 (2009)
Rockafellar, RT: Convex Analysis. Princeton Mathematical Series, vol. 28. Princeton University Press, Princeton (1970)
Li, C, Ng, KF: Majorizing functions and convergence of the Gauss-Newton method for convex composite optimization. SIAM J. Optim. 18, 613-642 (2007)
Robinson, SM: Extension of Newton’s method to nonlinear functions with values in a cone. Numer. Math. 19, 341-347 (1972)
Robinson, SM: Stability theory for systems of inequalities, I. Linear systems. SIAM J. Numer. Anal. 12, 754-769 (1975)
Argyros, IK, Cho, YJ, Hilout, S: Numerical Methods for Equations and Its Applications. CRC Press, New York (2012)
Argyros, IK, Hilout, S: Extending the applicability of the Gauss-Newton method under average Lipschitz-type conditions. Numer. Algorithms 58, 23-52 (2011)
Wang, XH: Convergence of Newton’s method and inverse function theorem in Banach space. Math. Comput. 68, 169-186 (1999)
Wang, XH: Convergence of Newton’s method and uniqueness of the solution of equations in Banach space. IMA J. Numer. Anal. 20, 123-134 (2000)
Xu, XB, Li, C: Convergence of Newton’s method for systems of equations with constant rank derivatives. J. Comput. Math. 25, 705-718 (2007)
Xu, XB, Li, C: Convergence criterion of Newton’s method for singular systems with constant rank derivatives. J. Math. Anal. Appl. 345, 689-701 (2008)
Zabrejko, PP, Nguen, DF: The majorant method in the theory of Newton-Kantorovich approximations and the Ptǎk error estimates. Numer. Funct. Anal. Optim. 9, 671-684 (1987)
Häubler, WM: A Kantorovich-type convergence analysis for the Gauss-Newton method. Numer. Math. 48, 119-125 (1986)
Hiriart-Urruty, JB, Lemaréchal, C: Convex Analysis and Minimization Algorithms I: Fundamentals. Advanced Theory and Bundle Methods, vol. 305. Springer, Berlin (1993)
Hiriart-Urruty, JB, Lemaréchal, C: Convex Analysis and Minimization Algorithms II. Advanced Theory and Bundle Methods, vol. 306. Springer, Berlin (1993)
Ferreira, OP, Svaiter, BF: Kantorovich’s majorants principle for Newton’s method. Comput. Optim. Appl. 42, 213-229 (2009)
Li, C, Wang, XH: On convergence of the Gauss-Newton method for convex composite optimization. Math. Program., Ser. A 91, 349-356 (2002)
Ng, KF, Zheng, XY: Characterizations of error bounds for convex multifunctions on Banach spaces. Math. Oper. Res. 29, 45-63 (2004)
Argyros, IK, Hilout, S: Weaker conditions for the convergence of Newton’s method. J. Complex. 28, 364-387 (2012)
Kantorovich, LV, Akilov, GP: Functional Analysis. Pergamon Press, Oxford (1982)
Argyros, IK: Approximating solutions of equations using Newton’s method with a modified Newton’s method iterate as a starting point. Rev. Anal. Numér. Théor. Approx. 36, 123-138 (2007)
Argyros, IK, Cho, YJ, George, S: On the ‘Terra incognita’ for the Newton-Kantorovich method. J. Korean Math. Soc. 51, 251-266 (2014)
Argyros, IK, Cho, YJ, Khattri, SK: On a new semilocal convergence analysis for the Jarratt method. J. Inequal. Appl. 2013, 194 (2013)
Argyros, IK, Cho, YJ, Ren, HM: Convergence of Halley’s method for operators with the bounded second derivative in Banach spaces. J. Inequal. Appl. 2013, 260 (2013)
Argyros, IK, Hilout, S: Local convergence analysis for a certain class of inexact methods. J. Nonlinear Sci. Appl. 1, 244-253 (2008)
Argyros, IK, Hilout, S: Local convergence analysis of inexact Newton-like methods. J. Nonlinear Sci. Appl. 2, 11-18 (2009)
Argyros, IK, Hilout, S: Multipoint iterative processes of efficiency index higher than Newton’s method. J. Nonlinear Sci. Appl. 2, 195-203 (2009)
Sahu, DR, Cho, YJ, Agarwal, RP, Argyros, IK: Accessibility of solutions of operator equations by Newton-like method. J. Complex. 31, 637-657 (2015)
Li, C, Hu, N, Wang, J: Convergence behavior of Gauss-Newton’s method and extensions to the Smale point estimate theory. J. Complex. 26, 268-295 (2010)
Li, C, Zhang, WH, Jin, XQ: Convergence and uniqueness properties of Gauss-Newton’s method. Comput. Math. Appl. 47, 1057-1067 (2004)
Chen, X, Yamamoto, T: Convergence domains of certain iterative methods for solving nonlinear equations. Numer. Funct. Anal. Optim. 10, 37-48 (1989)
Acknowledgements
The second author was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and future Planning (2014R1A2A2A01002100).
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Argyros, I.K., Cho, Y.J. & Hilout, S. On iterative computation of fixed points and optimization. Fixed Point Theory Appl 2015, 128 (2015). https://doi.org/10.1186/s13663-015-0372-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13663-015-0372-8