Appendix A
A.1 Proof of Lemma 2
(1) Since xk+ 1 is the optimal solution of the proximal operation (1.3) with a = xk − γ∇f(xk), we have
$$ \begin{array}{@{}rcl@{}} && g(x^{k+1}) + \frac{1}{2\gamma}\left\| x^{k+1} - \left( x^{k} - \gamma \nabla f(x^{k}) \right) \right\|^{2} \le g(x^{k}) + \frac{1}{2\gamma}\left\| \gamma \nabla f(x^{k}) \right\|^{2}, \end{array} $$
which can be reformulated as
$$ g(x^{k+1}) + \frac{1}{2\gamma}\left\| x^{k+1} - x^{k} \right\|^{2} + \left\langle \nabla f(x^{k}), x^{k+1} - x^{k} \right\rangle - g(x^{k}) \le 0. $$
(A.1)
Furthermore, since ∇f(x) is globally Lipschitz continuous with the Lipschitz constant L, we have
$$ f(x^{k+1})\le f(x^{k}) + \left\langle \nabla f(x^{k}), x^{k+1} - x^{k} \right\rangle + \frac{L}{2}\left\| x^{k+1} - x^{k} \right\|^{2}. $$
Adding the above inequality to (A.1) we obtain
$$ F(x^{k+1}) - F(x^{k}) \le \left( \frac{L}{2} - \frac{1}{2\gamma} \right) \left\| x^{k+1} - x^{k} \right\|^{2}. $$
As a result if \(\gamma < \frac {1}{L}\) we have (3.1) with \(\kappa _{1} := \frac {1}{2\gamma } - \frac {L}{2}\).
(2) By the optimality of xk+ 1 we have that for any x,
$$ g(x^{k+1}) + \frac{1}{2\gamma}\left\| x^{k+1} - x^{k} + \gamma \nabla f(x^{k}) \right\|^{2} \le g(x) + \frac{1}{2\gamma}\left\| x - x^{k} + \gamma \nabla f(x^{k}) \right\|^{2}, $$
which can be reformulated as
$$ g(x^{k+1}) - g(x) \le \frac{1}{2\gamma}\left\| x - x^{k} \right\|^{2} - \frac{1}{2\gamma}\left\| x^{k+1} - x^{k} \right\|^{2} + \left\langle \nabla f(x^{k}), x - x^{k+1} \right\rangle. $$
By the Lipschitz continuity of ∇f(x),
$$ f(x) \ge f(x^{k+1}) + \left\langle \nabla f(x^{k+1}), x - x^{k+1} \right\rangle - \frac{L}{2}\left\| x - x^{k+1} \right\|^{2}. $$
By the above two inequalities we obtain
$$ \begin{array}{@{}rcl@{}} F(x^{k+1}) - F(x)&\le&\frac{1}{2\gamma}\left\| x - x^{k} \right\|^{2} - \frac{1}{2\gamma}\left\| x^{k+1} - x^{k} \right\|^{2} + \left\langle \nabla f(x^{k}), x - x^{k+1} \right\rangle \\ && - \left\langle \nabla f(x^{k+1}), x - x^{k+1} \right\rangle + \frac{L}{2} \left\| x - x^{k+1} \right\|^{2}\\ &\le&\frac{1}{\gamma}\left\| x - x^{k+1}\right\|^{2} + \frac{1}{\gamma} \left\| x^{k+1} - x^{k} \right\|^{2} - \frac{1}{2\gamma}\left\| x^{k+1} - x^{k} \right\|^{2} \\ && +\left\langle \nabla f(x^{k}) - \nabla f(x^{k+1}), x - x^{k+1} \right\rangle + \frac{L}{2} \left\| x - x^{k+1} \right\|^{2}\\ &\le&\frac{1}{\gamma}\left\| x - x^{k+1}\right\|^{2}+ \frac{1}{\gamma} \left\| x^{k+1} - x^{k} \right\|^{2}-\frac{1}{2\gamma} \left\| x^{k+1} - x^{k} \right\|^{2} \\ && + \frac{L}{2} \left\| x^{k+1} - x^{k} \right\|^{2} + \frac{1}{2} \left\| x - x^{k+1} \right\|^{2} + \frac{L}{2} \left\| x - x^{k+1} \right\|^{2}\\ &=& \left( \frac{1}{\gamma} + \frac{L+1}{2} \right) \left\| x - x^{k+1} \right\|^{2} + \left( \frac{L}{2} + \frac{1}{2\gamma} \right) \left\| x^{k} - x^{k+1} \right\|^{2}, \end{array} $$
from which we can obtain (3.2) with \(\kappa _{2} := \max \limits \left \{ \left (\frac {1}{\gamma } + \frac {L+1}{2} \right ), \left (\frac {L}{2} + \frac {1}{2\gamma } \right ) \right \}\).
A.2 Proof of Theorem 5
In the proof, we denote by \(\zeta :=F(\bar x)\) for succinctness. And we recall that the proper separation of the stationary value condition holds on \(\bar {x} \in {\mathcal {X}}^{\pi }\), i.e., there exists δ > 0 such that
$$ x \in {\mathcal{X}}^{\pi}\cap {\mathbb{B}} (\bar{x},\delta )\quad \Longrightarrow \quad F(x) = F(\bar{x}). $$
(A.2)
Without lost of generality, we assume that 𝜖 < δ/(κ + 1) throughout the proof.
Step 1. We prove that \(\bar x\) is a stationary point and
$$ \lim_{k\rightarrow \infty} \|x^{k+1} -x^{k}\|=0. $$
(A.3)
Adding the inequalities in (3.1) starting from iteration k = 0 to an arbitrary positive integer K, we obtain
$$ \sum\limits_{k=0}^{K} \left\| x^{k+1} - x^{k} \right\|^{2} \le \frac{1}{\kappa_{1}} \left( F(x^{0}) - F(x^{K+1}) \right)\le \frac{1}{\kappa_{1}} \left( F(x^{0}) - F_{\min} \right) < \infty. $$
It follows that \({\sum }_{k=0}^{\infty } \left \| x^{k+1} - x^{k} \right \|^{2} < \infty ,\) and consequently (A.3) holds. Let \(\{ x^{k_{i}} \}_{i=1}^{\infty }\) be a convergent subsequence of \(\left \{ x^{k} \right \}\) such that \(x^{k_{i}}\rightarrow \bar {x}\) as \(i\rightarrow \infty \). Then by (A.3), we have
$$ \lim_{i\rightarrow \infty} x^{k_{i}}=\lim_{i\rightarrow \infty} x^{k_{i}-1}=\bar x. $$
(A.4)
Since
$$ x^{k_{i}} \in\text{Prox}_{g}^{\gamma} \left (x^{k_{i}-1} -\gamma \nabla f(x^{k_{i}-1}) \right ), $$
(A.5)
let \(i\rightarrow \infty \) in (A.5) and by the outer semicontinuity of \(\text {Prox}_{g}^{\gamma } (\cdot )\) (see [50, Theorem 1.25]) and continuity of ∇f, we have
$$ \bar{x}\in \text{Prox}_{g}^{\gamma} \left( \bar{x} -\gamma \nabla f(\bar{x}) \right ), $$
Using the definition of the proximal operator and applying the optimality condition and we have
$$ 0\in \nabla f \left( \bar{x}\right) + \partial^{\pi} g \left( \bar{x}\right) , $$
and so \(\bar x\in \mathcal {X}^{\pi }\).
Step 2. Given \(\hat {\epsilon } > 0\) such that \(\hat {\epsilon } < \delta /\epsilon - \kappa -1\), for each k > 0, we can find \({{\bar {x}^{k} \in \mathcal {X}^{\pi }}}\) such that
$$\left\| \bar{x}^{k} - x^{k} \right\| \le \min\left\{\sqrt{d\left( x^{k},{{{\mathcal{X}}^{\pi}}} \right)^{2} + \hat{\epsilon}\|x^{k} - x^{k-1}\|^{2}}, d\left( x^{k}, {{{\mathcal{X}}^{\pi}}} \right) + \hat{\epsilon}\|x^{k} - x^{k-1}\|\right\}.$$
It follows by the cost-to-estimate condition (3.2) we have
$$ F(x^{k}) - F(\bar{x}^{k})\le \hat{ \kappa}_{2} \left( \text{dist} \left( x^{k}, {\mathcal{X}}^{\pi} \right)^{2} + \left\| x^{k} - x^{k-1} \right\|^{2} \right), $$
(A.6)
with \(\hat { \kappa }_{2} = \kappa _{2}(1+\hat {\epsilon })\). Now we use the method of mathematical induction to prove that there exists kℓ > 0 such that for all j ≥ kℓ,
$$ \begin{array}{@{}rcl@{}} && x^{j}\in \mathbb{B} \left( \bar{x}, \epsilon \right),\quad x^{j+1}\in \mathbb{B} \left( \bar{x}, \epsilon \right),\quad F(\bar{x}^{j}) = \zeta ,\quad {{F(\bar{x}^{j+1}) = \zeta}}, \end{array} $$
(A.7)
$$ \begin{array}{@{}rcl@{}} && F(x^{j+1}) - \zeta \le \hat{ \kappa}_{2} \left( \text{dist} \left( x^{j+1}, {\mathcal{X}}^{\pi} \right)^{2} + \left\| x^{j+1} - x^{j} \right\|^{2} \right), \end{array} $$
(A.8)
$$ \sum\limits_{i=k_{\ell}}^{j} \left\| x^{i} - x^{i+1} \right\| \le \frac{\left\| x^{k_{\ell}-1} - x^{k_{\ell}} \right\| - \left\| x^{j} - x^{j+1} \right\|}{2} + c\left[\sqrt{F(x^{k_{\ell}}) - \zeta} - \sqrt{F(x^{j+1}) - \zeta}\right], $$
(A.9)
where the constant \(c:=\frac {2\sqrt {\hat { \kappa }_{2}(\kappa ^{2}+1)}}{\kappa _{1}}> 0\).
By (A.4) and the fact that F is continuous in its domain, there exists kℓ > 0 such that \(x^{k_{\ell }}\in \mathbb {B} \left (\bar {x},\epsilon \right )\), \(x^{k_{\ell }+1}\in \mathbb {B} \left (\bar {x},\epsilon \right )\),
$$ \begin{array}{@{}rcl@{}} &&{\left\| x^{k_{\ell}} - \bar{x} \right\| + \frac{\left\| x^{k_{\ell}-1} - x^{k_{\ell}} \right\| }{2} + c\left[\sqrt{F(x^{k_{\ell}}) - \zeta}\right]\le \frac{\epsilon}{2}}, \end{array} $$
(A.10)
$$ \begin{array}{@{}rcl@{}} &&{\left\| x^{k+1} - x^{k} \right\| < \frac{\epsilon}{2} , \quad \forall k \ge k_{\ell} - 1}, \end{array} $$
(A.11)
$$ \begin{aligned} \left\|\bar{x}^{k_{\ell}} - \bar{x}\right\| &\le \left\|\bar{x}^{k_{\ell}} - x^{k_{\ell}}\right\| + \left\|x^{k_{\ell}} - \bar{x}\right\| \\ &\overset{{(3.4)}}{\le} (\kappa+\hat{\epsilon}) \left\|x^{k_{\ell}} - x^{k_{\ell} - 1}\right\| + \left\|x^{k_{\ell}} - \bar{x}\right\| < {(\kappa +\hat{\epsilon} + 2)\epsilon/2} < \delta, \end{aligned} $$
which indicates \(\bar {x}^{k_{\ell }}\in {{{\mathcal {X}}^{\pi }}} \cap {\mathbb {B}}\left (\bar {x},\delta \right )\). It follows by the proper separation of the stationary value condition (A.2) that \(F\left (\bar {x}^{k_{\ell }} \right ) = \zeta \).
Before inducing (A.7)–(A.9), we should get ready by showing that for j ≥ kℓ, if (A.7) and (A.8) hold, then
$$ 2\left\| x^{j} - x^{j+1} \right\| \le c\left[\sqrt{F(x^{j}) - \zeta} - \sqrt{F(x^{j+1}) - \zeta}\right] + \frac{\left\| x^{j} - x^{j+1} \right\| + \left\| x^{j-1} - x^{j} \right\|}{2}. $$
(A.12)
Firstly, since \(x^{j}\in \mathbb {B} \left (\bar {x}, \epsilon \right )\), \(F(\bar {x}^{j}) = \zeta \) and (A.6) holds, it follows from (3.4) that
$$ F(x^{j})-\zeta\leq \hat{ \kappa}_{2}(\kappa \|x^{j}-x^{j-1}\|^{2}+ \|x^{j}-x^{j-1}\|^{2}) ={\kappa_{3}^{2}}\|x^{j}-x^{j-1}\|^{2}, $$
(A.13)
where \(\kappa _{3} := \sqrt {\hat { \kappa }_{2} \left (\kappa ^{2} + 1 \right )}\). Similarly, since \(x^{j+1}\in \mathbb {B} \left (\bar {x}, \epsilon \right )\) and \(F(\bar {x}^{j+1}) = \zeta \), by (A.6) and condition (3.4), we have
$$ \begin{array}{@{}rcl@{}} F(x^{j+1}) - \zeta &\le&{\kappa_{3}^{2}} \left\| x^{j+1} - x^{j} \right\|^{2}. \end{array} $$
(A.14)
As a result, we can obtain
$$ \begin{array}{@{}rcl@{}} \sqrt{F(x^{j}) - \zeta} - \sqrt{F(x^{j+1}) - \zeta}&=&\frac{\left( F(x^{j}) - \zeta\right) - \left( F(x^{j+1}) - \zeta\right)}{\sqrt{F(x^{j}) - \zeta} + \sqrt{F(x^{j+1}) - \zeta}}\\ &=& \frac{F(x^{j}) -F(x^{j+1})}{\sqrt{F(x^{j}) - \zeta} + \sqrt{F(x^{j+1}) - \zeta}}\\ &\overset{\text{(3.1)(A.46)(A.47)}}{\ge} & \frac{\kappa_{1} \left\| x^{j+1} - x^{j} \right\|^{2}}{\kappa_{3}\left( \left\| x^{j} - x^{j-1} \right\| + \left\| x^{j+1} - x^{j} \right\| \right)}. \end{array} $$
After defining \(c: = \frac {2\kappa _{3}}{\kappa _{1}}\), we have
$$ \left( c \left[\sqrt{F(x^{j}) - \zeta} - \sqrt{F(x^{j+1}) - \zeta}\right] \right)\left( \frac{\left\| x^{j} - x^{j+1} \right\| + \left\| x^{j-1} - x^{j} \right\|}{2} \right)\ge \left\| x^{j+1} - x^{j} \right\|^{2}, $$
from which by applying \(ab\le \left (\frac {a+b}{2} \right )^{2}\) we establish (A.12).
Next we proceed to prove the three properties (A.7)–(A.9) by induction on j. For j = kℓ, we have
$$ x^{k_{\ell}}\in \mathbb{B} \left( \bar{x}, \epsilon \right),\quad x^{k_{\ell}+1}\in \mathbb{B} \left( \bar{x}, \epsilon \right),\quad F(\bar{x}^{k_{\ell}}) = \zeta, $$
and similar to the estimate of \(\left \|\bar {x}^{k_{\ell }} - \bar {x}\right \|\), we can show
$$ \left\| \bar{x}^{k_{\ell}+1} - \bar{x} \right\| \le \delta. $$
It follows by (A.2) that \(F(\bar {x}^{k_{\ell }+1}) = \zeta \), and hence by (A.6),
$$ F(x^{k_{\ell}+1}) - \zeta \le \hat{ \kappa}_{2} \left( \text{dist} \left( x^{k_{\ell}+1}, {{{\mathcal{X}}^{\pi}}} \right)^{2} + \left\| x^{k_{\ell}+1} - x^{k_{\ell}} \right\|^{2} \right), $$
which is (A.8) with j = kℓ. Note that property (A.9) for j = kℓ can be obtained directly through (A.12).
Now suppose (A.7) (A.8) and (A.9) hold for certain j > kℓ. By induction we also want to show that (A.7) (A.8) and (A.9) hold for j + 1. We have
$$ \begin{array}{@{}rcl@{}} \left\| x^{j+2} - \bar{x} \right\|&\le& \left\| x^{k_{\ell}} - \bar{x} \right\| + {\sum}_{i={k_{\ell}}}^{j} \left\| x^{i} - x^{i+1} \right\| + \left\| x^{j+1} - x^{j+2} \right\| \\ &<& \left\| x^{k_{\ell}} - \bar{x} \right\| + \frac{\left\| x^{k_{\ell}-1} - x^{k_{\ell}} \right\| - \left\| x^{j} - x^{j+1} \right\|}{2}\\ && + c\left[\sqrt{F(x^{k_{\ell}}) - \zeta} - \sqrt{F(x^{j+1}) - \zeta}\right] + \frac{\epsilon}{2} \\ &\le& \left\| x^{k_{\ell}} - \bar{x} \right\| + \frac{\left\| x^{k_{\ell}-1} - x^{k_{\ell}} \right\| }{2} + c\left[\sqrt{F(x^{k_{\ell}}) - \zeta}\right] + \frac{\epsilon}{2} \le \epsilon, \end{array} $$
where the second inequality follows from (A.9) and (A.11) and the last inequality follows from (A.10). Since \(x^{j+2}\in \mathbb {B}(\bar x, \epsilon )\), by the definition of \(\bar {x}^{j}\) and (3.4), there holds that
$$\begin{aligned} \left\| \bar{x}^{j+2} - \bar{x} \right\| &\le \| \bar{x}^{j+2} - x^{j+2} \| + \| x^{j+2} - \bar{x} \| \\ &\le (\kappa+\hat{\epsilon})\| x^{j+2} - x^{j+1} \| + \epsilon \\ & < (\kappa +\hat{\epsilon} +2)\epsilon/2 < \delta, \end{aligned}$$
where the third inequality follows from (A.11). It follows from the proper separation of stationary value assumption (A.2) that \(F(\bar {x}^{j+2}) = \zeta \). Consequently by (A.6), we have
$$ F(x^{j+2}) - \zeta \le \hat{ \kappa}_{2} \left( \text{dist} \left( x^{j+2}, {{{\mathcal{X}}^{\pi}}} \right)^{2} + \left\| x^{j+2} - x^{j+1} \right\|^{2} \right). $$
So far we have shown that (A.7)-(A.8) hold for j + 1. Moreover
$$ \begin{array}{@{}rcl@{}} && \sum\limits_{i=k_{\ell}}^{j+1} \left\| x^{i} - x^{i+1} \right\| \\ &\overset{\text{(A.42)}}{\le}& \frac{\left\| x^{k_{\ell}-1} - x^{k_{\ell}} \right\| - \left\| x^{j} - x^{j+1} \right\|}{2} + c\left[\sqrt{F(x^{k_{\ell}}) - \zeta} - \sqrt{F(x^{j+1}) - \zeta}\right] \\ && + \left\| x^{j+1} - x^{j+2} \right\|\\ &\overset{\text{(A.45)}}{\le}& \frac{\left\| x^{k_{\ell}-1} - x^{k_{\ell}} \right\| - \left\| x^{j} - x^{j+1} \right\|}{2} + c\left[\sqrt{F(x^{k_{\ell}}) - \zeta} - \sqrt{F(x^{j+1}) - \zeta}\right] \\ && + c\left[\sqrt{F(x^{j+1}) - \zeta} - \sqrt{F(x^{j+2}) - \zeta}\right] + \frac{\left\| x^{j+1} - x^{j+2} \right\| + \left\| x^{j} - x^{j+1} \right\|}{2}\\ && - \left\| x^{j+1} - x^{j+2} \right\|\\ &=& \frac{\left\| x^{k_{\ell}-1} - x^{k_{\ell}} \right\| - \left\| x^{j+1} - x^{j+2} \right\|}{2} + c\left[\sqrt{F(x^{k_{\ell}}) - \zeta} - \sqrt{F(x^{j+2}) - \zeta}\right], \end{array} $$
from which we obtain (A.9) for j + 1. The desired induction on j is now complete. In summary, we have now proved the properties (A.7)–(A.9).
Step 3. We prove that the whole sequence {xk} converges to \(\bar x\) and (3.5)–(3.6) hold.
By (A.9), for all j ≥ kℓ
$$ \begin{aligned} \sum\limits_{i=k_{\ell}}^{j} \left\| x^{i} - x^{i+1} \right\| &\le \frac{\left\| x^{k_{\ell}-1} - x^{k_{\ell}} \right\| - \left\| x^{j} - x^{j+1} \right\|}{2} + c\left[\sqrt{F(x^{k_{\ell}}) - \zeta} - \sqrt{F(x^{j+1}) - \zeta}\right]\\ &\le \frac{\left\| x^{k_{\ell}-1} - x^{k_{\ell}} \right\|}{2} + c \sqrt{F(x^{k_{\ell}}) - \zeta} < \infty, \end{aligned} $$
which indicates that \(\left \{ x^{k} \right \}\) is a Cauchy sequence. It follows that the whole sequence converges to the stationary point \(\bar {x}\). Further for all k ≥ kℓ, we have \(x^{k}\in \mathbb {B}(\bar {x},\epsilon )\). As a result, the PG-iteration-based error bound condition (3.4) holds on all the iteration points \(\left \{ x^{k} \right \}_{k > k_{\ell }}\). Recall that by (3.1) and (A.14), we have
$$ \begin{array}{@{}rcl@{}} F(x^{k+1}) - F(x^{k}) &\le & -\kappa_{1}\left\| x^{k+1} - x^{k} \right\|^{2}, \\ F(x^{k+1}) - \zeta &\le & \hat{ \kappa}_{2} \left( \kappa^{2} + 1 \right)\left\| x^{k+1} - x^{k} \right\|^{2} , \end{array} $$
which implies that
$$ F(x^{k}) - F(x^{k+1})\ge \frac{\kappa_{1}}{\hat{ \kappa}_{2}\left( \kappa^{2} + 1 \right)} \left( F(x^{k+1}) - \zeta\right). $$
We can observe easily that
$$ F(x^{k}) - \zeta + \zeta - F(x^{k+1})\ge \frac{\kappa_{1}}{\hat{ \kappa}_{2}\left( \kappa^{2} + 1 \right)} \left( F(x^{k+1}) - \zeta\right). $$
Thus we have
$$ F(x^{k+1}) - \zeta \le \sigma \left( F(x^{k}) - \zeta\right), \text{with } \sigma := \frac{1}{1 + \frac{\kappa_{1}}{\hat{ \kappa}_{2}\left( \kappa^{2} + 1 \right)}} < 1, $$
which completes the proof of (3.5).
Inspired by [6], we have following linear convergence result for sequence {xk}. Recall the sufficient descent property (3.1),
$$F(x^{k+1}) - F(x^{k}) \le -\kappa_{1}\left\| x^{k+1} - x^{k} \right\|^{2},$$
which indicates that there exists a constant C such that
$$ \left\| x^{k+1} - x^{k} \right\| \le \sqrt{ \frac{1}{\kappa_{1}} \left( F(x^{k}) - F(x^{k+1}) \right) }\le \sqrt{ \frac{1}{\kappa_{1}} \left( F(x^{k}) - \zeta \right) }\le C \sqrt{\sigma}^{k}. $$
In addition, we have that
$$ \left\| x^{k} - \bar{x} \right\| \le {\sum}_{i=k}^{\infty} \left\| x^{i} - x^{i+1} \right\|\le {\sum}_{i=k}^{\infty} C\sqrt{\sigma}^{i} \le \frac{C}{1-\sqrt{\sigma}} \sqrt{\sigma}^{k}, $$
which implies (3.6) with \(\rho _{0} = \frac {C}{1-\sqrt {\sigma }}\).