The projected polar proximal point algorithm converges globally

Lindstrom, Scott B.

doi:10.1007/s10898-022-01136-0

The projected polar proximal point algorithm converges globally

Open access
Published: 13 April 2022

Volume 84, pages 177–203, (2022)
Cite this article

Download PDF

You have full access to this open access article

Journal of Global Optimization Aims and scope Submit manuscript

The projected polar proximal point algorithm converges globally

Download PDF

Scott B. Lindstrom ORCID: orcid.org/0000-0003-4287-4788¹

1513 Accesses
Explore all metrics

Abstract

Friedlander, Macêdo, and Pong recently introduced the projected polar proximal point algorithm (P4A) for solving optimization problems by using the closed perspective transforms of convex objectives. We analyse a generalization (GP4A) which replaces the closed perspective transform with a more general closed gauge. We decompose GP4A into the iterative application of two separate operators, and analyse it as a splitting method. By showing that GP4A and its under-relaxations exhibit global convergence whenever a fixed point exists, we obtain convergence guarantees for P4A by letting the gauge specify to the closed perspective transform for a convex function. We then provide easy-to-verify sufficient conditions for the existence of fixed points for the GP4A, using the Minkowski function representation of the gauge. Conveniently, the approach reveals that global minimizers of the objective function for P4A form an exposed face of the dilated fundamental set of the closed perspective transform.

The generalized proximal point algorithm with step size 2 is not necessarily convergent

Article 03 March 2018

Relaxed-inertial proximal point type algorithms for quasiconvex minimization

Article 26 August 2022

An extension of the proximal point algorithm beyond convexity

Article Open access 06 September 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Friedlander, Macêdo, and Pong introduced the projected polar proximal point algorithm (${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$, Definition 4) as the first proximal-point-like algorithm based on the polar envelope of a gauge [6]. The motivation to study such algorithms stems from the polar envelope’s relationship to infimal max convolution; Friedlander, Macêdo, and Pong showed that this relationship is analogous to the connection between the Moreau envelope and infimal convolution. They also illuminated useful variational properties of a duality framework admitted by such problems [5]. It is these special properties of the algorithm, and the rich associated theoretical framework that motivated its construction, that make its global convergence an interesting question.

The method makes use of the closed perspective transform for a proper convex function $f:X \rightarrow {{\mathbb {R}}}$:

$$\begin{aligned} f^\pi :X\times {{\mathbb {R}}}_+\rightarrow {{\mathbb {R}}}:\;(x,\lambda )\mapsto {\left\{ \begin{array}{ll} \lambda f(\lambda ^{-1}x) &{} \text {if}\;\;\lambda >0;\\ f_\infty (x) &{} \text {if}\;\;\lambda =0;\\ \infty &{} \text {if}\;\; \lambda < 0. \end{array}\right. } \end{aligned}$$

Here $f_\infty $ denotes the recession function of f [1, Definition 2.5.1], which satisfies

$$\begin{aligned} \mathrm{epi}(f_\infty ) = (\mathrm{epi}f)_\infty , \end{aligned}$$

and $\mathrm{epi}h$ denotes the epigraph of a proper convex function h and $C_\infty $ denotes the recession cone of a set C (see, for example, [8, Chapter 6]). The perspective $f^\pi $ is proper closed convex [7, Page 67] and is characterized by

$$\begin{aligned} \mathrm{epigraph} f^\pi = \mathrm{cl}\left( \mathrm{cone}(\mathrm{epigraph}f \times \{1\})\right) . \end{aligned}$$

Friedlander, Macêdo, and Pong also provided a result [6, Theorem 7.5] showing convergence under an assumption of strong convexity of $(f^\pi )^2$; however, the question of convergence more generally has remained open until now.

1.1 Outline and contributions

In Sect. 2, we recall familiar notation and concepts from convex analysis. In Sect. 3, we recall and analyse the polar proximity operator, a fundamental component of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$. In particular, we show that it is firmly quasinonexpansive (Theorem 3.4).

In Sect. 4, we recall the algorithm ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ and introduce a generalization thereof, ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ (Definition 5). Our motivation in so doing is that ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ may be described and studied as a 2-operator splitting method, a flexibility afforded by its definition on the lifted space $X \times {{\mathbb {R}}}$. In Sect. 4.1, we exploit this flexibility to show that, when the operator associated to ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ has a nonempty fixed point set, its fixed points all share a special property (Proposition 4.4), on which our analysis depends. In Sect. 4.2, we use this property to show that the operator is strictly quasinonexpansive (Theorem 4.8) and admits global convergence of sequences to a fixed point, whenever one exists (Theorem 4.11). We show similar results for the algorithm’s under-relaxed variants (Theorem 4.10). In Sect. 4.3, we provide convergence results for the associated shadow sequences. In Sect. 4.4, we provide an example that shows the operator associated with ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ is not, generically, firmly quasinonexpansive.

In Sect. 5, we provide sufficient conditions to guarantee fixed points of ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ (Theorem 5.2). Moreover, when ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ specifies to ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$, we show that set of global minimizers of f defines an exposed face of the fundamental set for the perspective function $f^\pi $ (Theorem 5.3). The latter results connect to known results ( [6, Theorem 7.4]) about fixed points of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$, and we explain how (Remark 2).

In Sect. 6, we connect our sufficient conditions for fixed point existence, with our convergence results that depend on that existence, to state a simple global convergence guarantee (Theorem 6.1). It shows that ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ is globally convergent in the full generality of [6].

2 Preliminaries

Throughout, X is a finite dimensional Euclidean space with the Euclidean norm, and we will work extensively with the space $X \times {{\mathbb {R}}}$. For ease of clarity, when we work with a 2-tuple $(y,\lambda ) \in X \times {{\mathbb {R}}}$, it should be understood that $y \in X$ and $\lambda \in {{\mathbb {R}}}$. In such a case, variable y will not be bolded. In order to be succinct, we will sometimes forego the use of a 2-tuple and simply use a single variable $\mathbf {x} \in X \times {{\mathbb {R}}}$. In such a case, the bolded variable $\mathbf {x}$ reminds that $\mathbf {x} \in X \times {{\mathbb {R}}}$. Throughout, $\kappa :X \times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}$ is a closed gauge in the sense of [5]. In other words $\kappa $ is convex, and

$$\begin{aligned} (\forall \mathbf {x} \in X\times {{\mathbb {R}}})\; (\forall \lambda \in \left[ 0,\infty \right] )\;0 \le \kappa (\lambda \mathbf {x}) = \lambda \kappa (\mathbf {x}). \end{aligned}$$

For an operator $U:X\times {{\mathbb {R}}}\rightarrow X\times {{\mathbb {R}}}$, $\mathrm{Fix}U:=\{\mathbf {x} \in X \times {{\mathbb {R}}}\;|\; U(\mathbf {x})=\mathbf {x}\}$ is its fixed point set. For a function $f:X \rightarrow {{\mathbb {R}}}_+$, $\mathrm{argmin}f:=\{x \;|\; f(x)=\inf f(X) \}$ is its set of global minimizers, $\mathrm{dom}f := \{x\;|\; f(x)< \infty \}$ is its domain, $\mathrm{lev}_{\le r}(f):=\{x \;|\; f(x)\le r \}$ is its r-lower level set, and $\mathrm{zer}f := \{x \;|\; f(x)= 0 \}$ is its zero set. For a gauge $\kappa : X \times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}_+$, the definitions of $\mathrm{argmin}\kappa $, $\mathrm{dom}\kappa $, $\mathrm{lev}_{\kappa < r}\kappa $, and $\mathrm{zer}\kappa $ are respectively analogous subsets of $X \times {{\mathbb {R}}}$. For a closed, convex subset $C \subset X \times {{\mathbb {R}}}$, $\mathrm{cone}C := \{\lambda \mathbf {x} \;|\; (\mathbf {x},\lambda ) \in (C \times [0,\infty [) \}$ is the cone of C, $P_C:\mathbf {x} \mapsto \mathrm{argmin}_{\mathbf {y} \in C}\Vert \mathbf {y}-\mathbf {x}\Vert $ is the projection operator associated with C, and $N_C(\mathbf {x})$ is the normal cone to C at a point $\mathbf {x} \in C$. For projection operators associated with (closed, convex) lower level sets, we use the shorthand: $P_{f \le r}:=P_{\mathrm{lev}_{\le r}(f)}$.

We will make use of various notions of nonexpansivity, which we now introduce; more information may be found in [2], and a comparison of what may be shown through different cutter and projection methods is found in [4]. The following definition may be found in either of these.

Definition 1

(Properties of operators). Let $D\subset X\times {{\mathbb {R}}}$ be nonempty and let $U:D\rightarrow X\times {{\mathbb {R}}}$. Assume that $\mathrm{Fix}U \ne \emptyset $. Then U is said to be

1.
firmly nonexpansive if
$$\begin{aligned} \Vert U(\mathbf {x})-U(\mathbf {y})\Vert ^2 + \Vert (\mathrm{Id}-U)(\mathbf {x})-(\mathrm{Id}-U)(\mathbf {y})\Vert ^2 \le \Vert \mathbf {x}-\mathbf {y}\Vert ^2 \;\; \forall \mathbf {x} \in D, \;\; \forall \mathbf {y} \in D; \end{aligned}$$
2.
nonexpansive if it is Lipschitz continuous with constant 1,
$$\begin{aligned} \Vert U(\mathbf {x})-U(\mathbf {y})\Vert \le \Vert \mathbf {x}-\mathbf {y}\Vert \qquad \forall \mathbf {x} \in D, \quad \forall \mathbf {y} \in D; \end{aligned}$$
3.
quasinonexpansive (QNE) if
$$\begin{aligned} \qquad \Vert U(\mathbf {x})-\mathbf {y}\Vert \le \Vert \mathbf {x}-\mathbf {y}\Vert \qquad \forall \mathbf {x} \in D, \quad \forall \mathbf {y} \in \mathrm{Fix}U \end{aligned}$$
(an operator that is both quasinonexpansive and continuous is called paracontracting);
4.
firmly quasinonexpansive (FQNE) (or a cutter) if
$$\begin{aligned} \Vert U\mathbf {x}-\mathbf {y}\Vert ^2 + \Vert U\mathbf {x}-\mathbf {x}\Vert ^2 \le \Vert \mathbf {x}-\mathbf {y}\Vert ^2\quad \forall \mathbf {x} \in D, \quad \forall \mathbf {y} \in \mathrm{Fix}U; \end{aligned}$$
5.
strictly quasinonexpansive (SQNE) if
$$\begin{aligned} \Vert U(\mathbf {x})-\mathbf {y}\Vert < \Vert \mathbf {x}-\mathbf {y}\Vert \qquad \forall \mathbf {x} \in D \setminus \mathrm{Fix}U,\quad \forall \mathbf {y} \in \mathrm{Fix}U; \end{aligned}$$
6.
$\rho $-strongly quasinonexpansive for $\rho > 0$ if
$$\begin{aligned} \Vert U\mathbf {x}-\mathbf {y}\Vert ^2 \le \Vert \mathbf {x}-\mathbf {y}\Vert ^2 - \rho \Vert U\mathbf {x} -\mathbf {x}\Vert ^2 \qquad \forall \mathbf {x} \in D \setminus \mathrm{Fix}U,\quad \forall \mathbf {y} \in \mathrm{Fix}U. \end{aligned}$$

Lemma 2.1

[2, Proposition 4.4] Let $D \subset X \times {{\mathbb {R}}}$ be nonempty. Let $U:D \rightarrow X\times {{\mathbb {R}}}$. The following are equivalent:

1.
U is firmly quasinonexpansive;
2.
$2U-\mathrm{Id}$ is quasinonexpansive;
3.
$(\forall \mathbf {x} \in D)(\forall \mathbf {y} \in \mathrm{Fix}U)\quad \Vert U\mathbf {x}-\mathbf {y}\Vert ^2 \le \langle \mathbf {x}-\mathbf {y}, U\mathbf {x}-\mathbf {y}\rangle $;
4.
$(\forall \mathbf {x} \in D)(\forall \mathbf {y} \in \mathrm{Fix}U)\quad \langle \mathbf {y}-U\mathbf {x}, \mathbf {x}-U\mathbf {x}\rangle \le 0$;
5.
$(\forall \mathbf {x} \in D)(\forall \mathbf {y} \in \mathrm{Fix}U)\quad \Vert U\mathbf {x}-\mathbf {x}\Vert ^2 \le \langle \mathbf {y}-\mathbf {x},U\mathbf {x}-\mathbf {x}\rangle $.

Definition 2

(Fejér monotonicity [2, 5.1]). A sequence $(\mathbf {x}_n)_{n\in {{\mathbb {N}}}}$ is Fejér monotone with respect to a closed convex set $C \subset X \times {{\mathbb {R}}}$ if

$$\begin{aligned} \Vert \mathbf {x}_{n+1}-\mathbf {x}\Vert \le \Vert \mathbf {x}_n - \mathbf {x}\Vert \qquad \forall \mathbf {x}\in C, \quad \forall n \in {{\mathbb {N}}}. \end{aligned}$$

A Fejér monotone sequence with respect to a closed convex set C may be thought of as a sequence defined by $\mathbf {x}_n := U^n \mathbf {x}_0$ where U is QNE with respect to $C=\mathrm{Fix}U$. We will make use of the fact that a Fejér monotone sequence with respect to a non-empty set is always bounded. We will also make use of the following convergence result.

Theorem 2.2

[2, Theorem 5.11] Let $(\mathbf {x}_n)_{n\in \mathbb {N}}$ be a sequence in $X\times {{\mathbb {R}}}$ and let C be a nonempty closed convex subset of $X\times {{\mathbb {R}}}$. Suppose that $(\mathbf {x}_n)_{n\in \mathbb {N}}$ is Fejér monotone with respect to C. Then the following are equivalent:

1.
the sequence $(\mathbf {x}_n)_{n\in \mathbb {N}}$ converges strongly (i.e. in norm) to a point in C;
2.
$(\mathbf {x}_n)_{n\in \mathbb {N}}$ possesses a strong sequential cluster point in C;
3.
$\underset{n\rightarrow \infty }{\mathrm{\liminf }}\, d(\mathbf {x}_{n},C)=0.$

We will make use of the following result, which may be recognized as a simplified version of [4, Theorem 4] and variants of which may be found in [3].

Lemma 2.3

Let $U: X\times R \rightarrow X \times {{\mathbb {R}}}$ be FQNE. Then the operator given by

$$\begin{aligned} (2-\gamma )(U-\mathrm{Id})+\mathrm{Id}\end{aligned}$$

is QNE for all $\gamma \in \left[ 0,2\right] $. Moreover, if $\gamma \in \left]0,2 \right[$ then U is SQNE and the sequence $(\mathbf {x}_n)_{n \in {{\mathbb {N}}}}$ given by $\mathbf {x}_n := U^n \mathbf {x}_0$ satisfies

$$\begin{aligned} \underset{n \rightarrow \infty }{\lim }\Vert \mathbf {x}_n - U(\mathbf {x}_n)\Vert \rightarrow 0. \end{aligned}$$

3 The polar envelope and proximity operator

Friedlander, Macêdo, and Pong introduced the polar envelope and its associated polar proximity operator [6], which we now recall.

Definition 3

(Polar envelope and polar proximal map [6]). For any closed gauge $\kappa : X\times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}$ and positive scalar $\alpha $, the function

$$\begin{aligned} \kappa _\alpha : X \times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}: \mathbf {x} \mapsto \underset{\mathbf {u}}{\inf }\max \{\kappa (\mathbf {u}),(1/\alpha )\Vert \mathbf {x}-\mathbf {u}\Vert \} \end{aligned}$$

is the polar envelope of $\kappa $. The corresponding polar proximal map

$$\begin{aligned} T_{\kappa , \alpha }: X\times {{\mathbb {R}}}\rightarrow X \times {{\mathbb {R}}}: \mathbf {x} \mapsto \underset{\mathbf {u}}{\mathrm{argmin}}\max \{\kappa (\mathbf {u}),(1/\alpha )\Vert \mathbf {x}-\mathbf {u}\Vert \} \end{aligned}$$

sends a point $\mathbf {x}$ to the minimizing set that defines $\kappa _\alpha (\mathbf {x})$. Naturally,

$$\begin{aligned} f_\alpha ^\pi : X \times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}: \mathbf {x} \mapsto \underset{\mathbf {u}}{\inf }\max \{\kappa (\mathbf {u}),(1/\alpha )\Vert \mathbf {x}-\mathbf {u}\Vert \} \end{aligned}$$

denotes the polar envelope of the closed perspective transform $f^\pi $ for a proper convex function $f:X \rightarrow {{\mathbb {R}}}$.

Figure 1 shows the construction of the polar envelope and its proximity operator for $\kappa =\Vert \cdot \Vert _\infty $. At top and at bottom left, we take three choices of $\mathbf {x}$ and plot the functions $\Vert \cdot - \mathbf {x}\Vert $ in yellow, red, and orange respectively. The domain points for which each of these epigraphs intersects the epigraph of $\Vert \cdot \Vert _\infty $ at lowest height are the respective proximal points. The height at the point of intersection determines the envelope value. For points $\mathbf {x}$ in the white regions, such as the points for which the functions $\Vert \cdot -\mathbf {x}\Vert $ are orange and yellow respectively, the envelope value is simply $\Vert \mathbf {x}\Vert _\infty /2$. For points lying in the red regions, the proximal point lies on the diagonals; for points in the interiors of the red regions, the envelope values are strictly greater than $\Vert \mathbf {x}\Vert _\infty /2$. This results in the smoothing apparent in the red regions for the envelope shown at bottom right.

We devote the remainder of this section to showing that the polar proximity operator $T_{\kappa ,\alpha }$ is firmly quasinonexpansive. We have the symmetry of vertical rescaling,

$$\begin{aligned} T_{\kappa ,\alpha }(\mathbf {u})&=\underset{\mathbf {u}}{\mathrm{argmin}} \max \left\{ \kappa (\mathbf {u}),(1/\alpha )\Vert \mathbf {x}-\mathbf {u}\Vert \right\} \\&=\underset{\mathbf {u}}{\mathrm{argmin}} \max \left\{ \alpha \kappa (\mathbf {u}),\Vert \mathbf {x}-\mathbf {u}\Vert \right\} =T_{\alpha \kappa ,1}(\mathbf {u}), \end{aligned}$$

by which the proximity operators satisfy $T_{\kappa ,\alpha } = T_{\alpha \kappa ,1}$, while the envelopes satisfy $(\alpha \kappa )_1 = \alpha (\kappa _\alpha )$. Thus, by working with a general $\kappa $, we can, and do, let $\alpha =1$ without loss of generality. For simplicity, we also write T instead of $T_{\kappa ,\alpha }$. Note that we still need the notation $\kappa _\alpha $ to distinguish the polar envelope from the gauge $\kappa $ itself.

Lemma 3.1

For a closed gauge $\kappa $, it holds that

$$\begin{aligned} \mathrm{Fix}T = \mathrm{zer}\kappa . \end{aligned}$$

Proof

The fact that $\mathrm{zer}\kappa \subset \mathrm{Fix}T$ is obvious. We will show the reverse inclusion. Let $\mathbf {x} \in \mathrm{Fix}T$. Then

$$\begin{aligned} \mathbf {x} = T\mathbf {x} = \underset{\mathbf {u}\in X\times {{\mathbb {R}}}}{\mathrm{argmin}}\max \{\kappa (\mathbf {u}) , \Vert \mathbf {x}-\mathbf {u}\Vert \}, \end{aligned}$$

and so

$$\begin{aligned} \underset{\mathbf {u} \in X \times {{\mathbb {R}}}}{\inf }\max \{\kappa (\mathbf {u}) , \Vert \mathbf {x}-\mathbf {u}\Vert \} = \max \{\kappa (\mathbf {x}) , \Vert \mathbf {x}-\mathbf {x}\Vert \} = 0. \end{aligned}$$

Thus $\kappa (\mathbf {x}) \le 0$. Combining with the fact $\kappa (\mathbf {x})\ge 0$, we have $\kappa (\mathbf {x}) = 0$. $\square $

Lemma 3.2

Let $\kappa $ be a closed gauge and $\kappa _\alpha $ its polar envelope. Then

$$\begin{aligned} (\kappa (\mathbf {x})=0) \iff (\kappa _\alpha (\mathbf {x}) = 0). \end{aligned}$$

Proof

Let $\kappa (\mathbf {x}) = 0$. Then

$$\begin{aligned} \kappa _\alpha (\mathbf {x})= \underset{\mathbf {u} \in X \times {{\mathbb {R}}}}{\inf }\max \{\kappa (\mathbf {u}),\Vert \mathbf {x}-\mathbf {u}\Vert \} \le \max \{ \kappa (\mathbf {x}),\Vert \mathbf {x}-\mathbf {x}\Vert \} = 0. \end{aligned}$$

Thus $\kappa _\alpha (\mathbf {x}) = 0$.

Now let $\kappa _\alpha (\mathbf {x}) = 0$. Since $\kappa _\alpha (\mathbf {x}) = 0$, there exists a sequence $(\mathbf {u}_n)_{n \in N}$ such that

$$\begin{aligned} \max \{ \kappa (\mathbf {u}_n), (1/\alpha )\Vert \mathbf {x}-\mathbf {u}_n\Vert \} \rightarrow 0. \end{aligned}$$

Then we have that $\kappa (\mathbf {u}_n) \rightarrow 0$ and $\Vert \mathbf {x}-\mathbf {u}_n\Vert \rightarrow 0$. Since $\Vert \mathbf {x}-\mathbf {u}_n\Vert \rightarrow 0$, we have that $\mathbf {u}_n \rightarrow x$. Combining with the fact that $\kappa $ is lower semicontinuous, we have that $\kappa (\mathbf {x}) \le \lim _{\mathbf {u}_n \rightarrow \mathbf {x}} \kappa (\mathbf {u}_n) = 0$. This concludes the result. $\square $

Lemma 3.3

Let $\kappa $ be a closed gauge and $\kappa (\mathbf {x}) > 0$. Then one of the following holds:

(i)
$\kappa (T\mathbf {x}) <\frac{1}{\alpha }\Vert T\mathbf {x}-\mathbf {x}\Vert $, in which case $0 \in \frac{1}{\alpha } \frac{T\mathbf {x}-\mathbf {x}}{\Vert T\mathbf {x}-\mathbf {x}\Vert } + N_{\mathrm{dom}\kappa }(T\mathbf {x})$ and $T\mathbf {x} = P_{\mathrm{dom}\kappa }\mathbf {x}$.
(ii)
We have that
$$\begin{aligned} r :=&\kappa (T\mathbf {x}) = \frac{1}{\alpha }\Vert T\mathbf {x}-\mathbf {x}\Vert \\ \text {where}\quad T\mathbf {x} =&P_{\mathrm{lev}_{\le r}(\kappa )}\mathbf {x}, \end{aligned}$$
and there exists $\lambda \in [0,1[$ such that
$$\begin{aligned} 0&\in \frac{1-\lambda }{\alpha } \frac{T\mathbf {x}-\mathbf {x}}{\Vert T\mathbf {x}-\mathbf {x}\Vert } + \partial (\lambda ^+ \kappa )(T\mathbf {x}) \nonumber \\&= \frac{1-\lambda }{\alpha } \frac{T\mathbf {x}-\mathbf {x}}{\Vert T\mathbf {x}-\mathbf {x}\Vert } + {\left\{ \begin{array}{ll} \left\{ \lambda z \; | \; z \in \partial \kappa (T\mathbf {x}) \right\} &{} \text {if}\; \lambda > 0 \\ N_{\mathrm{dom}\kappa }(T\mathbf {x}) &{} \text {if}\; \lambda = 0 \end{array}\right. }. \end{aligned}$$

Proof

Let $\kappa (\mathbf {x})>0$. From Lemma 3.2, $\kappa _\alpha (\mathbf {x})>0$. We have from [6, Section 5] that the condition $\kappa _\alpha (\mathbf {x})>0$ guarantees that one of (i) or (ii) must hold. $\square $

We will use the characterization of T from Lemma 3.3 often; hence the shorthand $P_{\kappa \le r}:=P_{\mathrm{lev}_{\le r}(\kappa )}$. Now we have the principle result of this section, which establishes that the polar proximity operator T is firmly quasinonexpansive.

Theorem 3.4

T is firmly quasinonexpansive.

Proof

Let $\mathbf {y} \in X \times {{\mathbb {R}}}$ and $\mathbf {x} \in \mathrm{Fix}T$. If $\kappa (\mathbf {y}) = 0$ then $\mathbf {y} \in \mathrm{Fix}T$, so let $\kappa (\mathbf {y}) > 0$. By Lemma 3.3, we need only consider two cases.

Case 1 If Lemma 3.3(i) holds, then we have that

$$\begin{aligned} -\frac{1}{\alpha }\frac{T\mathbf {y}-\mathbf {y}}{\Vert T\mathbf {y}-\mathbf {y}\Vert } \in N_{\mathrm{dom}\kappa } (T\mathbf {y}), \end{aligned}$$

(3.1)

and so $\mathbf {y} - T\mathbf {y} \in N_{\mathrm{dom}\kappa }(T\mathbf {y})$. Thus $T\mathbf {y} = P_{{\overline{\mathrm{dom}\kappa }}}(\mathbf {y})$. The operator $P_{{\overline{\mathrm{dom}\kappa }}}$ is FQNE with $\mathrm{Fix}P_{{\overline{\mathrm{dom}\kappa }}} = {\overline{\mathrm{dom}\kappa }}$. Thus by Lemma 2.1,

$$\begin{aligned} (\forall \mathbf {v} \in X \times {{\mathbb {R}}})\left( \forall \mathbf {u} \in \mathrm{Fix}P_{{\overline{\mathrm{dom}\kappa }}}\right) \quad \left\langle \mathbf {u}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {v},\mathbf {v}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {v} \right\rangle \le 0. \end{aligned}$$

(3.2)

Since $\mathbf {x} \in \mathrm{Fix}T \subset {\overline{\mathrm{dom}\kappa }}= \mathrm{Fix}P_{{\overline{\mathrm{dom}\kappa }}}$, we have that (3.2) is true, in particular, for $\mathbf {u}=\mathbf {x}$ and $\mathbf {v}=\mathbf {y}$. Thus we obtain

$$\begin{aligned} \left\langle \mathbf {x}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {y},\mathbf {y}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {y} \right\rangle \le 0. \end{aligned}$$

This is just

$$\begin{aligned} \langle \mathbf {x}-T\mathbf {y},\mathbf {y}-T\mathbf {y} \rangle \le 0. \end{aligned}$$

By Lemma 2.1, this is what we needed to show.

Case 2 If Lemma 3.3(ii) holds with $\lambda = 0$, then we again obtain (3.1) and proceed as in Case 1, obtaining what we needed to show. If Lemma 3.3(ii) holds with $\lambda > 0$ then there exists $r >0$ such that

$$\begin{aligned} T\mathbf {y} = P_{\kappa \le r}\mathbf {y}. \end{aligned}$$

Now since $P_{\kappa \le r}$ is FQNE, we have that

$$\begin{aligned} (\forall \mathbf {v} \in X \times {{\mathbb {R}}})\left( \forall \mathbf {u} \in \mathrm{Fix}P_{\kappa \le r}\right) \quad \left\langle \mathbf {u}-P_{\kappa \le r}\mathbf {v} , \mathbf {v}-P_{\kappa \le r}\mathbf {v} \right\rangle \le 0. \end{aligned}$$

(3.3)

Since $\mathbf {x} \in \mathrm{Fix}T =\mathrm{lev}_{\le 0}\kappa \subset \mathrm{lev}_{\le r_2}\kappa = \mathrm{Fix}P_{\kappa \le r_2}$, we have that (3.3) is true, in particular, for $\mathbf {u}=\mathbf {x}$ and $\mathbf {v}=\mathbf {y}$. Thus we obtain

$$\begin{aligned} \langle \mathbf {x}-P_{\kappa \le r_2}\mathbf {y},\mathbf {y}-P_{\kappa \le r_2}\mathbf {y} \rangle \le 0. \end{aligned}$$

This is just

$$\begin{aligned} \langle \mathbf {x}-T\mathbf {y},\mathbf {y}-T\mathbf {y} \rangle \le 0. \end{aligned}$$

By Lemma 2.1, this is what we needed to show. $\square $

4 The projected polar proximal point algorithm

We now recall the projected polar proximal point algorithm.

Definition 4

(Projected polar proximal point algorithm ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ [6, 7.2]). Fix $\alpha >0$ and set

$$\begin{aligned} \mathfrak {P}_{\alpha ,f}: X \rightarrow X: v \mapsto&T_{f^\pi ,\alpha }(v,1). \end{aligned}$$

The projected polar proximal point algorithm is to begin with any $v_0$ and update by

$$\begin{aligned} (v_{k+1},\lambda _{k+1})=\mathfrak {P}_{\alpha ,f}(v_k). \end{aligned}$$

(4.1)

Intuitively, the motivation of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ is to minimize a function f by attacking the gauge given by its closed perspective transform $f^\pi $. Its polar envelope $f_\alpha ^\pi $ serves a role analogous to the role played by the Fenchel–Moreau envelope in the construction of the traditional proximal point algorithm; see the remarks in [6, 7.2]. The use of the gauge allows problems to be reformulated using gauge duality [5].

In addition to explaining these connections, Friedlander, Macêdo, and Pong also showed that ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ has a useful fixed point property, which we now recall.

Theorem 4.1

(Fixed points of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ [6, Theorem 7.4]). Let $f:X \rightarrow {{\mathbb {R}}}_+ \cup \{+\infty \}$ be a proper closed nonnegative convex function with $\inf f>0$ and $\mathrm{argmin}f \ne \emptyset $. The following hold

(i)
If $(v,\lambda _*)=\mathfrak {P}_{1,f}$, then $\lambda _* > 0$ and $\lambda _*^{-1}v \in \mathrm{argmin}f$.
(ii)
If $v \in \mathrm{argmin}f$, then there exists $\lambda _* > 0$ so that $(\tau v, \lambda _*) = \mathfrak {P}_{1,f}(\tau v)$ where $\tau := \left[ 1+f(v) \right] ^{-1}$.

To show convergence of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$, we will analyse a generalization of it, which we now introduce.

Definition 5

(Generalized projected polar proximal point algorithm (${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$)). For a gauge $\kappa : X\times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}$ and fixed $\alpha >0$, choose a starting point $\mathbf {x}_0 \in X\times {{\mathbb {R}}}$ and iterate by

$$\begin{aligned} \mathbf {x}_{k+1}&= P_S \circ T \mathbf {x}_k,\\ \text {where}\quad S&= X \times \{1\}, \nonumber \\ \text {and}\quad P_S&: X\times {{\mathbb {R}}}\rightarrow X \times {{\mathbb {R}}}: (y,\lambda ) \mapsto (y,1) \nonumber \end{aligned}$$

(4.2)

is simply the projection operator for S.

Note that ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ and ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ are defined on X and $X \times {{\mathbb {R}}}$ respectively. However, when $\kappa = f^\pi $ the sequences $(v_k)$ from (4.1) and $(\mathbf {x}_k)_k$ from (4.2) clearly satisfy $(v_k,1) = \mathbf {x}_k$, and so the algorithms ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ and ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ generate the same sequence on the non-lifted space. This means that we can study ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ by studying ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$, because their performance for $\kappa = f^\pi $ is the same.

For our purposes, the advantage of ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ is that it is defined on the lifted space $X \times {{\mathbb {R}}}$. This allows us to decompose the method into iterative application of the two separate operators: T and $P_S$. This allows for greater flexibility. Specifically, it allows us to study ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ as a splitting method, where two different operators are applied in succession: first the one and then the other. This allows us to build new characterizations of fixed points, and these characterizations, in turn, allow us to show global convergence. Thus, by analysing ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$, we are able to prove the desired convergence of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ without assuming strong convexity of $(f^\pi )^2$. Whether the added flexibility of ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ has benefits beyond its utility for learning about ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ is a natural question for future research. For the present, our main motivation for introducing ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ is the aforementioned advantage.

4.1 Alternative fixed point characterization

We next establish a useful characterization of the fixed points (Proposition 4.4). For the purpose, we need the following two lemmas.

Lemma 4.2

Let $(x,1) \in \mathrm{Fix}P_S \circ T$. Then the following hold.

(i)
$T(x,1) = (x,\lambda )$ for some $\lambda \in \left[ 0,1 \right] $.
(ii)
Moreover, if $(x,1) \notin \mathrm{Fix}T$, then $\lambda \in [0,1[$.

Proof

(i): Since $(x,1) \in \mathrm{Fix}P_S \circ T$, we have that

$$\begin{aligned} P_S \circ T(x,1) = (x,1). \end{aligned}$$

Since $P_S^{-1}(w,1) = \{(w,\lambda )\; | \lambda \in {{\mathbb {R}}}\}$ for all $w \in X$, we have that $T(x,1) = (x,\lambda )$ for some $\lambda \in {{\mathbb {R}}}$. We now show that $\lambda \in \left[ 0,1\right] $. Since T is FQNE,

$$\begin{aligned} (\forall \mathbf {u} \in X\times {{\mathbb {R}}})(\forall \mathbf {v}\in \mathrm{Fix}T)\quad \langle \mathbf {v}-T\mathbf {u},\mathbf {u}-T\mathbf {u} \rangle \le 0. \end{aligned}$$

(4.3)

In particular, $\mathbf {0} \in \mathrm{Fix}T$ and so (4.3) yields

$$\begin{aligned} \langle -T\mathbf {u},\mathbf {u}-T\mathbf {u} \rangle \le 0. \end{aligned}$$

(4.4)

In particular (4.4) holds for $\mathbf {u}=(x,1)$, and so we have

$$\begin{aligned} \langle -T(x,1) , (x,1) - T(x,1) \rangle \le 0 \end{aligned}$$

This is just

$$\begin{aligned} \langle -(x,\lambda ),(x,1)-(x,\lambda ) \rangle&= \langle -(x,\lambda ),(0,1-\lambda )\rangle \\&= -\lambda (1-\lambda ) \le 0 \end{aligned}$$

Thus $\lambda \in \left[ 0,1 \right] $.

(ii): Since T is FQNE, it is SQNE [2], and so we have that

$$\begin{aligned} (\forall \mathbf {u} \in X\times {{\mathbb {R}}}\setminus \mathrm{Fix}T)(\forall \mathbf {v} \in \mathrm{Fix}T)\quad \Vert T\mathbf {u}-\mathbf {v}\Vert ^2 < \Vert \mathbf {u}-\mathbf {v}\Vert ^2. \end{aligned}$$

(4.5)

Specifically, (4.5) holds for $\mathbf {u} = (x,1)$ and $\mathbf {v}=\mathbf {0}$, and so we obtain

$$\begin{aligned} \Vert (x,\lambda )\Vert ^2 < \Vert (x,1)\Vert ^2. \end{aligned}$$

This is just

$$\begin{aligned} \Vert x\Vert ^2 + \lambda ^2 < \Vert x\Vert ^2 + 1^2, \end{aligned}$$

and so we conclude that $\lambda < 1$. This concludes the result. $\square $

Lemma 4.3

Let $(x,1) \in \mathrm{Fix}P_S \circ T$ and $(x,\lambda ) := T(x,1)$ (a representation that always holds by Lemma 4.2). Then for any $(y,1) \in S$ the following hold:

(i)
$\Vert T(y,1)-(y,1)\Vert \ge \Vert T(x,1)-(x,1) \Vert $;
(ii)
Additionally, if $T(x,1) = P_{{\overline{\mathrm{dom}\kappa }}}(x,1)$ or $\lambda =1$ then
1. (a)
  $\Vert T(y,1)-P_S\circ T(y,1)\Vert \ge \Vert T(x,1)-(x,1)\Vert $;
2. (b)
  $T(y,1) = (w,\mu )$ for some $\mu \in {{\mathbb {R}}}$ satisfying $|1-\lambda | \le |1-\mu |$;
3. (c)
  If $\lambda < 1$, then $T(y,1) = (w,\mu )$ for some $\mu \le \lambda $.

Proof

Let $(y,1) \in S$ and $(w,\mu ) = T(y,1)$. Set $r':=\Vert T(x,1)-(x,1) \Vert $ and $r_2:=\Vert T(y,1)-(y,1)\Vert $. We have by Lemma 4.2 that

$$\begin{aligned} \lambda \in \left[ 0,1 \right] . \end{aligned}$$

We will consider two cases: when $\lambda = 1$ and when $\lambda \in [0,1[$.

Case $\lambda =1$ Then $|1-\lambda | = 0$, and so (ii)b clearly holds. Additionally, by Lemma 4.2, we have that $(x,1) \in \mathrm{Fix}T$, and so $\Vert T(x,1)-(x,1)\Vert = 0$, and so (i) and (ii)a both clearly hold, while (ii)c does not apply. This concludes what we needed to show in the case $\lambda =1$.

Case $\lambda <1$ Let

$$\begin{aligned} \lambda \in [0,1[ \end{aligned}$$

(4.6)

By Lemma 3.3, we may further consider two subcases, namely when $T(x,1) = P_{{\overline{\mathrm{dom}\kappa }}}(x,1)$ and when $T(x,1) = P_{\kappa \le r'}(x,1)$.

Subcase 1 Let $T(x,1) = P_{{\overline{\mathrm{dom}\kappa }}}(x,1)$. Since $P_{{\overline{\mathrm{dom}\kappa }}}$ is FQNE with $\mathrm{Fix}P_{{\overline{\mathrm{dom}\kappa }}} = {\overline{\mathrm{dom}\kappa }}$, we have that

$$\begin{aligned} (\forall \mathbf {u} \in X\times {{\mathbb {R}}})(\forall \mathbf {v} \in {\overline{\mathrm{dom}\kappa }}) \quad \langle P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {u}-\mathbf {u},P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {u}-\mathbf {v} \rangle \le 0. \end{aligned}$$

(4.7)

In particular, (4.7) holds for $\mathbf {u} = (x,1)$ and $\mathbf {v} = T(y,1)$. Thus

$$\begin{aligned} \langle P_{{\overline{\mathrm{dom}\kappa }}}(x,1)-(x,1) , P_{{\overline{\mathrm{dom}\kappa }}}(x,1) - T(y,1)\rangle \le 0. \end{aligned}$$

(4.8)

Since $P_{{\overline{\mathrm{dom}\kappa }}}(x,1) = T(x,1)$, we have $P_{{\overline{\mathrm{dom}\kappa }}}(x,1) = (x,\lambda )$. Combining with (4.8), we obtain

$$\begin{aligned} \langle (x,\lambda )-(x,1) , (x,\lambda )-T(y,1) \rangle&\le 0 \end{aligned}$$

(4.9)

$$\begin{aligned} \langle (0,\lambda -1) , (x,\lambda )-T(y,1) \rangle&\le 0\nonumber \\ (\lambda -1)(\lambda -\mu )&\le 0. \end{aligned}$$

(4.10)

Recall that by (4.6), we have $\lambda \in [0 ,1[$. Thus we have $(\lambda -1) < 0$. Combining this fact with (4.10), we have $\lambda - \mu \ge 0$, and so $\lambda \ge \mu $. This shows (ii)b and (ii)c. Thus $(1-\mu ) \ge (1-\lambda ) > 0$, where the final inequality is because $\lambda < 1$. Thus we have

$$\begin{aligned} (\mu -1)^2 \ge (\lambda - 1)^2. \end{aligned}$$

(4.11)

We also have that

$$\begin{aligned} \Vert T(y,1)-(y,1)\Vert ^2 = \Vert (w,\mu )-(y,1) \Vert ^2 = \Vert w-y\Vert ^2+(\mu -1)^2 \ge (\mu -1)^2, \end{aligned}$$

(4.12)

and that

$$\begin{aligned} (\lambda - 1)^2 = \Vert (x,1)-(x,\lambda )\Vert ^2 = \Vert T(x,1)-(x,1)\Vert ^2 = r'^2. \end{aligned}$$

(4.13)

Combining (4.11), (4.12), and (4.13), we obtain $\Vert T(y,1) - (y,1)\Vert \ge r'$. Thus $r_2 \ge r'$, which shows (i).

We also have that

$$\begin{aligned} \Vert T(x,1)-P\circ T(x,1)\Vert&= \Vert (x,\lambda )-(x,1)\Vert = 1-\lambda \le 1-\mu \\ \text {and}\quad 1-\mu&= \Vert (w,\mu )-(w,1)\Vert = \Vert T(y,1)-P\circ T(y,1)\Vert , \end{aligned}$$

which shows (ii)a. Thus we have shown everything we needed to show in the case when $T(x,1) = P_{{\overline{\mathrm{dom}\kappa }}}(x,1)$

Subcase 2 Let $T(x,1) = P_{\kappa \le r'}(x,1)$. Suppose for a contradiction that $r_2 < r'$. For the sake of simplicity, define $P_{r'}:=P_{\kappa \le r'}$. We have that

$$\begin{aligned} \kappa (T(y,1)) \le r_2 < r', \end{aligned}$$

where the first inequality is because $\kappa (T(y,1)) \le \Vert T(y,1)-(y,1)\Vert $ [6, Theorem 4.4] and the second is our contradiction assumption. Thus

$$\begin{aligned} T(y,1) \in \mathrm{lev}_{\kappa \le r_2} \kappa \subset \mathrm{lev}_{\kappa < r'} \subset \mathrm{lev}_{\kappa \le r'}. \end{aligned}$$

As $\mathrm{lev}_{\le r'}\kappa $ is closed and convex, the operator $P_{r'}$ is FQNE with $\mathrm{Fix}P_{r'} = \mathrm{lev}_{\le r'}\kappa $. Using Lemma 2.1, we have that

$$\begin{aligned} (\forall \mathbf {u} \in X \times {{\mathbb {R}}})(\forall \mathbf {v} \in \mathrm{lev}_{\le r'}\kappa )\quad \langle P_{r'}\mathbf {u}-\mathbf {u} , P_{r'}\mathbf {u}-\mathbf {v} \rangle \le 0. \end{aligned}$$

(4.14)

In particular, $T(y,1) \in \mathrm{Fix}P_{\kappa \le r'}$, and so we can apply (4.14) with $\mathbf {u} = (x,1)$ and $\mathbf {v} = T(y,1)$, obtaining

$$\begin{aligned} \langle P_{r'}(x,1)-(x,1) , P_{r'}(x,1) - T(y,1)\rangle \le 0. \end{aligned}$$

(4.15)

Since $P_{r'}(x,1) = T(x,1)$, we have $P_{r'}(x,1) = (x,\lambda )$. Let $(w,\mu ):= T(y,1)$. Combining with (4.15), we again obtain (4.9), and we proceed as we did in Case 1 to obtain $r_2 \ge r'$, a contradiction. Thus $r_2 \ge r'$, which shows (i). $\square $

We now establish the useful alternative characterization of the set $\mathrm{Fix}P_S \circ T$.

Proposition 4.4

(Fixed points of $P_S\circ T$). Let $(x,1) \in \mathrm{Fix}P_S \circ T$ and set $r' := \Vert (x,1)-T(x,1)\Vert $. It holds that

$$\begin{aligned} \mathrm{Fix}P_S \circ T = \{(u,1) \;|\; \Vert T(u,1)-(u,1)\Vert = r' \}. \end{aligned}$$

(4.16)

Proof

The first inclusion $\{(u,1) \; | \; \Vert T(u,1)-(u,1)\Vert = r' \} \supset \mathrm{Fix}P_S \circ T$ is a consequence of Lemma 4.3. Simply let $(x,1),(y,1) \in \mathrm{Fix}P_S \circ T$, and we have from Lemma 4.3(i) that

$$\begin{aligned} \Vert T(y,1)-(y,1)\Vert&\ge \Vert T(x,1)-(x,1) \Vert ,\\ \text {and}\quad \Vert T(y,1)-(y,1)\Vert&\le \Vert T(x,1)-(x,1)\Vert , \end{aligned}$$

and so $\Vert T(y,1)-(y,1)\Vert = \Vert T(x,1)-(x,1)\Vert = r'$.

Now we will show the reverse inclusion. Let $(y,1) \in S$ and let

$$\begin{aligned} \Vert (y,1)-T(y,1)\Vert = r' = \Vert (x,1)-T(x,1)\Vert . \end{aligned}$$

By Lemma 4.2 we have that

$$\begin{aligned} \exists \lambda \in \left[ 0,1\right] \quad \text {such that}\quad (x,\lambda )=T(x,1). \end{aligned}$$

Case 1 $\lambda =1$. Suppose $\lambda =1$. Then we have that $(x,1) \in \mathrm{Fix}T$ and so $r' = 0$. Since $\Vert T(y,1)-(y,1)\Vert =r'$, we have $\Vert T(y,1)-(y,1)\Vert = 0$. Thus $(y,1) \in \mathrm{Fix}T$, and so $(y,1) \in \mathrm{Fix}P_S \circ T$. This concludes the case when $\lambda =1$.

Case 2 $\lambda \in [0,1[$. Let $\lambda \in [0,1[$. Then $(x,1) \notin \mathrm{Fix}T$ and so $\kappa (x,1) > 0$. Since $\kappa (x,1)>0$, we have by Lemma 3.3 that either $T(x,1) = P_{{\overline{\mathrm{dom}\kappa }}}(x,1)$ or $T(x,1) = P_{\kappa \le r'}(x,1)$.

Case 2(a) $T(x,1) = P_{{\overline{\mathrm{dom}\kappa }}}(x)$. Since ${\overline{\mathrm{dom}\kappa }}$ is closed and convex, we have that $P_{{\overline{\mathrm{dom}\kappa }}}$ is FQNE. Since $P_{{\overline{\mathrm{dom}\kappa }}}$ is FQNE with $\mathrm{Fix}P_{{\overline{\mathrm{dom}\kappa }}} = {\overline{\mathrm{dom}\kappa }}$, we have that

$$\begin{aligned} (\forall \mathbf {v} \in X\times {{\mathbb {R}}})(\forall \mathbf {u} \in {\overline{\mathrm{dom}\kappa }}) \quad \langle \mathbf {u}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {v},\mathbf {v}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {v} \rangle \le 0. \end{aligned}$$

(4.17)

In particular, $T(y,1) \in {\overline{\mathrm{dom}\kappa }}$, and so we may apply (4.17) holds with $\mathbf {u} = T(y,1)$ and $\mathbf {v} = (x,1)$, obtaining

$$\begin{aligned} \langle T(y,1)-P_{{\overline{\mathrm{dom}\kappa }}}(x,1), (x,1)-P_{{\overline{\mathrm{dom}\kappa }}}(x,1) \rangle \le 0. \end{aligned}$$

(4.18)

Using the fact that $T(y,1) = (w,\mu )$ and $P_{{\overline{\mathrm{dom}\kappa }}}(x,1) = T(x,1) = (x,\lambda )$, (4.18) becomes

$$\begin{aligned} \langle (w,\mu )-(x,\lambda ),(x,1)-(x,\lambda ) \rangle \le 0. \end{aligned}$$

(4.19)

From (4.19) we have that

$$\begin{aligned} (\mu - \lambda )(1-\lambda ) \le 0. \end{aligned}$$

(4.20)

Using the fact that $\lambda < 1$, (4.20) implies that $\mu \le \lambda $. Thus we have that $1-\mu \ge 1-\lambda \ge 0$. Thus we have that

$$\begin{aligned} (1-\mu )^2 \ge (1-\lambda )^2. \end{aligned}$$

(4.21)

Now since

$$\begin{aligned} (r')^2&= \Vert (x,1)-T(x,1)\Vert ^2 = \Vert (x,1)-(x,\lambda )\Vert ^2 = (1-\lambda )^2\quad \text {and}\\ (r')^2&= \Vert (y,1)-T(y,1)\Vert ^2 = \Vert (y,1)-(w,\mu )\Vert ^2 = \Vert y-w\Vert ^2 + (1-\mu )^2, \end{aligned}$$

we have that

$$\begin{aligned} (1-\lambda )^2 = \Vert y-w\Vert ^2 + (1-\mu )^2. \end{aligned}$$

(4.22)

Combining (4.21) and (4.22), we obtain

$$\begin{aligned} 0\ge (1-\lambda )^2 - (1-\mu )^2 = \Vert w-y\Vert ^2. \end{aligned}$$

Thus we have that $w=y$, and so $T(y,1) = (y,\mu )$. Thus

$$\begin{aligned} P_S\circ T(y,1) = P_S (y,\mu ) = (y,1), \end{aligned}$$

and so $(y,1) \in \mathrm{Fix}P_S \circ T$.

Case 2(b) $T(x,1) = P_{\kappa \le r'}(x,1)$. Let $T(x,1) = P_{\kappa \le r'}(x,1)$.

Since $P_{\kappa \le r'}$ is FQNE with $\mathrm{Fix}P_{\kappa \le r'} = \mathrm{lev}_{\le r'}\kappa $, we have that

$$\begin{aligned} (\forall \mathbf {v} \in X\times {{\mathbb {R}}})(\forall \mathbf {u} \in \mathrm{lev}_{\le r'} \kappa ) \quad \langle \mathbf {u}-P_{\kappa \le r'}\mathbf {v},\mathbf {v}-P_{\kappa \le r'}\mathbf {v} \rangle \le 0. \end{aligned}$$

(4.23)

In particular, since $\kappa (T(y,1))\le \Vert T(y,1)-(y,1)\Vert =r'$, we have that $T(y,1) \in \mathrm{Fix}P_{\kappa \le r'}$, and so we may apply (4.23) with $\mathbf {u} = T(y,1)$ and $\mathbf {v} = (x,1)$, obtaining

$$\begin{aligned} \langle T(y,1)-P_{\kappa \le r'}(x,1), (x,1)-P_{\kappa \le r'}(x,1) \rangle \le 0. \end{aligned}$$

(4.24)

Using the fact that $T(y,1) = (w,\mu )$ and $P_{\kappa \le r'}(x,1) = T(x,1) = (x,\lambda )$, (4.24) again yields (4.19), and we proceed as in Case 2(a).

This shows the desired result. $\square $

The following shorthand will simplify notation in the results that follow. Whenever $\mathrm{Fix}P_S \circ T \ne \emptyset $, we define

$$\begin{aligned} E: (\mathrm{Fix}P_S \circ T) \times S \rightarrow {{\mathbb {R}}}: \quad (\mathbf {x},\mathbf {y}) \mapsto \Vert T\mathbf {y}-\mathbf {y}\Vert ^2 - \Vert T\mathbf {x}-\mathbf {x}\Vert ^2. \end{aligned}$$

(4.25)

Our previous results admit the following important property.

Lemma 4.5

Whenever $\mathrm{Fix}P_S \circ T \ne \emptyset $, we have that

$$\begin{aligned} (\forall \mathbf {x} \in \mathrm{Fix}P_S \circ T)(\forall \mathbf {y} \in S)\quad E(\mathbf {x},\mathbf {y}) \ge 0, \end{aligned}$$

and equality holds if and only if $\mathbf {y} \in \mathrm{Fix}P_S \circ T$.

Proof

The inequality is an immediate consequence of Lemma 4.3(i). The fact that equality holds if and only if $\mathbf {y} \in \mathrm{Fix}P_S \circ T$ is an immediate consequence of Proposition 4.4. $\square $

4.2 Convergence

In this subsection, we will show that sequences admitted by ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ are globally convergent to a point in $\mathrm{Fix}P_S \circ T$, whenever the latter is nonempty. The key result, Theorem 4.8, uses the following auxiliary lemma.

Lemma 4.6

Let $\mathrm{Fix}P_S \circ T \ne \emptyset $. The following holds:

$$\begin{aligned} (\forall \mathbf {x} \in \mathrm{Fix}P_S \circ T) (\forall \mathbf {y} \in S)\quad \langle T\mathbf {x}-T\mathbf {y},\mathbf {y}-T\mathbf {y} \rangle \le 0. \end{aligned}$$

Proof

Let $\mathbf {x} \in \mathrm{Fix}P_S \circ T$ and $\mathbf {y} \in S$. By Lemma 3.3, we may consider two cases: when $T\mathbf {y} = P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {y}$ and when $T\mathbf {y} = P_{\kappa \le r_2}\mathbf {y}$ where $r_2 = \Vert T\mathbf {y}-\mathbf {y}\Vert $.

Case 1 Let $T\mathbf {y} = P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {y}$. Since $P_{{\overline{\mathrm{dom}\kappa }}}$ is FQNE, we have from Lemma 2.1 that

$$\begin{aligned} (\forall \mathbf {u} \in \mathrm{Fix}P_{{\overline{\mathrm{dom}\kappa }}})(\forall \mathbf {v} \in V)\quad \langle \mathbf {u}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {v},\mathbf {v}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {v} \rangle \le 0. \end{aligned}$$

(4.26)

In particular $T\mathbf {x} \in \mathrm{Fix}P_{{\overline{\mathrm{dom}\kappa }}}$, and so we can apply (4.26) with $\mathbf {u} = T\mathbf {x}$ and $\mathbf {v} = \mathbf {y}$, obtaining

$$\begin{aligned} \langle T\mathbf {x}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {y},\mathbf {y}-P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {y} \rangle \le 0. \end{aligned}$$

(4.27)

Finally, substituting in (4.27) using the fact that $P_{{\overline{\mathrm{dom}\kappa }}}\mathbf {y} = T\mathbf {y}$, we obtain

$$\begin{aligned} \langle T\mathbf {x}-T\mathbf {y},\mathbf {y}-T\mathbf {y} \rangle \le 0. \end{aligned}$$

This shows the result in Case 1.

Case 2 Let $T\mathbf {y} = P_{\kappa \le r_2}\mathbf {y}$ where $r_2 = \Vert T\mathbf {y}-\mathbf {y}\Vert $. As $P_{\kappa \le r_2}$ is FQNE, we have that

$$\begin{aligned} (\forall \mathbf {u} \in \mathrm{Fix}P_{\kappa \le r_2})(\forall \mathbf {v} \in V)\quad \langle \mathbf {u}-P_{\kappa \le r_2}\mathbf {v},\mathbf {v}-P_{\kappa \le r_2}\mathbf {v} \rangle \le 0. \end{aligned}$$

(4.28)

We have by Lemma 4.3 that $r_2 \ge r' = \Vert T\mathbf {x}-\mathbf {x}\Vert $. Combining this with the fact from [6, Theorem 4.4(ii)] that $\kappa (T\mathbf {x}) \le \Vert T\mathbf {x}-\mathbf {x}\Vert $ , we have that $T\mathbf {x} \in \mathrm{lev}_{\le r'} \kappa \subset \mathrm{lev}_{\le r_2}\kappa $. Thus $T\mathbf {x} \in \mathrm{Fix}P_{\kappa \le r_2}$, and so we can apply (4.28) with $\mathbf {u}=T\mathbf {x}$ and $\mathbf {v}=\mathbf {y}$, obtaining

$$\begin{aligned} \langle T\mathbf {x}-P_{\kappa \le r_2}\mathbf {y},\mathbf {y}-P_{\kappa \le r_2}\mathbf {y} \rangle \le 0. \end{aligned}$$

(4.29)

Finally, since $P_{\kappa \le r_2}\mathbf {y} = T\mathbf {y}$, we may substitute in (4.29) to obtain

$$\begin{aligned} \langle T\mathbf {x}-T\mathbf {y}, \mathbf {y}-T\mathbf {y} \rangle \le 0. \end{aligned}$$

This concludes the result. $\square $

Fact 4.7

Since S is an affine subspace, it holds that

$$\begin{aligned} (\forall \mathbf {u} \in X\times {{\mathbb {R}}})(\forall \mathbf {v} \in S)\quad \Vert \mathbf {u}-\mathbf {v}\Vert ^2 = \Vert P_S(\mathbf {u})-\mathbf {v}\Vert ^2+\Vert \mathbf {u}-P_S(\mathbf {u})\Vert ^2. \end{aligned}$$

(4.30)

Proof

This follows immediately from the Pythagorean theorem. $\square $

Figure 2 illustrates the strategy of the following theorem, which brings together all of the different results we have established so far.

Theorem 4.8

Let $\mathrm{Fix}P_S \circ T \ne \emptyset $. Set $\mathbf {y}_0 \in S$. The operator $P_S \circ T: S \rightarrow S$ is SQNE, and so the sequence $(\mathbf {y}_n)_{n \in {{\mathbb {N}}}} \subset S$ given by

$$\begin{aligned} \mathbf {y}_{n+1}:=P_S \circ T \mathbf {y}_n \end{aligned}$$

is Fejér monotone with respect to $\mathrm{Fix}(P_{S} \circ T)$. More specifically,

$$\begin{aligned} (\forall \mathbf {x} \in \mathrm{Fix}P_S \circ T)(\forall \mathbf {y} \in S)\quad \Vert P_S \circ T\mathbf {y} -\mathbf {x}\Vert ^2 \le \Vert \mathbf {y}-\mathbf {x}\Vert ^2 -E(\mathbf {x},\mathbf {y}), \end{aligned}$$

where E is as in (4.1).

Proof

Let $\mathbf {x} \in \mathrm{Fix}P_S \circ T$ and $\mathbf {y} \in S$. There exists $\beta \in {{\mathbb {R}}}$ such that

$$\begin{aligned} T\mathbf {x}-\mathbf {y} = (1+\beta )(T\mathbf {y}-\mathbf {y})+\mathbf {u}\quad \text {for some}\; \mathbf {u} \in (\mathrm{span} \{T\mathbf {y}-\mathbf {y}\})^\perp . \end{aligned}$$

(4.31)

We will first show that $\beta \ge 0$. From Lemma 4.6 we have that

$$\begin{aligned} \langle T\mathbf {x}-T\mathbf {y},\mathbf {y}-T\mathbf {y} \rangle \le 0. \end{aligned}$$

(4.32)

Adding $\langle \mathbf {y}-T\mathbf {x},\mathbf {y}-T\mathbf {y} \rangle $ to both sides of (4.32) yields

$$\begin{aligned} \Vert T\mathbf {y}-\mathbf {y}\Vert ^2 \le \langle T\mathbf {x}-\mathbf {y},T\mathbf {y}-\mathbf {y} \rangle . \end{aligned}$$

(4.33)

Combining (4.31) and (4.33), we obtain

$$\begin{aligned} \Vert T\mathbf {y}-\mathbf {y}\Vert ^2&\le \langle (1+\beta )(T\mathbf {y}-\mathbf {y})+\mathbf {u},T\mathbf {y}-\mathbf {y} \rangle \nonumber \\&=(1+\beta )\Vert T\mathbf {y}-\mathbf {y}\Vert ^2. \end{aligned}$$

(4.34)

Now (4.34) implies that $\beta \ge 0$ or $\Vert T\mathbf {y}-\mathbf {y}\Vert = 0$. If $\Vert T\mathbf {y}-\mathbf {y}\Vert = 0$, then $T\mathbf {y}=\mathbf {y} \in S$ and so $P_S\circ T \mathbf {y} = \mathbf {y}$, and so $\mathbf {y} \in \mathrm{Fix}(P_S \circ T)$, in which case we are done. Thus we may restrict to considering the case when $\Vert T\mathbf {y}-\mathbf {y}\Vert > 0$. This, together with (4.34), yields

$$\begin{aligned} \beta \ge 0. \end{aligned}$$

Now applying the Pythagorean Theorem to (4.31) yields

$$\begin{aligned} \Vert \mathbf {u}\Vert ^2 = \Vert T\mathbf {x}-\mathbf {y}\Vert ^2- \Vert (1+\beta )(T\mathbf {y}-\mathbf {y})\Vert ^2. \end{aligned}$$

(4.35)

Moreover, we may rearrange (4.31) to obtain

$$\begin{aligned} T\mathbf {x}-T\mathbf {y} = \beta (T\mathbf {y}-\mathbf {y})+\mathbf {u}. \end{aligned}$$

(4.36)

Applying the Pythagorean Theorem to (4.36), we obtain

$$\begin{aligned} \Vert T\mathbf {x}-T\mathbf {y}\Vert ^2 = \Vert \beta (T\mathbf {y}-\mathbf {y})\Vert ^2 + \Vert \mathbf {u}\Vert ^2. \end{aligned}$$

(4.37)

Using (4.35) to substitute for $\Vert \mathbf {u}\Vert ^2$ in (4.37), we obtain

$$\begin{aligned} \Vert T\mathbf {x}-T\mathbf {y}\Vert ^2&= \Vert \beta (T\mathbf {y}-\mathbf {y})\Vert ^2 + \Vert T\mathbf {x}-\mathbf {y}\Vert ^2- \Vert (1+\beta )(T\mathbf {y}-\mathbf {y})\Vert ^2 \nonumber \\&= \Vert T\mathbf {x}-\mathbf {y}\Vert ^2-(1+2\beta )\Vert T\mathbf {y}-\mathbf {y}\Vert ^2. \end{aligned}$$

(4.38)

Now from Fact 4.7 we have that (4.30) holds for $\mathbf {u}=T\mathbf {x}$ and $v=\mathbf {y}$, and so we have

$$\begin{aligned} \Vert T\mathbf {x}-\mathbf {y}\Vert ^2&= \Vert T\mathbf {x}-P_S\circ T(\mathbf {x})\Vert ^2 + \Vert \mathbf {y}-P_S\circ T(\mathbf {x})\Vert ^2\nonumber \\ \text {which is}\quad \Vert T\mathbf {x}-\mathbf {y}\Vert ^2&= \Vert T\mathbf {x}-\mathbf {x}\Vert ^2 + \Vert \mathbf {y}-\mathbf {x}\Vert ^2. \end{aligned}$$

(4.39)

Now we may use (4.39) to substitute for $\Vert T\mathbf {x}-\mathbf {y}\Vert ^2$ in (4.38) and obtain

$$\begin{aligned} \Vert T\mathbf {x}-T\mathbf {y}\Vert ^2 = \Vert T\mathbf {x}-\mathbf {x}\Vert ^2 + \Vert \mathbf {y}-\mathbf {x}\Vert ^2 -(1+2\beta )\Vert T\mathbf {y}-\mathbf {y}\Vert ^2. \end{aligned}$$

(4.40)

Now we have that

$$\begin{aligned} \Vert T\mathbf {y}-\mathbf {y}\Vert ^2 = \Vert T\mathbf {x}-\mathbf {x}\Vert ^2 + \Vert T\mathbf {y}-\mathbf {y}\Vert ^2-\Vert T\mathbf {x}-\mathbf {x}\Vert ^2 = \Vert T\mathbf {x}-\mathbf {x}\Vert ^2 + E(\mathbf {x},\mathbf {y}), \end{aligned}$$

(4.41)

where E is as defined in 4.1. Multiplying both sides of (4.41) by $-(1+2\beta )$, we obtain

$$\begin{aligned} -(1+2\beta )\Vert T\mathbf {y}-\mathbf {y}\Vert ^2 = -(1+2\beta )\Vert T\mathbf {x}-\mathbf {x}\Vert ^2 -(1+2\beta )E(\mathbf {x},\mathbf {y}). \end{aligned}$$

(4.42)

Using (4.42) to make the appropriate substitution for $-(1+2\beta )\Vert T\mathbf {y}-\mathbf {y}\Vert ^2$ in (4.40), we obtain

$$\begin{aligned} \Vert T\mathbf {x}-T\mathbf {y}\Vert ^2&= \Vert T\mathbf {x}-\mathbf {x}\Vert ^2 + \Vert \mathbf {y}-\mathbf {x}\Vert ^2 -(1+2\beta )\Vert T\mathbf {x}-\mathbf {x}\Vert ^2 -(1+2\beta )E(\mathbf {x},\mathbf {y})\nonumber \\&= \Vert \mathbf {y}-\mathbf {x}\Vert ^2 -2\beta \Vert T\mathbf {x}-\mathbf {x}\Vert ^2 -(1+2\beta )E(\mathbf {x},\mathbf {y}). \end{aligned}$$

(4.43)

Since S is closed and convex, $P_S$ is nonexpansive ( [2, Proposition 4.16]). Thus we have that

$$\begin{aligned} \Vert P_S \circ T\mathbf {y} - P_S \circ T\mathbf {x}\Vert ^2 \le \Vert T\mathbf {y}-T\mathbf {x}\Vert ^2. \end{aligned}$$

(4.44)

Since $\mathbf {x} \in \mathrm{Fix}P_S \circ T$, we have that $P_S \circ T \mathbf {x} = \mathbf {x}$. Making this substitution in (4.44), we obtain

$$\begin{aligned} \Vert P_S \circ T\mathbf {y} -\mathbf {x} \Vert ^2 \le \Vert T\mathbf {y}-T\mathbf {x}\Vert ^2. \end{aligned}$$

(4.45)

Together, (4.43) and (4.45) yield

$$\begin{aligned} \Vert P_S \circ T\mathbf {y} -\mathbf {x} \Vert ^2&\le \Vert \mathbf {y}-\mathbf {x}\Vert ^2 -2\beta \Vert T\mathbf {x}-\mathbf {x}\Vert ^2 -(1+2\beta )E(\mathbf {x},\mathbf {y}) \end{aligned}$$

(4.46)

$$\begin{aligned}&\le \Vert \mathbf {y}-\mathbf {x}\Vert ^2-E(\mathbf {x},\mathbf {y}). \end{aligned}$$

(4.47)

where the second inequality uses the fact that $\beta \ge 0$ and $E(\mathbf {x},\mathbf {y}) \ge 0$. This shows the desired result. $\square $

Theorem 4.8 admits the following corollary.

Corollary 4.9

(Averaged variant). Let $\mathrm{Fix}P_S \circ T \ne \emptyset $. The operator given by

$$\begin{aligned} \frac{1}{2} P_S \circ T + \frac{1}{2} \mathrm{Id}\end{aligned}$$

is FQNE.

Proof

By Theorem (4.8), we have that $P_S \circ T$ is QNE. By Lemma 2.1, an operator R is FQNE if and only if $2U-\mathrm{Id}$ is QNE. Letting

$$\begin{aligned} 2U-\mathrm{Id}= P_S \circ T, \end{aligned}$$

we have that U is FQNE and that $U=\frac{1}{2} P_S \circ T + \frac{1}{2} \mathrm{Id}$. $\square $

Now having the key results of Theorem 4.8, we are ready to show convergence for both ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$ and its under-relaxed variants.

Theorem 4.10

(Convergence of under-relaxed variants of ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$). Let $\mathrm{Fix}P_S \circ T \ne \emptyset $. Let $\gamma \in \left]0,1 \right[$ and $\mathbf {y}_0 \in S$. The sequence given by

$$\begin{aligned} \mathbf {y}_{n+1}:=&\;\mathcal {U}_\gamma \mathbf {y}_n,\\ \text {where}\quad \mathcal {U}_\gamma :=&\;(1-\gamma )P_S \circ T +\gamma \mathrm{Id}\end{aligned}$$

is strongly convergent to some $\mathbf {y} \in \mathrm{Fix}P_S \circ T$.

Proof

Notice that

$$\begin{aligned} \mathcal {U}_\gamma&=((1-\gamma )(P_S \circ T -\mathrm{Id})+\mathrm{Id}) \\&=\left( 1-\frac{2\gamma }{2}\right) (P_S \circ T-\mathrm{Id})+\mathrm{Id}\\&=(2-2\gamma )\left( \frac{1}{2}P_S\circ T - \frac{1}{2} \mathrm{Id}\right) +\mathrm{Id}\\&=(2-2\gamma )\left( \left( \frac{1}{2} P_S\circ T+\frac{1}{2} \mathrm{Id}\right) -\mathrm{Id}\right) + \mathrm{Id}\\&=(2-2\gamma )(U-\mathrm{Id})+\mathrm{Id}, \end{aligned}$$

where $2\gamma \in \left]0,2\right[$ and

$$\begin{aligned} U:=\frac{1}{2} P_S \circ T + \frac{1}{2} \mathrm{Id}\end{aligned}$$

is the FQNE operator from Corollary 4.9. Thus, applying Lemma 2.3 for the operator U, we have that $\mathcal {U}_\gamma $ is QNE and that

$$\begin{aligned} \Vert P_S \circ T\mathbf {y}_n - \mathbf {y}_n \Vert&= 2\left\| \frac{1}{2} P_S \circ T\mathbf {y}_n - \frac{1}{2} \mathbf {y}_n \right\| \nonumber \\&= 2\left\| \mathbf {y}_n-\left( \frac{1}{2} P_S \circ T + \frac{1}{2} \mathrm{Id}\right) \mathbf {y}_{n}\right\| \nonumber \\&=2\Vert \mathbf {y}_n-U(\mathbf {y}_n)\Vert \rightarrow 0 \quad \text {as}\quad n \rightarrow \infty . \end{aligned}$$

(4.48)

Since $(\mathbf {y}_n)_{n \in {{\mathbb {N}}}}$ is Fejér monotone, it is bounded. Since S is finite dimensional and $(\mathbf {y}_n)_{n \in {{\mathbb {N}}}} \subset S$ is bounded, we may take a convergent subsequence $(\mathbf {y}_j)_{j \in J \subset {{\mathbb {N}}}}$ such that

$$\begin{aligned} \Vert \mathbf {y}_{j} - \mathbf {y}\Vert \rightarrow 0 \quad \text {as}\; j \rightarrow \infty \end{aligned}$$

(4.49)

for some $\mathbf {y} \in S$. Since T is continuous and $P_S$ is continuous, $P_S\circ T$ is continuous. Thus we have

$$\begin{aligned} \underset{j \rightarrow \infty }{\lim }P_S \circ T(\mathbf {y}_{j}) = P_S \circ T(\underset{j \rightarrow \infty }{\lim }\mathbf {y}_{j}) = P_S \circ T (\mathbf {y}). \end{aligned}$$

(4.50)

The triangle inequality yields

$$\begin{aligned} \Vert \mathbf {y}-P_S\circ T(\mathbf {y})\Vert \le \Vert \mathbf {y}-\mathbf {y}_{j}\Vert + \Vert \mathbf {y}_{j}-P_S\circ T(\mathbf {y}_{j})\Vert +\Vert P_S\circ T(\mathbf {y}_{j})-P_S\circ T(\mathbf {y})\Vert . \end{aligned}$$

(4.51)

Taking the limit as $j \rightarrow \infty $, each of the terms in the right hand side of (4.51) go to zero by (4.49), (4.48), and (4.50) respectively. Thus $\Vert \mathbf {y}-P_S \circ T(\mathbf {y})\Vert = 0$, and so $\mathbf {y} = P \circ T(\mathbf {y})$. Thus we have that $\mathbf {y} \in \mathrm{Fix}P_S \circ T$.

Since $(\mathbf {y}_n)_{n \in {{\mathbb {N}}}}$ is Fejér monotone with respect to $\mathrm{Fix}P_S \circ T$ and possesses a sequential cluster point $\mathbf {y} \in \mathrm{Fix}P_S \circ T$, we conclude by Theorem 2.2 that $\mathbf {y}_n \rightarrow \mathbf {y}$ as $n \rightarrow \infty $. $\square $

Having proven the convergence for the under-relaxed variants of ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$, we now show the convergence of its non-relaxed version.

Theorem 4.11

(Convergence of ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$). Let $\mathrm{Fix}P_S \circ T \ne \emptyset $. Let $\mathbf {y}_0 \in S$. The sequence $(\mathbf {y}_n)_{n \in {{\mathbb {N}}}}$ given by

$$\begin{aligned} \mathbf {y}_{n+1}:= P_S \circ T \mathbf {y}_n \end{aligned}$$

converges to a point $\mathbf {y} \in \mathrm{Fix}P_S \circ T$.

Proof

Fix $\mathbf {x} \in \mathrm{Fix}P_S \circ T$. Applying Theorem 4.8, we have that

$$\begin{aligned} \Vert \mathbf {y}_{n+1}-\mathbf {x}\Vert ^2 \le \Vert \mathbf {y}_n-\mathbf {x}\Vert ^2 - E(\mathbf {x},\mathbf {y}_n), \end{aligned}$$

and so we have that

$$\begin{aligned} 0 \le \Vert \mathbf {y}_{n+1}-\mathbf {x}\Vert ^2 \le \Vert \mathbf {y}_0-\mathbf {x}\Vert ^2 -\sum _{i=0}^n E(\mathbf {x},\mathbf {y}_i). \end{aligned}$$

Thus we obtain

$$\begin{aligned} \sum _{i=0}^n E(\mathbf {x},\mathbf {y}_i) \le \Vert \mathbf {y}_0-\mathbf {x}\Vert ^2, \end{aligned}$$

which shows that

$$\begin{aligned} E(\mathbf {x},\mathbf {y}_n) \rightarrow 0 \quad \text {as}\; n \rightarrow \infty . \end{aligned}$$

(4.52)

From the definition of E, (4.52) implies that

$$\begin{aligned} \Vert T\mathbf {y}_n - \mathbf {y}_n\Vert \downarrow \Vert T\mathbf {x}-\mathbf {x}\Vert \quad \text {as}\; n\rightarrow \infty . \end{aligned}$$

(4.53)

As $(\mathbf {y}_n)_{n \in {{\mathbb {N}}}}$ is Fejér monotone, it is bounded. Thus we may take a convergent subsequence $(\mathbf {y}_j)_{j \in J \subset {{\mathbb {N}}}}$. Therefore, let

$$\begin{aligned} \mathbf {y}_j \rightarrow \mathbf {y} \quad \text {as}\quad j \rightarrow \infty . \end{aligned}$$

Combining with (4.53), we obtain

$$\begin{aligned} \Vert T\mathbf {y}-\mathbf {y}\Vert = \underset{j \rightarrow \infty }{\lim }\Vert T\mathbf {y}_j - \mathbf {y}_j \Vert = \Vert T\mathbf {x}-\mathbf {x}\Vert , \end{aligned}$$

where the first equality follows from the continuity of $T-\mathrm{Id}$ and the second equality is from (4.53). Now since $\Vert T\mathbf {y}-\mathbf {y}\Vert = \Vert T\mathbf {x}-\mathbf {x}\Vert $ with $\mathbf {x} \in \mathrm{Fix}P_S \circ T$, we have by Proposition 4.4 that $\mathbf {y} \in \mathrm{Fix}P_S \circ T$. Since $(\mathbf {y}_n)_{n \in {{\mathbb {N}}}}$ is Fejér monotone with respect to $\mathrm{Fix}P_S \circ T$ and possesses a sequential cluster point $\mathbf {y} \in \mathrm{Fix}P_S \circ T$, we conclude by Theorem 2.2 that $\mathbf {y}_n \rightarrow \mathbf {y}$ as $n \rightarrow \infty $. $\square $

When $\mathrm{Fix}P_S \circ T \ne \emptyset $, Theorem 4.11 guarantees the convergence of ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$, and Theorem 4.10 does the same for its under-relaxed variants. The next corollary simply formalizes this by including both cases.

Corollary 4.12

Let $\mathrm{Fix}P_S \circ T \ne \emptyset $. Let $\gamma \in [0,1[$ and $\mathbf {y}_0 \in S$. Then the sequence given by

$$\begin{aligned} \mathbf {y}_{n+1}:=&\mathcal {U}_\gamma \mathbf {y}_n \nonumber \\ \text {where}\quad \mathcal {U}_\gamma :=&(1-\gamma )P_S \circ T +\gamma \mathrm{Id} \end{aligned}$$

(4.54)

is convergent to some $\mathbf {y} \in \mathrm{Fix}P_S \circ T$, and the operator $\mathcal {U}_\gamma $ is paracontracting.

Proof

The convergence when $\gamma \in \left]0,1\right[$ is shown by Theorem 4.10, and the convergence when $\gamma = 0$ is dealt with by Theorem 4.11. These theorems also show that $\mathcal {U}_\gamma $ is QNE in these two cases respectively. Combining with the fact that $\mathcal {U}_\gamma $ is obviously a weighted average of continuous operators, the paracontracting property is clear. $\square $

4.3 Shadow sequence behaviour

Having established convergence for the governing sequence, we also have the following result that describes the behaviour of the sequence of shadows of the proximity operator: $T\circ (\mathcal {U}_\gamma )^{n-1}(y_0,1)$.

Corollary 4.13

Let $\mathrm{Fix}P_S \circ T \ne \emptyset $. Set $(y_0,1) \in S$ and $\gamma \in [0,1[$. The sequence $(\lambda _n)_{n \in {{\mathbb {N}}}} \subset S$ given by

$$\begin{aligned} (y_{n+1},\lambda _{n+1}):=T\circ (\mathcal {U}_\gamma )^{n}(y_0,1), \end{aligned}$$

where $\mathcal {U}_\gamma $ is as specified in (4.54) satisfies

$$\begin{aligned} \lambda _n \rightarrow 1-r' \in \left[ 0,1\right] \quad \text {as}\; n \rightarrow \infty , \end{aligned}$$

where $r'$ is as specified in (4.16).

Proof

From Corollary 4.12 we have that

$$\begin{aligned} \mathbf {x}_n:=(y_n,1) \rightarrow \mathbf {x} \;\;\text {for some}\;\; \mathbf {x} \in \mathrm{Fix}P_S \circ T. \end{aligned}$$

Given that $\mathbf {x} =(y,1)$ for some $y \in X$, we clearly have

$$\begin{aligned} (y_n,1) \rightarrow (y,1). \end{aligned}$$

From Proposition 4.4, we have that

$$\begin{aligned} \Vert T(y,1)-(y,1)\Vert = r', \end{aligned}$$

(4.55)

where $r'$ is as characterized in (4.16). Using Lemma 4.2, we have that $T(y,1)=(y,\lambda )$ for some $\lambda \in \left[ 0,1\right] $. This yields

$$\begin{aligned} \Vert T(y,1)-(y,1)\Vert = \Vert (y,\lambda )-(y,1)\Vert = 1-\lambda . \end{aligned}$$

(4.56)

Combining (4.55) and (4.56) we have that $r' = 1-\lambda $, and so

$$\begin{aligned} \lambda = 1-r' \in \left[ 0,1\right] . \end{aligned}$$

Since $(y_n,1) \rightarrow (y,1)$ and T is continuous, we have that

$$\begin{aligned} (y_{n+1},\lambda _{n+1}) = T(y_n,1) \rightarrow T(y,1) = (y,\lambda ). \end{aligned}$$

Thus $\lambda _n \rightarrow \lambda = 1-r'$. This concludes the result. $\square $

4.4 The operator $P_S \circ T$ is not, generically, FQNE

The property of firm quasinonexpansivity is especially important in the analysis of algorithms. In this section, we discuss under which conditions the operator $P_S \circ T$ may or may not exhibit this property. In particular, we provide an example illustrating that it is not, generically, FQNE. First we show, in Proposition 4.14, that failure to be a FQNE operator implies some specific conditions.

Proposition 4.14

Let $(y,1) \in S$, and let $(x,1) \in \mathrm{Fix}P_S \circ T$. Let $(x,\lambda ) = T(x,1)$ and $(w,\mu )=T(y,1)$. If

$$\begin{aligned} 0 < \langle (x,1)-P_S \circ T(y,1),(y,1)-P_S \circ T(y,1) \rangle , \end{aligned}$$

(4.57)

then the following hold:

(i)
$\lambda< \mu < 1$;
(ii)
$T(x,1) \ne P_{{\overline{\mathrm{dom}\kappa }}}(x,1)$.

Proof

Suppose that (4.57) holds. Then we have that

$$\begin{aligned} 0&< \langle (x,1)-P_S \circ T(y,1),(y,1)-P_S \circ T(y,1) \rangle \end{aligned}$$

(4.58)

$$\begin{aligned}&=\langle (x,1) - (w,1), (y,1)-(w,1) \rangle \nonumber \\&= \langle (x-w,0),(y-w,0) \rangle \nonumber \\&= \langle x-w,y-w \rangle . \end{aligned}$$

(4.59)

We will first show (i).

From Lemma 4.6 we have that

$$\begin{aligned} \langle T(x,1)-T(y,1), (y,1)-T(y,1) \rangle \le 0. \end{aligned}$$

(4.60)

Let $(w,\mu ) = T(y,1)$ and $(x,\lambda )=T(x,1)$ for $\lambda \le 1$. Then (4.60) becomes

$$\begin{aligned} \langle (x,\lambda )-(w,\mu ),(y,1)-(w,\mu ) \rangle&\le 0\nonumber \\ \langle (x-w,\lambda -\mu ), (y-w,1-\mu ) \rangle&\le 0\nonumber \\ \langle x-w,y-w \rangle + (\lambda -\mu )(1-\mu )&\le 0. \end{aligned}$$

(4.61)

By Lemma 4.2(i), $\lambda \in \left[ 0,1\right] $ and so there are only four possibilities: when $\lambda =1$, when $\mu \le \lambda <1$, when $\lambda < 1 \le \mu $, and when $\lambda< \mu < 1$. We will show that any case other than $\lambda< \mu < 1$ implies a contradiction.

Case $\lambda =1$ Suppose $\lambda =1$. Then $(\lambda -\mu )(1-\mu ) = (1-\mu )^2 \ge 0$.

Combining this fact with (4.61), we obtain

$$\begin{aligned} \langle x-w,y-w \rangle \le 0, \end{aligned}$$

(4.62)

which contradicts (4.59), and so we obtain a contradiction. This concludes the case $\lambda =1$.

Case $\mu \le \lambda < 1$ Suppose $\mu \le \lambda \le 1$. We have that $(\lambda -\mu ) \ge 0$ and $(1-\mu )\ge 0$, and so clearly $(\lambda -\mu )(1-\mu ) \ge 0$. Combining this with (4.61), we again obtain (4.62), which is a contradiction. This concludes the case when $\mu \le \lambda < 1$.

Case $\lambda < 1 \le \mu $ Suppose $\lambda < 1 \le \mu $. Then $(\lambda -\mu ) \le 0$ and $(1-\mu ) \le 0$, and so $(\lambda -\mu )(1-\mu ) \ge 0$. Combining this with (4.61), we again obtain (4.62), a contradiction. This concludes the case when $\lambda < 1 \le \mu $.

We are left with only one possibility, $\lambda< \mu < 1$, and so (i) holds.

We next show (ii). Having established that $\lambda < \mu $, we have by Lemma 4.3(ii)c that $(\lambda < \mu ) \implies T(x,1) \ne P_{{\overline{\mathrm{dom}\kappa }}}(x,1)$. Thus we have (ii). $\square $

Proposition 4.14 shows that any example of a gauge for which $P_S \circ T$ fails to be firmly quasi-nonexpansive must satisfy both (i) and (ii). Now we will see an example that satisfies both of these properties and serves as a counter-example to the tempting idea that $P_S \circ T$ is generically FQNE.

Example 1

($P_S \circ T$ is generically not a cutter). Let

$$\begin{aligned} \kappa : {{\mathbb {R}}}^1 \times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}: \mathbf {x} \mapsto 4\Vert \mathbf {x} \Vert _\infty + \iota _C \mathbf {x}, \end{aligned}$$

where $C:= \{(v,u)\;|\; v\le -u/4 \}$. Then

$$\begin{aligned} T_\gamma (1,-1/20) = P_{\kappa \le 1/5}(1,-1/20) = (1/5,-1/20), \end{aligned}$$

and so $(1,-1/20) \in \mathrm{Fix}P_S \circ T_\gamma $. Additionally,

$$\begin{aligned} T_\gamma (2,1) = P_{{\overline{\mathrm{dom}\kappa }}}(2,1) = (-2/17,8/17), \end{aligned}$$

and so $(2,1) \notin \mathrm{Fix}P_S \circ T_{\gamma }$. We have that

$$\begin{aligned} \langle P_S \circ T_{\gamma }(2,1) - (-1/20,1) , (-1/20,1) - (2,1) \rangle = (-2/17+1/20)(-1/20-2) > 0. \end{aligned}$$

This example is illustrated in Fig. 3.

5 Fundamental set and existence of fixed points

In the previous section, we consistently assumed that $\mathrm{Fix}P_S \circ T \ne \emptyset $. It bears noting that this condition may not hold for a general gauge. Of course, it does hold for ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$, under the conditions in Theorem 4.1. We will provide sufficient conditions for the more general ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$, which allow us to describe the solutions to ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ as lying on an exposed face of a dilated fundamental set. For the purpose, we make use of the Minkowski function representation of the gauge:

$$\begin{aligned} \kappa = \gamma _D :X \times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}: \mathbf {x}\mapsto \inf \{\mu \ge 0 \;|\; \mathbf {x} \in \mu D \}. \end{aligned}$$

(5.1)

Such a representation always holds by choosing $D = \{\mathbf {x} \in X\times {{\mathbb {R}}}\;|\; \kappa (\mathbf {x}) \le 1\}$ [6].

The following Lemma will be instrumental to our main result in Theorem 5.2.

Lemma 5.1

Let $D= \mathrm{lev}_{\kappa \le 1}$. The following hold.

(i)
$\mathrm{cone}D = \mathrm{dom}\kappa $.
(ii)
If there exists $\lambda ' > 0$ such that
$$\begin{aligned} \lambda ' = \max _{\lambda \in {{\mathbb {R}}}}\{\lambda \;\;|\;\; \exists y' \in X \text {so that}\;(y',\lambda ') \in D \}, \end{aligned}$$
then for any $(y,\beta ) \in \mathrm{cone}D \cap (X\times {{\mathbb {R}}}_{>0})$ there exists a minimal $0<r<\infty $ so that $(y,\beta ) = r \mathbf {d}$ for some $\mathbf {d} \in D$.

Proof

(i): Let $\mathbf {x} \in \mathrm{cone}D$. Then there exist $\lambda <\infty , \mathbf {d} \in D$ such that $\mathbf {x} = \lambda \mathbf {d}$. By positive homogeneity, $\kappa (\lambda \mathbf {d})= \lambda \kappa (\mathbf {d}) \le \lambda < \infty $, and so $\mathbf {x} \in \mathrm{dom}\kappa $, so $\mathrm{cone}D \subset \mathrm{dom}\kappa $. Now let $\mathbf {x} \in \mathrm{dom}\kappa $. Then there exists $r<\infty $ such that $\kappa (\mathbf {x}) = r$ and so by homogeneity $\kappa (\mathbf {x}/r) = 1$ and so $\mathbf {x}/r \in D$, so $\mathbf {x}= r(\mathbf {x}/r) \in \mathrm{cone}D$, and so $\mathrm{dom}\kappa \subset \mathrm{cone}D$.

(ii): Let $(y,\beta ) \in \mathrm{cone}D \cap (X\times {{\mathbb {R}}}_{>0})$. First of all, notice that by (i), $(y,\beta ) \in \mathrm{dom}\kappa $ and so

$$\begin{aligned} r:=\inf \{\lambda \;|\; (y,\beta ) \in \lambda D \}< \infty . \end{aligned}$$

Next we show $r>0$. Since $(y,\beta ) \in \mathrm{cone}D$, there exists some $\lambda \ge 0, (d_y,\mu )\in D$ such that $\lambda (d_y,\mu ) = (y,\beta )$. Now any such $\lambda $ that satisfies this equality clearly satisfies $\lambda \mu = \beta $ with $\beta >0$, and so all three constants are greater than zero. Moreover, any such constant $\lambda $ that satisfies this equality satisfies $\lambda = \beta /\mu \ge \beta / \lambda ' > 0$. Thus we have that

$$\begin{aligned} r=\inf \{\lambda \;|\; (y,\beta ) \in \lambda D \} \ge \beta /\lambda ' > 0. \end{aligned}$$

Now let $(\lambda _n)_n$ satisfy $\lambda _n \downarrow r$ as $n \rightarrow \infty $. Since $\lambda _n >r$ for all n, $(y,\beta ) \in \lambda _n D$ for all n and so there exists $(\mathbf {d}_n)_n$ such that $(y,\beta )=\lambda _n \mathbf {d}_n$ for all n. Notice that

$$\begin{aligned} \Vert \mathbf {d}_n\Vert = \frac{\Vert (y,\beta )\Vert }{\lambda _n} \le \frac{\Vert (y,\beta )\Vert }{r}< \infty , \end{aligned}$$

and so the sequence $(\mathbf {d}_n)_n$ is bounded. Thus we can pass to a convergent subsequence if need be and have $\mathbf {d}_n \rightarrow \mathbf {d}$ as $n \rightarrow \infty $. Since D is closed and $\mathbf {d}_n \in D$ for all n, we have $\mathbf {d} \in D$. Taking the limit of both sides of

$$\begin{aligned} (y,\beta )= \lambda _n \mathbf {d}_n, \end{aligned}$$

as $n \rightarrow \infty $, we have $(y,\beta ) = \mathbf {d}r$ with $\mathbf {d} \in D$ and r being the attained infimum of all such values such that $(y,\beta ) \in rD$. This shows the desired result. $\square $

The following theorem provides conditions that guarantee nonemptiness of the fixed point set. The strategy is to relate an exposed face of the fundamental set D to the fixed points of the algorithm.

Theorem 5.2

(Existence of fixed points of ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$). Let D be the (closed) fundamental set of $\kappa $ as in (5.1). The following hold.

(i)
If there exists $\lambda ' \ge 0$ such that
$$\begin{aligned} \lambda ' = \max _{\lambda \in {{\mathbb {R}}}}\{\lambda \;\;|\;\; \exists y \in X \text {so that}\;(y,\lambda ) \in D \}, \end{aligned}$$
then $F=D \cap \{(y,\lambda ')\; |\; y \in X\}$ is an exposed face of D and
1. (a)
  $\left( \frac{1}{1+\lambda '} \right) F = T(\mathrm{Fix}P_S \circ T);$ and
2. (b)
  Any $(x,1) \in \mathrm{Fix}P_S \circ T$ satisfies $T(x,1)=\left( x,\frac{\lambda '}{1+\lambda '}\right) $.
For example, this is always the case when D is bounded.
(ii)
If such a $\lambda '$ does not exist and there exists a sequence $(y_n,\lambda _n)_{n \in {{\mathbb {N}}}}$ such that $\lambda _n\rightarrow \infty $ and $\lambda _n/\Vert y_n\Vert \rightarrow m>0$, then $T(\mathrm{Fix}P_S \circ T) = \mathrm{Fix}P_S \circ T = \mathrm{zer}\kappa \cap S$.

Proof

(i) Suppose $\lambda '$ exists as described. To understand why F is an exposed face of D, see Remark 1 below.

Case 1 $\lambda '=0$, then any point in F is of the form (x, 0) for some $x \in X$. Moreover, (0, 1) is a fixed point and satisfies $\Vert T(0,1)-(0,0)\Vert =1$. It is then a straightforward consequence of Proposition 4.4 that $T(x,1)=(x,0)$ if and only if (x, 1) is a fixed point of $P_S\circ T$. This is all we needed to show in this case.

Case 2 $\lambda '>0$. Since $\lambda '>0$, the set $\cup _{r\ge 0}rD\cap (X\times \{\theta \}) \ne \emptyset $ for each $\theta >0$. For any $(x,\theta ) \in \mathrm{cone}D$ with $\theta >0$ we have from Lemma 5.1(ii) that there exists minimal $r_{(x,\theta )}>0$ such that $(x,\theta ) \in r_{(x,\theta )} D.$ By the Minkowski definition of the gauge this means,

$$\begin{aligned} \kappa (x,\theta )=r_{(x,\theta )}. \end{aligned}$$

There must also exist $(d_x,\mu _\theta ) \in D$ such that $r_{(x,\theta )}(d_x,\mu ) = (x,\theta )$. Thus $r_{(x,\theta )}\mu _\theta =\theta $ and $r_{(x,\theta )} =\theta /\mu _\theta $. Furthermore, our choice of $\lambda '$ guarantees that $\mu _\theta \le \lambda '$. Using positive homogeneity we have

$$\begin{aligned} r_{(x,\theta )} = \kappa (x,\theta ) = \kappa (r_{(x,\theta )}(d_x,\mu _\theta )) = r_{(x,\theta )}\kappa (d_x,\mu _\theta )\quad \text {and so}\quad \kappa (d_x,\mu _\theta )=1. \end{aligned}$$

Again using positive homogeneity, we obtain

$$\begin{aligned} \kappa (x,\theta ) = \kappa ((\theta /\mu _\theta )(d_x,\mu _\theta ))=(\theta /\mu _\theta )\kappa ((d_x,\mu _\theta )) =\theta /\mu _\theta , \end{aligned}$$

(5.2)

where the final equality is because we just showed $\kappa (d_x,\mu _\theta )=1$.

Additionally, for any point $(y',\lambda ') \in F$, homogeneity assures that

$$\begin{aligned} \theta /\lambda ' \ge (\theta /\lambda ') \kappa (y',\lambda ')=\kappa \left( (\theta /\lambda ') (y',\lambda ')\right) = \kappa ((p_\theta y',\theta )) \quad \text {where}\quad p_\theta :=\theta /\lambda '. \end{aligned}$$

(5.3)

Now let

$$\begin{aligned} \theta ':= \underset{\theta \in {{\mathbb {R}}}}{\mathrm{argmin}}\max \{\theta /\lambda ',|\theta -1| \}=\frac{\lambda '}{1+\lambda '}. \end{aligned}$$

(5.4)

Now we show that $(p_{\theta '} y',1)$ is in $\mathrm{Fix}P_S \circ T$. Remember that $\kappa _1$ is the polar envelope of $\kappa $ from Definition 3. We have the following.

$$\begin{aligned} \kappa _1((p_{\theta '} y',1))&=\underset{(x,\theta )\in X\times {{\mathbb {R}}}}{\inf }\max \{\kappa (x,\theta ),\Vert (x,\theta )-(p_{\theta '} y',1)\Vert \} \nonumber \\&= \underset{(x,\theta ) \in \mathrm{cone}D}{\inf }\max \{\kappa (x,\theta ),\Vert (x,\theta )-(p_{\theta '} y',1)\Vert \} \end{aligned}$$

(5.5a)

$$\begin{aligned}&= \underset{(x,\theta ) \in \mathrm{cone}D}{\inf }\max \{\theta /\mu _\theta ,\Vert (x,\theta )-(p_{\theta '} y',1)\Vert \} \end{aligned}$$

(5.5b)

$$\begin{aligned}&\ge \underset{\theta \in {{\mathbb {R}}}}{\min }\max \{\theta /\lambda ',|\theta -1| \} \end{aligned}$$

(5.5c)

$$\begin{aligned}&=\max \{\theta '/\lambda ',|\theta '-1| \} \end{aligned}$$

(5.5d)

$$\begin{aligned}&\ge \max \{\kappa (p_{\theta '} y',\theta '),\Vert (p_{\theta '} y',\theta ')-(p_{\theta '} y',1)\Vert \}. \end{aligned}$$

(5.5e)

Here (5.5a) is true by Lemma 5.1(i), (5.5b) holds by (5.2), (5.5c) holds because $\mu _{\theta }\le \lambda '$, (5.5d) holds by (5.4), and (5.5e) is obtained by applying (5.3) with $\theta = \theta '$. Altogether (5.5) shows that $(p_{\theta '}y',\theta ')=T(p_{\theta '}y',1)$, and so $(p_{\theta '}y',1)\in \mathrm{Fix}P_S \circ T$.

Notice that $\theta ' \in \left]0,1\right[$ and is nearer to 1 for larger $\lambda '$ and nearer to 0 for smaller $\lambda '$, exactly as we would expect. Notice also that we have shown that any point

$$\begin{aligned} (\theta '/\lambda ')(y',\lambda ') = \frac{1}{1+\lambda '}(y',\lambda ')\in \frac{1}{1+\lambda '}F, \end{aligned}$$

admits a corresponding point $(p_{\theta '}y',1) \in \mathrm{Fix}P_S \circ T$ whose proximal image is

$$\begin{aligned} T(p_{\theta '}y',1) = (p_{\theta '}y',\theta ')=(\theta '/\lambda ')(y',\lambda '). \end{aligned}$$

This shows that

$$\begin{aligned} \frac{1}{1+\lambda '}F \subset T\left( \mathrm{Fix}P_S \circ T\right) . \end{aligned}$$

Now let $(x,1) \in \mathrm{Fix}P_S \circ T$. Using Proposition 4.4, we have that any $(x,1) \in \mathrm{Fix}P_S \circ T$ must satisfy $T(x,1)=(x,\lambda )$ where $\lambda \in \left[ 0,1\right] $ and

$$\begin{aligned} \Vert (x,1) - (x,\lambda )\Vert = \Vert (p_{\theta '} y',\theta ')-(p_{\theta '} y',1)\Vert =r' = |1-\theta '|, \end{aligned}$$

which forces $\lambda = \theta '$. This shows that (i)b is true.

Now by Lemma 5.1(i),$(x,\theta ') \in \mathrm{cone}D$, since $(x,\theta ') \in \mathrm{dom}\kappa $. Now using the fact that $(x,\theta ')\in \mathrm{cone}D$ and following the same reasoning as we used above to obtain (5.2), we have that there exists a value $r_{(x,\theta ')}$ and a point $(d_x,\mu _{\theta '})$ with $\mu _{\theta '} \le \lambda '$ such that $r_{(x,{\theta '})}(d_x,\mu _{\theta '}) = (x,{\theta '})$ and $\kappa (x,{\theta '})=({\theta '}/\mu _{\theta '})$. Since $(x,1),(p_{\theta '}y',1) \in \mathrm{Fix}P_S \circ T$, we have from Proposition 4.4 that

$$\begin{aligned} |{\theta '}-1|=\Vert (x,{\theta '})-(x,1) \Vert = \Vert (p_{\theta '} y',\theta ')-(p_{\theta '} y',1)\Vert =r'. \end{aligned}$$

Moreover, ${\theta '}/\mu _{\theta '}=\kappa (x,{\theta '}) \le \Vert (x,{\theta '})-(x,1)\Vert =r'$, and so

$$\begin{aligned} r'&=\max \{{\theta '}/\mu _{\theta '},|1-{\theta '}|\}, \nonumber \\&\ge \underset{(\theta ,\mu ) \in {{\mathbb {R}}}_{+} \times ]0,\lambda ']}{\min }\max \{\theta /\mu ,|1-\theta |\}, \nonumber \\&=\max \{\theta '/\lambda ',|1-\theta '|\},\nonumber \\&=r'. \end{aligned}$$

(5.6)

The equality throughout (5.6) forces $\mu _{\theta '} = \lambda '$. Finally,

$$\begin{aligned} (d_x,\mu _{\theta '})&= (d_x,\lambda ') \in F,\\ \text {and so}\quad T(x,1)&= (x,{\theta '}) = (x,\theta ') = (\theta '/\lambda ')(d_x,\lambda ') \in (\theta '/\lambda ')F = \frac{1}{1+\lambda '}F. \end{aligned}$$

This shows that

$$\begin{aligned} \frac{1}{1+\lambda '}F \supset T\left( \mathrm{Fix}P_S \circ T\right) . \end{aligned}$$

This concludes the proof of (i)a.

(ii): Let the sequence $(y_n,\lambda _n)_{n \in {{\mathbb {N}}}}$ exist as described. By compactness of the unit ball in Euclidean space and by appealing to a subsequence if necessary, the sequence $y_n/\Vert y_n\Vert $ converges to some y in the unit ball in X. Now since $(y_n,\lambda _n) \in D$ for all n,

$$\begin{aligned} \kappa (y_n,\lambda _n) \le 1\quad (\forall n), \end{aligned}$$

and by the Minkowski function representation of $\kappa $,

$$\begin{aligned} \kappa \left( \frac{y_n}{m\Vert y_n\Vert },\frac{\lambda _n}{m\Vert y_n\Vert }\right) = \frac{1}{m\Vert y_n\Vert }\kappa \left( y_n,\lambda _n\right) . \end{aligned}$$

Taking the limits of both sides as $n\rightarrow \infty $ and using the lower semicontinuity of $\kappa $, we obtain

$$\begin{aligned} \kappa \left( \frac{y}{m},1\right) = 0. \end{aligned}$$

The point $\left( \frac{y}{m},1\right) \in \mathrm{zer}\kappa \cap S$ is clearly a fixed point of T since

$$\begin{aligned} \max \left\{ \kappa \left( \frac{y}{m},1\right) ,\left\| \left( \frac{y}{m},1\right) -\left( \frac{y}{m},1\right) \right\| \right\} = 0. \end{aligned}$$

Thereafter appealing to Lemma 3.1, Proposition 4.4, and the fact that $\kappa (T(x,1)) \le \Vert (x,1)-T(x,1)\Vert =r'=0$ for any $(x,1) \in \mathrm{Fix}P_S \circ T$, the result (ii) is clear. $\square $

Remark 1

(What do we mean by an exposed face?). Let us explain what we mean in Theorem 5.2 when we say that F is an exposed face of D. F is an exposed face of a closed, convex set D if there exists a supporting hyperplane H to D with $F = D \cap H$. In our case, $H = X \times \{\lambda ' \}$. The hyperplane H is a supporting hyperplane because D lies entirely in the affine half space $X \times {{\mathbb {R}}}_{\le \lambda '}$ defined by H.

The following example showcases a situation when $\mathrm{Fix}P_S \circ T$ may be empty. In so-doing, it illustrates the importance of the condition $m>0$ in Theorem 5.2(ii).

Example 2

Let $D= \{(y,\lambda )\; |\; y \ge \lambda ^2 \} \subset {{\mathbb {R}}}^2$. Then for any $(y,1) \in S$, $T(y,1)=(u,1)$ with $u>y$, and so $\mathrm{Fix}P_S \circ T = \emptyset $.

5.1 Fixed points of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$: facial characterization

When we take the results of Theorem 5.2 and specify from $\kappa $ back to the perspective transform $f^\pi $, we recover the following characterization of the fixed points of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$.

Theorem 5.3

(Facial characterization of fixed points of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$). Let $f:X \rightarrow {{\mathbb {R}}}_+ \cup \{+\infty \}$ be a proper closed nonnegative convex function with $\inf f>0$ and $\mathrm{argmin}f \ne \emptyset $. Let $\kappa = f^\pi $. The following hold:

(i)
$\lambda ' = \frac{1}{\min _{u \in X}f(u)},$ where $\lambda '$ is as in Theorem 5.2(iii);
(ii)
Where F is as in Theorem 5.2(i), $F=\frac{1}{\min _{u \in X}f(u)}(\mathrm{argmin}f \times \{1\})$;
(iii)
Any $(x,1) \in \mathrm{Fix}P_S \circ T$ satisfies $\frac{1+\lambda '}{\lambda '}x \in \mathrm{argmin}f$;
(iv)
Any $y \in \mathrm{argmin}f$ satisfies $(\frac{\lambda '}{1+\lambda '}y,1) \in \mathrm{Fix}P_S \circ T$.

Proof

(i): For simplicity, let $\eta :=\min _{u \in X}f(u)$. We will first show that

$$\begin{aligned} \frac{1}{\eta } = \max _{\lambda \in {{\mathbb {R}}}} \{\lambda \;\;|\;\; \exists y \in X\; \text {so that}\;(y,\lambda ) \in D \}. \end{aligned}$$

Let $y \in \mathrm{argmin}f$. Then

$$\begin{aligned} f^\pi (y/\eta ,1/\eta ) = \frac{1}{\eta } f^\pi (y,1) =\frac{1}{\eta }f(y)=\eta = 1, \end{aligned}$$

and so $(y,1/\eta ) \in D$. To see that $1/\eta $ is maximal, suppose for a contradiction that there exists $(y_0,\lambda _0) \in \left( X \times \left]1/\eta ,\infty \right[\right) \cap D.$ Then

$$\begin{aligned} 1 \ge f^\pi (y_0,\lambda _0) \ge \lambda _0 f^\pi (y_0/\lambda _0,1) > \frac{1}{\eta }f(y_0/\lambda _0,1) = \frac{1}{\eta }f(y_0/\lambda _0) \ge 1, \end{aligned}$$

a contradiction.

(ii): Having shown (i), we have from the definition of F that $F=D \cap \left\{ (x,1/\eta )\;|\; x \in X \right\} $. Let $(y,1/\eta ) \in F$. Then

$$\begin{aligned} 1 \ge f^\pi (y,1/\eta )=(1/\eta ) f^\pi (y\eta ,1) = (1/\eta ) f(y\eta ) \ge 1, \end{aligned}$$

and the equality throughout forces $f(y\eta )=\eta $. Thus $y\eta \in \mathrm{argmin}f$ and so $y \in (1/\eta )\mathrm{argmin}f$. Thus $F \subset (1/\eta )\mathrm{argmin}f$. The reverse inclusion is similar.

(iii) & (iv): By Theorem 5.2(i)a $(x,1) \in \mathrm{Fix}P_S \circ T$ is equivalent to

$$\begin{aligned} \left( x, \frac{\lambda '}{1+\lambda '} \right) = \left( \frac{1}{1+\lambda '} \right) \left( y,\frac{1}{\eta }\right) \;\;\text {for some}\;\;(y,1/\eta ) \in F. \end{aligned}$$

(5.7)

Having shown (ii), we have that the latter inclusion is equivalent to $y \eta \in \mathrm{argmin}f$. Combining with (5.7),

$$\begin{aligned} y\eta \in \mathrm{argmin}f \iff \left( \frac{1+\lambda '}{1}x\right) \eta \in \mathrm{argmin}f. \end{aligned}$$

Having shown (i), this is equivalent to

$$\begin{aligned} \frac{1+\lambda '}{\lambda '}x \in \mathrm{argmin}f, \end{aligned}$$

which shows both (iii) and (iv). $\square $

In the following remark, we compare the facial characterization of fixed points of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ from Theorem 5.3 with the closely related results from [6].

Remark 2

(On synchronicity between Theorems 4.1and5.3). Theorem 5.3 subsumes and is closely connected with the original results of [6, Theorem 7.4], which we recalled as Theorem 4.1. To see why, notice that items (i), (iii), (iv) of Theorem 5.3 have the following characterizations.

(iii) Applying Theorem 4.1(i), we have that $(x,\lambda _*) = T(x,1)$ satisfies $\lambda _*^{-1}x \in \mathrm{argmin}f$. Theorem 5.2(i) guarantees that $\lambda _* = \frac{\lambda '}{1+\lambda '}$.

(iv) From Theorem 4.1(ii), $((1+\eta )^{-1} y,1) \in \mathrm{Fix}P_S \circ T$. From Theorem 5.3(i), $(1+\eta )^{-1} = \frac{\lambda '}{1+\lambda '}$.

(i) The condition $1/(1+\eta ) = \frac{\lambda '}{1+\lambda '}$ then yields $\lambda '=1/\eta $.

Theorem 5.3 essentially uses the more general results from Theorem 5.2 to show that the minimizers of f form an exposed face of $(\min _{u \in X}f(u))D$: namely the face that is $(\min _{u \in X}f(u))F$. The other items are all a natural consequence of this.

6 Conclusion

We now state our eponymous convergence result, which shows global convergence of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ in the full generality of [6]. It also, under sufficient conditions to guarantee existence of a fixed point, shows convergence of ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$.

Theorem 6.1

(Convergence of ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ and ${\textbf {GP}}^\mathbf{4}{} {\textbf {A}}$). Let D be the (closed) fundamental set of $\kappa $ as in (5.1). Suppose one of the following holds.

(i)
$\kappa = f^\pi $ for $f:X \rightarrow {{\mathbb {R}}}_+ \cup \left\{ +\infty \right\} $ a proper closed nonnegative convex function with $\inf f=0$ and $\mathrm{argmin}f \ne \emptyset $;
(ii)
$\kappa = f^\pi $ for $f:X \rightarrow {{\mathbb {R}}}_+ \cup \left\{ +\infty \right\} $ a proper closed nonnegative convex function with $\inf f>0$ and $\mathrm{argmin}f \ne \emptyset $;
(iii)
There exists $\lambda ' \ge 0$ such that
$$\begin{aligned} \lambda ' = \max _{\lambda \in {{\mathbb {R}}}}\{\lambda \;\;|\;\; \exists y \in X \text {so that}\;(y,\lambda ) \in D \}; \end{aligned}$$
(iv)
Such a $\lambda '$ does not exist and there exists a sequence $(y_n,\lambda _n)_{n \in {{\mathbb {N}}}}$ such that $\lambda _n\rightarrow \infty $ and $\lambda _n/\Vert y_n\Vert \rightarrow m>0$.

Let $\gamma \in [0,1[$ and $(y_0,1) \in S$. Then the following hold.

1.
The sequence given by
$$\begin{aligned} (y_{n+1},1):=&\;\mathcal {U}_\gamma (y_{n},1), \nonumber \\ \text {where}\quad \mathcal {U}_\gamma :=&\;(1-\gamma )P_S \circ T +\gamma \mathrm{Id}\end{aligned}$$
is convergent to some $(y,1) \in \mathrm{Fix}P_S \circ T$;
2.
The shadow sequences $(y_{n+1},\lambda _{n+1})= T(y_n,1)$ satisfy $\lambda _n \rightarrow \lambda $ for some $\lambda \in \left[ 0,1\right] $;
3.
When (ii) or (iii) holds, $\lambda = \frac{\lambda '}{1+\lambda '}$;
4.
When (ii) holds, $\lambda ' = 1/(\inf f)$ and $\left( \frac{1}{\lambda _n}\right) y_n \rightarrow \left( \frac{1+\lambda '}{\lambda '}\right) y \in \mathrm{argmin}f$;
5.
When (i) holds, $y_n \rightarrow y \in \mathrm{argmin}f$ and $\lambda _n \rightarrow 0$.

Proof

Fixed points: (i): Since $\inf f=0$, any $x \in \mathrm{argmin}f$ satisfies

$$\begin{aligned} (f^\pi (x,1)=0) \underset{(Lemma~3.1)}{\implies } ((x,1) \in \mathrm{Fix}T) \implies ((x,1) \in \mathrm{Fix}P_S \circ T). \end{aligned}$$

By Theorem 5.3, we have that (ii) $\implies $ (iii). Either of the assumptions (iii) or (iv) guarantees existence of a fixed point of $P_S \circ T$ by Theorem 5.2.

Convergence Having shown that a fixed point exists, convergence of $(y_n)_n$ is assured by Corollary 4.12, and the convergence of $(\lambda _n)_n$ is guaranteed by Corollary 4.13. The characterization of $\lambda '$ in cases (ii) and (iii) is due to Theorems 5.2 and 5.3. $\square $

6.1 Further research

We suggest three further avenues of inquiry. Firstly, results on faces of fundamental sets (e.g. Theorem 5.3) are of interest in the development of more general theory. Secondly, Friedlander, Macêdo, and Pong also introduced a second algorithm, $\mathbf {EMA}$, which is not addressed here [6]. A natural question is whether $\mathbf {EMA}$ possesses similar properties to ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$. Finally, a motivating question is whether or not algorithms such as ${\textbf {P}}^\mathbf{4}{} {\textbf {A}}$ may have computational advantages for certain problems.

Data Availability Statement

Data availability considerations are not applicable to this research.

References

Auslender, A., Teboulle, M.: Asymptotic Cones and Functions in Optimization and Variational Inequalities: Springer Monographs in Mathematics. Springer, New York (2003)
MATH Google Scholar
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC, 2nd edn. Springer, Cham (2011)
Google Scholar
Cegielski, A.: Iterative Methods for Fixed Point Problems in Hilbert Spaces, Volume 2057 of Lecture Notes in Mathematics. Springer, Heidelberg (2012)
Google Scholar
Díaz Millán, R., Lindstrom, S.B., Roshchina, V.: Comparing averaged relaxed cutters and projection methods: Theory and examples. In: Bailey, D.H., Borwein, N., Brent, R.P., Burachik, R.S., Osborn, J.-A., Sims, B., Zhu, Q. (eds.) From Analysis to Visualization: A Celebration of the Life and Legacy of Jonathan M Borwein, Callaghan, Australia, September 2017, Springer Proceedings in Mathematics and Statistics, pp. 75–98. Springer (2020)
Friedlander, M.P., Macêdo, I., Pong, T.K.: Gauge optimization and duality. SIAM J. Optim. 24(4), 1999–2022 (2014)
Article MathSciNet Google Scholar
Friedlander, M.P., Macêdo, I., Pong, T.K.: Polar convolution. SIAM J. Optim. 29(2), 1366–1391 (2019)
Article MathSciNet Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press (1970)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer (1998)

Download references

Acknowledgements

The author was supported by Hong Kong Research Grants Council PolyU153085/16p. The author thanks Ting Kei Pong and Michael P. Friedlander for their useful suggestions on this manuscript.

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions.

Author information

Authors and Affiliations

Department of Applied Mathematics, Hong Kong Polytechnic University, Hung Hom, Hong Kong
Scott B. Lindstrom

Authors

Scott B. Lindstrom
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Scott B. Lindstrom.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article has been corrected: Funding note has been updated.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lindstrom, S.B. The projected polar proximal point algorithm converges globally. J Glob Optim 84, 177–203 (2022). https://doi.org/10.1007/s10898-022-01136-0

Download citation

Received: 23 February 2021
Accepted: 21 January 2022
Published: 13 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s10898-022-01136-0

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The projected polar proximal point algorithm converges globally

Abstract

Similar content being viewed by others

The generalized proximal point algorithm with step size 2 is not necessarily convergent

Relaxed-inertial proximal point type algorithms for quasiconvex minimization

An extension of the proximal point algorithm beyond convexity

1 Introduction

1.1 Outline and contributions

2 Preliminaries

Definition 1

Lemma 2.1

Definition 2

Theorem 2.2

Lemma 2.3

3 The polar envelope and proximity operator

Definition 3

Lemma 3.1

Proof

Lemma 3.2

Proof

Lemma 3.3

Proof

Theorem 3.4

Proof

4 The projected polar proximal point algorithm

Definition 4

Theorem 4.1

Definition 5

4.1 Alternative fixed point characterization

Lemma 4.2

Proof

Lemma 4.3

Proof

Proposition 4.4

Proof

Lemma 4.5

Proof

4.2 Convergence

Lemma 4.6

Proof

Fact 4.7

Proof

Theorem 4.8

Proof

Corollary 4.9

Proof

Theorem 4.10

Proof

Theorem 4.11

Proof

Corollary 4.12

Proof

4.3 Shadow sequence behaviour

Corollary 4.13

Proof

4.4 The operator \(P_S \circ T\) is not, generically, FQNE

Proposition 4.14

Proof

Example 1

5 Fundamental set and existence of fixed points

Lemma 5.1

Proof

Theorem 5.2

Proof

Remark 1

Example 2

5.1 Fixed points of \({\textbf {P}}^\mathbf{4}{} {\textbf {A}}\): facial characterization

Theorem 5.3

Proof

Remark 2

6 Conclusion

Theorem 6.1

Proof

6.1 Further research

Data Availability Statement

References

Acknowledgements

Funding

Author information

Authors and Affiliations