Complexity of the relaxed Peaceman-Rachford splitting method for the sum of two maximal strongly monotone operators

This paper considers the relaxed Peaceman-Rachford (PR) splitting method for finding an approximate solution of a monotone inclusion whose underlying operator consists of the sum of two maximal strongly monotone operators. Using general results obtained in the setting of a non-Euclidean hybrid proximal extragradient framework, we extend a previous convergence result on the iterates generated by the relaxed PR splitting method, as well as establish new pointwise and ergodic convergence rate results for the method whenever an associated relaxation parameter is within a certain interval. An example is also discussed to demonstrate that the iterates may not converge when the relaxation parameter is outside this interval.


Introduction
In this paper, we consider the relaxed Peaceman-Rachford (PR) splitting method for solving the monotone inclusion 0 ∈ (A + B)(u) (1) where A : X ⇒ X and B : X ⇒ X are maximal β-strongly monotone (point-to-set) operators for some β ≥ 0 (with the convention that 0-strongly monotone means simply monotone, and β-strongly monotone with β > 0 means strongly monotone in the usual sense). Recall that the relaxed PR splitting method is given by where θ > 0 is a fixed relaxation parameter and J T := (I + T ) −1 . The special case of the relaxed PR splitting method in which θ = 2 is known as the Peaceman-Rachford (PR) splitting method and the one with θ = 1 is the widely-studied Douglas-Rachford (DR) splitting method. Convergence results for them are studied for example in [1,2,3,4,8,13,14,22]. The analysis of the relaxed PR splitting method for the case in which β = 0 has been undertaken in a number of papers which are discussed in this paragraph. Convergence of the sequence of iterates generated by the relaxed PR splitting method is well-known when θ < 2 (see for example [1,7,14]) and, according to [16], its limiting behavior for the case in which θ ≥ 2 is not known. We

Basic concepts and notation
This section presents some definitions, notation and terminology which will be used in the paper.
We denote the set of real numbers by R and the set of non-negative real numbers by R + . Let f and g be functions with the same domain and whose values are in R + . We write that f (·) = Ω(g(·)) if there exists constant K > 0 such that f (·) ≥ Kg(·). Also, we write f (·) = Θ(g(·)) if f (·) = Ω(g(·)) and g(·) = Ω(f (·)).
Let Z be a finite-dimensional real vector space with inner product denoted by ·, · (an example of Z is R n endowed with the standard inner product) and let · denote an arbitrary seminorm in Z. Its dual (extended) seminorm, denoted by · * , is defined as · * := sup{ ·, z : z ≤ 1}. It is easy to see that The following straightforward result states some basic properties of the dual seminorm associated with a matrix seminorm. Its proof can be found for example in Lemma A.1(b) of [23].

A non-Euclidean hybrid proximal extragradient framework
This section discusses the non-Euclidean hybrid proximal extragradient (NE-HPE) framework and describes its associated convergence and iteration complexity results. The results of the section will be used in Sections 4 and 5 to study the convergence and iteration complexity properties of the relaxed PR splitting method (2). It contains two subsections. The first one describes a class of distance generating functions introduced in [17] and derives some of its basic properties. The second one describes the NE-HPE framework and its corresponding convergence and iteration complexity results.

A class of distance generating functions
We start by introducing a class of distance generating functions (and its corresponding Bregman distances) which is needed for the presentation of the NE-HPE framework in Subsection 3.2.
Definition 3.1 For a given convex set Z ⊂ Z, a seminorm · in Z and scalars 0 < m ≤ M , we let D Z (m, M ) denote the class of real-valued functions w which are differentiable on Z and satisfy A function w ∈ D Z (m, M ) is referred to as a distance generating function with respect to the seminorm · and its associated Bregman distance dw : Z × Z → R is defined as Throughout our presentation, we use the second notation (dw) z (z ) instead of the first one (dw)(z ; z) although the latter one makes it clear that (dw) is a function of two arguments, namely, z and z . Clearly, it follows from (5) that w is a convex function on Z which is in fact m-strongly convex on Z whenever · is a norm.
The following simple result summarizes the main identities about the Bregman distance (dw).
Lemma 3.2 For some convex set Z ⊂ Z and scalars 0 < m ≤ M , let w ∈ D Z (m, M ) be given. Then, the following identities hold for every z, z ∈ Z: Proof: Identities (8) and (9) follow straightforwardly from the definition of the Bregman distance in (7). The first inequality in (10) follows easily from (5) and the definition of (dw) z (z ) in (7). The second inequality in (10) follows from (3), (6), the definition of (dw) z (z ) in (7), and the identity It is easy to see that (11) immediately follows from (6), (8) and (10). Note that if the seminorm in Definition 3.1 is a norm, then (5) implies that w is strongly convex on Z, in which case the corresponding dw is said to be nondegenerate on Z. However, since Definition 3.1 does not necessarily assume that · is a norm, it admits the possibility of w being not strongly convex on Z, or equivalently, dw being degenerate on Z.
The following result gives some useful properties of distance generating functions.
Lemma 3.3 For some convex set Z ⊂ Z and scalars 0 < m ≤ M , let w ∈ D Z (m, M ) be given. Then, for every l ≥ 1 and z 0 , z 1 , . . . , z l ∈ Z, we have Proof: By (10), the triangle inequality for norms and the fact that the 1-norm of an l-vector is bounded by √ l times its 2-norm, we have which clearly implies (12) due to the first inequality in (10).

The NE-HPE framework
This subsection describes the NE-HPE framework and its corresponding convergence and iteration complexity results. Throughout this subsection, we assume that scalars 0 < m ≤ M , convex set Z ⊂ Z, seminorm · and distance generating function w ∈ D Z (m, M ) with respect to · are given. Our problem of interest in this section is the MIP 0 ∈ T (z) (13) where T : Z ⇒ Z is a maximal monotone operator satisfying the following conditions: A1) the solution set T −1 (0) of (13) is nonempty.

end
We now make some remarks about Framework 1. First, it does not specify how to find λ k and (z k , z k , ε k ) satisfying (14) and (15). The particular scheme for computing λ k and (z k , z k , ε k ) will depend on the instance of the framework under consideration and the properties of the operator T . Second, if w is strongly convex on Z and σ = 0, then (15) implies that ε k = 0 and z k =z k for every k, and hence that r k ∈ T (z k ) in view of (14). Therefore, the HPE error conditions (14)- (15) can be viewed as a relaxation of an iteration of the exact non-Euclidean proximal point method, namely, We observe that NE-HPE frameworks have already been studied in [17], [21] and [30]. The approach presented in this section differs from these three papers as follows. Assuming that Z is an open convex set, w is continuously differentiable on Z and continuous on its closure, [30] studies a special case of the NE-HPE framework in which ε k = 0 for every k, and presents results on convergence of sequences rather than iteration complexity. Paper [21] deals with distance generating functions w which do not necessarily satisfy conditions (5) and (6), and as consequence, obtains results which are more limited in scope, i.e., only an ergodic convergence rate result is obtained for operators with bounded feasible domains (or, more generally, for the case in which the sequence generated by the HPE framwework is bounded). Paper [17] introduces the class of distance generating functions D Z (m, M ) but only analyzes the behavior of a HPE framework for solving inclusions whose operators are strongly monotone with respect to a fixed w ∈ D Z (m, M ) (see condition A1 in Section 2 of [17]). This section on the other hand assumes that w ∈ D Z (m, M ) but it does assume any strong monotonicity of T with respect to w.
Before presenting the main results about the the NE-HPE framework, namely, Theorems 3.8 and 3.9 establishing its pointwise and ergodic iteration complexities, respectively, and Propositions 3.10 and 3.11 showing that {z k } and/or {z k } approach T −1 (0) in terms of the Bregman distance (dw), we first establish a few preliminary technical results.
Proposition 3.5 For every k ≥ 1 and z * ∈ T −1 (0), we have As a consequence, the following statements hold: Proof: Let z * ∈ T −1 (0) be given. The first inequality in (19) follows from (17) with z = z * and the last inequality in (19) follows from the fact that 0 ∈ T (z * ) and r k ∈ T [ε k ] (z k ), and the definition of T [ε] (·). Finally, statements (a) and (b) follow immediately from (19) while (c) follows by adding (19) over i = 1, . . . , k and using the fact that (dw) z k (z * ) ≥ 0 for every k.
For the purpose of stating the convergence rate results below, define Lemma 3.6 For every i ≥ 1, define Proof: For every i ≥ 1, it follows from (14), (8), (11), (15), the triangle inequality for norms and the above definition of τ , that The last inequality, (15) and the definition of θ i then imply that θ i ≤ (dw) z i−1 (z i ) for every i ≥ 1. Hence, if z * ∈ T −1 (0), it follows that where the last inequality follows from Proposition 3.5(c). The lemma now follows from the latter relation and the definition of (dw) 0 in (20).
Lemma 3.7 Let (dw) 0 be as in (20) and τ be as in (21), and assume that σ < 1. Then, for every α ∈ R and every k ≥ 1, there exists an i ≤ k such that Proof: It follows from Lemma 3.6 that which, in view of the definition of θ i in (21), can be easily seen to be equivalent to the conclusion of the lemma.
The following pointwise convergence rate result describes the convergence rate of the sequence {(r k , ε k )} of residual pairs associated to the sequence {z k }. Note that its convergence rate bounds are derived on the best residual pair among (r i , ε i ) for i = 1, . . . , k rather than on the last residual pair (r k , ε k ).
Theorem 3.8 (Pointwise convergence) Let (dw) 0 be as in (20) and τ be as in (21), and assume that σ < 1. Then, the following statements hold: Proof: Statements (a) (resp., (b)) follows from Lemma 3.7 with α = 1 (resp., α = 2). From now on, we focus on the ergodic convergence rate of the NE-HPE framework. For k ≥ 1, define Λ k := k i=1 λ i and the ergodic sequences The following ergodic convergence result describes the association between the ergodic iteratẽ z a k and the residual pair (r a k , ε a k ), and gives a convergence rate bound on the latter residual pair.
Theorem 3.9 (Ergodic convergence) Let (dw) 0 be as in (20) and τ be as in (21). Then, for Moreover, the sequence {ρ k } is bounded under either one of the following situations: where D := sup{min{(dw) y (y ), (dw) y (y)} : y, y ∈ Dom T } is the diameter of Dom T with respect to dw.
Proof: The inequality ε a k ≥ 0 and the inclusion r a k ∈ T [ε a k ] (z a k ) follows from (24) and the transportation formula (see [5,Theorem 2.3]). Now, let z * ∈ T −1 (0) be given. Using (8), (14) and (24), we easily see that Hence, in view of Proposition 3.5(a), and relations (11) and (21), we have This inequality together with definition of (dw) 0 clearly imply the bound on r k * . We now establish the bound on ε a k . Using inequality (18) with z =z a k , noting (24), and using the fact that (dw) z 0 (·) is convex and σ ≤ 1, we conclude that On the other hand, (12) with l = 3 implies that for every i ≥ 1 and z * ∈ T −1 (0), where the last inequality is due to Proposition 3.5(a). Combining the above two relations and using the definitions of ρ k and (dw) 0 , we then conclude that the bound on ε a k holds. We now establish the bounds on ρ k under either one of the conditions (a) or (b). First, if σ < 1, then it follows from (15) and Proposition 3.5 that for every i ≥ 1 and z * ∈ T −1 (0). Noting (20) and (25), we then conclude that (26) holds. Assume now that Dom T is bounded. Using (12) with l = 2 and Proposition 3.5(a), and noting the definition of D in (b), we conclude that for every i ≥ 1 and z * ∈ T −1 (0). Hence, noting (20) and (25), we conclude that (27) holds.
In the remaining part of this subsection, we state some results about the sequence generated by an instance of the NE-HPE framework. We assume from now on that such instance generates an infinite sequence of iterates, i.e., the instance does not terminate in a finite number of steps and no termination criterion is checked. Since we are not assuming that the distance generating function w is nondegenerate on Z, it is not possible to establish convergence of the sequence {z k } generated by the NE-HPE framework to a solution of (13). However, under some mild assumptions, it is possible to establish that {z k } approaches a pointz ∈ T −1 (0) if the proximity measure used is the actual Bregman distance. Proposition 3.10 Assume that for some infinite index set K and somez ∈ Z, we have Proof: Using the two limits in (28), and the fact that every maximal monotone operator is closed and r k ∈ T ε k (z k ) for every k ∈ K, we conclude that 0 ∈ T 0 (z) = T (z). This conclusion together with Assumption A0 then imply that the first assertion of the proposition holds and that {(dw) z k (z)} is non-increasing in view of Proposition 3.5(a). To show the second assertion, assume and the second limit in (28) clearly implies that lim k∈K (dw)z k (z) = 0, we then conclude that lim k∈K (dw) z k (z) = 0. Clearly, since {(dw) z k (z)} is non-increasing, we have that lim k→∞ (dw) z k (z) = 0, and hence that the second assertion holds. Proof: The assumption that σ < 1 and ∞ i=1 λ 2 k = ∞ together with Theorem 3.8(b) imply that there exists subsequence {(r k , ε k )} k∈K converging to zero. Since {z k } k∈K is bounded, we may assume without loss of generality (by passing to a subsequence if necessary) that {z k } k∈K converges to somez ∈ Z. Hence, by the first part of Proposition 3.10, we conclude thatz ∈ T −1 (0) ⊂ Z. Thus, Proposition 3.5(c) with z * =z and the assumption that σ < 1 imply that lim k→∞ (dw) z k−1 (z k ) = 0, and hence that lim k→∞ (dw) z k (z k ) = 0 in view of (15). This conclusion together with our previous conclusion that (28) holds and the second part of Proposition 3.10 then imply that lim k→∞ (dw) z k (z) = 0. The latter conclusion together with the fact that lim k→∞ (dw) z k (z k ) = 0 and Lemma 3.3 with l = 2 easily imply that lim k→∞ (dw)z k (z) = 0.
Clearly, if w is a nondegenerate distance generating function, then the results above give sufficient conditions for the sequences {z k } and {z k } to converge to somez ∈ T −1 (0).

The relaxed Peaceman-Rachford splitting method
This section derives convergence rate bounds for the relaxed Peaceman-Rachford (PR) splitting method for solving the monotone inclusion (1) under the assumption that A and B are maximal β-strongly monotone operators for any β ≥ 0. More specifically, its pointwise iteration-complexity is obtained in Theorem 4.6 and its ergodic iteration-complexity is derived in Theorem 4.8. These results are obtained as by-products of the corresponding ones (i.e, Theorem 3.8 and Theorem 3.9) in Subsection 3.2 and the fact that the relaxed Peaceman-Rachford (PR) splitting method can be viewed as a special instance of the NE-HPE framework.
Throughout this section, we assume that X a finite-dimensional real vector space with inner product and associated inner product norm denoted by ·, · X and · X , respectively. For a given β ≥ 0, an operator T : X ⇒ X is said to be β-strongly monotone if In what follows, we refer to monotone operators as 0-strongly monotone operators. This terminology has the benefit of allowing us to treat both the monotone and strongly monotone case simultaneously.
Throughout this section, we consider the monotone inclusion (1) where A, B : X ⇒ X satisfy the following assumptions: B0) for some β ≥ 0, A and B are maximal β-strongly monotone operators; We start by observing that (1) is equivalent to solving the following augmented system of inclusions/equation where γ > 0 is an arbitrary scalar. Another way of writing the above system is as Note that the first and second inclusions are equivalent to so that the third equation reduces to The Douglas-Rachford (DR) splitting method is the iterative procedure x k = x k−1 + v(x k−1 ) − u(x k−1 ), k ≥ 1, started from some x 0 ∈ X . It is known that the DR splitting method is an exact proximal point method for some maximal monotone operator [13,14]. Hence, convergence of its sequence of iterates is guaranteed. This section is concerned with a natural generalization of the DR splitting method, namely, the relaxed Peaceman-Rachford (PR) splitting method with relaxation parameter θ > 0, which iterates as We now make a few remarks about the above method. First, it reduces to the DR splitting method when θ = 1, and to the PR splitting method when θ = 2. Second, it reduces to (2) when γ = 1 but it is not more general than (2) since (31) is equivalent to (2) with (A, B) = (γA, γB). Third, as presented in (31), it can be viewed as an iterative process in the (u, v, x)-space rather than only in the x-space as suggested by (2). Our analysis of the relaxed DR splitting method is based on further exploring the last remark above, i.e., viewing it as an iterative method in the (u, v, x)-space. We start by introducing an inclusion which plays an important role in our analysis. For a fixedθ > 0 and γ > 0, consider the inclusion 0 ∈ (Lθ + γC)(z) where Lθ : X × X × X → X × X × X is the linear map defined as and C : X × X × X ⇒ X × X × X is the maximal monotone operator defined as It is easy to verify that the inclusion (32) is equivalent to the two systems of inclusions/equation following conditions B0 and B1. Hence, it suffices to solve (32) in order to solve (1). The following simple but useful result explicitly show the relationship between the solution sets of (32) and (1).
Proof: The conclusion of the lemma follows immediately from the definitions of Lθ and C in (33) and (34), respectively, and some simple algebraic manipulations.
The key idea of our analysis is to show that the relaxed PR splitting method is actually a special instance of the NE-HPE framework for solving inclusion (32) and then use the results discussed in Subsection 3.2 to derive convergence and iteration-complexity results for it. With this goal in mind, the next result gives a sufficient condition for (32) to be a maximal monotone inclusion.
The following technical result states some useful identities and inclusions needed to analyze the the sequence generated by the relaxed PR splitting method. Lemma 4.3 For a given x k−1 ∈ X andθ > 0, definẽ where u k , v k are as in (31), and set Then, we have: As a consequence, we have where c k := (a k , b k , 0).
Proof: Using the definition of (u(·), v(·)) in (30), the definition of (u k , v k , xθ k ) in (31), and the definitions of a k and b k in (39), we easily see that (40)  The following result shows that the relaxed PR splitting method with θ ∈ (0, 2θ 0 ] can be viewed as an inexact instance of the NE-HPE framework for solving (32) where from now on we assume thatθ := min{θ, θ 0 }.

The last conclusion follows from statements (b) and (c), and Proposition 4.2(b).
We now make a remark about the special case of Proposition 4.4 in which θ ∈ (0, θ 0 ]. Indeed, in this case,θ = θ, and hence σ = 0 andz k = z k for every k ≥ 1. Thus, the relaxed PR splitting method with θ ∈ (0, θ 0 ] can be viewed as an exact non-Euclidean proximal point method with distance generating function w as in (46) with respect to the monotone inclusion 0 ∈ T (z) := (L θ + γC)(z). Note also that the latter inclusion depends on θ.
As a consequence of Proposition 4.4, we are now ready to describe the pointwise and ergodic convergence rate for the relaxed PR splitting method. We first endow the space Z := X × X × X with the semi-norm (u, v, x) := x X and hence Proposition 2.1 implies that It is also easy to see that the distance generating function w defined in (46) is in D Z (m, M ) with respect to · where M = m = 1/θ (see Definition 3.1). Our next goal is to state a pointwise convergence rate bound for the relaxed PR splitting method. We start by stating a technical result which is well-known for the case where β = 0 (see for example Lemma 2.4 of [18]). The proof for the general case, i.e., β ≥ 0, is similar and is given in the Appendix for the sake of completeness.
We now state the pointwise convergence rate result for the relaxed PR splitting method. Theorem 4.6 Consider the sequence {z k = (u k , v k , x k )} generated by the relaxed PR splitting method with θ ∈ (0, 2θ 0 ). Then, for every k ≥ 1 and z * = (u * , u * , x * ) ∈ (Lθ + γC) −1 (0), Proof: The inclusion and the equality in the theorem follows from (42). Since by Proposition 4.4, the relaxed PR splitting method with θ ∈ (0, 2θ 0 ) is an NE-HPE instance for solving the monotone inclusion 0 ∈ (Lθ + γC)(z) in which σ = (θ/θ − 1) 2 < 1, ε k = 0 and λ k = 1 for all k ≥ 1, it follows from Lemma 4.5, Theorem 3.8, the fact that M = m = 1/θ, and relation (20) that The inequality of the theorem then follows by Proposition 4.4(a) and relation (48). Our main goal in the remaining part of this section is to derive ergodic convergence rate bounds for the relaxed PR splitting method for any θ ∈ (0, 2θ 0 ]. We start by stating the following variation of the transportation lemma for maximal β-strongly monotone operators.
Before proving this claim, we will use it to complete the proof of the theorem. Indeed, using the definition of w in (46), relations (20), (48), (52) and (54), the conclusion of Proposition 4.4, and Theorem 3.9 with T = Lθ + γC, M = m = 1/θ and λ k = 1 for all k, we conclude that where ρ k is defined in (25). Moreover, using (25), the definition of w in (46), the definition of x i andx i in (31) and (38), respectively, the triangle inequality, and Proposition 3.5(a), we conclude that The inequalities of the theorem now follows from the above three relations.
In the remaining part of the proof, we establish our previous claim (54). By Proposition 4.4(a) and relations (43) and (53), we have where c i is defined in (44). Moreover, we have where the second equality follows from (35) and the definitions ofz a k in (24),z k in (38), andū k and v k in (50), and the inequality follows from (37) and the fact thatθ ≤ θ 0 in view of (45). Finally, using the definitions ofz a k in (24), andz i and c i in Lemma 4.3, and the straightforward relation we conclude from (56) and (57) that and hence that the claim holds in view of (51). We now make some remarks about the convergence rate bounds obtained in Theorem 4.8. In view of Lemma 4.1, x * depends on γ according to Hence, letting and assuming that S < ∞, it is easy to see that Theorem 4.8 and (37) imply that the relaxed PR splitting method with θ = 2θ 0 satisfies When S/β ≥ d 0 , then γ = d 0 /S minimizes both C 1 (·) and C 2 (·) up to a multiplicative constant, in which case C * 1 = Θ(d 0 ), C * 1 /γ = Θ(S) and C * 2 = Θ(Sd 0 ) where Note that this case includes the case in which β = 0. On the other hand, when S/β < d 0 , then both C 1 and C 2 are minimized up to a multiplicative constant by any γ ≥ d 0 /S, in which case C * 1 = Θ(S/β) and C * 2 = Θ(S 2 /β). Clearly, in this case, C * 1 /γ converges to zero as γ tends to infinity.
Indeed, assume first that S/β ≥ d 0 . Then, up to some multiplicative constants, we have and hence that C * 1 = Ω(d 0 ) and C * 2 = Ω(Sd 0 ). Moreover, if γ = d 0 /S, then the assumption S/β ≥ d 0 implies that βγ ≤ 1, and hence that C * 1 = Θ(d 0 ) and C * 2 = Θ(Sd 0 ). Assume now that S/β < d 0 . Then, up to multiplicative constants, it is easy to see that and hence that C * 1 = Ω(S/β) and C * 2 = Ω(S 2 /β). Moreover, if γ ≥ d 0 /S, then it is easy to see that C * 1 = Θ(S/β) and C * 2 = Θ(S 2 /β). Based on the above discussion, the choice γ = d 0 /S is optimal but has the disadvantage that d 0 is generally difficult to compute. One possibility around this difficulty is to use γ = D 0 /S where D 0 is an upper bound on d 0 .

On the convergence of the relaxed PR splitting method
This section discusses some new convergence results about the sequence generated by the relaxed PR splitting method for the case in which β > 0. It contains two subsections. As observed in the Introduction, [12] already establishes the convergence of the relaxed PR sequence for the case in which β ≥ 0 and θ < 2θ 0 . The first subsection establishes convergence of the relaxed PR sequence for the case in which β > 0 and θ = 2θ 0 . The second subsection describes an instance showing that the relaxed PR spliting method may diverge when β ≥ 0 and θ ≥ min{2(1 + γβ), 2 + γβ + 1/(γβ)}. (Here, we assume that 1/0 = ∞.) Note that this instance, specialized to the case β = 0, shows that the sequence {z k = (u k , v k , x k )} generated by the relaxed PR splitting method with β = 0 may diverge for any θ ≥ 2, and hence that the convergence result obtained for any θ ∈ (0, 2) in [12] cannot be improved.

Convergence result about the relaxed PR sequence
It is known that the sequence {z k = (u k , v k , x k )} generated by the relaxed PR splitting method with θ ∈ (0, 2θ 0 ) and β ≥ 0 converges [12]. The main result of this subsection, namely Theorem 5.2, establishes convergence of this sequence for θ = 2θ 0 when β > 0.
We start by giving a lemma which is used in the proof of Theorem 5.2. Proof: The assumption that θ ∈ (0, 2θ 0 ] together with the last conclusion of Proposition 4.4 imply that the relaxed PR splitting method is an NE-HPE instance with σ ≤ 1. Hence, for any z * ∈ (Lθ + γC) −1 (0), it follows from Proposition 3.5(a) that the sequence {(dw) z k (z * )} is nonincreasing where w is the distance generating function given by (46). Clearly, this observation implies that {x k } is bounded. This conclusion together with (30) and the nonexpansiveness of J γA , J γB imply that {u k } and {v k } are also bounded. Finally, {x k } is bounded due to the definition ofx k in (38), and the boundedness of {x k }, {u k } and {v k }.
As mentioned at the beginning of this subsection, the convergence of {(u k , v k )} to some pair (u * , u * ) where u * ∈ (A + B) −1 (0) has been established in [12] for the case in which β > 0 and θ < 2θ 0 . The following result shows that the latter conclusion can also be extended to θ = 2θ 0 .
Note that when β =β, the above inequality reduces to θ ≥ min{2(1 + β), 2 + β + 1/β}. Before ending this subsection, we make two remarks. First, when β = 0 and hence A is not strongly monotone, the sequence {x k } for the above example diverges for any θ ≥ 2 even if B is strongly monotone, i.e.,β > 0. Second, the above example specialized to the case in which β =β easily shows that the sequence generated by the relaxed PR splitting method does not necessarily converge for any (β, θ) ∈ R(τ ) if τ > 1 where R(τ ) is defined at the end of Subsection 5.1.

Numerical study
This section illustrates the behavior of the relaxed PR splitting method for solving the weighted Lasso minimization problem [16] (see also [6]) where f (u) := 1 2 Cu − b 2 X and g(u) := W u 1 for every u ∈ R n . Our numerical experiments consider instances where n = 200, b ∈ R 300 and C ∈ R 300×200 is a sparse data matrix with an average of 10 nonzero entries per row. Each component of b and each nonzero element of C is drawn from a Gaussian distribution with zero mean and unit variance, while W ∈ R 200×200 is a diagonal matrix with positive diagonal elements drawn from a uniform distribution on the interval [0, 1]. This setup follows that of [16]. Note that X = R 300 and · 1 is the 1-norm on R 200 . Observe that f is α-strongly convex 1 on R 200 where α = λ min (C T C) is the minimum eigenvalue of C T C. Also, f is differentiable and its gradient is κ-Lipschitz continuous on R 200 where κ = λ max (C T C) is the maximum eigenvalue of C T C. The function g is clearly convex on R 200 .
We consider solving (67) by apply the relaxed PR splitting method (65) to solve the inclusion (1) with where 0 ≤ α ≤ α = λ min (C T C). Since A (resp., B) is (α − α )-strongly (resp., α -strongly) maximal monotone, the results developed in Sections 4 and 5 for the relaxed PR splitting method with (A, B) as above applies with β = min{α − α , α }. Our goal in this section is to gain some intuition of how the relaxed PR splitting method performs as α (and hence β), γ and θ change. In our numerical experiments, we start the relaxed PR splitting algorithm with x 0 = 0 and terminate it when x k+1 − x k X ≤ 10 −5 . The paragraphs below report the results of three experiments.
In the first experiment, we generate 100 random instances of (C, W, b) and we observed that the condition λ min (C T C) > 0 holds for all instances. The relaxed PR splitting method is used to solve these instances of (67) for various values of θ and with the pair (γ, α ) taking on the values (1, 0), (1, α/2), (1/ √ ακ, 0) and (1/ √ ακ, α/2). Note that it follows from Proposition 3 of [16] that when α = 0 and θ = 2, the choice of γ = 1/ √ ακ has been shown to be optimal for the relaxed PR splitting method. Our results are shown in Table 1 θ = 2 and θ = 2 + γα/2, the average number of iterations for α = 0 and α = α/2 are similar. However, when θ = 2 and θ = 2 + γα/2, the choice α = α/2 outperforms the one with α = 0. One possible explanation for this behavior is due to the fact that when θ = 2 and θ = 2 + γα/2, the relaxed PR sequence converges when both operators are strongly monotone, while it does not necessary converge when either one of the operators is only monotone. Note also that the results in the last row of the table confirm the convergence result of the relaxed PR splitting method (see Theorem 5.2) for the case in which A and B are β-strongly maximal monotone operators with β > 0 and θ = 2 + γβ. Finally, our results (the last two rows of table) suggest that, if A is maximal α-strongly monotone and B is only maximal monotone, it might be advantageous to use the relaxed PR splitting method with 0 < α < α (and hence β > 0) instead of α = 0 (and hence β = 0). In our second experiment, we use relaxed PR splitting method with (θ, γ) equal to (2, 1) and (2, 1/ √ ακ), and with α varying from 0 to α, to solve (67) for a randomly generated (C, W, b). In this instance, α = λ min (C T C) = 0.3792 and κ = λ max (C T C) = 57.6624. Our results are shown in Figure 1. We see from Figure 1 that the number of iterations decreases as α increases in both cases. These graphs again suggest that it might be advantageous to have A and B maximal β-strongly monotone with β > 0. We also observe that as α approaches α, the number of iterations does not increase even though the operator A is losing its strong monotonicity.
In our third experiment, we performed the same numerical experiments as the ones mentioned above but with (A, B) = (∂g + α I, ∂f − α I) instead of (A, B) = (∂f − α I, ∂g + α I) and note that the results obtained were very similar to the ones reported above. Hence interchanging A and B in the implementation of the relaxed PR splitting method have little impact on its performance. This paper establishes convergence of the sequence of iterates and an O(1/k) ergodic convergence rate bound for the relaxed PR splitting method for any θ ∈ (0, 2 + γβ] by viewing it as an instance of a non-Euclidean HPE framework. It also establishes an O(1/ √ k) pointwise convergence rate bound for it for any θ ∈ (0, 2 + γβ). Furthermore, an example showing that PR iterates do not necessarily converge for β ≥ 0 and θ ≥ min{2(1 + γβ), 2 + γβ + 1/(γβ)} is given. Table 2 (resp., Table 3) for the case in which β = 0 (resp., β > 0) provides a summary of the convergence rate results known so far for the relaxed PR splitting method when (A, B) = (∂f, ∂g) for some convex functions f and g. However, we observe that some of these results also hold for pairs (A, B) of maximal monotone operators which are not subdifferentials. The term "R-linear" in the tables below stands for linear convergence of the sequence {x k } generated by the relaxed PR splitting method.