Abstract
In this work, we propose a new algorithm for finding a zero of the sum of two monotone operators where one is assumed to be singlevalued and Lipschitz continuous. This algorithm naturally arises from a nonstandard discretization of a continuous dynamical system associated with the Douglas–Rachford splitting algorithm. More precisely, it is obtained by performing an explicit, rather than implicit, discretization with respect to one of the operators involved. Each iteration of the proposed algorithm requires the evaluation of one forward and one backward operator.
Introduction
The study of continuous time dynamical systems associated with iterative algorithms for solving optimization problems has a long history which can be traced back at least to the 1950s [4, 14]. The relationship between the continuous and discrete versions of an algorithm provides a unifying perspective which gives insights into their behavior and properties. As we will see in this work, this includes suggesting new algorithmic schemes as well as appropriate Lyapunov functions for analyzing their convergence properties. The interplay between continuous and discrete dynamical systems has been studied by many authors including [1,2,3, 5, 6, 9, 10, 23, 24].
The following wellknown idea will help to motivate the approach used in this work. Let \({\mathcal {H}}\) be a real Hilbert space and suppose \({B:{\mathcal {H}}\rightarrow {\mathcal {H}}}\) is a maximal monotone operator. Consider the monotone equation
to which the following continuous time dynamical system can be attached
Let \(\lambda >0\). We now devise two iterative algorithms for solving (1) by using different discretizations of \({\dot{x}}(t)\) in (2). To this end, let us first approximate the trajectory x(t) in (2) by discretizing at the points \((k\lambda )_{k\in {\mathbb {Z}}_+}\), and denote the discretized trajectory by \(x_k := x(k\lambda )\).
Now, on one hand, using the forward discretization \({\dot{x}}(t) \approx \frac{x_{k+1}x_k}{\lambda }\) gives
In the particular case when B is the gradient of a function, (3) is nothing more than the classical gradient descent method. On the other hand, using the backward discretization \({\dot{x}}(t) \approx \frac{x_{k}x_{k1}}{\lambda }\) gives
where \(J_{A}:=({{\,\mathrm{Id}\,}}+A)^{1}\) denotes the resolvent of a (potentially multivalued) maximal monotone operator \(A:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\). This iteration is precisely the proximal point algorithm for the monotone inclusion (1). It is worth emphasizing that (3) and (4) are different iterative algorithms which, in general, do not converge under the same conditions. In particular, if B is monotone but not cocoercive, then (4) converges to a solution for any \(\lambda >0\) whereas (3) does not. Nevertheless, both algorithms correspond to the same continuous dynamical system (2).
In this work, we exploit the same type of relationship between continuous and discrete dynamical systems to discover a new algorithm for monotone inclusions of the form
where \(A:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\) and \(B:{\mathcal {H}}\rightarrow {\mathcal {H}}\) are (maximally) monotone operators with BLLipschitz continuous (but not necessarily cocoercive). More precisely, by using a nonstandard discretization of the continuous time Douglas–Rachford algorithm, we obtain
which, as we will show, converges weakly to a solution of (5) whenever \(\lambda \in (0,\frac{1}{3L})\). Note also that, by choosing the operators A and B appropriately, the setting of (5) covers smooth–nonsmooth convex minimization, monotone inclusions through duality, and saddle point problems with smooth convex–concave couplings. For further details, see [22].
Despite substantial progress in monotone operator theory, there are not so many original splitting algorithms for solving monotone inclusions of form (5) which use forward evaluations of B. Tseng’s forwardbackwardforward algorithm [25], published in 2000, was the first such method capable of solving (5). Until recently, this was the only known method with these properties, however there has since been progress in the area with the discovery of further methods having this property [16, 17, 22]. In this connection, see also [8, 12].
The remainder of this work is organized as follows. In Sect. 2, we discuss the classical Douglas–Rachford and study an alternative form of its continuous time dynamical system. In Sect. 3, we discretize this alternative form to obtain (6) and prove its convergence. In Sect. 4, we briefly show how the same idea can be applied to recover a primal–dual algorithm which was recently proposed in [19, Algorithm 1] and [18]. Section 5 concludes our work by suggesting avenues for further investigation.
From the Discrete to the Continuous
The Douglas–Rachford method is an algorithm for finding a zero of the sum of maximally monotone operators, A and B. This popular splitting method works by only requiring the evaluation of the resolvents of each of the operators individually, rather than the resolvent of their sum. The method was first formulated for solving linear equations in [13] and later generalized to monotone inclusions in [20].
The method can be compactly described as the fixed point iteration
where \(R_{\lambda B}=2J_{\lambda B}{{\,\mathrm{Id}\,}}\) denotes the reflected resolvent of a monotone operator \(\lambda B\). Its behavior is summarized in the following theorem.
Theorem 1
[7, Theorem 25.6] Let \(A:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\) and \(B:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\) be maximally monotone operators with \({{\,\mathrm{zer}\,}}(A+B) \ne \varnothing \). Let \(\lambda >0\) and \(z_0\in {\mathcal {H}}\). Then the sequence \((z_k)\), generated by (7), satisfies

(i)
\((z_k)\) converges weakly to a point \(z\in {{\,\mathrm{Fix}\,}}(R_{\lambda A} R_{\lambda B})\).

(ii)
\((J_{\lambda B}z_k)\) converges weakly to \(J_{\lambda B}z \in {{\,\mathrm{zer}\,}}(A+B)\).
The iteration (7) can be viewed as a discretization of the continuous time dynamical system
where the discretizations \({\dot{z}}(t)\approx z_{k+1}z_k\) and \(z(t)\approx z_k\) are used. Since the operator \(R_{\lambda A} R_{\lambda B}\) is nonexpansive (i.e., 1Lipschitz), the Picard–Lindelöf theorem [15, Theorem 2.2] implies that, for any \(z_0\in {\mathcal {H}}\), there exists a unique trajectory z(t) satisfying (8) and the initial condition \(z(0) = z_0\).
Let us now express this dynamical system in an alternative form. First, by using the definition of the reflected resolvent, we observe that (8) can be written as
Denote \(x(t) = J_{\lambda B}(z(t))\) and \(y(t) = z(t)  x(t)\). From the definition of the resolvent, \(y(t)\in \lambda B(x(t))\) and we therefore have
By using these identities to eliminate z from (9), we obtain
This system can be viewed as the continuous dynamical system associated with the shadow trajectories, x(t), of the Douglas–Rachford system (8) specified by z(t). In particular, this fact implies the existence of the trajectories x(t) and y(t). In a later section, we will use a discretization of this system to obtain a new splitting algorithm. Note also that, by using the definition of the resolvent \(J_{\lambda A}\), (11) can be equivalently expressed as
We begin with a theorem concerning the asymptotic behavior of (11). Although this result can be obtained, with some work, from [10, Theorem 6], we give a more direct proof which serves the additional purpose of providing insights useful for the analysis of the discrete case. We require the following two preparatory lemmas.
Lemma 1
Let \(\lambda >0\). Suppose \(A:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\) and \(B:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\) are maximally monotone operators. Then the setvalued operator on \({\mathcal {H}}\times {\mathcal {H}}\) defined by
is demiclosed. That is, its graph is a sequentially closed set in the weakstrong topology.
Proof
Note that the operator in (13) is maximally monotone as it is the sum of two maximally monotone operators, the latter having full domain [7, Corollary 24.4(i)]. Since maximally monotone operators are demiclosed [7, Proposition 20.32], the result follows. \(\square \)
Although the following lemma is a direct consequence of [1, Lemma 5.2], we include its explicit statement for the convenience of the reader.
Lemma 2
Suppose \(T:{\mathcal {H}}\rightarrow {\mathcal {H}}\) is LLipschitz continuous. If \({\dot{z}}(t)=T(z(t))\) and \(\int _0^\infty \left\ {\dot{z}}(t) \right\ ^2\,dt<+\infty \), then \({\dot{z}}(t)\rightarrow 0\) as \(t\rightarrow +\infty \).
Proof
Since T is LLipschitz continuous, [10, Remark 1] implies that \(\ddot{z}\) exists almost everywhere and that \( \left\ \ddot{z}(t) \right\ = \left\ \frac{d}{dt}Tz(t) \right\ \le L\left\ {\dot{z}}(t) \right\ \) for almost all \(t\ge 0\). From this it follows that \(\int _0^\infty \left\ \ddot{z}(t) \right\ ^2\,dt<+\,\infty \). We also have
Since the righthand side is integrable, [1, Lemma 5.2] yields the result. \(\square \)
The following theorem is our main result regarding the asymptotic behavior of (11).
Theorem 2
Let \(A:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\) and \(B:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\) be maximally monotone operators with \({{\,\mathrm{zer}\,}}(A+B) \ne \varnothing \). Let \(\lambda >0\) and \(x_0\in {\mathcal {H}}\). Then the trajectories, x(t) and y(t), generated by (11) with initial condition \(x(0)=x_0\), satisfy

(i)
x(t) converges weakly to a point \({\bar{x}}\in {{\,\mathrm{zer}\,}}(A+B)\).

(ii)
y(t) converges weakly to a point \({\bar{y}}\in \lambda B({\bar{x}})\cap ( \lambda A({\bar{x}}))\).
Proof
Let \({\bar{x}}\in {{\,\mathrm{zer}\,}}(A+B)\) and \({\bar{y}}\in \lambda B({\bar{x}})\cap (\lambda A({\bar{x}}))\). Denote \({\bar{z}}={\bar{x}}+{\bar{y}}\) and \(z(t)=x(t)+y(t)\). By using monotonicity of \(\lambda A\) with (12) followed by monotonicity of \(\lambda B\), we obtain
In particular, this shows that \(\left\ z(t){\bar{z}} \right\ ^2\) is decreasing, hence \(\lim _{t\rightarrow \infty }\left\ z(t){\bar{z}} \right\ \) exists, and that \(\int _0^\infty \left\ {\dot{z}}(t) \right\ ^2\, dt<+\infty \). The latter combined with Lemma 2 implies that \({\dot{z}}(t)\rightarrow 0\) as \(t\rightarrow \infty \). Monotonicity of \(\lambda B\) then yields
from which it follows that x(t) is bounded.
By eliminating y(t) from (12) and rearranging the resulting system, we may express (11) in the form
Let (x, z) be a weak sequential cluster point of the bounded trajectory (x(t), z(t)). Taking the limit along this subsequence in (15), using Lemma 1, and unraveling the resulting expression gives
In particular, by combining (14) with (16), we deduce that \(\lim _{t\rightarrow +\infty }\left\ z(t)z \right\ ^2\) exists. Applying Opial’s lemma [10, Lemma 4] then shows that z(t) converges weakly to a point \({\bar{z}}\in {\bar{x}}+\lambda B({\bar{x}})\) where \({\bar{x}}\) is a weak sequential cluster point of x(t). The definition of \(J_{\lambda B}\) then yields \({\bar{x}}=J_{\lambda B}({\bar{z}})\), which implies that \(J_{\lambda B}({\bar{z}})\) is the unique cluster point of x(t). The trajectory x(t) therefore converges weakly to a point \({\bar{x}}\in {{\,\mathrm{zer}\,}}(A+B)\). To complete the proof, simply note that \(y(t)=z(t)x(t)\rightharpoonup {\bar{z}}{\bar{x}}\in \lambda B({\bar{x}})\cap (\lambda A({\bar{x}}))\) as \(t\rightarrow +\infty \). \(\square \)
From the Continuous to the Discrete
In this section, we devise a new splitting algorithm by considering different discretizations of the dynamical system (11). For the remainder of this work, we will suppose that B is a singlevalued operator. In this case, the system (11) simplifies to
In order to discretize this system, let us replace \(x(t)\approx x_k\) and \(y(t)\approx y_k\). As two derivatives appear in (17), there are many combinations of possible discretizations. One involves using forward discretizations of both \({\dot{x}}(t)\) and \({\dot{y}}(t)\), that is,
Under this discretization, (17) becomes
As written, this expression does not given rise to a useful algorithm, since \(x_{k+1}\) appears on both sides of the equation. However, we note that by taking \(z_k=x_k+y_k=(I+\lambda B)(x_k)\) and rearranging, we obtain
which is precisely the usual Douglas–Rachford algorithm given in (7).
To derive a new algorithm, we consider a different discretization of (17). To this end, we perform a forward discretization of \({\dot{x}}(t)\) and a backward discretization of \({\dot{y}}(t)\), that is,
Under this discretization, (11) becomes
Although not surprising, it is interesting to note that (19) and (21) only differ in the indices which appear in the last two terms. In particular, in this expression, \(x_{k+1}\) does not appear on the righthand side.
Remark 1
(Timestep in the discretization) In the above derivation, we assumed that the discretization of \({\dot{x}}(t)\) and \({\dot{y}}(t)\) where performed with respect to a unit timestep for simplicity of exposition. However, if a timestep \(\gamma >0\) is used, then (20) becomes
Under this discretization, (17) becomes
In other words, for timesteps \(\gamma \) in (0, 1), the resolvent term in (21) becomes a convex combination with the previous point.
Before turning our attention to the convergence properties of this iteration, we make the following remark.
Remark 2
Backward/forward discretizations of a derivative usually correspond to the same type of step in their discrete counterpart of the algorithms. This is, for instance, the case for the forwardbackward method which includes the discussion from Sect. 2 as a special case. It is curious to note, however, that forward (resp. backward) discretization gave rise to backward (resp. forward) operators in the discrete counterparts. In particular, two forward discretizations of (17) gave rise the Douglas–Rachford algorithm which has two backward steps whereas one forward and one backward discretization produced a method also having one forward and one backward step.
We now prove the following lemma, which might be interesting in its own right due to the very general form of the recurrent relation.
Lemma 3
Let \(A:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\) be a maximal monotone operator and let \((y_k)\subset {\mathcal {H}}\) be an arbitrary sequence. Let \(x_0\in {\mathcal {H}}\) and consider \((x_k)\) defined by
Then, for all \(x\in {\mathcal {H}}\) and \(y \in A(x)\), we have
Proof
By the definition of the resolvent and (22), it follows that
Since \(y \in A(x)\) and A is monotone, we have
which is equivalent to
To simplify (25), we note that
Now, using the above three identities in (25), we obtain
The equivalence between the last inequality and (23) now follows. \(\square \)
Since (20) is of the form specified by Lemma 3, this lemma suggests one possible way to prove convergence of (20): the quantity \(\left\ x_k+y_{k1}  x  y \right\ ^2\) will be decreasing if the other terms in the righthand side of (23) can be estimated appropriately. The following theorem, which is our main result regarding convergence of (21), makes use of this observation.
Theorem 3
Let \(A:{\mathcal {H}}\rightrightarrows {\mathcal {H}}\) be maximally monotone and \(B:{\mathcal {H}}\rightarrow {\mathcal {H}}\) be monotone and LLipschitz with \({{\,\mathrm{zer}\,}}(A+B)\ne \varnothing \). Let \(\varepsilon >0\), \(\lambda \in \left[ \varepsilon ,\frac{13\varepsilon }{3L}\right] \) and let \(x_0,x_{1}\in {\mathcal {H}}\). Then the sequence \((x_k)\), generated by (21), satisfies

(i)
\((x_k)\) converges weakly to a point \({\overline{x}}\in {{\,\mathrm{zer}\,}}(A+B)\).

(ii)
\((B(x_k))\) converges weakly to \(B({\bar{x}})\).
Proof
Let \(x\in {{\,\mathrm{zer}\,}}(A+B)\) and set \(y=\lambda B(x)\in \lambda A(x)\). Since (21) is of the form specified by (22), we apply Lemma 3 to the monotone operator \(\lambda A\) with \(y_k=\lambda B(x_k)\) to deduce that the inequality (23) holds. Now, using that B is monotone, we have \(\left\langle y_ky, x_kx\right\rangle \ge 0\) and hence
Next, we estimate the innerproduct in the last line of (27). To this end, note that Young’s inequality gives
and that Lipschitzness of B yields
Combing these two estimates with (27) gives the inequality
By denoting \(z_{k}=x_k+y_{k1}\) and \(z=x+y\), the previous inequality implies
which telescopes to yield
From this, it follows that \((z_k)\) is bounded and that \({\left\ x_kx_{k1} \right\ \rightarrow 0}\). The latter, together with Lipschitz continuity of B, implies \({\left\ y_{k}y_{k1} \right\ \rightarrow 0}\) and, consequently, we also have that \(\left\ z_kz_{k1} \right\ \rightarrow 0\). Since \(z_k = ({{\,\mathrm{Id}\,}}+\lambda B)(x_k) + (y_{k1}y_k)\), we have
Since \((z_k)\) is bounded, \(\left\ y_ky_{k1} \right\ \rightarrow 0\) and \(J_{\lambda B}\) is nonexpansive, it then follows that the sequence \((x_k)\) is also bounded. Also, due to (30), we see that the following limit exits
Now, by using the definition of the resolvent \(J_{\lambda A}\), we can express (24) in the form
Let (x, z) be a weak cluster point of the bounded sequence \((x_k,z_k)\). Taking the limit along this subsequence in (31), using Lemma 1, and unravelling the resulting expression gives
Applying Opial’s Lemma [7, Lemma 2.39], it then follows that \((z_k)\) converges weakly to a point \({\bar{z}}={\bar{x}}+\lambda B({\bar{x}})\) where \({\bar{x}}\) is weak cluster point of \((x_k)\). But then the definition of \(J_{\lambda B}\) yields that \({\bar{x}}=J_{\lambda B}({\bar{z}})\) which implies that \(J_{\lambda B}({\bar{z}})\) is the unique cluster point of \((x_k)\). The sequence \((x_k)\) therefore converges weakly to a point \({\bar{x}}\in {{\,\mathrm{zer}\,}}(A+B)\). To complete the proof, simply note that \(y_{k1}=z_{k}x_k\rightharpoonup {\bar{z}}{\bar{x}} = \lambda B({\bar{x}})\) as \(k\rightarrow \infty \). \(\square \)
Some remarks regarding Theorem 3 and its proof are in order.
Remark 3
(Continuous and discrete proofs) The sequence \(z_{k}=x_k+y_{k1}\) plays a similar role in Theorem 3 to the trajectory \(z(t)=x(t)+y(t)\) in Theorem 2. This does however highlight a subtle difference between the two proofs—in the discrete case, we have \(x_k=J_{\lambda B}\bigl (z_k+(y_ky_{k1})\bigr )\) whereas, in the continuous case, we have \(x(t)=J_{\lambda B}(z(t))\).
Remark 4
In the original submission of this manuscript we conjectured that the interval in which \(\lambda \) lies could be extended to \(\lambda \in (0,\frac{1}{2L})\). Later, in a private communication, Sebastian Banert constructed a counterexample to show that this is not the case. Indeed, consider the setting with \({\mathcal {H}}={\mathbb {R}}^2, A=3B\) and \(B(x,y)=(y,x)\). Then choosing \(\lambda =1/3\) and initializing with \(x_0=B(x_{1})\) yields that (21) satisfies \(x_{k+2}=x_k\) for all \(k\in {\mathbb {N}}\). In particular, the sequence \((x_k)\) does not converge, when \(x_0\) is nonzero.
Interestingly, our original motivation for considering the continuous dynamical system (11) did not arise from its connection to the Douglas–Rachford algorithm, but rather from its connection to the operator splitting method studied in [22] given by
Note that the iterations (21) and (33) look very similar and, in fact, coincide if \(A=0\). For (33), convergence has been established when \(\lambda < \frac{1}{2L}\), which is slightly better than for (21).
On the other hand, the analysis of dynamical systems corresponding to (33) is more complicated. In particular, a natural candidate for a continuous analogue of (33) is given by
Because we are unable to couple the derivatives \({\dot{x}}(t)\) and \({\dot{y}}(t)\) in (34) in general, it is not clear how to prove existence of its trajectory x(t).
Primal–Dual Algorithms
In this section, we illustrate another application of Lemma 3 in the analysis of a primal–dual algorithm. Consider the bilinear convex–concave saddle point problem
where \(g:{\mathcal {H}}_1\rightarrow (\infty , +\infty ]\), \(f:{\mathcal {H}}_2 \rightarrow (\infty , +\infty ]\) are proper convex lsc functions, \(K:{\mathcal {H}}_1 \rightarrow {\mathcal {H}}_2\) is a bounded linear operator with norm \(\left\ K \right\ \), and \(f^*\) denotes the Fenchel conjugate of f. A popular method to solve this problem is the Chambolle–Pock primal–dual method [11] defined by
Under the assumption that the solution set of (35) is nonempty and that \(\tau \sigma \left\ K \right\ ^2 < 1\), one can prove that the sequence \((u_k, v_k)\) converges weakly to a saddle point of (35).
In the spirit of (21), we might consider another primal–dual algorithm:
This algorithm can be viewed as a special case of [18, Algorithm 1] corresponding to the choice of parameters \(\mu =0\) and \(\theta =\lambda =1\) (see also [19]). In what follows, we provide an alternative derivation of its convergence using the same lemma as in the analysis of the shadow DR algorithm. Rather than present the full proof, we will only focus on the most important ingredient—the fact that \((u_k)\), \((v_k)\) remain bounded. One this is established, the rest follows the same argument as in Theorem 3.
Theorem 4
Let \(g:{\mathcal {H}}_1 \rightarrow (\infty , +\infty ]\), \(f:{\mathcal {H}}_2\rightarrow (\infty , +\infty ]\) be proper convex lsc functions and \(K:{\mathcal {H}}_1 \rightarrow {\mathcal {H}}_2\) be a bounded linear operator with norm \(\left\ K \right\ \) such that the solution set of (35) is nonempty. Let \(\tau \sigma \left\ K \right\ ^2 < 1\), let \(u_0\in {\mathcal {H}}_1\), and let \(v_0\in {\mathcal {H}}_2\). Then the sequence \((u_k, v_k)\), generated by (37), converges weakly to a solution of (35).
Proof
Let (u, v) be a saddle point of (35). Then the firstorder optimality conditions give \( K^*v \in \partial g(u)\) and \(Ku \in \partial f^*(v)\). By applying Lemma 3 for a fixed \(k\in {\mathbb {N}}\) with
we obtain
where, instead of (23), we used its equivalent form (26). Similarly, by applying Lemma 3 for a fixed \(k\in {\mathbb {N}}\) with
we obtain
By applying Young’s inequality and using the inequality \(\tau \sigma \left\ K \right\ ^2 <1\), we have
Now, multiplying (38) by \(1/\tau \), (39) by \(1/\sigma \), summing these two inequalities, and then using the estimate (40) yields
By telescoping this inequality, one obtains boundedness of \((u_k)\) and \((v_k)\). In fact, a slightly tighter estimation in (40) would yield \(\left\ u_ku_{k1} \right\ \rightarrow 0\) and \(\left\ v_kv_{k1} \right\ \rightarrow 0\) (since the inequality \(\tau \sigma \left\ K \right\ ^2 < 1\) is strict). \(\square \)
Concluding Remarks/Future Directions
In this work, we proposed and analyzed a new algorithm for finding a zero of the sum of two monotone operators, one of which is assumed to be Lipschitz continuous. This algorithm naturally arise from a nonstandard discretization of a continuous dynamical system with the Douglas–Rachford algorithm. To conclude, we outline possible directions for future work.

Linesearch It would be interesting to incorporate a linesearch procedure in the shadow Douglas–Rachford method. Similarly, it makes sense to consider a continuous dynamic scheme with variable steps, as it was done, for example, in [6] for Tseng’s method.

Inertial terms It is important to study the extensions of (11) and (21) which incorporate additional inertial and relaxation terms, as was done in the recent work [5] for the forward–backward method. Combining inertial and relaxing effects allows one to go beyond the standard bound of \(\frac{1}{3}\) for the stepsize associated with the inertial term.

Role of reflection Perhaps the most interesting and challenging direction for future work is to understand why the inclusion of a “reflection term” in an algorithm allows for convergence to be proven under milder hypotheses. For instance, applied to the saddle point problem (35), the famous Arrow–Hurwicz algorithm [4] can fail to converge. In contrast, both (36) and (37), which can be viewed its “reflected” modifications, do converge. Similarly, for the monotone variational inequality \(0\in N_C(x) + B(x)\), where C is a closed convex set and \(N_C\) is its normal cone, the projected gradient algorithm
$$\begin{aligned} x_{k+1} = P_C (x_k  \lambda B(x_k)) \end{aligned}$$does not work, but its “reflected” modification [21] given by
$$\begin{aligned} x_{k+1} = P_C (x_k  \lambda B(2x_kx_{k1})) \end{aligned}$$does converge to a solution. For the more general monotone inclusion \(0 \in A(x) + B(x)\), the forwardbackward method also does not work, however both of its “reflected” modifications, (21) and (33), do. We note however that although all of aforementioned algorithms share the same “reflected term”, their analyses are not the same. It would be interesting to understand deeper reasons for their success.
References
 1.
Abbas, B., Attouch, H., Svaiter, B.F.: Newtonlike dynamics and forwardbackward methods for structured monotone inclusions in Hilbert spaces. J. Optim. Theory Appl. 161(2), 331–360 (2014)
 2.
Al’ber, Y.I.: Continuous regularization of linear operator equations in a Hilbert space. Math. Notes 4(5), 793–797 (1968)
 3.
Antipin, A.S.: Minimization of convex functions on convex sets by means of differential equations. Diff. Equ. 30(9), 1365–1375 (1994)
 4.
Arrow, K., Hurwicz, L.: Gradient methods for constrained maxima. Op. Res. 5(2), 258–265 (1957)
 5.
Attouch, H., Cabot, A.: Convergence of a relaxed inertial forwardbackward algorithm for structured monotone inclusions. Appl. Math. Optim. (2019). https://doi.org/10.1007/s0024501909584z
 6.
Banert, S., Boţ, R.I.: A forwardbackwardforward differential equation and its asymptotic properties. J. Convex. Anal. 25(2), 371–388 (2018)
 7.
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 1st edn. Springer, New York (2011)
 8.
Bello Cruz, J., Díaz Millán, R.: A variant of forwardbackward splitting method for the sum of two monotone operators with a new search strategy. Optimization 64(7), 1471–1486 (2015)
 9.
Bolte, J.: Continuous gradient projection method in Hilbert spaces. J. Optim. Theory Appl. 119(2), 235–259 (2003)
 10.
Boţ, R.I., Csetnek, E.R.: A dynamical system associated with the fixed points set of a nonexpansive operator. J. Dyn. Differ. Equ. 29(1), 155–168 (2017)
 11.
Chambolle, A., Pock, T.: A firstorder primaldual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011)
 12.
Combettes, P.L., Pesquet, J.C.: Primaldual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallelsum type monotone operators. SetValued Var. Anal. 20(2), 307–330 (2012)
 13.
Douglas, J., Rachford, H.H.: On the numerical solution of heat conduction problems in two and three space variables. Trans.Am. Math. Soc. 82(2), 421–439 (1956)
 14.
Gavurin, M.K.: Nonlinear functional equations and continuous analogues of iterative methods (in Russian). Isvestiya Vuzov Matem 5, 18–31 (1958)
 15.
Granas, A., Dugundji, J.: Fixed Point Theory. Springer, New York (2013)
 16.
Johnstone, P.R., Eckstein, J.: Projective splitting with forward steps: asynchronous and blockiterative operator splitting. arXiv:1803.07043 (2018)
 17.
Johnstone, P.R., Eckstein, J.: Singleforwardstep projective splitting: Exploiting cocoercivity. arXiv:1902.09025 (2019)
 18.
Latafat, P., Patrinos, P.: Primaldual proximal algorithms for structured convex optimization: a unifying framework. In: Giselsson, P., Rantzer, A. (eds.) LargeScale and Distributed Optimization, pp. 97–120. Springer, Cham (2018)
 19.
Latafat, P., Stella, L., Patrinos, P.: New primaldual proximal algorithm for distributed optimization. In: IEEE 55th Conference on Decision and Control (CDC), 2016 , pp. 1959–1964
 20.
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)
 21.
Malitsky, Y.: Reflected projected gradient method for solving monotone variational inequalities. SIAM J. Optim. 25(1), 502–520 (2015)
 22.
Malitsky, Y., Tam, M.K.: A forwardbackward splitting method for monotone inclusions without cocoercivity. arXiv:1808.04162 (2018)
 23.
Peypouquet, J., Sorin, S.: Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. J. Convex Anal. 17(3&4), 1113–1163 (2010)
 24.
Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comp. Math. Math. Phys. 4(5), 1–17 (1964)
 25.
Tseng, P.: A modified forwardbackward splitting method for maximal monotone mappings. SIAM J. Control Optim. 38, 431–446 (2000)
Acknowledgements
The authors would like to thank the Erwin Sch\(\ddot{\mathrm{r}}\)odinger Institute for their support and hospitality during the thematic program “Modern Maximal Monotone Operator Theory: From Nonsmooth Optimization to Differential Inclusions”. The authors would also like to thank the two anonymous referees for their helpful comments as well as Sebastian Banert for sharing his nice counterexample that we mentioned in Remark 4.
Funding
ERC was supported by Austrian Science Fund Project P 29809N32. YM was supported by German Research Foundation Grant No. SFB755A4.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Csetnek, E.R., Malitsky, Y. & Tam, M.K. Shadow Douglas–Rachford Splitting for Monotone Inclusions. Appl Math Optim 80, 665–678 (2019). https://doi.org/10.1007/s00245019095978
Published:
Issue Date:
Keywords
 Monotone operator
 Operator splitting
 Douglas–Rachford algorithm
 Dynamical systems
Mathematics Subject Classification
 49M29
 90C25
 47H05
 47J20
 65K15