A Note on the Equivalence of Operator Splitting Methods

Moursi, Walaa M.; Zinchenko, Yuriy

doi:10.1007/978-3-030-25939-6_13

Walaa M. Moursi^4,5 &
Yuriy Zinchenko⁶

1119 Accesses
4 Citations

Abstract

This note provides a comprehensive discussion of the equivalences between some splitting methods. We survey known results concerning these equivalences which have been studied over the past few decades. In particular, we provide simplified proofs of the equivalence of the ADMM and the Douglas–Rachford method and the equivalence of the ADMM with intermediate update of multipliers and the Peaceman–Rachford method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The adjoint of L is the unique operator L ^∗: X → Y that satisfies 〈Ly, x〉 = 〈y, L ^∗ x〉 (∀(x, y) ∈ X × Y ).
2.
It is straightforward to verify that g ^∨∗ = (g ^∗)^∨ (see, e.g., [5, Proposition 13.23(v)]).
3.
In passing, we point out that, when X is a finite-dimensional Hilbert space, the condition can be relaxed to . The convergence in this case is proved in [18, Theorem 3.3].

References

Attouch, H., Brézis, H.: Duality for the sum of convex functions in general Banach spaces. In: Aspects of Mathematics and Its Applications 34, pp. 125–133. North-Holland, Amsterdam (1986)
Google Scholar
Attouch, H., Théra, M.: A general duality principle for the sum of two operators. J. Convex Anal. 3, 1–24 (1996)
MathSciNet MATH Google Scholar
Bauschke, H.H., Combettes, P.L.: A Dykstra-like algorithm for two monotone operators. Pacific J. Optim. 4, 383–391 (2008)
MathSciNet MATH Google Scholar
Bauschke, H.H., Boţ, R.I., Hare, W.L., Moursi, W.M.: Attouch–Théra duality revisited: paramonotonicity and operator splitting. J. Approx. Th. 164, 1065–1084 (2012)
Article Google Scholar
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Second Edition. Springer (2017)
Book Google Scholar
Bauschke, H.H., Koch, V.R.: Projection methods: Swiss Army knives for solving feasibility and best approximation problems with halfspaces. In: Infinite Products of Operators and Their Applications, Contemp. Math. vol. 636, pp. 1–40 (2012)
Google Scholar
Beck, A.: First-Order Methods in Optimization, SIAM (2017)
Google Scholar
Boţ, R.I, Csetnek, E.R.: ADMM for monotone operators: convergence analysis and rates. Advances Comp. Math. 45, 327–359 (2019)
Article MathSciNet Google Scholar
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3, 1–122 (2011)
Article Google Scholar
Boyle, J.P., Dykstra, R.L.: A method for finding projections onto the intersection of convex sets in Hilbert spaces. Lecture Notes in Statistics 37, 28–47 (1986)
Article MathSciNet Google Scholar
Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vis. 40, 120–145 (2011)
Article MathSciNet Google Scholar
Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53, 475–504 (2004)
Article MathSciNet Google Scholar
Briceño-Arias, L.M., Combettes, P.L.: A monotone + skew splitting model for composite monotone inclusions in duality. SIAM J. Optim. 21, 1230–1250 (2011)
Article MathSciNet Google Scholar
Combettes, P.L., Pesquet, J.-C.: Primal-dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallel-sum type monotone operators. Set-Valued Var. Anal. 20, 307–330 (2012)
Article MathSciNet Google Scholar
Combettes, P.L., Pesquet, J.-C.: A proximal decomposition method for solving convex variational inverse problems. Inverse Problems 24, article 065014 (2008)
Google Scholar
Combettes, P.L., Pesquet, J.-C.: Proximal splitting methods in signal processing. In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, vol. 49, pp. 185–212. Springer, New York (2011)
Google Scholar
Combettes, P.L., Vũ, B.-C.: Variable metric forward–backward splitting with applications to monotone inclusions in duality. Optimization 63, 1289–1318 (2014)
Article MathSciNet Google Scholar
Condat, L.: A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Th. Appl. 158, 460–479 (2013)
Article MathSciNet Google Scholar
Deutsch, F.: Best Approximation in Inner Product Spaces, Springer (2001)
Google Scholar
Eckstein, J.: Splitting Methods for Monotone Operators with Applications to Parallel Optimization, Ph.D. thesis, MIT (1989)
Google Scholar
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Prog. 55, 293–318 (1992)
Article MathSciNet Google Scholar
Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, vol. 15, pp. 299–331. North-Holland, Amsterdam (1983)
Google Scholar
Gossez, J.-P.: Opérateurs monotones non linéaires dans les espaces de Banach non réflexifs. J. Math. Anal. Appl. 34, 371–395 (1971)
Article MathSciNet Google Scholar
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
Article MathSciNet Google Scholar
O’Connor, D., Vandenberghe, L.: On the equivalence of the primal-dual hybrid gradient method and Douglas–Rachford splitting. Math. Prog. (Ser. A) (2018), https://doi.org/10.1007/s10107-018-1321-1.
Riesz, F., Sz.-Nagy, B.: Functional Analysis. Dover paperback (1990)
Google Scholar
Rockafellar, R.T.: On the maximal monotonicity of subdifferential mappings. Pacific J. Math. 33, 209–216 (1970)
Article MathSciNet Google Scholar
Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis. Third Edition. Springer-Verlag (2002)
Book Google Scholar
Tseng, P.: Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIAM J. Con. Optim. 29, 119–138 (1991)
Article MathSciNet Google Scholar
Vũ, B.C.: A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv. Comput. Math. 38, 667–681 (2013)
Article MathSciNet Google Scholar

Download references

Acknowledgements

WMM was supported by the Pacific Institute of Mathematics Postdoctoral Fellowship and the DIMACS/Simons Collaboration on Bridging Continuous and Discrete Optimization through NSF grant # CCF-1740425.

Author information

Authors and Affiliations

Electrical Engineering, Stanford University, Stanford, CA, USA
Walaa M. Moursi
Faculty of Science, Mathematics Department, Mansoura University, Mansoura, Egypt
Walaa M. Moursi
University of Calgary, Department of Mathematics and Statistics, Calgary, AB, Canada
Yuriy Zinchenko

Authors

Walaa M. Moursi
View author publications
You can also search for this author in PubMed Google Scholar
Yuriy Zinchenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Walaa M. Moursi .

Editor information

Editors and Affiliations

Department of Mathematics, University of British Columbia, Kelowna, BC, Canada
Heinz H. Bauschke
School of IT & Mathematical Sciences, University of South Australia, Mawson Lakes, SA, Australia
Regina S. Burachik
Inst. Numerische & Angewandte Mathematik, Universität Göttingen, Göttingen, Niedersachsen, Germany
D. Russell Luke

Appendices

13.1.1 Appendix 1

Let A: X → X be linear. Define

(13.45)

Recall that a linear operator A: X → X is monotone if (∀x ∈ X) 〈x, Ax〉≥ 0, and is strictly monotone if $(\forall x\in X\smallsetminus \{0\})$ 〈x, Ax〉 > 0. Let h: X →ℝ and let x ∈ X. We say that h is Fréchet differentiable at x if there exists a linear operator Dh(x): X →ℝ, called the Fréchet derivative of h at x, such that ; and h is Fréchet differentiable on X if it is Fréchet differentiable at every point in X.

The following lemma is a special case of [5, Proposition 17.36].

Lemma 13.3

Let A: X → X be linear, strictly monotone, self-adjoint and invertible. Then the following hold:

(i)
q _A and $q_{A^{-1}}$ are strictly convex, continuous, Fréchet differentiable. Moreover, $(\nabla q_A,\nabla q_{A^{-1}} )=(A,A^{-1})$.
(ii)
$q_{A}^*=q_{A^{-1}}$.

Proof

Note that, likewise A, A ⁻¹ is linear, strictly monotone, self-adjoint (since (A ⁻¹)^∗ = (A ^∗)⁻¹ = A ⁻¹) and invertible. Moreover, ${\operatorname {ran}} A={\operatorname {ran}} A^{-1}=X$. (i): This follows from [5, Example 17.11 and Proposition 17.36(i)] applied to A and A ⁻¹ respectively. (ii): It follows from [5, Proposition 17.36(iii)], [28, Theorem 4.8.5.4] and the invertibility of A that $q_{A}^*=q_{A^{-1}}+\iota _{{\operatorname {ran}} A} =q_{A^{-1}}+\iota _{X}=q_{A^{-1}}$. □

Proposition 13.3

Let L: Y → X be linear. Suppose that L ^∗ L is invertible. $g\colon Y \to \left ]-\infty ,+\infty \right ]$ be convex, lower semicontinuous, and proper. Then the following hold:

(i)
$\ker L=\{0\}$.
(ii)
L ^∗ L is strictly monotone.
(iii)
${\operatorname {dom}}(q_{L^*L}+g)^*=X$.
(iv)
$\partial (q_{L^*L}+g)=\nabla q_{L^*L}+\partial g=L^*L+\partial g$.
(v)
$(q_{L^*L}+g^*)^*$ is Fréchet differentiable on X.
(vi)
(L ^∗ L + ∂g ^∗)⁻¹ is single-valued and ${\operatorname {dom}} (L^*L+\partial g^*)^{-1}=X$.
(vii)
${\operatorname {Prox}}_{g^*\circ L^*}={\operatorname {Id}}-L(L^*L+\partial g)^{-1}L^*$.
(viii)
${\operatorname {Prox}}_{(g^*\circ L^*)^{*}}=L(L^*L+\partial g)^{-1}L^*$.

Proof

(i): Using [5, Fact 2.25(vi)] and the assumption that L ^∗ L is invertible we have $\ker L=\ker L^*L=\{0\}$. (ii): Using (i) we have $(\forall x\in X\smallsetminus \{0\})$ , hence L ^∗ L is strictly monotone. (iii): By (ii) and Lemma 13.3 (i) applied with A replaced by L ^∗ L we have ${\operatorname {dom}} q_{L^*L}={\operatorname {dom}} q^*_{L^*L}=X$, hence

$$\displaystyle \begin{aligned} {\operatorname{dom}} q_{L^*L}- {\operatorname{dom}} g=X- {\operatorname{dom}} g=X. \end{aligned} $$

(13.46)

It follows from (13.46), [1, Corollary 2.1] and Lemma 13.3 (ii)&(i) that ${\operatorname {dom}} (q_{L^*L}+g)^*={\operatorname {dom}} q_{L^*L}^*+{\operatorname {dom}} g^{*} ={\operatorname {dom}} q_{(L^*L)^{-1}}+{\operatorname {dom}} g^*=X+{\operatorname {dom}} g^*=X$. (iv): Combine (13.46), [1, Corollary 2.1] and Lemma 13.3 (i). (v): Since $q_{L^*L}$ is strictly convex, so is $q_{L^*L}+g$, which in view of [5, Proposition 18.9] and (iii) implies that $(q_{L^*L}+g)^*$ is Fréchet differentiable on $X={\operatorname {int}} {\operatorname {dom}} (q_{L^*L}+g)^*$. (vi): Using (iv), Fact 13.8 (i) applied with f replaced by $q_{L^*L}+g$, (v) and [5, Proposition 17.31(i)] we have $(L^*L+\partial g)^{-1}=(\partial (q_{L^*L}+g))^{-1} =\partial (q_{L^*L}+g)^* =\{\nabla (q_{L^*L}+g)^*\}$ is single-valued with ${\operatorname {dom}} (L^*L+\partial g)^{-1}=X$.

(vii): Let $x\in X={\operatorname {dom}} (L^*L+\partial g)^{-1}$ and let y ∈ X such that y = x − L(L ^∗ L + ∂g)⁻¹ L ^∗ x. Then using (vi) we have

(13.47)

Consequently, L ^∗ y + L ^∗ Lu = L ^∗ x ∈ L ^∗ Lu + ∂g(u), hence L ^∗ y ∈ ∂g(u), equivalently, in view of Fact 13.8 (i) applied with f replaced by g, u ∈ (∂g)⁻¹(L ^∗ y) = ∂g ^∗(L ^∗ y). Combining with (13.47) we learn that

$$\displaystyle \begin{aligned} x\in y+L\circ(\partial g^{*})\circ L^* (y). \end{aligned} $$

(13.48)

Note that [5, Fact 2.25(vi) and Fact 2.26] implies that ${\operatorname {ran}} L^*={\operatorname {ran}} L^*L=X$, hence $0\in {\operatorname {sri}}({\operatorname {dom}} g^*-{\operatorname {ran}} L^*)$. Therefore one can apply [5, Corollary 16.53(i)] to re-write (13.48) as $x\in ({\operatorname {Id}}+\partial (g^*\circ L^*))y$. Therefore, $y={\operatorname {Prox}}_{g^*\circ L^*}x$ by [5, Proposition 16.44]. (viii): Apply Fact 13.8 (ii) with f replaced by g ^∗∘ L ^∗. □

13.1.2 Appendix 2

Lemma 13.4

Let $g\colon Y \to \left ]-\infty ,+\infty \right ]$ be convex, lower semicontinuous, and proper. Consider the following statements:

(i)
g is strongly convex.
(ii)
g ^∗ is Fréchet differentiable and ∇g ^∗ is Lipschitz continuous.
(iii)
g ^∗∘ L ^∗ is Fréchet differentiable and ∇(g ^∗∘ L ^∗) = L ∘ (∇g ^∗) ∘ L ^∗ is Lipschitz continuous.
(iv)
(g ^∗∘ L ^∗)^∗ is strongly convex.

Then (i)⇔(ii)⇒(iii)⇔(iv).

Proof

(i)⇔(ii): See [5, Theorem 18.15]. (ii)⇒(iii): Clearly g ^∗∘ L ^∗ is Fréchet differentiable. Now let (x, y) ∈ X × X and suppose that β > 0 is a Lipschitz constant of ∇g ^∗. It follows from [5, Corollary 16.53] that . (iii)⇔(iv): Use the equivalence of (i) and (ii) applied with g replaced by (g ^∗∘ L ^∗)^∗. □

13.1.3 Appendix 3

We start by recalling the following well-known fact.

Fact 13.8

Let $f\colon X\to \left ]-\infty ,+\infty \right ]$ be convex, lower semicontinuous and proper and let γ > 0. Then the following hold:

(i)
(∂f)⁻¹ = ∂f ^∗.
(ii)
${\operatorname {Prox}}_{\gamma f}+{\operatorname {Prox}}_{(\gamma f)^*}={\operatorname {Id}}$.

Proof

(i): See, e.g., [27, Remark on page 216] or [23, Théorème 3.1].

(ii): See, e.g., [5, Theorem 14.3(iii)]. □

Lemma 13.5

Let γ > 0. The Douglas–Rachford method given in (13.9) applied to the ordered pair (γf, γg) with a starting point x ₀ ∈ X to solve (13.8) can be rewritten as:

$$\displaystyle \begin{aligned} y_n&= {\operatorname{Prox}}_{\gamma f} x_{n} {} \end{aligned} $$

(13.49a)

$$\displaystyle \begin{aligned} x_{n+1}&=y_{n}-{\operatorname{Prox}}_{(\gamma g)^*}(2y_n-x_{n}). {} \end{aligned} $$

(13.49b)

Proof

Using (13.9a), (13.10), and Fact 13.8 (ii) applied with f replaced by g we have

$$\displaystyle \begin{aligned} x_{n+1}&= x_n-{\operatorname{Prox}}_{\gamma f }x_n+{\operatorname{Prox}}_{\gamma g}(2{\operatorname{Prox}}_{\gamma f} x_n-x_n) =x_n-y_n+{\operatorname{Prox}}_{\gamma g}(2y_n-x_n) \\ &=x_n-y_n+2y_n-x_n-{\operatorname{Prox}}_{{(\gamma g)}^*}(2y_n-x_n) =y_{n}-{\operatorname{Prox}}_{(\gamma g)^*}(2y_n-x_{n}), \end{aligned} $$

(13.50)

and the conclusion follows. □

13.1.4 Appendix 4

Proposition 13.4

Let (x, y, z) ∈ X × Y × Z and let B and $\widetilde {f}$ be defined as in (13.39) and (13.41). Then the following hold:

(i)
B ^∗ y = (A ^∗ y, C ^∗ y).
(ii)
${\operatorname {dom}} \widetilde {f}={\operatorname {dom}} f\times \{0\}$.
(iii)
$(\forall (x,z)\in {\operatorname {dom}} \widetilde {f})$ we have z = 0 and B(x, z) = Ax.
(iv)
$B({\operatorname {dom}} \widetilde {f})=A({\operatorname {dom}} f)$.
(v)
$0\in {\operatorname {sri}} ({\operatorname {dom}} g-B({\operatorname {dom}} \widetilde {f}))$.
(vi)
${\operatorname {argmin}}(\widetilde {f}+g\circ B)={\operatorname {argmin}} (f+g\circ A)\times \{0\} \neq \varnothing $.
(vii)
${\operatorname {Prox}}_{\widetilde {f}}(x,z)=({\operatorname {Prox}}_f x,0)$.
(viii)
${\operatorname {Prox}}_{(\tau g\circ B)^*}=\tau B^*{\operatorname {Prox}}_{\sigma g^*}(\sigma B)$.

Proof

(i): This clearly follows from (13.39). (ii): It follows from (13.41) that ${\operatorname {dom}}\widetilde {f}={\operatorname {dom}} f\times {\operatorname {dom}}\iota _{\{0\}}={\operatorname {dom}} f\times \{0\}$. (iii): The claim that z = 0 follows from (ii). Now combine with (13.39). (iv): Combine (ii) and (iii). (v): Combine (iv) and (13.34). (vi): We have

$$\displaystyle \begin{aligned} {\operatorname{argmin}}(\widetilde{f}+g\circ B) &={\operatorname{zer}} (\partial \widetilde{f}+B^* \circ \partial g\circ B) {} \end{aligned} $$

(13.51a)

(13.51b)

(13.51c)

where (13.51a) follows from (v) and (13.4) applied with (f, g, L) replaced by $(g,\widetilde {f},B)$, and (13.51b) follows from (13.39) and (13.41). Therefore, $(x,z)\in {\operatorname {argmin}}(\widetilde {f}+g\circ B)$ ⇔ [z = 0 and $x\in {\operatorname {zer}}(\partial f+A^* \circ \partial g\circ A)$] ⇔ $(x,z)\in {\operatorname {argmin}}(f+g\circ A) \times \{0\}$. Now combine with (13.4). (vii): Combine (13.41) and [5, Proposition 23.18]. (viii): Indeed, Proposition 13.3 (viii) implies

(13.52a)

(13.52b)

□

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Moursi, W.M., Zinchenko, Y. (2019). A Note on the Equivalence of Operator Splitting Methods. In: Bauschke, H., Burachik, R., Luke, D. (eds) Splitting Algorithms, Modern Operator Theory, and Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-25939-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-25939-6_13
Published: 07 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25938-9
Online ISBN: 978-3-030-25939-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

A Note on the Equivalence of Operator Splitting Methods

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Appendices

13.1.1 Appendix 1

Lemma 13.3

Proof

Proposition 13.3

Proof

13.1.2 Appendix 2

Lemma 13.4

Proof

13.1.3 Appendix 3

Fact 13.8

13.1.4 Appendix 4

Proposition 13.4

Proof

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation