Skip to main content

A Note on the Equivalence of Operator Splitting Methods

  • Chapter
  • First Online:
Splitting Algorithms, Modern Operator Theory, and Applications

Abstract

This note provides a comprehensive discussion of the equivalences between some splitting methods. We survey known results concerning these equivalences which have been studied over the past few decades. In particular, we provide simplified proofs of the equivalence of the ADMM and the Douglas–Rachford method and the equivalence of the ADMM with intermediate update of multipliers and the Peaceman–Rachford method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The adjoint of L is the unique operator L : X → Y  that satisfies 〈Ly, x〉 = 〈y, L x〉 (∀(x, y) ∈ X × Y ).

  2. 2.

    It is straightforward to verify that g ∨∗ = (g ) (see, e.g., [5, Proposition 13.23(v)]).

  3. 3.

    In passing, we point out that, when X is a finite-dimensional Hilbert space, the condition can be relaxed to . The convergence in this case is proved in [18, Theorem 3.3].

References

  1. Attouch, H., Brézis, H.: Duality for the sum of convex functions in general Banach spaces. In: Aspects of Mathematics and Its Applications 34, pp. 125–133. North-Holland, Amsterdam (1986)

    Google Scholar 

  2. Attouch, H., Théra, M.: A general duality principle for the sum of two operators. J. Convex Anal. 3, 1–24 (1996)

    MathSciNet  MATH  Google Scholar 

  3. Bauschke, H.H., Combettes, P.L.: A Dykstra-like algorithm for two monotone operators. Pacific J. Optim. 4, 383–391 (2008)

    MathSciNet  MATH  Google Scholar 

  4. Bauschke, H.H., Boţ, R.I., Hare, W.L., Moursi, W.M.: Attouch–Théra duality revisited: paramonotonicity and operator splitting. J. Approx. Th. 164, 1065–1084 (2012)

    Article  Google Scholar 

  5. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Second Edition. Springer (2017)

    Book  Google Scholar 

  6. Bauschke, H.H., Koch, V.R.: Projection methods: Swiss Army knives for solving feasibility and best approximation problems with halfspaces. In: Infinite Products of Operators and Their Applications, Contemp. Math. vol. 636, pp. 1–40 (2012)

    Google Scholar 

  7. Beck, A.: First-Order Methods in Optimization, SIAM (2017)

    Google Scholar 

  8. Boţ, R.I, Csetnek, E.R.: ADMM for monotone operators: convergence analysis and rates. Advances Comp. Math. 45, 327–359 (2019)

    Article  MathSciNet  Google Scholar 

  9. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3, 1–122 (2011)

    Article  Google Scholar 

  10. Boyle, J.P., Dykstra, R.L.: A method for finding projections onto the intersection of convex sets in Hilbert spaces. Lecture Notes in Statistics 37, 28–47 (1986)

    Article  MathSciNet  Google Scholar 

  11. Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vis. 40, 120–145 (2011)

    Article  MathSciNet  Google Scholar 

  12. Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53, 475–504 (2004)

    Article  MathSciNet  Google Scholar 

  13. Briceño-Arias, L.M., Combettes, P.L.: A monotone + skew splitting model for composite monotone inclusions in duality. SIAM J. Optim. 21, 1230–1250 (2011)

    Article  MathSciNet  Google Scholar 

  14. Combettes, P.L., Pesquet, J.-C.: Primal-dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallel-sum type monotone operators. Set-Valued Var. Anal. 20, 307–330 (2012)

    Article  MathSciNet  Google Scholar 

  15. Combettes, P.L., Pesquet, J.-C.: A proximal decomposition method for solving convex variational inverse problems. Inverse Problems 24, article 065014 (2008)

    Google Scholar 

  16. Combettes, P.L., Pesquet, J.-C.: Proximal splitting methods in signal processing. In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, vol. 49, pp. 185–212. Springer, New York (2011)

    Google Scholar 

  17. Combettes, P.L., Vũ, B.-C.: Variable metric forward–backward splitting with applications to monotone inclusions in duality. Optimization 63, 1289–1318 (2014)

    Article  MathSciNet  Google Scholar 

  18. Condat, L.: A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Th. Appl. 158, 460–479 (2013)

    Article  MathSciNet  Google Scholar 

  19. Deutsch, F.: Best Approximation in Inner Product Spaces, Springer (2001)

    Google Scholar 

  20. Eckstein, J.: Splitting Methods for Monotone Operators with Applications to Parallel Optimization, Ph.D. thesis, MIT (1989)

    Google Scholar 

  21. Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Prog. 55, 293–318 (1992)

    Article  MathSciNet  Google Scholar 

  22. Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, vol. 15, pp. 299–331. North-Holland, Amsterdam (1983)

    Google Scholar 

  23. Gossez, J.-P.: Opérateurs monotones non linéaires dans les espaces de Banach non réflexifs. J. Math. Anal. Appl. 34, 371–395 (1971)

    Article  MathSciNet  Google Scholar 

  24. Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)

    Article  MathSciNet  Google Scholar 

  25. O’Connor, D., Vandenberghe, L.: On the equivalence of the primal-dual hybrid gradient method and Douglas–Rachford splitting. Math. Prog. (Ser. A) (2018), https://doi.org/10.1007/s10107-018-1321-1.

  26. Riesz, F., Sz.-Nagy, B.: Functional Analysis. Dover paperback (1990)

    Google Scholar 

  27. Rockafellar, R.T.: On the maximal monotonicity of subdifferential mappings. Pacific J. Math. 33, 209–216 (1970)

    Article  MathSciNet  Google Scholar 

  28. Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis. Third Edition. Springer-Verlag (2002)

    Book  Google Scholar 

  29. Tseng, P.: Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIAM J. Con. Optim. 29, 119–138 (1991)

    Article  MathSciNet  Google Scholar 

  30. Vũ, B.C.: A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv. Comput. Math. 38, 667–681 (2013)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

WMM was supported by the Pacific Institute of Mathematics Postdoctoral Fellowship and the DIMACS/Simons Collaboration on Bridging Continuous and Discrete Optimization through NSF grant # CCF-1740425.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Walaa M. Moursi .

Editor information

Editors and Affiliations

Appendices

Appendices

13.1.1 Appendix 1

Let A: X → X be linear. Define

(13.45)

Recall that a linear operator A: X → X is monotone if (∀x ∈ X) 〈x, Ax〉≥ 0, and is strictly monotone if \((\forall x\in X\smallsetminus \{0\})\)x, Ax〉 > 0. Let h: X →ℝ and let x ∈ X. We say that h is Fréchet differentiable at x if there exists a linear operator Dh(x): X →ℝ, called the Fréchet derivative of h at x, such that ; and h is Fréchet differentiable on X if it is Fréchet differentiable at every point in X.

The following lemma is a special case of [5, Proposition 17.36].

Lemma 13.3

Let A: X  X be linear, strictly monotone, self-adjoint and invertible. Then the following hold:

  1. (i)

    q A and \(q_{A^{-1}}\) are strictly convex, continuous, Fréchet differentiable. Moreover, \((\nabla q_A,\nabla q_{A^{-1}} )=(A,A^{-1})\).

  2. (ii)

    \(q_{A}^*=q_{A^{-1}}\).

Proof

Note that, likewise A, A −1 is linear, strictly monotone, self-adjoint (since (A −1) = (A )−1 = A −1) and invertible. Moreover, \({\operatorname {ran}} A={\operatorname {ran}} A^{-1}=X\). (i): This follows from [5, Example 17.11 and Proposition 17.36(i)] applied to A and A −1 respectively. (ii): It follows from [5, Proposition 17.36(iii)], [28, Theorem 4.8.5.4] and the invertibility of A that \(q_{A}^*=q_{A^{-1}}+\iota _{{\operatorname {ran}} A} =q_{A^{-1}}+\iota _{X}=q_{A^{-1}}\). □

Proposition 13.3

Let L: Y  X be linear. Suppose that L L is invertible. \(g\colon Y \to \left ]-\infty ,+\infty \right ]\) be convex, lower semicontinuous, and proper. Then the following hold:

  1. (i)

    \(\ker L=\{0\}\).

  2. (ii)

    L L is strictly monotone.

  3. (iii)

    \({\operatorname {dom}}(q_{L^*L}+g)^*=X\).

  4. (iv)

    \(\partial (q_{L^*L}+g)=\nabla q_{L^*L}+\partial g=L^*L+\partial g\).

  5. (v)

    \((q_{L^*L}+g^*)^*\) is Fréchet differentiable on X.

  6. (vi)

    (L L + ∂g )−1 is single-valued and \({\operatorname {dom}} (L^*L+\partial g^*)^{-1}=X\).

  7. (vii)

    \({\operatorname {Prox}}_{g^*\circ L^*}={\operatorname {Id}}-L(L^*L+\partial g)^{-1}L^*\).

  8. (viii)

    \({\operatorname {Prox}}_{(g^*\circ L^*)^{*}}=L(L^*L+\partial g)^{-1}L^*\).

Proof

(i): Using [5, Fact 2.25(vi)] and the assumption that L L is invertible we have \(\ker L=\ker L^*L=\{0\}\). (ii): Using (i) we have \((\forall x\in X\smallsetminus \{0\})\) , hence L L is strictly monotone. (iii): By (ii) and Lemma 13.3 (i) applied with A replaced by L L we have \({\operatorname {dom}} q_{L^*L}={\operatorname {dom}} q^*_{L^*L}=X\), hence

$$\displaystyle \begin{aligned} {\operatorname{dom}} q_{L^*L}- {\operatorname{dom}} g=X- {\operatorname{dom}} g=X. \end{aligned} $$
(13.46)

It follows from (13.46), [1, Corollary 2.1] and Lemma 13.3 (ii)&(i) that \({\operatorname {dom}} (q_{L^*L}+g)^*={\operatorname {dom}} q_{L^*L}^*+{\operatorname {dom}} g^{*} ={\operatorname {dom}} q_{(L^*L)^{-1}}+{\operatorname {dom}} g^*=X+{\operatorname {dom}} g^*=X\). (iv): Combine (13.46), [1, Corollary 2.1] and Lemma 13.3 (i). (v): Since \(q_{L^*L}\) is strictly convex, so is \(q_{L^*L}+g\), which in view of [5, Proposition 18.9] and (iii) implies that \((q_{L^*L}+g)^*\) is Fréchet differentiable on \(X={\operatorname {int}} {\operatorname {dom}} (q_{L^*L}+g)^*\). (vi): Using (iv), Fact 13.8 (i) applied with f replaced by \(q_{L^*L}+g\), (v) and [5, Proposition 17.31(i)] we have \((L^*L+\partial g)^{-1}=(\partial (q_{L^*L}+g))^{-1} =\partial (q_{L^*L}+g)^* =\{\nabla (q_{L^*L}+g)^*\}\) is single-valued with \({\operatorname {dom}} (L^*L+\partial g)^{-1}=X\).

(vii): Let \(x\in X={\operatorname {dom}} (L^*L+\partial g)^{-1}\) and let y ∈ X such that y = x − L(L L + ∂g)−1 L x. Then using (vi) we have

(13.47)

Consequently, L y + L Lu = L x ∈ L Lu + ∂g(u), hence L y ∈ ∂g(u), equivalently, in view of Fact 13.8 (i) applied with f replaced by g, u ∈ (∂g)−1(L y) = ∂g (L y). Combining with (13.47) we learn that

$$\displaystyle \begin{aligned} x\in y+L\circ(\partial g^{*})\circ L^* (y). \end{aligned} $$
(13.48)

Note that [5, Fact 2.25(vi) and Fact 2.26] implies that \({\operatorname {ran}} L^*={\operatorname {ran}} L^*L=X\), hence \(0\in {\operatorname {sri}}({\operatorname {dom}} g^*-{\operatorname {ran}} L^*)\). Therefore one can apply [5, Corollary 16.53(i)] to re-write (13.48) as \(x\in ({\operatorname {Id}}+\partial (g^*\circ L^*))y\). Therefore, \(y={\operatorname {Prox}}_{g^*\circ L^*}x\) by [5, Proposition 16.44]. (viii): Apply Fact 13.8 (ii) with f replaced by g ∘ L . □

13.1.2 Appendix 2

Lemma 13.4

Let \(g\colon Y \to \left ]-\infty ,+\infty \right ]\) be convex, lower semicontinuous, and proper. Consider the following statements:

  1. (i)

    g is strongly convex.

  2. (ii)

    g is Fréchet differentiable andg is Lipschitz continuous.

  3. (iii)

    g  L is Fréchet differentiable and ∇(g  L ) = L ∘ (∇g ) ∘ L is Lipschitz continuous.

  4. (iv)

    (g L ) is strongly convex.

Then (i)(ii)(iii)(iv).

Proof

(i)(ii): See [5, Theorem 18.15]. (ii)(iii): Clearly g ∘ L is Fréchet differentiable. Now let (x, y) ∈ X × X and suppose that β > 0 is a Lipschitz constant of ∇g . It follows from [5, Corollary 16.53] that . (iii)(iv): Use the equivalence of (i) and (ii) applied with g replaced by (g L ). □

13.1.3 Appendix 3

We start by recalling the following well-known fact.

Fact 13.8

Let \(f\colon X\to \left ]-\infty ,+\infty \right ]\) be convex, lower semicontinuous and proper and let γ > 0. Then the following hold:

  1. (i)

    (∂f)−1 = ∂f .

  2. (ii)

    \({\operatorname {Prox}}_{\gamma f}+{\operatorname {Prox}}_{(\gamma f)^*}={\operatorname {Id}}\).

Proof

(i): See, e.g., [27, Remark on page 216] or [23, Théorème 3.1].

(ii): See, e.g., [5, Theorem 14.3(iii)]. □

Lemma 13.5

Let γ > 0. The Douglas–Rachford method given in (13.9) applied to the ordered pair (γf, γg) with a starting point x 0 ∈ X to solve (13.8) can be rewritten as:

$$\displaystyle \begin{aligned} y_n&= {\operatorname{Prox}}_{\gamma f} x_{n} {} \end{aligned} $$
(13.49a)
$$\displaystyle \begin{aligned} x_{n+1}&=y_{n}-{\operatorname{Prox}}_{(\gamma g)^*}(2y_n-x_{n}). {} \end{aligned} $$
(13.49b)

Proof

Using (13.9a), (13.10), and Fact 13.8 (ii) applied with f replaced by g we have

$$\displaystyle \begin{aligned} x_{n+1}&= x_n-{\operatorname{Prox}}_{\gamma f }x_n+{\operatorname{Prox}}_{\gamma g}(2{\operatorname{Prox}}_{\gamma f} x_n-x_n) =x_n-y_n+{\operatorname{Prox}}_{\gamma g}(2y_n-x_n) \\ &=x_n-y_n+2y_n-x_n-{\operatorname{Prox}}_{{(\gamma g)}^*}(2y_n-x_n) =y_{n}-{\operatorname{Prox}}_{(\gamma g)^*}(2y_n-x_{n}), \end{aligned} $$
(13.50)

and the conclusion follows. □

13.1.4 Appendix 4

Proposition 13.4

Let (x, y, z) ∈ X × Y × Z and let B and \(\widetilde {f}\) be defined as in (13.39) and (13.41). Then the following hold:

  1. (i)

    B y = (A y, C y).

  2. (ii)

    \({\operatorname {dom}} \widetilde {f}={\operatorname {dom}} f\times \{0\}\).

  3. (iii)

    \((\forall (x,z)\in {\operatorname {dom}} \widetilde {f})\) we have z = 0 and B(x, z) = Ax.

  4. (iv)

    \(B({\operatorname {dom}} \widetilde {f})=A({\operatorname {dom}} f)\).

  5. (v)

    \(0\in {\operatorname {sri}} ({\operatorname {dom}} g-B({\operatorname {dom}} \widetilde {f}))\).

  6. (vi)

    \({\operatorname {argmin}}(\widetilde {f}+g\circ B)={\operatorname {argmin}} (f+g\circ A)\times \{0\} \neq \varnothing \).

  7. (vii)

    \({\operatorname {Prox}}_{\widetilde {f}}(x,z)=({\operatorname {Prox}}_f x,0)\).

  8. (viii)

    \({\operatorname {Prox}}_{(\tau g\circ B)^*}=\tau B^*{\operatorname {Prox}}_{\sigma g^*}(\sigma B)\).

Proof

(i): This clearly follows from (13.39). (ii): It follows from (13.41) that \({\operatorname {dom}}\widetilde {f}={\operatorname {dom}} f\times {\operatorname {dom}}\iota _{\{0\}}={\operatorname {dom}} f\times \{0\}\). (iii): The claim that z = 0 follows from (ii). Now combine with (13.39). (iv): Combine (ii) and (iii). (v): Combine (iv) and (13.34). (vi): We have

$$\displaystyle \begin{aligned} {\operatorname{argmin}}(\widetilde{f}+g\circ B) &={\operatorname{zer}} (\partial \widetilde{f}+B^* \circ \partial g\circ B) {} \end{aligned} $$
(13.51a)
(13.51b)
(13.51c)

where (13.51a) follows from (v) and (13.4) applied with (f, g, L) replaced by \((g,\widetilde {f},B)\), and (13.51b) follows from (13.39) and (13.41). Therefore, \((x,z)\in {\operatorname {argmin}}(\widetilde {f}+g\circ B)\) ⇔ [z = 0 and \(x\in {\operatorname {zer}}(\partial f+A^* \circ \partial g\circ A)\)] ⇔ \((x,z)\in {\operatorname {argmin}}(f+g\circ A) \times \{0\}\). Now combine with (13.4). (vii): Combine (13.41) and [5, Proposition 23.18]. (viii): Indeed, Proposition 13.3 (viii) implies

(13.52a)
(13.52b)

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Moursi, W.M., Zinchenko, Y. (2019). A Note on the Equivalence of Operator Splitting Methods. In: Bauschke, H., Burachik, R., Luke, D. (eds) Splitting Algorithms, Modern Operator Theory, and Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-25939-6_13

Download citation

Publish with us

Policies and ethics