Abstract
This note provides a comprehensive discussion of the equivalences between some splitting methods. We survey known results concerning these equivalences which have been studied over the past few decades. In particular, we provide simplified proofs of the equivalence of the ADMM and the Douglas–Rachford method and the equivalence of the ADMM with intermediate update of multipliers and the Peaceman–Rachford method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The adjoint of L is the unique operator L ∗: X → Y that satisfies 〈Ly, x〉 = 〈y, L ∗ x〉 (∀(x, y) ∈ X × Y ).
- 2.
It is straightforward to verify that g ∨∗ = (g ∗)∨ (see, e.g., [5, Proposition 13.23(v)]).
- 3.
In passing, we point out that, when X is a finite-dimensional Hilbert space, the condition can be relaxed to . The convergence in this case is proved in [18, Theorem 3.3].
References
Attouch, H., Brézis, H.: Duality for the sum of convex functions in general Banach spaces. In: Aspects of Mathematics and Its Applications 34, pp. 125–133. North-Holland, Amsterdam (1986)
Attouch, H., Théra, M.: A general duality principle for the sum of two operators. J. Convex Anal. 3, 1–24 (1996)
Bauschke, H.H., Combettes, P.L.: A Dykstra-like algorithm for two monotone operators. Pacific J. Optim. 4, 383–391 (2008)
Bauschke, H.H., Boţ, R.I., Hare, W.L., Moursi, W.M.: Attouch–Théra duality revisited: paramonotonicity and operator splitting. J. Approx. Th. 164, 1065–1084 (2012)
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Second Edition. Springer (2017)
Bauschke, H.H., Koch, V.R.: Projection methods: Swiss Army knives for solving feasibility and best approximation problems with halfspaces. In: Infinite Products of Operators and Their Applications, Contemp. Math. vol. 636, pp. 1–40 (2012)
Beck, A.: First-Order Methods in Optimization, SIAM (2017)
Boţ, R.I, Csetnek, E.R.: ADMM for monotone operators: convergence analysis and rates. Advances Comp. Math. 45, 327–359 (2019)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3, 1–122 (2011)
Boyle, J.P., Dykstra, R.L.: A method for finding projections onto the intersection of convex sets in Hilbert spaces. Lecture Notes in Statistics 37, 28–47 (1986)
Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vis. 40, 120–145 (2011)
Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53, 475–504 (2004)
Briceño-Arias, L.M., Combettes, P.L.: A monotone + skew splitting model for composite monotone inclusions in duality. SIAM J. Optim. 21, 1230–1250 (2011)
Combettes, P.L., Pesquet, J.-C.: Primal-dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallel-sum type monotone operators. Set-Valued Var. Anal. 20, 307–330 (2012)
Combettes, P.L., Pesquet, J.-C.: A proximal decomposition method for solving convex variational inverse problems. Inverse Problems 24, article 065014 (2008)
Combettes, P.L., Pesquet, J.-C.: Proximal splitting methods in signal processing. In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, vol. 49, pp. 185–212. Springer, New York (2011)
Combettes, P.L., Vũ, B.-C.: Variable metric forward–backward splitting with applications to monotone inclusions in duality. Optimization 63, 1289–1318 (2014)
Condat, L.: A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Th. Appl. 158, 460–479 (2013)
Deutsch, F.: Best Approximation in Inner Product Spaces, Springer (2001)
Eckstein, J.: Splitting Methods for Monotone Operators with Applications to Parallel Optimization, Ph.D. thesis, MIT (1989)
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Prog. 55, 293–318 (1992)
Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, vol. 15, pp. 299–331. North-Holland, Amsterdam (1983)
Gossez, J.-P.: Opérateurs monotones non linéaires dans les espaces de Banach non réflexifs. J. Math. Anal. Appl. 34, 371–395 (1971)
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
O’Connor, D., Vandenberghe, L.: On the equivalence of the primal-dual hybrid gradient method and Douglas–Rachford splitting. Math. Prog. (Ser. A) (2018), https://doi.org/10.1007/s10107-018-1321-1.
Riesz, F., Sz.-Nagy, B.: Functional Analysis. Dover paperback (1990)
Rockafellar, R.T.: On the maximal monotonicity of subdifferential mappings. Pacific J. Math. 33, 209–216 (1970)
Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis. Third Edition. Springer-Verlag (2002)
Tseng, P.: Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIAM J. Con. Optim. 29, 119–138 (1991)
Vũ, B.C.: A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv. Comput. Math. 38, 667–681 (2013)
Acknowledgements
WMM was supported by the Pacific Institute of Mathematics Postdoctoral Fellowship and the DIMACS/Simons Collaboration on Bridging Continuous and Discrete Optimization through NSF grant # CCF-1740425.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendices
13.1.1 Appendix 1
Let A: X → X be linear. Define
Recall that a linear operator A: X → X is monotone if (∀x ∈ X) 〈x, Ax〉≥ 0, and is strictly monotone if \((\forall x\in X\smallsetminus \{0\})\) 〈x, Ax〉 > 0. Let h: X →ℝ and let x ∈ X. We say that h is Fréchet differentiable at x if there exists a linear operator Dh(x): X →ℝ, called the Fréchet derivative of h at x, such that ; and h is Fréchet differentiable on X if it is Fréchet differentiable at every point in X.
The following lemma is a special case of [5, Proposition 17.36].
Lemma 13.3
Let A: X → X be linear, strictly monotone, self-adjoint and invertible. Then the following hold:
-
(i)
q A and \(q_{A^{-1}}\) are strictly convex, continuous, Fréchet differentiable. Moreover, \((\nabla q_A,\nabla q_{A^{-1}} )=(A,A^{-1})\).
-
(ii)
\(q_{A}^*=q_{A^{-1}}\).
Proof
Note that, likewise A, A −1 is linear, strictly monotone, self-adjoint (since (A −1)∗ = (A ∗)−1 = A −1) and invertible. Moreover, \({\operatorname {ran}} A={\operatorname {ran}} A^{-1}=X\). (i): This follows from [5, Example 17.11 and Proposition 17.36(i)] applied to A and A −1 respectively. (ii): It follows from [5, Proposition 17.36(iii)], [28, Theorem 4.8.5.4] and the invertibility of A that \(q_{A}^*=q_{A^{-1}}+\iota _{{\operatorname {ran}} A} =q_{A^{-1}}+\iota _{X}=q_{A^{-1}}\). □
Proposition 13.3
Let L: Y → X be linear. Suppose that L ∗ L is invertible. \(g\colon Y \to \left ]-\infty ,+\infty \right ]\) be convex, lower semicontinuous, and proper. Then the following hold:
-
(i)
\(\ker L=\{0\}\).
-
(ii)
L ∗ L is strictly monotone.
-
(iii)
\({\operatorname {dom}}(q_{L^*L}+g)^*=X\).
-
(iv)
\(\partial (q_{L^*L}+g)=\nabla q_{L^*L}+\partial g=L^*L+\partial g\).
-
(v)
\((q_{L^*L}+g^*)^*\) is Fréchet differentiable on X.
-
(vi)
(L ∗ L + ∂g ∗)−1 is single-valued and \({\operatorname {dom}} (L^*L+\partial g^*)^{-1}=X\).
-
(vii)
\({\operatorname {Prox}}_{g^*\circ L^*}={\operatorname {Id}}-L(L^*L+\partial g)^{-1}L^*\).
-
(viii)
\({\operatorname {Prox}}_{(g^*\circ L^*)^{*}}=L(L^*L+\partial g)^{-1}L^*\).
Proof
(i): Using [5, Fact 2.25(vi)] and the assumption that L ∗ L is invertible we have \(\ker L=\ker L^*L=\{0\}\). (ii): Using (i) we have \((\forall x\in X\smallsetminus \{0\})\) , hence L ∗ L is strictly monotone. (iii): By (ii) and Lemma 13.3 (i) applied with A replaced by L ∗ L we have \({\operatorname {dom}} q_{L^*L}={\operatorname {dom}} q^*_{L^*L}=X\), hence
It follows from (13.46), [1, Corollary 2.1] and Lemma 13.3 (ii)&(i) that \({\operatorname {dom}} (q_{L^*L}+g)^*={\operatorname {dom}} q_{L^*L}^*+{\operatorname {dom}} g^{*} ={\operatorname {dom}} q_{(L^*L)^{-1}}+{\operatorname {dom}} g^*=X+{\operatorname {dom}} g^*=X\). (iv): Combine (13.46), [1, Corollary 2.1] and Lemma 13.3 (i). (v): Since \(q_{L^*L}\) is strictly convex, so is \(q_{L^*L}+g\), which in view of [5, Proposition 18.9] and (iii) implies that \((q_{L^*L}+g)^*\) is Fréchet differentiable on \(X={\operatorname {int}} {\operatorname {dom}} (q_{L^*L}+g)^*\). (vi): Using (iv), Fact 13.8 (i) applied with f replaced by \(q_{L^*L}+g\), (v) and [5, Proposition 17.31(i)] we have \((L^*L+\partial g)^{-1}=(\partial (q_{L^*L}+g))^{-1} =\partial (q_{L^*L}+g)^* =\{\nabla (q_{L^*L}+g)^*\}\) is single-valued with \({\operatorname {dom}} (L^*L+\partial g)^{-1}=X\).
(vii): Let \(x\in X={\operatorname {dom}} (L^*L+\partial g)^{-1}\) and let y ∈ X such that y = x − L(L ∗ L + ∂g)−1 L ∗ x. Then using (vi) we have
Consequently, L ∗ y + L ∗ Lu = L ∗ x ∈ L ∗ Lu + ∂g(u), hence L ∗ y ∈ ∂g(u), equivalently, in view of Fact 13.8 (i) applied with f replaced by g, u ∈ (∂g)−1(L ∗ y) = ∂g ∗(L ∗ y). Combining with (13.47) we learn that
Note that [5, Fact 2.25(vi) and Fact 2.26] implies that \({\operatorname {ran}} L^*={\operatorname {ran}} L^*L=X\), hence \(0\in {\operatorname {sri}}({\operatorname {dom}} g^*-{\operatorname {ran}} L^*)\). Therefore one can apply [5, Corollary 16.53(i)] to re-write (13.48) as \(x\in ({\operatorname {Id}}+\partial (g^*\circ L^*))y\). Therefore, \(y={\operatorname {Prox}}_{g^*\circ L^*}x\) by [5, Proposition 16.44]. (viii): Apply Fact 13.8 (ii) with f replaced by g ∗∘ L ∗. □
13.1.2 Appendix 2
Lemma 13.4
Let \(g\colon Y \to \left ]-\infty ,+\infty \right ]\) be convex, lower semicontinuous, and proper. Consider the following statements:
-
(i)
g is strongly convex.
-
(ii)
g ∗ is Fréchet differentiable and ∇g ∗ is Lipschitz continuous.
-
(iii)
g ∗∘ L ∗ is Fréchet differentiable and ∇(g ∗∘ L ∗) = L ∘ (∇g ∗) ∘ L ∗ is Lipschitz continuous.
-
(iv)
(g ∗∘ L ∗)∗ is strongly convex.
Proof
(i)⇔(ii): See [5, Theorem 18.15]. (ii)⇒(iii): Clearly g ∗∘ L ∗ is Fréchet differentiable. Now let (x, y) ∈ X × X and suppose that β > 0 is a Lipschitz constant of ∇g ∗. It follows from [5, Corollary 16.53] that . (iii)⇔(iv): Use the equivalence of (i) and (ii) applied with g replaced by (g ∗∘ L ∗)∗. □
13.1.3 Appendix 3
We start by recalling the following well-known fact.
Fact 13.8
Let \(f\colon X\to \left ]-\infty ,+\infty \right ]\) be convex, lower semicontinuous and proper and let γ > 0. Then the following hold:
-
(i)
(∂f)−1 = ∂f ∗.
-
(ii)
\({\operatorname {Prox}}_{\gamma f}+{\operatorname {Prox}}_{(\gamma f)^*}={\operatorname {Id}}\).
Proof
(i): See, e.g., [27, Remark on page 216] or [23, Théorème 3.1].
(ii): See, e.g., [5, Theorem 14.3(iii)]. □
Lemma 13.5
Let γ > 0. The Douglas–Rachford method given in (13.9) applied to the ordered pair (γf, γg) with a starting point x 0 ∈ X to solve (13.8) can be rewritten as:
Proof
Using (13.9a), (13.10), and Fact 13.8 (ii) applied with f replaced by g we have
and the conclusion follows. □
13.1.4 Appendix 4
Proposition 13.4
Let (x, y, z) ∈ X × Y × Z and let B and \(\widetilde {f}\) be defined as in (13.39) and (13.41). Then the following hold:
-
(i)
B ∗ y = (A ∗ y, C ∗ y).
-
(ii)
\({\operatorname {dom}} \widetilde {f}={\operatorname {dom}} f\times \{0\}\).
-
(iii)
\((\forall (x,z)\in {\operatorname {dom}} \widetilde {f})\) we have z = 0 and B(x, z) = Ax.
-
(iv)
\(B({\operatorname {dom}} \widetilde {f})=A({\operatorname {dom}} f)\).
-
(v)
\(0\in {\operatorname {sri}} ({\operatorname {dom}} g-B({\operatorname {dom}} \widetilde {f}))\).
-
(vi)
\({\operatorname {argmin}}(\widetilde {f}+g\circ B)={\operatorname {argmin}} (f+g\circ A)\times \{0\} \neq \varnothing \).
-
(vii)
\({\operatorname {Prox}}_{\widetilde {f}}(x,z)=({\operatorname {Prox}}_f x,0)\).
-
(viii)
\({\operatorname {Prox}}_{(\tau g\circ B)^*}=\tau B^*{\operatorname {Prox}}_{\sigma g^*}(\sigma B)\).
Proof
(i): This clearly follows from (13.39). (ii): It follows from (13.41) that \({\operatorname {dom}}\widetilde {f}={\operatorname {dom}} f\times {\operatorname {dom}}\iota _{\{0\}}={\operatorname {dom}} f\times \{0\}\). (iii): The claim that z = 0 follows from (ii). Now combine with (13.39). (iv): Combine (ii) and (iii). (v): Combine (iv) and (13.34). (vi): We have
where (13.51a) follows from (v) and (13.4) applied with (f, g, L) replaced by \((g,\widetilde {f},B)\), and (13.51b) follows from (13.39) and (13.41). Therefore, \((x,z)\in {\operatorname {argmin}}(\widetilde {f}+g\circ B)\) ⇔ [z = 0 and \(x\in {\operatorname {zer}}(\partial f+A^* \circ \partial g\circ A)\)] ⇔ \((x,z)\in {\operatorname {argmin}}(f+g\circ A) \times \{0\}\). Now combine with (13.4). (vii): Combine (13.41) and [5, Proposition 23.18]. (viii): Indeed, Proposition 13.3 (viii) implies
□
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Moursi, W.M., Zinchenko, Y. (2019). A Note on the Equivalence of Operator Splitting Methods. In: Bauschke, H., Burachik, R., Luke, D. (eds) Splitting Algorithms, Modern Operator Theory, and Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-25939-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-25939-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25938-9
Online ISBN: 978-3-030-25939-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)