Iteration Complexity Analysis of Multi-block ADMM for a Family of Convex Minimization Without Strong Convexity

Lin, Tianyi; Ma, Shiqian; Zhang, Shuzhong

doi:10.1007/s10915-016-0182-0

Iteration Complexity Analysis of Multi-block ADMM for a Family of Convex Minimization Without Strong Convexity

Published: 01 March 2016

Volume 69, pages 52–81, (2016)
Cite this article

Journal of Scientific Computing Aims and scope Submit manuscript

Tianyi Lin¹,
Shiqian Ma¹ &
Shuzhong Zhang²

1135 Accesses
37 Citations
Explore all metrics

Abstract

The alternating direction method of multipliers (ADMM) is widely used in solving structured convex optimization problems due to its superior practical performance. On the theoretical side however, a counterexample was shown in Chen et al. (Math Program 155(1):57–79, 2016.) indicating that the multi-block ADMM for minimizing the sum of N $(N\ge 3)$ convex functions with N block variables linked by linear constraints may diverge. It is therefore of great interest to investigate further sufficient conditions on the input side which can guarantee convergence for the multi-block ADMM. The existing results typically require the strong convexity on parts of the objective. In this paper, we provide two different ways related to multi-block ADMM that can find an $\epsilon $-optimal solution and do not require strong convexity of the objective function. Specifically, we prove the following two results: (1) the multi-block ADMM returns an $\epsilon $-optimal solution within $O(1/\epsilon ^2)$ iterations by solving an associated perturbation to the original problem; this case can be seen as using multi-block ADMM to solve a modified problem; (2) the multi-block ADMM returns an $\epsilon $-optimal solution within $O(1/\epsilon )$ iterations when it is applied to solve a certain sharing problem, under the condition that the augmented Lagrangian function satisfies the Kurdyka–Łojasiewicz property, which essentially covers most convex optimization models except for some pathological cases; this case can be seen as applying multi-block ADMM to solving a special class of problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Block-Wise ADMM with a Relaxation Factor for Multiple-Block Convex Programming

Article 02 February 2018

Bing-Sheng He, Ming-Hua Xu & Xiao-Ming Yuan

On the Sublinear Convergence Rate of Multi-block ADMM

Article 19 July 2015

Tian-Yi Lin, Shi-Qian Ma & Shu-Zhong Zhang

Convergence analysis of the direct extension of ADMM for multiple-block separable convex minimization

Article 13 October 2017

Min Tao & Xiaoming Yuan

References

Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Lojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)
Article MathSciNet MATH Google Scholar
Boley, D.: Local linear convergence of the alternating direction method of multipliers on quadratic or linear programs. SIAM J. Optim. 23(4), 2183–2207 (2013)
Article MathSciNet MATH Google Scholar
Bolte, J., Daniilidis, A., Ley, O., Mazet, L.: Characterizations of Lojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362(6), 3319–3363 (2010)
Article MathSciNet MATH Google Scholar
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearization minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)
Article MathSciNet MATH Google Scholar
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
Article MATH Google Scholar
Cai, X., Han, D., Yuan, X.: The direct extension of ADMM for three-block separable convex minimization models is convergent when one function is strongly convex. Preprint http://www.optimization-online.org/DB_FILE/2014/11/4644.pdf (2014)
Chen, C., He, B., Ye, Y., Yuan, X.: The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent. Math. Program. 155(1), 57–79 (2016)
Chen, C., Shen, Y., You, Y.: On the convergence analysis of the alternating direction method of multipliers with three blocks. In: Abstract and Applied Analysis, Article ID 183961 (2013)
Davis, D., Yin, W.: A three-operator splitting scheme and its optimization applications. Technical report, UCLA CAM Report 15-13 (2015)
Deng, W., Lai, M., Peng, Z., Yin, W.: Parallel multi-block ADMM with $o(1/k)$ convergence. Technical report, UCLA CAM 13-64 (2013)
Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. J. Sci. Comput. 66(3), 889–916 (2016)
Article MathSciNet Google Scholar
Douglas, J., Rachford, H.H.: On the numerical solution of the heat conduction problem in 2 and 3 space variables. Trans. Am. Math. Soc. 82, 421–439 (1956)
Article MathSciNet MATH Google Scholar
Eckstein, J.: Splitting methods for monotone operators with applications to parallel optimization. Ph.D. thesis, Massachusetts Institute of Technology (1989)
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55, 293–318 (1992)
Article MathSciNet MATH Google Scholar
Eckstein, J., Yao, W.: Understanding the convergence of the alternating direction method of multipliers: theoretical and computational perspectives. Pac. J. Optim. 11(4), 619–644 (2015)
Fortin, M., Glowinski, R.: Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems. North-Holland Pub. Co., Amsterdam (1983)
MATH Google Scholar
Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Fortin, M., Glowinski, R. (eds.) Augmented Lagrangian Methods: Applications to the Solution of Boundary Value Problems. North-Holland, Amsterdam (1983)
Google Scholar
Glowinski, R., Le Tallec, P.: Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics. SIAM, Philadelphia (1989)
Book MATH Google Scholar
Han, D., Yuan, X.: A note on the alternating direction method of multipliers. J. Optim. Theory Appl. 155(1), 227–238 (2012)
Article MathSciNet MATH Google Scholar
He, B., Hou, L., Yuan, X.: On full Jacobian decomposition of the augmented Lagrangian method for separable convex programming. SIAM J. Optim. 25(4), 2274–2312 (2013)
Article MathSciNet MATH Google Scholar
He, B., Tao, M., Yuan, X.: Alternating direction method with Gaussian back substitution for separable convex programming. SIAM J. Optim. 22, 313–340 (2012)
Article MathSciNet MATH Google Scholar
He, B., Tao, M., Yuan, X.: Convergence rate and iteration complexity on the alternating direction method of multipliers with a substitution procedure for separable convex programming. Preprint http://www.optimization-online.org/DB_FILE/2012/09/3611.pdf (2012)
He, B., Yuan, X.: On the ${O}(1/n)$ convergence rate of Douglas–Rachford alternating direction method. SIAM J. Numer. Anal. 50, 700–709 (2012)
Article MathSciNet MATH Google Scholar
He, B., Yuan, X.: On nonergodic convergence rate of Douglas–Rachford alternating direction method of multipliers. Numer. Math. 130(3), 567–577 (2015)
Article MathSciNet MATH Google Scholar
Hong, M., Chang, T.-H., Wang, X., Razaviyayn, M., Ma, S., Luo, Z.-Q.: A block successive upper bound minimization method of multipliers for linearly constrained convex optimization (2014). Preprint ArXiv:1401.7079
Hong, M., Luo, Z.: On the linear convergence of the alternating direction method of multipliers (2012). Preprint ArXiv:1208.3922
Hong, M., Luo, Z.-Q., Razaviyayn, M.: Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J. Optim. 26(1), 337–364 (2016)
Li, M., Sun, D., Toh, K.-C.: A convergent 3-block semi-proximal ADMM for convex minimization with one strongly convex block. Asia Pac. J. Oper. Res. 32, 1550024 (2015)
Article MathSciNet MATH Google Scholar
Lin, T., Ma, S., Zhang, S.: Global convergence of unmodified 3-block ADMM for a class of convex minimization problems (2015). Preprint ArXiv:1505.04252
Lin, T., Ma, S., Zhang, S.: On the global linear convergence of the ADMM with multi-block variables. SIAM J. Optim. 25(3), 1478–1497 (2015)
Article MathSciNet MATH Google Scholar
Lin, T., Ma, S., Zhang, S.: On the sublinear convergence rate of multi-block ADMM. J. Oper. Res. Soc. China 3(3), 251–274 (2015)
Article MathSciNet MATH Google Scholar
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
Article MathSciNet MATH Google Scholar
Monteiro, R.D.C., Svaiter, B.F.: Iteration-complexity of block-decomposition algorithms and the alternating direction method of multipliers. SIAM J. Optim. 23, 475–507 (2013)
Article MathSciNet MATH Google Scholar
Peaceman, D.H., Rachford, H.H.: The numerical solution of parabolic elliptic differential equations. SIAM J. Appl. Math. 3, 28–41 (1955)
Article MathSciNet MATH Google Scholar
Peng, Y., Ganesh, A., Wright, J., Xu, W., Ma, Y.: RASL: robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2233–2246 (2012)
Article Google Scholar
Sun, D., Toh, K.-C., Yang, L.: A convergent 3-block semi-proximal alternating direction method of multipliers for conic programming with 4-type of constraints. SIAM J. Optim. 25(2), 882–915 (2015)
Article MathSciNet MATH Google Scholar
Tao, M., Yuan, X.: Recovering low-rank and sparse components of matrices from incomplete and noisy observations. SIAM J. Optim. 21, 57–81 (2011)
Article MathSciNet MATH Google Scholar
Wang, X., Hong, M., Ma, S., Luo, Z.-Q.: Solving multiple-block separable convex minimization problems using two-block alternating direction method of multipliers. Pac. J. Optim. 11(4), 645–667 (2015)
MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors are grateful to the associate editor and two anonymous referees for their insightful comments that have improved the presentation of this paper greatly.

Author information

Authors and Affiliations

Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China
Tianyi Lin & Shiqian Ma
Department of Industrial and Systems Engineering, University of Minnesota, Minneapolis, MN, 55455, USA
Shuzhong Zhang

Authors

Tianyi Lin
View author publications
You can also search for this author in PubMed Google Scholar
Shiqian Ma
View author publications
You can also search for this author in PubMed Google Scholar
Shuzhong Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shiqian Ma.

Additional information

Shiqian Ma: Research of this author was supported in part by the Hong Kong Research Grants Council General Research Fund Early Career Scheme (Project ID: CUHK 439513). Shuzhong Zhang: Research of this author was supported in part by the National Science Foundation under Grant Number CMMI-1462408.

Appendix: Proof of Theorem 4.3

We first prove the following lemma.

Lemma 4.8

The following results hold under the conditions in Scenario 2.

1.
The iterative gap of dual variable can be bounded by that of primal variable, i.e.,
$$\begin{aligned} \Vert \lambda ^{k+1} - \lambda ^{k} \Vert ^2 \le L^2 \Vert x_N^{k+1} - x_N^k \Vert , \end{aligned}$$
(4.17)
where L is the Lipschitz constant for $\nabla f_N$.
2.
The augmented Lagrangian $L_\gamma $ has a sufficient decrease in each iteration, i.e.,
$$\begin{aligned}&{\mathcal {L}}_\gamma \left( x_1^{k},\ldots , x_{N+1}^{k};\lambda ^k\right) - {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_{N+1}^{k+1};\lambda ^{k+1}\right) \nonumber \\&\quad \ge \frac{\gamma ^2-2L^2}{2\gamma (1+L^2)}\left( \sum \limits _{i=1}^{N-1} \left\| A_i x_i^k - A_i x_i^{k+1} \right\| ^2 + \left\| x_N^k - x_N^{k+1} \right\| ^2 + \left\| \lambda ^k - \lambda ^{k+1} \right\| ^2\right) .\nonumber \\ \end{aligned}$$
(4.18)
3.
The augmented Lagrangian ${\mathcal {L}}_\gamma (w^k)$ is uniformly lower bounded, and it holds true that
$$\begin{aligned}&\sum \limits _{k=0}^\infty \left( \sum \limits _{i=1}^{N-1} \left\| A_i x_i^{k+1} - A_i x_i^k \right\| ^2 + \left\| x_N^{k+1} - x_N^k \right\| ^2 + \left\| \lambda ^{k+1}-\lambda ^k \right\| ^2\right) \nonumber \\&\quad \le \frac{2\gamma (1+L^2)}{\gamma ^2-2L^2}\left( {\mathcal {L}}_\gamma (w^0) - L^*\right) \end{aligned}$$
(4.19)
where $L^*$ is the uniform lower bound of ${\mathcal {L}}_\gamma (w^k)$, and hence
$$\begin{aligned} \lim \limits _{k\rightarrow \infty } \left( \sum _{i=1}^{N-1} \left\| A_i x_i^k - A_i x_i^{k+1} \right\| ^2 + \left\| x_N^k - x_N^{k+1} \right\| ^2 + \left\| \lambda ^k - \lambda ^{k+1} \right\| ^2 \right) = 0. \end{aligned}$$
(4.20)
Moreover, $\left\{ \left( x_1^k, x_2^k, \ldots ,x_N^k, \lambda ^k\right) : k=0,1,\ldots \right\} $ is a bounded sequence.
4.
There exists an upper bound for a subgradient of augmented Lagrangian ${\mathcal {L}}_\gamma $ in each iteration. In fact, we define
$$\begin{aligned} R_i^{k+1}:= & {} \gamma A_i^\top \left( \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right) \\&- \gamma A_{i}^{\top }\left( \sum \limits _{j=i+1}^{N-1} A_{j}\left( x_{j}^{k}-x_{j}^{k+1}\right) +\left( x_{N}^{k}-x_{N}^{k+1}\right) \right) \end{aligned}$$
and
$$\begin{aligned} R_N^{k+1} := \gamma \left( \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right) , \quad R_{\lambda }^{k+1} := b - \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} - x_N^{k+1} \end{aligned}$$
for each positive integer k, and $i = 1,2,\ldots , N$. Then $\left( R_1^{k+1}, \ldots , R_N^{k+1}, R_\lambda ^{k+1}\right) \in \partial {\mathcal {L}}_\gamma (w^{k+1})$. Moreover, it holds that
$$\begin{aligned}&\left\| \left( R_1^{k+1}, \ldots , R_N^{k+1}, R_\lambda ^{k+1}\right) \right\| \le \sum \limits _{i=1}^N \left\| R_i^{k+1} \right\| + \left\| R_\lambda ^{k+1} \right\| \nonumber \\&\quad \le M\left( \sum \limits _{i=1}^{N-1} \left\| A_i x_i^k - A_i x_i^{k+1}\right\| + \left\| x_i^k - x_i^{k+1} \right\| + \left\| \lambda ^k - \lambda ^{k+1} \right\| \right) , \quad \forall k\ge 0,\nonumber \\ \end{aligned}$$
(4.21)
where M is a constant defined in (4.8).

Proof of Lemma 4.8

1.
(4.17) follows directly from (4.5).
2.
From (4.1), by invoking the convexity of $f_i$, we have for $i=1,\ldots ,N-1$:
$$\begin{aligned} 0&= \left( x_i^k - x_i^{k+1}\right) ^\top \left[ g_i\left( x_{i}^{k+1}\right) -A_{i}^{\top }\lambda ^{k}+\gamma A_{i}^{\top }\left( \sum _{j=1}^iA_{j}x_{j}^{k+1}+\sum _{j=i+1}^{N-1} A_{j}x_{j}^{k} + x_{N}^k -b\right) \right] \nonumber \\&\le f_i\left( x_i^k\right) - f_i\left( x_i^{k+1}\right) - \left( A_i x_i^k - A_i x_i^{k+1}\right) ^\top \lambda ^k \nonumber \\&\qquad +\, \gamma \left( A_i x_i^k - A_i x_i^{k+1}\right) ^\top \left( \sum _{j=1}^iA_{j}x_{j}^{k+1}+\sum _{j=i+1}^{N-1} A_{j}x_{j}^{k} + x_{N}^k -b \right) \nonumber \\&= {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_{i-1}^k,x_i^k,\ldots ,\lambda ^k\right) - {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots ,x_i^{k+1},x_{i+1}^k,\ldots ,\lambda ^k\right) \nonumber \\&\qquad -\, \frac{\gamma }{2}\left\| A_i x_i^k - A_i x_i^{k+1}\right\| ^2. \end{aligned}$$
(4.22)
Similarly, from (4.2) we can prove that
$$\begin{aligned} {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_{N-1}^{k+1},x_N^k,\ldots ;\lambda ^k\right) - {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots ,x_N^{k+1};\lambda ^k\right) \ge \frac{\gamma }{2}\left\| x_N^k - x_N^{k+1}\right\| ^2.\nonumber \\ \end{aligned}$$
(4.23)
Summing (4.22) over $i=1,\ldots ,N-1$ and (4.23), we have
$$\begin{aligned}&{\mathcal {L}}_\gamma \left( x_1^k,\ldots ,x_N^k,\lambda ^k\right) - {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_N^{k+1},\lambda ^k\right) \nonumber \\&\quad \ge \frac{\gamma }{2}\sum \limits _{i=1}^{N-1}\left\| A_i x_i^k - A_i x_i^{k+1} \right\| ^2 + \frac{\gamma }{2}\left\| x_N^k - x_N^{k+1} \right\| ^2. \end{aligned}$$
(4.24)
On the other hand, it follows from (4.2) and (4.17) that
$$\begin{aligned}&{\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_N^{k+1},\lambda ^k\right) - {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_N^{k+1},\lambda ^{k+1}\right) \nonumber \\&\quad = \frac{1}{\gamma }\left\| \lambda ^k - \lambda ^{k+1} \right\| ^2 \ge - \frac{L^2}{\gamma }\left\| x_N^k - x_N^{k+1} \right\| ^2. \end{aligned}$$
(4.25)
Combining (4.24) and (4.25) yields
$$\begin{aligned}&{\mathcal {L}}_\gamma \left( x_1^k,\ldots , x_N^k,\lambda ^k\right) - {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_N^{k+1},\lambda ^{k+1}\right) \nonumber \\&\quad \ge \frac{\gamma }{2}\sum \limits _{i=1}^{N-1}\left\| A_i x_i^k - A_i x_i^{k+1} \right\| ^2 + \frac{\gamma ^2 - 2L^2}{2\gamma } \left\| x_N^k - x_N^{k+1} \right\| ^2 \nonumber \\&\quad \ge \frac{\gamma }{2}\sum \limits _{i=1}^{N-1}\left\| A_i x_i^k - A_i x_i^{k+1} \right\| ^2 + \frac{\gamma ^2 - 2L^2}{2\gamma (1+L^2)}\left( \left\| x_N^k - x_N^{k+1} \right\| ^2 + \left\| \lambda ^k - \lambda ^{k+1} \right\| ^2\right) \nonumber \\&\quad \ge \frac{\gamma ^2 - 2L^2}{2\gamma (1+L^2)}\left( \sum \limits _{i=1}^{N-1}\left\| A_i x_i^k - A_i x_i^{k+1} \right\| ^2 + \left\| x_N^k - x_N^{k+1} \right\| ^2 + \left\| \lambda ^k - \lambda ^{k+1} \right\| ^2\right) .\nonumber \\ \end{aligned}$$
(4.26)
3.
It follows from (4.2) and the fact that $\nabla f_N$ is Lipschitz continuous with constant L that,
$$\begin{aligned}&f_N\left( b - \sum \limits _{i=1}^{N-1} A_i x_i^{k+1}\right) \\&\quad \le f_N\left( x_N^{k+1}\right) + \left\langle \nabla f_N\left( x_N^{k+1}\right) , \left( b - \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} - x_N^{k+1}\right) \right\rangle \\&\qquad +\, \frac{L}{2}\left\| b - \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} - x_N^{k+1} \right\| ^2 \\&\quad = f_N\left( x_N^{k+1}\right) - \left\langle \lambda ^{k+1}, \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right\rangle + \frac{L}{2} \left\| \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right\| ^2. \end{aligned}$$
This implies that there exists $L^*>-\infty $, such that
$$\begin{aligned}&{\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_N^{k+1},\lambda ^{k+1}\right) \nonumber \\&\quad \ge \sum _{i=1}^{N-1} f_i(x_i^{k+1}) + f_N\left( b - \sum \limits _{i=1}^{N-1} A_i x_i^{k+1}\right) + \frac{\gamma -L}{2} \left\| \sum _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} -b\right\| ^2 \nonumber \\&\quad > L^*, \end{aligned}$$
(4.27)
where the last inequality holds since $\gamma >L$ and $\inf _{{\mathcal {X}}_i}f_i>f_i^*$ for $i=1,2,\ldots ,N$.

Therefore, it directly follows from (4.18) and $\gamma >\sqrt{2}L$ that,
$$\begin{aligned}&\frac{\gamma ^2-2L^2}{2\gamma (1+L^2)}\sum \limits _{k=0}^K \left( \sum \limits _{i=1}^{N-1} \Vert A_i x_i^{k+1} - A_i x_i^k \Vert ^2 + \Vert x_N^{k+1} - x_N^k \Vert ^2 + \Vert \lambda ^{k+1}-\lambda ^k\Vert ^2\right) \\&\quad \le {\mathcal {L}}_\gamma (w^0) - L^*. \end{aligned}$$
Letting $K\rightarrow \infty $ gives (4.19) and (4.20).

It also follows from (4.27), (4.18) and $\gamma >\sqrt{2}L$ that ${\mathcal {L}}_\gamma (w^0) - f_N^* \ge \sum _{i=1}^{N-1} f_i(x_i^{k+1})$. This implies that $\left\{ \left( x_1^k, x_2^k, \ldots ,x_{N-1}^k\right) : k=0,1,\ldots \right\} $ is a bounded sequence by using the coerciveness of $f_i+\mathbf 1 _{{\mathcal {X}}_i}, i=1,2,\ldots ,N-1$. The boundedness of $\left( x_N^k, \lambda ^k\right) $ can be obtained by using (4.3), (4.17) and (4.20).
4.
From the definition of ${\mathcal {L}}_\gamma $, it is clear that for $i=1,\ldots ,N-1$,
$$\begin{aligned} g_i\left( x_i^{k+1}\right) - A_i^\top \lambda ^{k+1} + \gamma A_i^\top \left( \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right) \in \partial _{x_i} {\mathcal {L}}_\gamma (w^{k+1}), \end{aligned}$$
and
$$\begin{aligned} \nabla f\left( x_N^{k+1}\right) - \lambda ^{k+1} + \gamma \left( \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right) = \nabla _{x_N} {\mathcal {L}}_\gamma (w^{k+1}), \end{aligned}$$
and
$$\begin{aligned} b - \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} - x_N^{k+1} = \nabla _{\lambda } {\mathcal {L}}_\gamma (w^{k+1}), \end{aligned}$$
where $g_i\in \partial \left( f_i + \mathbf {1}_{{\mathcal {X}}_i}\right) $ for $i=1,2,\ldots ,N-1$.

Combining these relations with (4.4) and (4.5) yields that
$$\begin{aligned}&R_i^{k+1} := \gamma A_i^\top \left( \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right) \\&\quad - \gamma A_{i}^{\top }\left( \sum \limits _{j=i+1}^{N-1} A_{j}(x_{j}^{k}-x_{j}^{k+1})+(x_{N}^{k}-x_{N}^{k+1})\right) \in \partial _{x_i} {\mathcal {L}}_\gamma (w^{k+1}), \\&R_N^{k+1} := \gamma \left( \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right) = \nabla _{x_N} {\mathcal {L}}_\gamma (w^{k+1}), \\&R_{\lambda }^{k+1} := b - \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} - x_N^{k+1} = \nabla _{\lambda } {\mathcal {L}}_\gamma (w^{k+1}), \end{aligned}$$
for $i=1,2,\ldots ,N-1$. Therefore, $\left( R_1^{k+1}, \ldots , R_N^{k+1}, R_\lambda ^{k+1}\right) \in \partial {\mathcal {L}}_\gamma (w^{k+1})$. We now need to bound the norms of $R_i^{k+1}$, $i=1,\ldots ,N-1$, $R_N^k$ and $R_\lambda ^k$. It holds that
$$\begin{aligned} \left\| R_i^{k+1} \right\|\le & {} \gamma \left\| A_i^\top \right\| \left( \sum \limits _{j=i+1}^{N-1} \left\| A_j x_j^{k} - A_j x_j^{k+1}\right\| + \left\| x_N^{k} - x_N^{k+1}\right\| \right) \\&+\, \gamma \left\| A_i^\top \right\| \left\| \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right\| \\\le & {} \gamma \left\| A_i^\top \right\| \left( \sum \limits _{j=1}^{N-1} \left\| A_j x_j^{k} - A_j x_j^{k+1}\right\| + \left\| x_N^{k} - x_N^{k+1}\right\| \right) + \left\| A_i^\top \right\| \left\| \lambda ^k - \lambda ^{k+1}\right\| \end{aligned}$$
and
$$\begin{aligned} \left\| R_N^{k+1} \right\| = \gamma \left\| \sum \limits _{i=1}^{N-1} A_i x_i^{k+1} + x_N^{k+1} - b\right\| = \left\| \lambda ^k - \lambda ^{k+1}\right\| , \quad \Vert R_\lambda ^{k+1} \Vert = \frac{1}{\gamma } \left\| \lambda ^k - \lambda ^{k+1} \right\| . \end{aligned}$$
These relations immediately imply (4.21). $\Box $

Proof of Theorem 4.3

1.
It has been proven in Lemma 4.8 that $\left\{ \left( x_1^k, x_2^k, \ldots ,x_N^k, \lambda ^k\right) :\right. \left. k=0,1,\ldots \right\} $ is a bounded sequence. Therefore, we conclude that $\Omega (w^0)$ is non-empty by the Bolzano-Weierstrass Theorem. Let $w^* = \left( x_1^*,\ldots , x_N^*,\lambda ^*\right) \in \Omega (w^0)$ be a limit point of $\{w^k = \left( x_1^k,\ldots , x_N^k,\lambda ^k\right) :k=0,1,\ldots \}$. Then there exists a subsequence $\left\{ w^{k_q} = \left( x_1^{k_q},\ldots , x_N^{k_q},\lambda ^{k_q}\right) :q=0,1,\ldots \right\} $ such that $w^{k_q}\rightarrow w^*$ as $q\rightarrow \infty $. Since $f_i, i=1,\ldots ,N-1$, are lower semi-continuous, we obtain that
$$\begin{aligned} \liminf \limits _{q\rightarrow \infty } f_i(x_i^{k_q}) \ge f_i(x_i^*), \quad i=1,2,\ldots , N. \end{aligned}$$
(4.28)
From (1.2), we have for any integer k and any $i=1,\ldots ,N-1$,
$$\begin{aligned} x_i^{k+1} := \mathop {\mathrm{argmin}}\limits _{x_i\in {\mathcal {X}}_i} \ {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_{i-1}^{k+1}, x_i, x_{i+1}^k,\ldots , x_N^k;\lambda ^k\right) . \end{aligned}$$
Letting $x_i = x_i^*$ in the above, we get
$$\begin{aligned} {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_i^{k+1}, x_{i+1}^k,\ldots , x_N^k;\lambda ^k\right) \le {\mathcal {L}}_\gamma \left( x_1^{k+1},\ldots , x_{i-1}^{k+1}, x_i^{*}, x_{i+1}^k,\ldots , x_N^k;\lambda ^k\right) , \end{aligned}$$
i.e.,
$$\begin{aligned}&f_i\left( x_i^{k+1}\right) - \left\langle \lambda ^k, A_i x_i^{k+1}\right\rangle + \frac{\gamma }{2}\left\| \sum \limits _{j=1}^i A_j x_j^{k+1} + \sum \limits _{j=i+1}^{N-1} A_j x_j^{k} + x_N^k - b \right\| ^2 \\&\quad \le f_i\left( x_i^*\right) - \left\langle \lambda ^k, A_i x_i^*\right\rangle + \frac{\gamma }{2}\left\| \sum \limits _{j=1}^{i-1} A_j x_j^{k+1} + A_i x_i^* + \sum \limits _{j=i+1}^{N-1} A_j x_j^{k} + x_N^k - b \right\| ^2. \end{aligned}$$
Choosing $k=k_q-1$ in the above inequality and letting q go to $+\infty $, we obtain
$$\begin{aligned} \limsup \limits _{q\rightarrow +\infty }f_i\left( x_i^{k_q}\right) \le \limsup \limits _{q\rightarrow +\infty } \left( \frac{\gamma }{2}\left\| A_i x_i^{k_q} - A_i x_i^* \right\| ^2 + \left\langle \lambda ^k, A_i x_i^{k_q}- A_i x_i^*\right\rangle \right) + f_i\left( x_i^*\right) ,\nonumber \\ \end{aligned}$$
(4.29)
for $i=1,2,\ldots ,N-1$. Here we have used the facts that the sequence $\{w^k:k=0,1,\ldots \}$ is bounded, and $\gamma $ is finite, and that the distance between two successive iterates tends to zero (4.20), and the fact that
$$\begin{aligned}&\sum \limits _{j=1}^i A_j x_j^{k+1} + \sum \limits _{j=i+1}^{N-1} A_j x_j^k + x_N^k - b = \sum \limits _{j=i+1}^{N-1} \left( A_j x_j^k - A_j x_j^{k+1}\right) + \left( x_N^k - x_N^{k+1}\right) \\&\quad +\,\frac{1}{\gamma } \left( \lambda ^k - \lambda ^{k+1}\right) . \end{aligned}$$
From (4.20) we also have $x_i^{k_q-1}\rightarrow x_i^*$ as $q\rightarrow \infty $, hence (4.29) reduces to $\limsup \limits _{q\rightarrow \infty }f_i(x_i^{k_q})\le f_i(x_i^*)$. Therefore, combining with (4.28), $f_i(x_i^{k_q})$ tends to $f_i(x_i^*)$ as $q\rightarrow \infty $. Hence, we can conclude that
$$\begin{aligned} \lim \limits _{q\rightarrow \infty }{\mathcal {L}}_\gamma (w^{k_q})= & {} \lim \limits _{q\rightarrow \infty }\left( \sum \limits _{i=1}^N f_i\left( x_i^{k_q}\right) - \left\langle \lambda ^{k_q}, \sum \limits _{i=1}^{N-1} A_i x_i^{k_q} + x_N^{k_q} -b\right\rangle \right. \\&\left. +\, \frac{\gamma }{2}\left\| \sum \limits _{i=1}^{N-1} A_i x_i^{k_q} + x_N^{k_q} -b \right\| ^2\right) \\= & {} \sum \limits _{i=1}^N f_i\left( x_i^{*}\right) - \left\langle \lambda ^{*}, \sum \limits _{i=1}^{N-1} A_i x_i^{*}+x_N^{*}-b\right\rangle + \frac{\gamma }{2}\left\| \sum \limits _{i=1}^{N-1} A_i x_i^{*}+x_N^{*}-b \right\| ^2 \\= & {} {\mathcal {L}}_\gamma (w^{*}). \end{aligned}$$
On the other hand, it follows from (4.20) and (4.21) that
$$\begin{aligned} \left( R_1^{k+1}, \ldots , R_N^{k+1}, R_\lambda ^{k+1}\right)\in & {} \partial {\mathcal {L}}_\gamma (w^{k+1}) \end{aligned}$$
(4.30)

$$\begin{aligned} \left( R_1^{k+1}, \ldots , R_N^{k+1}, R_\lambda ^{k+1}\right)\rightarrow & {} (0,\ldots ,0), \quad k\rightarrow \infty . \end{aligned}$$
(4.31)
It implies that $(0,\ldots ,0)\in \partial {\mathcal {L}}_\gamma (x_1^*,\ldots ,x_N^*,\lambda ^*)$ due to the closeness of $\partial {\mathcal {L}}_\gamma $. Therefore, $w^* = \left( x_1^*,\ldots ,x_N^*,\lambda ^*\right) $ is a critical point of ${\mathcal {L}}_\gamma (x_1,\ldots ,x_N,\lambda )$.
2.
The proof for this assertion directly follows from Lemma 5 and Remark 5 of [4]. We omit the proof here for succinctness.
3.
We define that $\tilde{L}$ is the finite limit of ${\mathcal {L}}_\gamma (x_1^k,\ldots ,x_N^k,\lambda ^k)$ as k goes to infinity, i.e.,
$$\begin{aligned} \tilde{L} = \lim \limits _{k\rightarrow \infty } {\mathcal {L}}_\gamma \left( x_1^k,\ldots ,x_N^k,\lambda ^k\right) . \end{aligned}$$
Take $w^*\in \Omega (w^0)$. There exists a subsequence $w^{k_q}$ converging to $w^*$ as q goes to infinity. Since we have proven that
$$\begin{aligned} \lim \limits _{q\rightarrow \infty }{\mathcal {L}}_\gamma (w^{k_q}) = {\mathcal {L}}_\gamma (w^{*}), \end{aligned}$$
and ${\mathcal {L}}_\gamma (w^{k})$ is a non-increasing sequence, we conclude that ${\mathcal {L}}_\gamma (w^{*}) = \tilde{L}$, hence the restriction of ${\mathcal {L}}_\gamma (x_1,\ldots ,x_N,\lambda )$ to $\Omega (w^0)$ equals $\tilde{L}$.

$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, T., Ma, S. & Zhang, S. Iteration Complexity Analysis of Multi-block ADMM for a Family of Convex Minimization Without Strong Convexity. J Sci Comput 69, 52–81 (2016). https://doi.org/10.1007/s10915-016-0182-0

Download citation

Received: 27 May 2015
Revised: 29 December 2015
Accepted: 10 February 2016
Published: 01 March 2016
Issue Date: October 2016
DOI: https://doi.org/10.1007/s10915-016-0182-0

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Iteration Complexity Analysis of Multi-block ADMM for a Family of Convex Minimization Without Strong Convexity

Abstract

Access this article

Similar content being viewed by others

Block-Wise ADMM with a Relaxation Factor for Multiple-Block Convex Programming

On the Sublinear Convergence Rate of Multi-block ADMM

Convergence analysis of the direct extension of ADMM for multiple-block separable convex minimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proof of Theorem 4.3

Lemma 4.8

Proof of Lemma 4.8

Proof of Theorem 4.3

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Iteration Complexity Analysis of Multi-block ADMM for a Family of Convex Minimization Without Strong Convexity

Abstract

Access this article

Similar content being viewed by others

Block-Wise ADMM with a Relaxation Factor for Multiple-Block Convex Programming

On the Sublinear Convergence Rate of Multi-block ADMM

Convergence analysis of the direct extension of ADMM for multiple-block separable convex minimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proof of Theorem 4.3

Appendix: Proof of Theorem 4.3

Lemma 4.8

Proof of Lemma 4.8

Proof of Theorem 4.3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation