An alternating direction method with increasing penalty for stable principal component pursuit

Aybat, N. S.; Iyengar, G.

doi:10.1007/s10589-015-9736-6

An alternating direction method with increasing penalty for stable principal component pursuit

Published: 08 March 2015

Volume 61, pages 635–668, (2015)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

N. S. Aybat¹ &
G. Iyengar²

591 Accesses
29 Citations
Explore all metrics

Abstract

The stable principal component pursuit (SPCP) is a non-smooth convex optimization problem, the solution of which enables one to reliably recover the low rank and sparse components of a data matrix which is corrupted by a dense noise matrix, even when only a fraction of data entries are observable. In this paper, we propose a new algorithm for solving SPCP. The proposed algorithm is a modification of the alternating direction method of multipliers (ADMM) where we use an increasing sequence of penalty parameters instead of a fixed penalty. The algorithm is based on partial variable splitting and works directly with the non-smooth objective function. We show that both primal and dual iterate sequences converge under mild conditions on the sequence of penalty parameters. To the best of our knowledge, this is the first convergence result for a variable penalty ADMM when penalties are not bounded, the objective function is non-smooth and its sub-differential is not uniformly bounded. Using partial variable splitting and adopting an increasing sequence of penalty multipliers, together, significantly reduce the number of iterations required to achieve feasibility in practice. Our preliminary computational tests show that the proposed algorithm works very well in practice, and outperforms ASALM, a state of the art ADMM algorithm for the SPCP problem with a constant penalty parameter.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A linearized Peaceman–Rachford splitting method for structured convex optimization with application to stable principal component pursuit

Article 21 February 2020

A customized proximal point algorithm for stable principal component pursuit with nonnegative constraint

Article Open access 29 April 2015

Two proximal splitting methods for multi-block separable programming with applications to stable principal component pursuit

Article 10 February 2017

Notes

In an earlier preprint, we named it as NSA algorithm.
In an earlier preprint, we named it as Non-Smooth Augmented Lagrangian (NSA) algorithm.
The modified version is available from http://svt.stanford.edu/code.html

References

Aybat, N.S., Iyengar, G.: A unified approach for minimizing composite norms. Math. Progr. Ser. A 144, 181–226 (2014)
Article MATH MathSciNet Google Scholar
Aybat, N.S., Goldfarb, D., Ma, S.: Efficient algorithms for robust and stable principal component pursuit problems. Comput. Optim. Appl. 58, 1–29 (2014)
Article MATH MathSciNet Google Scholar
Aybat, N.S., Zarmehri, S., Kumara, S.: An ADMM algorithm for clustering partially observed networks. In: Proceedings of the 2015 SIAM International Conference on Data Mining, to appear (2015). Preprint available at http://arxiv.org/abs/1410.3898
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)
Article MATH MathSciNet Google Scholar
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. (2011)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Foundations and Trends in Machine Learning, vol. 3, chap. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, pp. 1–122 (2011)
Boyer, C., Merzbach, U.: A History of Mathematics, 2nd edn, pp. 286–287. Wiley, New York (1991)
MATH Google Scholar
Candès, E.J., Li, X., Ma, Y.J.W.: Robust principle component analysis? J. ACM 58, 1–37 (2011)
Article Google Scholar
Chandrasekaran, V., Sanghavi, S., Parrilo, P., Willsky, A.: Rank-sparsity incoherence for matrix decomposition. SIAM J. Optim. 21(2), 572–596 (2011)
Article MATH MathSciNet Google Scholar
Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57, 1413–1457 (2004)
Article MATH Google Scholar
Eckstein, J.: Augmented lagrangian and alternating direction methods for convex optimization: a tutorial and some illustrative computational results. Rutcor Research Report RRR 32–2012, Rutgers Center for Operations Research (2012)
Eckstein, J., Bertsekas, D.P.: On the douglas-rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55, 293–318 (1992)
Article MATH MathSciNet Google Scholar
Fukushima, M.: Application of the alternating direction method of multipliers to separable convex programming problems. Comput. Optim. Appl. 1, 93–111 (1992). doi:10.1007/BF00247655
Article MATH MathSciNet Google Scholar
Glowinski, R.: Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems. Studies in Mathematics and its Applications. Elsevier Science (2000)
Goldfarb, D., Ma, S., Scheinberg, K.: Fast alternating linearization methods for minimizing the sum of two convex functions. Math. Program. Ser. A 141(1–2), 349–382 (2013)
Article MATH MathSciNet Google Scholar
He, B., Yang, H.: Some convergence properties of a method of multipliers for linearly constrained monotone variational inequalities. Oper. Res. Lett. 23, 151–161 (1998)
Article MATH MathSciNet Google Scholar
He, B., Yang, H., Wang, S.: Alternating direction method with self-adaptive penalty parameters for monotone variational inequalities. J. Optim. Theory Appl. 106(2), 337–356 (2000)
Article MATH MathSciNet Google Scholar
He, B.S., Liao, L.Z., Han, D.R., Yang, H.: A new inexact alternating directions method for monontone variational inequalities. Math. Program. Ser. A 92, 103–118 (2002)
Article MATH MathSciNet Google Scholar
Kontogiorgis, S., Meyer, R.R.: A variable-penalty alternating direction method for convex optimization. Math. Program. 83, 29–53 (1998)
MATH MathSciNet Google Scholar
Larsen, R.: Lanczos bidiagonalization with partial reorthogonalization. Technical report DAIMI PB-357, Department of Computer Science, Aarhus University (1998)
Li, L., Huang, W., Gu, I., Tian, Q.: Statistical modeling of complex backgrounds for foreground object detection. IEEE Trans. Image Process. 13, 1459–1472 (2004)
Article Google Scholar
Lin, Z., Ganesh, A., Wright, J., Wu, L., Chen, M., Ma, Y.: Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. Tech. rep., UIUC Technical Report UILU-ENG-09-2214 (2009)
Lin, Z., Chen, M., Wu, L., Ma, Y.: The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv:1009.5055v2 (2011)
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
Article MATH MathSciNet Google Scholar
Nocedal, J., Wright, S.J.: Numer. Optim. Springer-Verlag, New York (1999)
Book Google Scholar
Rockafellar, R.: Convex Analysis. Princeton University Press (1997)
Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2), 97–116 (1976)
Article MATH MathSciNet Google Scholar
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877–898 (1976)
Article MATH MathSciNet Google Scholar
Tao, M., Yuan, X.: Recovering low-rank and sparse components of matrices from incomplete and noisy observations. SIAM J. Optim. 21(1), 57–81 (2011)
Article MATH MathSciNet Google Scholar
Tseng, P.: On accelerated proximal gradient methods for convex-concave optimization. SIAM J. Optim. (2008)
Wright, J., Peng, Y., Ma, Y., Ganesh, A., Rao, S.: Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. In: Proceedings of Neural Information Processing Systems (NIPS) (2009)
Zhou, Z., Li, X., Wright, J., Candès, E., Ma, Y.: Stable principle component pursuit. Proceedings of International Symposium on Information Theory (2010)

Download references

Acknowledgments

We would like to thank to Min Tao for providing the code ASALM. The work of N. S. Aybat is supported by NSF Grant CMMI-1400217. The work of G. Iyengar is supported by NIH R21 AA021909-01, NSF CMMI-1235023, NSF DMS-1016571 Grants.

Author information

Authors and Affiliations

IE Department, Penn State University, University Park, PA, 16802, USA
N. S. Aybat
IEOR Department, Columbia University, New York, NY, 10027, USA
G. Iyengar

Authors

N. S. Aybat
View author publications
You can also search for this author in PubMed Google Scholar
G. Iyengar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N. S. Aybat.

Appendix: Proofs

1.1 Proof of Lemma 1

Suppose $\delta >0$. Let $(Z^*,S^*)$ be an optimal solution to problem $(P_{ns})$, $\theta ^*$ denote the optimal Lagrangian multiplier for the constraint $(Z,S)\in \chi $ written as $\frac{1}{2}\Vert \pi _{\varOmega }\left( Z+S-D\right) \Vert ^2_F\le \frac{\delta ^2}{2}$ and $\pi ^*_{\varOmega }$ denotes the adjoint operator of $\pi _{\varOmega }$. Note that $\pi ^*_{\varOmega }=\pi _{\varOmega }$. Then the KKT conditions for this problem are given by

$$\begin{aligned} Q+\rho (Z^*-\tilde{Z})+\theta ^*~\pi _{\varOmega }\left( Z^*+S^*-D\right)= & {} 0, \end{aligned}$$

(38)

$$\begin{aligned} \xi G + \theta ^*~\pi _{\varOmega }\left( Z^*+S^*-D\right)= & {} 0, \quad G\in \partial \Vert S^*\Vert _1, \end{aligned}$$

(39)

$$\begin{aligned} \Vert \pi _{\varOmega }\left( Z^*+S^*-D\right) \Vert _F\le & {} \delta , \end{aligned}$$

(40)

$$\begin{aligned} \theta ^*\ge & {} 0, \end{aligned}$$

(41)

$$\begin{aligned} \theta ^*~\left( \Vert \pi _{\varOmega }\left( Z^*+S^*-D\right) \Vert _F-\delta \right)= & {} 0, \end{aligned}$$

(42)

where (38) and (39) follow from the fact that $\pi _{\varOmega } \pi _{\varOmega }=\pi _{\varOmega }$.

From (38) and (39), we get

$$\begin{aligned} \pi _{{\varOmega }^c}\left( Z^*\right) =\pi _{{\varOmega }^c}\left( q(\tilde{Z})\right) , \quad \pi _{{\varOmega }^c}\left( G\right) =\mathbf{0} \end{aligned}$$

(43)

and

$$\begin{aligned} \left[ \begin{array}{c@{\quad }c} (\rho +\theta ^*)I &{} \theta ^*I\\ \theta ^*I &{} \theta ^*I \\ \end{array} \right] \left[ \begin{array}{c} \pi _{\varOmega }\left( Z^*\right) \\ \pi _{\varOmega }\left( S^*\right) \\ \end{array} \right] = \left[ \begin{array}{c} \pi _{\varOmega }\left( \theta ^*~D+\rho ~q(\tilde{Z})\right) \\ \pi _{\varOmega }\left( \theta ^*~D-\xi G\right) \\ \end{array} \right] , \end{aligned}$$

(44)

where $q(\tilde{Z})=\tilde{Z}-\rho ^{-1}~Q$. From (44) it follows that

$$\begin{aligned} \left[ \begin{array}{c@{\quad }c} (\rho +\theta ^*)I &{} \theta ^*I\\ 0 &{} \left( \frac{\rho \theta ^*}{\rho +\theta ^*}\right) ~I \\ \end{array} \right] \left[ \begin{array}{c} \pi _{\varOmega }\left( Z^*\right) \\ \pi _{\varOmega }\left( S^*\right) \\ \end{array} \right] = \left[ \begin{array}{c} \pi _{\varOmega }\left( \theta ^*~D+\rho ~q(\tilde{Z})\right) \\ \frac{\rho \theta ^*}{\rho +\theta ^*}~\pi _{\varOmega }\left( D-q(\tilde{Z})\right) -\xi \pi _{\varOmega }\left( G\right) \\ \end{array} \right] . \nonumber \\ \end{aligned}$$

(45)

From the second equation in (45), we get

$$\begin{aligned} \xi \frac{(\rho +\theta ^*)}{\rho \theta ^*}~\pi _{\varOmega }\left( G\right) +\pi _{\varOmega }\left( S^*\right) +\pi _{\varOmega }\left( q(\tilde{Z})-D\right) =0. \end{aligned}$$

(46)

The Eq. (46) and $\pi _{{\varOmega }^c}\left( G\right) =\mathbf{0}$ are precisely the first-order optimality conditions for the “shrinkage” problem

$$\begin{aligned} \min _{S\in \mathbb {R}^{m\times n}}\left\{ \xi \frac{(\rho +\theta ^*)}{\rho \theta ^*} \Vert S\Vert _1+\frac{1}{2}\Vert S+\pi _{\varOmega }\left( q(\tilde{Z})-D\right) \Vert _F^2\right\} . \end{aligned}$$

The expression for $S^*$ in (10) is the optimal solution to this “shrinkage” problem, and $Z^*$ given in (11) follows from the first equation in (43) and the first row of (45). Hence, given optimal Lagrangian dual $\theta ^*$, $S^*$ and $Z^*$ computed from Eqs. (10) and (11), respectively, satisfy KKT conditions (38) and (39).

Next, we show how to compute the optimal dual $\theta ^*$. We consider two cases.

(i)
Suppose $\Vert \pi _{\varOmega }\left( D-q(\tilde{Z})\right) \Vert _F\le \delta $. In this case, let $\theta ^*=0$. Setting $\theta ^*=0$ in (10) and (11), we find $S^*=\mathbf{0}$ and $Z^*=q(\tilde{Z})$. By construction, $S^*$, $Z^*$ and $\theta ^*$ satisfy conditions (38) and (39). It is easy to check that this choice of $\theta ^*=0$ trivially satisfies the rest of the conditions as well. Hence, $\theta ^*=0$ is an optimal lagrangian dual.
(ii)
Next, suppose $\Vert \pi _{\varOmega }\left( D-q(\tilde{Z})\right) \Vert _F>\delta $. From (11), we have
$$\begin{aligned} \pi _{\varOmega }\left( Z^*+S^*-D\right) = \frac{\rho }{\rho +\theta ^*}~\pi _{\varOmega }\left( S^*+q(\tilde{Z})-D\right) . \end{aligned}$$
(47)
Therefore,
$$\begin{aligned}&\Vert \pi _{\varOmega }\left( Z^*+S^*-D\right) \Vert _F \nonumber \\&\quad =\frac{\rho }{\rho +\theta ^*}~\left\| \pi _{\varOmega }\left( S^*+q(\tilde{Z})-D\right) \right\| _F, \nonumber \\&\quad =\frac{\rho }{\rho +\theta ^*} \left\| \pi _{\varOmega }\left( \max \left\{ |D-q(\tilde{Z})| -\xi \frac{(\rho +\theta ^*)}{\rho \theta ^*} E,\ \mathbf{0}\right\} -|D-q(\tilde{Z})|\right) \right\| _F,\nonumber \\&\quad = \frac{\rho }{\rho +\theta ^*}~\left\| \pi _{\varOmega }\left( \min \left\{ \xi \frac{(\rho +\theta ^*)}{\rho \theta ^*}~E,\ |D-q(\tilde{Z})|\right\} \right) \right\| _F,\nonumber \\&\quad =\left\| \min \left\{ \frac{\xi }{\theta ^*}~E,\ \frac{\rho }{\rho +\theta ^*}~\left| \pi _{\varOmega }\left( D-q(\tilde{Z})\right) \right| \right\} \right\| _F, \end{aligned}$$
(48)
where the second equation is obtained after substituting (10) for $S^*$ and then componentwise dividing the resulting expression inside the norm by $\hbox {sgn}\left( D-q(\tilde{Z})\right) $. Define $\phi :\mathbb {R}_+\rightarrow \mathbb {R}$,
$$\begin{aligned} \phi (\theta ):= \left\| \min \left\{ \frac{\xi }{\theta }~E,\ \frac{\rho }{\rho +\theta }~\left| \pi _{\varOmega }\left( D-q(\tilde{Z})\right) \right| \right\} \right\| _F. \end{aligned}$$
(49)
It is easy to show that $\phi $ is a strictly decreasing function of $\theta $. Since $\lim _{\theta \rightarrow \infty }\phi (\theta )=0$ and $\phi (0)=\Vert \pi _{\varOmega }\left( D-q(\tilde{Z})\right) \Vert _F>\delta $, there exists a unique $\theta ^*>0$ such that $\phi (\theta ^*)=\delta $. Moreover, since $\theta ^*>0$ and $\phi (\theta ^*)=\delta $, (48) implies that $Z^*$, $S^*$ and $\theta ^*$ satisfy the rest of KKT conditions (40), (41) and (42) as well. Thus, the unique $\theta ^*>0$ that satisfies $\phi (\theta ^*)=\delta $ is the optimal Lagrangian dual. We now show that $\theta ^*$ can be computed in $\mathcal {O}(|{\varOmega }|\log (|{\varOmega }|))$ time. Let $A:=|\pi _{\varOmega }\left( D-q(\tilde{Z})\right) |$ and $0\le a_{(1)}\le a_{(2)}\le \cdots \le a_{(|{\varOmega }|)}$ be the $|{\varOmega }|$ elements of the matrix $A$ corresponding to the indices $(i,j)\in {\varOmega }$ sorted in increasing order, which can be done in $\mathcal {O}(|{\varOmega }|\log (|{\varOmega }|))$ time. Defining $a_{(0)}:=0$ and $a_{(|{\varOmega }|+1)}:=\infty $, we then have for all $j\in \{0,1,\ldots ,|{\varOmega }|\}$ that
$$\begin{aligned} \frac{\rho }{\rho +\theta }~a_{(j)} \le \frac{\xi }{\theta } \le \frac{\rho }{\rho +\theta }~a_{(j+1)} \Leftrightarrow \frac{1}{\xi }~a_{(j)}-\frac{1}{\rho } \le \frac{1}{\theta } \le \frac{1}{\xi }~a_{(j+1)}-\frac{1}{\rho }. \end{aligned}$$
(50)
Let $\bar{k}:=\max \left\{ j: a_{(j)}\le \frac{\xi }{\rho },\ 0\le j\le |{\varOmega }| \right\} $, and for all $\bar{k}< j\le |{\varOmega }|$ define $\theta _j:=\frac{1}{\frac{1}{\xi }~a_{(j)}-\frac{1}{\rho }}$. Then for all $\bar{k}< j\le |{\varOmega }|$, we have
$$\begin{aligned} \phi (\theta _j)=\sqrt{\left( \frac{\rho }{\rho +\theta _j}\right) ^2~\sum _{i=0}^j a^2_{(i)}+(|{\varOmega }|-j)~\left( \frac{\xi }{\theta _j}\right) ^2}. \end{aligned}$$
(51)
Also define $\theta _{\bar{k}}:=\infty $ and $\theta _{|{\varOmega }|+1}:=0$ so that $\phi (\theta _{\bar{k}}):=0$ and $\phi (\theta _{|{\varOmega }|+1})=\phi (0)=\Vert A\Vert _F>\delta $. Note that $\{\theta _j\}_{\{\bar{k}< j\le |{\varOmega }|\}}$ contains all the points at which $\phi (\theta )$ may not be differentiable for $\theta \ge 0$. Define $j^*:=\max \{j:\ \phi (\theta _j)\le \delta ,\ \bar{k}\le j\le |{\varOmega }|\}$. Then $\theta ^*$ is the unique solution of the system
$$\begin{aligned} \sqrt{\left( \frac{\rho }{\rho +\theta }\right) ^2~\sum _{i=0}^{j^*} a^2_{(i)}+(|{\varOmega }|-j^*)~\left( \frac{\xi }{\theta }\right) ^2}=\delta \,\hbox {and}\, \theta >0, \end{aligned}$$
(52)
since $\phi (\theta )$ is continuous and strictly decreasing in $\theta $ for $\theta \ge 0$. Solving the equation in (52) requires finding the roots of a fourth-order polynomial (also known as a quartic function). Lodovico Ferrari showed in 1540 that the roots of quartic functions can be solved in closed form. Thus, it follows that $\theta ^*>0$ can be computed in $\mathcal {O}(1)$ operations. Note that if $\bar{k}=|{\varOmega }|$, then $\theta ^*$ is the solution of the equation
$$\begin{aligned} \sqrt{\left( \frac{\rho }{\rho +\theta ^*}\right) ^2~\sum _{i=1}^{|{\varOmega }|} a^2_{(i)}}=\delta , \end{aligned}$$
(53)
i.e. $\theta ^*= \rho \left( \frac{\Vert A\Vert _F}{\delta }-1\right) = \rho \left( \frac{\Vert \pi _{\varOmega }\left( D-q(\tilde{Z})\right) \Vert _F}{\delta }-1\right) $.

Hence, we have proved that problem $(P_{ns})$ can be solved efficiently when $\delta > 0$.

Now, suppose $\delta =0$. Since $\pi _{\varOmega }\left( Z^*+S^*-D\right) =0$, problem $(P_{ns})$ can be written as

$$\begin{aligned} \begin{array}{l@{\quad }l} \min _{Z,S\in \mathbb {R}^{m\times n}}&\xi \rho ^{-1} \Vert \pi _{\varOmega }(S)\Vert _1+\frac{1}{2} \Vert \pi _{\varOmega }\left( D-S-q(\tilde{Z})\right) +\pi _{{\varOmega }^c}\left( Z-q(\tilde{Z})\right) \Vert _F^2. \end{array}\nonumber \\ \end{aligned}$$

(54)

Then (13) and $Z^*=\pi _{\varOmega }\left( D-S^*\right) +\pi _{{\varOmega }^c}\left( q(\tilde{Z})\right) $ trivially follow from first-order optimality conditions for the above problem.

1.2 Proof of Lemma 2

Let $W^*:=-Q+\rho (\tilde{Z}-Z^*)$. Then (38) in the proof of Lemma 1 implies that $W^*=\theta ^*~\pi _{\varOmega }\left( Z^*+S^*-D\right) $. From the first-order optimality conditions of $(P_{ns})$ in (9), we have that $(W^*,-W)\in \partial \mathbf{1}_\chi (Z^*,S^*)$ for some $W\in \partial \xi \Vert S^*\Vert _1$. From (38) and (39), it follows that $-W^*\in \partial \xi \Vert S^*\Vert _1$. The definition of $\chi $, chain rule on subdifferential (see Theorem 23.9 in [26]), and $-W^*\in \partial \xi \Vert S^*\Vert _1$ together imply that $(W^*,W^*)\in \partial \mathbf{1}_\chi (Z^*,S^*)$.

1.3 Proof of Lemma 3

Since $L_{k+1}$ is the optimal solution to the subproblem in Step 4 of ADMIP corresponding to the $k$-th iteration, it follows that

$$\begin{aligned} 0\in \partial \Vert L_{k+1}\Vert _*+ Y_k+\rho _k(L_{k+1}-Z_k). \end{aligned}$$

(55)

Let $\theta _k\ge 0$ denote the optimal Lagrange multiplier for the quadratic constraint in Step 5 sub-problem in the $k$-th iteration. Since $(Z_{k+1},S_{k+1})$ is the optimal solution, the first-order optimality conditions imply that

$$\begin{aligned}&0\in \xi \partial \Vert S_{k+1}\Vert _1+ \theta _k~\pi _{\varOmega }\left( Z_{k+1}+S_{k+1}-D\right) , \end{aligned}$$

(56)

$$\begin{aligned}&-Y_k+\rho _k(Z_{k+1}-L_{k+1})+\theta _k~\pi _{\varOmega }\left( Z_{k+1}+S_{k+1}-D\right) =0. \end{aligned}$$

(57)

From (55), it follows that $-\hat{Y}_{k+1}\in \partial \Vert L_{k+1}\Vert _*$. From (56) and (57), it follows that $-Y_{k+1}\in \xi ~\partial \Vert S_{k+1}\Vert _1$. Since $\partial \Vert L\Vert _*$ and $\partial \Vert S\Vert _1$ are uniformly bounded sets for all $L, S\in \mathbb {R}^{m\times n}$, it follows that $\{\hat{Y}_k\}_{k\in \mathbb {Z}_+}$ and $\{Y_k\}_{k\in \mathbb {Z}_+}$ are bounded sequences. Moreover, (57) implies that $\pi _{\varOmega }\left( Y_k\right) =Y_k$ for all $k\ge 1$.

1.4 Proof of Lemma 4

For all $k \ge 0$, since $Y_{k+1}=Y_k+\rho _k(L_{k+1}-Z_{k+1})$ and and $\hat{Y}_{k+1}:=Y_k+\rho _k(L_{k+1}-Z_k)$, we have that $Y_{k+1}-\hat{Y}_{k+1}=\rho _k(Z_k-Z_{k+1})$. Using these relations, we obtain the following equality

$$\begin{aligned}&\rho _k^{-1}\langle Y_{k+1}-Y_k, Y_{k+1}-Y^* \rangle = \rho _k\langle L_{k+1}-L^*, Z_k-Z_{k+1} \rangle \nonumber \\&\quad +\,\langle L_{k+1}-L^*, \hat{Y}_{k+1}-Y^* \rangle +\langle L^*-Z_{k+1}, Y_{k+1}-Y^* \rangle . \end{aligned}$$

(58)

Moreover, we also have

$$\begin{aligned}&\Vert Z_{k+1}-L^*\Vert _F^2+\rho _{k}^{-2}\Vert Y_{k+1}-Y^*\Vert _F^2 \nonumber \\&\quad = \Vert Z_{k}-L^*\Vert _F^2+\rho _{k}^{-2}\Vert Y_{k}-Y^*\Vert _F^2-\Vert Z_{k+1}-Z_k\Vert _F^2-\rho _{k}^{-2}\Vert Y_{k+1}-Y_k\Vert _F^2 \nonumber \\&\qquad +\, 2\langle Z_{k+1}-L^*, Z_{k+1}-Z_k \rangle + 2 \rho _k^{-2} \langle Y_{k+1}-Y_k, Y_{k+1}-Y^* \rangle , \end{aligned}$$

(59)

$$\begin{aligned}&\quad = \Vert Z_{k}-L^*\Vert _F^2 + \rho _{k}^{-2}\Vert Y_{k}-Y^*\Vert _F^2-\Vert Z_{k+1}-Z_k\Vert _F^2 -\rho _{k}^{-2}\Vert Y_{k+1}-Y_k\Vert _F^2,\nonumber \\&\qquad +\,2\langle Z_{k+1}-L_{k+1}, Z_{k+1}-Z_k \rangle -\,2\rho _k^{-1}\langle -\hat{Y}_{k+1}+Y^*,L_{k+1}-L^* \rangle \nonumber \\&\qquad -\,2\rho _k^{-1}\langle -Y_{k+1}+Y^*, L^*-Z_{k+1} \rangle , \nonumber \\&\quad =\Vert Z_{k}-L^*\Vert _F^2 +\rho _{k}^{-2}\Vert Y_{k}-Y^*\Vert _F^2 -\Vert Z_{k+1}-Z_k\Vert _F^2-\rho _{k}^{-2}\Vert Y_{k+1}-Y_k\Vert _F^2, \nonumber \\&\qquad -\,2\rho _k^{-1}\left( \langle Y_{k+1}-Y_k, Z_{k+1}-Z_k \rangle + \langle -\hat{Y}_{k+1}+Y^*, L_{k+1}-L^* \rangle \right) \nonumber \\&\qquad -\,2\rho _k^{-1}\langle -Y_{k+1}+Y^*, L^*-Z_{k+1} \rangle , \end{aligned}$$

(60)

where the second equality follows from rewriting the last term in (59) using (58), and the last equality follows from the relation $L_{k+1}-Z_{k+1} = \rho _k^{-1}(Y_{k+1}-Y_k)$.

Since $Y^*$ and $\theta ^*$ are optimal Lagrangian dual variables, we have

$$\begin{aligned} (L^*,L^*,S^*)&=\mathop {\hbox {argmin}}\limits _{L,Z,S}\Vert L\Vert _*+\xi ~\Vert S\Vert _1 +\langle Y^*, L-Z \rangle \\&\qquad +\frac{\theta ^*}{2}\left( \Vert \pi _{\varOmega }\left( Z+S-D\right) \Vert ^2_F-\delta ^2\right) . \end{aligned}$$

From first-order optimality conditions, we get

$$\begin{aligned} 0\in & {} \partial \Vert L^*\Vert _*+Y^*,\\ 0\in & {} \xi ~\partial \Vert S^*\Vert _1+\theta ^*~\pi _{\varOmega }\left( L^*+S^*-D\right) ,\\ 0= & {} -Y^*+\theta ^*~\pi _{\varOmega }\left( L^*+S^*-D\right) . \end{aligned}$$

Hence, $-Y^*\in \partial \Vert L^*\Vert _*$ and $-Y^*\in \xi ~\partial \Vert S^*\Vert _1$. Moreover, from Lemma 3, we also have that $-Y_k\in \partial \xi ~\Vert S_k\Vert _1$ for all $k\ge 1$. Since $\xi ~\Vert .\Vert _1$ is convex, it follows that

$$\begin{aligned} \langle -Y_{k+1}+Y_k, S_{k+1}-S_k \rangle \ge 0, \end{aligned}$$

(61)

$$\begin{aligned} \langle -Y_{k+1}+Y^*, S_{k+1}-S^* \rangle \ge 0. \end{aligned}$$

(62)

Since $\rho _{k+1}\ge \rho _k$ for all $k\ge 1$, first adding (61) to (60), then adding and subtracting (62), we get

$$\begin{aligned}&\Vert Z_{k+1}-L^*\Vert _F^2 +\rho _{k+1}^{-2}\Vert Y_{k+1}-Y^*\Vert _F^2 \nonumber \\&\quad \le \Vert Z_{k}-L^*\Vert _F^2 +\rho _{k}^{-2}\Vert Y_{k}-Y^*\Vert _F^2 -\Vert Z_{k+1}-Z_k\Vert _F^2-\rho _{k}^{-2}\Vert Y_{k+1}-Y_k\Vert _F^2 \nonumber \\&\qquad -\,2\rho _k^{-1}\left( \langle -\hat{Y}_{k+1}+Y^*, L_{k+1}-L^* \rangle +\langle -Y_{k+1}+Y^*, S_{k+1}-S^* \rangle \right) \nonumber \\&\qquad -\,2\rho _k^{-1}\langle Y_{k+1}-Y_k,Z_{k+1}+S_{k+1}-Z_k-S_k \rangle \nonumber \\&\qquad -\,2\rho _k^{-1}\langle -Y_{k+1}+Y^*,L^*+S^*-Z_{k+1}-S_{k+1} \rangle . \end{aligned}$$

(63)

Lemma 2 applied to the Step 5 sub-problem corresponding to the $k$-th iteration gives $(Y_{k+1},Y_{k+1})\in \partial \mathbf{1}_{\chi }(Z_{k+1},S_{k+1})$. Using an argument similar to that used in the proof of Lemma 2, one can also show that $(Y^*,Y^*)\in \partial \mathbf{1}_{\chi }(L^*,S^*)$. Moreover, since $-Y^*\in \partial \xi ~\Vert S^*\Vert _1$, $-Y^*\in \partial \Vert L^*\Vert _*$, and $-Y_{k}\in \partial \xi ~\Vert S_k\Vert _1$, $-\hat{Y}_{k}\in \partial \Vert L_k\Vert _*$ for all $k\ge 1$, we have that for all $k \ge 0$,

$$\begin{aligned} \langle Y_{k+1}-Y_k, Z_{k+1}+S_{k+1}-Z_k-S_k \rangle \ge 0,\\ \langle -Y_{k+1}+Y^*, L^*+S^*-Z_{k+1}-S_{k+1} \rangle \ge 0,\\ \langle -Y_{k+1}+Y^*, S_{k+1}-S^* \rangle \ge 0,\\ \langle -\hat{Y}_{k+1}+Y^*, L_{k+1}-L^* \rangle \ge 0. \end{aligned}$$

This set of inequalities and (63) together imply that $\{\Vert Z_{k}-L^*\Vert _F^2+\rho _{k}^{-2}\Vert Y_{k}-Y^*\Vert _F^2\}_{k\in \mathbb {Z}_+}$ is a non-increasing sequence. Using this fact, rewriting (63) and summing over $k\in \mathbb {Z}_+$, we get

$$\begin{aligned}&\sum _{k\in \mathbb {Z}_+}\Vert Z_{k+1}-Z_k\Vert _F^2+\rho _{k}^{-2}\Vert Y_{k+1}-Y_k\Vert _F^2 \\&\qquad +\, 2\sum _{k\in \mathbb {Z}_+}\rho _k^{-1}\left( \langle -\hat{Y}_{k+1}+Y^*, L_{k+1}-L^* \rangle +\langle -Y_{k+1}+Y^*, S_{k+1}-S^* \rangle \right) \\&\qquad +\, 2\sum _{k\in \mathbb {Z}_+}\rho _k^{-1}\langle Y_{k+1}-Y_k,Z_{k+1}+S_{k+1}-Z_k-S_k \rangle \\&\qquad +\, 2\sum _{k\in \mathbb {Z}_+}\rho _k^{-1}\langle -Y_{k+1}+Y^*,L^*+S^*-Z_{k+1}-S_{k+1} \rangle \\&\quad \le \sum _{k\in \mathbb {Z}_+}\left( \Vert Z_{k}-L^*\Vert _F^2 +\rho _{k}^{-2}\Vert Y_{k}-Y^*\Vert _F^2-\Vert Z_{k+1}-L^*\Vert _F^2\right. \\&\qquad \left. -\,\rho _{k+1}^{-2}\Vert Y_{k+1}-Y^*\Vert _F^2\right) <\infty . \end{aligned}$$

This inequality is sufficient to prove the rest of the lemma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aybat, N.S., Iyengar, G. An alternating direction method with increasing penalty for stable principal component pursuit. Comput Optim Appl 61, 635–668 (2015). https://doi.org/10.1007/s10589-015-9736-6

Download citation

Received: 31 March 2014
Published: 08 March 2015
Issue Date: July 2015
DOI: https://doi.org/10.1007/s10589-015-9736-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An alternating direction method with increasing penalty for stable principal component pursuit

Abstract

Access this article

Similar content being viewed by others

A linearized Peaceman–Rachford splitting method for structured convex optimization with application to stable principal component pursuit

A customized proximal point algorithm for stable principal component pursuit with nonnegative constraint

Two proximal splitting methods for multi-block separable programming with applications to stable principal component pursuit

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proofs

1.1 Proof of Lemma 1

1.2 Proof of Lemma 2

1.3 Proof of Lemma 3

1.4 Proof of Lemma 4

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An alternating direction method with increasing penalty for stable principal component pursuit

Abstract

Access this article

Similar content being viewed by others

A linearized Peaceman–Rachford splitting method for structured convex optimization with application to stable principal component pursuit

A customized proximal point algorithm for stable principal component pursuit with nonnegative constraint

Two proximal splitting methods for multi-block separable programming with applications to stable principal component pursuit

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proofs

Appendix: Proofs

1.1 Proof of Lemma 1

1.2 Proof of Lemma 2

1.3 Proof of Lemma 3

1.4 Proof of Lemma 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation