Correction to: An Approximate Augmented Lagrangian Method for Nonnegative Low-Rank Matrix Approximation

Zhu, Hong; Ng, Michael K.; Song, Guang-Jing

doi:10.1007/s10915-021-01729-z

Correction to: An Approximate Augmented Lagrangian Method for Nonnegative Low-Rank Matrix Approximation

Correction
Published: 07 December 2021

Volume 90, article number 37, (2022)
Cite this article

Download PDF

Journal of Scientific Computing Aims and scope Submit manuscript

Correction to: An Approximate Augmented Lagrangian Method for Nonnegative Low-Rank Matrix Approximation

Download PDF

693 Accesses
1 Altmetric
Explore all metrics

The Original Article was published on 13 July 2021

1 Correction to: Journal of Scientific Computing (2021) 88:45 https://doi.org/10.1007/s10915-021-01556-2

The original version of this article [4] unfortunately contained an error. The authors would like to correct the error with this corrigendum.

In [4], the optimization formulation is not presented correctly. We should use the following model:

$$\begin{aligned} \min _{X}~\varPhi (X): = \frac{1}{2}|| A - X ||_F^2 \quad \mathrm{subject~to}\quad \mathrm{rank}(X) \le r,~X \ge 0. \end{aligned}$$

(1)

instead of (2) stated in [4] to avoid rank-deficient iteration points caused by the fact that the constraint set ${\mathcal {M}}_r = \{X\in {\mathbb {R}}^{m\times n}\vert \mathrm{rank}(X) = r\}$ used in [4] is not closed.

In order to derive the Augmented Lagrangian (AL) method for (1), we need to add some geometric properties of ${\mathcal {M}}_{\le r} = \{X\in {\mathbb {R}}^{m\times n}\vert \mathrm{rank}(X) \le r\} $ which is a real-algebraic variety [3]. In the rank-deficient point $X\in {\mathcal {M}}_{\le r}$ with $\mathrm{rank}(X) = s < r$, the tangent cone $T_{X}{\mathcal {M}}_{\le r}$ is given by [2]

$$\begin{aligned} T_{X}{\mathcal {M}}_{\le r} = T_X{\mathcal {M}}_s \oplus \{\varXi _{r-s}\in U^{\bot }\otimes {\mathcal {V}}^{\bot }~|~\mathrm{rank}(\varXi _{r-s}) \le r - s\}, \end{aligned}$$

where $T_{X}{\mathcal {M}}_{s}$ is the tangent space of ${\mathcal {M}}_s$ at $X$, $\oplus $ denotes direct sum and $\otimes $ denotes Kronecker product. Then for any $Z\in {\mathbb {R}}^{m\times n}$, the orthogonal projection of $Z$ onto $T_X{\mathcal {M}}_{\le r}$ follows

$$\begin{aligned} {\mathcal {P}}_{T_X{\mathcal {M}}_{\le r}}\big (Z\big ) = {\mathcal {P}}_{T_X{\mathcal {M}}_{s}}\big (Z\big ) + \varXi _{r - s}, \end{aligned}$$

where $\varXi _{r-s}$ is a best rank-($r-s$) approximation of $Z - {\mathcal {P}}_{T_X{\mathcal {M}}_{r}}\big (Z\big )$ in the Frobenius norm. For differentiable function $\varPhi $, the critical point $X^*$ of $\min _{X\in {\mathcal {M}}_{\le r}}\varPhi (X)$ satisfies [3]

$$\begin{aligned} 0 = \Vert \nabla \varPhi (X^*)\Vert ^2 - \mathrm{dist}(-\nabla \varPhi (X^*), T_{X^*}{\mathcal {M}}_{\le r})^2, \end{aligned}$$

(2)

where

$$\begin{aligned} \mathrm{dist}(Z, T_{X^*}{\mathcal {M}}_{\le r}) = \Vert Z - {\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(Z)\Vert _F. \end{aligned}$$

Recall that ${\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(Z)$ is a orthogonal projection of $Z$ onto $T_{X^*}{\mathcal {M}}_{\le r}$, it holds that

$$\begin{aligned} \Vert {\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(-\nabla \varPhi (X^*))\Vert ^2 = \Vert -\nabla \varPhi (X^*)\Vert ^2 - \Vert -\nabla \varPhi (X^*) - {\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(-\nabla \varPhi (X^*))\Vert _F^2. \end{aligned}$$

Therefore, (2) is equivalent to

$$\begin{aligned} {\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(-\nabla \varPhi (X^*)) = 0. \end{aligned}$$

If $\mathrm{rank}(X^*) = r$, according to the definition of ${\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(\cdot )$, it holds that ${\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{r}}(-\nabla \varPhi (X^*)) = 0.$

The AL subproblem can be rewritten as

$$\begin{aligned} \min _{X, Y}{\mathcal {L}}_{\rho _{k-1}}(X, Y, \varLambda ^{k-1}) + \delta _{{\mathcal {M}}_+}(X) + \delta _{{\mathcal {M}}_{\le r}}(Y). \end{aligned}$$

(3)

According to the above discussion of geometric properties of ${\mathcal {M}}_+$ and ${\mathcal {M}}_{\le r}$, the stationary point $(X^k_*, Y^k_*)$ of (3) satisfies

$$\begin{aligned} {\left\{ \begin{array}{ll} 0 \in \nabla _{X}{\mathcal {L}}_{\rho _{k-1}}(X^k_*, Y^k_*, \varLambda ^{k-1}) + \partial \delta _{{\mathcal {M}}_+}(X^k_*);\\ 0 = \Vert \nabla _{Y}{\mathcal {L}}_{\rho _{k-1}}(X^k_*, Y^k_*, \varLambda ^{k-1})\Vert ^2 - \mathrm{dist}(-\nabla _{Y}{\mathcal {L}}_{\rho _{k-1}}(X^k_*, Y^k_*, \varLambda ^{k-1}), T_{Y_*^k}{\mathcal {M}}_{\le r})^2. \end{array}\right. } \end{aligned}$$

Then Algorithm 1 of [4] can be revised.

The main convergence result of Theorem 1 for Algorithm 1 stated in the original article [4] still holds. By substituting ${\mathcal {M}}_{\le r}$ for ${\mathcal {M}}_r$, the proof is the same except formula (17) in [4] should be replaced by

$$\begin{aligned} \left| \Vert \frac{1}{2}(Y^k - A) + \varLambda ^{k}\Vert ^2 - \mathrm{dist}(-\frac{1}{2}(Y^k - A) - \varLambda ^{k}, T_{Y^k}{\mathcal {M}}_{\le r})^2\right| \le \epsilon _k/2,~\forall k\in {\mathbb {N}}. \end{aligned}$$

(8)

Recall (15) and (16) in [4], taking limits as ${\mathcal {K}}\ni k\rightarrow \infty $ on both sides of (8), there exists $Z^*\in \partial \delta _{{\mathcal {M}}_+}(X^*)$, such that

$$\begin{aligned} \Vert \nabla \varPhi (X^*)+Z^*\Vert ^2 - \mathrm{dist}(-\nabla \varPhi (X^*)-Z^*, T_{X^*}{\mathcal {M}}_{\le r})^2 = 0, \end{aligned}$$

which implies that $X^*$ is a stationary point of problem (1).

Table 1 Correction to Table 1 in [4]. Average results over 10 tests on synthetic data with fixed $m$

Full size table

Table 2 Correction to Table 2 in [4]. Average results over 10 tests on synthetic data with fixed $n$

Full size table

Note that the AL subproblem is a composite optimization problem. It can be checked that (a) $\delta _{{\mathcal {M}}_+}$ and $\delta _{{\mathcal {M}}_{\le r}}$ are proper lower semicontinuous; (b) $\varTheta (X, Y) \triangleq \varPsi (X, Y) - \langle \varLambda ^{k-1}, X - Y\rangle + \frac{\rho _{k-1}}{2}\Vert X - Y\Vert _F^2$ is a ${\mathcal {C}}^1$ function and $\nabla \varTheta $ is Lipschitz continuous on bounded subset of ${\mathbb {R}}^{m\times n}\times {\mathbb {R}}^{m\times n}$ (for the case that penalty parameter $\rho _k$ tends to infinity, the AL subproblem can be scaled by $1/\rho _k$ to ensure that $\nabla _X L_k(X, Y)$ and $\nabla _Y L_k(X, Y)$ are Lipschitz continuous.); (c) ${\mathcal {M}}_{\le r}$ is a real-algebraic variety [3] and $L_k(X, Y)$ is a semi-algebraic function as a finite sum of semi-algebraic functions [1]. Thus the AL subproblem has the KL property. Therefore, the inner-loop solver is valid.

Since the optimality condition of the AL subproblem is changed when substituting ${\mathcal {M}}_{\le r}$ for ${\mathcal {M}}_r$, Theorem 2(ii) in [4] should be replaced by (ii) $\left( X^{k,j}, Y^{k,j}\right) $ converges to a critical point of $L_k$. Let $\left( X^{k,*}, Y^{k,*}\right) $ be the limit point of $\{\left( X^{k,j}, Y^{k,j}\right) \}_{j\in {\mathbb {N}}}$. Then

$$\begin{aligned}&0 \in \nabla _{X}{\mathcal {L}}_{\rho _{k-1}}(X^{k,*}, Y^{k,*}, \varLambda ^{k-1}) + \partial \delta _{{\mathcal {M}}_+}(X^{k,*});\\&0 = \Vert \nabla _{Y}{\mathcal {L}}_{\rho _{k-1}}(X^{k,*}, Y^{k,*}, \varLambda ^{k-1})\Vert ^2 - \mathrm{dist}(-\nabla _{Y}{\mathcal {L}}_{\rho _{k-1}}(X^{k,*}, Y^{k,*}, \varLambda ^{k-1}), T_{Y^{k,*}}{\mathcal {M}}_{\le r})^2. \end{aligned}$$

By substituting ${\mathcal {M}}_{\le r}$ for ${\mathcal {M}}_r$, the stopping criterion for AALM (i.e., formula (19) in the original article [4]) should be replaced by

$$\begin{aligned} \left\{ \begin{array}{l} \Vert X^k - Y^k\Vert< \varepsilon ,\\ \Vert {\mathcal {P}}_{T_{X^k}{\mathcal {M}}_+}\left( \frac{1}{2}(X^k - A) - \varLambda ^k\right) \Vert< \varepsilon ,\\ \left| \Vert \frac{1}{2}(Y^k - A) + \varLambda ^k\Vert ^2 - \mathrm{dist}(-\frac{1}{2}(Y^k - A) - \varLambda ^k, T_{Y^k}{\mathcal {M}}_{\le r})^2\right| < \varepsilon . \end{array} \right. \end{aligned}$$

Notice that both projections onto ${\mathcal {M}}_r$ and ${\mathcal {M}}_{\le r}$ can be obtained by SVD. The only difference between solving the AL subproblem over ${\mathcal {M}}_{r}$ and ${\mathcal {M}}_{\le r}$ occurs when the rank of iteration point is less than $r$. We have retested numerical experiments listed in [4]. Rank-deficient has not been observed when the rank of the initial point equals $r$. Hence, the performance of the revised algorithm is similar to that in [4]. The revised numerical results are given in Figs. 1, 2, 3, and Tables 1, 2, 3, 4 corresponding to the update in the figures and tables of [4].

Table 3 Correction to Table 4 in [4]. Average results for Face Datasets with different $r$

Full size table

Table 4 Correction to Table 5 in [4]. Denoising results for face Datasets

Full size table

References

Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)
Article MathSciNet Google Scholar
Cason, T., Absil, P., Van Dooren, P.: Iterative methods for low rank approximation of graph similarity matrices. Linear Algebra Appl. 438(4), 1863–1882 (2013)
Article MathSciNet Google Scholar
Schneider, R., Uschmajew, A.: Convergence results for projected line-search methods on varieties of low-rank matrices via lojasiewicz inequality. SIAM J. Optim. 25(1), 622–646 (2015)
Article MathSciNet Google Scholar
Zhu, H., Ng, M., Song, G.: Augmented Lagrangian methods for nonnegative low-rank matrix approximation. J. Sci. Comput. 88 (2021)

Download references

Author information

Authors and Affiliations

School of Mathematical Sciences, Jiangsu University, Zhenjiang, 212013, Jiangsu, People’s Republic of China
Hong Zhu
Department of Mathematics, The University of Hong Kong, Pokfulam, Hong Kong
Michael K. Ng
School of Mathematics and Information Sciences, Weifang University, Weifang, 261061, People’s Republic of China
Guang-Jing Song

Authors

Hong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Michael K. Ng
View author publications
You can also search for this author in PubMed Google Scholar
Guang-Jing Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael K. Ng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, H., Ng, M.K. & Song, GJ. Correction to: An Approximate Augmented Lagrangian Method for Nonnegative Low-Rank Matrix Approximation. J Sci Comput 90, 37 (2022). https://doi.org/10.1007/s10915-021-01729-z

Download citation

Received: 05 September 2021
Accepted: 30 October 2021
Published: 07 December 2021
DOI: https://doi.org/10.1007/s10915-021-01729-z

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Correction to: An Approximate Augmented Lagrangian Method for Nonnegative Low-Rank Matrix Approximation

1 Correction to: Journal of Scientific Computing (2021) 88:45 https://doi.org/10.1007/s10915-021-01556-2

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation