1 Correction to: Journal of Scientific Computing (2021) 88:45 https://doi.org/10.1007/s10915-021-01556-2

The original version of this article [4] unfortunately contained an error. The authors would like to correct the error with this corrigendum.

In [4], the optimization formulation is not presented correctly. We should use the following model:

$$\begin{aligned} \min _{X}~\varPhi (X): = \frac{1}{2}|| A - X ||_F^2 \quad \mathrm{subject~to}\quad \mathrm{rank}(X) \le r,~X \ge 0. \end{aligned}$$
(1)

instead of (2) stated in [4] to avoid rank-deficient iteration points caused by the fact that the constraint set \({\mathcal {M}}_r = \{X\in {\mathbb {R}}^{m\times n}\vert \mathrm{rank}(X) = r\}\) used in [4] is not closed.

In order to derive the Augmented Lagrangian (AL) method for (1), we need to add some geometric properties of \({\mathcal {M}}_{\le r} = \{X\in {\mathbb {R}}^{m\times n}\vert \mathrm{rank}(X) \le r\} \) which is a real-algebraic variety [3]. In the rank-deficient point \(X\in {\mathcal {M}}_{\le r}\) with \(\mathrm{rank}(X) = s < r\), the tangent cone \(T_{X}{\mathcal {M}}_{\le r}\) is given by [2]

$$\begin{aligned} T_{X}{\mathcal {M}}_{\le r} = T_X{\mathcal {M}}_s \oplus \{\varXi _{r-s}\in U^{\bot }\otimes {\mathcal {V}}^{\bot }~|~\mathrm{rank}(\varXi _{r-s}) \le r - s\}, \end{aligned}$$

where \(T_{X}{\mathcal {M}}_{s}\) is the tangent space of \({\mathcal {M}}_s\) at \(X\), \(\oplus \) denotes direct sum and \(\otimes \) denotes Kronecker product. Then for any \(Z\in {\mathbb {R}}^{m\times n}\), the orthogonal projection of \(Z\) onto \(T_X{\mathcal {M}}_{\le r}\) follows

$$\begin{aligned} {\mathcal {P}}_{T_X{\mathcal {M}}_{\le r}}\big (Z\big ) = {\mathcal {P}}_{T_X{\mathcal {M}}_{s}}\big (Z\big ) + \varXi _{r - s}, \end{aligned}$$

where \(\varXi _{r-s}\) is a best rank-(\(r-s\)) approximation of \(Z - {\mathcal {P}}_{T_X{\mathcal {M}}_{r}}\big (Z\big )\) in the Frobenius norm. For differentiable function \(\varPhi \), the critical point \(X^*\) of \(\min _{X\in {\mathcal {M}}_{\le r}}\varPhi (X)\) satisfies [3]

$$\begin{aligned} 0 = \Vert \nabla \varPhi (X^*)\Vert ^2 - \mathrm{dist}(-\nabla \varPhi (X^*), T_{X^*}{\mathcal {M}}_{\le r})^2, \end{aligned}$$
(2)

where

$$\begin{aligned} \mathrm{dist}(Z, T_{X^*}{\mathcal {M}}_{\le r}) = \Vert Z - {\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(Z)\Vert _F. \end{aligned}$$

Recall that \({\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(Z)\) is a orthogonal projection of \(Z\) onto \(T_{X^*}{\mathcal {M}}_{\le r}\), it holds that

$$\begin{aligned} \Vert {\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(-\nabla \varPhi (X^*))\Vert ^2 = \Vert -\nabla \varPhi (X^*)\Vert ^2 - \Vert -\nabla \varPhi (X^*) - {\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(-\nabla \varPhi (X^*))\Vert _F^2. \end{aligned}$$

Therefore, (2) is equivalent to

$$\begin{aligned} {\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(-\nabla \varPhi (X^*)) = 0. \end{aligned}$$

If \(\mathrm{rank}(X^*) = r\), according to the definition of \({\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{\le r}}(\cdot )\), it holds that \({\mathcal {P}}_{T_{X^*}{\mathcal {M}}_{r}}(-\nabla \varPhi (X^*)) = 0.\)

The AL subproblem can be rewritten as

$$\begin{aligned} \min _{X, Y}{\mathcal {L}}_{\rho _{k-1}}(X, Y, \varLambda ^{k-1}) + \delta _{{\mathcal {M}}_+}(X) + \delta _{{\mathcal {M}}_{\le r}}(Y). \end{aligned}$$
(3)

According to the above discussion of geometric properties of \({\mathcal {M}}_+\) and \({\mathcal {M}}_{\le r}\), the stationary point \((X^k_*, Y^k_*)\) of (3) satisfies

$$\begin{aligned} {\left\{ \begin{array}{ll} 0 \in \nabla _{X}{\mathcal {L}}_{\rho _{k-1}}(X^k_*, Y^k_*, \varLambda ^{k-1}) + \partial \delta _{{\mathcal {M}}_+}(X^k_*);\\ 0 = \Vert \nabla _{Y}{\mathcal {L}}_{\rho _{k-1}}(X^k_*, Y^k_*, \varLambda ^{k-1})\Vert ^2 - \mathrm{dist}(-\nabla _{Y}{\mathcal {L}}_{\rho _{k-1}}(X^k_*, Y^k_*, \varLambda ^{k-1}), T_{Y_*^k}{\mathcal {M}}_{\le r})^2. \end{array}\right. } \end{aligned}$$
Fig. 1
figure 1

Correction to Figure 1 in [4]. Iteration convergence on \(\mathrm{dist}(Y^k, {\mathcal {M}}_+)\) for face dataset UMist with Gaussian noise \({\mathcal {N}}(0, (30/255)^2)\). Left: \((\varepsilon _p, \varepsilon _f) = (e^{-7}, e^{-9})\); right: \((\varepsilon _p, \varepsilon _f) = (e^{-10}, e^{-13})\)

Fig. 2
figure 2

Correction to Figure 2 in [4]. CPU time, recover qualities RSE, PSNR and SSIM as function of the number of testing rank \(r\) for dataset YaleB

Fig. 3
figure 3

Left: correction to Figure 3 in [4]. Right: correction to Figure 5 in [4]

Then Algorithm 1 of [4] can be revised.

figure a

The main convergence result of Theorem 1 for Algorithm 1 stated in the original article [4] still holds. By substituting \({\mathcal {M}}_{\le r}\) for \({\mathcal {M}}_r\), the proof is the same except formula (17) in [4] should be replaced by

$$\begin{aligned} \left| \Vert \frac{1}{2}(Y^k - A) + \varLambda ^{k}\Vert ^2 - \mathrm{dist}(-\frac{1}{2}(Y^k - A) - \varLambda ^{k}, T_{Y^k}{\mathcal {M}}_{\le r})^2\right| \le \epsilon _k/2,~\forall k\in {\mathbb {N}}. \end{aligned}$$
(8)

Recall (15) and (16) in [4], taking limits as \({\mathcal {K}}\ni k\rightarrow \infty \) on both sides of (8), there exists \(Z^*\in \partial \delta _{{\mathcal {M}}_+}(X^*)\), such that

$$\begin{aligned} \Vert \nabla \varPhi (X^*)+Z^*\Vert ^2 - \mathrm{dist}(-\nabla \varPhi (X^*)-Z^*, T_{X^*}{\mathcal {M}}_{\le r})^2 = 0, \end{aligned}$$

which implies that \(X^*\) is a stationary point of problem (1).

Table 1 Correction to Table 1 in [4]. Average results over 10 tests on synthetic data with fixed \(m\)
Table 2 Correction to Table 2 in [4]. Average results over 10 tests on synthetic data with fixed \(n\)

Note that the AL subproblem is a composite optimization problem. It can be checked that (a) \(\delta _{{\mathcal {M}}_+}\) and \(\delta _{{\mathcal {M}}_{\le r}}\) are proper lower semicontinuous; (b) \(\varTheta (X, Y) \triangleq \varPsi (X, Y) - \langle \varLambda ^{k-1}, X - Y\rangle + \frac{\rho _{k-1}}{2}\Vert X - Y\Vert _F^2\) is a \({\mathcal {C}}^1\) function and \(\nabla \varTheta \) is Lipschitz continuous on bounded subset of \({\mathbb {R}}^{m\times n}\times {\mathbb {R}}^{m\times n}\) (for the case that penalty parameter \(\rho _k\) tends to infinity, the AL subproblem can be scaled by \(1/\rho _k\) to ensure that \(\nabla _X L_k(X, Y)\) and \(\nabla _Y L_k(X, Y)\) are Lipschitz continuous.); (c) \({\mathcal {M}}_{\le r}\) is a real-algebraic variety [3] and \(L_k(X, Y)\) is a semi-algebraic function as a finite sum of semi-algebraic functions [1]. Thus the AL subproblem has the KL property. Therefore, the inner-loop solver is valid.

Since the optimality condition of the AL subproblem is changed when substituting \({\mathcal {M}}_{\le r}\) for \({\mathcal {M}}_r\), Theorem 2(ii) in [4] should be replaced by (ii) \(\left( X^{k,j}, Y^{k,j}\right) \) converges to a critical point of \(L_k\). Let \(\left( X^{k,*}, Y^{k,*}\right) \) be the limit point of \(\{\left( X^{k,j}, Y^{k,j}\right) \}_{j\in {\mathbb {N}}}\). Then

$$\begin{aligned}&0 \in \nabla _{X}{\mathcal {L}}_{\rho _{k-1}}(X^{k,*}, Y^{k,*}, \varLambda ^{k-1}) + \partial \delta _{{\mathcal {M}}_+}(X^{k,*});\\&0 = \Vert \nabla _{Y}{\mathcal {L}}_{\rho _{k-1}}(X^{k,*}, Y^{k,*}, \varLambda ^{k-1})\Vert ^2 - \mathrm{dist}(-\nabla _{Y}{\mathcal {L}}_{\rho _{k-1}}(X^{k,*}, Y^{k,*}, \varLambda ^{k-1}), T_{Y^{k,*}}{\mathcal {M}}_{\le r})^2. \end{aligned}$$

By substituting \({\mathcal {M}}_{\le r}\) for \({\mathcal {M}}_r\), the stopping criterion for AALM (i.e., formula (19) in the original article [4]) should be replaced by

$$\begin{aligned} \left\{ \begin{array}{l} \Vert X^k - Y^k\Vert< \varepsilon ,\\ \Vert {\mathcal {P}}_{T_{X^k}{\mathcal {M}}_+}\left( \frac{1}{2}(X^k - A) - \varLambda ^k\right) \Vert< \varepsilon ,\\ \left| \Vert \frac{1}{2}(Y^k - A) + \varLambda ^k\Vert ^2 - \mathrm{dist}(-\frac{1}{2}(Y^k - A) - \varLambda ^k, T_{Y^k}{\mathcal {M}}_{\le r})^2\right| < \varepsilon . \end{array} \right. \end{aligned}$$

Notice that both projections onto \({\mathcal {M}}_r\) and \({\mathcal {M}}_{\le r}\) can be obtained by SVD. The only difference between solving the AL subproblem over \({\mathcal {M}}_{r}\) and \({\mathcal {M}}_{\le r}\) occurs when the rank of iteration point is less than \(r\). We have retested numerical experiments listed in [4]. Rank-deficient has not been observed when the rank of the initial point equals \(r\). Hence, the performance of the revised algorithm is similar to that in [4]. The revised numerical results are given in Figs. 1, 2, 3, and Tables 1, 2, 3, 4 corresponding to the update in the figures and tables of [4].

Table 3 Correction to Table 4 in [4]. Average results for Face Datasets with different \(r\)
Table 4 Correction to Table 5 in [4]. Denoising results for face Datasets