Skip to main content
Log in

Convex–Concave Tensor Robust Principal Component Analysis

  • Original Paper
  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Tensor robust principal component analysis (TRPCA) aims at recovering the underlying low-rank clean tensor and residual sparse component from the observed tensor. The recovery quality heavily depends on the definition of tensor rank which has diverse construction schemes. Recently, tensor average rank has been proposed and the tensor nuclear norm has been proven to be its best convex surrogate. Many improved works based on the tensor nuclear norm have emerged rapidly. Nevertheless, there exist three common drawbacks: (1) the neglect of consideration on relativity between the distribution of large singular values and low-rank constraint; (2) the prior assumption of equal treatment for frontal slices hidden in tensor nuclear norm; (3) the missing convergence of whole iteration sequences in optimization. To address these problems together, in this paper, we propose a convex–concave TRPCA method in which the notion of convex–convex singular value separation (CCSVS) plays a dominant role in the objective. It can adjust the distribution of the first several largest singular values with low-rank controlling in a relative way and emphasize the importance of frontal slices collaboratively. Remarkably, we provide the rigorous convergence analysis of whole iteration sequences in optimization. Besides, a low-rank tensor recovery guarantee is established for the proposed CCSVS model. Extensive experiments demonstrate that the proposed CCSVS significantly outperforms state-of-the-art methods over toy data and real-world datasets, and running time per image is also the fastest.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

The Berkeley Segmentation dataset (BSD500) in experiments is publicly available and can be downloaded from the website https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/. The Scene Background Initialization (SBI) dataset can be found at https://sbmi2015.na.icar.cnr.it/SBIdataset.html.

References

  • Barber, R. F., & Sidky, E. Y. (2020). Convergence for nonconvex ADMM, with applications to CT imaging. arXiv:2006.07278

  • Belarbi, M.A., Mahmoudi, S., & Belalem, G. (2017). PCA as dimensionality reduction for large-scale image retrieval systems. Proceedings of International Joint Conference on Artificial Intelligence.

  • Bhatia, R. (2013). Matrix Analysis. Springer.

    Google Scholar 

  • Bouwmans, T., Javed, S., Zhang, H., Lin, Z., & Otazo, R. (2018). On the applications of robust PCA in image and video processing. Proceedings of the IEEE, 106(8), 1427–1457.

    Article  Google Scholar 

  • Candès, E. J., Li, X. D., Ma, Y., & Wright, J. (2011). Robust principal component analysis? Journal of the ACM, 58(3).

  • De la Torre, F., & Black, M. J. (2001). Robust principal component analysis for computer vision. In Proceedings of the IEEE International Conference on Computer Vision.

  • Fan, K. (1951). Maximum properties and inequalities for the eigenvalues of completely continuous operators. Proceedings of the National Academy of Sciences, 37(11), 760–766.

    Article  MathSciNet  Google Scholar 

  • Gao, Q., Zhang, P., Xia, W., Xie, D., Gao, X., & Tao, D. (2020). Enhanced tensor RPCA and its application. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(6), 2133–2140.

    Article  Google Scholar 

  • Gu, S., Xie, Q., Meng, D., Zuo, W., Feng, X., & Zhang, L. (2017). Weighted nuclear norm minimization and its applications to low level vision. International Journal of Computer Vision, 121(2), 183–208.

    Article  Google Scholar 

  • Imaizumi, M., & Maehara, T. (2017). On tensor train rank minimization: statistical efficiency and scalable algorithm. In Proceedings of Advances in Neural Information Processing Systems.

  • Kilmer, M. E., & Martin, C. D. (2011). Factorization strategies for third-order tensors. Linear Algebra and Its Applications, 435(3), 641–658.

    Article  MathSciNet  Google Scholar 

  • Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51(3), 455–500.

    Article  MathSciNet  Google Scholar 

  • Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., & Ma, Y. (2013). Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Recognition and Machine Intelligence, 35(1), 171–185.

    Article  Google Scholar 

  • Liu, J., Musialski, P., Wonka, P., & Ye, J. (2013). Tensor completion for estimating missing values in visual data. IEEE Transactions on Pattern Recognition and Machine Intelligence, 35(1), 208–220.

    Article  Google Scholar 

  • Lu, C., Feng, J., Liu, W., Lin, Z., & Yan, S. (2020). Tensor robust principal component analysis with a new tensor nuclear norm. IEEE Transactions on Pattern Recognition and Machine Intelligence, 42(4), 925–938.

    Article  Google Scholar 

  • Lu, C., Feng, J., Yan, S., & Lin, Z. (2018). A unified alternating direction method of multipliers by majorization minimization. IEEE Tranactions on Pattern Recognition and Machine Intelligence, 40(3), 527–541.

    Article  Google Scholar 

  • Lu, H., Plataniotis, K. N., & Venetsanopoulos, A. N. (2008). MPCA: Multilinear principal component analysis of tensor objects. IEEE Transactions on Neural Networks, 19(1), 18–39.

    Article  Google Scholar 

  • Maddalena, L., & Petrosino, A. (2015). Towards benchmarking scene background initialization. In New Trends in Image Analysis and Processing–ICIAP 2015 Workshops: ICIAP 2015 International Workshops.

  • Malik, O.A., & Becker, S. (2018). Low-rank Tucker decomposition of large tensors using TensorSketch. In Proceedings of Advances in Neural Information Processing Systems.

  • Marshall, A., & Olkin, I. (1979). Inequalities: Theory of Majorization and its Applications. Academic.

    Google Scholar 

  • Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in Proceedings Eighth IEEE International Conference on Computer Vision,

  • Mirsky, L. (1975). A trace inequality of John von Neumann. Monatsheftefür Mathematik, 79(4), 303–306.

    Article  MathSciNet  Google Scholar 

  • Moslehian, M. (2012). S, Ky fan inequalities. Linear and multilinear algebra, 60(11–12), 1313–1325.

    Article  MathSciNet  Google Scholar 

  • Rockafellar, R. T., & Wets, R. (1998). Variational analysis. Springer.

    Book  Google Scholar 

  • Schölkopf, B., Smola, A., & Müller, K.-R. (1997). Kernel principal component analysis. In Artificial Neural Networks-ICANN.

  • Sun, M., Zhao, L., Zheng, J., & Xu. J. (2020). A Nonlocal Denoising Framework Based on Tensor Robust Principal Component Analysis with \(l_{p}\) norm. In Proceedings of the IEEE International Conference on Big Data.

  • Von Neumann, J. (1937). Some matrix-inequalities and metrization of matric space.

  • Wang, Y., Yin, W., & Zeng, J. (2008). Linear convergence of iterative soft-thresholding. Journal of Fourier Analysis and Applications, 14, 813–837.

    Article  MathSciNet  Google Scholar 

  • Wang, Y., Yin, W., & Zeng, J. (2019). Global convergence of ADMM in nonconvex nonsmooth optimization. Journal of Scientific Computing, 78, 29–63.

    Article  MathSciNet  Google Scholar 

  • Wang, L., Zhang, S., & Huang, H. (2021). Adaptive dimension-discriminative low-rank tensor recovery for computational hyperspectral imaging. International Journal of Computer Vision, 129(10), 2907–2926.

  • Yu, H., & Bennamoun, M. (2006). 1D-PCA, 2D-PCA to nD-PCA. In 18th International Conference on Pattern Recognition (ICPR’06).

  • Zhang, Z., Ely, G., Aeron, S., Hao, N., & Kilmer, M. (2014). Novel methods for multilinear data completion and de-noising based on tensor-SVD. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.

  • Zhang, L., & Peng, Z. (2019). Infrared small target detection based on partial sum of the tensor nuclear norm. Remote Sensing, 11(4), 382-392.

    Article  Google Scholar 

  • Zhou, Y., & Cheung, Y.-M. (2021). Bayesian low-tubal-rank robust tensor factorization with multi-rank determination. IEEE Transacrions on Pattern Recognition and Machine Intelligence, 43(1), 62–76.

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their valuable comments and suggestions which helped to improve the quality of this paper. This work was supported in part by National Natural Science Foundation of China under Grants 62106081, 62225113, 62106063 and 62122060, by the Guangdong Natural Science Foundation under Grant 2022A1515010819 and by the Shenzhen Science and Technology Program under Grant RCBS20210609103708013.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Bo Du or Yongyong Chen.

Additional information

Communicated by Xavier Pennec.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Preliminary Lemmas

Lemma 1

(Fan, 1951) If \({\textbf{H}}\in {\mathbb {C}}^{n\times n}\) is a complex matrix of order n, it holds that, for any \(1\le k\le n\),

$$\begin{aligned} \sum _{i=1}^{k}\sigma _{i}({\textbf{H}})=\max \vert \sum _{i=1}^{k}\langle {\textbf{U}}{\textbf{H}}{\textbf{z}}_{i},{\textbf{z}}_{i}\rangle \vert , \end{aligned}$$
(60)

where the maximization means that \({\textbf{U}}\) runs over all unitary matrices in \({\mathbb {C}}^{n\times n}\) and \(\{{\textbf{z}}_{i}\}_{i=1}^{k}\) runs over k orthonormal sets in \({\mathbb {C}}^{n}\).

This lemma is also generalized to the case of multiple matrices as follows.

Lemma 2

(Fan, 1951) If \({\textbf{H}}_{1},{\textbf{H}}_{2},\cdots ,{\textbf{H}}_{t}\) are complex matrices from \({\mathbb {C}}^{n\times n}\), it holds that, for any \(1\le k\le n\),

$$\begin{aligned}&\sum _{i=1}^{k}\sigma _{i}({\textbf{H}}_{1})\sigma _{i}({\textbf{H}}_{2}) \cdots \sigma _{i}({\textbf{H}}_{t})\nonumber \\&\quad =\max \vert \sum _{i=1}^{k}\langle {\textbf{U}}_{1}{\textbf{H}}_{1}{\textbf{U}}_{2}{\textbf{H}}_{2} \cdots {\textbf{U}}_{t}{\textbf{H}}_{t}{\textbf{z}}_{i},{\textbf{z}}_{i}\rangle \vert , \end{aligned}$$
(61)

where the maximization means that \({\textbf{U}}_{1}\), \({\textbf{U}}_{2}\), \(\cdots \), \({\textbf{U}}_{t}\) run over all unitary matrices in \({\mathbb {C}}^{n\times n}\) and \(\{{\textbf{z}}_{i}\}_{i=1}^{k}\) runs over k orthonormal sets in \({\mathbb {C}}^{n}\).

1.2 The Proof of Theorem 1

Proof

Denote \({\widetilde{n}}=\max \{n_{1},n_{2}\}\) and \({\widehat{n}}=\min \{n_{1},n_{2}\}\). We augment \({\textbf{X}}\) as \(\widetilde{{\textbf{X}}}\in {\mathbb {R}}^{{\widetilde{n}}\times {\widetilde{n}}}\) by the following method. If column size \(n_{2}\) is less than row size, i.e. \(n_{2}<n_{1}\), we form \(\widetilde{{\textbf{X}}}\) by zero padding scheme along columns. Otherwise, we perform zero padding scheme along rows. Either way, the first \({\widehat{n}}\) singular values of \({\textbf{X}}\) and \(\widetilde{{\textbf{X}}}\) coincide.

It follows from the well-known Ky Fan inequality (Moslehian, 2012) that

$$\begin{aligned}&\sum _{i=1}^{\min \{n_{1},n_{2}\}}\sigma _{i}(\widetilde{{\textbf{X}}}_{1} +\widetilde{{\textbf{X}}}_{2})\le \sum _{i=1}^{\min \{n_{1},n_{2}\}}\sigma _{i} (\widetilde{{\textbf{X}}}_{1})\nonumber \\&\quad +\sum _{i=1}^{\min \{n_{1},n_{2}\}}\sigma _{i}(\widetilde{{\textbf{X}}}_{2}). \end{aligned}$$
(62)

Equivalently,

$$\begin{aligned} \sum _{i=1}^{\min \{n_{1},n_{2}\}}\sigma _{i}({\textbf{X}}_{1}+{\textbf{X}}_{2}) \le \!\!\sum _{i=1}^{\min \{n_{1},n_{2}\}}\sigma _{i}({\textbf{X}}_{1})\!+\!\!\sum _{i=1} ^{\min \{n_{1},n_{2}\}}\sigma _{i}({\textbf{X}}_{2}). \end{aligned}$$

Combining this fact with the positive homogeneity of singular value function \(\sigma _{i}(\cdot )\), we have

$$\begin{aligned}&\sum _{i=1}^{\min \{n_{1},n_{2}\}}\sigma _{i}(\theta {\textbf{X}}_{1}+(1-\theta ) {\textbf{X}}_{2})\le \theta \sum _{i=1}^{\min \{n_{1},n_{2}\}}\nonumber \\&\quad \sigma _{i}({\textbf{X}}_{1}) +(1-\theta )\sum _{i=1}^{\min \{n_{1},n_{2}\}}\sigma _{i}({\textbf{X}}_{2}), \end{aligned}$$
(63)

where \(\theta \in [0,1]\). This implies the convexity of the function \(f_{convex}(\cdot )\).

Let

$$\begin{aligned} F({\textbf{X}})=\sum _{i=1}^{r}\omega _{i}\sigma _{i}({\textbf{X}}). \end{aligned}$$
(64)

By Lemma 1, we have

$$\begin{aligned} \sum _{i=1}^{r}\sigma _{i}(\widetilde{{\textbf{X}}})=\max \vert \sum _{i=1}^{r}\langle {\textbf{U}}{\textbf{H}}{\textbf{z}}_{i},{\textbf{z}}_{i}\rangle \vert . \end{aligned}$$
(65)

Denote \({\textbf{D}}=diag([\omega _{1},\omega _{2},\cdots ,\omega _{r},0,\cdots ,0]) \in {\mathbb {R}}^{{\widetilde{n}}\times {\widetilde{n}}}\). Given \({\textbf{A}}\) and \({\textbf{B}}\) from \({\mathbb {C}}^{n_{1}\times n_{2}}\), we represent \({\textbf{C}}={\textbf{A}}+{\textbf{B}}\). By the aforementioned augmentation method, \(\widetilde{{\textbf{C}}}=\widetilde{{\textbf{A}}}+\widetilde{{\textbf{B}}}\). Let \(\widetilde{{\textbf{X}}}=\widetilde{{\textbf{P}}}\widetilde{{{\varvec{\Sigma }}}}\widetilde{{\textbf{Q}}}^{*}\) be the singular value decomposition of \(\widetilde{{\textbf{X}}}\). Then

$$\begin{aligned}&\quad \sum _{i=1}^{r}\omega _{i}\sigma _{i}(\widetilde{{\textbf{C}}})\nonumber \\&=\max \vert \sum _{i=1}^{r}\langle {\textbf{U}}\mathbf {{\widetilde{P}}{\widetilde{\Sigma }} D{\widetilde{Q}}^{*}}{\textbf{z}}_{i},{\textbf{z}}_{i}\rangle \vert \nonumber \\&=\max \vert \sum _{i=1}^{r}\langle {\textbf{U}}\mathbf {({\widetilde{P}}{\widetilde{\Sigma }} {\widetilde{Q}}^{*}){\widetilde{Q}}D{\widetilde{Q}}^{*}}{\textbf{z}}_{i},{\textbf{z}}_{i}\rangle \vert \nonumber \\&=\max \vert \sum _{i=1}^{r}\langle {\textbf{U}}\mathbf {{\widetilde{C}}{\widetilde{Q}} D{\widetilde{Q}}^{*}}{\textbf{z}}_{i},{\textbf{z}}_{i}\rangle \vert \nonumber \\&\le \max \vert \sum _{i=1}^{r}\langle {\textbf{U}}\mathbf {{\widetilde{A}}{\widetilde{Q}}D{\widetilde{Q}}^{*}}{\textbf{z}}_{i}, {\textbf{z}}_{i}\rangle \vert \nonumber \\&\quad +\max \vert \sum _{i=1}^{r}\langle {\textbf{U}}\mathbf {{\widetilde{B}}{\widetilde{Q}}D{\widetilde{Q}}^{*}}{\textbf{z}}_{i}, {\textbf{z}}_{i}\rangle \vert , \end{aligned}$$
(66)

where the first equality holds by the decreasing property of the diagonal elements of \({\textbf{D}}\) and the last inequality is attributed to triangle inequality.

Hence, in view of Lemma 1 and Eq. (68), we have

$$\begin{aligned} \sum _{i=1}^{r}\omega _{i}\sigma _{i}(\widetilde{{\textbf{C}}}) \le \sum _{i=1}^{r}\sigma _{i}(\mathbf {{\widetilde{A}}{\widetilde{Q}}D{\widetilde{Q}}^{*}}) +\sum _{i=1}^{r}\sigma _{i}(\mathbf {{\widetilde{B}}{\widetilde{Q}}D{\widetilde{Q}}^{*}}). \end{aligned}$$
(67)

Note that

$$\begin{aligned} \sum _{i=1}^{r}\sigma _{i}(\widetilde{{\textbf{A}}})\sigma _{i}(\mathbf {{\widetilde{Q}}} {\textbf{D}}\mathbf {{\widetilde{Q}}^{*}})&=\max \vert \sum _{i=1}^{k}\langle {\textbf{U}}_{1}\widetilde{{\textbf{A}}}{\textbf{U}}_{2}\mathbf {{\widetilde{Q}}} {\textbf{D}}\mathbf {{\widetilde{Q}}^{*}}{\textbf{z}}_{i},{\textbf{z}}_{i}\rangle \vert \nonumber \\&\ge \max \vert \sum _{i=1}^{r}\langle {\textbf{U}}_{1}\widetilde{{\textbf{A}}} \mathbf {{\widetilde{Q}}}{\textbf{D}}\mathbf {{\widetilde{Q}}^{*}}{\textbf{z}}_{i}, {\textbf{z}}_{i}\rangle \vert \nonumber \\&=\sum _{i=1}^{k}\sigma _{i}(\widetilde{{\textbf{A}}}\mathbf {{\widetilde{Q}}} {\textbf{D}}\mathbf {{\widetilde{Q}}^{*}}). \end{aligned}$$
(68)

The left-hand side is further equivalent to

$$\begin{aligned} \sum _{i=1}^{r}\sigma _{i}(\widetilde{{\textbf{A}}})\sigma _{i}(\mathbf {{\widetilde{Q}}} {\textbf{D}}\mathbf {{\widetilde{Q}}^{*}})&=\sum _{i=1}^{r}\sigma _{i}(\widetilde{{\textbf{A}}}) \sqrt{\lambda _{i}(\mathbf {{\widetilde{Q}}}{\textbf{D}}^{2}\widetilde{{\textbf{Q}}}^{*})}\nonumber \\&=\sum _{i=1}^{r}\sigma _{i}(\widetilde{{\textbf{A}}})\sqrt{\lambda _{i}({\textbf{D}}^{2})}\nonumber \\&=\sum _{i=1}^{r}\omega _{i}\sigma _{i}(\widetilde{{\textbf{A}}}). \end{aligned}$$
(69)

Then

$$\begin{aligned} \sum _{i=1}^{k}\sigma _{i}(\widetilde{{\textbf{A}}}\mathbf {{\widetilde{Q}}}{\textbf{D}} \mathbf {{\widetilde{Q}}^{*}})\le \sum _{i=1}^{r}\omega _{i}\sigma _{i}(\widetilde{{\textbf{A}}}). \end{aligned}$$
(70)

With a similar argument, we have

$$\begin{aligned} \sum _{i=1}^{k}\sigma _{i}(\widetilde{{\textbf{B}}}\mathbf {{\widetilde{Q}}}{\textbf{D}} \mathbf {{\widetilde{Q}}^{*}})\le \sum _{i=1}^{r}\omega _{i}\sigma _{i}(\widetilde{{\textbf{B}}}). \end{aligned}$$
(71)

Gathering Eqs. (67–71), it holds that

$$\begin{aligned} \sum _{i=1}^{r}\omega _{i}\sigma _{i}(\widetilde{{\textbf{C}}})\le \sum _{i=1}^{r}\omega _{i}\sigma _{i}(\widetilde{{\textbf{A}}})+\sum _{i=1}^{r}\omega _{i} \sigma _{i}(\widetilde{{\textbf{B}}}). \end{aligned}$$
(72)

We simplify Eq. (72) as

$$\begin{aligned} \sum _{i=1}^{r}\omega _{i}\sigma _{i}({\textbf{C}})\le \sum _{i=1}^{r}\omega _{i}\sigma _{i}({\textbf{A}})+\sum _{i=1}^{r}\omega _{i}\sigma _{i}({\textbf{B}}). \end{aligned}$$
(73)

In other words,

$$\begin{aligned} F(\mathbf {A+B})\le F({\textbf{A}})+F({\textbf{B}}). \end{aligned}$$
(74)

Considering the positive homogeneity of singular value function \(\sigma _{i}(\cdot )\), we have

$$\begin{aligned} F(\theta {\textbf{A}}+(1-\theta ){\textbf{B}})\le \theta F({\textbf{A}})+(1-\theta )F({\textbf{B}}), \end{aligned}$$
(75)

where \(\theta \in [0,1]\). Therefore, the function \(F(\cdot )\) is convex, which implies the concave property of \(g_{concave}(\cdot )\). \(\square \)

1.3 The Remaining Proof of Theorem 6

Since the objective in Eq. (15) is convex–concave, the traditional concept of subdifferential for convex functions is not suitable for convergence analysis here. As a matter of fact, the following notion of limiting subdifferential for non-convex and non-smooth functions is a necessary ingredient.

Definition 8

(The Limiting Subdifferential (Rockafellar & Wets, 1998) in Tensor Form) Let G be a proper and lower semi-continuous function \(G:{\mathbb {R}}^{n_{1}\times n_{2}\times n_{3}}\rightarrow {\mathbb {R}}\). The Frechét subdifferential of G at \({\mathcal {X}}\) is

$$\begin{aligned}&{\widehat{\partial }}G({\mathcal {X}})=\left\{ {\mathcal {Z}}\in {\mathbb {R}}^{n_{1}\times n_{2}\times n_{3}}:\right. \\&\quad \left. \lim _{{\mathcal {Y}}\ne {\mathcal {X}}}\inf _{{\mathcal {Y}}\rightarrow {\mathcal {X}}} \frac{G({\mathcal {Y}})-G({\mathcal {X}})-\langle {\mathcal {Z}},{\mathcal {Y}}-{\mathcal {X}} \rangle }{\Vert {\mathcal {Y}}-{\mathcal {X}}\Vert _{F}}\ge 0.\right\} . \end{aligned}$$

The limiting subdifferetial of G at \({\mathcal {X}}\) is

$$\begin{aligned}&\partial G({\mathcal {X}})=\{{\mathcal {Z}}\in {\mathbb {R}}^{n_{1}\times n_{2}\times n_{3}} :\exists {\mathcal {X}}^{(k)}\rightarrow {\mathcal {X}}, G({\mathcal {X}}^{(k)})\nonumber \\&\quad \rightarrow G({\mathcal {X}}), {\mathcal {Z}}^{(k)}\in {\widehat{\partial }}G({\mathcal {X}})\rightarrow {\mathcal {Z}},\nonumber \\&\quad \textrm{as}\ k\rightarrow \infty \}. \end{aligned}$$
(76)

The following three propositions is also necessary for convergence analysis.

Proposition 3

(Fermat’s Rule (Rockafellar & Wets, 1998)) In the non-smooth case, Fermat’s rule still holds. This is, if \({\mathcal {X}}\) is a local minimizer of non-smooth function G, then \(0\in \partial G({\mathcal {X}})\), i.e. \({\mathcal {X}}\) is a stationary point of G.

Proposition 4

(Rockafellar & Wets, 1998) Let \(\{{\mathcal {X}}_{k}\}_{k\ge 1}\) and \(\{{\mathcal {Z}}_{k}\}_{k\ge 1}\) satisfy \(\lim \limits _{k\rightarrow \infty }{\mathcal {X}}_{k}={\mathcal {X}}\), \(\lim \limits _{k\rightarrow \infty }{\mathcal {Z}}_{k}={\mathcal {Z}}\), \(\lim \limits _{k\rightarrow \infty }G({\mathcal {X}}_{k})=G({\mathcal {X}})\) and \({\mathcal {Z}}_{k}\in \partial G({\mathcal {X}}_{k})\). Then \({\mathcal {Z}}\in \partial G({\mathcal {X}})\).

Proposition 5

(Rockafellar & Wets, 1998) Let G be a proper and lower semi-continuous function \(G:{\mathbb {R}}^{n_{1}\times n_{2}\times n_{3}}\rightarrow {\mathbb {R}}\). Besides, suppose that \(f:{\mathbb {R}}^{n_{1}\times n_{2}\times n_{3}}\rightarrow {\mathbb {R}}\) is continuously differentiable. Then \(\partial (f+G)({\mathcal {X}})=\nabla f({\mathcal {X}})+\partial G({\mathcal {X}})\).

Proof

(The remaining part of the proof of Theorem 6.) First of all, we prove the basic fact that the limiting point pair \(({\mathcal {X}}^{*},{\mathcal {E}}^{*})\) is actually a feasible point pair. By Theorem 5, we have

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert {\mathcal {Y}}-{\mathcal {X}}^{(k+1)}-{\mathcal {E}}^{(k+1)} \Vert _{F}=\Vert {\mathcal {Y}}-{\mathcal {X}}^{*}-{\mathcal {E}}^{*}\Vert _{F}=0. \end{aligned}$$
(77)

Then the constraint \({\mathcal {Y}}={\mathcal {X}}^{*}+{\mathcal {E}}^{*}\) is satisfied, which implies that \(({\mathcal {X}}^{*},{\mathcal {E}}^{*})\) is a feasible point for the model Eq. (15).

By Proposition 3, it can be derived from Eq. (38) that

$$\begin{aligned} 0\in \partial \left( \frac{1}{\mu ^{(k)}}\textrm{Sep}({\mathcal {X}}^{(k+1)},\Omega ,\vartheta ) +\frac{1}{2}\Vert {\mathcal {X}}^{(k+1)}-{\mathcal {N}}^{(k)}\Vert _{F}^{2}\right) . \end{aligned}$$
(78)

Further, by Proposition 5, Eq. (80) can be reformulated as

$$\begin{aligned} 0\in \frac{1}{\mu ^{(k)}}\partial \textrm{Sep}({\mathcal {X}}^{(k+1)},\Omega ,\vartheta ) +{\mathcal {X}}^{(k+1)}-{\mathcal {N}}^{(k)}. \end{aligned}$$
(79)

Recall that \({\mathcal {N}}^{(k)}={\mathcal {Y}}+\frac{1}{\mu ^{(k)}}{\mathcal {L}}^{(k)}-{\mathcal {E}}^{(k)}\). Then we get

$$\begin{aligned} {\mathcal {L}}^{(k)}\in \partial \textrm{Sep}({\mathcal {X}}^{(k+1)},\Omega ,\vartheta ) +\mu ^{(k)}({\mathcal {X}}^{(k+1)}+{\mathcal {E}}^{(k+1)}-{\mathcal {Y}}). \end{aligned}$$
(80)

Equivalently, considering the expression of \({\mathcal {L}}^{(k+1)}\) in the Algorithm 1, we have

$$\begin{aligned} {\mathcal {L}}^{(k+1)}\in \partial \textrm{Sep}({\mathcal {X}}^{(k+1)},\Omega ,\vartheta ). \end{aligned}$$
(81)

We claim that

$$\begin{aligned} \lim _{k\rightarrow \infty }\textrm{Sep}({\mathcal {X}}^{(k+1)},\Omega ,\vartheta ) =\textrm{Sep}({\mathcal {X}}^{*},\Omega ,\vartheta ). \end{aligned}$$
(82)

According to the structural expression in Theorem 1 and the triangle inequality of absolute value, we have

$$\begin{aligned}&\left| \textrm{Sep}({\mathcal {X}}^{(k+1)},\Omega ,\vartheta )-\textrm{Sep} ({\mathcal {X}}^{*},\Omega ,\vartheta )\right| \nonumber \\&\quad \le \underbrace{\omega \left| \Vert {\mathcal {X}}^{(k+1)}\Vert _{*}-\Vert {\mathcal {X}} ^{*}\Vert _{*}\right| }_{J_{0}^{(k)}}\nonumber \\&\quad +\underbrace{\left| \sum _{i=1}^{n_{3}} \vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j}(\overline{{\mathcal {X}}^{(k+1)}}^{(i)})- \sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {X}}^{*}}^{(i)})\right| }_{J_{1}^{(k)}}. \end{aligned}$$
(83)

On the one hand, it holds that

$$\begin{aligned} J_{0}^{(k)}&\le \omega \Vert {\mathcal {X}}^{(k+1)}-{\mathcal {X}}^{*}\Vert _{*}\nonumber \\&\le \omega \sqrt{\frac{\min \{n_{1},n_{2}\}}{n_{3}}}\Vert {\mathcal {X}}^{(k+1)}-{\mathcal {X}}^{*}\Vert _{F}. \end{aligned}$$
(84)

where the first inequality is attributed to the triangle inequality of the tensor nuclear norm. Because \(\lim \limits _{k\rightarrow \infty }{\mathcal {X}}^{(k+1)}={\mathcal {X}}^{*}\), we have \(\lim \limits _{k\rightarrow 0}J_{0}^{(k)}=0\).

On the other hand, we have

$$\begin{aligned} J_{1}^{(k)}&\le \sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {X}}^{(k+1)}}^{(i)}-\overline{{\mathcal {X}}^{*}}^{(i)})\nonumber \\&\le \vartheta _{\max }\omega _{\max }\sum _{i=1}^{n_{3}}\sum _{j=1}^{r}\sigma _{j} (\overline{{\mathcal {X}}^{(k+1)}}^{(i)}-\overline{{\mathcal {X}}^{*}}^{(i)})\nonumber \\&\le \vartheta _{\max }\omega _{\max }\sum _{i=1}^{n_{3}}\sum _{j=1}^{\min \{n_{1},n_{2}\}}\sigma _{j} (\overline{{\mathcal {X}}^{(k+1)}}^{(i)}-\overline{{\mathcal {X}}^{*}}^{(i)})\nonumber \\&=n_{3}\vartheta _{\max }\omega _{\max }\Vert {\mathcal {X}}^{(k+1)}-{\mathcal {X}}^{*}\Vert _{*}\nonumber \\&\le \sqrt{\min \{n_{1},n_{2}\}n_{3}}\vartheta _{\max }\omega _{\max }\Vert {\mathcal {X}}^{(k+1)}-{\mathcal {X}}^{*}\Vert _{F}, \end{aligned}$$
(85)

where \(\vartheta _{\max }=\max _{i\in [n_{3}]}\vartheta _{i}\) and \(\omega _{\max }=\Vert \Omega \Vert _{\infty }\), in which \(\Omega =\{\omega ,\omega _{1},\cdots ,\omega _{r}\}\) comes from Definition 4. The first inequality holds by the convexity of top-r singular values like Eq. (74). According to the fact \(\lim \limits _{k\rightarrow \infty }{\mathcal {X}}^{(k+1)}={\mathcal {X}}^{*}\), it holds that \(\lim \limits _{k\rightarrow 0}J_{1}^{(k)}=0\).

Therefore, Eq. (84) holds. Recall that \({\mathcal {L}}^{*}\) is an accumulation point of the sequence \(\{{\mathcal {L}}^{(k)}\}_{k\ge 1}\). Then there must exist a subsequence \(\{{\mathcal {L}}^{(k_{j})}\}_{j\ge 1}\) which converges to \({\mathcal {L}}^{*}\). Select the corresponding subsequences \(\{{\mathcal {X}}^{(k_{j})}\}_{j\ge 1}\) and \(\{{\mathcal {X}}^{(k_{j})}\}_{j\ge 1}\) in \(\{{\mathcal {X}}^{(k)}\}_{k\ge 1}\) and \(\{{\mathcal {E}}^{(k)}\}_{k\ge 1}\), respectively. Using these selected subsequences, By Proposition 4, we have

$$\begin{aligned} 0\in \partial \textrm{Sep}({\mathcal {X}}^{*},\Omega ,\vartheta )-{\mathcal {L}}^{*}. \end{aligned}$$
(86)

By Proposition 3, it can be derived from Eq. (40) that

$$\begin{aligned} 0\in \partial \left( \frac{\lambda }{\mu ^{(k)}}\Vert {\mathcal {E}}^{(k+1)}\Vert _{1}+\frac{1}{2} \Vert {\mathcal {E}}^{(k+1)}-{\mathcal {J}}^{(k)}\Vert ^{2}_{F}\right) . \end{aligned}$$
(87)

Recall that \({\mathcal {J}}^{(k)}={\mathcal {Y}}+\frac{1}{\mu ^{(k)}}{\mathcal {L}}^{(k)} -{\mathcal {X}}^{(k+1)}\). Then it holds that

$$\begin{aligned} {\mathcal {L}}^{(k)}\in \partial \Vert {\mathcal {E}}^{(k+1)}\Vert _{1}+\mu ^{(k)}({\mathcal {X}}^{(k+1)} +{\mathcal {E}}^{(k+1)}-{\mathcal {Y}}). \end{aligned}$$
(88)

Equivalently, considering the expression of \({\mathcal {L}}^{(k+1)}\) in the Algorithm 1, we have

$$\begin{aligned} {\mathcal {L}}^{(k+1)}\in \partial \Vert {\mathcal {E}}^{(k+1)}\Vert _{1}. \end{aligned}$$
(89)

We claim that

$$\begin{aligned} 0\in \partial \Vert {\mathcal {E}}^{(*)}\Vert _{1}-{\mathcal {L}}^{*}. \end{aligned}$$
(90)

Note that

$$\begin{aligned}&\quad \left| \Vert {\mathcal {E}}^{(k+1)}\Vert _{1}-\Vert {\mathcal {E}}^{*}\Vert _{1}\right| \nonumber \\&\le \Vert {\mathcal {E}}^{(k+1)}-{\mathcal {E}}^{*}\Vert _{1}\nonumber \\&\le \sqrt{n_{1}n_{2}n_{3}}\Vert {\mathcal {E}}^{(k+1)}-{\mathcal {E}}^{*}\Vert _{F}, \end{aligned}$$
(91)

where the first inequality holds by triangle inequality.

Considering the aforementioned subsequences \(\{{\mathcal {X}}^{(k_{j})}\}_{j\ge 1}\), \(\{{\mathcal {E}}^{(k_{j})}\}_{j\ge 1}\) and \(\{{\mathcal {X}}^{(k_{j})}\}_{j\ge 1}\), it follows from Proposition 4 that

$$\begin{aligned} 0\in \partial \Vert {\mathcal {E}}^{(*)}\Vert _{1}-{\mathcal {L}}^{*}. \end{aligned}$$
(92)

\(\square \)

1.4 The Proof of Theorem 7

Let \({\mathcal {A}}\) be a real tensor of size \(n_{1}\times n_{2}\times n_{3}\) and \(\Phi =diag([\theta _{1},\theta _{2},\cdots ,\theta _{n_{3}}])\in {\mathbb {R}}^{n_{3}\times n_{3}}\). We define the tensor product of \(\Phi \) and \({\mathcal {X}}\) as

$$\begin{aligned} \Phi \otimes {\mathcal {A}} = fold\left( \begin{bmatrix} \theta _{1}{\mathcal {A}}^{(1)}\\ \theta _{2}{\mathcal {A}}^{(2)}\\ \vdots \\ \theta _{n_{3}}{\mathcal {A}}^{(n_{3})} \end{bmatrix} \right) . \end{aligned}$$
(93)

Denote \({\textbf{T}}\) by

$$\begin{aligned} {\textbf{T}}=\{{\mathcal {U}}*{\mathcal {Y}}^{*}+{\mathcal {W}}*{\mathcal {V}}^{*} :{\mathcal {Y}},{\mathcal {W}}\in {\mathbb {R}}^{n_{1}\times r\times n_{3}}\}. \end{aligned}$$
(94)

Let \({\textbf{T}}^{\bot }\) be the orthogonal component of \({\textbf{T}}\) in \({\mathbb {R}}^{n_{1}\times n_{2}\times n_{3}}\). Given any \({\mathcal {Z}}\in {\mathbb {R}}^{n_{1}\times n_{2}\times n_{3}}\), the projection of \({\mathcal {Z}}\) into \({\textbf{T}}\) and \({\textbf{T}}^{\bot }\) are prescribed as

$$\begin{aligned} {\mathcal {P}}_{{\textbf{T}}}{\mathcal {Z}}={\mathcal {U}}*{\mathcal {U}}^{*}*{\mathcal {Z}} +{\mathcal {Z}}*{\mathcal {V}}*{\mathcal {V}}^{*}-{\mathcal {U}}*{\mathcal {U}}^{*} *{\mathcal {Z}}*{\mathcal {V}}*{\mathcal {V}}^{*} \end{aligned}$$
(95)

and

$$\begin{aligned} {\mathcal {P}}_{{\textbf{T}}^{\bot }}{\mathcal {Z}}=({\mathcal {I}}_{n_{1}}-{\mathcal {U}} *{\mathcal {U}}^{*})*{\mathcal {Z}}*({\mathcal {I}}_{n_{2}}-{\mathcal {V}}*{\mathcal {V}}^{*}), \end{aligned}$$
(96)

respectively.

To prove the Theorem 7, we need the following auxiliary lemma.

Lemma 3

For any tensor \({\mathcal {H}}\in {\mathbb {R}}^{n_{1}\times n_{2}\times n_{3}}\), we have

$$\begin{aligned} \Vert {\mathcal {H}}\Vert _{*}\ge \Vert {\mathcal {P}}_{{\textbf{T}}^{\bot }}{\mathcal {H}}\Vert _{*}. \end{aligned}$$
(97)

Proof

By the expression of the tensor nuclear norm (Please refer to Proposition 2 for details.),

$$\begin{aligned}&\Vert {\mathcal {P}}_{{\textbf{T}}^{\bot }}{\mathcal {H}}\Vert _{*}=\sum _{i=1}^{n_{3}} \frac{1}{n_{3}}\sum _{j=1}^{\min \{n_{1},n_{2}\}}\nonumber \\&\quad \sigma _{j}(\overline{{\mathcal {P}}_{{\textbf{T}}^{\bot }} {\mathcal {H}}}^{(i)}). \end{aligned}$$
(98)

By the definition of projection Eq. (96) and Marshall and Olkin (1979),

$$\begin{aligned}&\sum _{j=1}^{\min \{n_{1},n_{2}\}}\sigma _{j}(\overline{{\mathcal {P}}_{{\textbf{T}} ^{\bot }}{\mathcal {H}}}^{(i)})\le \sum _{j=1}^{\min \{n_{1},n_{2}\}}\sigma _{j} ({\textbf{I}}_{n_{1}}\mathcal {{\overline{U}}}^{(j)}(\mathcal {{\overline{U}}}^{(j)})^{*})\nonumber \\&\quad \sigma _{j}({\mathcal {Z}})\sigma _{j}({\textbf{I}}_{n_{2}}\overline{{\mathcal {V}}}^{(j)} (\overline{{\mathcal {V}}}^{(j)})^{*}), \end{aligned}$$
(99)

where \({\textbf{I}}_{n_{1}}\) and \({\textbf{I}}_{n_{2}}\) are identity matrices of size \(n_{1}\times n_{1}\) and \(n_{2}\times n_{2}\), respectively.

Note that, for \(1\le j\le \min \{n_{1},n_{2}\}\),

$$\begin{aligned} \sigma _{j}({\textbf{I}}_{n_{1}}\mathcal {{\overline{U}}}^{(j)}(\mathcal {{\overline{U}}}^{(j)})^{*})\le 1,\nonumber \\ \sigma _{j}({\textbf{I}}_{n_{2}}\overline{{\mathcal {V}}}^{(j)}(\overline{{\mathcal {V}}}^{(j)})^{*})\le 1. \end{aligned}$$
(100)

Then we have

$$\begin{aligned} \sum _{j=1}^{\min \{n_{1},n_{2}\}}\sigma _{j}(\overline{{\mathcal {P}}_{{\textbf{T}}^{\bot }} {\mathcal {H}}}^{(i)})\le \sum _{j=1}^{\min \{n_{1},n_{2}\}}\sigma _{j}({\mathcal {Z}}). \end{aligned}$$
(101)

Further, we have

$$\begin{aligned} \Vert {\mathcal {H}}\Vert _{*}\ge \Vert {\mathcal {P}}_{{\textbf{T}}^{\bot }}{\mathcal {H}}\Vert _{*}. \end{aligned}$$
(102)

\(\square \)

Now we begin to prove Theorem 1.

Proof

The deduction is divided into two steps as follows.

  • Step 1: Under the assumption of \(\Vert {\mathcal {P}}_{{{\varvec{\Omega }}}} {\mathcal {P}}_{{\textbf{T}}}\Vert \le \frac{1}{2}\) and \(\lambda =\frac{1}{\sqrt{\max \{n_{1},n_{2}\}n_{3}}}\), we prove that \(({\mathcal {X}}_{0},{\mathcal {E}}_{0})\) is the \((\varepsilon ,\delta )\)-asymptotic unique local solution to the proposed CCSVS model Eq. (15) if there exists \(({\mathcal {W}},{\mathcal {F}})\) such that

    $$\begin{aligned} \omega \Phi \otimes ({\mathcal {U}}*{\mathcal {V}}^{*}+{\mathcal {W}})=\lambda (\textrm{sgn} ({\mathcal {E}}_{0})+{\mathcal {F}}+{\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {D}}), \end{aligned}$$
    (103)

    where \(\Phi =diag([\vartheta _{1},\vartheta _{2},\cdots ,\vartheta _{n_{3}}])\in {\mathbb {R}}^{n_{3}\times n_{3}}\) and

    $$\begin{aligned}&{\mathcal {P}}_{{\textbf{T}}}{\mathcal {W}}=0, \Vert {\mathcal {W}}\Vert \le \epsilon _{0}, {\mathcal {P}}_{{{\varvec{\Omega }}}}({\mathcal {F}})=0, \Vert {\mathcal {F}}\Vert _{\infty }\le \epsilon _{0}\nonumber \\&\quad \textrm{and}\ \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {D}}\Vert _{F}\le \frac{1}{4}, \end{aligned}$$
    (104)

    in which \(\epsilon _{0}\) is a positive constant such that \(0<\epsilon _{0}<\frac{3}{4}\).

  • Step 2: On the basis of Step 1, it suffices to produce a dual certification \({\mathcal {W}}\) which satisfies

    $$\begin{aligned} \left\{ \begin{aligned}&{\mathcal {W}}\in {\textbf{T}}^{\bot },\\&\Vert {\mathcal {W}}\Vert \le \epsilon _{0},\\&\Vert {\mathcal {P}}_{{{\varvec{\Omega }}}}(\omega \Theta \otimes ({\mathcal {U}} *{\mathcal {V}}^{*}+{\mathcal {W}})-\lambda \textrm{sgn}({\mathcal {E}}_{0}))\Vert _{F}\le \frac{\lambda }{4},\\&\Vert {\mathcal {P}}_{{{\varvec{\Omega }}}^{\perp }}(\omega \Theta \otimes ({\mathcal {U}}*{\mathcal {V}}^{*} +{\mathcal {W}}))\Vert _{\infty }\le \lambda \epsilon _{0}. \end{aligned} \right. \end{aligned}$$
    (105)

First of all, we complete the Step 1. For any \({\mathcal {H}}\ne 0\), \(({\mathcal {X}}_{0}+{\mathcal {H}},{\mathcal {E}}_{0}-{\mathcal {H}})\) is a feasible point because \(({\mathcal {X}}_{0}+{\mathcal {H}})+({\mathcal {E}}_{0}-{\mathcal {H}})={\mathcal {X}}_{0}+{\mathcal {E}}_{0}\). To prove that is the \((\varepsilon ,\delta )\)-asymptotic unique local solution (See Definition 6 for details.) to the proposed CCSVS model Eq. (15), we need to show

$$\begin{aligned}&\textrm{Sep}({\mathcal {X}}_{0}+{\mathcal {H}},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0} -{\mathcal {H}}\Vert _{1}>\nonumber \\&\quad \textrm{Sep}({\mathcal {X}}_{0},\Omega ,\vartheta ) +\lambda \Vert {\mathcal {E}}_{0}\Vert _{1}-\delta \end{aligned}$$
(106)

for any \({\mathcal {H}}\in \textrm{B}(0,\varepsilon )-\{0\}\) in the sense of tensor nuclear norm.

Note that

$$\begin{aligned}&\textrm{Sep}({\mathcal {X}}_{0}+{\mathcal {H}},\Omega ,\vartheta )=\textrm{Sep}_{ns}({\mathcal {X}}_{0} +{\mathcal {H}},\Omega ,\vartheta )\nonumber \\&\quad -\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1} ^{r}{\widetilde{\omega }}_{j}\sigma _{j}(\overline{{\mathcal {X}}}^{(i)}_{0}+\overline{{\mathcal {H}}}^{(i)}), \end{aligned}$$
(107)

where \({\widetilde{\omega }}_{j}=\omega +\omega _{j}\), \(j=1,2,\cdots ,r\) and \(r<\min \{n_{1},n_{2}\}\).

For any \(\omega \Phi \otimes ({\mathcal {U}}*{\mathcal {V}}^{*}+{\mathcal {W}}_{0})\in \partial \textrm{Sep}_{ns}({\mathcal {X}}_{0},\Omega ,\vartheta )\) and \(\textrm{sgn}({\mathcal {E}}_{0})+{\mathcal {F}}_{0}\in \partial \Vert {\mathcal {E}}_{0}\Vert _{1}\), we have

$$\begin{aligned}&\textrm{Sep}_{ns}({\mathcal {X}}_{0}+{\mathcal {H}},\Omega ,\vartheta ) +\quad \lambda \Vert {\mathcal {E}}_{0}-{\mathcal {H}}\Vert _{1}\nonumber \\&\ge \textrm{Sep}_{ns}({\mathcal {X}}_{0},\Omega ,\vartheta )+\langle \omega \Phi \otimes ({\mathcal {U}}*{\mathcal {V}}^{*}+{\mathcal {W}}_{0}),{\mathcal {H}}\rangle \nonumber \\&\quad +\lambda \Vert {\mathcal {E}}_{0}\Vert _{1}s-\lambda \langle \textrm{sgn}({\mathcal {E}}_{0}) +{\mathcal {F}}_{0},{\mathcal {H}}\rangle . \end{aligned}$$
(108)

By the intermediate result in the proof of Theorem 1, we have

$$\begin{aligned}&\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {X}}}^{(i)}_{0}+\overline{{\mathcal {H}}}^{(i)})\le \sum _{i=1} ^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j}(\overline{{\mathcal {X}}} ^{(i)}_{0})\nonumber \\&\quad +\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {H}}}^{(i)}). \end{aligned}$$
(109)

Therefore

$$\begin{aligned}&\textrm{Sep}({\mathcal {X}}_{0}+{\mathcal {H}},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0} -{\mathcal {H}}\Vert _{1}\nonumber \\&\quad \ge \textrm{Sep}({\mathcal {X}}_{0},\Omega ,\vartheta )+\langle \omega \Phi \otimes ({\mathcal {U}} *{\mathcal {V}}^{*}+{\mathcal {W}}_{0}),{\mathcal {H}}\rangle \nonumber \\&\qquad +\lambda \Vert {\mathcal {E}}_{0}\Vert _{1}-\lambda \langle \textrm{sgn}({\mathcal {E}}_{0})+{\mathcal {F}}_{0},{\mathcal {H}}\rangle \nonumber \\&\qquad -\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {H}}}^{(i)})\nonumber \\&\quad =\textrm{Sep}({\mathcal {X}}_{0},\Omega ,\vartheta )+\langle {\mathcal {U}}*{\mathcal {V}}^{*} +{\mathcal {W}}_{0},\omega \Phi \otimes {\mathcal {H}}\rangle \nonumber \\&\qquad +\lambda \Vert {\mathcal {E}}_{0}\Vert _{1} -\lambda \langle \textrm{sgn}({\mathcal {E}}_{0}) +{\mathcal {F}}_{0},{\mathcal {H}}\rangle \nonumber \\&\qquad -\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {H}}}^{(i)}). \end{aligned}$$
(110)

According to the property of dual norm on the tensor nuclear norm and \(l_{1}\)-norm,

$$\begin{aligned} \sup _{\Vert {\mathcal {W}}_{0}\Vert \le 1}\langle {\mathcal {W}}_{0},\omega \Phi \otimes {\mathcal {H}}\rangle =\Vert \omega \Phi \otimes {\mathcal {H}}\Vert _{*}. \end{aligned}$$
(111)

and

$$\begin{aligned} \sup _{\Vert {\mathcal {F}}_{0}\Vert _{\infty }\le 1}-\langle {\mathcal {F}}_{0},{\mathcal {H}}\rangle =\Vert {\mathcal {H}}\Vert _{1}. \end{aligned}$$
(112)

Note that

$$\begin{aligned} \Vert {\mathcal {H}}\Vert _{1}\ge \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}^{\bot }}{\mathcal {H}}\Vert _{1}. \end{aligned}$$
(113)

Then we have

$$\begin{aligned} \sup _{\Vert {\mathcal {F}}_{0}\Vert _{\infty }\le 1}-\langle {\mathcal {F}}_{0},{\mathcal {H}}\rangle \ge \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}^{\bot }}{\mathcal {H}}\Vert _{1}. \end{aligned}$$
(114)

Further, taking the supreme for the right-hand side of Eq. (110) over the set \(\{{\mathcal {W}}:\Vert {\mathcal {W}}\Vert \le 1\}\) leads to

$$\begin{aligned}&\textrm{Sep}({\mathcal {X}}_{0}+{\mathcal {H}},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0}-{\mathcal {H}}\Vert _{1}\nonumber \\&\quad \ge \textrm{Sep}({\mathcal {X}}_{0},\Omega ,\vartheta )+\langle {\mathcal {U}} *{\mathcal {V}}^{*},\omega \Phi \otimes {\mathcal {H}}\rangle \nonumber \\&\qquad +\lambda \Vert {\mathcal {S}}_{0}\Vert _{1} -\lambda \langle \textrm{sgn}({\mathcal {E}}_{0}),{\mathcal {H}}\rangle \nonumber \\&\qquad -\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {H}}}^{(i)})+\sup _{\Vert {\mathcal {W}}\Vert \le 1}\langle {\mathcal {W}}, \omega \Phi \otimes {\mathcal {H}}\rangle \nonumber \\&\qquad +\lambda \sup _{\Vert {\mathcal {F}}\Vert _{\infty } \le 1} \langle -{\mathcal {F}},{\mathcal {H}}\rangle \nonumber \\&\quad =\textrm{Sep}({\mathcal {X}}_{0},\Omega ,\vartheta )+\langle \omega \Phi \otimes ({\mathcal {U}} *{\mathcal {V}}^{*}),{\mathcal {H}}\rangle \nonumber \\&\qquad +\lambda \Vert {\mathcal {E}}_{0}\Vert _{1}-\lambda \langle \textrm{sgn}({\mathcal {E}}_{0}),{\mathcal {H}}\rangle \nonumber \\&\qquad -\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {H}}}^{(i)})+\sup _{\Vert {\mathcal {W}}\Vert \le 1}\vert \langle {\mathcal {W}},\omega \Phi \otimes {\mathcal {H}}\rangle \vert \nonumber \\&\qquad +\lambda \sup _{\Vert {\mathcal {F}}\Vert _{\infty } \le 1}\vert \langle {\mathcal {F}},{\mathcal {H}}\rangle \vert . \end{aligned}$$
(115)

By the assumption on Eq. (103),

$$\begin{aligned}&\langle \omega \Phi \otimes ({\mathcal {U}}*{\mathcal {V}}^{*})-\lambda \textrm{sgn} ({\mathcal {E}}_{0}),{\mathcal {H}}\rangle \nonumber \\&\quad =-\langle {\mathcal {W}},\omega \Phi \otimes {\mathcal {H}}\rangle +\lambda \langle {\mathcal {F}}, {\mathcal {H}}\rangle +\lambda \langle {\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {D}}, {\mathcal {H}}\rangle \nonumber \\&\quad \ge -\sup _{\Vert {\mathcal {W}}\Vert \le \epsilon _{0}}\vert \langle {\mathcal {W}},\omega \Phi \otimes {\mathcal {H}}\rangle \vert -\lambda \sup _{\Vert {\mathcal {F}}\Vert _{\infty }\le \epsilon _{0}} \vert \langle {\mathcal {F}},{\mathcal {H}}\rangle \vert \nonumber \\&\qquad +\lambda \langle {\mathcal {P}}_{{{\varvec{\Omega }}}} {\mathcal {D}},{\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {H}}\rangle \nonumber \\&\quad \ge -\epsilon _{0}\sup _{\Vert \mathcal {{\widehat{W}}}\Vert \le 1}\vert \langle \mathcal {{\widehat{W}}},\omega \Phi \otimes {\mathcal {H}}\rangle \vert -\lambda \epsilon _{0}\nonumber \\&\quad \sup _{\Vert \mathcal {{\widehat{F}}}\Vert _{\infty } \le 1}\vert \langle \mathcal {{\widehat{F}}},{\mathcal {H}}\rangle \vert -\frac{\lambda }{4}\Vert {\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {H}}\Vert _{F} \end{aligned}$$
(116)

Then we have

$$\begin{aligned}&\quad \textrm{Sep}({\mathcal {X}}_{0}+{\mathcal {H}},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0} -{\mathcal {H}}\Vert _{1}\nonumber \\&\quad \ge \textrm{Sep}({\mathcal {X}}_{0},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0}\Vert _{1}\nonumber \\&\qquad -\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {H}}}^{(i)})+\sup _{\Vert {\mathcal {W}}\Vert \le 1}\langle {\mathcal {W}}, \omega \Phi \otimes {\mathcal {H}}\rangle \nonumber \\&\qquad +\frac{\lambda }{2}\sup _{\Vert {\mathcal {F}}\Vert _{\infty } \le 1}\langle -{\mathcal {F}},{\mathcal {H}}\rangle \nonumber \\&\quad \ge \textrm{Sep}({\mathcal {X}}_{0},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0}\Vert _{1}\nonumber \\&\qquad -\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {H}}}^{(i)})+(1-\epsilon _{0})\nonumber \\&\quad \left( \Vert \omega \Phi \otimes {\mathcal {H}}\Vert _{*} +\lambda \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}^{\bot }}{\mathcal {H}}\Vert _{1} \right) -\frac{\lambda }{4}\Vert {\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {H}}\Vert _{F}. \end{aligned}$$
(117)

Note that

$$\begin{aligned} \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {H}}\Vert _{F}&\le \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}} {\mathcal {P}}_{{\textbf{T}}}{\mathcal {H}}\Vert _{F}+\Vert {\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {P}}_{{\textbf{T}} ^{\bot }}{\mathcal {H}}\Vert _{F}\nonumber \\&\le \frac{1}{2}\Vert {\mathcal {H}}\Vert _{F}+\Vert {\mathcal {P}}_{{\textbf{T}}^{\bot }}{\mathcal {H}}\Vert _{F}\nonumber \\&\le \frac{1}{2}\Vert {\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {H}}\Vert _{F}+\frac{1}{2}\Vert {\mathcal {P}}_{{{\varvec{\Omega }}} ^{\bot }}{\mathcal {H}}\Vert _{F}+\Vert {\mathcal {P}}_{{\textbf{T}}^{\bot }}{\mathcal {H}}\Vert _{F}. \end{aligned}$$
(118)

which implies that

$$\begin{aligned} \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}}{\mathcal {H}}\Vert _{F}&\le \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}} {\mathcal {H}}\Vert _{F}+2\Vert {\mathcal {P}}_{{\textbf{T}}^{\bot }}{\mathcal {H}}\Vert _{F}\nonumber \\&\le \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}^{\bot }}{\mathcal {H}}\Vert _{1}+2\sqrt{n_{3}}\Vert {\mathcal {P}}_{{\textbf{T}} ^{\bot }}{\mathcal {H}}\Vert _{*}\nonumber \\&\le \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}^{\bot }}{\mathcal {H}}\Vert _{1}+2\sqrt{n_{3}}\Vert {\mathcal {H}}\Vert _{*} \end{aligned}$$
(119)

The last inequality holds by Lemma 3.

Then

$$\begin{aligned}&\quad \textrm{Sep}({\mathcal {X}}_{0}+{\mathcal {H}},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0} -{\mathcal {H}}\Vert _{1}\nonumber \\&\ge \textrm{Sep}({\mathcal {X}}_{0},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0}\Vert _{1}\nonumber \\&\quad \underbrace{-\sum _{i=1}^{n_{3}}\vartheta _{i}\sum _{j=1}^{r}{\widetilde{\omega }}_{j}\sigma _{j} (\overline{{\mathcal {H}}}^{(i)})+(1-\epsilon _{0})\Vert \omega \Phi \otimes {\mathcal {H}}\Vert _{*} -\frac{\lambda \sqrt{n_{3}}}{2}\Vert {\mathcal {H}}\Vert _{*}}_{I_{0}}\nonumber \\&\quad +\underbrace{\left( \frac{3}{4}-\epsilon _{0}\right) \lambda \Vert {\mathcal {P}}_{{{\varvec{\Omega }}}^{\bot }} {\mathcal {H}}\Vert _{1}}_{I_{1}}. \end{aligned}$$
(120)

As a matter of fact,

$$\begin{aligned} I_{0}=\sum _{i=1}^{n_{3}}\sum _{j=1}^{r}\kappa _{ij}\sigma _{j}(\overline{{\mathcal {H}}}^{(i)}), \end{aligned}$$
(121)

where

$$\begin{aligned} \kappa _{ij}=\frac{1}{n_{3}}\left( (1-\epsilon _{0})\omega \vartheta _{i}-\frac{\lambda \sqrt{n_{3}}}{2}\right) -\vartheta _{i}{\widetilde{\omega }}_{j}. \end{aligned}$$
(122)

Then

$$\begin{aligned} \vert \kappa _{ij}\vert \le 3\vartheta _{\max }\omega _{\max }+\frac{1}{n_{3}\sqrt{\max \{n_{1},n_{2}\}}}. \end{aligned}$$
(123)

By given condition \(\Vert {\mathcal {H}}\Vert _{*}\le \varepsilon \), we have

$$\begin{aligned} \vert I_{0}\vert \le \varepsilon \left( 3\vartheta _{\max }\omega _{\max }+\frac{1}{n_{3} \sqrt{\max \{n_{1},n_{2}\}}}\right) =\delta . \end{aligned}$$
(124)

Recall that \(\epsilon _{0}<\frac{3}{4}\). \({\mathcal {H}}\ne 0\) means that all singular values \(\sigma _{j}(\overline{{\mathcal {H}}}^{(i)})\) are non-zero. Hence, \(I_{1}\) must be positive, i.e. \(I_{1}>0\). Then

$$\begin{aligned}&\textrm{Sep}({\mathcal {X}}_{0}+{\mathcal {H}},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0} -{\mathcal {H}}\Vert _{1}\nonumber \\&\quad >\textrm{Sep}({\mathcal {X}}_{0},\Omega ,\vartheta )+\lambda \Vert {\mathcal {E}}_{0}\Vert _{1}-\delta . \end{aligned}$$
(125)

which holds for any \({\mathcal {H}}\in B(0,\varepsilon )-\{0\}\) in the sense of tensor nuclear norm.

The remaining Step 2 is similar to the dual certification step in Lu et al. (2020) and omitted here. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Du, B., Chen, Y. et al. Convex–Concave Tensor Robust Principal Component Analysis. Int J Comput Vis 132, 1721–1747 (2024). https://doi.org/10.1007/s11263-023-01960-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-023-01960-1

Keywords

Navigation