On Polynomial Time Methods for Exact Low-Rank Tensor Completion

Abstract

In this paper, we investigate the sample size requirement for exact recovery of a high-order tensor of low rank from a subset of its entries. We show that a gradient descent algorithm with initial value obtained from a spectral method can, in particular, reconstruct a \({d\times d\times d}\) tensor of multilinear ranks (rrr) with high probability from as few as \(O(r^{7/2}d^{3/2}\log ^{7/2}d+r^7d\log ^6d)\) entries. In the case when the ranks \(r=O(1)\), our sample size requirement matches those for nuclear norm minimization (Yuan and Zhang in Found Comput Math 1031–1068, 2016), or alternating least squares assuming orthogonal decomposability (Jain and Oh in Advances in Neural Information Processing Systems, pp 1431–1439, 2014). Unlike these earlier approaches, however, our method is efficient to compute, is easy to implement, and does not impose extra structures on the tensor. Numerical results are presented to further demonstrate the merits of the proposed approach.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. 1.

    P. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press, 2008.

  2. 2.

    Animashree Anandkumar, Rong Ge, Daniel Hsu, Sham M Kakade, and Matus Telgarsky. Tensor decompositions for learning latent variable models. Journal of Machine Learning Research, 15(1):2773–2832, 2014.

  3. 3.

    Boaz Barak and Ankur Moitra. Noisy tensor completion via the sum-of-squares hierarchy. In 29th Annual Conference on Learning Theory, pages 417–445, 2016.

  4. 4.

    Olivier Bousquet. A Bennett concentration inequality and its application to suprema of empirical processes. Comptes Rendus Mathematique, 334(6):495–500, 2002.

    MathSciNet  Article  Google Scholar 

  5. 5.

    Emmanuel J Candès and Benjamin Recht. Exact matrix completion via convex optimization. Foundations of Computational mathematics, 9(6):717–772, 2009.

    MathSciNet  Article  Google Scholar 

  6. 6.

    Emmanuel J Candès and Terence Tao. The power of convex relaxation: Near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5):2053–2080, 2010.

    MathSciNet  Article  Google Scholar 

  7. 7.

    S. Cohen and M. Collins. Tensor decomposition for fast parsing with latent-variable PCFGS. In Advances in Neural Information Processing Systems, 2012.

  8. 8.

    Victor de la Pena and Evarist Giné. Decoupling: from dependence to independence. Springer Science & Business Media, 1999.

  9. 9.

    Victor H de la Peña and Stephen J Montgomery-Smith. Decoupling inequalities for the tail probabilities of multivariate U-statistics. The Annals of Probability, pages 806–816, 1995.

  10. 10.

    Vin de Silva and Lek-Heng Lim. Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM Journal on Matrix Analysis and Applications, 30(3):1084–1127, 2008.

    MathSciNet  Article  Google Scholar 

  11. 11.

    Alan Edelman, Tomás A Arias, and Steven T Smith. The geometry of algorithms with orthogonality constraints. SIAM journal on Matrix Analysis and Applications, 20(2):303–353, 1998.

    MathSciNet  Article  Google Scholar 

  12. 12.

    Lars Elden and Berkant Savas. A Newton-Grassmann method for computing the best multilinear rank-(\(r_1,r_2,r_3\)) approximation of a tensor. SIAM Journal on Matrix Analysis and Applications, 31(2):248–271, 2009.

    MathSciNet  Article  Google Scholar 

  13. 13.

    Silvia Gandy, Benjamin Recht, and Isao Yamada. Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Problems, 27(2):025010, 2011.

    MathSciNet  Article  Google Scholar 

  14. 14.

    David Gross. Recovering low-rank matrices from few coefficients in any basis. IEEE Transactions on Information Theory, 57(3):1548–1566, 2011.

    MathSciNet  Article  Google Scholar 

  15. 15.

    C. Hillar and Lek-Heng Lim. Most tensor problems are NP-hard. Journal of ACM, 60(6):45, 2013.

  16. 16.

    Prateek Jain and Sewoong Oh. Provable tensor factorization with missing data. In Advances in Neural Information Processing Systems, pages 1431–1439, 2014.

  17. 17.

    Raghunandan H Keshavan, Sewoong Oh, and Andrea Montanari. Matrix completion from a few entries. In 2009 IEEE International Symposium on Information Theory, pages 324–328. IEEE, 2009.

  18. 18.

    Daniel Kressner, Michael Steinlechner, and Bart Vandereycken. Low-rank tensor completion by Riemannian optimization. BIT Numerical Mathematics, 54(2):447–468, 2014.

    MathSciNet  Article  Google Scholar 

  19. 19.

    N. Li and B. Li. Tensor completion for on-board compression of hyperspectral images. In 17th IEEE International Conference on Image Processing (ICIP), pages 517–520, 2010.

  20. 20.

    Ji Liu, Przemyslaw Musialski, Peter Wonka, and Jieping Ye. Tensor completion for estimating missing values in visual data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1):208–220, 2013.

    Article  Google Scholar 

  21. 21.

    David G Luenberger and Yinyu Ye. Linear and nonlinear programming, volume 228. Springer, 2015.

  22. 22.

    Andrea Montanari and Nike Sun. Spectral algorithms for tensor completion. Communications on Pure and Applied Mathematics, 2016.

  23. 23.

    Cun Mu, Bo Huang, John Wright, and Donald Goldfarb. Square deal: Lower bounds and improved convex relaxations for tensor recovery. Journal of Machine Learning Research, 1:1–48, 2014.

    Google Scholar 

  24. 24.

    Holger Rauhut and Željka Stojanac. Tensor theta norms and low rank recovery. arXiv preprint arXiv:1505.05175, 2015.

  25. 25.

    Holger Rauhut, Reinhold Schneider, and Zeljka Stojanac. Low rank tensor recovery via iterative hard thresholding. arXiv preprint arXiv:1602.05217, 2016.

  26. 26.

    Benjamin Recht. A simpler approach to matrix completion. Journal of Machine Learning Research, 12(Dec):3413–3430, 2011.

  27. 27.

    Berkant Savas and Lek-Heng Lim. Quasi-newton methods on Grassmannians and multilinear approximations of tensors. SIAM Journal on Matrix Analysis and Applications, 32(6):3352–3393, 2010.

    MathSciNet  MATH  Google Scholar 

  28. 28.

    O. Semerci, N. Hao, M. Kilmer, and E. Miller. Tensor-based formulation and nuclear norm regularization for multienergy computed tomography. IEEE Transactions on Image Processing, 23:1678–1693, 2014.

    MathSciNet  Article  Google Scholar 

  29. 29.

    N.D. Sidiropoulos and N. Nion. Tensor algebra and multi-dimensional harmonic retrieval in signal processing for mimo radar. IEEE Transactions on Signal Processing, 58:5693–5705, 2010.

    MathSciNet  Article  Google Scholar 

  30. 30.

    Ryota Tomioka, Kohei Hayashi, and Hisashi Kashima. Estimation of low-rank tensors via convex optimization. arXiv preprint arXiv:1010.0789, 2010.

  31. 31.

    Joel A Tropp. User-friendly tail bounds for sums of random matrices. Foundations of Computational Mathematics, 12(4):389–434, 2012.

  32. 32.

    Yi Yu, Tengyao Wang, and Richard J Samworth. A useful variant of the Davis–Kahan theorem for statisticians. Biometrika, 102(2):315–323, 2015.

    MathSciNet  Article  Google Scholar 

  33. 33.

    Ming Yuan and Cun-Hui Zhang. On tensor completion via nuclear norm minimization. Foundations of Computational Mathematics, pages 1031–1068, 2016.

  34. 34.

    Ming Yuan and Cun-Hui Zhang. Incoherent tensor norms and their applications in higher order tensor completion. IEEE Transactions on Information Theory, 63(10):6753–6766, 2017.

    MathSciNet  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ming Yuan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Ming Yuan’s research was supported in part by NSF Grant DMS-1721584.

Communicated by Thomas Strohmer.

Appendices

Proof of Lemma 1

The first claim is straightforward. It suffices to prove the second claim. Let \(\mathbf {A}=(\mathbf {U},\mathbf {V},\mathbf {W})\cdot \mathbf {C}\) with \(\mathbf {C}\in \mathbb {R}^{r_1(\mathbf {A})\times r_2(\mathbf {A})\times r_3(\mathbf {A})}\) being the core tensor. Clearly, \(\Vert \mathbf {A}\Vert _{\star }=\Vert \mathbf {C}\Vert _{\star }\) and \(\Vert \mathbf {A}\Vert _{\mathrm{F}}=\Vert \mathbf {C}\Vert _{\mathrm{F}}\). Denote by \(\mathbf {C}_1,\ldots , \mathbf {C}_{r_1(\mathbf {A})}\in \mathbb {R}^{r_2(\mathbf {A})\times r_3(\mathbf {A})}\) the mode-1 slices of \(\mathbf {C}\). By convexity of nuclear norm,

$$\begin{aligned} \Vert \mathbf {C}\Vert _{\star }\le \Vert \mathbf {C}_1\Vert _{\star }+\cdots +\Vert \mathbf {C}_{r_1(\mathbf {A})}\Vert _{\star }. \end{aligned}$$

As a result,

$$\begin{aligned} \Vert \mathbf {C}\Vert _{\star }^2\le & {} r_1(\mathbf {A})\big (\Vert \mathbf {C}_1\Vert _{\star }^2+\cdots +\Vert \mathbf {C}_{r_1(\mathbf {A})}\Vert _{\star }^2\big )\\\le & {} r_1(\mathbf {A})\big (r_2(\mathbf {A})\wedge r_3(\mathbf {A})\big )\big (\Vert \mathbf {C}_1\Vert _{\mathrm{F}}^2+\ldots +\Vert \mathbf {C}_{r_1(\mathbf {A})}\Vert _{\mathrm{F}}^2\big )\\= & {} r_1(\mathbf {A})\big (r_2(\mathbf {A})\wedge r_3(\mathbf {A})\big )\Vert \mathbf {C}\Vert _{\mathrm{F}}^2. \end{aligned}$$

Therefore,

$$\begin{aligned} \Vert \mathbf {C}\Vert _{\star }\le \sqrt{r_1(\mathbf {A})\min \{r_2(\mathbf {A}),r_3(\mathbf {A})\}}\Vert \mathbf {C}\Vert _{\mathrm{F}}. \end{aligned}$$

By the same process on mode-2 and mode-3 slices of \(\mathbf {C}\), we obtain

$$\begin{aligned} \Vert \mathbf {C}\Vert _{\star }\le \sqrt{r_2(\mathbf {A})\min \{r_1(\mathbf {A}), r_3(\mathbf {A})\}}\Vert \mathbf {C}\Vert _{\mathrm{F}}, \end{aligned}$$

and

$$\begin{aligned} \Vert \mathbf {C}\Vert _{\star }\le \sqrt{r_3(\mathbf {A})\min \{r_1(\mathbf {A}), r_2(\mathbf {A})\}}\Vert \mathbf {C}\Vert _{\mathrm{F}}, \end{aligned}$$

which concludes the proof.

Proof of Corollary 1

By Davis–Kahan theorem (see, e.g., Theorem 2 of [32]),

$$\begin{aligned} d_{\mathrm{p}}\big ({\widehat{\mathbf {U}}},\mathbf {U}\big )\le \frac{2\sqrt{r_1}\Vert {\widehat{\mathbf {N}}}-\mathbf {M}\mathbf {M}^\top \Vert }{\sigma _{\min }(\mathbf {M}\mathbf {M}^\top )}. \end{aligned}$$

By choosing \(m_1=d_1, m_2=d_2d_3\) in Theorem 2 and noticing that \(n\ge C_1(\alpha +1)(d_1d_2d_3)^{1/2}\), then

$$\begin{aligned}&\Vert {\widehat{\mathbf {N}}}-\mathbf {M}\mathbf {M}^\top \Vert \le C\alpha ^2 {(d_1d_2d_3)^{3/2}\log d\over n} \\&\quad \times \, \left[ \left( 1+\frac{d_1}{d_2d_3}\right) ^{1/2}+\left( \frac{n}{d_2d_3\log d}\right) ^{1/2}\right] \Vert \mathbf {M}\Vert _{\max }^2 \end{aligned}$$

with probability at least \(1-d^{-\alpha }\). It suffices to control \(\Vert \mathbf {M}\Vert _{\max }\). Recall that \(\mu (\mathbf {T})\le \mu _0\); then,

$$\begin{aligned} \Vert \mathbf {M}\Vert _{\max }=\Vert \mathbf {T}\Vert _{\max }\le \Vert \mathbf {T}\Vert \mu _0^{3/2}\left( r_1r_2r_3\over d_1d_2d_3\right) ^{1/2}. \end{aligned}$$

It is clear by definition that

$$\begin{aligned} {\Vert \mathbf {T}\Vert ^2}/{\sigma _{\min }(\mathbf {M}\mathbf {M}^\top )}\le \kappa ^2(\mathbf {T})\le \kappa _0^2. \end{aligned}$$

As a result, the following bound holds with probability at least \(1-d^{-\alpha }\),

$$\begin{aligned} d_{\mathrm{p}}\big ({\widehat{\mathbf {U}}},\mathbf {U}\big )&\le 2C\alpha ^2\mu _0^3\kappa _0^2r_1^{3/2}r_2r_3{(d_1d_2d_3)^{1/2}\log d\over n} \\&\quad \times \, \left[ \left( 1+\frac{d_1}{d_2d_3}\right) ^{1/2}+\left( \frac{n}{d_2d_3\log d}\right) ^{1/2}\right] \\&\le 2C\alpha ^2\mu _0^3\kappa _0^2r_1^{3/2}r_2r_3\left[ \frac{(d_1d_2d_3)^{1/2}\log d}{n}+\frac{d_1\log d}{n}+\left( \frac{d_1\log d}{n}\right) ^{1/2}\right] . \end{aligned}$$

The claim then follows.

Proof of Lemma 2

For simplicity, define a random tensor \(\mathbf {E}\in \{0,1\}^{d_1\times d_2\times d_3}\) based on \(\omega \in [d_1]\times [d_2]\times [d_3]\) such that \(\mathbf {E}(\omega )=1\) and all the other entries are 0s. Let \(\mathbf {E}_1,\ldots ,\mathbf {E}_n\) be i.i.d. copies of \(\mathbf {E}\). Equivalently, we write

$$\begin{aligned} \beta _n(\gamma _1,\gamma _2)=\underset{\mathbf {A}\in \mathcal{K}(\gamma _1,\gamma _2)}{\sup }\Big |\frac{1}{n}\sum _{i=1}^n\langle \mathbf {A},\mathbf {E}_i\rangle ^2-\mathbb {E}\langle \mathbf {A},\mathbf {E}\rangle ^2\Big | \end{aligned}$$

which is the upper bound of an empirical process indexed by \(\mathcal {K}(\gamma _1,\gamma _2)\). Define \(\delta _{1,j}=2^j\delta _1^-\) for \(j=0,1,2,\ldots ,\lfloor \rfloor {\log \frac{\delta _1^+}{\delta _1^-}}\) and \(\delta _{2,k}=2^k\delta _2^-\) for \(k=0,1,2,\ldots , \lfloor \rfloor {\log \frac{\delta _2^+}{\delta _2^-}}\). For each jk, we derive the upper bound of \(\beta _n(\gamma _1,\gamma _2)\) with \(\gamma _1\in [\delta _{1,j}, \delta _{1,j+1}]\) and \(\gamma _2\in [\delta _{2,k},\delta _{2,k+1}]\). Following the union argument, we can make the bound uniformly true for \(\gamma _1\in [\delta _1^-, \delta _1^+]\) and \(\gamma _2\in [\delta _{2}^-, \delta _2^+]\).

Consider \(\gamma _1\in [\delta _{1,j}, \delta _{1,j+1}]\), \(\gamma _2\in [\delta _{2,k},\delta _{2,k+1}]\), and observe that

$$\begin{aligned} \underset{\mathbf {A}\in \mathcal {K}(\gamma _1,\gamma _2)}{\sup }\big |\langle \mathbf {A},\mathbf {E}\rangle ^2-\mathbb {E}\langle \mathbf {A},\mathbf {E}\rangle ^2\big |\le \gamma _1^2. \end{aligned}$$

Moreover,

$$\begin{aligned} \underset{{\mathbf {A}}\in \mathcal {K}(\gamma _1,\gamma _2)}{\sup }\mathrm{Var}\big (\langle \mathbf {A},\mathbf {E}\rangle ^2\big )\le \underset{{\mathbf {A}}\in \mathcal {K}(\gamma _1,\gamma _2)}{\sup }\mathbb {E}\langle \mathbf {A},\mathbf {E}\rangle ^4\le \frac{\gamma _1^2\Vert \mathbf {A}\Vert _{\mathrm{F}}^2}{d_1d_2d_3}\le \frac{\gamma _1^2}{d_1d_2d_3}. \end{aligned}$$

Applying Bousquet’s version of Talagrand’s concentration inequality [4], with probability at least \(1-e^{-t}\) for all \(t\ge 0\),

$$\begin{aligned} \beta _n(\gamma _1,\gamma _2)\le 2\mathbb {E}\beta _n(\gamma _1,\gamma _2)+2\gamma _1\sqrt{\frac{t}{nd_1d_2d_3}}+2\gamma _1^2\frac{t}{n}. \end{aligned}$$

By the symmetrization inequality,

$$\begin{aligned} \mathbb {E}\beta _n(\gamma _1,\gamma _2)\le 2\mathbb {E}\underset{\mathbf {A}\in \mathcal {K}(\gamma _1,\gamma _2)}{\sup }\Big |\frac{1}{n}\sum _{i=1}^n\varepsilon _{i}\langle \mathbf {A},\mathbf {E}_i\rangle ^2\Big |, \end{aligned}$$

where \(\varepsilon _1,\ldots ,\varepsilon _n\) are i.i.d Rademacher random variables. Since \(|\langle \mathbf {A},\mathbf {E}\rangle |\le \gamma _1\), by the contraction inequality,

$$\begin{aligned} \mathbb {E}\beta _n(\gamma _1,\gamma _2)\le 4\gamma _1\mathbb {E}\underset{\mathbf {A}\in \mathcal {K}(\gamma _1,\gamma _2)}{\sup }\Big |\frac{1}{n}\sum _{i=1}^n\varepsilon _{i}\langle \mathbf {A},\mathbf {E}_i\rangle \Big |. \end{aligned}$$

Denote \({{\varvec{\Gamma }}}=n^{-1}\sum _{i=1}^n\varepsilon _i\mathbf {E}_i\in \mathbb {R}^{d_1\times d_2\times d_3}\). Then,

$$\begin{aligned} \mathbb {E}\underset{\mathbf {A}\in \mathcal {K}(\gamma _1,\gamma _2)}{\sup }\Big |\frac{1}{n}\sum _{i=1}^n\varepsilon _{i}\langle \mathbf {A},\mathbf {E}_i\rangle \Big |\le \mathbb {E}\underset{\mathbf {A}\in \mathcal {K}(\gamma _1,\gamma _2)}{\sup }\Vert {{\varvec{\Gamma }}}\Vert \Vert \mathbf {A}\Vert _{\star }\le \gamma _2\mathbb {E}\Vert {{\varvec{\Gamma }}}\Vert . \end{aligned}$$

It is not difficult to show that

$$\begin{aligned} \mathbb {E}\Vert \varvec{\Gamma }\Vert \le C\Big (\sqrt{\frac{d}{nd_1d_2d_3}}\log d+\frac{\log ^{3/2}d}{n}\Big ). \end{aligned}$$

See, e.g., Lemma 8 of Yuan and Zhang [33]. The above bound holds as long as

$$\begin{aligned} n\ge C\Big \{\mu _0(r_1r_2r_3d_1d_2d_3)^{1/2}\log ^{3/2}d+\mu _0^2r_1r_2r_3d\log ^2d\Big \}. \end{aligned}$$

As a result, with probability at least \(1-e^{-t}\),

$$\begin{aligned} \beta _n(\gamma _1,\gamma _2)\le C\gamma _1\gamma _2\Big (\sqrt{\frac{d}{nd_1d_2d_3}}\log d+\frac{\log ^{3/2}d}{n}\Big )+2\gamma _1\sqrt{\frac{t}{nd_1d_2d_3}}+2\gamma _1^2\frac{t}{n} \end{aligned}$$

for \(\gamma _1\in [\delta _{1,j}, \delta _{1,j+1}]\) and \(\gamma _2\in [\delta _{2,k},\delta _{2,k+1}]\). Now, consider all the combinations of j and k, and we can make the upper bound uniform for all j and k with adjusting t to \({{\bar{t}}}\), and C to 2C.

Proof of lower bound of \(\langle \mathbf {Q}_\mathbf {T}({\widehat{\mathbf {T}}}-\mathbf {T}), \mathbf {H}_1\rangle \)

Recall that

$$\begin{aligned}&\langle \mathbf {Q}_\mathbf {T}({\widehat{\mathbf {T}}}-\mathbf {T}), \mathbf {H}_1\rangle = \Big <(\mathbf {U},\mathbf {V},\mathbf {W})\cdot (\mathbf {C}-\mathbf {G})+(\varvec{\Delta }_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\varvec{\Delta }_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}\\&\quad +(\mathbf {U},\mathbf {V},\varvec{\Delta }_\mathbf {Z})\cdot \mathbf {C}, (\mathbf {D}_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\mathbf {D}_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\mathbf {V},\mathbf {D}_\mathbf {Z})\cdot \mathbf {C}\Big >. \end{aligned}$$

Clearly, the right-hand side can be written as \(\zeta _1+\zeta _2+\zeta _3\) where

$$\begin{aligned} \zeta _1&=\Vert (\varvec{\Delta }_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\varvec{\Delta }_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\mathbf {V},\varvec{\Delta }_\mathbf {Z})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2\\ \zeta _2&=\Big<(\mathbf {U},\mathbf {V},\mathbf {W})\cdot (\mathbf {C}-\mathbf {G}), (\mathbf {D}_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\mathbf {D}_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\mathbf {V},\mathbf {D}_\mathbf {Z})\cdot \mathbf {C}\Big>\\ \zeta _3&=\Big <\varvec{\Delta }_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\varvec{\Delta }_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\mathbf {V},\varvec{\Delta }_\mathbf {Z})\cdot \mathbf {C},(\mathbf {D}_\mathbf {X}-\varvec{\Delta }_{\mathbf {X}},\mathbf {V},\mathbf {W})\cdot \mathbf {C}\\&\quad +\,(\mathbf {U},\mathbf {D}_\mathbf {Y}-\varvec{\Delta }_{\mathbf {Y}},\mathbf {W})\cdot \mathbf {C}+(\mathbf {U},\mathbf {V},\mathbf {D}_\mathbf {Z}-\varvec{\Delta }_{\mathbf {Z}})\cdot \mathbf {C}\Big >. \end{aligned}$$

Clearly,

$$\begin{aligned} \zeta _1&\ge \Vert (\varvec{\Delta }_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2+\Vert (\mathbf {U},\varvec{\Delta }_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2+\Vert (\mathbf {U},\mathbf {V},\varvec{\Delta }_\mathbf {Z})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2\\&\quad -\,2\Lambda _{\max }^2(\mathbf {C})\Big (\Vert \mathbf {U}^\top \varvec{\Delta }_\mathbf {X}\Vert _{\mathrm{F}}\Vert \mathbf {V}^\top \varvec{\Delta }_\mathbf {Y}\Vert _{\mathrm{F}}+\Vert \mathbf {U}^\top \varvec{\Delta }_\mathbf {X}\Vert _{\mathrm{F}}\Vert \mathbf {W}^\top \varvec{\Delta }_\mathbf {Z}\Vert _{\mathrm{F}}+\Vert \mathbf {V}^\top \varvec{\Delta }_\mathbf {Y}\Vert _{\mathrm{F}}\Vert \mathbf {W}^\top \varvec{\Delta }_\mathbf {Z}\Vert _{\mathrm{F}}\Big )\\&\ge \Lambda _{\min }^2(\mathbf {C})\Big (\Vert \varvec{\Delta }_\mathbf {X}\Vert _{\mathrm{F}}^2+\Vert \varvec{\Delta }_\mathbf {Y}\Vert _{\mathrm{F}}^2+\Vert \varvec{\Delta }_\mathbf {Z}\Vert _{\mathrm{F}}^2\Big )-8\Lambda _{\max }^2(\mathbf {C})d_{\mathrm{p}}^4\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big ) \end{aligned}$$

where we used the fact that

$$\begin{aligned} \Vert \mathbf {U}^\top \varvec{\Delta }_\mathbf {X}\Vert _{\mathrm{F}}\le 2d_{\mathrm{p}}^2(\mathbf {U},\mathbf {X}). \end{aligned}$$

Recall from (23) that on the event \(\mathcal{E}_1\cap \mathcal{E}_2\cap \mathcal{E}_3\), we have

$$\begin{aligned} \frac{\Lambda _{\min }}{2}\le \Lambda _{\min }(\mathbf {C})\le \Lambda _{\max }(\mathbf {C})\le 2\Lambda _{\max }. \end{aligned}$$

Then

$$\begin{aligned} \zeta _1\ge \frac{1}{12}\Lambda _{\min }^2 d_{\mathrm{p}}^2\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big )-32\Lambda _{\max }^2 d_{\mathrm{p}}^4\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big ). \end{aligned}$$

It also implies that on the event \(\mathcal{E}_1\cap \mathcal{E}_2\cap \mathcal{E}_3\),

$$\begin{aligned} \zeta _1\ge \frac{1}{2}\Big (\Vert (\varvec{\Delta }_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2+\Vert (\mathbf {U},\varvec{\Delta }_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2+\Vert (\mathbf {U},\mathbf {V},\varvec{\Delta }_\mathbf {Z})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2\Big ). \end{aligned}$$
(28)

We can control \(|\zeta _3|\) in the same fashion. Indeed,

$$\begin{aligned} |\zeta _3|^2&\le |\zeta _1| \Lambda _{\max }^2(\mathbf {C})(\Vert \mathbf {D}_\mathbf {X}-\varvec{\Delta }_\mathbf {X}\Vert _{\mathrm{F}}^2+\Vert \mathbf {D}_\mathbf {Y}-\varvec{\Delta }_\mathbf {Y}\Vert _{\mathrm{F}}^2+\Vert \mathbf {D}_\mathbf {Z}-\varvec{\Delta }_\mathbf {Z}\Vert _{\mathrm{F}}^2)\\&\le 4|\zeta _1|\Lambda _{\max }^2 d_{\mathrm{p}}^4\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big ). \\ \end{aligned}$$

If

$$\begin{aligned} d_{\mathrm{p}}\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big )\le (C\alpha \kappa _0\log d)^{-1} \end{aligned}$$

for large \(C>0\), then under the event \(\mathcal{E}_1\cap \mathcal{E}_2\cap \mathcal{E}_3\),

$$\begin{aligned} \zeta _1\ge \frac{1}{16}\Lambda _{\min }^2d_{\mathrm{p}}^2\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big ) \quad \mathrm{and}\quad |\zeta _3|\le \frac{\zeta _1}{4} \end{aligned}$$

To control \(\zeta _2\), recall that \(\mathbf {X}^\top \mathbf {D}_\mathbf {X}=\mathbf{0}, \mathbf {Y}^\top \mathbf {D}_\mathbf {Y}=\mathbf{0}\) and \(\mathbf {Z}^\top \mathbf {D}_\mathbf {Z}=\mathbf{0}\). Then,

$$\begin{aligned} |\zeta _2|&\le |\langle (\varvec{\Delta }_\mathbf {X},\mathbf {V},\mathbf {W})\cdot (\mathbf {C}-\mathbf {G}), (\mathbf {D}_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}\rangle |\\&\quad +\, |\langle (\mathbf {U},\varvec{\Delta }_\mathbf {Y},\mathbf {W})\cdot (\mathbf {C}-\mathbf {G}), (\mathbf {U},\mathbf {D}_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}\rangle |\\&\quad +\, |\langle (\mathbf {U},\mathbf {V},\varvec{\Delta }_\mathbf {Z})\cdot (\mathbf {C}-\mathbf {G}), (\mathbf {U},\mathbf {V},\mathbf {D}_\mathbf {Z})\cdot \mathbf {C}\rangle |\\&\le 2\Vert \mathbf {C}-\mathbf {G}\Vert _{\mathrm{F}}\bigg \{\Big (\Vert (\varvec{\Delta }_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}\Vert _{\mathrm{F}}+\Vert \mathbf {U},\varvec{\Delta }_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}\Vert _{\mathrm{F}}+\Vert (\mathbf {U},\mathbf {V},\varvec{\Delta }_\mathbf {Z})\cdot \mathbf {C}\Vert _{\mathrm{F}}\Big )\\&\quad +\,\Lambda _{\max }(\mathbf {C})\Big (\Vert \mathbf {D}_\mathbf {X}-\varvec{\Delta }_\mathbf {X}\Vert _{\mathrm{F}}+\Vert \mathbf {D}_\mathbf {Y}-\varvec{\Delta }_\mathbf {Y}\Vert _{\mathrm{F}}+\Vert \mathbf {D}_\mathbf {Z}-\varvec{\Delta }_\mathbf {Z}\Vert _{\mathrm{F}}\Big )\bigg \}d_{\mathrm{p}}\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big )\\&\le 2\Vert \mathbf {G}-\mathbf {C}\Vert _{\mathrm{F}}\sqrt{\zeta _1}d_{\mathrm{p}}\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big )\\ {}&\qquad +\,4\Vert \mathbf {C}-\mathbf {G}\Vert _{\mathrm{F}}\Lambda _{\max } d_{\mathrm{p}}^3\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big ). \end{aligned}$$

Recall from (19) that under the event \(\mathcal{E}_1\cap \mathcal{E}_2\cap \mathcal{E}_3\),

$$\begin{aligned} \Vert \mathbf {G}-\mathbf {C}\Vert _{\mathrm{F}}\le C\Lambda _{\max }(\alpha \log d)^{1/2}d_{\mathrm{p}}\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big ). \end{aligned}$$

Therefore, \(|\zeta _2|\le \zeta _1/2\) in view of the lower bound of \(\zeta _1\). In summary, under the event \(\mathcal{E}_1\cap \mathcal{E}_2\cap \mathcal{E}_3\),

$$\begin{aligned} \langle \mathbf {Q}_\mathbf {T}({\widehat{\mathbf {T}}}-\mathbf {T}), \mathbf {H}_1\rangle \ge \frac{1}{4}\zeta _1\ge \frac{1}{64}\Lambda _{\min }^2 d_{\mathrm{p}}^2\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big ). \end{aligned}$$

Upper bound of \(\Vert \mathbf {H}_2\Vert _{\mathrm{F}}\)

It is shown in (28) that if \(d_{\mathrm{p}}\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big )\le (C\alpha \kappa _0\log d)^{-1}\), then

$$\begin{aligned} \zeta _1\ge \frac{1}{2}\Big (\Vert (\varvec{\Delta }_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2+\Vert (\mathbf {U},\varvec{\Delta }_\mathbf {Y},\mathbf {W})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2+\Vert (\mathbf {U},\mathbf {V},\varvec{\Delta }_\mathbf {Z})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2\Big ). \end{aligned}$$

Observe that

$$\begin{aligned} \Vert (\varvec{\Delta }_\mathbf {X},\mathbf {V},\mathbf {W})\cdot \mathbf {C}\Vert _{\mathrm{F}}^2=\Vert \mathcal{M}_2(\mathbf {C})(\varvec{\Delta }_\mathbf {X}\otimes \mathbf {W})\Vert _{\mathrm{F}}=\Vert \mathcal{M}_3(\mathbf {C})(\varvec{\Delta }_\mathbf {X}\otimes \mathbf {V})\Vert _{\mathrm{F}} \end{aligned}$$

which implies that

$$\begin{aligned} \zeta _1\ge \frac{1}{6}\Big (\Vert \mathcal{M}_2(\mathbf {C})(\varvec{\Delta }_\mathbf {X}\otimes \mathbf {W})\Vert _{\mathrm{F}}+\Vert \mathcal{M}_3(\mathbf {C})(\mathbf {U}\otimes \varvec{\Delta }_\mathbf {Y})\Vert _{\mathrm{F}}+\Vert \mathcal{M}_1(\mathbf {C})( \mathbf {V}\otimes \varvec{\Delta }_\mathbf {Z})\Vert _{\mathrm{F}}\Big )^2 \end{aligned}$$

By definition of \(\mathbf {H}_2\), we obtain

$$\begin{aligned} \Vert \mathbf {H}_2\Vert _{\mathrm{F}}&\le \Vert \mathcal{M}_1(\mathbf {C})(\varvec{\Delta }_\mathbf {Y}\otimes \mathbf {W})\Vert _{\mathrm{F}}\Vert \mathbf {D}_\mathbf {X}\Vert _{\mathrm{F}}+\Vert \mathcal{M}_1(\mathbf {C})(\mathbf {V}\otimes \varvec{\Delta }_\mathbf {Z})\Vert _{\mathrm{F}}\Vert \mathbf {D}_\mathbf {X}\Vert _{\mathrm{F}}\\&\quad + \,\Vert \mathcal{M}_2(\mathbf {C})(\varvec{\Delta }_\mathbf {X}\otimes \mathbf {W})\Vert _{\mathrm{F}}\Vert \mathbf {D}_\mathbf {Y}\Vert _{\mathrm{F}}+\Vert \mathcal{M}_2(\mathbf {C})(\mathbf {U}\otimes \varvec{\Delta }_\mathbf {Z})\Vert _{\mathrm{F}}\Vert \mathbf {D}_\mathbf {Y}\Vert _{\mathrm{F}}\\&\quad +\,\Vert \mathcal{M}_3(\mathbf {C})(\varvec{\Delta }_\mathbf {X}\otimes \mathbf {V})\Vert _{\mathrm{F}}\Vert \mathbf {D}_\mathbf {Z}\Vert _{\mathrm{F}}+\Vert \mathcal{M}_3(\mathbf {C})(\mathbf {U}\otimes \varvec{\Delta }_\mathbf {Y})\Vert _{\mathrm{F}}\Vert \mathbf {D}_\mathbf {Z}\Vert _{\mathrm{F}}\\&\quad +\,24\Lambda _{\max }d_{\mathrm{p}}^3\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big ) \end{aligned}$$

where we used the fact \(\Lambda _{\max }(\mathbf {C})\le 2\Lambda _{\max }\) from (23). Clearly,

$$\begin{aligned} \Vert \mathbf {H}_2\Vert _{\mathrm{F}}&\le 2\sqrt{6\zeta _1}\big (\Vert \mathbf {D}_\mathbf {X}\Vert _{\mathrm{F}}+\Vert \mathbf {D}_\mathbf {Y}\Vert _{\mathrm{F}}+\Vert \mathbf {D}_\mathbf {Z}\Vert _{\mathrm{F}}\big )+24\Lambda _{\max }d_{\mathrm{p}}^3\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big )\\&\le 4\sqrt{6\zeta _1}d_{\mathrm{p}}\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big )+24\Lambda _{\max }d_{\mathrm{p}}^3\big ((\mathbf {U},\mathbf {V},\mathbf {W}),(\mathbf {X},\mathbf {Y},\mathbf {Z})\big ). \end{aligned}$$

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xia, D., Yuan, M. On Polynomial Time Methods for Exact Low-Rank Tensor Completion. Found Comput Math 19, 1265–1313 (2019). https://doi.org/10.1007/s10208-018-09408-6

Download citation

Keywords

  • Concentration inequality
  • Matrix completion
  • Nonconvex optimization
  • Polynomial time complexity
  • Tensor completion
  • Tensor rank
  • U-statistics

Mathematics Subject Classification

  • Primary 90C25
  • Secondary 90C59
  • 15A52