Reshaped tensor nuclear norms for higher order tensor completion

Abstract

We investigate optimal conditions for inducing low-rankness of higher order tensors by using convex tensor norms with reshaped tensors. We propose the reshaped tensor nuclear norm as a generalized approach to reshape tensors to be regularized by using the tensor nuclear norm. Furthermore, we propose the reshaped latent tensor nuclear norm to combine multiple reshaped tensors using the tensor nuclear norm. We analyze the generalization bounds for tensor completion models regularized by the proposed norms and show that the novel reshaping norms lead to lower Rademacher complexities. Through simulation and real-data experiments, we show that our proposed methods are favorably compared to existing tensor norms consolidating our theoretical claims.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  1. Berclaz, J., Fleuret, F., Turetken, E., & Fua, P. (2011). Multiple object tracking using K-shortest paths optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 1806–19.

    Article  Google Scholar 

  2. Carroll, J. D., & Chang, J.-J. (1970). Analysis of individual differences in multidimensional scaling via an n-way generalization of “eckart-young”decomposition. Psychometrika, 35(3), 283–319.

    Article  Google Scholar 

  3. El-Yaniv, R., & Pechyony, D. (2007). Transductive rademacher complexity and its applications. Learning Theory, 4539, 157–171.

    MathSciNet  Article  Google Scholar 

  4. Fazel, M., Hindi, H., & Boyd, S. P. (2001). A rank minimization heuristic with application to minimum order system approximation. In Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148) (Vol. 6, pp. 4734–4739).

  5. Guo, X., Yao, Q., & Kwok, J. T. (2017). Efficient sparse low-rank tensor completion using the frank-wolfe algorithm. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, 2017, San Francisco, California, USA. (pp. 1948–1954).

  6. Hackbusch, W. (2012). Tensor spaces and numerical tensor calculus. Berlin Heidelberg: Springer Series in Computational Mathematics. Springer. ISBN 9783642280276.

  7. Harshman, R. A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an explanatory multimodal factor analysis. UCLA Working Papers in Phonetics, 16, 1–84.

    Google Scholar 

  8. Hillar, C. J., & Lim, L.-H. (2013). Most tensor problems are np-hard. Journal of ACM, 60(6), ISSN 0004-5411.

  9. Hitchcock, F. L. (1927). The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics, 6(1), 164–189.

    Article  Google Scholar 

  10. Imaizumi, M., Maehara, T., & Hayashi, K. (2017). On tensor train rank minimization: Statistical efficiency and scalable algorithm. In NIPS, pp. 3933–3942.

  11. Karatzoglou, A., Amatriain, X., Baltrunas, L., & Oliver, N. (2010). Multiverse recommendation: N-dimensional tensor factorization for context-aware collaborative filtering. In RecSys (pp. 79–86). ACM.

  12. Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51(3), 455–500.

    MathSciNet  Article  Google Scholar 

  13. Latała, R. (2005). Some estimates of norms of random matrices. Proceedings of the American Mathematical Society, 133(5), 1273–1282.

    MathSciNet  Article  Google Scholar 

  14. Lim, L., & Comon, P. (2014). Blind multilinear identification. IEEE Transactions on Information Theory, 60(2), 1260–1280.

    MathSciNet  Article  Google Scholar 

  15. Liu, J., Musialski, P., Wonka, P., & Ye, J. (2009). Tensor completion for estimating missing values in visual data. In ICCV (pp. 2114–2121).

  16. Mu, C., Huang, B., Wright, J., & Goldfarb, D. (2014). Square deal: Lower bounds and improved relaxations for tensor recovery. In ICML (pp. 73–81).

  17. Oseledets, I. V. (2011). Tensor-train decomposition. SIAM Journal on Scientific Computing, 33(5), 2295–2317, ISSN 1064-8275.

  18. Rai, P., Hu, C., Harding, M., & Carin, L. (2015). Scalable probabilistic tensor factorization for binary and count data. IJCAI’15, pp. 3770–3776. AAAI Press. ISBN 978-1-57735-738-4.

  19. Raskutti, G., Chen, H., Yuan, M. (2015). Convex regularization for high-dimensional multi-response tensor regression. CoRR, abs/1512.01215v2.

  20. Shamir, O., & Shalev-Shwartz, S. (2014). Matrix completion with the trace norm: Learning, bounding, and transducing. Journal of Machine Learning Research, 15, 3401–3423.

    MathSciNet  MATH  Google Scholar 

  21. Song, Q., Ge, H., Caverlee, J., & Hu, X. (2017). Tensor completion algorithms in big data analytics. CoRR, abs/1711.10105.

  22. Tomioka, R., & Suzuki, T. (2013). Convex tensor decomposition via structured schatten norm regularization. In NIPS.

  23. Wimalawarne, K., Sugiyama, M., & Tomioka, R. (2014). Multitask learning meets tensor factorization: Task imputation via convex optimization. In NIPS.

  24. Yang, Y., Feng, Y., & Suykens, J. A. K. (2015). A rank-one tensor updating algorithm for tensor completion. IEEE Signal Processing Letters, 22(10), 1633–1637. ISSN 1070-9908.

  25. Yuan, M., & Zhang, C.-H. (2016). On tensor completion via nuclear norm minimization. Foundations of Computational Mathematics, 16(4), 1031–1068.

    MathSciNet  Article  Google Scholar 

  26. Zheng, V. W., Cao, B., Zheng, Y., Xie, X., & Yang, Q. (2010). Collaborative filtering meets mobile recommendation: A user-centered approach. In AAAI, AAAI’10, pp. 236–241. AAAI Press.

Download references

Acknowledgements

H.M. has been supported in part by JST ACCEL [Grant Number JPMJAC1503], MEXT Kakenhi [Grant Number 19H04169] and AIPSE by Academy of Finland.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Kishan Wimalawarne.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Editor: Pradeep Ravikumar.

Appendix

Appendix

Dual norms of reshaped tensor nuclear norms

In this section, we discuss the dual norm of the proposed reshaped tensor nuclear norm. The dual norm is useful in developing optimization procedures and proving theoretical bounds.

The dual norm of the tensor nuclear norm (Yang et al. 2015; Yuan and Zhang 2016) for a K-mode tensor \({\mathcal {T}} \in {\mathbb {R}}^{n_1 \times \cdots \times n_K}\) is given by

$$\begin{aligned} \Vert {\mathcal {T}}\Vert _{\mathrm {op}} = \max _{\Vert y_{i}\Vert _{2} = 1, 1 \le i \le K} \langle {\mathcal {T}}, y_{1} \otimes y_{2} \otimes \cdots \otimes y_{K} \rangle . \end{aligned}$$
(6)

This definition applies to all tensor nuclear norms including the reshaped norms.

The next lemma provides the dual norm for the reshaped latent tensor nuclear norm.

Lemma 1

The dual norm of the reshaped latent tensor nuclear norm for a tensor \({\mathcal {W}}\in {\mathbb {R}}^{n_1 \times \cdots \times n_{K}}\) for a collection of G reshaping sets \(D_{\mathrm {L}} = (D^{(1)},\ldots ,D^{(G)})\) is

$$\begin{aligned} \Vert {\mathcal {W}} \Vert _{\mathrm {r\_latent}(D_{\mathrm {L}})^{*}} = \max _{g} \Vert {\mathcal {W}}_{(D^{(g)})} \Vert _{\mathrm {op}}. \end{aligned}$$

Proof

Using the standard formulation of the dual norm, we write the dual norm for \(\Vert {\mathcal {W}} \Vert _{\mathrm {r\_latent}(D_{\mathrm {L}})^{*}}\) as

$$\begin{aligned} \Vert {\mathcal {W}} \Vert _{\mathrm {r\_latent}(D_{\mathrm {L}})^{*}} = \sup \Bigg \langle \sum _{k=1}^{G} {\mathcal {X}}^{(k)}, {\mathcal {W}} \Bigg \rangle \quad \mathrm {s.t.}\; \inf _{{\mathcal {X}}^{(1)} + \cdots + {\mathcal {X}}^{(G)}= {\mathcal {X}}} \sum _{k=1}^{G} \Vert {\mathcal {X}}^{(k)}_{(D^{(k)})} \Vert _{\star } \le 1. \end{aligned}$$
(7)

The solution to (7) resides on the simplex of \(\inf _{{\mathcal {X}}^{(1)} + \cdots + {\mathcal {X}}^{(G)}= {\mathcal {X}}} \sum _{k=1}^{G} \Vert {\mathcal {X}}^{(k)}_{(D^{(k)})} \Vert _{\star } \le 1\) and one of the edges of the simplex is a solution. Then, we can take any \(g \in 1,\ldots ,G\) such that \({\mathcal {X}}^{(g)} = {\mathcal {X}}\) and all \({\mathcal {X}}^{(k \ne g)} =0\), and arrange (7) as

$$\begin{aligned} \Vert {\mathcal {W}} \Vert _{\mathrm {r\_latent}(D_{\mathrm {L}})^{*}} = \sup _{g \in 1,\ldots ,G} \Big \langle {\mathcal {X}}_{(D^{(g)})} , {\mathcal {W}}_{(D^{(g)})} \Big \rangle \quad \mathrm {s.t.}\; \Vert {\mathcal {X}}_{(D^{(g)})} \Vert _{\star } \le 1, \end{aligned}$$

which results in the following

$$\begin{aligned} \Vert {\mathcal {W}} \Vert _{\mathrm {r\_latent}(D_{\mathrm {L}})^{*}} = \max _{g \in 1,\ldots ,G} \Vert {\mathcal {W}}_{(D^{(g)})} \Vert _{\mathrm {op}}. \end{aligned}$$

\(\square\)

Proofs of theoretical analysis

In this section, we provide proofs of the theoretical analysis in Sect. 4.

First, we prove following useful lemmas. These lemmas bound the tensor nuclear norm and the reshaped tensor nuclear norms with respect to the multilinear rank of a tensor.

Lemma 2

Let \({\mathcal {X}} \in {\mathbb {R}}^{n_1 \times \cdots \times n_K}\) be a random K-mode tensor with a multilinear rank of \((r_1,\ldots ,r_K)\). Let \(r_{cp}\) be the CP rank of \({\mathcal {X}}\),then

$$\begin{aligned} \Vert {\mathcal {X}} \Vert _{\star }&= \Bigg \{ \sum _{j=1}^{r_{cp}} \gamma _{j} | {\mathcal {X}} = \sum _{j=1}^{r_{cp}} \gamma _{j} u_{1j} \otimes u_{2j} \otimes \cdots \otimes u_{Kj}, \Vert u_{kj}\Vert _{2}^{2} = 1, \gamma _{j} \ge \gamma _{j+1} > 0 \Bigg\}\\ & \le \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,K} r_{i}} \gamma _{1}, \end{aligned}$$

where \(\gamma _i\) is the ith singular value of \({\mathcal {X}}\).

Proof

Let us consider the Tucker decomposition of \({\mathcal {X}}\) as

$$\begin{aligned} {\mathcal {X}} = \sum _{j_1=1}^{r_1}\sum _{j_2=1}^{r_2}\cdots \sum _{j_K=1}^{r_K} {\mathcal {C}}_{j_1,\ldots ,j_K} u_{j_1}^{(1)} \otimes u_{j_2}^{(2)} \otimes \cdots \otimes u_{j_K}^{(K)}, \end{aligned}$$

where \({\mathcal {C}} \in {\mathbb {R}}^{r_1 \times \cdots \times r_K}\) is the core tensor and \(u^{j}_{(i)} \in {\mathbb {R}}^{n_{j}},\;\Vert u^{j}_{(i)}\Vert _{2}=1,\;i=1,\ldots ,r_i,\;j=1,\ldots ,K\) are component vectors.

Following Chapter 8 of Hackbusch (2012), we can express the above Tucker decomposition as

$$\begin{aligned} {\mathcal {X}} = \sum _{j_2=1}^{r_2}\cdots \sum _{j_K=1}^{r_K} \underbrace{\Bigg (\sum _{j_1=1}^{r_1} {\mathcal {C}}_{j_1,\ldots ,j_K} u_{j_1}^{(1)} \Bigg )}_{{\hat{u}}^{(1)}[j_2,\ldots ,j_K] \in {\mathbb {R}}^{n_1}} \otimes u_{j_2}^{(2)} \otimes \cdots \otimes u_{j_K}^{(K)}, \end{aligned}$$
(8)

where we have taken summation over the multiplications of core tensor elements and component vectors of the mode 1. It is easy to see that we can also consider the summation over component vectors of any other mode in a similar manner.

By considering \({\hat{u}}^{(1)}[j_2,\ldots ,j_K] = \gamma [j_2,\ldots ,j_K]\frac{{\hat{u}}^{(1)}[j_2,\ldots ,j_K]}{\Vert {\hat{u}}^{(1)}[j_2,\ldots ,j_K]\Vert _{2}}\) where \(\gamma [j_2,\ldots ,j_K] =\Vert {\hat{u}}^{(1)}[j_2,\ldots ,j_K]\Vert _{2}\), the above arrangement leads to a CP decomposition with a rank of \(r_{cp} = \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,K} r_{i}}\).

By arranging \(\gamma [j_2,\ldots ,j_K]\) in descending order along component vectors \({\hat{u}}^{(1)}[j_2,\ldots ,j_K]\) and renaming them as \(\gamma _1 \ge \gamma _2 \ge \ldots\) and \(u_{1j}\), respectively, we obtain

$$\begin{aligned} \Vert {\mathcal {X}} \Vert _{\star } = \Bigg \{ \sum _{j=1}^{r_{cp}} \gamma _{j} | {\mathcal {X}} = \sum _{j=1}^{r_{cp}} \gamma _{j} u_{1j} \otimes u_{2j} \cdots \otimes u_{Kj}, \Vert u_{kj}\Vert _{2}^{2} = 1, \gamma _{j} \ge \gamma _{j+1} > 0 \Bigg \}, \end{aligned}$$

where \(u_{kj} \in [u^{(k)}_{1},\ldots u^{(k)}_{r_{k}}]\) are component vectors from (8) for each \(k=2,\ldots ,K\).

Then we arrive at the final bound of

$$\begin{aligned} \Vert {\mathcal {X}} \Vert _{\star }&= \Bigg \{ \sum _{j=1}^{r_{cp}} \gamma _{j} | {\mathcal {X}} = \sum _{j=1}^{r_{cp}} \gamma _{j} u_{1j} \otimes u_{2j} \cdots \otimes u_{Kj}, \Vert u_{kj}\Vert _{2}^{2} = 1, \gamma _{j} \ge \gamma _{j+1} > 0 \Bigg \} \\ & \le \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,K} r_{i}} \gamma _{1} . \end{aligned}$$

\(\square\)

Lemma 3

Let \({\mathcal {X}} \in {\mathbb {R}}^{n \times \ldots \times n}\) be a random K-mode tensor with multilinear rank of \((r_1,\ldots ,r_K)\). We consider a set of M reshaping modes \(D_i,\;i=1,\ldots ,M\). Let \(r_{cp}\) be the CP rank of \({\mathcal {X}}\), then

$$\begin{aligned}&\Vert {\mathcal {X}}_{(D_1,\ldots ,D_M)} \Vert _{\star } = \Bigg \{ \sum _{j=1}^{r_{cp}} \gamma _{j} | {\mathcal {X}}_{(D_1,\ldots ,D_M)} = \sum _{j=1}^{r_{cp}} \gamma _{j} u_{1j} \otimes u_{2j} \cdots \otimes u_{Mj},\\&\quad \Vert u_{kj}\Vert _{2}^{2} = 1, \gamma _{j} \ge \gamma _{j+1} > 0 \Bigg \} \le \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,M} \prod _{i \in D_{j}} r_{i}} \gamma _{1}, \end{aligned}$$

where \(\gamma _i\) is the ith singular value of \({\mathcal {X}}_{(D_1,\ldots ,D_M)}\).

Proof

Let us consider the Tucker decomposition of \({\mathcal {X}}\) as

$$\begin{aligned} {\mathcal {X}} = \sum _{j_1=1}^{r_1}\sum _{j_2=1}^{r_2}\cdots \sum _{j_K=1}^{r_K} {\mathcal {C}}_{j_1,\ldots ,j_K} u_{j_1}^{(1)} \otimes u_{j_2}^{(2)} \otimes \cdots \otimes u_{j_K}^{(K)}, \end{aligned}$$

where \({\mathcal {C}} \in {\mathbb {R}}^{r_1 \times \cdots \times r_K}\) is the core tensor and \(u^{j}_{i} \in {\mathbb {R}}^{n_{j}},\;\Vert u^{j}_{i}\Vert _{2}=1,\;i=1,\ldots ,r_i,\;j=1,\ldots ,K\) are component vectors. We rearrange the Tucker decomposition for the reshaped tensor \({\mathcal {X}}_{(D_1,\ldots ,D_M)}\) as

$$\begin{aligned} {\mathcal {X}}_{(D_1,\ldots ,D_M)} &= \sum _{ j'_{a'}, j'_{b'},\ldots \in D_2}\cdots \Bigg ( \sum _{\sum _{ j''_{a''}, j''_{b''},\ldots \in D_M}} \underbrace{\Bigg ( \sum _{ j_a, j_b,\ldots \in D_1} {\mathcal {C}}_{j_1,\ldots ,j_K} \varPi _{D_1}( u_{j_a}^{(a)} \otimes u_{j_b}^{(b)} \cdots ) \Bigg )}_{{\hat{u}}_{1}[D'_2,\ldots ,D'_M] \in {\mathbb {R}}^{\mathrm {prod}(D_1)}} \\&\otimes \varPi _{D_M}( u_{j_a'}^{(a')} \otimes u_{j_b'}^{(b')} \cdots ) \Bigg ) \otimes \cdots , \end{aligned}$$

 with \({\hat{u}}_{1}[D'_2,\ldots ,D'_M] = \gamma [D'_2,\ldots ,D'_M] \frac{{\hat{u}}_{1}[D'_2,\ldots ,D'_M]}{\Vert {\hat{u}}_{1}[D'_2,\ldots ,D'_M]\Vert _{2}}\) where \(\gamma [D'_2,\ldots ,D'_M] = \Vert {\hat{u}}_{1}[D'_2,\ldots ,D'_M]\Vert _{2}\). We can consider the above summation over any reshaping set and it is easy to see that the arrangement takes a CP decomposition with a CP rank of \(r_{cp}= \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,M} \prod _{i \in D_{j}} r_{i}}\).

By arranging \(\gamma [D_2,\ldots ,D_M]\) in descending order order along with component vectors \({\hat{u}}^{(1)}[D_2,\ldots ,D_M]\) and renaming them as \(\gamma _1 \ge \gamma _2 \ge \ldots\) and \(u_{1j}\), respectively, we obtain

$$\begin{aligned} \Vert {\mathcal {X}}_{(D_1,\ldots ,D_M)} \Vert _{\star } = \Bigg \{ \sum _{j=1}^{r_{cp}} \gamma _{j} | {\mathcal {X}}_{(D_1,\ldots ,D_M)} = \sum _{j=1}^{r_{cp}} \gamma _{j} u_{1j} \otimes u_{2j} \cdots \otimes u_{Mj}, \Vert u_{kj}\Vert _{2}^{2} = 1, \gamma _{j} \ge \gamma _{j+1} > 0 \Bigg \}, \end{aligned}$$

where \(u_{kj} \in [\varPi _{D_k}( u_{1}^{(a')} \otimes u_{1}^{(b')} \cdots ) , \ldots , \varPi _{D_k}( u_{r_a'}^{(a')} \otimes u_{r_b'}^{(b')} \cdots ) ]\) are components for each \(k=2,\ldots ,M\) and \(a',b',\ldots \in D_k\).

Using the above results we arrive at the following bound

$$\begin{aligned}&\Vert {\mathcal {X}}_{(D_1,\ldots ,D_M)} \Vert _{\star } = \Bigg \{ \sum _{j=1}^{r_{cp}} \gamma _{j} | {\mathcal {X}}_{(D_1,\ldots ,D_M)} = \sum _{j=1}^{r_{cp}} \gamma _{j} u_{1j} \otimes u_{2j} \cdots \otimes u_{Mj}, \\&\quad \Vert u_{kj}\Vert _{2}^{2} = 1, \gamma _{j} \ge \gamma _{j+1} > 0 \Bigg \} \le \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,M} \prod _{i \in D_{j}} r_{i}} \gamma _{1}, \end{aligned}$$

\(\square\)

Lemma 4

Let \({\mathcal {X}} \in {\mathbb {R}}^{n_1 \times \ldots \times n_K}\) be a random K-mode tensor with CP rank of \(r_{cp}\). We consider a set of M reshaping sets \(D_i,\;i=1,\ldots ,M\). Then

$$\begin{aligned} \Vert {\mathcal {X}}_{(D_1,\ldots ,D_M)} \Vert _{\star } \le r_{cp}\gamma _{1}, \end{aligned}$$

where \(\gamma _i\) is the ith singular value of \({\mathcal {X}}_{(D_1,\ldots ,D_M)}\).

Proof

Let us consider \({\mathcal {X}}\) as

$$\begin{aligned} {\mathcal {X}} = \sum _{j=1}^{r_{cp}} \gamma _{j} u_{1j} \otimes u_{2j} \cdots \otimes u_{Kj}, \end{aligned}$$

with \(\Vert u_{kj}\Vert _{2}^{2} = 1, \gamma _{j} \ge \gamma _{j+1} > 0\). For the reshaping set \((D_1,\ldots ,D_M)\), we rearrange \({\mathcal {X}}\) as

$$\begin{aligned} {\mathcal {X}}_{(D_1,\ldots ,D_M)} = \sum _{j=1}^{r_{cp}} \gamma _{j} (\circ _{i_1 \in D_1} u_{i_1j}) \otimes (\circ _{i_2 \in D_2} u_{i_2j}) \cdots \otimes (\circ _{i_M \in D_M} u_{i_Mj}), \end{aligned}$$

where \(a \circ b = [a_1b, a_2b, \ldots , a_n b]^{\top }\) is the Khatri-Rao product (Kolda and Bader 2009). It is easy to verify that \(\mathrm {vec}((a \circ b) \otimes (c \circ d)) = \mathrm {vec}(a \otimes b \otimes c \otimes d)\), which indicates that \(\mathrm {vec}({\mathcal {X}}) = \mathrm {vec}({\mathcal {X}}_{(D_1,\ldots ,D_M)})\).

Using the fact that \(\mathrm {Rank}(a \otimes b) \le \mathrm {Rank}(a)\mathrm {Rank}(b)\) from Kolda and Bader (2009), we have

$$\begin{aligned} \mathrm {Rank}({\mathcal {X}}_{(D_1,\ldots ,D_M)}) \le \mathrm {Rank}({\mathcal {X}}) = r_{cp}. \end{aligned}$$

This lead to the final observation

$$\begin{aligned} \Vert {\mathcal {X}}_{(D_1,\ldots ,D_M)} \Vert _{\star } \le r_{cp}\gamma _{1}. \end{aligned}$$

\(\square\)

In order to prove Rademacher complexities in Theorems 1 and 2, we use the following lemma form Raskutti et al. (2015).

Lemma 5

(Raskutti et al. 2015) Consider a K-mode tensor \({\mathcal {X}} \in {\mathbb {R}}^{n_1 \times \cdots \times n_K}\) with random samples from an i.i.d. Gaussian tensor ensemble. Then

$$\begin{aligned} {\mathbb {E}}\Vert {\mathcal {X}} \Vert _{\mathrm {op}} \le 4\log (4K)\sum _{k=1}^{K}\sqrt{n_k}. \end{aligned}$$

Given a tensor \({\mathcal {X}} \in {\mathbb {R}}^{n_1 \times \cdots \times n_K}\) with Gaussian entries, we can write

$$\begin{aligned} {\mathbb {E}}{\mathcal {X}} = {\mathbb {E}}\sum _{i_1,i_2,\ldots ,i_K} {\mathcal {X}}_{i_1,i_2,\ldots ,i_K}e_{i_1} \otimes e_{i_2}\otimes \cdots \otimes e_{i_K}, \end{aligned}$$

where \(e_{i_k}\) is the vector with 1 at the kth element and rest of the elements are zero. Due to each \({\mathcal {X}}_{i_1,i_2,\ldots ,i_K}\) being a Gaussian entry, we have

$$\begin{aligned} {\mathbb {E}}{\mathcal {X}} = {\mathbb {E}}_{g}{\mathbb {E}}_{\epsilon }\sum _{i_1,i_2,\ldots ,i_K} \epsilon _{i_1,i_2,\ldots ,i_K}|{\mathcal {X}}_{i_1,i_2,\ldots ,i_K}|e_{i_1} \otimes e_{i_2}\otimes \cdots \otimes e_{i_K}, \end{aligned}$$

where \(\epsilon _{i_1,i_2,\ldots ,i_K} \in \{-1,1\}\). Using the Jensen’s inequality, we have

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{g}{\mathbb {E}}_{\epsilon }\sum _{i_1,i_2,\ldots ,i_K} \epsilon _{i_1,i_2,\ldots ,i_K}|{\mathcal {X}}_{i_1,i_2,\ldots ,i_K}|e_{i_1} \otimes e_{i_2}\otimes \cdots e_{i_K}\\&\;\; \ge {\mathbb {E}}_{\epsilon }\sum _{i_1,i_2,\ldots ,i_K} \epsilon _{i_1,i_2,\ldots ,i_K}{\mathbb {E}}_{g}|{\mathcal {X}}_{i_1,i_2,\ldots ,i_K}|e_{i_1} \otimes e_{i_2}\otimes \cdots e_{i_K} \\&\;\; \ge \sqrt{2\pi }{\mathbb {E}}_{\epsilon }\sum _{i_1,i_2,\ldots ,i_K} \epsilon _{i_1,i_2,\ldots ,i_K}e_{i_1} \otimes e_{i_2}\otimes \cdots e_{i_K}. \end{aligned} \end{aligned}$$

This shows that we can use the Lemma 3 to bound tensors with Bernoulli random variables.

Next we give the detailed proof of Theorem 1.

Proof of Theorem 1

We expand the Rademacher complexity in (5) as

$$R_{{\text{S}}} \left( {l \circ {\mathcal{W}}} \right) = \frac{1}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \left[ {\mathop {\sup }\limits_{{{\mathcal{W}}\in W}} \sum\limits_{{i_{1} , \ldots ,i_{K} }} {\Sigma_{{i_{1} , \ldots ,i_{K} }} {l\left( {{\mathcal{X}}_{{i_{1} , \ldots ,i_{K} }} ,{\mathcal{W}}_{{i_{1} , \ldots ,i_{K} }} } \right)} } } \right],$$

where \(\Sigma _{{i_1, \ldots ,i_K}} = \sigma _{j}\) when \((i_{1},\ldots ,i_{K}) \in \mathrm {S}\) and \(\Sigma _{{i_1, \ldots ,i_K}} = 0\), otherwise.

We analyze the Rademacher complexity

$$\begin{aligned} R_{{\text{S}}} \left( {l \circ {\mathcal{W}}} \right) & = \frac{1}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \left[ {\mathop {\sup }\limits_{{{\mathcal{W}}\in W}} \sum\limits_{{i_1, \ldots ,i_K}} \Sigma_{i_1, \ldots ,i_K} {l\left( {{\mathcal{X}}_{{i_1, \ldots ,i_K}} ,{\mathcal{W}}_{{i_1, \ldots ,i_K}} } \right)} } \right], \\ & \le \frac{\varLambda}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \left[ {\mathop {\sup }\limits_{{{\mathcal{W}}\in W}} \sum\limits_{{i_1, \ldots ,i_K}} {\Sigma _{{i_1, \ldots ,i_K}} {\mathcal{W}}_{{i_1, \ldots ,i_K}} } } \right], ({\text{Rademacher contraction}}) \\ & \le \frac{\varLambda}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \mathop {\sup }\limits_{{{\mathcal{W}}\in W}} \left\| {{\mathcal{W}}_{{\left( {D_1, \ldots ,D_M} \right)}} } \right\|_{\star}\left\| {\Sigma _{{\left( {D_1, \ldots ,D_M} \right)}} } \right\|_{{\star}^{*}}, ({\text{Duality relationship}}) \\ \end{aligned}$$
(9)

(a) Given that tensor has a multilinear rank of \((r_1,\ldots ,r_K)\), using the Lemma 3 we know that

$$\begin{aligned} \Vert {\mathcal {W}}_{(D_1,\ldots ,D_{M})} \Vert _{\star } \le \bigg ( \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,M} \prod _{i \in D_{j}} r_{i}} \bigg ) \gamma _1({\mathcal {W}}_{(D_1,\ldots ,D_{M})}) . \end{aligned}$$
(10)

Using Lemma 5 we can bound \({\mathbb{E}}_{\sigma } \left\| {\Sigma _{{\left( {D_{1} , \ldots ,D_{M} } \right)}} } \right\|_{\star^{*}}\) as

$${\mathbb{E}}_{\sigma } \left\| {\Sigma _{{\left( {D_{1} , \ldots ,D_{M} } \right)}} } \right\|_{{\star}^{*}} \le 4\log \left( {4M} \right)\sum\limits_{{j = 1}}^{M} {\sqrt {\mathop \Pi \limits_{{p\in D_{j} }} n_{p} .} }$$
(11)

By substituting (10) and (11) to (9), we obtain the following bound

$$\begin{aligned} R_{\mathrm {S}}(l \circ {\mathcal {W}}) \le \frac{c\varLambda }{|\mathrm {S}|}\bigg ( \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,M} \prod _{i \in D_{j}} r_{i}} \bigg ) \gamma _1({\mathcal {W}}_{(D_1,\ldots ,D_{M})})\log (4M)\sum _{j = 1}^{M}\sqrt{ \prod _{p \in D_{j}} n_{p}}. \end{aligned}$$
(12)

(b) Given that a tensor has a CP rank of \(r_{cp}\), using the Lemma 4 we have

$$\begin{aligned} \Vert {\mathcal {W}}_{(D_1,\ldots ,D_{M})} \Vert _{\star } \le r_{cp} \gamma _1({\mathcal {W}}_{(D_1,\ldots ,D_{M})}). \end{aligned}$$
(13)

From Lemma 5, we have

$${\mathbb{E}}_{\sigma } \left\| {\Sigma _{{\left( {D_{1} , \ldots ,D_{M} } \right)}} } \right\|_{{\star}^{*}} \le 4\log \left( {4M} \right)\sum\limits_{{j = 1}}^{M} {\sqrt {\mathop \Pi \limits_{{p\in D_{j} }} n_{p} } } .$$
(14)

By substituting (13) and (14) to (9), we obtain the desired bound

$$\begin{aligned} R_{\mathrm {S}}(l \circ {\mathcal {W}}) \le \frac{c\varLambda }{|\mathrm {S}|}r_{cp} \gamma _1({\mathcal {W}}_{(D_1,\ldots ,D_{M})})\log (4M)\sum _{j = 1}^{M} \sqrt{\prod _{p \in D_j} n^{|D_j|}}. \end{aligned}$$
(15)

\(\square\)

Next, we give the proof for Theorem 2.

Proof of Theorem 2

We expand the Rademacher complexity in (5) using latent tensors \({\mathcal {W}}^{(1)},\ldots ,{\mathcal {W}}^{(G)}\) for the reshaped latent tensor nuclear norm as

$$R_{{\text{S}}} \left( {l \circ \left( {{\mathcal{W}}^{{(1)}} + \cdots + {\mathcal{W}}^{{(G)}} } \right)} \right) = \frac{1}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \left[ {\mathop {\sup }\limits_{{{\mathcal{W}}^{{(1)}} + \cdots + {\mathcal{W}}^{{(G)}} = W\in {\text{W}}_{{r1}} }} \sum\nolimits_{{i_1, \ldots ,i_K}} {\Sigma_{{i_1, \ldots ,i_K}} {l\left( {{\mathcal{X}}_{{i_1, \ldots ,i_K}} ,{\mathcal{W}}_{{i_1, \ldots ,i_K}} } \right)} } } \right],$$

where \(\Sigma _{{i_1, \ldots ,i_K}} = \sigma _{j}\) when \((i_{1},\ldots ,i_{K}) \in \mathrm {S}\) and \(\Sigma _{{i_1, \ldots ,i_K}} = 0\), otherwise.

We analyze the Rademacher complexity as

$$ \begin{aligned} R_{{\text{S}}} \left( {l \circ \left( {{\mathcal{W}}^{{(1)}} + \cdots + {\mathcal{W}}^{{(G)}} } \right)} \right) & = \frac{1}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \left[ {\mathop {\sup }\limits_{{{\mathcal{W}}^{{(1)}} + \cdots + {\mathcal{W}}^{{(G)}} = {\mathcal{W}}\epsilon {\text{W}}_{{{\text{r1}}}} }} \sum\limits_{{i1, \ldots ,iK}} {\Sigma _{{i1, \ldots ,iK}} l\left( {{\mathcal{X}}_{{i1, \ldots ,iK}} ,{\mathcal{W}}_{{i1, \ldots ,iK}} } \right)} } \right], \\ & \le \frac{\varLambda}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \left[ {\mathop {\sup }\limits_{{{\mathcal{W}}^{{(1)}} + \cdots + {\mathcal{W}}^{{(G)}} = {\mathcal{W}}\epsilon {\text{W}}_{{{\text{r1}}}} }} \sum\limits_{{i1, \ldots ,iK}} {\Sigma _{{i1, \ldots ,iK}} {\mathcal{W}}_{{i1, \ldots ,iK}} } } \right],\quad ({\text{Rademacher}}\;{\text{contraction}}) \\ & \le \frac{\varLambda}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \mathop {\mathop {\sup }\limits_{{{\mathcal{W}}^{{(1)}} + \cdots + {\mathcal{W}}^{{(G)}} = {\mathcal{W}}\epsilon {\text{W}}_{{{\text{r1}}}} }} }\limits_{{}} \left\| {\mathcal{W}} \right\|_{{{\text{r\_latent}}}} \left\| \Sigma \right\|_{{{\text{r\_latent}}*}} \quad {\text{(Duality}}\;{\text{relationship}}){\text{.}} \\ \end{aligned} $$
(16)

(a) For a tensor with multilinear rank, using Lemma 4 we obtain

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {W}}\Vert _{\mathrm {r\_latent}}&= \inf _{{\mathcal {W}}^{(1)} + \cdots + {\mathcal {W}}^{(G)}= {\mathcal {W}}} \sum _{g=1}^{G} \Vert {\mathcal {W}}^{(k)}_{(D^{(g)}_{1},\ldots ,D^{(g)}_{m_{g}})} \Vert _{\star } \\&\le \min _{g \in G} \bigg ( \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,M} \prod _{i \in D^{(g)}_{j}} r_{i}} \bigg ) \gamma _1({\mathcal {W}}^{(g)}_{(D^{(g)}_{1},\ldots ,D^{(g)}_{M_{g}})}) . \end{aligned} \end{aligned}$$
(17)

Using Lemma 1 we can bound \({\mathbb{E}}_{\sigma } \left\| \Sigma \right\|_{{{\text{r\_latent*}}}}\) as

$${\mathbb{E}}_{\sigma } \left\| \Sigma \right\|_{{{\text{r\_latent*}}}} = \mathop {\max }\limits_{{g\in G}} \left\| {{\mathcal{W}}_{{\left( {D_{1}^{{(g)}} , \ldots ,D_{{M_{g} }}^{{(g)}} } \right)}}^{{(g)}} } \right\|_{\star} \le 4\mathop {\max \log (4M_{g} )}\limits_{{g\in G}} \sum\limits_{{j = 1}}^{{M_{g} }} {\sqrt {\mathop \Pi \limits_{{p\in D_{j}^{{(g)}} }} n_{p} } .}$$
(18)

By substituting (17) and (18) to (16), we obtain the following bound

$$\begin{aligned}R_{\mathrm {S}}(l \circ {\mathcal {W}}) &\le \frac{c\varLambda }{|\mathrm {S}|}\min _{g \in G} \bigg ( \frac{\prod _{k=1}^{K}r_{k}}{\max _{j = 1,\ldots ,M} \prod _{i \in D^{(g)}_{j}} r_{i}} \bigg ) \gamma _1({\mathcal {W}}^{(g)}_{(D^{(g)}_{1},\ldots ,D^{(g)}_{M_{g}})})\\&\quad \max _{g \in G} \log (4M_g) \sum _{j = 1}^{M_{g}}\sqrt{ \prod _{p \in D^{(g)}_{j}} n_{p} }. \end{aligned}$$

(b) For tensor with CP rank, using Lemma 4 we obtain

$$\begin{aligned} \Vert {\mathcal {W}}\Vert _{\mathrm {r\_latent}} = \inf _{{\mathcal {W}}^{(1)} + \cdots + {\mathcal {W}}^{(G)}= {\mathcal {W}}} \sum _{g=1}^{G} \Vert {\mathcal {W}}^{(k)}_{(D^{(g)}_{1},\ldots ,D^{(g)}_{m_{g}})} \Vert _{\star } \le \min _{g \in G} r_{cp} \gamma _1\left({\mathcal {W}}^{(g)}_{(D^{(g)}_{1},\ldots ,D^{(g)}_{M_{g}})}\right) . \end{aligned}$$
(19)

By substituting (19) and (18) to (16), we obtain the following bound

$$\begin{aligned} R_{\mathrm {S}}(l \circ {\mathcal {W}}) \le \frac{c\varLambda }{|\mathrm {S}|}\min _{g \in G} r_{cp} \gamma _1({\mathcal {W}}^{(g)}_{(D^{(g)}_{1},\ldots ,D^{(g)}_{M_{g}})})\max _{g \in G} \log (4M_g) \sum _{j = 1}^{M_{g}}\sqrt{ \prod _{p \in D^{(g)}_{j}} n_{p} }. \end{aligned}$$

\(\square\)

Finally, we derive the Rademacher complexity for the tensor completion model regularized by the Schatten TT norm.

Theorem 3

Consider a K-mode tensor \({\mathcal {W}} \in {\mathbb {R}}^{n_1 \times \ldots \times n_K}\) with a multilinear rank of \((r_1,\ldots ,r_K)\). Let us consider the hypothesis class \(\textsf {W}_{\mathrm {TT}} = \{{\mathcal {W}}| \Vert {\mathcal {T}} \Vert _{s,T} \le t \}\). Then Rademacher complexity is bounded as

$$\begin{aligned} R_{\mathrm {S}}(l \circ {\mathcal {W}}) \le \frac{c'\varLambda }{|\mathrm {S}|} \sum _{k=1}^{K-1} \min \Bigg ( \prod _{i=1}^{k} \sqrt{r_i}, \prod _{j=k+1}^{K} \sqrt{r_j} \Bigg ) B_{{\mathcal {T}}} \min _{k=1,\ldots ,K-1} \Bigg (\sqrt{\prod _{i < k}{n_{i}}} + \sqrt{\prod _{j \ge k}^{K}{n_{j}}} \Bigg ), \end{aligned}$$
(20)

where \(\Vert {\mathcal {W}} \Vert _{\mathrm {F}} \le B_{{\mathcal {T}}}\) and \(c'\) is a constant.

Proof

For this case we consider the hypothesis class \({\mathcal {W}}_{\mathrm {TT}}\) for the Rademacher complexity follows as

$$R_{{\text{S}}} \left( {l \circ {\mathcal{W}}} \right) = \frac{1}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \left[ {\mathop {\sup }\limits_{{{\mathcal{W}}_{{{\text{TT}}}} \in {\text{W}}}} \sum\nolimits_{{i_1, \ldots ,i_K}} {\Sigma _{{i_1, \ldots ,i_K}} l\left( {{\mathcal{X}}_{{i_1, \ldots ,i_K}} ,{\mathcal{W}}_{{i_1, \ldots ,i_K}} } \right)} } \right],$$

where \(\Sigma _{{i_1, \ldots ,i_K}} = \sigma _{j}\) when \((i_{1},\ldots ,i_{K}) \in S\) and \(\Sigma _{{i_1, \ldots ,i_K}} = 0\), otherwise.

Now we analyze the Rademacher complexity for the hypothesis class \({\mathcal {W}}_{\mathrm {TT}}\). We have

$$\begin{aligned} R_{{\text{S}}} \left( {l \circ {\mathcal{W}}} \right) & = \frac{1}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \left[ {\mathop {\sup }\limits_{{{\mathcal{W}}\in {\text{W}}_{{{\text{TT}}}} }} \sum\limits_{{i_1, \ldots ,i_K}} {\Sigma _{{i_1, \ldots ,i_K}} l\left( {{\mathcal{X}}_{{i_1, \ldots ,i_K}} ,{\mathcal{W}}_{{i_1, \ldots ,i_K}} } \right)} } \right], \\ & \le \frac{\varLambda}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \left[ {\mathop {\sup }\limits_{{{\mathcal{W}}\in {\text{W}}_{{{\text{TT}}}} }} \sum\limits_{{i_1, \ldots ,i_K}} {\Sigma _{{i_1, \ldots ,i_K}} {\mathcal{W}}_{{i_1, \ldots ,i_K}} } } \right],\quad {\text{(Rademacher}}\;{\text{contraction)}} \\ & \le \frac{\varLambda}{{\left| {\text{S}} \right|}}{\mathbb{E}}_{\sigma } \mathop {\sup }\limits_{{{\mathcal{W}}\in {\text{W}}_{{{\text{TT}}}} }} \left\| W \right\|_{{s,T}} \left\| \Sigma \right\|_{{s,T}^{*}} ,\quad {\text{(Duality}}\;{\text{relationship)}} \\ \end{aligned}$$
(21)

where \(\Vert \cdot \Vert _{{s,T}^{*}}\) is the dual norm of \(\Vert \cdot \Vert _{{s,T}}\). The last step can be obtained by applying the Holder’s inequality to the sum of trace norms in the Schatten TT norm.

Considering \(\Vert {\mathcal {W}} \Vert _{s,T}\), we can expand it as

$$\begin{aligned} \Vert {\mathcal {W}} \Vert _{s,T} = \frac{1}{K-1} \sum _{k=1}^{K-1} \Vert Q_{k}({\mathcal {T}}) \Vert _{\mathrm {tr}} = \frac{1}{K-1} \sum _{k=1}^{K-1} \sum _{i_{k}=1}^{{\hat{r}}_{k}} \gamma _{i_k}(Q_{k}({\mathcal {T}})), \end{aligned}$$

where \(Q_{k}: {\mathcal {T}} \rightarrow {\mathbb {R}}^{n_{\ge k} \times n_{k <}}\) is a reshaping operator, and \(\gamma _{i_k}()\) and \({\hat{r}}_k\) are the \(i_k\)th singular value and the rank of the reshaped tensor by \(Q_{k}\), respectively. Using the Cauchy-Schwarz inequality, we have

$$\begin{aligned} \Vert {\mathcal {W}} \Vert _{s,T} \le \frac{1}{K-1} \sum _{k=1}^{K-1} \sqrt{{\hat{r}}_{k}} \sqrt{\sum _{i_{k}=1}^{{\hat{r}}_{k}} \gamma _{i_k}^{2}(Q_{k}({\mathcal {T}}))} = \frac{1}{K-1} \sum _{k=1}^{K-1} \sqrt{{\hat{r}}_{k}} B_{{\mathcal {T}}}, \end{aligned}$$

where \(\Vert {\mathcal {T}} \Vert _{\mathrm {F}} = B_{{\mathcal {T}}}\). Using Lemmas 1 and 2, we can infer that

$$\begin{aligned} \Vert {\mathcal {W}} \Vert _{s,T} \le \frac{1}{K-1} \sum _{k=1}^{K-1} \min \Bigg ( \prod _{i=1}^{k} \sqrt{r_i}, \prod _{j=k+1}^{K} \sqrt{r_j} \Bigg ) B_{{\mathcal {T}}}, \end{aligned}$$
(22)

Similar to the overlapped trace norm (Tomioka and Suzuki 2013), the Schatten TT norm also sums nuclear norms of the the same tensor reshaped into different matrices. Hence, we can extend the dual norm of the overlapped trace norm in Tomioka and Suzuki (2013) to the Schatten TT norm. Using (Tomioka and Suzuki 2013), it is easy to the dual norm of Schatten TT norm as

$$\left\| \Sigma \right\|_{{s,T}^{*}} = \mathop {\inf }\limits_{{\Sigma ^{{(1)}} + \cdots + \Sigma ^{{(K)}} = \Sigma }} \sum\nolimits_{{k = 1}}^{{K - 1}} {\left\| {Q_{k} \left( {\Sigma ^{{(k)}} } \right)} \right\|_{{{\text{op}}}} .}$$

We want to bound

$${\mathbb{E}}\left\| \Sigma \right\|_{{s,T}^{*}} = {\mathbb{E}}\mathop {\inf }\limits_{{\Sigma ^{{(1)}} + \cdots + \Sigma ^{{(K)}} = \Sigma }} \sum\nolimits_{{k = 1}}^{{K - 1}} {\left\| {Q_{k} \left( {\Sigma ^{{(k)}} } \right)} \right\|_{{{\text{op}}}} .}$$

and since we can take any of \(\Sigma ^{{(k)}} ,k = 1, \ldots ,K\) to be equal to \({\Sigma }\), we have

$${\mathbb{E}}\left\| \Sigma \right\|_{{s,T}^{*}} \le \mathop {\min }\limits_{{k = 1, \ldots ,K - 1}} \left\| {Q_{k} \left( \Sigma \right)} \right\|_{{{\text{op}}}} .$$

We apply Latała’s Theorem (Latała 2005; Shamir and Shalev-Shwartz 2014) to the reshaping by the \(Q_{k}\) operator and bound \({\mathbb{E}}\left\| {Q_{k} \left( \Sigma \right)} \right\|_{{{\text{op}}}}\) as

$${\mathbb{E}}\left\| {Q_{k} \left( \Sigma \right)} \right\|_{{{\text{op}}}} \le C_{1} \left( {\sqrt {\mathop \Pi \limits_{{i < k}} n_{i} + \sqrt {\mathop \Pi \limits_{{j \ge k}}^{K} n_{j} + \sqrt[4]{{\left| {Q_{k} (\Sigma )} \right|}}} } } \right),$$

and since \(\root 4 \of {|Q_{k}(\Sigma )|} \le \root 4 \of { \prod _{i=1}^{K}{n_{i}}} \le \frac{1}{2} \bigg (\sqrt{\prod _{i < k}{n_{i}}} + \sqrt{\prod _{j \ge k}^{K}{n_{j}}} \bigg )\), we have,

$${\mathbb{E}}\left\| {Q_{k} \left( \Sigma \right)} \right\|_{{{\text{op}}}} \le \frac{{3C_{1} }}{2}\left( {\sqrt {\mathop \Pi \limits_{{i < k}} n_{i} + \sqrt {\mathop \Pi \limits_{{j \ge k}}^{K} n_{j} } } } \right).$$

This gives us the bounds for \({\mathbb{E}}\left\| \Sigma \right\|_{{s,T}^{*}}\) as

$${\mathbb {E}}\left\| \Sigma \right\|_{{s,T}^{*}} \le \mathop {\min }\limits_{{k = 1, \ldots ,K - 1}} \frac{{3C_{1} }}{2}\left( {\sqrt {\mathop \Pi \limits_{{i < k}} n_{i} + \sqrt {\mathop \Pi \limits_{{j \ge k}}^{K} n_{j} } } } \right).$$
(23)

By combining (22) and (23) to (21), we obtain

$$R_{{\text{S}}} \left( {l \circ {\mathcal{W}}} \right) \le \frac{{c^{{\prime }} {\varLambda}}}{{\left| {\text{S}} \right|\left( {K - 1} \right)}}\sum\limits_{{k = 1}}^{{K - 1}} {\min \left( {\mathop \Pi \limits_{{i = 1}}^{k} \sqrt {r_{i} } ,\mathop \Pi \limits_{{j = k + 1}}^{K} \sqrt {r_{j} } } \right)B_{{\mathcal{T}}} \mathop {\min }\limits_{k} } \left( {\sqrt {\mathop \Pi \limits_{{i < k}} n_{i} + \sqrt {\mathop \Pi \limits_{{j \ge k}}^{K} n_{j} } } } \right).$$
(24)

\(\square\)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wimalawarne, K., Mamitsuka, H. Reshaped tensor nuclear norms for higher order tensor completion. Mach Learn 110, 507–531 (2021). https://doi.org/10.1007/s10994-020-05927-y

Download citation

Keywords

  • Tensor nuclear norm
  • Reshaping
  • CP rank
  • Generalization bounds